二、ElasticSearch基础语法

创始人

2025-05-29 06:07:59

0次

一、简单了解ik分词器(分词效果)
- 1.standard(单字分词器，es默认分词器)
- 2.ik_smart分词(粗粒度的拆分)
- 3.ik_max_word分词器（最细粒度拆分）
二、指定默认分词器
- 1.为索引指定默认分词器
三、ES操作数据
- 1.概述
- 2.创建索引
- 3.查询索引
- 4.删除索引
- 5.添加文档
- 6.查询索引库
- - 6.1查询索引库中所有内容
  - 6.2简单等值查询
  - 6.3简单范围查询
  - 6.4 通过id进行in查询
  - 6.5分页查询
  - 6.6对查询结果只显示指定字段
  - 6.7排序查询
- 7.修改索引内容
- 8.删除索引内容
- 9.PUT和POST区别

一、简单了解ik分词器(分词效果)

这个是底层自带的不属于ik分词，ik分词器属于第三方分词器

1.standard(单字分词器，es默认分词器)

POST _analyze
{"analyzer":"standard","text":"我爱学搜索引擎"
}

效果（把每一个字都拆分，每个字都被分词了）

{"tokens" : [{"token" : "我","start_offset" : 0,"end_offset" : 1,"type" : "","position" : 0},{"token" : "爱","start_offset" : 1,"end_offset" : 2,"type" : "","position" : 1},{"token" : "学","start_offset" : 2,"end_offset" : 3,"type" : "","position" : 2},{"token" : "搜","start_offset" : 3,"end_offset" : 4,"type" : "","position" : 3},{"token" : "索","start_offset" : 4,"end_offset" : 5,"type" : "","position" : 4},{"token" : "引","start_offset" : 5,"end_offset" : 6,"type" : "","position" : 5},{"token" : "擎","start_offset" : 6,"end_offset" : 7,"type" : "","position" : 6}]
}

2.ik_smart分词(粗粒度的拆分)

和单字分词器的区别，就是按照比较粗的粒度去分词，把搜索引擎当成一个词来分词

 POST _analyze
{"analyzer":"ik_smart","text":"我爱学搜索引擎"
}

效果

{"tokens" : [{"token" : "我","start_offset" : 0,"end_offset" : 1,"type" : "CN_CHAR","position" : 0},{"token" : "爱","start_offset" : 1,"end_offset" : 2,"type" : "CN_CHAR","position" : 1},{"token" : "学","start_offset" : 2,"end_offset" : 3,"type" : "CN_CHAR","position" : 2},{"token" : "搜索引擎","start_offset" : 3,"end_offset" : 7,"type" : "CN_WORD","position" : 3}]
}

3.ik_max_word分词器（最细粒度拆分）

按照最细粒度进行分词，把认为能组成一个词的情况都拆分。

POST _analyze
{"analyzer":"ik_max_word","text":"我爱学搜索引擎"
}

效果

{"tokens" : [{"token" : "我","start_offset" : 0,"end_offset" : 1,"type" : "CN_CHAR","position" : 0},{"token" : "爱","start_offset" : 1,"end_offset" : 2,"type" : "CN_CHAR","position" : 1},{"token" : "学","start_offset" : 2,"end_offset" : 3,"type" : "CN_CHAR","position" : 2},{"token" : "搜索引擎","start_offset" : 3,"end_offset" : 7,"type" : "CN_WORD","position" : 3},{"token" : "搜索","start_offset" : 3,"end_offset" : 5,"type" : "CN_WORD","position" : 4},{"token" : "索引","start_offset" : 4,"end_offset" : 6,"type" : "CN_WORD","position" : 5},{"token" : "引擎","start_offset" : 5,"end_offset" : 7,"type" : "CN_WORD","position" : 6}]
}

二、指定默认分词器

1.为索引指定默认分词器

创建一个索引（mysql中对应database），名为test_index_database
指定默认分词器为：ik_max_word

PUT /test_index_database
{"settings":{"index":{"analysis.analyzer.default.type":"ik_max_word"}}
}

三、ES操作数据

在7.x版本以后类型默认为_doc

1.概述

es是面向文档的，它可以储存整个对象或者文档，对该文档进行索引、搜索、排序、过滤。
使用json作为文档序列化格式

2.创建索引

PUT /test_index01

3.查询索引

GET /test_index01

查询信息如下
其中number_of_shards(分片数量)
number_of_replicas(副本数量)
es7.6.1版本默认的分片和副本数量为1，这个默认数量和你es的版本有关系。可能其他版本默认不是1

{"test_index01" : {"aliases" : { },"mappings" : { },"settings" : {"index" : {"creation_date" : "1678969193239","number_of_shards" : "1","number_of_replicas" : "1","uuid" : "n6tD0dyxTB2aOQjqyDK0QQ","version" : {"created" : "7060199"},"provided_name" : "test_index01"}}}
}

4.删除索引

DELETE /test_index01

5.添加文档

格式: PUT /索引名称/类型/id

PUT /test_index01/_doc/1
{
"name": "张三",
"sex": 1,
"age": 25,
"address": "北京",
"remark": "java"
}

执行结果
_index:索引名称
_type：类型
_id：id
_version：版本（因为这条数据可能会被修改，所以版本可能不是1）
result:结果(操作结果，创建，更新等)

{"_index" : "test_index01","_type" : "_doc","_id" : "1","_version" : 1,"result" : "created","_shards" : {"total" : 2,"successful" : 1,"failed" : 0},"_seq_no" : 0,"_primary_term" : 1
}

6.查询索引库

查询格式：GET /索引名称/类型/id

GET /test_index01/_doc/1

查询结果

{"_index" : "test_index01","_type" : "_doc","_id" : "1","_version" : 1,"_seq_no" : 0,"_primary_term" : 1,"found" : true,"_source" : {"name" : "张三","sex" : 1,"age" : 25,"address" : "北京","remark" : "java"}
}

6.1查询索引库中所有内容

格式: GET /索引名称/类型/_search

GET /test_index01/_doc/_search

相当于mysql中的 select *
结果（我这里只有一条数据）

#! Deprecation: [types removal] Specifying types in search requests is deprecated.
{"took" : 1,"timed_out" : false,"_shards" : {"total" : 1,"successful" : 1,"skipped" : 0,"failed" : 0},"hits" : {"total" : {"value" : 1,"relation" : "eq"},"max_score" : 1.0,"hits" : [{"_index" : "test_index01","_type" : "_doc","_id" : "1","_score" : 1.0,"_source" : {"name" : "秀儿","sex" : 1,"age" : 25,"address" : "上海","remark" : "java"}}]}
}

6.2简单等值查询

格式: GET /索引名称/类型/_search?q=:**

GET /test_index01/_doc/_search?q=age:25

6.3简单范围查询

格式: GET /索引名称/类型/_search?q=***[left TO tight]

GET /test_index01/_doc/_search?q=age[25 TO 26]

6.4 通过id进行in查询

格式: GET /索引名称/类型/_mget

GET /test_index01/_doc/_mget
{
"ids":["1","2"]
}

6.5分页查询

GET /索引名称/类型/_search?from=0&size=1
GET /索引名称/类型/_search?q=条件&from=0&size=1

GET /test_index01/_doc/_search?from=0&size=1

GET /test_index01/_doc/_search?q=age[25 TO 26]&from=0&size=1

6.6对查询结果只显示指定字段

GET /索引名称/类型/_search?_source=字段,字段

GET /test_index01/_doc/_search?_source=name,age

6.7排序查询

GET /索引名称/类型/_search?sort=字段 desc

GET /test_index01/_doc/_search?sort=age:desc
GET /test_index01/_doc/_search?sort=age:asc

7.修改索引内容

格式：PUT /索引名称/类型/id

PUT /test_index01/_doc/1
{
"name": "秀儿",
"sex": 1,
"age": 25,
"address": "上海",
"remark": "java"
}

结果

{"_index" : "test_index01","_type" : "_doc","_id" : "1","_version" : 2,"result" : "updated","_shards" : {"total" : 2,"successful" : 1,"failed" : 0},"_seq_no" : 1,"_primary_term" : 1
}

8.删除索引内容

格式: DELETE /索引名称/类型/id

DELETE /test_index01/_doc/1

结果

{"_index" : "test_index01","_type" : "_doc","_id" : "1","_version" : 3,"result" : "deleted","_shards" : {"total" : 2,"successful" : 1,"failed" : 0},"_seq_no" : 2,"_primary_term" : 1
}

9.PUT和POST区别

post和put都能实现创建和更新操作
①PUT:
(1)需要对一个具体的资源进行操作，所以必须要有id才能更新和创建操作。没有就会执行失败
(2)只会将json数据全都进行替换
(3)与delete都是幂等操作，无论操作多少次结果都一样
②POST:
(1)针对整个资源集合进行操作，如果不写id就会由es生成一个唯一的id进行创建文档，如果指定id则会对应创建或者更新文档。
(2)只会更新相同字段的值

词库加载错误:未能找到文件“E:\highferrum_mysql\Configuration\Dict_Stopwords.txt”。

上一篇：Linux-用户以及用户组讲解

下一篇：【华为OD机试真题 JAVA】身高排序问题