赞
踩
在elasticsearch全文搜索中,如果需要用到中文分词,可以选择默认的分词器,但是默认分词器的分词效果不太好,我们可以选择ik分词器。
ik分词器支持的版本,目前我们基本都是根据elasticsearch 的版本选择对应的ik分词器版本,
目前使用elasticsearch-7.16.0, 那么分词器也选择7.16.0,下面是对应的版本选择
es常用数据类型
字段的数据类型由字段的属性type指定,ElasticSearch支持的基础数据类型主要有:
安装ik分词器有两种方法
1、直接下载 对应分词器压缩包 然后解压到对应目录,安装完成后重启es
- cd /data/es/elasticsearch-7.16.0-node-1/plugins/
- mkdir ik
- cd ik
- wget https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v7.16.0/elasticsearch-analysis-ik-7.16.0.zip
-
- unzip elasticsearch-analysis-ik-7.16.0.zip
2、使用 elasticsearch-plugin 安装(从 v5.5.1 版本支持),安装完成重启es
- cd /data/es/elasticsearch-7.16.0-node-1/
-
- ./bin/elasticsearch-plugin install wget https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v7.16.0/elasticsearch-analysis-ik-7.16.0.zip
测试分词器
- {
- "analyzer" : "ik_max_word",
- "text": "河北省石家庄市高新区虚度大道"
- }
返回结果:
- {
- "tokens": [
- {
- "token": "河北省",
- "start_offset": 0,
- "end_offset": 3,
- "type": "CN_WORD",
- "position": 0
- },
- {
- "token": "河北",
- "start_offset": 0,
- "end_offset": 2,
- "type": "CN_WORD",
- "position": 1
- },
- {
- "token": "省",
- "start_offset": 2,
- "end_offset": 3,
- "type": "CN_CHAR",
- "position": 2
- },
- {
- "token": "石家庄市",
- "start_offset": 3,
- "end_offset": 7,
- "type": "CN_WORD",
- "position": 3
- },
- {
- "token": "石家庄",
- "start_offset": 3,
- "end_offset": 6,
- "type": "CN_WORD",
- "position": 4
- },
- {
- "token": "家庄",
- "start_offset": 4,
- "end_offset": 6,
- "type": "CN_WORD",
- "position": 5
- },
- {
- "token": "市",
- "start_offset": 6,
- "end_offset": 7,
- "type": "CN_CHAR",
- "position": 6
- },
- {
- "token": "高新区",
- "start_offset": 7,
- "end_offset": 10,
- "type": "CN_WORD",
- "position": 7
- },
- {
- "token": "高新",
- "start_offset": 7,
- "end_offset": 9,
- "type": "CN_WORD",
- "position": 8
- },
- {
- "token": "新区",
- "start_offset": 8,
- "end_offset": 10,
- "type": "CN_WORD",
- "position": 9
- },
- {
- "token": "虚度",
- "start_offset": 10,
- "end_offset": 12,
- "type": "CN_WORD",
- "position": 10
- },
- {
- "token": "大道",
- "start_offset": 12,
- "end_offset": 14,
- "type": "CN_WORD",
- "position": 11
- }
- ]
- }
在使用ElasticSearch的时候,我们会牵扯到很多的请求方法,比如GET,POST,PUT,DELETE等等,这些方法使用的都是Restful的调用风格,我们来简单介绍下这些方法
创建索引并指定分词器
http://127.0.0.1:9200/test_analysis?pretty
- PUT /test_analysis
-
- {
- "mappings": {
- "properties": {
- "content": {
- "type": "text",
- "analyzer": "ik_max_word",
- "search_analyzer": "ik_smart"
- }
- }
- }
- }
查看索引映射:
curl http://127.0.0.1:9200/test_analysis/_mapping?pretty
删除索引
curl -X DELETE 'http://127.0.0.1:9200/test_analysis?pretty=null'
新增数据
- curl -X GET 'http://elastic:dsydnn@127.0.0.1:9200/test_analysis/_doc?pretty=null' \
- -H 'Content-Type: application/json' \
- -d '
- {
- "content" : "我是中国人"
- }'
查询:
- curl -X GET 'http://127.0.0.1:9200/test_analysis/_search?pretty=null' \
- -H'Content-Type: application/json' \
- -d '{
- "query": {
- "bool": {
- "must": [
- {
- "term": {
- "content": "中"
- }
- }
- ]
- }
- }
- }'
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。