当前位置:   article > 正文

Elasticsearch生产实战(ik分词器、拼音分词、自动补全、自动纠错)_es拼音分词器

es拼音分词器

目录

一.IK分词器

1.IK分词器介绍

2.安装

3.使用

 4.自定义词库

二.拼音分词器

1.拼音分词器介绍

2.安装

三.自动补全

1.效果演示

2.实战

四.自动纠错

1.场景描述

2.DSL实现

3.java实现

五.仿京东实战


一.IK分词器

1.IK分词器介绍

        默认的中文分词是将每个字看成一个词,这显然是不符合要求的,所以我们需要安装中
文分词器来解决这个问题。

        IK分词是一款国人开发的相对简单的中文分词器。虽然开发者自2012年之后就不在维护
了,但在工程应用中IK算是比较流行的一款!我们今天就介绍一下IK中文分词器的使用。

2.安装

ik分词器下载地址
https://github.com/medcl/elasticsearch-analysis-ik/releases

链接:https://pan.baidu.com/s/1z49plwtgCzxprTibFviLlw 
提取码:c8xg

(1)先将其解压,将解压后的elasticsearch文件夹重命名文件夹为ik

(2)将ik文件夹拷贝到elasticsearch/plugins 目录下

(3)重新启动es,即可加载IK分词器

3.使用

下面看下效果吧~,如果没有IK分词器会将所有的字都拆分当成词语

postman请求测试【最小切分】:http://127.0.0.1:9200/testindex/_analyze

post请求,参数为:{"analyzer": "ik_smart", "text": "北京天安门" }

 postman请求测试【最小切分】: http://127.0.0.1:9200/testindex/_analyze

post请求参数为:{"analyzer": "ik_max_word", "text": "北京天安门" }

 4.自定义词库

现在测试ik分词器中没有配置过的特殊词语时,还是会对所有字分词,这时候需要自定义词库

如下:

http://127.0.0.1:9200/testindex/_analyze

 {"analyzer": "ik_max_word", "text": "你个老6" }

(1)进入elasticsearch/plugins/ik/config目录

(2)新建一个my.dic文件,编辑内容:你个老6

修改IKAnalyzer.cfg.xml(在ik/config目录下)

 重新启动elasticsearch,测试分词效果

二.拼音分词器

1.拼音分词器介绍

pinyin 分词器可以让用户输入拼音,就能查找到相关的关键词。比如在某个商城搜索中,输入 shuihu,就能匹配到水壶。这样的体验还是非常好的。

2.安装

拼音分词器安装地址:

Releases · medcl/elasticsearch-analysis-pinyin · GitHub

从github上下载对应es版本的拼音分词器,解压到plugins目录下,重命名为pinyin,重启es即可生效

 测试

创建索引

  1. #创建索引
  2. PUT /medcl/
  3. {
  4. "settings" : {
  5. "analysis" : {
  6. "analyzer" : {
  7. "pinyin_analyzer" : {
  8. "tokenizer" : "my_pinyin"
  9. }
  10. },
  11. "tokenizer" : {
  12. "my_pinyin" : {
  13. "type" : "pinyin",
  14. "keep_separate_first_letter" : false,
  15. "keep_full_pinyin" : true,
  16. "keep_original" : true,
  17. "limit_first_letter_length" : 16,
  18. "lowercase" : true,
  19. "remove_duplicated_term" : true
  20. }
  21. }
  22. }
  23. }
  24. }
  1. #查看分词效果
  2. GET /medcl/_analyze
  3. {
  4. "text": ["汤兵兵"],
  5. "analyzer": "pinyin_analyzer"
  6. }
  1. {
  2. "tokens" : [
  3. {
  4. "token" : "tang",
  5. "start_offset" : 0,
  6. "end_offset" : 0,
  7. "type" : "word",
  8. "position" : 0
  9. },
  10. {
  11. "token" : "汤兵兵",
  12. "start_offset" : 0,
  13. "end_offset" : 0,
  14. "type" : "word",
  15. "position" : 0
  16. },
  17. {
  18. "token" : "tbb",
  19. "start_offset" : 0,
  20. "end_offset" : 0,
  21. "type" : "word",
  22. "position" : 0
  23. },
  24. {
  25. "token" : "bing",
  26. "start_offset" : 0,
  27. "end_offset" : 0,
  28. "type" : "word",
  29. "position" : 1
  30. }
  31. ]
  32. }

创建映射

  1. #创建映射
  2. POST /medcl/_mapping
  3. {
  4. "properties": {
  5. "name": {
  6. "type": "keyword",
  7. "fields": {
  8. "pinyin": {
  9. "type": "text",
  10. "store": false,
  11. "term_vector": "with_offsets",
  12. "analyzer": "pinyin_analyzer",
  13. "boost": 10
  14. }
  15. }
  16. }
  17. }
  18. }

批量添加数据

  1. #批量添加数据
  2. POST /medcl/_bulk
  3. {"index": {"_index": "medcl","_id": "1"}}
  4. {"name":"汤兵兵"}
  5. {"index": {"_index": "medcl","_id": "2"}}
  6. {"name":"阚阳阳"}
  7. {"index": {"_index": "medcl","_id": "3"}}
  8. {"name":"汤一辰"}
  9. {"index": {"_index": "medcl","_id": "4"}}
  10. {"name":"汤得明"}
  11. {"index": {"_index": "medcl","_id": "5"}}
  12. {"name":"张继琴"}
  13. {"index": {"_index": "medcl","_id": "6"}}
  14. {"name":"阚佳武"}
  15. {"index": {"_index": "medcl","_id": "7"}}
  16. {"name":"施玉芬"}
  17. {"index": {"_index": "medcl","_id": "8"}}
  18. {"name":"陆毅"}
  19. {"index": {"_index": "medcl","_id": "9"}}
  20. {"name":"刘德华"}

测试拼音搜索

  1. #拼音分词搜索
  2. GET /medcl/_search
  3. {
  4. "query": {
  5. "match": {
  6. "name.pinyin": "tbb"
  7. }
  8. }
  9. }

拼音分词器可选参数配置

  1. ** 可选参数 **
  2. keep_first_letter : 启用此选项时,例如:刘德华 > ldh,默认值:true
  3. keep_separate_first_letter : 启用该选项时,将保留第一个字母分开,例如:刘德华 > l,d,h,默认:假的,注意:查询结果也许是太模糊,由于长期过频
  4. limit_first_letter_length : 设置 first_letter 结果的最大长度,默认值:16
  5. keep_full_pinyin : 当启用该选项,例如:刘德华 > [liu,de,hua],默认值:true
  6. keep_joined_full_pinyin : 启用此选项时,例如:刘德华 > [liudehua],默认值:false
  7. keep_none_chinese : 在结果中保留非中文字母或数字,默认值:true
  8. keep_none_chinese_together : 保持非中国信一起,默认值:true,如:DJ 音乐家 - > DJ,yin,yue,jia,当设置为 false,例如:DJ 音乐家 - > D,J,yin,yue,jia,注意:keep_none_chinese 必须先启动
  9. keep_none_chinese_in_first_letter : 第一个字母不能是中文,例如:刘德华 AT2016- > ldhat2016defaulttrue
  10. keep_none_chinese_in_joined_full_pinyin : 保持非中文字母加入完整拼音,例如:刘德华 2016- > liudehua2016,默认:false
  11. none_chinese_pinyin_tokenize : 打破非中国信成单独的拼音项,如果他们拼音,默认值:true,如:liudehuaalibaba13zhuanghan- > liu,de,hua,a,li,ba,ba,13,zhuang,han,注意: keep_none_chinese 和 keep_none_chinese_together 应首先启用
  12. keep_original : 当启用此选项时,也将保留原始输入,默认值:false
  13. lowercase : 小写非中文字母,默认值:true
  14. trim_whitespace : 默认值:true
  15. remove_duplicated_term : 当启用此选项时,将删除重复项以保存索引,例如:de> de,默认值:false 注意:位置相关查询可能受影响

三.自动补全

1.效果演示

汉字自动补全

拼音自动补全

 拼音首字母自动补全

2.实战

(1)数据准备

  1. #创建索引
  2. PUT /product_completion_index/
  3. {
  4. "settings": {
  5. "number_of_shards": 1,
  6. "number_of_replicas": 2,
  7. "analysis": {
  8. "analyzer": {
  9. "ik_pinyin_analyzer": {
  10. "type": "custom",
  11. "tokenizer": "ik_smart",
  12. "filter": "pinyin_filter"
  13. }
  14. },
  15. "filter": {
  16. "pinyin_filter": {
  17. "type": "pinyin",
  18. "keep_first_letter": true,
  19. "keep_separate_first_letter": false,
  20. "keep_full_pinyin": true,
  21. "keep_original": true,
  22. "limit_first_letter_length": 16,
  23. "lowercase": true,
  24. "remove_duplicated_term": true
  25. }
  26. }
  27. }
  28. }
  29. }
  30. #创建映射
  31. POST /product_completion_index/_mapping
  32. {
  33. "properties": {
  34. "name": {
  35. "type": "keyword"
  36. },
  37. "searchkey": {
  38. "type": "completion",
  39. "analyzer": "ik_pinyin_analyzer"
  40. }
  41. }
  42. }
  43. #批量新增数据
  44. POST /product_completion_index/_bulk
  45. {"index":{"_index":"product_completion_index","_id":"1"}}
  46. {"name":"小米(MI)","searchkey":"小米手机"}
  47. {"index":{"_index":"product_completion_index","_id":"2"}}
  48. {"searchkey":"小米10","name":"小米(MI)"}
  49. {"index":{"_index":"product_completion_index","_id":"3"}}
  50. {"searchkey":"小米电视","name":"小米(MI)"}
  51. {"index":{"_index":"product_completion_index","_id":"4"}}
  52. {"searchkey":"小米路由器","name":"小米(MI)"}
  53. {"index":{"_index":"product_completion_index","_id":"5"}}
  54. {"searchkey":"小米9","name":"小米(MI)"}
  55. {"index":{"_index":"product_completion_index","_id":"6"}}
  56. {"searchkey":"小米手机","name":"小米(MI)"}
  57. {"index":{"_index":"product_completion_index","_id":"7"}}
  58. {"searchkey":"小米耳环","name":"小米(MI)"}
  59. {"index":{"_index":"product_completion_index","_id":"8"}}
  60. {"searchkey":"小米8","name":"小米(MI)"}
  61. {"index":{"_index":"product_completion_index","_id":"9"}}
  62. {"searchkey":"小米10Pro","name":"小米(MI)"}
  63. {"index":{"_index":"product_completion_index","_id":"10"}}
  64. {"searchkey":"小米笔记本","name":"小米(MI)"}
  65. {"index":{"_index":"product_completion_index","_id":"11"}}
  66. {"searchkey":"小米摄像头","name":"小米(MI)"}
  67. {"index":{"_index":"product_completion_index","_id":"12"}}
  68. {"searchkey":"小米电饭煲","name":"小米(MI)"}
  69. {"index":{"_index":"product_completion_index","_id":"13"}}
  70. {"searchkey":"小米充电宝","name":"小米(MI)"}
  71. {"index":{"_index":"product_completion_index","_id":"14"}}
  72. {"searchkey":"adidas男鞋","name":"adidas男鞋"}
  73. {"index":{"_index":"product_completion_index","_id":"15"}}
  74. {"searchkey":"adidas女鞋","name":"adidas女鞋"}
  75. {"index":{"_index":"product_completion_index","_id":"16"}}
  76. {"searchkey":"adidas外套","name":"adidas外套"}
  77. {"index":{"_index":"product_completion_index","_id":"17"}}
  78. {"searchkey":"adidas裤子","name":"adidas裤子"}
  79. {"index":{"_index":"product_completion_index","_id":"18"}}
  80. {"searchkey":"adidas官方旗舰店","name":"adidas官方旗舰店"}
  81. {"index":{"_index":"product_completion_index","_id":"19"}}
  82. {"searchkey":"阿迪达斯袜子","name":"阿迪达斯袜子"}
  83. {"index":{"_index":"product_completion_index","_id":"20"}}
  84. {"searchkey":"阿迪达斯外套","name":"阿迪达斯外套"}
  85. {"index":{"_index":"product_completion_index","_id":"21"}}
  86. {"searchkey":"阿迪达斯运动鞋","name":"阿迪达斯运动鞋"}
  87. {"index":{"_index":"product_completion_index","_id":"22"}}
  88. {"searchkey":"耐克外套","name":"耐克外套"}
  89. {"index":{"_index":"product_completion_index","_id":"23"}}
  90. {"searchkey":"耐克运动鞋","name":"耐克运动鞋"}

(2)测试

  1. #汉字自动补全
  2. GET product_completion_index/_search
  3. {
  4. "from": 0,
  5. "size": 100,
  6. "suggest": {
  7. "czbk-suggest": {
  8. "prefix": "小米",
  9. "completion": {
  10. "field": "searchkey",
  11. "size": 20,
  12. "skip_duplicates": true
  13. }
  14. }
  15. }
  16. }
  17. // 结果
  18. {
  19. "took" : 2,
  20. "timed_out" : false,
  21. "_shards" : {
  22. "total" : 1,
  23. "successful" : 1,
  24. "skipped" : 0,
  25. "failed" : 0
  26. },
  27. "hits" : {
  28. "total" : {
  29. "value" : 0,
  30. "relation" : "eq"
  31. },
  32. "max_score" : null,
  33. "hits" : [ ]
  34. },
  35. "suggest" : {
  36. "czbk-suggest" : [
  37. {
  38. "text" : "xm",
  39. "offset" : 0,
  40. "length" : 2,
  41. "options" : [
  42. {
  43. "text" : "小米10",
  44. "_index" : "product_completion_index",
  45. "_type" : "_doc",
  46. "_id" : "2",
  47. "_score" : 1.0,
  48. "_source" : {
  49. "searchkey" : "小米10",
  50. "name" : "小米(MI)"
  51. }
  52. },
  53. {
  54. "text" : "小米10Pro",
  55. "_index" : "product_completion_index",
  56. "_type" : "_doc",
  57. "_id" : "9",
  58. "_score" : 1.0,
  59. "_source" : {
  60. "searchkey" : "小米10Pro",
  61. "name" : "小米(MI)"
  62. }
  63. },
  64. {
  65. "text" : "小米8",
  66. "_index" : "product_completion_index",
  67. "_type" : "_doc",
  68. "_id" : "8",
  69. "_score" : 1.0,
  70. "_source" : {
  71. "searchkey" : "小米8",
  72. "name" : "小米(MI)"
  73. }
  74. },
  75. {
  76. "text" : "小米9",
  77. "_index" : "product_completion_index",
  78. "_type" : "_doc",
  79. "_id" : "5",
  80. "_score" : 1.0,
  81. "_source" : {
  82. "searchkey" : "小米9",
  83. "name" : "小米(MI)"
  84. }
  85. },
  86. {
  87. "text" : "小米充电宝",
  88. "_index" : "product_completion_index",
  89. "_type" : "_doc",
  90. "_id" : "13",
  91. "_score" : 1.0,
  92. "_source" : {
  93. "searchkey" : "小米充电宝",
  94. "name" : "小米(MI)"
  95. }
  96. },
  97. {
  98. "text" : "小米手机",
  99. "_index" : "product_completion_index",
  100. "_type" : "_doc",
  101. "_id" : "1",
  102. "_score" : 1.0,
  103. "_source" : {
  104. "name" : "小米(MI)",
  105. "searchkey" : "小米手机"
  106. }
  107. },
  108. {
  109. "text" : "小米摄像头",
  110. "_index" : "product_completion_index",
  111. "_type" : "_doc",
  112. "_id" : "11",
  113. "_score" : 1.0,
  114. "_source" : {
  115. "searchkey" : "小米摄像头",
  116. "name" : "小米(MI)"
  117. }
  118. },
  119. {
  120. "text" : "小米电视",
  121. "_index" : "product_completion_index",
  122. "_type" : "_doc",
  123. "_id" : "3",
  124. "_score" : 1.0,
  125. "_source" : {
  126. "searchkey" : "小米电视",
  127. "name" : "小米(MI)"
  128. }
  129. },
  130. {
  131. "text" : "小米电饭煲",
  132. "_index" : "product_completion_index",
  133. "_type" : "_doc",
  134. "_id" : "12",
  135. "_score" : 1.0,
  136. "_source" : {
  137. "searchkey" : "小米电饭煲",
  138. "name" : "小米(MI)"
  139. }
  140. },
  141. {
  142. "text" : "小米笔记本",
  143. "_index" : "product_completion_index",
  144. "_type" : "_doc",
  145. "_id" : "10",
  146. "_score" : 1.0,
  147. "_source" : {
  148. "searchkey" : "小米笔记本",
  149. "name" : "小米(MI)"
  150. }
  151. },
  152. {
  153. "text" : "小米耳环",
  154. "_index" : "product_completion_index",
  155. "_type" : "_doc",
  156. "_id" : "7",
  157. "_score" : 1.0,
  158. "_source" : {
  159. "searchkey" : "小米耳环",
  160. "name" : "小米(MI)"
  161. }
  162. },
  163. {
  164. "text" : "小米路由器",
  165. "_index" : "product_completion_index",
  166. "_type" : "_doc",
  167. "_id" : "4",
  168. "_score" : 1.0,
  169. "_source" : {
  170. "searchkey" : "小米路由器",
  171. "name" : "小米(MI)"
  172. }
  173. }
  174. ]
  175. }
  176. ]
  177. }
  178. }
  1. #拼音自动补全
  2. GET product_completion_index/_search
  3. {
  4. "from": 0,
  5. "size": 100,
  6. "suggest": {
  7. "czbk-suggest": {
  8. "prefix": "xiaomi",
  9. "completion": {
  10. "field": "searchkey",
  11. "size": 20,
  12. "skip_duplicates": true
  13. }
  14. }
  15. }
  16. }
  17. // 结果
  18. {
  19. "took" : 1,
  20. "timed_out" : false,
  21. "_shards" : {
  22. "total" : 1,
  23. "successful" : 1,
  24. "skipped" : 0,
  25. "failed" : 0
  26. },
  27. "hits" : {
  28. "total" : {
  29. "value" : 0,
  30. "relation" : "eq"
  31. },
  32. "max_score" : null,
  33. "hits" : [ ]
  34. },
  35. "suggest" : {
  36. "czbk-suggest" : [
  37. {
  38. "text" : "xiaomi",
  39. "offset" : 0,
  40. "length" : 6,
  41. "options" : [
  42. {
  43. "text" : "小米10",
  44. "_index" : "product_completion_index",
  45. "_type" : "_doc",
  46. "_id" : "2",
  47. "_score" : 1.0,
  48. "_source" : {
  49. "searchkey" : "小米10",
  50. "name" : "小米(MI)"
  51. }
  52. },
  53. {
  54. "text" : "小米10Pro",
  55. "_index" : "product_completion_index",
  56. "_type" : "_doc",
  57. "_id" : "9",
  58. "_score" : 1.0,
  59. "_source" : {
  60. "searchkey" : "小米10Pro",
  61. "name" : "小米(MI)"
  62. }
  63. },
  64. {
  65. "text" : "小米8",
  66. "_index" : "product_completion_index",
  67. "_type" : "_doc",
  68. "_id" : "8",
  69. "_score" : 1.0,
  70. "_source" : {
  71. "searchkey" : "小米8",
  72. "name" : "小米(MI)"
  73. }
  74. },
  75. {
  76. "text" : "小米9",
  77. "_index" : "product_completion_index",
  78. "_type" : "_doc",
  79. "_id" : "5",
  80. "_score" : 1.0,
  81. "_source" : {
  82. "searchkey" : "小米9",
  83. "name" : "小米(MI)"
  84. }
  85. },
  86. {
  87. "text" : "小米充电宝",
  88. "_index" : "product_completion_index",
  89. "_type" : "_doc",
  90. "_id" : "13",
  91. "_score" : 1.0,
  92. "_source" : {
  93. "searchkey" : "小米充电宝",
  94. "name" : "小米(MI)"
  95. }
  96. },
  97. {
  98. "text" : "小米手机",
  99. "_index" : "product_completion_index",
  100. "_type" : "_doc",
  101. "_id" : "1",
  102. "_score" : 1.0,
  103. "_source" : {
  104. "name" : "小米(MI)",
  105. "searchkey" : "小米手机"
  106. }
  107. },
  108. {
  109. "text" : "小米摄像头",
  110. "_index" : "product_completion_index",
  111. "_type" : "_doc",
  112. "_id" : "11",
  113. "_score" : 1.0,
  114. "_source" : {
  115. "searchkey" : "小米摄像头",
  116. "name" : "小米(MI)"
  117. }
  118. },
  119. {
  120. "text" : "小米电视",
  121. "_index" : "product_completion_index",
  122. "_type" : "_doc",
  123. "_id" : "3",
  124. "_score" : 1.0,
  125. "_source" : {
  126. "searchkey" : "小米电视",
  127. "name" : "小米(MI)"
  128. }
  129. },
  130. {
  131. "text" : "小米电饭煲",
  132. "_index" : "product_completion_index",
  133. "_type" : "_doc",
  134. "_id" : "12",
  135. "_score" : 1.0,
  136. "_source" : {
  137. "searchkey" : "小米电饭煲",
  138. "name" : "小米(MI)"
  139. }
  140. },
  141. {
  142. "text" : "小米笔记本",
  143. "_index" : "product_completion_index",
  144. "_type" : "_doc",
  145. "_id" : "10",
  146. "_score" : 1.0,
  147. "_source" : {
  148. "searchkey" : "小米笔记本",
  149. "name" : "小米(MI)"
  150. }
  151. },
  152. {
  153. "text" : "小米耳环",
  154. "_index" : "product_completion_index",
  155. "_type" : "_doc",
  156. "_id" : "7",
  157. "_score" : 1.0,
  158. "_source" : {
  159. "searchkey" : "小米耳环",
  160. "name" : "小米(MI)"
  161. }
  162. },
  163. {
  164. "text" : "小米路由器",
  165. "_index" : "product_completion_index",
  166. "_type" : "_doc",
  167. "_id" : "4",
  168. "_score" : 1.0,
  169. "_source" : {
  170. "searchkey" : "小米路由器",
  171. "name" : "小米(MI)"
  172. }
  173. }
  174. ]
  175. }
  176. ]
  177. }
  178. }
  1. #首字母自动补全
  2. GET product_completion_index/_search
  3. {
  4. "from": 0,
  5. "size": 100,
  6. "suggest": {
  7. "czbk-suggest": {
  8. "prefix": "xm",
  9. "completion": {
  10. "field": "searchkey",
  11. "size": 20,
  12. "skip_duplicates": true
  13. }
  14. }
  15. }
  16. }
  17. //结果
  18. {
  19. "took" : 1,
  20. "timed_out" : false,
  21. "_shards" : {
  22. "total" : 1,
  23. "successful" : 1,
  24. "skipped" : 0,
  25. "failed" : 0
  26. },
  27. "hits" : {
  28. "total" : {
  29. "value" : 0,
  30. "relation" : "eq"
  31. },
  32. "max_score" : null,
  33. "hits" : [ ]
  34. },
  35. "suggest" : {
  36. "czbk-suggest" : [
  37. {
  38. "text" : "xm",
  39. "offset" : 0,
  40. "length" : 2,
  41. "options" : [
  42. {
  43. "text" : "小米10",
  44. "_index" : "product_completion_index",
  45. "_type" : "_doc",
  46. "_id" : "2",
  47. "_score" : 1.0,
  48. "_source" : {
  49. "searchkey" : "小米10",
  50. "name" : "小米(MI)"
  51. }
  52. },
  53. {
  54. "text" : "小米10Pro",
  55. "_index" : "product_completion_index",
  56. "_type" : "_doc",
  57. "_id" : "9",
  58. "_score" : 1.0,
  59. "_source" : {
  60. "searchkey" : "小米10Pro",
  61. "name" : "小米(MI)"
  62. }
  63. },
  64. {
  65. "text" : "小米8",
  66. "_index" : "product_completion_index",
  67. "_type" : "_doc",
  68. "_id" : "8",
  69. "_score" : 1.0,
  70. "_source" : {
  71. "searchkey" : "小米8",
  72. "name" : "小米(MI)"
  73. }
  74. },
  75. {
  76. "text" : "小米9",
  77. "_index" : "product_completion_index",
  78. "_type" : "_doc",
  79. "_id" : "5",
  80. "_score" : 1.0,
  81. "_source" : {
  82. "searchkey" : "小米9",
  83. "name" : "小米(MI)"
  84. }
  85. },
  86. {
  87. "text" : "小米充电宝",
  88. "_index" : "product_completion_index",
  89. "_type" : "_doc",
  90. "_id" : "13",
  91. "_score" : 1.0,
  92. "_source" : {
  93. "searchkey" : "小米充电宝",
  94. "name" : "小米(MI)"
  95. }
  96. },
  97. {
  98. "text" : "小米手机",
  99. "_index" : "product_completion_index",
  100. "_type" : "_doc",
  101. "_id" : "1",
  102. "_score" : 1.0,
  103. "_source" : {
  104. "name" : "小米(MI)",
  105. "searchkey" : "小米手机"
  106. }
  107. },
  108. {
  109. "text" : "小米摄像头",
  110. "_index" : "product_completion_index",
  111. "_type" : "_doc",
  112. "_id" : "11",
  113. "_score" : 1.0,
  114. "_source" : {
  115. "searchkey" : "小米摄像头",
  116. "name" : "小米(MI)"
  117. }
  118. },
  119. {
  120. "text" : "小米电视",
  121. "_index" : "product_completion_index",
  122. "_type" : "_doc",
  123. "_id" : "3",
  124. "_score" : 1.0,
  125. "_source" : {
  126. "searchkey" : "小米电视",
  127. "name" : "小米(MI)"
  128. }
  129. },
  130. {
  131. "text" : "小米电饭煲",
  132. "_index" : "product_completion_index",
  133. "_type" : "_doc",
  134. "_id" : "12",
  135. "_score" : 1.0,
  136. "_source" : {
  137. "searchkey" : "小米电饭煲",
  138. "name" : "小米(MI)"
  139. }
  140. },
  141. {
  142. "text" : "小米笔记本",
  143. "_index" : "product_completion_index",
  144. "_type" : "_doc",
  145. "_id" : "10",
  146. "_score" : 1.0,
  147. "_source" : {
  148. "searchkey" : "小米笔记本",
  149. "name" : "小米(MI)"
  150. }
  151. },
  152. {
  153. "text" : "小米耳环",
  154. "_index" : "product_completion_index",
  155. "_type" : "_doc",
  156. "_id" : "7",
  157. "_score" : 1.0,
  158. "_source" : {
  159. "searchkey" : "小米耳环",
  160. "name" : "小米(MI)"
  161. }
  162. },
  163. {
  164. "text" : "小米路由器",
  165. "_index" : "product_completion_index",
  166. "_type" : "_doc",
  167. "_id" : "4",
  168. "_score" : 1.0,
  169. "_source" : {
  170. "searchkey" : "小米路由器",
  171. "name" : "小米(MI)"
  172. }
  173. }
  174. ]
  175. }
  176. ]
  177. }
  178. }

(3)代码实现

返回数据封装

  1. package com.tangbb.elasticsearch.pojo;
  2. //import com.fasterxml.jackson.annotation.JsonInclude;
  3. import com.alibaba.fastjson.annotation.JSONField;
  4. import com.alibaba.fastjson.serializer.SerializerFeature;
  5. import com.fasterxml.jackson.annotation.JsonInclude;
  6. import com.tangbb.elasticsearch.pojo.ResultEnum;
  7. import java.io.Serializable;
  8. /**
  9. * @Class: ResponseData
  10. * @Package com.itheima.commons.result
  11. * @Description: 数据返回封装类
  12. * @Company: http://www.itheima.com/
  13. */
  14. //如果加该注解的字段为null,那么就不序列化
  15. @JsonInclude(JsonInclude.Include.NON_NULL)
  16. public class ResponseData<T> implements Serializable {
  17. //返回码
  18. private String code;
  19. //返回信息
  20. private String desc;
  21. //返回的数据
  22. private T data;
  23. //返回数据总数
  24. private Integer count;
  25. public Integer getCount() {
  26. return count;
  27. }
  28. public void setCount(Integer count) {
  29. this.count = count;
  30. }
  31. public String getCode() {
  32. return code;
  33. }
  34. public void setCode(String code) {
  35. this.code = code;
  36. }
  37. public String getDesc() {
  38. return desc;
  39. }
  40. public T getData() {
  41. return data;
  42. }
  43. public ResponseData(T data, ResultEnum resultEnum) {
  44. this.code = resultEnum.getCode();
  45. this.desc = resultEnum.getDecs();
  46. this.data = data;
  47. }
  48. public ResponseData(ResultEnum resultEnum) {
  49. this.code = resultEnum.getCode();
  50. this.desc = resultEnum.getDecs();
  51. }
  52. public ResponseData(String code, String desc) {
  53. this.code = code;
  54. this.desc = desc;
  55. }
  56. public ResponseData setResultEnum(ResultEnum result) {
  57. this.code = result.getCode();
  58. this.desc = result.getDecs();
  59. return this;
  60. }
  61. public ResponseData setResultEnum(T data, ResultEnum resultEnum, Integer count) {
  62. this.code = resultEnum.getCode();
  63. this.desc = resultEnum.getDecs();
  64. this.data = data;
  65. this.count = count;
  66. return this;
  67. }
  68. public ResponseData(T data, ResultEnum resultEnum, Integer count) {
  69. this.code = resultEnum.getCode();
  70. this.desc = resultEnum.getDecs();
  71. this.data = data;
  72. this.count = count;
  73. }
  74. public ResponseData() {
  75. }
  76. public ResponseData setResultEnum(String code, String desc) {
  77. this.code = code;
  78. this.desc = desc;
  79. return this;
  80. }
  81. }

枚举类

  1. package com.tangbb.elasticsearch.pojo;
  2. /**
  3. * @Class: ResultEnum
  4. * @Package com.itheima.commons.enums
  5. * @Description: 操作提示枚举类
  6. * @Company: http://www.itheima.com/
  7. */
  8. public enum ResultEnum {
  9. success("200", "操作成功!"),
  10. param_isnull("-400", "参数为空"),
  11. error("-402", "操作失败!"),
  12. server_error("-500", "服务异常"),
  13. data_existent("-504", "数据不存在"),
  14. result_empty("-000", "查询内容为空"),
  15. NOT_SYSTEM_API("404", "不是系统指定api"),
  16. REPEAT("666", "数据已存在"),
  17. HTTP_ERROR("-405", "请求异常");
  18. private String code;
  19. private String decs;
  20. public String getCode() {
  21. return code;
  22. }
  23. public String getDecs() {
  24. return decs;
  25. }
  26. ResultEnum(String code, String decs) {
  27. this.code = code;
  28. this.decs = decs;
  29. }
  30. }

日志类

  1. package com.tangbb.elasticsearch.pojo;
  2. /**
  3. * @Class: ResultEnum
  4. * @Package com.itheima.commons.enums
  5. * @Description: 应用层操作提示
  6. * @Company: http://www.itheima.com/
  7. */
  8. public enum TipsEnum {
  9. create_index_success("创建索引成功!"),
  10. create_index_fail("创建索引失败!"),
  11. delete_index_success("删除索引成功!"),
  12. delete_index_fail("删除索引失败!"),
  13. open_index_success("打开索引成功!"),
  14. open_index_fail("打开索引失败!"),
  15. close_index_success("关闭索引成功!"),
  16. close_index_fail("关闭索引失败!"),
  17. alias_index_success("索引别名设置成功!"),
  18. alias_index_fail("索引别名设置失败!"),
  19. exists_index_success("索引是否存在查询成功!"),
  20. exists_index_fail("引是否存在查询失败!"),
  21. create_doc_success("创建文档成功!"),
  22. create_doc_fail("创建文档失败!"),
  23. batch_create_doc_success("批量创建文档成功!"),
  24. batch_create_doc_fail("批量创建文档失败!"),
  25. update_doc_success("修改文档成功!"),
  26. update_doc_fail("修改文档失败!"),
  27. get_doc_success("查询文档成功!"),
  28. batch_get_doc_fail("批量查询文档失败!"),
  29. batch_get_doc_success("批量查询文档成功!"),
  30. get_doc_fail("查询文档失败!"),
  31. delete_doc_success("删除文档成功!"),
  32. delete_doc_fail("删除文档失败!"),
  33. csuggest_get_doc_fail("自动补全获取失败!"),
  34. csuggest_get_doc_success("自动补全获取成功!"),
  35. psuggest_get_doc_fail("拼写纠错获取失败!"),
  36. psuggest_get_doc_success("拼写纠错获取成功!"),
  37. tsuggest_get_doc_fail("搜索推荐获取失败!"),
  38. tsuggest_get_doc_success("搜索推荐获取成功!"),
  39. hotwords_get_doc_fail("搜索热词获取失败!"),
  40. hotwords_get_doc_success("搜索热词获取成功!"),
  41. metricagg_get_doc_fail("指标聚合处理失败!"),
  42. metricagg_get_doc_success("指标聚合处理成功!"),
  43. bucketagg_get_doc_fail("桶聚合处理失败!"),
  44. bucketagg_get_doc_success("桶聚合处理成功!"),
  45. index_default("索引创建失败!");
  46. private String message;
  47. public String getMessage() {
  48. return message;
  49. }
  50. TipsEnum(String message) {
  51. this.message = message;
  52. }
  53. }

实体类

  1. package com.tangbb.elasticsearch.pojo;
  2. import com.fasterxml.jackson.annotation.JsonInclude;
  3. import java.io.Serializable;
  4. import java.util.List;
  5. import java.util.Map;
  6. /**
  7. * @Class: CommonEntity
  8. * @Package com.itheima.commons.pojo
  9. * @Description: 公共实体类
  10. * @Company: http://www.itheima.com/
  11. */
  12. //如果加该注解的字段为null,那么就不序列化
  13. @JsonInclude(JsonInclude.Include.NON_NULL)
  14. public class CommonEntity implements Serializable {
  15. //页码
  16. private int pageNumber;
  17. //每页数据条数
  18. private int pageSize;
  19. //索引名称
  20. private String indexName;
  21. //高亮列
  22. private String highlight;
  23. //排序 DESC ASC
  24. private String sortOrder;
  25. //排序列
  26. private String sortField;
  27. //自动补全建议列
  28. private String suggestFileld;
  29. //自动补全建议值
  30. private String suggestValue;
  31. //自动补全返回个数
  32. private Integer suggestCount;
  33. //动态查询参数封装
  34. Map<String, Object> map;
  35. //批量增加list
  36. private List<Map<String, Object>> list;
  37. public int getPageNumber() {
  38. return pageNumber;
  39. }
  40. public void setPageNumber(int pageNumber) {
  41. this.pageNumber = pageNumber;
  42. }
  43. public int getPageSize() {
  44. return pageSize;
  45. }
  46. public void setPageSize(int pageSize) {
  47. this.pageSize = pageSize;
  48. }
  49. public String getIndexName() {
  50. return indexName;
  51. }
  52. public void setIndexName(String indexName) {
  53. this.indexName = indexName;
  54. }
  55. public String getHighlight() {
  56. return highlight;
  57. }
  58. public void setHighlight(String highlight) {
  59. this.highlight = highlight;
  60. }
  61. public String getSortOrder() {
  62. return sortOrder;
  63. }
  64. public void setSortOrder(String sortOrder) {
  65. this.sortOrder = sortOrder;
  66. }
  67. public String getSortField() {
  68. return sortField;
  69. }
  70. public void setSortField(String sortField) {
  71. this.sortField = sortField;
  72. }
  73. public String getSuggestFileld() {
  74. return suggestFileld;
  75. }
  76. public void setSuggestFileld(String suggestFileld) {
  77. this.suggestFileld = suggestFileld;
  78. }
  79. public String getSuggestValue() {
  80. return suggestValue;
  81. }
  82. public void setSuggestValue(String suggestValue) {
  83. this.suggestValue = suggestValue;
  84. }
  85. public Integer getSuggestCount() {
  86. return suggestCount;
  87. }
  88. public void setSuggestCount(Integer suggestCount) {
  89. this.suggestCount = suggestCount;
  90. }
  91. public Map<String, Object> getMap() {
  92. return map;
  93. }
  94. public void setMap(Map<String, Object> map) {
  95. this.map = map;
  96. }
  97. public List<Map<String, Object>> getList() {
  98. return list;
  99. }
  100. public void setList(List<Map<String, Object>> list) {
  101. this.list = list;
  102. }
  103. }

控制层

  1. /*
  2. * @Description 自动补全
  3. * @Method: suggester
  4. * @Param: [commonEntity]
  5. * @Update:
  6. * @since: 1.0.0
  7. * @Return: com.oldlu.commons.result.ResponseData
  8. *
  9. */
  10. @GetMapping(value = "/csuggest")
  11. public ResponseData cSuggest(@RequestBody CommonEntity commonEntity) {
  12. // 构造返回数据
  13. ResponseData rData = new ResponseData();
  14. if (StringUtils.isEmpty(commonEntity.getIndexName()) ||
  15. StringUtils.isEmpty(commonEntity.getSuggestFileld()) ||
  16. StringUtils.isEmpty(commonEntity.getSuggestValue())) {
  17. rData.setResultEnum(ResultEnum.param_isnull);
  18. return rData;
  19. }
  20. //批量查询返回结果
  21. List<String> result = null;
  22. try {
  23. //通过高阶API调用批量新增操作方法
  24. result = contentService.cSuggest(commonEntity);
  25. //通过类型推断自动装箱(多个参数取交集)
  26. rData.setResultEnum(result, ResultEnum.success, result.size());
  27. //日志记录
  28. logger.info(TipsEnum.get_doc_success.getMessage());
  29. } catch (Exception e) {
  30. //日志记录
  31. logger.error(TipsEnum.get_doc_fail.getMessage(), e);
  32. //构建错误返回信息
  33. rData.setResultEnum(ResultEnum.error);
  34. }
  35. return rData;
  36. }

服务层

  1. //自动补全
  2. /*
  3. * @Description: 自动补全 根据用户的输入联想到可能的词或者短语
  4. * @Method: suggester
  5. * @Param: [commonEntity]
  6. * @Update:
  7. * @since: 1.0.0
  8. * @Return: org.elasticsearch.action.search.SearchResponse
  9. *
  10. */
  11. public List<String> cSuggest(CommonEntity commonEntity) throws Exception {
  12. //定义返回
  13. List<String> suggestList = new ArrayList<>();
  14. //构建查询请求
  15. SearchRequest searchRequest = new
  16. SearchRequest(commonEntity.getIndexName());
  17. //通过查询构建器定义评分排序
  18. SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
  19. searchSourceBuilder.sort(new ScoreSortBuilder().order(SortOrder.DESC));
  20. //构造搜索建议语句,搜索条件字段
  21. CompletionSuggestionBuilder completionSuggestionBuilder =new
  22. CompletionSuggestionBuilder(commonEntity.getSuggestFileld());
  23. //搜索关键字
  24. completionSuggestionBuilder.prefix(commonEntity.getSuggestValue());
  25. //去除重复
  26. completionSuggestionBuilder.skipDuplicates(true);
  27. //匹配数量
  28. completionSuggestionBuilder.size(commonEntity.getSuggestCount());
  29. searchSourceBuilder.suggest(new SuggestBuilder().addSuggestion("czbk-suggest", completionSuggestionBuilder));
  30. //czbk-suggest为返回的字段,所有返回将在czbk-suggest里面,可写死,sort按照评分排序
  31. searchRequest.source(searchSourceBuilder);
  32. //定义查找响应
  33. SearchResponse suggestResponse = restHighLevelClient.search(searchRequest,
  34. RequestOptions.DEFAULT);
  35. //定义完成建议对象
  36. CompletionSuggestion completionSuggestion =
  37. suggestResponse.getSuggest().getSuggestion("czbk-suggest");
  38. List<CompletionSuggestion.Entry.Option> optionsList =
  39. completionSuggestion.getEntries().get(0).getOptions();
  40. //从optionsList取出结果
  41. if (!CollectionUtils.isEmpty(optionsList)) {
  42. optionsList.forEach(item ->
  43. suggestList.add(item.getText().toString()));
  44. }
  45. return suggestList;
  46. }

测试结果

http://localhost:9090/csuggest

四.自动纠错

1.场景描述

例如:错误输入"【adidaas官方旗舰店】 ”能够纠错为【adidas官方旗舰店】

2.DSL实现

  1. #自动纠错
  2. GET product_completion_index/_search
  3. {
  4. "suggest": {
  5. "czbk-suggestion": {
  6. "text": "adidaas官方旗舰店",
  7. "phrase": {
  8. "field": "name",
  9. "size": 13
  10. }
  11. }
  12. }
  13. }

3.java实现

 控制层

  1. /*
  2. * @Description: 拼写纠错
  3. * @Method: suggester2
  4. * @Param: [commonEntity]
  5. * @Update:
  6. * @since: 1.0.0
  7. * @Return: com.oldlu.commons.result.ResponseData
  8. *
  9. */
  10. @GetMapping(value = "/psuggest")
  11. public ResponseData pSuggest(@RequestBody CommonEntity commonEntity) {
  12. // 构造返回数据
  13. ResponseData rData = new ResponseData();
  14. if (StringUtils.isEmpty(commonEntity.getIndexName()) ||
  15. StringUtils.isEmpty(commonEntity.getSuggestFileld()) ||
  16. StringUtils.isEmpty(commonEntity.getSuggestValue())) {
  17. rData.setResultEnum(ResultEnum.param_isnull);
  18. return rData;
  19. }
  20. //批量查询返回结果
  21. String result = null;
  22. try {
  23. //通过高阶API调用批量新增操作方法
  24. result = contentService.pSuggest(commonEntity);
  25. //通过类型推断自动装箱(多个参数取交集)
  26. rData.setResultEnum(result, ResultEnum.success, null);
  27. //日志记录
  28. logger.info(TipsEnum.get_doc_success.getMessage());
  29. } catch (Exception e) {
  30. //日志记录
  31. logger.error(TipsEnum.batch_get_doc_fail.getMessage(), e);
  32. //构建错误返回信息
  33. rData.setResultEnum(ResultEnum.error);
  34. }
  35. return rData;
  36. }

服务层

  1. /*
  2. * @Description: 拼写纠错
  3. * @Method: psuggest
  4. * @Param: [commonEntity]
  5. * @Update:
  6. * @since: 1.0.0
  7. * @Return: java.util.List<java.lang.String>
  8. *
  9. */
  10. public String pSuggest(CommonEntity commonEntity) throws Exception {
  11. //定义返回
  12. String pSuggestString = new String();
  13. //定义查询请求
  14. SearchRequest searchRequest = new
  15. SearchRequest(commonEntity.getIndexName());
  16. //定义查询条件构建器
  17. SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
  18. //定义排序器
  19. searchSourceBuilder.sort(new ScoreSortBuilder().order(SortOrder.DESC));
  20. //构造短语建议器对象(参数为匹配列)
  21. PhraseSuggestionBuilder pSuggestionBuilder = new
  22. PhraseSuggestionBuilder(commonEntity.getSuggestFileld());
  23. //搜索关键字(被纠错的值)
  24. pSuggestionBuilder.text(commonEntity.getSuggestValue());
  25. //匹配数量
  26. pSuggestionBuilder.size(1);
  27. searchSourceBuilder.suggest(new SuggestBuilder().addSuggestion("czbk-suggest", pSuggestionBuilder));
  28. searchRequest.source(searchSourceBuilder);
  29. //定义查找响应
  30. SearchResponse suggestResponse = restHighLevelClient.search(searchRequest,
  31. RequestOptions.DEFAULT);
  32. //定义短语建议对象
  33. PhraseSuggestion phraseSuggestion =
  34. suggestResponse.getSuggest().getSuggestion("czbk-suggest");
  35. //获取返回数据
  36. List<PhraseSuggestion.Entry.Option> optionsList =
  37. phraseSuggestion.getEntries().get(0).getOptions();
  38. //从optionsList取出结果
  39. if (!CollectionUtils.isEmpty(optionsList)
  40. &&optionsList.get(0).getText()!=null) {
  41. pSuggestString = optionsList.get(0).getText().string().replaceAll(" ","");
  42. }
  43. return pSuggestString;
  44. }

 测试

http://localhost:9090/psuggest

五.仿京东实战

仿京东实战完整代码详见我的gitee,代码地址如下

https://gitee.com/EkkoBoy/elasticsearch.git

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/IT小白/article/detail/966713
推荐阅读
相关标签
  

闽ICP备14008679号