当前位置:   article > 正文

Elasticsearch 学习第7篇—布尔过滤器_es bool should

es bool should

bool 过滤

bool 过滤可以用来合并多个过滤条件查询结果的布尔逻辑,bool 过滤器由三部分组成:

  1. {
  2. "bool" : {
  3. "must" : [],
  4. "should" : [],
  5. "must_not" : [],
  6. }
  7. }

它包含以下操作符:

  • must:多个查询条件的完全匹配,相当于 and
  • must_not:多个查询条件的相反匹配,相当于 not
  • should:至少有一个查询条件匹配, 相当于 or

注意:

  1. must、must_not语句里面如果包含多个条件,则各个条件间是的关系,而should的多个条件是的关系。
  2. 查询语句同时包含must和should时,可以不满足should的条件,因为must条件优先级高于should,但是如果也满足should的条件,则会提高相关性得分。
  3. 可以使用minimum_should_match参数来控制应当满足条件的个数或百分比,通常和should配合使用。
  4. must、must_not、should支持数组,bool复合查询语句中使用不参与计算相关性得分的过滤查询时,可以将过滤内容写到filter中的查询语句中。

查询举例

测试数据如下:

 

_index

_type

_id

_score

first_name

last_name

age

about

megacorp

employee

5

1

国庆

38

I like to shopping foods

megacorp

employee

8

1

Li

Haijing

35

I like to shopping foods1

megacorp

employee

2

1

Jane

Smith

32

I like to collect rock albums

megacorp

employee

4

1

Li

Haijing

35

I like to shopping foods

megacorp

employee

6

1

张国庆

28

I like to shopping foods

megacorp

employee

1

1

John

Smith

25

I love to go rock climbing

megacorp

employee

3

1

Douglas

Fir

35

I like to build cabinets

mapping信息如下:

  1. {
  2. "mapping": {
  3. "employee": {
  4. "properties": {
  5. "about": {
  6. "type": "text",
  7. "fields": {
  8. "keyword": {
  9. "type": "keyword",
  10. "ignore_above": 256
  11. }
  12. }
  13. },
  14. "age": {
  15. "type": "long"
  16. },
  17. "first_name": {
  18. "type": "text",
  19. "fields": {
  20. "keyword": {
  21. "type": "keyword",
  22. "ignore_above": 256
  23. }
  24. }
  25. },
  26. "interests": {
  27. "type": "text",
  28. "fields": {
  29. "keyword": {
  30. "type": "keyword",
  31. "ignore_above": 256
  32. }
  33. }
  34. },
  35. "last_name": {
  36. "type": "text",
  37. "fields": {
  38. "keyword": {
  39. "type": "keyword",
  40. "ignore_above": 256
  41. }
  42. }
  43. }
  44. }
  45. }
  46. }
  47. }

使用es默认的标准分词器(它根据Unicode Consortium的定义的单词边界(word boundaries) 来切分文本,然后去掉大部分标点符号。最后,把所有词转为小写。例如Smith创建的分词索引是小新的smith)进行分词。

  • 查询需求

      找出年龄大于30岁但是不等于38岁的,first_name为Douglas或last_name为Smith的所有人,相当于下面sq;的

select * from employee  where age>30 and age<>38 and (first_name="Douglas" or last_name="Smith")

  • es的布尔过滤查询语句
  1. GET /megacorp/employee/_search
  2. {
  3. "query" : {
  4. "bool" : {
  5. "filter" : {
  6. "range" : {
  7. "age" : { "gt" : 30 }
  8. }
  9. },
  10. "must_not": {
  11. "term":{"age":38}
  12. },
  13. "should": [
  14. {"term":{"last_name":"Smith"}},
  15. {"term":{"first_name":"Douglas"}}
  16. ]
  17. }
  18. }
  19. }

结果

  1. {
  2. "took" : 2,
  3. "timed_out" : false,
  4. "_shards" : {
  5. "total" : 5,
  6. "successful" : 5,
  7. "skipped" : 0,
  8. "failed" : 0
  9. },
  10. "hits" : {
  11. "total" : 4,
  12. "max_score" : 0.0,
  13. "hits" : [
  14. {
  15. "_index" : "megacorp",
  16. "_type" : "employee",
  17. "_id" : "8",
  18. "_score" : 0.0,
  19. "_source" : {
  20. "first_name" : "Li",
  21. "last_name" : "Haijing",
  22. "age" : 35,
  23. "about" : "I like to shopping foods1",
  24. "interests" : [
  25. "music1"
  26. ]
  27. }
  28. },
  29. {
  30. "_index" : "megacorp",
  31. "_type" : "employee",
  32. "_id" : "2",
  33. "_score" : 0.0,
  34. "_source" : {
  35. "first_name" : "Jane",
  36. "last_name" : "Smith",
  37. "age" : 32,
  38. "about" : "I like to collect rock albums",
  39. "interests" : [
  40. "music"
  41. ]
  42. }
  43. },
  44. {
  45. "_index" : "megacorp",
  46. "_type" : "employee",
  47. "_id" : "4",
  48. "_score" : 0.0,
  49. "_source" : {
  50. "first_name" : "Li",
  51. "last_name" : "Haijing",
  52. "age" : "35",
  53. "about" : "I like to shopping foods",
  54. "interests" : [
  55. "forestry"
  56. ]
  57. }
  58. },
  59. {
  60. "_index" : "megacorp",
  61. "_type" : "employee",
  62. "_id" : "3",
  63. "_score" : 0.0,
  64. "_source" : {
  65. "first_name" : "Douglas",
  66. "last_name" : "Fir",
  67. "age" : 35,
  68. "about" : "I like to build cabinets",
  69. "interests" : [
  70. "forestry"
  71. ]
  72. }
  73. }
  74. ]
  75. }
  76. }

结果中有4条文档命中,而我最开始的预期是只有两条数据,即下图中红框中标注的数据

但实际结果是,蓝色框中的数据也查询出来了。原因如下:

查询语句同时包含must(filter、must_not)和should时,可以不满足should的条件,因为must条件优先级高于should,但是如果也满足should的条件,则会提高相关性得分。

从以上示例中可知should中的条件就是可以不满足,我们可以理解为有没有should不影响命中结果,只是得分可能会不同,但是如果我们想让should中的条件必须满足其一呢?

有两种方法可以解决,一种是用mustd对should进行包裹,另一种是使用minimum_should_match 参数

  • 第一种方案:minimum_should_match代表了最小匹配精度,如果设置minimum_should_match=1,那么should语句中至少需要有一个条件满足,查询语句如下:
  1. GET /megacorp/employee/_search
  2. {
  3. "query" : {
  4. "bool" : {
  5. "filter" : {
  6. "range" : {
  7. "age" : { "gt" : 30 }
  8. }
  9. },
  10. "must_not": {
  11. "term":{"age":38}
  12. },
  13. "should": [
  14. {"term":{"last_name":"Smith"}},
  15. {"term":{"first_name":"Douglas"}}
  16. ],
  17. "minimum_should_match":1
  18. }
  19. }
  20. }

此时返回的结果如下:

  1. {
  2. "took" : 0,
  3. "timed_out" : false,
  4. "_shards" : {
  5. "total" : 5,
  6. "successful" : 5,
  7. "skipped" : 0,
  8. "failed" : 0
  9. },
  10. "hits" : {
  11. "total" : 0,
  12. "max_score" : null,
  13. "hits" : [ ]
  14. }
  15. }

没有任何命中,原因是我们should中用的是term过滤查询,不会对查询关键词进行分词,输入的内容会原封不动的进行匹配,而我们在es中的索引是采用标准分词的,也就是说索引是小写的,因此没有任何文档被命中.

这时我们可以通过字段的keyword字段进行精确匹配

  1. GET /megacorp/employee/_search
  2. {
  3. "query" : {
  4. "bool" : {
  5. "filter" : {
  6. "range" : {
  7. "age" : { "gt" : 30 }
  8. }
  9. },
  10. "must_not": {
  11. "term":{"age":38}
  12. },
  13. "should": [
  14. {"term":{"last_name.keyword":"Smith"}},
  15. {"term":{"first_name.keyword":"Douglas"}}
  16. ],
  17. "minimum_should_match":1
  18. }
  19. }
  20. }

结果

  1. {
  2. "took" : 1,
  3. "timed_out" : false,
  4. "_shards" : {
  5. "total" : 5,
  6. "successful" : 5,
  7. "skipped" : 0,
  8. "failed" : 0
  9. },
  10. "hits" : {
  11. "total" : 2,
  12. "max_score" : 0.9808292,
  13. "hits" : [
  14. {
  15. "_index" : "megacorp",
  16. "_type" : "employee",
  17. "_id" : "2",
  18. "_score" : 0.9808292,
  19. "_source" : {
  20. "first_name" : "Jane",
  21. "last_name" : "Smith",
  22. "age" : 32,
  23. "about" : "I like to collect rock albums",
  24. "interests" : [
  25. "music"
  26. ]
  27. }
  28. },
  29. {
  30. "_index" : "megacorp",
  31. "_type" : "employee",
  32. "_id" : "3",
  33. "_score" : 0.2876821,
  34. "_source" : {
  35. "first_name" : "Douglas",
  36. "last_name" : "Fir",
  37. "age" : 35,
  38. "about" : "I like to build cabinets",
  39. "interests" : [
  40. "forestry"
  41. ]
  42. }
  43. }
  44. ]
  45. }
  46. }
  • 第二种方案

将should语句用must包裹

  1. GET /megacorp/employee/_search
  2. {
  3. "query" : {
  4. "bool" : {
  5. "filter" : {
  6. "range" : {
  7. "age" : { "gt" : 30 }
  8. }
  9. },
  10. "must_not": {
  11. "term":{"age":38}
  12. },
  13. "must":[
  14. {
  15. "bool":{
  16. "should": [
  17. {"term":{"last_name.keyword":"Smith"}},
  18. {"term":{"first_name.keyword":"Douglas"}}
  19. ]
  20. }
  21. }
  22. ]
  23. }
  24. }
  25. }

结果如下:

  1. {
  2. "took" : 1,
  3. "timed_out" : false,
  4. "_shards" : {
  5. "total" : 5,
  6. "successful" : 5,
  7. "skipped" : 0,
  8. "failed" : 0
  9. },
  10. "hits" : {
  11. "total" : 2,
  12. "max_score" : 0.9808292,
  13. "hits" : [
  14. {
  15. "_index" : "megacorp",
  16. "_type" : "employee",
  17. "_id" : "2",
  18. "_score" : 0.9808292,
  19. "_source" : {
  20. "first_name" : "Jane",
  21. "last_name" : "Smith",
  22. "age" : 32,
  23. "about" : "I like to collect rock albums",
  24. "interests" : [
  25. "music"
  26. ]
  27. }
  28. },
  29. {
  30. "_index" : "megacorp",
  31. "_type" : "employee",
  32. "_id" : "3",
  33. "_score" : 0.2876821,
  34. "_source" : {
  35. "first_name" : "Douglas",
  36. "last_name" : "Fir",
  37. "age" : 35,
  38. "about" : "I like to build cabinets",
  39. "interests" : [
  40. "forestry"
  41. ]
  42. }
  43. }
  44. ]
  45. }
  46. }

 

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/代码探险家/article/detail/859316
推荐阅读
相关标签
  

闽ICP备14008679号