当前位置:   article > 正文

从入门到进阶:Elasticsearch高级查询技巧详解_elasticsearch 查询

elasticsearch 查询

Elasticsearch是一款功能强大的全文搜索引擎,它使用Lucene搜索库进行底层索引和搜索。Elasticsearch提供了许多高级查询技巧,可以帮助用户更准确、更高效地查询数据。本教程将介绍Elasticsearch的高级查询技巧,并提供一些示例代码来说明它们的使用。

一、布尔查询

Elasticsearch支持布尔查询,包括AND、OR和NOT运算符。这使得用户可以使用多个条件来限制查询结果。

例如,以下查询将返回所有匹配“foo”和“bar”的文档:

  1. GET /_search
  2. {
  3. "query": {
  4. "bool": {
  5. "must": [
  6. { "match": { "content": "foo" }},
  7. { "match": { "content": "bar" }}
  8. ]
  9. }
  10. }
  11. }

此外,可以使用“should”查询来匹配任意一个条件。以下查询将返回匹配“foo”或“bar”的所有文档:

  1. GET /_search
  2. {
  3. "query": {
  4. "bool": {
  5. "should": [
  6. { "match": { "content": "foo" }},
  7. { "match": { "content": "bar" }}
  8. ]
  9. }
  10. }
  11. }

二、范围查询

Elasticsearch支持范围查询,可以用于查询一个字段是否在指定范围内。范围查询有两种类型:数值范围和日期范围。

例如,以下查询将返回所有年龄在18到30岁之间的用户:

  1. GET /_search
  2. {
  3. "query": {
  4. "range": {
  5. "age": {
  6. "gte": 18,
  7. "lte": 30
  8. }
  9. }
  10. }
  11. }

以下查询将返回所有注册日期在2019年1月1日到2020年1月1日之间的用户:

  1. GET /_search
  2. {
  3. "query": {
  4. "range": {
  5. "registered_at": {
  6. "gte": "2019-01-01",
  7. "lte": "2020-01-01"
  8. }
  9. }
  10. }
  11. }

三、模糊查询

Elasticsearch支持模糊查询,可以用于查询包含拼写错误或近似匹配的文档。模糊查询使用模糊匹配算法(如编辑距离算法)来找到近似匹配的文档。

例如,以下查询将返回包含“fox”或“fix”的文档:

  1. GET /_search
  2. {
  3. "query": {
  4. "fuzzy": {
  5. "content": {
  6. "value": "fox",
  7. "fuzziness": "2"
  8. }
  9. }
  10. }
  11. }

“fuzziness”参数指定了允许的最大编辑距离。在上面的例子中,“fuzziness”为2,表示查询将匹配编辑距离为1或2的文档。

四、正则表达式查询

Elasticsearch支持正则表达式查询,可以用于查询符合指定模式的文本。正则表达式查询可以使用“regexp”查询类型。

例如,以下查询将返回包含“foo”或“bar”的文档:

  1. GET /_search
  2. {
  3. "query": {
  4. "regexp": {
  5. "content": "foo|bar"
  6. }
  7. }
  8. }

五、通配符查询

Elasticsearch支持通配符查询,可以用于查询包含通配符模式的文本。通配符查询可以使用“wildcard”查询类型。

例如,以下查询将返回包含以“foo”或“bar”开头的文档:

  1. GET /_search
  2. {
  3. "query": {
  4. "wildcard": {
  5. "content": "foo* OR bar*"
  6. }
  7. }
  8. }

六、短语查询

Elasticsearch支持短语查询,可以用于查询包含一个或多个短语的文档。短语查询可以使用“match_phrase”查询类型。

例如,以下查询将返回包含短语“quick brown fox”或“lazy dog”的文档:

  1. GET /_search
  2. {
  3. "query": {
  4. "match_phrase": {
  5. "content": "quick brown fox lazy dog"
  6. }
  7. }
  8. }

七、高亮显示

Elasticsearch支持高亮显示查询结果中的关键字,可以用于使查询结果更易于理解。可以使用“highlight”参数来启用高亮显示。

例如,以下查询将返回包含“foo”或“bar”的文档,并将查询结果中的关键字高亮显示:

  1. GET /_search
  2. {
  3. "query": {
  4. "bool": {
  5. "should": [
  6. { "match": { "content": "foo" }},
  7. { "match": { "content": "bar" }}
  8. ]
  9. }
  10. },
  11. "highlight": {
  12. "fields": {
  13. "content": {}
  14. }
  15. }
  16. }

八、分页和排序

Elasticsearch支持分页和排序查询结果。可以使用“from”和“size”参数来指定返回结果的起始位置和数量。可以使用“sort”参数来指定排序方式。

例如,以下查询将返回从第10个文档开始的5个文档,并按照“age”字段进行升序排序:

  1. GET /_search
  2. {
  3. "from": 10,
  4. "size": 5,
  5. "query": {
  6. "match_all": {}
  7. },
  8. "sort": [
  9. { "age": "asc" }
  10. ]
  11. }

九、聚合查询

Elasticsearch支持聚合查询,可以用于对文档进行统计和分组。聚合查询可以使用“aggs”参数来启用。

例如,以下查询将返回“content”字段中包含每个单词的文档数量:

  1. GET /_search
  2. {
  3. "query": {
  4. "match_all": {}
  5. },
  6. "aggs": {
  7. "word_count": {
  8. "terms": {
  9. "field": "content"
  10. }
  11. }
  12. }
  13. }

以上就是Elasticsearch的一些高级查询技巧。下面将提供一些示例代码来说明它们的使用。

十、Java示例代码

示例代码如下:

  1. import java.io.IOException;
  2. import java.util.HashMap;
  3. import java.util.Map;
  4. import org.apache.http.HttpHost;
  5. import org.elasticsearch.action.search.SearchRequest;
  6. import org.elasticsearch.action.search.SearchResponse;
  7. import org.elasticsearch.client.RequestOptions;
  8. import org.elasticsearch.client.RestClient;
  9. import org.elasticsearch.client.RestHighLevelClient;
  10. import org.elasticsearch.common.text.Text;
  11. import org.elasticsearch.index.query.MatchQueryBuilder;
  12. import org.elasticsearch.index.query.QueryBuilders;
  13. import org.elasticsearch.index.query.RangeQueryBuilder;
  14. import org.elasticsearch.search.SearchHit;
  15. import org.elasticsearch.search.SearchHits;
  16. import org.elasticsearch.search.builder.SearchSourceBuilder;
  17. import org.elasticsearch.search.fetch.subphase.highlight.HighlightBuilder;
  18. import org.elasticsearch.search.fetch.subphase.highlight.HighlightField;
  19. import org.elasticsearch.search.fetch.subphase.highlight.HighlightBuilder.Field;
  20. public class ElasticsearchDemo {
  21. public static void main(String[] args) throws IOException {
  22. // 创建客户端
  23. RestHighLevelClient client = new RestHighLevelClient(
  24. RestClient.builder(new HttpHost("localhost", 9200, "http")));
  25. // 创建索引和映射
  26. createIndexAndMapping(client);
  27. // 插入文档
  28. insertDocument(client);
  29. // 查询
  30. MatchQueryBuilder matchQuery = QueryBuilders.matchQuery("content", "elasticsearch");
  31. SearchRequest searchRequest = new SearchRequest("my_index");
  32. SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
  33. searchSourceBuilder.query(matchQuery);
  34. searchRequest.source(searchSourceBuilder);
  35. SearchResponse response = client.search(searchRequest, RequestOptions.DEFAULT);
  36. printSearchResult(response);
  37. // 带有高亮显示的查询
  38. HighlightBuilder highlightBuilder = new HighlightBuilder();
  39. highlightBuilder.field(new Field("content").preTags("<em>").postTags("</em>"));
  40. searchSourceBuilder = new SearchSourceBuilder();
  41. searchSourceBuilder.query(matchQuery);
  42. searchSourceBuilder.highlighter(highlightBuilder);
  43. searchRequest = new SearchRequest("my_index");
  44. searchRequest.source(searchSourceBuilder);
  45. response = client.search(searchRequest, RequestOptions.DEFAULT);
  46. printSearchResultWithHighlight(response);
  47. // 范围查询
  48. RangeQueryBuilder rangeQuery = QueryBuilders.rangeQuery("publish_date")
  49. .from("2020-01-01")
  50. .to("2021-12-31");
  51. searchSourceBuilder = new SearchSourceBuilder();
  52. searchSourceBuilder.query(rangeQuery);
  53. searchRequest = new SearchRequest("my_index");
  54. searchRequest.source(searchSourceBuilder);
  55. response = client.search(searchRequest, RequestOptions.DEFAULT);
  56. printSearchResult(response);
  57. // 排序
  58. searchSourceBuilder = new SearchSourceBuilder();
  59. searchSourceBuilder.query(matchQuery);
  60. searchSourceBuilder.sort("publish_date");
  61. searchRequest = new SearchRequest("my_index");
  62. searchRequest.source(searchSourceBuilder);
  63. response = client.search(searchRequest, RequestOptions.DEFAULT);
  64. printSearchResult(response);
  65. // 删除索引
  66. deleteIndex(client);
  67. // 关闭客户端
  68. client.close();
  69. }
  70. private static void createIndexAndMapping(RestHighLevelClient client) throws IOException {
  71. // 创建索引
  72. Map<String, Object> settings = new HashMap<>();
  73. settings.put("number_of_shards", 1);
  74. settings.put("number_of_replicas", 0);
  75. Map<String, Object> mapping = new HashMap<>();
  76. Map<String, Object> properties = new HashMap<>();
  77. properties.put("title", Map.of("type", "text"));
  78. properties.put("content", Map.of("type", "text"));
  79. properties.put("publish_date", Map.of("type", "date"));
  80. mapping.put("properties", properties);
  81. client.indices().create(Map.of("index", "my_index", "settings", settings, "mapping", mapping),
  82. RequestOptions.DEFAULT);
  83. }
  84. private static void insertDocument(RestHighLevelClient client) throws IOException {
  85. // 插入文档
  86. Map<String, Object> document = new HashMap<>();
  87. document.put("title", "Elasticsearch Guide");
  88. document.put("content", "This is a guide to Elasticsearch.");
  89. document.put("publish_date", "2021-03-01");
  90. client.index(Map.of("index", "my_index", "id", "1", "body", document), RequestOptions.DEFAULT);
  91. }
  92. private static void deleteIndex(RestHighLevelClient client) throws IOException {
  93. // 删除索引
  94. client.indices().delete(Map.of("index", "my_index"), RequestOptions.DEFAULT);
  95. }
  96. private static void printSearchResult(SearchResponse response) {
  97. // 打印查询结果
  98. SearchHits hits = response.getHits();
  99. System.out.println("Total hits: " + hits.getTotalHits().value);
  100. System.out.println("Hits:");
  101. for (SearchHit hit : hits) {
  102. System.out.println("Id: " + hit.getId());
  103. System.out.println("Score: " + hit.getScore());
  104. System.out.println("Title: " + hit.getSourceAsMap().get("title"));
  105. System.out.println("Content: " + hit.getSourceAsMap().get("content"));
  106. System.out.println("Publish date: " + hit.getSourceAsMap().get("publish_date"));
  107. }
  108. }
  109. private static void printSearchResultWithHighlight(SearchResponse response) {
  110. // 打印带有高亮显示的查询结果
  111. SearchHits hits = response.getHits();
  112. System.out.println("Total hits: " + hits.getTotalHits().value);
  113. System.out.println("Hits:");
  114. for (SearchHit hit : hits) {
  115. System.out.println("Id: " + hit.getId());
  116. System.out.println("Score: " + hit.getScore());
  117. System.out.println("Title: " + hit.getSourceAsMap().get("title"));
  118. HighlightField highlightField = hit.getHighlightFields().get("content");
  119. if (highlightField != null) {
  120. Text[] fragments = highlightField.fragments();
  121. String content = "";
  122. for (Text fragment : fragments) {
  123. content += fragment;
  124. }
  125. System.out.println("Content: " + content);
  126. } else {
  127. System.out.println("Content: " + hit.getSourceAsMap().get("content"));
  128. }
  129. System.out.println("Publish date: " + hit.getSourceAsMap().get("publish_date"));
  130. }
  131. }
  132. }

这里我们使用了Elasticsearch高级REST客户端API来实现示例代码,相较于低级API,使用高级API的好处在于更易用,而且使用方式更加接近面向对象编程,提高了开发效率。

十一、使用Spring Boot框架

首先,我们需要添加相关依赖。在​​pom.xml​​文件中添加以下依赖:

  1. <dependency>
  2. <groupId>org.springframework.boot</groupId>
  3. <artifactId>spring-boot-starter-data-elasticsearch</artifactId>
  4. </dependency>
  5. <dependency>
  6. <groupId>org.elasticsearch.client</groupId>
  7. <artifactId>elasticsearch-rest-high-level-client</artifactId>
  8. <version>7.15.2</version>
  9. </dependency>

其中,​​spring-boot-starter-data-elasticsearch​​​依赖为Spring Boot提供的与Elasticsearch集成的基础依赖,​​elasticsearch-rest-high-level-client​​为Elasticsearch高级REST客户端API的依赖。

接下来,我们创建一个Spring Boot主类,并在其中添加如下代码:

  1. import org.elasticsearch.action.delete.DeleteRequest;
  2. import org.elasticsearch.action.index.IndexRequest;
  3. import org.elasticsearch.action.search.SearchRequest;
  4. import org.elasticsearch.action.search.SearchResponse;
  5. import org.elasticsearch.client.RequestOptions;
  6. import org.elasticsearch.client.RestHighLevelClient;
  7. import org.elasticsearch.common.settings.Settings;
  8. import org.elasticsearch.common.unit.TimeValue;
  9. import org.elasticsearch.index.query.BoolQueryBuilder;
  10. import org.elasticsearch.index.query.QueryBuilders;
  11. import org.elasticsearch.search.SearchHit;
  12. import org.elasticsearch.search.builder.SearchSourceBuilder;
  13. import org.elasticsearch.search.fetch.subphase.highlight.HighlightBuilder;
  14. import org.elasticsearch.search.fetch.subphase.highlight.HighlightField;
  15. import org.elasticsearch.search.fetch.subphase.highlight.HighlightBuilder.Field;
  16. import org.elasticsearch.search.fetch.subphase.highlight.HighlightBuilder.HighlightQuery;
  17. import org.elasticsearch.search.sort.SortOrder;
  18. import org.springframework.boot.CommandLineRunner;
  19. import org.springframework.boot.SpringApplication;
  20. import org.springframework.boot.autoconfigure.SpringBootApplication;
  21. import org.springframework.context.annotation.Bean;
  22. import org.springframework.data.elasticsearch.client.RestClients;
  23. import java.io.IOException;
  24. import java.util.HashMap;
  25. import java.util.Map;
  26. @SpringBootApplication
  27. public class ElasticsearchDemoApplication implements CommandLineRunner {
  28. public static void main(String[] args) {
  29. SpringApplication.run(ElasticsearchDemoApplication.class, args);
  30. }
  31. @Bean
  32. public RestHighLevelClient client() {
  33. return RestClients.create(RestClients.createLocalHost()).rest();
  34. }
  35. @Override
  36. public void run(String... args) throws Exception {
  37. RestHighLevelClient client = client();
  38. try {
  39. createIndex(client);
  40. insertDocument(client);
  41. searchDocument(client);
  42. deleteIndex(client);
  43. } catch (IOException e) {
  44. e.printStackTrace();
  45. } finally {
  46. client.close();
  47. }
  48. }
  49. private static void createIndex(RestHighLevelClient client) throws IOException {
  50. // 创建索引
  51. Settings.Builder settings = Settings.builder()
  52. .put("index.number_of_shards", 1)
  53. .put("index.number_of_replicas", 0);
  54. Map<String, Object> mapping = new HashMap<>();
  55. Map<String, Object> properties = new HashMap<>();
  56. properties.put("title", Map.of("type", "text"));
  57. properties.put("content", Map.of("type", "text"));
  58. properties.put("publish_date", Map.of("type", "date"));
  59. mapping.put("properties", properties);
  60. client.indices().create(Map.of("index", "my_index", "settings", settings, "mapping", mapping),
  61. RequestOptions.DEFAULT);
  62. }
  63. private static void insertDocument(RestHighLevelClient client) throws IOException {
  64. // 插入文档
  65. Map<String, Object> document = new HashMap<>();
  66. document.put("title", "Elasticsearch Guide");
  67. document.put("content", "This is a guide to
  68. IndexRequest request = new IndexRequest("my_index")
  69. .id("1")
  70. .source(document);
  71. client.index(request, RequestOptions.DEFAULT);
  72. }
  73. private static void searchDocument(RestHighLevelClient client) throws IOException {
  74. // 搜索文档
  75. SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
  76. BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery()
  77. .must(QueryBuilders.matchQuery("title", "Elasticsearch"))
  78. .should(QueryBuilders.matchQuery("content", "guide"));
  79. sourceBuilder.query(boolQueryBuilder)
  80. .sort("publish_date", SortOrder.DESC)
  81. .from(0)
  82. .size(10)
  83. .timeout(TimeValue.timeValueSeconds(1))
  84. .fetchSource(new String[]{"title", "publish_date"}, new String[]{"content"});
  85. HighlightBuilder highlightBuilder = new HighlightBuilder()
  86. .field(new Field("title"))
  87. .highlightQuery(new HighlightQuery().matchQuery(new HashMap<String, Object>() {{
  88. put("title", new HashMap<>());
  89. }}));
  90. sourceBuilder.highlighter(highlightBuilder);
  91. SearchRequest request = new SearchRequest("my_index").source(sourceBuilder);
  92. SearchResponse response = client.search(request, RequestOptions.DEFAULT);
  93. System.out.println("Total hits: " + response.getHits().getTotalHits().value);
  94. for (SearchHit hit : response.getHits().getHits()) {
  95. System.out.println("Title: " + hit.getSourceAsMap().get("title"));
  96. System.out.println("Publish date: " + hit.getSourceAsMap().get("publish_date"));
  97. System.out.println("Content: " + hit.getHighlightFields().get("title").fragments()[0].string());
  98. System.out.println("--------------------------");
  99. }
  100. }
  101. private static void deleteIndex(RestHighLevelClient client) throws IOException {
  102. // 删除索引
  103. DeleteRequest request = new DeleteRequest("my_index");
  104. client.indices().delete(request, RequestOptions.DEFAULT);
  105. }
  106. }

我们在主类​​ElasticsearchDemoApplication​​​中实现了​​CommandLineRunner​​​接口,以便在应用启动时执行相关方法。在​​run​​方法中,我们调用了创建索引、插入文档、搜索文档和删除索引的方法。这些方法的具体实现与示例代码中的实现相同。

接下来,我们可以运行应用程序并查看结果。在终端中输入以下命令:

mvn spring-boot:run

通过这个Spring Boot实现,我们可以更方便地与Elasticsearch进行交互,而不必手动设置连接和释放资源等操作。此外,Spring Boot还提供了许多其他特性,例如自动配置和依赖注入等。这使得我们可以更加专注于业务逻辑,而不必过多关注与Elasticsearch的交互。

总结

Elasticsearch是一个功能强大的搜索引擎,拥有许多高级查询技巧。在实际使用中,可以根据具体需求选择合适的查询方式,并使用查询语句中的高级功能,来实现更复杂的查询操作。本教程介绍了Elasticsearch的基本查询方式和高级查询技巧,并提供了相应的代码示例,希望能帮助读者更好地掌握Elasticsearch的查询功能。​

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/笔触狂放9/article/detail/842341
推荐阅读
相关标签
  

闽ICP备14008679号