当前位置:   article > 正文

Elasticsearch和RestHighLevelClient的使用_resthighlevelclient elasticsearchclient

resthighlevelclient elasticsearchclient

概述

文档说明

  • 以下所有的都基于ElasticSearch 7.x
  • 以下所有的案例都建立在index_learn_test索引上
  • 索引DSL
  • 所有代码案例都基于java的RestHighLevelClient编写
{
"index_learn_test" : {
 "mappings" : {
   "properties" : {
     "age" : {
       "type" : "keyword",
       "fields" : {
         "number" : {
           "type" : "integer"
         }
       }
     },
     "departmentId" : {
       "type" : "keyword"
     },
     "departmentIdLeve1" : {
       "type" : "keyword"
     },
     "departmentIdLeve2" : {
       "type" : "keyword"
     },
     "departmentIdLeve3" : {
       "type" : "keyword"
     },
     "departmentIdLeve4" : {
       "type" : "keyword"
     },
     "departmentIdLeve5" : {
       "type" : "keyword"
     },
     "departmentIdLeve6" : {
       "type" : "keyword"
     },
     "departmentIdLeve7" : {
       "type" : "keyword"
     },
     "departmentIds" : {
       "type" : "keyword"
     },
     "departmentJoin" : {
       "type" : "join",
       "eager_global_ordinals" : true,
       "relations" : {
         "department" : "user"
       }
     },
     "id" : {
       "type" : "long"
     },
     "name" : {
       "type" : "text",
       "fields" : {
         "keyword" : {
           "type" : "keyword"
         }
       }
     },
     "resume" : {
       "type" : "wildcard"
     },
     "sex" : {
       "type" : "keyword"
     }
   }
 }
}
}

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68

字段类型

索引

遍历所有索引并查看索引占用空间

GET /_cat/indices?v
  • 1

查看某个索引的配置(包含默认配置)

GET /index_learn_test/_settings?include_defaults=true
  • 1

创建索引

PUT /index_learn_test
{
  "mappings": {
    "properties": {
      "id": {
        "type": "long"
      },
      "name": {
        "type": "text",
        "fields": {
          "keyword": {
            "type": "keyword"
          }
        }
      },
      "sex": {
        "type": "keyword"
      },
      "age": {
        "type": "keyword",
        "fields": {
          "number": {
            "type": "integer"
          }
        }
      }
    }
  }
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29

这里面的fields是给字段设置别的类型,使用的时候以名字为例name.keyword即可

查看索引结构

GET /index_learn_test/_mapping
  • 1

删除索引

DELETE /index_learn_test
  • 1

新增索引字段


PUT /index_learn_test/_mapping
{
  "properties":{
    "departmentIds":{
      "type":"keyword"
    }
  }
}

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10

复制索引数据

{
  "dest": {
    "index": "index_learn_test2"
  },
  "source": {
    "query": {
      "bool": {
        "must": [
          {
            "term": {
              "name": {
                "value": "正"
              }
            }
          }
        ]
      }
    },
    "index": "index_learn_test"
  } ,
  "max_docs":1
}

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • dest:目标索引
  • source:数据源
  • query:数据筛选
  • max_docs:最大复制文档数量

增删改

更新后立即生效

在ES中所有更新都是延迟生效的,默认是1s,如果需要更新后立即生效,参考以下java代码。
查看延迟时间GET /index_learn_test/_settings?include_defaults=true 返回的refresh_interval设置

java:

UpdateRequest updateRequest = new UpdateRequest(getIndex(),userData.getId().toString())
                //ES更新后会有延迟,延迟根据refresh_interval设置的,所以这边配置强制更新
                .setRefreshPolicy(WriteRequest.RefreshPolicy.IMMEDIATE)
                
  • 1
  • 2
  • 3
  • 4

新增(insert)

es:

PUT /index_learn_test/_doc/${id}

PUT /index_learn_test/_doc/12
{
  "id": 12,
  "age": 42,
  "sex": "女",
  "name": "厍振",
  "resume": "我是厍振,大家好!",
  "departmentId": "A"
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9

java代码:

public void insert() throws IOException {
        List<DepartmentData> list = DepartmentUtil.getDepartment(DOC_PARENT_NAME);
        String[] sex = new String[]{"男", "女"};
        UserData userData = new UserData();
        userData.setId(12L);
        userData.setAge(RandomUtil.randomInt(19, 60));
        userData.setSex(sex[RandomUtil.randomInt(2)]);
        userData.setName(RandNameUtil.randName());
        userData.setResume("我是" + userData.getName() + ",大家好!");
        userData.setDepartmentId(list.get(RandomUtil.randomInt(list.size())).getDepartmentId());
        IndexRequest indexRequest = new IndexRequest(getIndex())
                .id(userData.getId().toString()).source(JsonUtils.toJsonString(userData), XContentType.JSON);
        restHighLevelClient.index(indexRequest,RequestOptions.DEFAULT);
    }
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14

修改(update)

可以直接使用新增进行全部替换,或者使用以下代码修改部分替换

es:

POST /index_learn_test/_update/12?retry_on_conflict=10
{
  "doc": {
    "age":12
  }
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • retry_on_conflict允许重试次数(update在并发的情况下各个线程哪都的version可能不同导致更新失败)

java代码:

 public void update() throws IOException {
        UserData userData = new UserData();
        userData.setId(12L);
        userData.setAge(RandomUtil.randomInt(19, 60));
        UpdateRequest updateRequest = new UpdateRequest(getIndex(),userData.getId().toString())
                //ES更新后会有延迟,延迟根据refresh_interval设置的,所以这边配置强制更新
                .setRefreshPolicy(WriteRequest.RefreshPolicy.IMMEDIATE)
                .retryOnConflict(10)
                .doc(JsonUtils.toJsonString(userData), XContentType.JSON);
        restHighLevelClient.update(updateRequest,RequestOptions.DEFAULT);
    }
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11

删除(delete)

DELETE /index_learn_test/_doc/${id}

es:

DELETE /index_learn_test/_doc/12
  • 1

java代码:

    public void delete() throws IOException {
        UserData userData = new UserData();
        userData.setId(12L);
        DeleteRequest request = new DeleteRequest(getIndex(),userData.getId().toString())
                .setRefreshPolicy(WriteRequest.RefreshPolicy.IMMEDIATE);
        restHighLevelClient.delete(request,RequestOptions.DEFAULT);
    }
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7

批处理(bulk)

es:

PUT /index_learn_test/_bulk
{"delete":{"_index":"index_learn_test","_id":"12"}}
{"update":{"_index":"index_learn_test","_id":"20"}}
{"doc":{"age":23}}
{"create":{"_index":"index_learn_test","_id":"30"}}
{"age":34,"id":30}
{"index":{"_index":"index_learn_test","_id":"40"}}
{"age":34,"id":30}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8

java代码:

public void bulk()  throws IOException {
        BulkRequest bulkRequest = new BulkRequest();
        DeleteRequest deleteRequest = new DeleteRequest(getIndex()).id("12");
        bulkRequest.add(deleteRequest);
        UpdateRequest updateRequest = new UpdateRequest(getIndex(), "20").doc(Collections.singletonMap("age", 30));
        bulkRequest.add(updateRequest);
        //…… 其他的省略
        restHighLevelClient.bulk(bulkRequest,RequestOptions.DEFAULT);
    }
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9

查询

查询条件的java代码

所有的查询方法基本上都可以套用以下代码

java代码:

SearchRequest searchRequest = new SearchRequest(getIndex());
// 所有的查询条件基本都可以通过QueryBuilders类构建
QueryBuilder wildcardQueryBuilder = QueryBuilders.wildcardQuery("resume", "*我是王*,大家*"));
searchRequest.source(SearchSourceBuilder
        .searchSource()
        .query(wildcardQueryBuilder);
SearchResponse searchResponse = restHighLevelClient.search(searchRequest,RequestOptions.DEFAULT);
log.info("查询结果:{}",JsonUtils.toJsonString(searchResponsegetHits().getHits()));
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8

算分

文档匹配的相关度,主要用于排序

耗时

查新结果返回参数took,单位ms

返回查询结果总条数

如非必要不要使用,具体原因看以下说明。
实测单片5G数据量的情况下,普通查询影响并不大,大概在20ms
父子文档的查询影响比较大,大概在800ms

  • 当值为true时返回总数,需要访问所有文档。效率最低
  • 当值为>= 0时返回总数,总数超过则按照设置的值返回,且最大值为2147483647。仅需要访问设置的参数的文档数,效率根据设置的值做参考
  • 当值为= -1时不返回总数,效率高

es:

GET /index_learn_test/_search
{
  "track_total_hits": true
}
  • 1
  • 2
  • 3
  • 4

java代码:

SearchSourceBuilder searchSourceBuilder = SearchSourceBuilder.searchSource();
searchSourceBuilder.trackTotalHits(true);
  • 1
  • 2

返回部分字段

includes只返回这些字段,excludes除了这些字段都返回。当两个一起使用时是and的关系

es:

GET /index_learn_test/_search
{
  "_source": {
    "includes": [
      "name",
      "age"
    ],
    "excludes": [
      "name"
    ]
  }
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12

java代码:

@Test
public void _source() throws IOException {
    SearchRequest searchRequest = new SearchRequest(getIndex());
    searchRequest.source(SearchSourceBuilder
            .searchSource().fetchSource(new String[]{"name","age"}, new String[]{"name"}));
    SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
    log.info("查询结果:{}", JsonUtils.toJsonString(searchResponse.getHits().getHits()));
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8

排序

以SQL为案例,先根据年龄降序,然后再根据性别升序

SQL:

select * from index_learn_test order by age desc, sex asc

  • 1
  • 2

es:

GET /index_learn_test/_search
{
  "sort": [
    {
      "age": {
        "order": "desc"
      }
    },
    {
      "sex": {
        "order": "asc"
      }
    }
  ]
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15

java代码:

SearchRequest searchRequest = new SearchRequest(getIndex());
searchRequest.source(SearchSourceBuilder
                .searchSource()
                .sort("age",SortOrder.DESC)
                .sort("sex",SortOrder.ASC));
  • 1
  • 2
  • 3
  • 4
  • 5

精确搜索

根据文档id单条查询

GET /index_learn_test/_doc/${id}

GET /index_learn_test/_doc/20
  • 1
根据文档id批量查询

es:

GET /index_learn_test/_search
{
  "query": {
    "ids": {
      "values": [
        "35",
        "333"
      ]
    }
  }
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
单条精确term(算分)

类似于MySQL=

es:

GET /index_learn_test/_search
{
  "query": {
    "term": {
      "name.keyword": {
        "value": "王正年2"
      }
    }
  }
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
多条精确terms(算分)

类似于MySQLin

es:

GET /index_learn_test/_search
{
  "query": {
    "terms": {
      "age": [
        26,
        27
      ]
    }
  }
}

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12

模糊查询

wildcard(算分)

类似于MySQLlike
该方法需要将字段定义成wildcard类型

es:

GET /index_learn_test/_search
{
  "query": {
    "wildcard": {
      "resume": {
        "wildcard": "*我是王*,大家*"
      }
    }
  }
}

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11

java代码:

 public void wildcard()  throws IOException {
     SearchRequest searchRequest = new SearchRequest(getIndex());
     searchRequest.source(SearchSourceBuilder
             .searchSource()
             .query(QueryBuilders.wildcardQuery("resume", "*我是王*,大家*")));
     SearchResponse searchResponse = restHighLevelClient.search(searchRequest,RequestOptions.DEFAULT);
     log.info("查询结果:{}", JsonUtils.toJsonString(searchResponse.getHits().getHits()));
 }
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
match(算分)

基于分词的查询搜索,如果需要根据短语搜索请使用match_parse

es:

   GET /index_learn_test/_search
   {
     "query": {
       "match": {
         "introduce": "齐,今年"
       }
     }
   }

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9

java代码:

 @Test
 public void match()  throws IOException {
     SearchRequest searchRequest = new SearchRequest(getIndex());
     QueryBuilder queryBuilder = QueryBuilders.matchQuery("introduce", "齐,今年");
     searchRequest.source(SearchSourceBuilder
             .searchSource()
             .query(queryBuilder));
     SearchResponse searchResponse = restHighLevelClient.search(searchRequest,RequestOptions.DEFAULT);
     log.info("查询结果:{}", JsonUtils.toJsonString(searchResponse.getHits().getHits()));
 }
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
match_parse(算分)

基于短语的匹配,匹配的分词间隔参数slop

es:

GET /index_learn_test/_search
{
  "query": {
   "match_phrase": {
     "introduce":{
       "query": "齐,今年",
       "slop": 3
     }
   }
  }
}

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12

java代码:

    @Test
    public void matchParse()  throws IOException {
        SearchRequest searchRequest = new SearchRequest(getIndex());
        QueryBuilder queryBuilder = QueryBuilders.matchPhraseQuery("introduce", "齐,今年").slop(4);
        searchRequest.source(SearchSourceBuilder
                .searchSource()
                .query(queryBuilder));
        SearchResponse searchResponse = restHighLevelClient.search(searchRequest,RequestOptions.DEFAULT);
        log.info("查询结果:{}", JsonUtils.toJsonString(searchResponse.getHits().getHits()));
    }
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10

组合查询bool

filter(and,不算分)

如非需要算分排序,就用filter

es:

    GET /index_learn_test/_search
    {
      "query": {
        "bool": {
          "filter": [
            {
              "range": {
                "age": {
                  "gte": 50,
                  "lte": 60
                }
              }
            },
            {
              "term": {
                "departmentId": "F"
              }
            }
          ]
        }
      },
      "sort": [
        {
          "age": {
            "order": "desc"
          }
        }
      ]
    }

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30

java代码:

    @Test
    public void filter()  throws IOException {
        SearchRequest searchRequest = new SearchRequest(getIndex());
        BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
        boolQueryBuilder.filter(QueryBuilders.rangeQuery("age").from(50, true).to(60,true));
        boolQueryBuilder.filter(QueryBuilders.termQuery("departmentId","F"));
        searchRequest.source(SearchSourceBuilder
                .searchSource()
                .query(boolQueryBuilder));
        SearchResponse searchResponse = restHighLevelClient.search(searchRequest,RequestOptions.DEFAULT);
        log.info("查询结果:{}", JsonUtils.toJsonString(searchResponse.getHits().getHits()));
    }
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
must(and,算分,比filter效率低)

如非需要算分排序,则使用filter

es:

    GET /index_learn_test/_search
    {
      "query": {
        "bool": {
          "must": [
            {
              "range": {
                "age": {
                  "gte": 50,
                  "lte": 60,
                  "include_lower": true,
                  "include_upper": true
                }
              }
            },
            {
              "term": {
                "departmentId": "F"
              }
            }
          ]
        }
      },
      "sort": [
        {
          "age": {
            "order": "desc"
          }
        }
      ]
    }

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32

java代码:

    @Test
    public void must()  throws IOException {
        SearchRequest searchRequest = new SearchRequest(getIndex());
        BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
        boolQueryBuilder.must(QueryBuilders.rangeQuery("age").from(50, true).to(60,true));
        boolQueryBuilder.must(QueryBuilders.termQuery("departmentId","F"));
        searchRequest.source(SearchSourceBuilder
                .searchSource()
                .query(boolQueryBuilder));
        SearchResponse searchResponse = restHighLevelClient.search(searchRequest,RequestOptions.DEFAULT);
        log.info("查询结果:{}", JsonUtils.toJsonString(searchResponse.getHits().getHits()));
    }
        
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
must_not(not,不算分)

es:


    GET /index_learn_test/_search
    {
      "query": {
        "bool": {
          "must_not": [
            {
              "range": {
                "age": {
                  "gte": 50,
                  "lte": 60,
                  "include_lower": true,
                  "include_upper": true
                }
              }
            },
            {
              "term": {
                "departmentId": "F"
              }
            }
          ]
        }
      },
      "sort": [
        {
          "age": {
            "order": "desc"
          }
        }
      ]
    }

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33

java代码:

@Test
public void mustNot()  throws IOException {
    SearchRequest searchRequest = new SearchRequest(getIndex());
    BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
    boolQueryBuilder.mustNot(QueryBuilders.rangeQuery("age").from(50, true).to(60,true));
    boolQueryBuilder.mustNot(QueryBuilders.termQuery("departmentId","F"));
    searchRequest.source(SearchSourceBuilder
            .searchSource()
            .query(boolQueryBuilder));
    SearchResponse searchResponse = restHighLevelClient.search(searchRequest,RequestOptions.DEFAULT);
    log.info("查询结果:{}", JsonUtils.toJsonString(searchResponse.getHits().getHits()));
}

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
should(or,不算分,效率高)

es:

    GET /index_learn_test/_search
    {
      "query": {
        "bool": {
          "should": [
            {
              "range": {
                "age": {
                  "gte": 40,
                  "lte": 50,
                  "include_lower": true,
                  "include_upper": true
                }
              }
            },
            {
              "term": {
                "departmentId": "F"
              }
            }
          ]
        }
      },
      "sort": [
        {
          "departmentId": {
            "order": "asc"
          }
        }
      ]
    }

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32

java代码:

@Test
public void should()  throws IOException {
    SearchRequest searchRequest = new SearchRequest(getIndex());
    BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
    boolQueryBuilder.should(QueryBuilders.rangeQuery("age").from(50, true).to(60,true));
    boolQueryBuilder.should(QueryBuilders.termQuery("departmentId","F"));
    searchRequest.source(SearchSourceBuilder
            .searchSource()
            .query(boolQueryBuilder));
    SearchResponse searchResponse = restHighLevelClient.search(searchRequest,RequestOptions.DEFAULT);
    log.info("查询结果:{}", JsonUtils.toJsonString(searchResponse.getHits().getHits()));
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
复杂案例

以SQL为案例

sql:

 select * from index_learn_test 
 where departmentId = 'E' 
 and (resume like '*冉晶菊*' or age =30)
  • 1
  • 2
  • 3

es:

GET /index_learn_test/_search
{
  "query": {
    "bool": {
      "filter": [
        {
          "term": {
            "departmentId": "E"
          }
        },
        {
          "bool": {
            "should":[
              {
                "term":{
                  "age":30
                }
              },
              {
                "wildcard":{
                  "resume":"*冉晶菊*"
                }
              }
            ]
          }
        }
        
      ]
    }
  }, 
  "sort": [
    {
      "age": {
        "order": "desc"
      }
    }
  ]
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38

java代码:

 @Test
 public void complexQuery() throws IOException {
     SearchRequest searchRequest = new SearchRequest(getIndex());
     BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
     //departmentId = E and (introduce like '*冉晶菊*' or age =30)
     boolQueryBuilder.filter(QueryBuilders.termQuery("departmentId", "E"));
     BoolQueryBuilder shouldBool = QueryBuilders.boolQuery();
     shouldBool.should(QueryBuilders.termQuery("age", 30));
     shouldBool.should(QueryBuilders.wildcardQuery("resume", "*冉晶菊*"));
     boolQueryBuilder.filter(shouldBool);
     searchRequest.source(SearchSourceBuilder
             .searchSource()
             .query(boolQueryBuilder));
     SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
     log.info("查询结果:{}", JsonUtils.toJsonString(searchResponse.getHits().getHits()));
 }
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16

聚合

在聚合前使用query是对统计数据源进行统一过滤。

分组

size是对返回数据量的限制,在分组查询前需要对自己查询的数据规模有一定的认知。如果size过大将会导致内存溢出

单字段分组(term)

类似于SQL中的group by 字段1

其中order的字段_count指对聚合结果数进行排序,如果是_key则是对统计字段进行排序

es:

GET /index_learn_test/_search
{
  "size": 0, 
  "aggs": {
    "这是指标名字随便取": {
      "terms": {
        "field": "departmentId",
        "size": 10,
        "order": {
          "_count": "desc"
        }
      }
    }
  }
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15

java:

SearchRequest searchRequest = new SearchRequest(getIndex());
searchRequest.source(SearchSourceBuilder
        .searchSource()
        .size(0)
        .aggregation(AggregationBuilders.terms("这是一个名字随便取")
        .order(BucketOrder.count(false)).field("departmentId")));
SearchResponse searchResponse =restHighLevelClient.search(searchRequst, RequestOptions.DEFAULT);
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
多个字段分组(term)

类似于SQL中的group by 字段1, 字段2

注意其中的order是对排序的字段进行排序,不是对结果进行排序

es:

GET /index_learn_test/_search
{
  "size": 0,
  "aggs": {
    "这是一个名字随便取": {
      "composite": {
        "size": 10,
        "sources": [
          {
            "groupByAge": {
              "terms": {
                "field": "age",
                "order": "asc"
              }
            }
          },
          {
            "groupBySex": {
              "terms": {
                "field": "sex",
                "order": "asc"
              }
            }
          }
        ]
      }
    }
  }
}

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30

java代码:

SearchRequest searchRequest = new SearchRequest(getIndex());
TermsValuesSourceBuilder groupByAge = new TermsValuesSourceBuilder("groupByAge").order(SortOrder.ASC).field("age");
TermsValuesSourceBuilder groupBySex = new TermsValuesSourceBuilder("groupBySex").order(SortOrder.ASC).field("sex");
CompositeAggregationBuilder compositeAggregationBuilder = AggregationBuilders.composite("这是一个名字随便取", Lists.newArrayList(groupByAge, groupBySex));
compositeAggregationBuilder.size(10);
searchRequest.source(SearchSourceBuilder
        .searchSource()
        .size(0)
        .aggregation(compositeAggregationBuilder));
SearchResponse searchResponse =restHighLevelClient.search(searchRequst, RequestOptions.DEFAULT);

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11

去重计数(cardinality)

类似于SQL中的DISTINCT 字段

es:

GET /index_learn_test/_search
{
  "size": 0,
  "aggs": {
    "这是一个名字随便取": {
       "cardinality": {
         "field": "name.keyword"
       }
    }
  }
}

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12

java代码:

 SearchRequest searchRequest = new SearchRequest(getIndex());
 searchRequest.source(SearchSourceBuilder
         .searchSource()
         .size(0)
         .aggregation(AggregationBuilders.cardinality("这是一个名字随便取").field("name.keyword")));
 SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
 log.info("查询结果:{}",searchResponse.toString());
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7

总数计数(value_count)

统计字段不为空的,类似SQLcount(字段)

es:


GET /index_learn_test/_search
{
  "query": {
    "term": {
      "departmentId": {
        "value": "E"
      }
    }
  }, 
  "size": 0,
  "aggs": {
    "这是一个名字随便取": {
       "value_count": {
         "field": "name.keyword"
       }
    }
  }
}

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20

java代码:


SearchRequest searchRequest = new SearchRequest(getIndex());
searchRequest.source(SearchSourceBuilder
        .searchSource()
        .size(0)
        .aggregation(AggregationBuilders.count("这是一个名字随便取").field("name.keyword")));
SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
log.info("查询结果:{}",searchResponsetoString());
    
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9

最大值(max)

统计字段最大值,字段定义必须是number类型,类似SQLmax(字段)

es:

GET /index_learn_test/_search
{
  "size": 0, 
  "aggs": {
    "这是指标名字随便取": {
      "max": {
        "field": "age.number"
      }
    }
  }
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11

java代码:


 SearchRequest searchRequest = new SearchRequest(getIndex());
 searchRequest.source(SearchSourceBuilder
         .searchSource()
         .size(0)
         .aggregation(AggregationBuilders.max("这是一个名字随便取").field("age.number")));
 SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
 log.info("查询结果:{}",searchResponse.toString());
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8

最小值(min)

统计字段最小值,字段定义必须是number类型,类似SQLmin(字段)

es:

GET /index_learn_test/_search
{
  "size": 0, 
  "aggs": {
    "这是指标名字随便取": {
      "min": {
        "field": "age.number"
      }
    }
  }
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11

java代码:

SearchRequest searchRequest = new SearchRequest(getIndex());
searchRequest.source(SearchSourceBuilder
        .searchSource()
        .size(0)
        .aggregation(AggregationBuilders.min("这是一个名字随便取").field("age.number")));
SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
log.info("查询结果:{}",searchResponse.toString());

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8

平均值(avg)

统计字段平均值,字段定义必须是number类型,类似SQLavg(字段)

es:

GET /index_learn_test/_search
{
  "size": 0, 
  "aggs": {
    "这是指标名字随便取": {
      "avg": {
        "field": "age.number"
      }
    }
  }
}

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12

java代码:

SearchRequest searchRequest = new SearchRequest(getIndex());
searchRequest.source(SearchSourceBuilder
        .searchSource()
        .size(0)
        .aggregation(AggregationBuilders.avg("这是一个名字随便取").field("age.number")));
SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
log.info("查询结果:{}",searchResponse.toString());
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7

求和(sum)

统计字段求和,字段定义必须是number类型,类似SQLsum(字段)

es:

GET /index_learn_test/_search
{
  "size": 0, 
  "aggs": {
    "这是指标名字随便取": {
      "sum": {
        "field": "age.number"
      }
    }
  }
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11

java代码:

SearchRequest searchRequest = new SearchRequest(getIndex());
searchRequest.source(SearchSourceBuilder
        .searchSource()
        .size(0)
        .aggregation(AggregationBuilders.sum("这是一个名字随便取").field("age.number")));
SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
log.info("查询结果:{}",searchResponse.toString());
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7

过滤计数(filter)

类似于sqlselect sum(age), count(case when a=1 then 1 else 0 end) from table where age=30在已有的筛选结果中统计指定部分数据

es:

GET /index_learn_test/_search
{
  "size": 0, 
  "aggs": {
    "这是指标名字随便取": {
      "filter": {
        "bool": {
          "filter":[
            {
              "term":{
                "departmentId":"F"
              }
            },
            {
              "term":{
                "age":30
              }
            },
            {
              "term":{
                "sex":"女"
              }
            }  
          ]
        }
      }
    }
  }
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29

java代码:

SearchRequest searchRequest = new SearchRequest(getIndex());
BoolQueryBuilder boolQueryBuilder = QueryBuilders.boolQuery();
boolQueryBuilder.filter(QueryBuilders.termQuery("departmentId", "F"));
boolQueryBuilder.filter(QueryBuilders.termQuery("age", "30"));
boolQueryBuilder.filter(QueryBuilders.termQuery("sex", "女"));
searchRequest.source(SearchSourceBuilder
        .searchSource()
        .size(0)
        .aggregation(AggregationBuilders.filter("这是一个名字随便取",boolQueryBuilder)));
SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
log.info("查询结果:{}",searchResponse.toString());
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11

分页

普通查询分页

from + size(允许跳页)

适用于页面显示, 该方法支持跳页查询,深度分页效率低,但最多查询数据量不超过配置max_result_window设置的值,默认是1w

max_result_window:

// 查看配置该配置
GET /index_learn_test/_settings?include_defaults=true

// 设置该配置
PUT index_learn_test/_settings
{
  "index":{
    "max_result_window":30000
  }
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10

es:

GET /index_learn_test/_search
{
  "from": 0, 
  "size":  10,
  "sort": [
    {
      "id": {
        "order": "desc"
      }
    }
  ]
}

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13

java代码:

SearchRequest searchRequest = new SearchRequest(getIndex());
searchRequest.source(SearchSourceBuilder
        .searchSource()
        .size(10).from(0)
        .sort("id", SortOrder.DESC));
SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
log.info("查询结果:{}",JsonUtils.toJsonString(searchResponsegetHits().getHits()));
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
深度分页search_after(不允许跳页)

类似于SQL中的select * from table where id >123 order by id asc limit 10

  • 该方法必须提供一个唯一值作为排序
  • 不允许跳页
  • 获取到的数据是实时的
  • 保存返回中最后一条数据的sort字段数据
  • 回传sort数据,顺序都不能发生改变,需要和上一次搜索返回一致

es:

GET /index_learn_test/_search
{
  "from": 0,
  "size": 10,
  "sort": [
    {
      "age": {
        "order": "desc"
      }
    },
    {
      "id": {
        "order": "desc"
      }
    }
  ],
  "search_after": [
    "59",
    25335465
  ]
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21

java:


   @Test
    public void pageSearchAfter() throws IOException {
        pageSearchAfter(pageSearchAfter(null));
    }

    public Object[] pageSearchAfter(Object[] sort) throws IOException {
        SearchRequest searchRequest = new SearchRequest(getIndex());
        SearchSourceBuilder searchSourceBuilder = SearchSourceBuilder
                .searchSource()
                .size(10).from(0)
                .sort("age", SortOrder.DESC)
                .sort("id", SortOrder.DESC);
        if(Objects.nonNull(sort)){
            searchSourceBuilder.searchAfter(sort);
        }
        searchRequest.source(searchSourceBuilder);
        SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
        log.info("查询结果:{}", JsonUtils.toJsonString(searchResponse.getHits().getHits()));
        //取最后一条的sort
        return searchResponse.getHits().getHits()[searchResponse.getHits().getHits().length-1].getSortValues();
    }
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
深度分页scroll(适用于数据实时性不需要很高的,不允许跳页)

适用于导出(但不推荐用),该方法是非常好资源的,所以在使用完后需要尽快把scroll释放掉

  1. 获取scroll_id
  • 10m表示scroll_id保留十分钟
  • from必须是0开始
  • 保存返回中的_scroll_id
GET /index_learn_test/_search?scroll=10m
{
  "from": 0, 
  "size":  10,
  "sort": [
    {
      "id": {
        "order": "desc"
      }
    }
  ]
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  1. 根据scroll_id获取下一页
GET /_search/scroll
{
  "scroll_id" : "FGluY2x1……",
  "scroll": "10m"
}
  • 1
  • 2
  • 3
  • 4
  • 5
  1. 根据scroll_id释放资源
DELETE _search/scroll/${scroll_id}

  • 1
  • 2

聚合后分页

from + size分页(支持跳页)

深度分页效率低

  • compositesize是指定分页的数据量,应大于后面的from+size的大小,否则获取不到数据
  • size不能太大否则会造成报错,默认值是10,最大值是65535
  • bucket_sort中的sort也只能对size范围内的数据进行排序,所以想排序的情况下而且只有一个分组条件使用term可以对数据进行排序后,再根据size返回一定规模的数据

es

GET /index_learn_test/_search
{
  "size": 0,
  "aggs": {
    "这是一个名字随便取": {
      "composite": {
        "size": 1000, 
        "sources": [
          {
            "groupByAge": {
              "terms": {
                "field": "age",
                "order": "asc"
              }
            }
          },
          {
            "groupBySex": {
              "terms": {
                "field": "sex",
                "order": "asc"
              }
            }
          }
        ]
      },
      "aggs": {
        "统计数量":{
          "value_count": {
            "field": "id"
          }
        },
        "聚合分页": {
          "bucket_sort": {
            "sort": [{"统计数量":{"order":"desc"}}],
            "from": 20,
            "size": 10
          }
        }
      }
    }
  }
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43

java代码:

  SearchRequest searchRequest = new SearchRequest(getIndex());
  TermsValuesSourceBuilder groupByAge = new TermsValuesSourceBuilder("groupByAge").order(SortOrder.ASC).field("age");
  TermsValuesSourceBuilder groupBySex = new TermsValuesSourceBuilder("groupBySex").order(SortOrder.ASC).field("sex");
  CompositeAggregationBuilder compositeAggregationBuilder = AggregationBuilders.composite("这是一个名字随便取", Lists.newArrayList(groupByAge, groupBySex));
  //分组返回的数据规模
  compositeAggregationBuilder.size(1000);

  //分页对象
  List<FieldSortBuilder> fieldSortBuilders = new ArrayList<>();
  BucketSortPipelineAggregationBuilder pipelineAggregationBuilder = new BucketSortPipelineAggregationBuilder("聚合分页", fieldSortBuilders);
  //from+size不应该超过上方设置的数据规模
  pipelineAggregationBuilder.from(100);
  pipelineAggregationBuilder.size(10);
  compositeAggregationBuilder.subAggregtionpipelineAggregationBuilder);
   //排序字段
   fieldSortBuilders.add( new FieldSortBuilder("统计数量").order(SortOrder.DESC));
   AggregationBuilder sortField = AggregationBuilders.count("统计数量").field("id");
   compositeAggregationBuilder.subAggregation(sortField);
   searchRequest.source(SearchSourceBuilder
           .searchSource()
           .size(0)
           .aggregation(compositeAggregationBuilder));
   SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
   log.info("查询结果:{}",searchResponse.toString());
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24

其他

查看空间使用情况

GET /_cat/allocation?v
  • 1
声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/知新_RL/article/detail/68653
推荐阅读
相关标签
  

闽ICP备14008679号