赞
踩
聚合一般用于数据的统计分析,类似于mysql的group by。
聚合里面有两个基本概念,一个叫桶,一个叫度量。
桶的作用,是按照某种方式对数据进行分组,每一组数据成为一个桶。比如对手机品牌分组,可以得到小米桶,华为桶。
Date Histogram Aggregation:根据日期阶梯分组,例如给定阶梯为周,会自动每周分为一组
Histogram Aggregation:根据数值阶梯分组,与日期类似
Terms Aggregation:根据词条内容分组,词条内容完全匹配的为一组
Range Aggregation:数值和日期的范围分组,指定开始和结束,然后按段分组
可以看出ES的分组方式相当强大,mysql的group by只能实现类似Terms Aggregation的分组效果,而ES还可以根据阶梯和范围来分组。
度量类似mysql的avg,max等函数,用来求分组内平均值,最大值等。
比较常用的一些度量聚合方式:
Avg Aggregation:求平均值
Max Aggregation:求最大值
Min Aggregation:求最小值
Percentiles Aggregation:求百分比
Stats Aggregation:同时返回avg、max、min、sum、count等
Sum Aggregation:求和
Top hits Aggregation:求前几
Value Count Aggregation:求总数
我们来看最简单的词条桶,brand_aggs就是自定义桶的名字,terms表示词条桶,field:brand表示按照字段brand来划分桶,size为0表示不想返回查询结果,从这里可以看出分页不影响聚合的结果,也就是说可以实现分页查询和聚合结果一起返回。
下面的查询是通过品牌名来分组统计
- GET /goods/_search
- {
- "size" : 0,
- "aggs" : {
- "brand_aggs" : {
- "terms" : {
- "field" : "brand"
- }
- }
- }
- }
查询结果:
- {
- "took" : 3,
- "timed_out" : false,
- "_shards" : {
- "total" : 3,
- "successful" : 3,
- "skipped" : 0,
- "failed" : 0
- },
- "hits" : {
- "total" : {
- "value" : 5,
- "relation" : "eq"
- },
- "max_score" : null,
- "hits" : [ ]
- },
- "aggregations" : {
- "brand_aggs" : { //桶的名字
- "doc_count_error_upper_bound" : 0,
- "sum_other_doc_count" : 0,
- "buckets" : [ //查询结果
- {
- "key" : "华为", //品牌名,因为是按照品牌分组
- "doc_count" : 3 //统计的数量
- },
- {
- "key" : "小米",
- "doc_count" : 2
- }
- ]
- }
- }
- }

可以看到不需要加度量默认就把总数求出来了,如果要求品牌下平均手机价格,就需要加度量了
- GET /goods/_search
- {
- "size" : 0,
- "aggs" : {
- "brand_aggs" : {
- "terms" : {
- "field" : "brand"
- },
- "aggs":{
- "avg_price": {
- "avg": {
- "field": "price"
- }
- }
- }
- }
- }
- }

返回结果:
- {
- "took" : 1,
- "timed_out" : false,
- "_shards" : {
- "total" : 3,
- "successful" : 3,
- "skipped" : 0,
- "failed" : 0
- },
- "hits" : {
- "total" : {
- "value" : 5,
- "relation" : "eq"
- },
- "max_score" : null,
- "hits" : [ ]
- },
- "aggregations" : {
- "brand_aggs" : {
- "doc_count_error_upper_bound" : 0,
- "sum_other_doc_count" : 0,
- "buckets" : [
- {
- "key" : "华为",
- "doc_count" : 3,
- "avg_price" : {
- "value" : 4500.0
- }
- },
- {
- "key" : "小米",
- "doc_count" : 2,
- "avg_price" : {
- "value" : 5000.0
- }
- }
- ]
- }
- }
- }

代码实现
- public void testAggs() {
- AbstractAggregationBuilder aggregationBuilder = AggregationBuilders.terms("brand_aggs").field("brand");//通过品牌分组
- aggregationBuilder.subAggregation(AggregationBuilders.avg("avg_price").field("price")); //平均值度量,计算price平均值
- NativeSearchQuery nativeSearchQuery = new NativeSearchQueryBuilder()
- .withPageable(PageRequest.of(0, 1)) //size只能大于0
- .addAggregation(aggregationBuilder)
- .build();
- SearchHits<GoodsInfo> goodsInfos = elasticsearchRestTemplate.search(nativeSearchQuery, GoodsInfo.class);
- Terms brandTerms = goodsInfos.getAggregations().get("brand_aggs");
- brandTerms.getBuckets().stream().forEach(bucket -> {
- System.out.println(bucket.getKey()); //获取品牌名
- System.out.println(bucket.getDocCount()); //获取总数
- ParsedAvg avgPrice = bucket.getAggregations().get("avg_price"); //获取平均价格
- System.out.println(avgPrice.getValue());
- });
- }

下面的例子是按照500为一个阶梯统计不同价位手机数量
- GET /goods/_search
- {
- "size":0,
- "aggs":{
- "price_histogram":{
- "histogram": {
- "field": "price",
- "interval": 500
- }
- }
- }
- }
结果:
- {
- "took" : 103,
- "timed_out" : false,
- "_shards" : {
- "total" : 3,
- "successful" : 3,
- "skipped" : 0,
- "failed" : 0
- },
- "hits" : {
- "total" : {
- "value" : 5,
- "relation" : "eq"
- },
- "max_score" : null,
- "hits" : [ ]
- },
- "aggregations" : {
- "price_histogram" : {
- "buckets" : [
- {
- "key" : 3500.0,
- "doc_count" : 1
- },
- {
- "key" : 4000.0,
- "doc_count" : 0
- },
- {
- "key" : 4500.0,
- "doc_count" : 2
- },
- {
- "key" : 5000.0,
- "doc_count" : 0
- },
- {
- "key" : 5500.0,
- "doc_count" : 2
- }
- ]
- }
- }
- }

代码:
- public void testHistogram() {
- AbstractAggregationBuilder aggregationBuilder = AggregationBuilders.histogram("price_histogram").field("price").interval(500);//500一个阶梯统计
- NativeSearchQuery nativeSearchQuery = new NativeSearchQueryBuilder()
- .withPageable(PageRequest.of(0, 1)) //size只能大于0
- .addAggregation(aggregationBuilder)
- .build();
- SearchHits<GoodsInfo> goodsInfos = elasticsearchRestTemplate.search(nativeSearchQuery, GoodsInfo.class);
- ParsedHistogram priceHistogram = goodsInfos.getAggregations().get("price_histogram");
- priceHistogram.getBuckets().stream().forEach(bucket -> {
- System.out.println(bucket.getKey()); //阶梯值
- System.out.println(bucket.getDocCount()); //获取总数
- });
- }
统计价格在4000-6000手机的数量
- GET /goods/_search
- {
- "size": 0,
- "aggs": {
- "price_range": {
- "range": {
- "field": "price",
- "ranges": [
- {
- "from": 4000,
- "to": 6000
- }
- ]
- }
- }
- }
- }

结果:
- {
- "took" : 1,
- "timed_out" : false,
- "_shards" : {
- "total" : 3,
- "successful" : 3,
- "skipped" : 0,
- "failed" : 0
- },
- "hits" : {
- "total" : {
- "value" : 5,
- "relation" : "eq"
- },
- "max_score" : null,
- "hits" : [ ]
- },
- "aggregations" : {
- "price_range" : {
- "buckets" : [
- {
- "key" : "4000.0-6000.0",
- "from" : 4000.0,
- "to" : 6000.0,
- "doc_count" : 4
- }
- ]
- }
- }
- }

代码:
- public void testRangeAggrs() {
- AbstractAggregationBuilder aggregationBuilder = AggregationBuilders.range("price_range").field("price").addRange(4000, 6000);//[4000,6000)范围统计
- NativeSearchQuery nativeSearchQuery = new NativeSearchQueryBuilder()
- .withPageable(PageRequest.of(0, 1)) //size只能大于0
- .addAggregation(aggregationBuilder)
- .build();
- SearchHits<GoodsInfo> goodsInfos = elasticsearchRestTemplate.search(nativeSearchQuery, GoodsInfo.class);
- ParsedRange priceHistogram = goodsInfos.getAggregations().get("price_range");
- priceHistogram.getBuckets().stream().forEach(bucket -> {
- System.out.println(bucket.getKey()); //key值
- System.out.println(bucket.getDocCount()); //获取总数
- });
- }
- GET /cars/_search
- {
- "size":0,
- "aggs" : {
- "date" : {
- "date_histogram" : {
- "field" : "sold",
- "interval" : "1M",
- "format" : "yyyy-MM",
- "time_zone": "+08:00",
- "min_doc_count": 1
- }
- }
- }
- }
结果:
- "aggregations" : {
- "date" : {
- "buckets" : [
- {
- "key_as_string" : "2013-12",
- "key" : 1385859600000,
- "doc_count" : 1
- },
- {
- "key_as_string" : "2014-02",
- "key" : 1391216400000,
- "doc_count" : 1
- },
- {
- "key_as_string" : "2014-05",
- "key" : 1398906000000,
- "doc_count" : 1
- },
- {
- "key_as_string" : "2014-07",
- "key" : 1404176400000,
- "doc_count" : 1
- },
- {
- "key_as_string" : "2014-08",
- "key" : 1406854800000,
- "doc_count" : 1
- },
- {
- "key_as_string" : "2014-10",
- "key" : 1412125200000,
- "doc_count" : 1
- },
- {
- "key_as_string" : "2014-11",
- "key" : 1414803600000,
- "doc_count" : 2
- }
- ]
- }
- }

Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。