当前位置:   article > 正文

ElasticSearch查询学习笔记章节4——cardinality,range,extended_stats聚合统计aggregations查询_elasticsearch查询cardinality

elasticsearch查询cardinality

ElasticSearch查询笔记目录

  涉及的常用查询内容较多,将分多个章节进行笔记整理,具体如下:

  1. ElasticSearch查询学习笔记章节1——term,terms,match,id查询

   主要是依据精准的查询条件来查询,查询速度快,也是最常用的几类查询方式,具体种类如下:

  • term查询
  • terms查询
  • match_all查询
  • match查询
  • 布尔match查询
  • multi_match查询
  • 根据文档id查询(单个id)
  • 根据文档ids查询(多个id)
  1. ElasticSearch查询学习笔记章节2——prefix,fuzzy,wildcard,range,regexp查询

  主要是涉及ElasticSearch查询条件相对模糊,查询速度相对慢,实时查询时应尽量避免这些方式,但是这些查询方式又具有自己独特不可代替的功能,还是还有必要,具体如下:

  • prefix查询
  • fuzzy查询
  • wildcard查询
  • range查询
  • regexp查询
  1. ElasticSearch查询学习笔记章节3——scroll,delete-by-query,bool,boosting,filter,highlight查询

  主要涉及ElasticSearch的一些常用的杂项查询;

  • 深分页scroll查询
  • delete-by-query
  • bool查询
  • boosting查询
  • filter查询
  • highlight(高亮)查询
  1. ElasticSearch查询学习笔记章节4——cardinality,range,extended_stats聚合统计aggregations查询

  主要涉及ES的聚合查询Aggregations;

  • cardinality(去重计数)查询
  • range(范围统计)查询
  • extended_stats(统计聚合)查询
  1. ElasticSearch查询学习笔记章节5——geo_distance,geo_bounding_box,geo_polygon地图检索geo查询

.   主要涉及ES的地图检索geo相关的查询;

  • geo_distance查询
  • geo_bounding_box查询
  • geo_polygon查询

整体Java代码的测试用例项目

  整个章节的Java代码放在CSDN资源ElasticSearch常用查询的Java实现;路径效果如下图,欢迎下载访问;在这里插入图片描述

前 言

  该章节笔记主要针对非常重要的聚合查询,具体如下,ES的聚合查询和MySQL的聚合查询类型类似,但是ES的聚合查询相比于MySQL要强大的多,ES提供的统计数据的方式多种多样,这里就挑选几个常用的来示范一下,更多的内容可以参考官网Aggregations;
  ES聚合查询的RESTFul代码语法如下;

POST /index/_search   #这里和普通查询一样,一般用POST或者GET,后面接索引名和_search,/隔开
{
	"aggs":{       #这里和普通的query不一样了,不是写query,是写aggs,是aggregations的缩写               
	    "名字,一般用agg":  #这里指定此次聚合计算的一个名字,一般都起名agg
	    {
	        "agg_type":  #选择系统内的聚合函数类型
	        {
	            "属性":"值"   #指定需要聚合计算的属性和值
	        }
	    }
	}
}

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13

cardinality(去重计数)查询

去重计算,即cardinality,类似于MySQL的count(distinct field),第一步先将返回的文档中的一个指定的filed进行去重,第二步,再统计一下去重后总共有多少条数据;

  实现要求,依据province字段查询所有公司驻场在几个城市,即去重province后,最终有几个城市。

  RESTFUL代码如下;

#cardinality查询
POST /sms-logs-index/_search
{
  "aggs": {
    "agg": {
      "cardinality": {
        "field": "province"
      }
    }
  }
}

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12

  注意:cardinality的优点性能快,亿级别的记录在1秒内完成,但是存在只能保证最大40000条记录内的精确,超过的存在5%的误差,不适合需要精确去重场景!参数precision_threshold接受 0–40000 之间的数字,更大的值还是会被当作 40000 来处理。precision_threshold值设置的越大,占用内存越大,样例如下;

GET /myindex/mytype/_search
{
    "size" : 0,
    "aggs" : {
        "distinct_idCard" : {
            "cardinality" : {
              "field" : "idCard",
              "precision_threshold" : 100 
            }
        }
    }
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12

  RESTFUL代码的查询结果如下,注意看,结果是在最后的跟第二个hits平级的标签aggregations标签里面,里面的agg是我们取的此次的聚合查询的名字,名字发生变化的话这个值也会发生变化,value对应的值3,代表着最后的返回结果;

{
  "took" : 14,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 12,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "sms-logs-index",
        "_type" : "_doc",
        "_id" : "3",
        "_score" : 1.0,
        "_source" : {
          "createDate" : "2020-09-16",
          "senDate" : "2020-09-16",
          "longCode" : "5478434123",
          "moblie" : 18056587445,
          "corpName" : "中威集团",
          "smsContent" : "中威集团,服务于你的身边!",
          "state" : "0",
          "opratorId" : "3",
          "province" : "杭州",
          "ipAddr" : "10.248.19.45",
          "replyTotal" : "4",
          "fee" : "20"
        }
      },
      {
        "_index" : "sms-logs-index",
        "_type" : "_doc",
        "_id" : "5",
        "_score" : 1.0,
        "_source" : {
          "createDate" : "2020-09-16",
          "senDate" : "2020-09-16",
          "longCode" : "24514635",
          "moblie" : 18545427895,
          "corpName" : "东东集团",
          "smsContent" : "数据驱动,AI推动,新零售模型让你的购买更心怡!",
          "state" : "1",
          "opratorId" : "1",
          "province" : "北京",
          "ipAddr" : "10.254.19.45",
          "replyTotal" : "1",
          "fee" : "6000"
        }
      },
      {
        "_index" : "sms-logs-index",
        "_type" : "_doc",
        "_id" : "11",
        "_score" : 1.0,
        "_source" : {
          "createDate" : "2020-09-22",
          "senDate" : "2020-09-22",
          "longCode" : "458744536",
          "moblie" : 134625584654,
          "corpName" : "星雨文化传媒",
          "smsContent" : "魅力宣传,星雨传媒!",
          "state" : "1",
          "opratorId" : "3",
          "province" : "杭州",
          "ipAddr" : "10.289.19.45",
          "replyTotal" : "6",
          "fee" : "500"
        }
      },
      {
        "_index" : "sms-logs-index",
        "_type" : "_doc",
        "_id" : "9",
        "_score" : 1.0,
        "_source" : {
          "createDate" : "2020-09-16",
          "senDate" : "2020-09-16",
          "longCode" : "5784320",
          "moblie" : 15236964578,
          "corpName" : "花花派",
          "smsContent" : "花开花落,魅力女性,买花选我!",
          "state" : "1",
          "opratorId" : "1",
          "province" : "上海",
          "ipAddr" : "10.265.19.45",
          "replyTotal" : "1",
          "fee" : "0.1"
        }
      },
      {
        "_index" : "sms-logs-index",
        "_type" : "_doc",
        "_id" : "4",
        "_score" : 1.0,
        "_source" : {
          "createDate" : "2020-09-16",
          "senDate" : "2020-09-16",
          "longCode" : "87454120",
          "moblie" : 13625789645,
          "corpName" : "爱美化妆品有限公司",
          "smsContent" : "魅力,势不可挡,爱美爱美",
          "state" : "1",
          "opratorId" : "1",
          "province" : "上海",
          "ipAddr" : "10.258.19.45",
          "replyTotal" : "1",
          "fee" : "200"
        }
      },
      {
        "_index" : "sms-logs-index",
        "_type" : "_doc",
        "_id" : "7",
        "_score" : 1.0,
        "_source" : {
          "createDate" : "2020-09-16",
          "senDate" : "2020-09-16",
          "longCode" : "33656412674",
          "moblie" : 18956451203,
          "corpName" : "华丽网集团",
          "smsContent" : "网络安全,华丽靠谱!",
          "state" : "1",
          "opratorId" : "3",
          "province" : "上海",
          "ipAddr" : "10.215.254.45",
          "replyTotal" : "1",
          "fee" : "2000"
        }
      },
      {
        "_index" : "sms-logs-index",
        "_type" : "_doc",
        "_id" : "8",
        "_score" : 1.0,
        "_source" : {
          "createDate" : "2020-09-16",
          "senDate" : "2020-09-16",
          "longCode" : "56412345",
          "moblie" : 17055452369,
          "corpName" : "万事Ok公司",
          "smsContent" : "万事Ok,找我没错!",
          "state" : "0",
          "opratorId" : "2",
          "province" : "杭州",
          "ipAddr" : "10.215.19.45",
          "replyTotal" : "1",
          "fee" : "200"
        }
      },
      {
        "_index" : "sms-logs-index",
        "_type" : "_doc",
        "_id" : "10",
        "_score" : 1.0,
        "_source" : {
          "createDate" : "2020-09-16",
          "senDate" : "2020-09-16",
          "longCode" : "54784641",
          "moblie" : 15625584654,
          "corpName" : "勾股科技有限公司",
          "smsContent" : "智能算法,智慧生活,勾股科技!",
          "state" : "1",
          "opratorId" : "2",
          "province" : "杭州",
          "ipAddr" : "10.215.19.45",
          "replyTotal" : "6",
          "fee" : "4000"
        }
      },
      {
        "_index" : "sms-logs-index",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 1.0,
        "_source" : {
          "createDate" : "2020-09-16",
          "senDate" : "2020-09-16",
          "longCode" : "20165411010",
          "moblie" : 15248754897,
          "corpName" : "北京鑫鑫能源有限公司",
          "smsContent" : "欢迎使用新能源,让世界更环保",
          "state" : "1",
          "opratorId" : "2",
          "province" : "北京",
          "ipAddr" : "10.245.29.280",
          "replyTotal" : "0.6",
          "fee" : "0.5"
        }
      },
      {
        "_index" : "sms-logs-index",
        "_type" : "_doc",
        "_id" : "6",
        "_score" : 1.0,
        "_source" : {
          "createDate" : "2020-09-16",
          "senDate" : "2020-09-16",
          "longCode" : "89451254",
          "moblie" : 13028457893,
          "corpName" : "大兴建筑有限公司",
          "smsContent" : "我房建,你放心,大兴建筑!",
          "state" : "1",
          "opratorId" : "1",
          "province" : "杭州",
          "ipAddr" : "10.215.19.45",
          "replyTotal" : "1",
          "fee" : "500"
        }
      }
    ]
  },
  "aggregations" : {
    "agg" : {
      "value" : 3
    }
  }
}

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74
  • 75
  • 76
  • 77
  • 78
  • 79
  • 80
  • 81
  • 82
  • 83
  • 84
  • 85
  • 86
  • 87
  • 88
  • 89
  • 90
  • 91
  • 92
  • 93
  • 94
  • 95
  • 96
  • 97
  • 98
  • 99
  • 100
  • 101
  • 102
  • 103
  • 104
  • 105
  • 106
  • 107
  • 108
  • 109
  • 110
  • 111
  • 112
  • 113
  • 114
  • 115
  • 116
  • 117
  • 118
  • 119
  • 120
  • 121
  • 122
  • 123
  • 124
  • 125
  • 126
  • 127
  • 128
  • 129
  • 130
  • 131
  • 132
  • 133
  • 134
  • 135
  • 136
  • 137
  • 138
  • 139
  • 140
  • 141
  • 142
  • 143
  • 144
  • 145
  • 146
  • 147
  • 148
  • 149
  • 150
  • 151
  • 152
  • 153
  • 154
  • 155
  • 156
  • 157
  • 158
  • 159
  • 160
  • 161
  • 162
  • 163
  • 164
  • 165
  • 166
  • 167
  • 168
  • 169
  • 170
  • 171
  • 172
  • 173
  • 174
  • 175
  • 176
  • 177
  • 178
  • 179
  • 180
  • 181
  • 182
  • 183
  • 184
  • 185
  • 186
  • 187
  • 188
  • 189
  • 190
  • 191
  • 192
  • 193
  • 194
  • 195
  • 196
  • 197
  • 198
  • 199
  • 200
  • 201
  • 202
  • 203
  • 204
  • 205
  • 206
  • 207
  • 208
  • 209
  • 210
  • 211
  • 212
  • 213
  • 214
  • 215
  • 216
  • 217
  • 218
  • 219
  • 220
  • 221
  • 222
  • 223
  • 224
  • 225

  Java代码如下;

    static RestHighLevelClient myClient= EsClient.getClient();  //获取操作ES的
    String index="sms-logs-index";

    @Test
    public void cardnality() throws IOException
    {
        //1.SearchRequest
        SearchRequest request=new SearchRequest(index);

        //2.聚合查询条件
        SearchSourceBuilder builder=new SearchSourceBuilder();
        builder.aggregation(AggregationBuilders.cardinality("agg").field("province"));
        request.source(builder);

        //3.执行查询
        SearchResponse resp = myClient.search(request, RequestOptions.DEFAULT);

        //4.返回结果,默认返回的是父类Aggregation
        Aggregation agg = resp.getAggregations().get("agg");

        //转换成子接口Cardinality才有getValue()方法,父接口Aggregation没有该方法
        Cardinality myagg=(Cardinality) agg;
        
        //以上两句代码也可以写成直接写成    Cardinality agg = resp.getAggregations().get("agg");
        //后续所有的聚合查询都要有此类似的操作,切记!

        long value = myagg.getValue();
        System.out.println(value);


    }
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31

  Java代码实现的效果如图1如下;
在这里插入图片描述

图1 Java实现cardinality的效果

range(范围统计)查询

统计一定范围内出现的文档个数,比如。针对某一个field的值在0到100,100到200,200到300之间的文档出现的个数分别是多少。
范围统计可以针对普通的数值,针对时间,针对ip类型都可以做相应的统计。
range,date_range,ip_range
range统计这里涉及参数:
from:大于等于
to:小于

  数值range统计:实现要求,依据fee字段查询费用小于200的;大于等于200,小于400;大于等于400的分别有多少公司

  RESTFUL代码如下;

POST /sms-logs-index/_search
{
  "aggs": {
    "agg": {
      "range": {
        "field": "fee",
        "ranges": [
          {
            "to": 200
          },
          {
            "from": 200,
            "to": 400
          },
          {
            "from": 400
          }
        ]
      }
    }
  }

#效果
{
  "took" : 131,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 12,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "sms-logs-index",
        "_type" : "_doc",
        "_id" : "3",
        "_score" : 1.0,
        "_source" : {
          "createDate" : "2020-09-16",
          "senDate" : "2020-09-16",
          "longCode" : "5478434123",
          "moblie" : 18056587445,
          "corpName" : "中威集团",
          "smsContent" : "中威集团,服务于你的身边!",
          "state" : "0",
          "opratorId" : "3",
          "province" : "杭州",
          "ipAddr" : "10.248.19.45",
          "replyTotal" : "4",
          "fee" : "20"
        }
      },
      {
        "_index" : "sms-logs-index",
        "_type" : "_doc",
        "_id" : "5",
        "_score" : 1.0,
        "_source" : {
          "createDate" : "2020-09-16",
          "senDate" : "2020-09-16",
          "longCode" : "24514635",
          "moblie" : 18545427895,
          "corpName" : "东东集团",
          "smsContent" : "数据驱动,AI推动,新零售模型让你的购买更心怡!",
          "state" : "1",
          "opratorId" : "1",
          "province" : "北京",
          "ipAddr" : "10.254.19.45",
          "replyTotal" : "1",
          "fee" : "6000"
        }
      },
      {
        "_index" : "sms-logs-index",
        "_type" : "_doc",
        "_id" : "11",
        "_score" : 1.0,
        "_source" : {
          "createDate" : "2020-09-22",
          "senDate" : "2020-09-22",
          "longCode" : "458744536",
          "moblie" : 134625584654,
          "corpName" : "星雨文化传媒",
          "smsContent" : "魅力宣传,星雨传媒!",
          "state" : "1",
          "opratorId" : "3",
          "province" : "杭州",
          "ipAddr" : "10.289.19.45",
          "replyTotal" : "6",
          "fee" : "500"
        }
      },
      {
        "_index" : "sms-logs-index",
        "_type" : "_doc",
        "_id" : "9",
        "_score" : 1.0,
        "_source" : {
          "createDate" : "2020-09-16",
          "senDate" : "2020-09-16",
          "longCode" : "5784320",
          "moblie" : 15236964578,
          "corpName" : "花花派",
          "smsContent" : "花开花落,魅力女性,买花选我!",
          "state" : "1",
          "opratorId" : "1",
          "province" : "上海",
          "ipAddr" : "10.265.19.45",
          "replyTotal" : "1",
          "fee" : "0.1"
        }
      },
      {
        "_index" : "sms-logs-index",
        "_type" : "_doc",
        "_id" : "4",
        "_score" : 1.0,
        "_source" : {
          "createDate" : "2020-09-16",
          "senDate" : "2020-09-16",
          "longCode" : "87454120",
          "moblie" : 13625789645,
          "corpName" : "爱美化妆品有限公司",
          "smsContent" : "魅力,势不可挡,爱美爱美",
          "state" : "1",
          "opratorId" : "1",
          "province" : "上海",
          "ipAddr" : "10.258.19.45",
          "replyTotal" : "1",
          "fee" : "200"
        }
      },
      {
        "_index" : "sms-logs-index",
        "_type" : "_doc",
        "_id" : "7",
        "_score" : 1.0,
        "_source" : {
          "createDate" : "2020-09-16",
          "senDate" : "2020-09-16",
          "longCode" : "33656412674",
          "moblie" : 18956451203,
          "corpName" : "华丽网集团",
          "smsContent" : "网络安全,华丽靠谱!",
          "state" : "1",
          "opratorId" : "3",
          "province" : "上海",
          "ipAddr" : "10.215.254.45",
          "replyTotal" : "1",
          "fee" : "2000"
        }
      },
      {
        "_index" : "sms-logs-index",
        "_type" : "_doc",
        "_id" : "8",
        "_score" : 1.0,
        "_source" : {
          "createDate" : "2020-09-16",
          "senDate" : "2020-09-16",
          "longCode" : "56412345",
          "moblie" : 17055452369,
          "corpName" : "万事Ok公司",
          "smsContent" : "万事Ok,找我没错!",
          "state" : "0",
          "opratorId" : "2",
          "province" : "杭州",
          "ipAddr" : "10.215.19.45",
          "replyTotal" : "1",
          "fee" : "200"
        }
      },
      {
        "_index" : "sms-logs-index",
        "_type" : "_doc",
        "_id" : "10",
        "_score" : 1.0,
        "_source" : {
          "createDate" : "2020-09-16",
          "senDate" : "2020-09-16",
          "longCode" : "54784641",
          "moblie" : 15625584654,
          "corpName" : "勾股科技有限公司",
          "smsContent" : "智能算法,智慧生活,勾股科技!",
          "state" : "1",
          "opratorId" : "2",
          "province" : "杭州",
          "ipAddr" : "10.215.19.45",
          "replyTotal" : "6",
          "fee" : "4000"
        }
      },
      {
        "_index" : "sms-logs-index",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 1.0,
        "_source" : {
          "createDate" : "2020-09-16",
          "senDate" : "2020-09-16",
          "longCode" : "20165411010",
          "moblie" : 15248754897,
          "corpName" : "北京鑫鑫能源有限公司",
          "smsContent" : "欢迎使用新能源,让世界更环保",
          "state" : "1",
          "opratorId" : "2",
          "province" : "北京",
          "ipAddr" : "10.245.29.280",
          "replyTotal" : "0.6",
          "fee" : "0.5"
        }
      },
      {
        "_index" : "sms-logs-index",
        "_type" : "_doc",
        "_id" : "6",
        "_score" : 1.0,
        "_source" : {
          "createDate" : "2020-09-16",
          "senDate" : "2020-09-16",
          "longCode" : "89451254",
          "moblie" : 13028457893,
          "corpName" : "大兴建筑有限公司",
          "smsContent" : "我房建,你放心,大兴建筑!",
          "state" : "1",
          "opratorId" : "1",
          "province" : "杭州",
          "ipAddr" : "10.215.19.45",
          "replyTotal" : "1",
          "fee" : "500"
        }
      }
    ]
  },
  "aggregations" : {
    "agg" : {
      "buckets" : [
        {
          "key" : "*-200.0",
          "to" : 200.0,
          "doc_count" : 4
        },
        {
          "key" : "200.0-400.0",
          "from" : 200.0,
          "to" : 400.0,
          "doc_count" : 2
        },
        {
          "key" : "400.0-*",
          "from" : 400.0,
          "doc_count" : 6
        }
      ]
    }
  }
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74
  • 75
  • 76
  • 77
  • 78
  • 79
  • 80
  • 81
  • 82
  • 83
  • 84
  • 85
  • 86
  • 87
  • 88
  • 89
  • 90
  • 91
  • 92
  • 93
  • 94
  • 95
  • 96
  • 97
  • 98
  • 99
  • 100
  • 101
  • 102
  • 103
  • 104
  • 105
  • 106
  • 107
  • 108
  • 109
  • 110
  • 111
  • 112
  • 113
  • 114
  • 115
  • 116
  • 117
  • 118
  • 119
  • 120
  • 121
  • 122
  • 123
  • 124
  • 125
  • 126
  • 127
  • 128
  • 129
  • 130
  • 131
  • 132
  • 133
  • 134
  • 135
  • 136
  • 137
  • 138
  • 139
  • 140
  • 141
  • 142
  • 143
  • 144
  • 145
  • 146
  • 147
  • 148
  • 149
  • 150
  • 151
  • 152
  • 153
  • 154
  • 155
  • 156
  • 157
  • 158
  • 159
  • 160
  • 161
  • 162
  • 163
  • 164
  • 165
  • 166
  • 167
  • 168
  • 169
  • 170
  • 171
  • 172
  • 173
  • 174
  • 175
  • 176
  • 177
  • 178
  • 179
  • 180
  • 181
  • 182
  • 183
  • 184
  • 185
  • 186
  • 187
  • 188
  • 189
  • 190
  • 191
  • 192
  • 193
  • 194
  • 195
  • 196
  • 197
  • 198
  • 199
  • 200
  • 201
  • 202
  • 203
  • 204
  • 205
  • 206
  • 207
  • 208
  • 209
  • 210
  • 211
  • 212
  • 213
  • 214
  • 215
  • 216
  • 217
  • 218
  • 219
  • 220
  • 221
  • 222
  • 223
  • 224
  • 225
  • 226
  • 227
  • 228
  • 229
  • 230
  • 231
  • 232
  • 233
  • 234
  • 235
  • 236
  • 237
  • 238
  • 239
  • 240
  • 241
  • 242
  • 243
  • 244
  • 245
  • 246
  • 247
  • 248
  • 249
  • 250
  • 251
  • 252
  • 253
  • 254
  • 255
  • 256
  • 257
  • 258
  • 259
  • 260
  • 261
  • 262
  • 263
  • 264

  时间range统计:实现要求,依据createDate字段创建日期小于2020-09-17和大于等于2020-09-17到至今(2020-10-12)的文档数据分别有多少公司

  RESTFUL代码如下;

#时间范围统计date_range
POST /sms-logs-index/_search
{
  "aggs": {
    "agg": {
      "date_range": {
        "field": "createDate",
        "format": "yyyy-MM-dd", 
        "ranges": [
          {
            "to": "2020-09-17"
          },
          {
            "from": "2020-09-17",
            "to": "now"
          }
        ]
      }
    }
  }
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21

  ip的range统计:实现要求,依据ipAddr字段ip网段小于10.126.2.8和大于等于10.126.2.8的文档数据分别有多少公司

POST /sms-logs-index/_search
{
  "aggs": {
    "agg": {
      "ip_range": {
        "field": "ipAddr",
        "ranges": [
          {
            "to": "10.126.2.8"
          },
          {
            "from": "10.126.2.8"
          }
        ]
      }
    }
  }
  
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19

  Java代码如下,这里偷个懒,就写个数值range的Java实现,其他两个都是一样的,大家自己试试吧;

    static RestHighLevelClient myClient= EsClient.getClient();  //获取操作ES的
    String index="sms-logs-index";

    @Test
    public void range() throws IOException
    {
        //1.SearchRequest
        SearchRequest request=new SearchRequest(index);

        //2.聚合查询条件
        SearchSourceBuilder builder=new SearchSourceBuilder();
        builder.aggregation(AggregationBuilders.range("agg").field("fee").addUnboundedTo(200).addRange(200,400).addUnboundedFrom(400));
        request.source(builder);

        //3.执行查询
        SearchResponse resp = myClient.search(request, RequestOptions.DEFAULT);

        //4.返回结果,默认返回的是父类Aggregation
        Range agg = resp.getAggregations().get("agg");
        for (Range.Bucket bucket : agg.getBuckets()) {
            String key=bucket.getKeyAsString();
            Object from = bucket.getFrom();
            Object to = bucket.getTo();
            long docCount = bucket.getDocCount();

            System.out.println(String.format("key:%s,from:%s,to,%s,docCount:%s",key,from,to,docCount));
        }
    }
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28

  Java实现range的效果;
在这里插入图片描述

图2 Java实现range聚合统计的效果

extended_stats(统计聚合)查询

可以帮你指定field的最大值,最小值,平均值,平方和等

  实现要求,依据fee字段查询最大值,最小值,平均值,平方和等

  RESTFUL代码如下;

POST /sms-logs-index/_search
{
  "aggs": {
    "aggE": {
      "extended_stats": {
        "field": "fee"
      }
    }
  }
}


  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12

  RESTFUL代码返回结果如下;

{
  "took" : 62,
  "timed_out" : false,
  "_shards" : {
    "total" : 5,
    "successful" : 5,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 12,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "sms-logs-index",
        "_type" : "_doc",
        "_id" : "3",
        "_score" : 1.0,
        "_source" : {
          "createDate" : "2020-09-16",
          "senDate" : "2020-09-16",
          "longCode" : "5478434123",
          "moblie" : 18056587445,
          "corpName" : "中威集团",
          "smsContent" : "中威集团,服务于你的身边!",
          "state" : "0",
          "opratorId" : "3",
          "province" : "杭州",
          "ipAddr" : "10.248.19.45",
          "replyTotal" : "4",
          "fee" : "20"
        }
      },
      {
        "_index" : "sms-logs-index",
        "_type" : "_doc",
        "_id" : "5",
        "_score" : 1.0,
        "_source" : {
          "createDate" : "2020-09-16",
          "senDate" : "2020-09-16",
          "longCode" : "24514635",
          "moblie" : 18545427895,
          "corpName" : "东东集团",
          "smsContent" : "数据驱动,AI推动,新零售模型让你的购买更心怡!",
          "state" : "1",
          "opratorId" : "1",
          "province" : "北京",
          "ipAddr" : "10.254.19.45",
          "replyTotal" : "1",
          "fee" : "6000"
        }
      },
      {
        "_index" : "sms-logs-index",
        "_type" : "_doc",
        "_id" : "11",
        "_score" : 1.0,
        "_source" : {
          "createDate" : "2020-09-22",
          "senDate" : "2020-09-22",
          "longCode" : "458744536",
          "moblie" : 134625584654,
          "corpName" : "星雨文化传媒",
          "smsContent" : "魅力宣传,星雨传媒!",
          "state" : "1",
          "opratorId" : "3",
          "province" : "杭州",
          "ipAddr" : "10.289.19.45",
          "replyTotal" : "6",
          "fee" : "500"
        }
      },
      {
        "_index" : "sms-logs-index",
        "_type" : "_doc",
        "_id" : "9",
        "_score" : 1.0,
        "_source" : {
          "createDate" : "2020-09-16",
          "senDate" : "2020-09-16",
          "longCode" : "5784320",
          "moblie" : 15236964578,
          "corpName" : "花花派",
          "smsContent" : "花开花落,魅力女性,买花选我!",
          "state" : "1",
          "opratorId" : "1",
          "province" : "上海",
          "ipAddr" : "10.265.19.45",
          "replyTotal" : "1",
          "fee" : "0.1"
        }
      },
      {
        "_index" : "sms-logs-index",
        "_type" : "_doc",
        "_id" : "4",
        "_score" : 1.0,
        "_source" : {
          "createDate" : "2020-09-16",
          "senDate" : "2020-09-16",
          "longCode" : "87454120",
          "moblie" : 13625789645,
          "corpName" : "爱美化妆品有限公司",
          "smsContent" : "魅力,势不可挡,爱美爱美",
          "state" : "1",
          "opratorId" : "1",
          "province" : "上海",
          "ipAddr" : "10.258.19.45",
          "replyTotal" : "1",
          "fee" : "200"
        }
      },
      {
        "_index" : "sms-logs-index",
        "_type" : "_doc",
        "_id" : "7",
        "_score" : 1.0,
        "_source" : {
          "createDate" : "2020-09-16",
          "senDate" : "2020-09-16",
          "longCode" : "33656412674",
          "moblie" : 18956451203,
          "corpName" : "华丽网集团",
          "smsContent" : "网络安全,华丽靠谱!",
          "state" : "1",
          "opratorId" : "3",
          "province" : "上海",
          "ipAddr" : "10.215.254.45",
          "replyTotal" : "1",
          "fee" : "2000"
        }
      },
      {
        "_index" : "sms-logs-index",
        "_type" : "_doc",
        "_id" : "8",
        "_score" : 1.0,
        "_source" : {
          "createDate" : "2020-09-16",
          "senDate" : "2020-09-16",
          "longCode" : "56412345",
          "moblie" : 17055452369,
          "corpName" : "万事Ok公司",
          "smsContent" : "万事Ok,找我没错!",
          "state" : "0",
          "opratorId" : "2",
          "province" : "杭州",
          "ipAddr" : "10.215.19.45",
          "replyTotal" : "1",
          "fee" : "200"
        }
      },
      {
        "_index" : "sms-logs-index",
        "_type" : "_doc",
        "_id" : "10",
        "_score" : 1.0,
        "_source" : {
          "createDate" : "2020-09-16",
          "senDate" : "2020-09-16",
          "longCode" : "54784641",
          "moblie" : 15625584654,
          "corpName" : "勾股科技有限公司",
          "smsContent" : "智能算法,智慧生活,勾股科技!",
          "state" : "1",
          "opratorId" : "2",
          "province" : "杭州",
          "ipAddr" : "10.215.19.45",
          "replyTotal" : "6",
          "fee" : "4000"
        }
      },
      {
        "_index" : "sms-logs-index",
        "_type" : "_doc",
        "_id" : "2",
        "_score" : 1.0,
        "_source" : {
          "createDate" : "2020-09-16",
          "senDate" : "2020-09-16",
          "longCode" : "20165411010",
          "moblie" : 15248754897,
          "corpName" : "北京鑫鑫能源有限公司",
          "smsContent" : "欢迎使用新能源,让世界更环保",
          "state" : "1",
          "opratorId" : "2",
          "province" : "北京",
          "ipAddr" : "10.245.29.280",
          "replyTotal" : "0.6",
          "fee" : "0.5"
        }
      },
      {
        "_index" : "sms-logs-index",
        "_type" : "_doc",
        "_id" : "6",
        "_score" : 1.0,
        "_source" : {
          "createDate" : "2020-09-16",
          "senDate" : "2020-09-16",
          "longCode" : "89451254",
          "moblie" : 13028457893,
          "corpName" : "大兴建筑有限公司",
          "smsContent" : "我房建,你放心,大兴建筑!",
          "state" : "1",
          "opratorId" : "1",
          "province" : "杭州",
          "ipAddr" : "10.215.19.45",
          "replyTotal" : "1",
          "fee" : "500"
        }
      }
    ]
  },
  "aggregations" : {
    "aggE" : {
      "count" : 12,
      "min" : 0.1,
      "max" : 6000.0,
      "avg" : 1160.0583333333334,
      "sum" : 13920.7,
      "sum_of_squares" : 5.6830400269999996E7,
      "variance" : 3390131.3524305555,
      "std_deviation" : 1841.2309340304262,
      "std_deviation_bounds" : {
        "upper" : 4842.520201394185,
        "lower" : -2522.403534727519
      }
    }
  }
}

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74
  • 75
  • 76
  • 77
  • 78
  • 79
  • 80
  • 81
  • 82
  • 83
  • 84
  • 85
  • 86
  • 87
  • 88
  • 89
  • 90
  • 91
  • 92
  • 93
  • 94
  • 95
  • 96
  • 97
  • 98
  • 99
  • 100
  • 101
  • 102
  • 103
  • 104
  • 105
  • 106
  • 107
  • 108
  • 109
  • 110
  • 111
  • 112
  • 113
  • 114
  • 115
  • 116
  • 117
  • 118
  • 119
  • 120
  • 121
  • 122
  • 123
  • 124
  • 125
  • 126
  • 127
  • 128
  • 129
  • 130
  • 131
  • 132
  • 133
  • 134
  • 135
  • 136
  • 137
  • 138
  • 139
  • 140
  • 141
  • 142
  • 143
  • 144
  • 145
  • 146
  • 147
  • 148
  • 149
  • 150
  • 151
  • 152
  • 153
  • 154
  • 155
  • 156
  • 157
  • 158
  • 159
  • 160
  • 161
  • 162
  • 163
  • 164
  • 165
  • 166
  • 167
  • 168
  • 169
  • 170
  • 171
  • 172
  • 173
  • 174
  • 175
  • 176
  • 177
  • 178
  • 179
  • 180
  • 181
  • 182
  • 183
  • 184
  • 185
  • 186
  • 187
  • 188
  • 189
  • 190
  • 191
  • 192
  • 193
  • 194
  • 195
  • 196
  • 197
  • 198
  • 199
  • 200
  • 201
  • 202
  • 203
  • 204
  • 205
  • 206
  • 207
  • 208
  • 209
  • 210
  • 211
  • 212
  • 213
  • 214
  • 215
  • 216
  • 217
  • 218
  • 219
  • 220
  • 221
  • 222
  • 223
  • 224
  • 225
  • 226
  • 227
  • 228
  • 229
  • 230
  • 231
  • 232
  • 233
  • 234
  • 235
  • 236

  Java代码如下;

    static RestHighLevelClient myClient= EsClient.getClient();  //获取操作ES的
    String index="sms-logs-index";

    @Test
    public void extended_stats() throws IOException
    {
        //1.SearchRequest
        SearchRequest request=new SearchRequest(index);

        //2.聚合查询条件
        SearchSourceBuilder builder=new SearchSourceBuilder();
        builder.aggregation(AggregationBuilders.extendedStats("agg").field("fee"));
        request.source(builder);

        //3.执行查询
        SearchResponse resp = myClient.search(request, RequestOptions.DEFAULT);

        //4.返回结果,默认返回的是父类Aggregation
        ExtendedStats agg=resp.getAggregations().get("agg");
        double max = agg.getMax();
        double min = agg.getMin();

        //其他的属性就不一一例举了

        System.out.println("max fee :  "+max+"  min fee :  "+min);

    }
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27

在这里插入图片描述

图3 Java实现extended_stats聚合统计的效果

  聚合统计的函数在ES太丰富了,此处无法一一例举,剩下的留给大家举一反三更多的内容可以参考官网Aggregations;

嵌套字段的聚合统计aggregations查询

  有的时候会用到嵌套结构,而且统计数量的时候要用到嵌套结构里面的字段作为判别,如下嵌套结构;

    {
        "name": "内容1",
        "desc": "描述1",
        "channel": [
            {
                "name": "one",
                "num": 28
            },
            {
                "name": "two",
                "num": 32
            }
        ]
    }

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15

  为了方便我们对字段中的嵌套类型进行操作Elasticsearch提供了nested方法让我们对嵌套字段实现操作。使用nested需要指定嵌套字段的路径path,假如希望对嵌套字段内的内容进行聚合等操作的时候需要指明字段的全部路径。channel.num;

{
    "query": {
        "match_all": {
            "boost": 1.0
        }
    },
    "aggregations": {
        "test": {
            "nested": {
                "path": "channel"
            },
            "aggregations": {
                "sumChannel": {
                    "sum": {
                        "field": "channel.num"
                    }
                }
            }
        }
    }
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21

  结果如下;此时我们对嵌套内数据的求和操作,并不是仅仅将一条数据内此字段的数据进行聚合,而是将所有数据中嵌套内的值求和。下面的返回内容中test桶中的sumChannel的结果就是将两条数据的四个num值进行了相加操作。

{
    "took": 133,
    "timed_out": false,
    "_shards": {
        "total": 1,
        "successful": 1,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 2,
            "relation": "eq"
        },
        "max_score": 1,
        "hits": [
            {
                "_index": "test_field2",
                "_type": "_doc",
                "_id": "1",
                "_score": 1,
                "_source": {
                    "name": "内容1",
                    "desc": "描述1",
                    "channel": [
                        {
                            "name": "one",
                            "num": 28
                        },
                        {
                            "name": "two",
                            "num": 32
                        }
                    ]
                }
            },
            {
                "_index": "test_field2",
                "_type": "_doc",
                "_id": "2",
                "_score": 1,
                "_source": {
                    "name": "内容1",
                    "desc": "描述1",
                    "channel": [
                        {
                            "name": "one",
                            "num": 33
                        },
                        {
                            "name": "two",
                            "num": 44
                        }
                    ]
                }
            }
        ]
    },
    "aggregations": {
        "test": {
            "doc_count": 4,
            "sumChannel": {
                "value": 137
            }
        }
    }
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67

  Java代码如下;

static RestHighLevelClient myClient= EsClient.getClient();  //获取操作ES的
String index="sms-logs-index";

public static void queryIndex() throws IOException {
        NestedAggregationBuilder nested = AggregationBuilders.nested("test", "channel");
        nested.subAggregation(AggregationBuilders.sum("sumChannel").field("channel.num"));

        // match all query 查询所有数据
        SearchRequest searchRequest = new SearchRequest();
        SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
        searchSourceBuilder.query(QueryBuilders.matchAllQuery());
        searchSourceBuilder.aggregation(nested);
        searchRequest.source(searchSourceBuilder);
        // 同步执行 RestClientUtils.client 获得ES连接的方法
        SearchResponse searchResponse = myClient.search(searchRequest, RequestOptions.DEFAULT);
        Aggregations aggregations = searchResponse.getAggregations();
    }
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/一键难忘520/article/detail/799090
推荐阅读
相关标签
  

闽ICP备14008679号