赞
踩
造数据
# 创建索引库 PUT /book { "settings": {}, "mappings": { "properties": { "description": { "type": "text", "analyzer": "ik_max_word" }, "name": { "type": "text", "analyzer": "ik_max_word" }, "price": { "type": "float" }, "timestamp": { "type": "date", "format": "yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||epoch_millis" } } } } # 插入数据 PUT /book/_doc/1 { "name": "lucene", "description": "Lucene Core is a Java library providing powerful indexing and search features, as well as spellchecking, hit highlighting and advanced analysis/tokenization capabilities. The PyLucene sub project provides Python bindings for Lucene Core. ", "price":100.45, "timestamp":"2020-08-21 19:11:35" } PUT /book/_doc/2 {"name": "solr", "description": "Solr is highly scalable, providing fully fault tolerant distributed indexing, search and analytics. It exposes Lucenes features through easy to use JSON/HTTP interfaces or native clients for Java and other languages.", "price":320.45, "timestamp":"2020-07-21 17:11:35" } PUT /book/_doc/3 { "name": "Hadoop", "description": "The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models.", "price":620.45, "timestamp":"2020-08-22 19:18:35" } PUT /book/_doc/4 { "name": "ElasticSearch", "description": "Elasticsearch是一个基于Lucene的搜索服务器。它提供了一个分布式多用户能力 的全文搜索引擎,基于RESTful web接口。Elasticsearch是用Java语言开发的,并作为Apache许可条 款下的开放源码发布,是一种流行的企业级搜索引擎。Elasticsearch用于云计算中,能够达到实时搜 索,稳定,可靠,快速,安装使用方便。官方客户端在Java、.NET(C#)、PHP、Python、Apache Groovy、Ruby和许多其他语言中都是可用的。根据DB-Engines的排名显示,Elasticsearch是最受欢 迎的企业搜索引擎,其次是Apache Solr,也是基于Lucene。", "price":999.99, "timestamp":"2020-08-15 10:11:35" }
对一个数据集求最大、最小、和、平均值等指标的聚合
设置size是为了不展示具体消息,一般情况下size是和from搭配使用来做分页的
POST /book/_search { "size": 0, "aggs": { "max_price": { "max": { "field": "price" } } } } POST /book/_search { "size": 0, "aggs": { "sum_price": { "sum": { "field": "price" } } } }
统计price大于300的文档数量
POST /book/_count
{
"query": {
"range": {
"price": {
"gt": 300
}
}
}
}
统计price字段有值的文档数
POST /book/_search?size=0
{
"aggs": {
"price_count": {
"value_count": {
"field": "price"
}
}
}
}
cardinality可以去掉重复的值,相当于mysql distinct
POST /book/_search?size=0
{
"aggs": {
"_id_count": {
"cardinality": {
"field": "_id"
}
},
"price_count": {
"cardinality": {
"field": "price"
}
}
}
}
统计count max min avg sum
POST /book/_search?size=0
{
"aggs": {
"price_stats": {
"stats": {
"field": "price"
}
}
}
}
比stats多了 平方和、方差、标准差、平均值加/减两个标准差的区间
POST /book/_search?size=0
{
"aggs": {
"price_stats": {
"extended_stats": {
"field": "price"
}
}
}
}
占比百分位
POST /book/_search?size=0 { "aggs": { "price_percents": { "percentiles": { "field": "price" } } } } POST /book/_search?size=0 { "aggs": { "price_percents": { "percentiles": { "field": "price", "percents": [ 75, 99, 99.9 ] } } } }
统计值小于等于指定值的文档占比
统计price小于100和200的文档的占比
POST /book/_search?size=0
{
"aggs": {
"gge_perc_rank": {
"percentile_ranks": {
"field": "price",
"values": [
100,
200
]
}
}
}
}
类似mysql group by ,把满足相关特性的文档分到一个桶里,一个桶是一个group,输出结果可包括多个group
分组然后算桶里面的数据统计
POST /book/_search { "size": 0, "aggs": { "group_by_price": { "range": { "field": "price", "ranges": [ { "from": 0, "to": 200 }, { "from": 200, "to": 400 }, { "from": 400, "to": 1000 } ] }, "aggs": { "average_price": { "avg": { "field": "price" } }, "count_price": { "value_count": { "field": "price" } } } } } }
过滤聚合后的结果,类似mysql的having
POST /book/_search { "size": 0, "aggs": { "group_by_price": { "range": { "field": "price", "ranges": [ { "from": 0, "to": 200 }, { "from": 200, "to": 400 }, { "from": 400, "to": 1000 } ] }, "aggs": { "average_price": { "avg": { "field": "price" } }, "count_price": { "value_count": { "field": "price" } }, "having": { "bucket_selector": { "buckets_path": { "avg_price": "average_price" }, "script": { "source": "params.avg_price >= 200 " } } } } } } }
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。