赞
踩
目录
2. 搜索博客等级(level)大于等于2, 同时发布日期(post_date)是2018-11-11的博客
Elasticsearch(ES):一款基于Apache Lucene(TM)的开源的全文检索和分析引擎。通过简单的RESTful API
来隐藏其复杂性、同时也做了分布式相关的工作。
Lucene:使用Java实现的一套搜索引擎库。
Elasticsearch集群可以包含多个索引(数据库),每一个索引可以包含多个类型(表),每一个类型包含多个文档(行),然后每个文档包含多个字段(列)
关系型数据库 | 数据库 | 表 | 行 | 列 |
ElasticSearch | 索引(index) | 类型(type) | 文档 | 字段 |
相关概念:
将文本转换成一系列单词(Term or Token)的过程,用于创建和查询倒排索引
分词器:是ES中专门处理分词的组件,由一下三部分组成
内置分词器:
分词查看
- POST /_analyze
- {
- "analyzer": "standard",
- "text": "The 2 QUICK Brown-Foxes jumped over the lazy dog's bone."
- }
结果
- {
- "tokens": [{
- "token": "the",
- "start_offset": 0,
- "end_offset": 3,
- "type": "<ALPHANUM>",
- "position": 0
- },
- {
- "token": "2",
- "start_offset": 4,
- "end_offset": 5,
- "type": "<NUM>",
- "position": 1
- },
- ...
- ]
- }
详情点击跳转官方文档查看
ES把文档中的数据进行分析后,将词和文档之间建立映射关系。
组成:倒排索引由文档中不重复词的列表+每个词被包含的文档ID列表
查询过程:
- GET /_search
- {}
-
- GET /_search
- {
- "query": {
- "match_all": {}
- }
- }
字段详解
根据ES的相似度算法(TF/IDF)得出的结果,具体值由_score字段表示,根据以下维度计算得出
- // 请求后增加explain=true即可
- GET /_search
- {
- "explain":true,
- "query" : { "match" : { "name" : "John Smith" }}
- }
Query | Filter | |
争对问题 | 该文档匹不匹配这个查询,它的相关度高么❓ | 这篇文档是否与该查询匹配❓ |
相关度处理 | 先查询符合搜索条件的文档数,然后计算每个文档对于搜索条件的相关度分数,再根据评分倒序排序 | 只根据搜索条件过滤出符合的文档, 不进行评分, 忽略TF/IDF信息 |
性能 | 性能较差, 有排序 , 并且没有缓存功能(有倒排索引来弥补) | 性能更好, 无排序; 会缓存比较常用的filter的数据 |
栗子 | ❗ 查询与“first blog”字段最佳匹配的文档 ❗ | ❗ 搜索博客等级(level)大于等于2, 同时发布日期(post_date)是2018-11-11的博客 ❗ |
- // query
- GET /_search
- {
- "query": {
- "match": {
- "desc": "four blog"
- }
- }
- }
-
- // filter
- GET /_search
- {
- "query": {
- "bool": {
- "filter": {
- "match": {
- "desc": "four blog"
- }
- }
- }
- }
- }
- // query
- GET /_search
- {
- "query": {
- "bool": {
- "must": [
- { "match": { "post_date": "2018-11-11" } },
- { "range": { "level": { "gte": 2 } } }
- ]
- }
- }
- }
- // filter
- GET /_search
- {
- "query": {
- "bool": {
- "must": {
- "match": { "post_date": "2018-11-11" }
- },
- "filter": {
- "range": { "level": { "gte": 2 } }
- }
- }
- }
- }
插入测试数据
- POST /my_store/_bulk
- { "index": { "_id": 1 }}
- { "price" : 10, "productID" : "XHDK-A-1293-#fJ3" }
- { "index": { "_id": 2 }}
- { "price" : 20, "productID" : "KDKE-B-9947-#kL5" }
- { "index": { "_id": 3 }}
- { "price" : 30, "productID" : "JODL-X-1937-#pV7" }
- { "index": { "_id": 4 }}
- { "price" : 30, "productID" : "QQPX-R-3956-#aD8" }
查看索引详情
GET /my_store
查询价格20的所有产品
SQL:SELECT * FROM products WHERE price = 20
- GET /_search
- {
- "query" : {
- "constant_score" : {
- "filter" : {
- "term" : {
- "price" : 20
- }
- }
- }
- }
- }
- // constant_score关键字将trem查询转化为filter
- GET /_search
- {
- "query":{
- "bool": {
- "filter": {
- "term": {
- "price": 20
- }
- }
- }
- }
- }
查询productID为XHDK-A-1293-#fJ3的文档
SQL:SELECT * FROM products WHERE productID = "XHDK-A-1293-#fJ3"
- GET /_search
- {
- "query" : {
- "constant_score" : {
- "filter" : {
- "term" : {
- "productID" : "XHDK-A-1293-#fJ3"
- }
- }
- }
- }
- }
-
- // 查看分词结果
- GET /my_store/_analyze
- {
- "field": "productID",
- "text": "XHDK-A-1293-#fJ3"
- }
总结:term会拿"XHDK-A-1293-#fJ3",去倒排索引中找,但倒排索引表里只有"xhdk","a","1293","fj3",因此查不到
解决办法1
- GET /_search
- {
- "query" : {
- "match_phrase" : {
- "productID" : "XHDK-A-1293-#fJ3"
- }
- }
- }
解决办法2
- // 1.删除索引
- DELETE /my_store
- //2.指定productID字段使用keyword规则
- PUT /my_store
- {
- "mappings": {
- "properties": {
- "price": {
- "type": "long"
- },
- "productID": {
- "type": "text",
- "analyzer": "keyword"
- }
- }
- }
- }
查找price为20 && 30 的文档
- GET /my_store/_search
- {
- "query" : {
- "constant_score" : {
- "filter" : {
- "terms" : {
- "price" : [20, 30]
- }
- }
- }
- }
- }
gt:> lt:< gte:>= lte:<=
查找price大于20且小于40的产品
SQL:SELECT * FROM products WHERE price BETWEEN 20 AND 40
- GET /my_store/_search
- {
- "query" : {
- "constant_score" : {
- "filter" : {
- "range" : {
- "price" : {
- "gte" : 20,
- "lt" : 40
- }
- }
- }
- }
- }
- }
日期范围查询 now data||+1M
- GET /website/_search
- {
- "query" : {
- "constant_score" : {
- "filter" : {
- "range" : {
- "post_date": {
- "gte" : "2020-01-01",
- "lt": "2020-09-09||+1h"
- }
- }
- }
- }
- }
- }
SQL:SELECT * FROM products WHERE (price = 20 OR productID = "XHDK-A-1293-#fJ3") AND (price != 30)
- GET /my_store/_search
- {
- "query" : {
- "constant_score" : {
- "filter" : {
- "bool" : {
- "should" : [
- { "term" : {"price" : 20}},
- { "term" : {"productID" : "XHDK-A-1293-#fJ3"}}
- ],
- "must_not" : {
- "term" : {"price" : 30}
- }
- }
- }
- }
- }
- }
SQL:SELECT * FROM products WHERE productID = "KDKE-B-9947-#kL5" OR (productID = "JODL-X-1937-#pV7" AND price = 30)
- GET /my_store/_search
- {
- "query" : {
- "constant_score" : {
- "filter" : {
- "bool" : {
- "should" : [
- { "term" : {"productID" : "KDKE-B-9947-#kL5"}},
- { "bool" : {
- "must" : [
- { "term" : {"productID" : "JODL-X-1937-#pV7"}},
- { "term" : {"price" : 30}}
- ]
- }}
- ]
- }
- }
- }
- }
- }
插入测试数据
- POST /posts/_bulk
- { "index": { "_id": "1" }}
- { "tags" : ["search"] }
- { "index": { "_id": "2" }}
- { "tags" : ["search", "open_source"] }
- { "index": { "_id": "3" }}
- { "other_field" : "some data" }
- { "index": { "_id": "4" }}
- { "tags" : null }
- { "index": { "_id": "5" }}
- { "tags" : ["search", null] }
存在查询
SQL:SELECT tags FROM posts WHERE tags IS NOT NULL
- GET /posts/_search
- {
- "query" : {
- "constant_score" : {
- "filter" : {
- "exists" : { "field" : "tags" }
- }
- }
- }
- }
缺失查询
SQL:SELECT tags FROM posts WHERE tags IS NULL
- GET /posts/_search
- {
- "query" : {
- "constant_score" : {
- "filter" : {
- "bool": {
- "must_not":{"exists" : { "field" : "tags" }}
- }
- }
- }
- }
- }
插入测试数据
- POST /my_index/my_type/_bulk
- { "index": { "_id": 1 }}
- { "title": "The quick brown fox" }
- { "index": { "_id": 2 }}
- { "title": "The quick brown fox jumps over the lazy dog" }
- { "index": { "_id": 3 }}
- { "title": "The quick brown fox jumps over the quick dog" }
- { "index": { "_id": 4 }}
- { "title": "Brown fox brown dog" }
单个词查询
执行过程:
- 检查字段类型
- 分析查询字符串
- 调用term查询,去倒排索引中查询包含quick的文档
- 为每个文档评分并排序
- GET /my_index/_search
- {
- "query": {
- "match": {
- "title": "QUICK!"
- }
- }
- }
检查字段类型
- GET /_analyze
- {
- "text": "QUICK!"
- }
分析查询字符串,调用term查询,去倒排索引中查询包含quick的文档
- GET /my_type/_search
- {
- "query": {
- "term": {
- "title": "quick"
- }
- }
- }
多词查询
- GET /my_index/_search
- {
- "query": {
- "match": {
- "title": "BROWN DOG!"
- }
- }
- }
-
- GET /my_index/_search
- {
- "query": {
- "bool": {
- "should": [
- {"term": {"title": "brown"}},
- {"term":{"title":"dog"}}
- ]
- }
- }
- }
为每个文档评分并排序
总结:被匹配的此项越多,文档越相关,排名越靠前
operator:修改匹配关系
- GET /my_index/_search
- {
- "query": {
- "match": {
- "title": {
- "query": "BROWN DOG!",
- "operator": "and"
- }
- }
- }
- }
-
- GET /my_index/_search
- {
- "query": {
- "bool": {
- "must": [
- {"term": {"title": "brown"}},
- {"term":{"title":"dog"}}
- ]
- }
- }
- }
查询包含quick,但不包含lazy的所有文档,如果包含should里的字段,则该文章相关度更高
- GET /my_index/_search
- {
- "query": {
- "bool": {
- "must": { "match": { "title": "quick" }},
- "must_not": { "match": { "title": "lazy" }},
- "should": [
- { "match": { "title": "brown" }},
- { "match": { "title": "dog" }}
- ]
- }
- }
- }
- GET /my_index/_search
- {
- "query": {
- "match_phrase": {
- "title": "quick brown fox"
- }
- }
- }
缺点:
- 效率低。比如from=5000,size=100,es需要在各个分片上匹配排序并得到5000+100条有效数据,然后在结果集中取最后100条结果。
- 最大可查询条数为1W条。ES目前默认支持的skin值max_result_window=10000,当from+size>max_result_window时,ES就会返回错误。
- 解决办法:使用scroll(游标查询)
- {
- "query": {
- "match_all": {}
- },
- "from": 0,
- "size": 1
- }
from:从第几个商品开始查,最开始是 0
size:要查几个结果
根据主文档字段排序
- {
- "query": {
- "match_all": {
- }
- },
- "sort": [
- {
- "age": {
- "order": "desc"
- }
- }
- ]
- }
内嵌文档字段排序
主查询中的过滤条件并不会把不符合条件的内部嵌套文档过滤掉,以至于排序嵌套文档时,还是按照全部的嵌套文档排序
- {
- "query": {
- "nested": {
- "path": "shgx",
- "query": {
- "range": {
- "shgx.age": {
- "lt": 50
- }
- }
- }
- }
- },
- "sort": [
- {
- "shgx.age": {
- "nested_path": "shgx",
- "order": "desc",
- "nested_filter": {
- "range": {
- "shgx.age": {
- "lt": 50
- }
- }
- }
- }
- }
- ]
- }
启动游标查询
CET /host/_search?scroll=1m
scroll=1m表示游标查询窗口保持1分钟,如果一次取的数据量大可以设置大一些的时间;返回字段包含一个scroll_id,接下来用这个字段获取后续值
循环获取余下值
- GET /_search/scroll
- {
- "scroll": "1m",
- "scroll_id": scroll_id
- }
python操作
- from elasticsearch import Elasticsearch
-
- es = Elasticsearch(['localhost:9200'])
-
- # 1.启动游标
- queryData = es.search("internal_isop_log", body=dsl_body, scroll='1m', size=1000)
-
- # 获取scroll_id
- hits_list = queryData.get("hits").get("hits")
- scroll_id = queryData['_scroll_id']
-
- # 2.循环获取
- total = queryData.get("hits").get("total").get('value')
- for i in range(int(total / 1000)):
- ss = {'scroll': '1m', 'scroll_id': scroll_id}
- res = self.es.scroll(body=ss)
创建索引,设置postcode字段使用keyword规则 ❗模糊查询会匹配倒排表里的字段 ❗
- PUT /address
- {
- "mappings": {
- "properties": {
- "postcode": {
- "type": "text",
- "analyzer": "keyword"
- }
- }
- }
- }
导入测试数据
- PUT /address/_bulk
- { "index": { "_id": 1 }}
- { "postcode": "W1V 3DG" }
- { "index": { "_id": 2 }}
- { "postcode": "W2F 8HW" }
- { "index": { "_id": 3 }}
- { "postcode": "W1F 7HW" }
- { "index": { "_id": 4 }}
- { "postcode": "WC1N 1LZ" }
- { "index": { "_id": 5 }}
- { "postcode": "SW5 0BE" }
倒排表
Term | Doc IDs |
"SW5 0BE" | 5 |
"W1F 7HW" | 3 |
"W1V 3DG" | 1 |
"W2F 8HW" | 2 |
"WC1N 1LZ" | 4 |
前缀匹配(prefix)
匹配postcode字段以“W1”开头的文档
- GET /address/_search
- {
- "query": {
- "prefix": {
- "postcode": "W1"
- }
- }
- }
通配符查询(wildcard)
- GET /address/_search
- {
- "query": {
- "wildcard": {
- "postcode": "W?F*HW"
- }
- }
- }
正则匹配(regexp)
- GET /address/_search
- {
- "query": {
- "regexp": {
- "postcode": "W[0-9].+"
- }
- }
- }
不配置分词规则带来的影响
栗子:title字段为“Quick brown fox” ,倒排索引中会生成: quick 、 brown 和 fox
{ "regexp": { "title": "br.*" }} | 可以匹配 |
{ "regexp": { "title": "Qu.*" }} | 匹配不到:quick为小写 |
{ "regexp": { "title": "quick br*" }} | 匹配不到:quick和brown是分开的 |
ElasticSearch除了致力于搜索之外,也提供了聚合实时分析数据的功能,透过聚合,我们可以得到一个数据的概览,分析和总结全套的数据
对相同的数据进行 搜索/过滤 + 分析,两个愿望一次满足
聚合的两个主要的概念,分别是 桶 和 指标
桶(Buckets) : 满足特定条件的文档的集合
指标(Metrics) : 对桶内的文档进行统计计算
当query和aggs一起存在时,会先执行query的主查询,主查询query执行完后会搜出一批结果,而这些结果才会被拿去aggs拿去做聚合
伪代码结构
- {
- "query": { ... },
- "size": 0,
- "aggs": {
- "custom_name1": { // 自定义桶1名称
- "桶": { ... } // 桶1查询语句
- },
- "custom_name2": { // 一个aggs里可以有多个聚合
- "桶": { ... }
- },
- "custom_name3": {
- "桶": {
- .....
- },
- "aggs": { // aggs可以嵌套在别的aggs里面
- "in_name": { // 记得使用aggs需要先自定义一个name
- "桶": { ... } // in_name的桶作用的文档是custom_name3的桶的结果
- }
- }
- }
- }
- }
例:查询所有记录中年龄的最大值
- POST /book1/_search?pretty
-
- {
- "size": 0,
- "aggs": {
- "maxage": {
- "max": {
- "field": "age"
- }
- }
- }
- }
结果
- {
- "took": 4,
- "timed_out": false,
- "_shards": {
- "total": 5,
- "successful": 5,
- "skipped": 0,
- "failed": 0
- },
- "hits": {
- "total": 41,
- "max_score": 0,
- "hits": []
- },
- "aggregations": {
- "maxage": {
- "value": 54
- }
- }
- }
例:查询所有记录的平均年龄是多少,并对平均年龄加10
- POST /book1/_search?pretty
- {
- "size":0,
- "aggs": {
- "avg_age": {
- "avg": {
- "script": {
- "source": "doc.age.value"
- }
- }
- },
- "avg_age10": {
- "avg": {
- "script": {
- "source": "doc.age.value + 10"
- }
- }
- }
- }
- }
结果:
- {
- "took": 3,
- "timed_out": false,
- "_shards": {
- "total": 5,
- "successful": 5,
- "skipped": 0,
- "failed": 0
- },
- "hits": {
- "total": 41,
- "max_score": 0,
- "hits": []
- },
- "aggregations": {
- "avg_age": {
- "value": 7.585365853658536
- },
- "avg_age10": {
- "value": 17.585365853658537
- }
- }
- }
例:为缺失值指定值。如未指定,缺失该字段值的文档将被忽略
- POST /book1/_search?pretty
-
- {
- "size":0,
- "aggs": {
- "sun_age": {
- "avg": {
- "field":"age",
- "missing":15
- }
- }
- }
- }
结果
- {
- "took": 12,
- "timed_out": false,
- "_shards": {
- "total": 5,
- "successful": 5,
- "skipped": 0,
- "failed": 0
- },
- "hits": {
- "total": 41,
- "max_score": 0,
- "hits": []
- },
- "aggregations": {
- "sun_age": {
- "value": 12.847826086956522
- }
- }
- }
统计某字段有值的文档数
- POST /book1/_search?size=0
- {
- "aggs":{
- "age_count":{
- "value_count":{
- "field":"age"
- }
-
- }
- }
- }
结果
- {
- "took": 1,
- "timed_out": false,
- "_shards": {
- "total": 5,
- "successful": 5,
- "skipped": 0,
- "failed": 0
- },
- "hits": {
- "total": 41,
- "max_score": 0,
- "hits": []
- },
- "aggregations": {
- "age_count": {
- "value": 38
- }
- }
- }
去重计数
- POST /book1/_search?size=0
- {
- "aggs":{
- "age_count":{
- "value_count":{
- "field":"age"
- }
-
- },
- "name_count":{
- "cardinality":{
- "field":"age"
- }
- }
- }
- }
结果
- {
- "took": 16,
- "timed_out": false,
- "_shards": {
- "total": 5,
- "successful": 5,
- "skipped": 0,
- "failed": 0
- },
- "hits": {
- "total": 41,
- "max_score": 0,
- "hits": []
- },
- "aggregations": {
- "name_count": {
- "value": 11
- },
- "age_count": {
- "value": 38
- }
- }
- }
统计 count max min avg sum 5个值
- POST /book1/_search?size=0
- {
- "aggs":{
- "age_count":{
- "stats":{
- "field":"age"
- }
-
- }
- }
- }
结果
- {
- "took": 12,
- "timed_out": false,
- "_shards": {
- "total": 5,
- "successful": 5,
- "skipped": 0,
- "failed": 0
- },
- "hits": {
- "total": 41,
- "max_score": 0,
- "hits": []
- },
- "aggregations": {
- "age_count": {
- "count": 38,
- "min": 1,
- "max": 54,
- "avg": 12.394736842105264,
- "sum": 471
- }
- }
- }
高级统计,比stats多4个统计结果: 平方和、方差、标准差、平均值加/减两个标准差的区间。
- POST /book1/_search?size=0
-
- {
- "aggs":{
- "age_stats":{
- "extended_stats":{
- "field":"age"
- }
-
- }
- }
- }
结果
- {
- "took": 8,
- "timed_out": false,
- "_shards": {
- "total": 5,
- "successful": 5,
- "skipped": 0,
- "failed": 0
- },
- "hits": {
- "total": 41,
- "max_score": 0,
- "hits": []
- },
- "aggregations": {
- "age_stats": {
- "count": 38,
- "min": 1,
- "max": 54,
- "avg": 12.394736842105264,
- "sum": 471,
- "sum_of_squares": 11049,
- "variance": 137.13365650969527,
- "std_deviation": 11.710408041981085,
- "std_deviation_bounds": {
- "upper": 35.81555292606743,
- "lower": -11.026079241856905
- }
- }
- }
- }
针对某个field的值进行分组,field有几种值就分成几组
测试数据
- { "color": "red" }
- { "color": "green" }
- { "color": ["red", "blue"] }
dsl语句
- {
- "query": {
- "match_all": {}
- },
- "size": 0,
- "aggs": {
- "my_name": {
- "terms": {
- "field": "color" //使用color来进行分组
- }
- }
- }
- }
结果
- "aggregations": {
- "my_name": {
- "doc_count_error_upper_bound": 0,
- "sum_other_doc_count": 0,
- "buckets": [
- {
- "key": "blue",
- "doc_count": 1
- },
- {
- "key": "red",
- "doc_count": 2 //表示color为red的文档有2个,此例中就是 {"color": "red"} 和 {"color": ["red", "blue"]}这两个文档
- },
- {
- "key": "green",
- "doc_count": 1
- }
- ]
- }
- }
对满足过滤查询的文档进行聚合计算
要注意此处的 "filter桶" 和用在主查询query的 "过滤filter" 的用法是一模一样的,都是过滤,不过差别是 "filter桶" 会自己给创建一个新的桶,而不会像 "过滤filter" 一样依附在query下,因为filter桶毕竟还是一个聚合桶,因此他可以和别的桶进行嵌套,但他不是依附在别的桶上
测试数据同上
dsl语句
- {
- "query": {
- "match_all": {}
- },
- "size": 0,
- "aggs": {
- "my_name": {
- "filter": { //因为他用法跟一般的过滤filter一样,所以也能使用bool嵌套
- "bool": {
- "must": {
- "terms": { //注意此terms是查找terms,不是terms桶
- "color": [ "red", "blue" ]
- }
- }
- }
- }
- }
- }
- }
结果
- "aggregations": {
- "my_name": {
- "doc_count": 2 //filter桶计算出来的文档数量
- }
- }
多个过滤组聚合计算
例:分别统计包含‘test’,和‘里’的文档的个数
- POST book1/_search?size=0
- {
- "aggs":{
- "age_terms":{
- "filters":{
- "filters":{
- "test":{
- "match":{"name":"test"}
- },
- "china":{
- "match":{"name":"里"}
- }
- }
- }
- }
- }
- }
结果:
- {
- "took": 3,
- "timed_out": false,
- "_shards": {
- "total": 5,
- "successful": 5,
- "skipped": 0,
- "failed": 0
- },
- "hits": {
- "total": 41,
- "max_score": 0,
- "hits": []
- },
- "aggregations": {
- "age_terms": {
- "buckets": {
- "china": {
- "doc_count": 13
- },
- "test": {
- "doc_count": 5
- }
- }
- }
- }
- }
在某个桶底下找出这个桶的前几笔hits,返回的hits格式和主查询query返回的hits格式一模一样
参数
测试数据
- { "color": "red", "price": 100 }
- { "color": ["red", "blue"], "price": 1000 }
使用terms桶分组,再使用top_hits桶找出每个group里面的price最小的前5笔hits
- {
- "query": {
- "match_all": {}
- },
- "size": 0,
- "aggs": {
- "my_name": {
- "terms": {
- "field": "color"
- },
- "aggs": {
- "my_top_hits": {
- "top_hits": {
- "size": 5,
- "sort": {
- "price": "asc"
- }
- }
- }
- }
- }
- }
- }
结果
- "aggregations": {
- "my_name": {
- "doc_count_error_upper_bound": 0,
- "sum_other_doc_count": 0,
- "buckets": [
- {
- "key": "red",
- "doc_count": 2, //terms桶计算出来的color为red的文档数
- "my_top_hits": {
- "hits": { //top_hits桶找出color为red的这些文档中,price从小到大排序取前5笔
- "total": 2,
- "max_score": null,
- "hits": [
- {
- "_score": null,
- "_source": { "color": "red", "price": 100 },
- "sort": [ 100 ]
- },
- {
- "_score": null,
- "_source": { "color": [ "red", "blue" ], "price": 1000 },
- "sort": [ 1000 ]
- }
- ]
- }
- }
- },
- {
- "key": "blue",
- "doc_count": 1, //terms桶计算出来的color为blue的文档数
- "my_top_hits": {
- "hits": { //top_hits桶找出的hits
- "total": 1,
- "max_score": null,
- "hits": [
- {
- "_source": {
- "color": [ "red", "blue" ], "price": 1000 },
- "sort": [ 1000 ]
- }
- ]
- }
- }
- }
- ]
- }
- }
时间直方图(柱状)聚合
参数
- time_zone:"+08:00":设置市区(东八区),不指定会影响分组时间错误
- interval:聚合时间间隔
- year(1y)年
- quarter(1q)季度
- month(1M)月份
- week(1w)星期
- day(1d)天
- hour(1h)小时
- minute(1m)分钟
- second(1s)秒
- format:指定返回时间格式
dsl语句
- {
- "query": {
- "match_all": {}
- },
- "size": 0,
- "aggs": {
- // 自己取的聚合名字
- "group_by_grabTime": {
- // es提供的时间处理函数
- "date_histogram": {
- // 需要聚合分组的字段名称, 类型需要为date, 格式没有要求
- "field": "@timestamp",
- // 按什么时间段聚合, 这里是5分钟, 可用的interval在上面给出
- "interval": "5m",
- // 设置时区, 这样就相当于东八区的时间
- "time_zone": "+08:00",
- // 返回值格式化,HH大写,不然不能区分上午、下午
- "format": "yyyy-MM-dd HH",
- // 为空的话则填充0
- "min_doc_count": 0,
- // 需要填充0的范围
- "extended_bounds": {
- "min": 1533556800000,
- "max": 1533806520000
- }
- },
- // 聚合
- "aggs": {
- // 自己取的名称
- "group_by_status": {
- // es提供
- "terms": {
- // 聚合字段名
- "field": "LowStatusOfPrice"
- }
- }
- }
- }
- }
- }
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。