当前位置:   article > 正文

Elasticsearch进阶知识_elasticsearch cardinality

elasticsearch cardinality

聚合查询 - aggs

数据准备

建立索引:

PUT /employee
{
  "mappings": {
    "properties": {
      "id": {
        "type": "integer"
      },
       "name": {
        "type": "keyword"
      },
       "job": {
        "type": "keyword"
      },
       "age": {
        "type": "integer"
      },
       "gender": {
        "type": "keyword"
      }
    }
  }
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22

插入文档数据:

POST /employee/_bulk
{"index": {"_id": 1}}
{"id": 1, "name": "Bob", "job": "java", "age": 21, "sal": 8000, "gender": "female"}
{"index": {"_id": 2}}
{"id": 2, "name": "Rod", "job": "html", "age": 31, "sal": 18000, "gender": "female"}
{"index": {"_id": 3}}
{"id": 3, "name": "Gaving", "job": "java", "age": 24, "sal": 12000, "gender": "male"}
{"index": {"_id": 4}}
{"id": 4, "name": "King", "job": "dba", "age": 26, "sal": 15000, "gender": "female"}
{"index": {"_id": 5}}
{"id": 5, "name": "Jonhson", "job": "dba", "age": 29, "sal": 16000, "gender": "male"}
{"index": {"_id": 6}}
{"id": 6, "name": "Douge", "job": "java", "age": 41, "sal": 20000, "gender": "female"}
{"index": {"_id": 7}}
{"id": 7, "name": "cutting", "job": "dba", "age": 27, "sal": 7000, "gender": "male"}
{"index": {"_id": 8}}
{"id": 8, "name": "Bona", "job": "html", "age": 22, "sal": 14000, "gender": "female"}
{"index": {"_id": 9}}
{"id": 9, "name": "Shyon", "job": "dba", "age": 20, "sal": 19000, "gender": "female"}
{"index": {"_id": 10}}
{"id": 10, "name": "James", "job": "html", "age": 18, "sal": 22000, "gender": "male"}
{"index": {"_id": 11}}
{"id": 11, "name": "Golsling", "job": "java", "age": 32, "sal": 23000, "gender": "female"}
{"index": {"_id": 12}}
{"id": 12, "name": "Lily", "job": "java", "age": 24, "sal": 2000, "gender": "male"}
{"index": {"_id": 13}}
{"id": 13, "name": "Jack", "job": "html", "age": 23, "sal": 3000, "gender": "female"}
{"index": {"_id": 14}}
{"id": 14, "name": "Rose", "job": "java", "age": 36, "sal": 6000, "gender": "female"}
{"index": {"_id": 15}}
{"id": 15, "name": "Will", "job": "dba", "age": 38, "sal": 4500, "gender": "male"}
{"index": {"_id": 16}}
{"id": 16, "name": "smith", "job": "java", "age": 32, "sal": 23000, "gender": "male"}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33

求和查询 - sum

查询 - 员工的工作总和

GET /employee/_search
{
  "size": 0, 
  "aggs": {
    "sum_sal": {
      "sum": {
        "field": "sal"
      }
    }
  }
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11

aggs:aggregations(聚合)这个字段是固定值,就和query一样,表示这是一个聚合查询

sum_sal:这个字段是我们自己起的名字,用来表示我们这个聚合查询后的值的名称

sum:固定值,可以理解为函数,es内置了很多函数

size:设置为0是因为聚合查询会附带查询出文档数据,而聚合查询的结果在最下面,方便我们看结果,所以设置0,不显示文档

查询结果:

image-20220222103935951

在最下面 aggregations 中,就是我们查询的结果,sum_sal = 212500

平均值查询 - avg

查询 - 员工的平均工资

GET /employee/_search
{
  "size": 0, 
  "aggs": {
    "avg_sal": {
      "avg": {
        "field": "sal"
      }
    }
  }
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11

avg:和mysql的聚合函数相同,取平均值

查询过于简单,结果不再贴图。。。

去重统计 - cardinality

查询 - 一共有多少岗位

GET /employee/_search
{
  "size": 0, 
  "aggs": {
    "cardi_job": {
      "cardinality": {
        "field": "job"
      }
    }
  }
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11

cardinality:去重再求和,相当于mysql的 count(distinct)

多条件聚合查询

查询 - kibana提供的样板航班数据中,各个航班的平均机票最大值,最小值,平均值

GET /kibana_sample_data_flights/_search
{
  "size": 0, 
  "aggs": {
    "max_ticket_price": {
      "max": {
        "field": "AvgTicketPrice"
      }
    },
    "min_ticket_price": {
      "min": {
        "field": "AvgTicketPrice"
      }
    },
    "avg_ticket_price": {
      "avg": {
        "field": "AvgTicketPrice"
      }
    }
  }
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21

在kibana中可以添加官方给出的三个索引,分别是电商、日志、航班

image-20220222112016672

ps:索引为什么有的是green有的是yellow?http://www.jwsblog.com/archives/59.html

简单统计聚合工具方法 - stats

一个方法,一次性查询出总和、平均、最大、最小

GET /employee/_search
{
  "size": 0, 
  "aggs": {
    "sal_info": {
      "stats": {
        "field": "sal"
      }
    }
  }
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11

查询结果:

image-20220222113317131

注意:stats 只能处理数值类型的字段,非数值类型的字段不能使用stats

分组统计集合工具方法 - terms

查询:航班到达国家数量统计(分组统计)相当于mysql的 count(group by)

GET /kibana_sample_data_flights/_search
{
  "size": 0, 
  "aggs": {
    "count_dest_country": {
      "terms": {
        "field": "DestCountry",
        "size": 10,
        "order": {
          "_count": "desc"
        }
      }
    }
  }
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15

注意:aggs 中的 terms 和 query 中的 terms 有很大的区别

Es聚合之Terms:https://www.cnblogs.com/xing901022/p/4947436.html

嵌套查询1:查询目的地航班次数以及天气统计

GET /kibana_sample_data_flights/_search
{
  "size": 0, 
  "aggs": {
    "count_dest_country": {
      "terms": {
        "field": "DestCountry",
        "order": {
          "_count": "desc"
        }
      },
      "aggs": {
        "weather_count": {
          "terms": {
            "field": "DestWeather"
          }
        }
      }
    }
  }
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21

嵌套查询2:查询不同岗位的男女比例以及薪资信息

GET /employee/_search
{
  "size": 0, 
  "aggs": {
    "job_info": {
      "terms": {
        "field": "job"
      },
      "aggs": {
        "gender_info": {
          "terms": {
            "field": "gender"
          },
          "aggs": {
            "sal_info": {
              "stats": {
                "field": "sal"
              }
            }
          }
        }
      }
    }
  }
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25

顶层命中 - top_hits

查询:查询员工中年龄最大的2个人

方法1:

GET /employee/_search
{
  "size": 2, 
  "sort": [
    {
      "age": {
        "order": "desc"
      }
    }
  ]
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11

方法2:

GET /employee/_search
{
  "size": 0, 
  "aggs": {
    "top_age": {
      "top_hits": {
        "size": 2,
        "sort": [
          {
            "age": {
              "order": "desc"
            }
          }
        ]
      }
    }
  }
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18

区间查询 - range

GET employee/_search
{
  "size": 0,
  "aggs": {
    "sal_info": {
      "range": {
        "field": "sal",
        "ranges": [
          {
            "key": "0 <= sal <= 5000",
            "from": 0,
            "to": 5000
          },
          {
            "key": "5001 <= sal <= 10000",
            "from": 5001,
            "to": 10000
          },
          {
            "key": "10001 <= sal <= 15000",
            "from": 10001,
            "to": 15000
          }
        ]
      }
    }
  }
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28

推荐搜索

在搜索过程中,因为单词拼写错误,导致我们没有任何的搜索结果,希望es能够给我们一个推荐搜索

GET /es_jd_goods/_search
{
  "suggest": {
    "title_suggestion": {
      "text": "elasticsearh",
      "term": {
        "field": "name",
        "suggest_mode": "popular"
      }
    }
  }
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12

suggest:推荐搜索的固定写法

text:我们输入的原始搜索词

suggest_mode,有三个值:popular、missing、always

  1. popular 是推荐词频更高的一些搜索。
  2. missing 是当没有要搜索的结果的时候才推荐。
  3. always无论什么情况下都进行推荐(默认)。

注意,推荐搜索只有在倒排索引中没有词条的时候才会给出建议

自动补全,前缀搜索 - suggest - prefix

自动补全的功能对性能的要求极高,用户每发送输入一个字符就要发送一个请求去查找匹配项。
ES采取了不同的数据结构来实现,并不是通过倒排索引来实现的;

注意:需要将对应的数据类型设置为completion ; 所以在将数据索引进ES之前需要先定义 mapping 信息。

PS:Es中不能直接对mapping进行修改,如果字段的类型需要变动,只能重建索引,然后填充数据,具体操作见下:

# 查看当前索引的mapping,复制
GET /es_jd_goods/_mapping
GET /es_jd_goods/_search

# 创建一个零时的索引,修改对应的字段属性
put /es_jd_temp
{
  "mappings": {
    "properties": {
      "createTime": {
        "type": "long"
      },
      "id": {
        "type": "long"
      },
      "imgUrl": {
        "type": "keyword"
      },
      "modifyTime": {
        "type": "long"
      },
      "name": {
        "type": "text",
        "term_vector": "with_positions_offsets",
        "analyzer": "ik_max_word"
      },
      "price": {
        "type": "double"
      },
      "shopName": {
        "type": "completion"
      },
      "valid": {
        "type": "boolean"
      }
    }
  }
}

# 查看临时索引的mapping,拷贝
GET /es_jd_temp/_mapping
GET /es_jd_temp/_search

# 拷贝数据到零时索引
POST _reindex
{
  "source": {
    "index": "es_jd_goods"
  },
  "dest": {
    "index": "es_jd_temp"
  }
}

# 删除原来的索引
DELETE es_jd_goods

# 重建索引
PUT /es_jd_goods
{
  "mappings": {
    "properties": {
      "createTime": {
        "type": "long"
      },
      "id": {
        "type": "long"
      },
      "imgUrl": {
        "type": "keyword"
      },
      "modifyTime": {
        "type": "long"
      },
      "name": {
        "type": "text",
        "term_vector": "with_positions_offsets",
        "analyzer": "ik_max_word"
      },
      "price": {
        "type": "double"
      },
      "shopName": {
        "type": "completion"
      },
      "valid": {
        "type": "boolean"
      }
    }
  }
}

GET /es_jd_goods/_mapping
GET /es_jd_goods/_search

# 将零时索引中的数据拷贝到正式索引
POST _reindex
{
  "source": {
    "index": "es_jd_temp"
  },
  "dest": {
    "index": "es_jd_goods"
  }
}

# 删除临时索引
DELETE es_jd_temp
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74
  • 75
  • 76
  • 77
  • 78
  • 79
  • 80
  • 81
  • 82
  • 83
  • 84
  • 85
  • 86
  • 87
  • 88
  • 89
  • 90
  • 91
  • 92
  • 93
  • 94
  • 95
  • 96
  • 97
  • 98
  • 99
  • 100
  • 101
  • 102
  • 103
  • 104
  • 105
  • 106
  • 107
  • 108

非常常用的业务场景,类似百度、京东在搜索的时候,只要打出前几个字,会自动在搜索栏下方提示你可能感兴趣的关键字

GET /es_jd_goods/_search
{
  "_source": [
    "shopName"
  ],
  "suggest": {
    "prefix_suggestion": {
      "prefix": "电子",
      "completion": {
        "field": "shopName",
        "skip_duplicates": true,
        "size": 10
      }
    }
  }
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16

高亮显示

GET /es_jd_goods/_search
{
  "query": {
    "multi_match": {
      "query": "设计",
      "fields": ["name","shopName"]
    }
  },
  "highlight": {
    "pre_tags": "<span>",
    "post_tags": "</span>", 
    "fields": {
      "name": {}, 
      "shopName": {
        "pre_tags": "<em>",
        "post_tags": "</em>"
      }
    }
  }
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20

highlight:高亮显示,只能作用域text类型的字段

pre_tags:标签前缀

post_tags:标签后缀

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/weixin_40725706/article/detail/839122
推荐阅读
相关标签
  

闽ICP备14008679号