赞
踩
语法
POST /wubigdata/_search { "query": { "match_all": {} } } # query :代表查询对象 # match_all :代表查询所有 # 结果 # took:查询花费时间,单位是毫秒 # time_out:是否超时 # _shards:分片信息 # hits:搜索结果总览对象 # total:搜索到的总条数 # max_score:所有结果中文档得分的最高分 # hits:搜索结果的文档对象数组,每个元素是一条搜索到的文档信息 # _index:索引库 # _type:文档类型 # _id:文档id # _score:文档得分 # _source:文档的源数据
match queries 接收 text/numerics/dates, 对它们进行分词分析, 再组织成一个boolean查询。可通过operator 指定bool组 合操作(or、and 默认是 or )
A. or关系
POST /wudbes/_search { "query": { "match": { "title": "小米电视" } } } ## ------------------------------------------------------------------ { "took" : 1, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 2, "relation" : "eq" }, "max_score" : 1.2044649, "hits" : [ { "_index" : "wudbes", "_type" : "_doc", "_id" : "_C9i54EBVssqw8qzOZhJ", "_score" : 1.2044649, "_source" : { "title" : "小米电视4A", "images" : "http://image.com/12479122.jpg", "price" : 4288 } }, { "_index" : "wudbes", "_type" : "_doc", "_id" : "_S9i54EBVssqw8qzR5gA", "_score" : 0.52354836, "_source" : { "title" : "小米手机", "images" : "http://image.com/12479622.jpg", "price" : 2699 } } ] } }
B. and 关系
我们需要更精确查找,我们希望这个关系变成 and。
POST /wudbes/_search { "query": { "match": { "title": { "query": "小米电视", "operator": "and" } } } } #______________________________________________________________________________________________________ { "took" : 0, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : { "total" : { "value" : 1, "relation" : "eq" }, "max_score" : 1.2044649, "hits" : [ { "_index" : "wudbes", "_type" : "_doc", "_id" : "_C9i54EBVssqw8qzOZhJ", "_score" : 1.2044649, "_source" : { "title" : "小米电视4A", "images" : "http://image.com/12479122.jpg", "price" : 4288 } } ] } }
match_phrase 查询用来对一个字段进行短语查询,可以指定 analyzer、slop移动因子
GET /wudbes/_search { "query": { "match_phrase": { "title": { "query": "小米 4A", "slop": 2 } } } } GET /wudbes/_search { "query": { "query_string": { "query": "2699" } } } GET /wudbes/_search { "query": { "query_string": { "query": "2699", "default_field": "title" } } }
Query String Query提供了无需指定某字段而对文档全文进行匹配查询的一个高级查询,同时可以指定在 哪些字段上进行匹配
# 默认搜索 GET /wudbes/_search { "query": { "query_string": { "query": "2699" } } } GET /wudbes/_search { "query": { "query_string": { "query": "2699", "default_field": "title" } } } #逻辑查询 GET /wudbes/_search { "query": { "query_string": { "query": "手机 OR 小米", "default_field": "title" } } } GET /wudbes/_search { "query": { "query_string": { "query": "手机 AND 小米", "default_field": "title" } } } # 模糊查询 GET /wudbes/_search { "query": { "query_string": { "query": "大米~1", "default_field": "title" } } } # 多字段支持 GET /wudbes/_search { "query": { "query_string": { "query": "2699", "fields": [ "title", "price" ] } } }
多个字段上进行文本搜索,可用multi_match 。multi_match在 match的基础上支持对多个字段进行文本查询。
GET /wudbes/_search { "query": { "multi_match": { "query": "2699", "fields": [ "title","price" ] } } } # 可以用*来匹配多个字段 GET /wudbes/_search { "query": { "multi_match": { "query": "http://image.com/12479622.jpg", "fields": [ "title", "ima*" ] } } }
使用term-level queries根据结构化数据中的精确值查找文档
添加数据结构 和添加数据
结构
PUT /wudl_book { "settings": {}, "mappings": { "properties": { "description": { "type": "text", "analyzer": "ik_max_word" }, "name": { "type": "text", "analyzer": "ik_max_word" }, "price": { "type": "float" }, "timestamp": { "type": "date", "format": "yyyy-MM-dd HH:mm:ss||yyyy-MM-dd||epoch_millis" } } } }
添加数据
PUT /wudl_book/_doc/1 { "name": "lucene", "description": "Lucene Core is a Java library providing powerful indexing and search features, as well as spellchecking, hit highlighting and advanced analysis/tokenization capabilities. The PyLucene sub project provides Python bindings for Lucene Core. ", "price": 100.45, "timestamp": "2022-07-10 19:11:35" } PUT /wudl_book/_doc/2 { "name": "solr", "description": "Solr is highly scalable, providing fully fault tolerant distributed indexing, search and analytics. It exposes Lucenes features through easy to use JSON/HTTP interfaces or native clients for Java and other languages.", "price": 320.45, "timestamp": "2022-07-10 17:11:35" } PUT /wudl_book/_doc/3 { "name": "Hadoop", "description": "The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models.", "price": 620.45, "timestamp": "2022-07-22 19:18:35" } PUT /wudl_book/_doc/4 { "name": "ElasticSearch", "description": "Elasticsearch是一个基于Lucene的搜索服务器。它提供了一个分布式多用户能力 的全文搜索引擎,基于RESTful web接口。Elasticsearch是用Java语言开发的,并作为Apache许可条 款下的开放源码发布,是一种流行的企业级搜索引擎。Elasticsearch用于云计算中,能够达到实时搜 索,稳定,可靠,快速,安装使用方便。官方客户端在Java、.NET(C#)、PHP、Python、Apache Groovy、Ruby和许多其他语言中都是可用的。根据DB-Engines的排名显示,Elasticsearch是最受欢 迎的企业搜索引擎,其次是Apache Solr,也是基于Lucene。", "price": 999.99, "timestamp": "2022-08-15 10:11:35" }
term 查询用于查询指定字段包含某个词项的文档
POST /wudl_book/_search
{
"query": {
"term": {
"name": "solr"
}
}
}
terms 查询用于查询指定字段包含某些词项的文档
GET /wudl_book/_search
{
"query": {
"terms": {
"name": [
"solr",
"elasticsearch"
]
}
}
}
gte:大于等于
gt:大于
lte:小于等于
lt:小于
boost:查询权重
GET /wudl_book/_search { "query": { "range": { "price": { "gte": 10, "lte": 200, "boost": 2 } } } } GET /wudl_book/_search { "query": { "range": { "timestamp": { "gte": "now-2d/d", "lt": "now/d" } } } } GET wudl_book/_search { "query": { "range": { "timestamp": { "gte": "18/08/2020", "lte": "2021", "format": "dd/MM/yyyy||yyyy" } } } }
查询指定字段值不为空的文档。相当 SQL 中的 column is not null
GET /wudl_book/_search
{
"query": {
"exists": {
"field": "price"
}
}
}
GET /wudl_book/_search
{
"query": {
"prefix": {
"name": "so"
}
}
}
GET /wudl_book/_search { "query": { "wildcard": { "name": "so*r" } } } GET /wudl_book/_search { "query": { "wildcard": { "name": { "value": "lu*", "boost": 2 } } } }
GET /wudl_book/_search { "query": { "regexp": { "name": "s.*" } } } GET /wudl_book/_search { "query": { "regexp": { "name": { "value": "s.*", "boost": 1.2 } } } }
GET /wudl_book/_search { "query": { "fuzzy": { "name": "so" } } } GET /wudl_book/_search { "query": { "fuzzy": { "name": { "value": "so", "boost": 1, "fuzziness": 2 } } } } GET /wudl_book/_search { "query": { "fuzzy": { "name": { "value": "sorl", "boost": 1, "fuzziness": 2 } } } }
GET /wudl_book/_search
{
"query": {
"ids": {
"type": "_doc",
"values": [
"1",
"3"
]
}
}
}
GET /wudl_book/_search { "query": { "term": { "description": "solr" } } } GET /wudl_book/_search { "query": { "constant_score": { "filter": { "term": { "description": "solr" } }, "boost": 1.2 } } }
bool 查询用bool操作来组合多个查询字句为一个查询。 可用的关键字:
must:必须满足
filter:必须满足,但执行的是filter上下文,不参与、不影响评分
should:或
must_not:必须不满足,在filter上下文中执行,不参与、不影响评分
POST /wudl_book/_search { "query": { "bool": { "should": { "match": { "description": "java" } }, "filter": { "term": { "name": "solr" } }, "must_not": { "range": { "price": { "gte": 200, "lte": 300 } } }, "minimum_should_match": 1, "boost": 1 } } }
相关性评分排序
默认情况下,返回的结果是按照 相关性 进行排序的——最相关的文档排在最前。 在本章的后面部 分,我们会解释 相关性 意味着什么以及它是如何计算的, 不 过让我们首先看看 sort 参数以及如
何使用它。
为了按照相关性来排序,需要将相关性表示为一个数值。在 Elasticsearch 中, 相关性得分 由一
个浮点数进行表示,并在搜索结果中通过 _score 参数返回, 默认排序是 _score 降序,按照相
关性评分升序排序如下
POST /wudl_book/_search { "query": { "match": { "description": "solr" } } } POST /wudl_book/_search { "query": { "match": { "description": "solr" } }, "sort": [ { "_score": { "order": "asc" } } ] }
字段值排序
POST /wudl_book/_search
{
"query": {
"match_all": {}
},
"sort": [
{
"price": {
"order": "desc"
}
}
]
}
多级排序
POST /wudl_book/_search { "query": { "match_all": {} }, "sort": [ { "price": { "order": "desc" } }, { "timestamp": { "order": "desc" } } ] }
size:每页显示多少条
from:当前页起始索引, int start = (pageNum - 1) * size
POST /wudl_book/_search
{
"query": {
"match_all": {}
},
"sort": [
{
"price": {
"order": "desc"
}
}
],
"size": 2,
"from": 2
}
POST /wudl_book/_search { "query": { "match": { "name": "elasticsearch" } }, "highlight": { "pre_tags": "<font color='pink'>", "post_tags": "</font>", "fields": [ { "name": {} } ] } } POST /wudl_book/_search { "query": { "match": { "name": "elasticsearch" } }, "highlight": { "pre_tags": "<font color='pink'>", "post_tags": "</font>", "fields": [ { "name": {} }, { "description": {} } ] } } POST /wudl_book/_search { "query": { "query_string": { "query": "elasticsearch" } }, "highlight": { "pre_tags": "<font color='pink'>", "post_tags": "</font>", "fields": [ { "name": {} }, { "description": {} } ] } }
mget 批量查询
单条查询 GET /test_index/_doc/1,如果查询多个id的文档一条一条查询,网络开销太大
GET /_mget
{
"docs": [
{
"_index": "wudl_book",
"_id": 1
},
{
"_index": "wudl_book",
"_id": 2
}
]
}
同一索引下批量查询:
GET /wudl_book/_mget
{
"docs": [
{
"_id": 2
},
{
"_id": 3
}
]
}
bulk 批量增删改:
Bulk 操作解释将文档的增删改查一些列操作,通过一次请求全都做完。减少网络传输次数
POST /_bulk
{
"delete": { "_index": "wudl_book", "_id": "1" }}
{ "create": { "_index": "wudl_book", "_id": "5" }}
{ "name": "test14","price":100.99 }
{ "update": { "_index": "wudl_book", "_id": "2"} }
{ "doc" : {"name" : "test"} }
参数说明:
delete:删除一个文档,只要1个json串就可以了 删除的批量操作不需要请求体
create:相当于强制创建 PUT /index/type/id/_create
index:普通的put操作,可以是创建文档,也可以是全量替换文档
update:执行的是局部更新partial update操作
格式:每个json不能换行。相邻json必须换行。
隔离:每个操作互不影响。操作失败的行会返回其失败信息
实际用法:bulk请求一次不要太大,否则一下积压到内存中,性能会下降。所以,一次请求几千个操作、大小在几M正好。
bulk会将要处理的数据载入内存中,所以数据量是有限的,最佳的数据两不是一个确定的数据,它取决
于你的硬件,你的文档大小以及复杂性,你的索引以及搜索的负载。
一般建议是1000-5000个文档,大小建议是5-15MB,默认不能超过100M,可以在es的配置文件(ES的
config下的elasticsearch.yml)中配置。
http.max_content_length: 10mb
#5.0 之前的写法 POST /wudl_book/_search { "query": { "filtered": { "query": { "match_all": {} }, "filter": { "range": { "price": { "gte": 20, "lte": 1000 } } } } } } #5.0 之后的写法 POST /wudl_book/_search { "query": { "bool": { "must": { "match_all": {} }, "filter": { "range": { "price": { "gte": 200, "lte": 1000 } } } } } }
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。