赞
踩
先收藏一波官方link再看看别人的笔记1link笔记2link就假装我会了。
为何ES就成为全文搜索引擎的首选。可以快速地存储、搜索和分析海量数据。这些离不开它的倒排索引表,示意如下:
对于保存的记录:
1-红海行动
2-探索红海行动
3-红海特别行动
4-红海记录篇
5-特工红海特别探索
dokcer中安装elastic search
(1)下载ealastic search和kibana
docker pull elasticsearch:7.4.2
docker pull kibana:7.4.2
(2)配置
mkdir -p /mydata/elasticsearch/config
mkdir -p /mydata/elasticsearch/data
echo "http.host: 0.0.0.0" >/mydata/elasticsearch/config/elasticsearch.yml
chmod -R 777 /mydata/elasticsearch/
(3)启动Elastic search
docker run --name elasticsearch -p 9200:9200 -p 9300:9300 \
-e "discovery.type=single-node" \
-e ES_JAVA_OPTS="-Xms64m -Xmx512m" \
-v /mydata/elasticsearch/config/elasticsearch.yml:/usr/share/elasticsearch/config/elasticsearch.yml \
-v /mydata/elasticsearch/data:/usr/share/elasticsearch/data \
-v /mydata/elasticsearch/plugins:/usr/share/elasticsearch/plugins \
-d elasticsearch:7.4.2
设置开机启动elasticsearch
docker update elasticsearch --restart=always
(4)启动kibana:
docker run --name kibana -e ELASTICSEARCH_HOSTS=http://192.168.1.100:9200 -p 5601:5601 -d kibana:7.4.2
注意这里的192.168.1.100:9200为ES的
设置开机启动kibana
docker update kibana --restart=always
(5)测试
查看elasticsearch版本信息:http://localhost:9200/
访问Kibana: http://192.168.1.100:5601/app/kibana
新建索引并添加属性mapping
PUT gulimall_product { "mappings": { "properties": { "attrs": { "type": "nested", "properties": { "attrId": { "type": "long" }, "attrName": { "type": "keyword" }, "attrValue": { "type": "keyword" } } }, "brandId": { "type": "long" }, "brandImg": { "type": "keyword" }, "brandName": { "type": "keyword" }, "catalogId": { "type": "long" }, "catalogName": { "type": "keyword" }, "hasStock": { "type": "boolean" }, "hotScore": { "type": "long" }, "saleCount": { "type": "long" }, "skuId": { "type": "long" }, "skuImg": { "type": "keyword" }, "skuPrice": { "type": "keyword" }, "skuTitle": { "type": "text", "analyzer": "ik_smart" }, "spuId": { "type": "keyword" } } } }
你会发现其中有一个"type": “nested”,如果要不是该类型会如何呢,ES默认会对对嵌套数据类型进行扁平化处理
为保持属性的独立,需要将该属性类型改为nested,通过对下面的实例进行nested类型查询示意:
这个是某款商品的属性表示例
"attrs" : [ { "attrId" : 10, "attrName" : "上市年份", "attrValue" : "2020" }, { "attrId" : 11, "attrName" : "品牌名", "attrValue" : "mate30" }, { "attrId" : 12, "attrName" : "CPU", "attrValue" : "麒麟" }, { "attrId" : 13, "attrName" : "屏幕刷新率", "attrValue" : "120HZ" } ],
该数据为nested type,maintain the independence of each object in the array
现在想查询匹配属性需求的产品,示例:12号属性必须为鲲鹏、11号属性必须为xiaomi,其DSL如下
将其中一个nested query展开,针对的是path指定的attrs数组中每个独立的对象query,查询两个独立对象就需要两个nested query.
"nested": { //嵌入式的 "path": "attrs", "query": { "bool": { "must": [ // 根据属性id 以及属性值进行过滤 { "term": { "attrs.attrId": { "value": "12" } } }, { "terms": { "attrs.attrValue": [ "鲲鹏" ] } } ] } } }
ES内置了不少分词器,但对中文分词并不友好
1、安装 ik 分词器
去link 下载与 es对应的版本,拷贝到ES的plugins 目录下,即ES当初启动时指定的挂载目录-v /mydata/elasticsearch/plugins:/usr/share/elasticsearch/plugins
2、自定义词库
对于有些中文流行语,分词器支持的未必很好,可以自己扩展词库,修改/mydata/elasticsearch/plugins/ik/config/IKAnalyzer.cfg.xml
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd">
<properties>
<comment>IK Analyzer 扩展配置</comment>
<!--用户可以在这里配置自己的扩展字典 -->
<entry key="ext_dict"></entry>
<!--用户可以在这里配置自己的扩展停止词字典-->
<entry key="ext_stopwords"></entry>
<!--用户可以在这里配置远程扩展字典 -->
<entry key="remote_ext_dict">http://192.168.1.100/es/fenci.txt</entry>
<!--用户可以在这里配置远程扩展停止词字典-->
<!-- <entry key="remote_ext_stopwords">words_location</entry> -->
</properties>
PS:http://192.168.1.100/es/fenci.txt这个是nginx静态资源,配置好nginx可直接访问,之后再阐述
选择 Elasticsearch - Rest - Client (elasticsearch - rest - high - level - client)link
<properties>
<!--人为指定ES版本,为了与ES服务器版本一致,springboot中的ES版本不是这个-->
<elasticsearch.version>7.4.2</elasticsearch.version>
</properties>
<!--ElasticSearch-->
<dependency>
<groupId>org.elasticsearch.client</groupId>
<artifactId>elasticsearch-rest-high-level-client</artifactId>
<!--<version>7.4.2</version>-->
</dependency>
配置类,给容器注入一个RestHighLevelClient
@Configuration public class ElasticSearchConfig { public static final RequestOptions COMMON_OPTIONS; static { RequestOptions.Builder builder = RequestOptions.DEFAULT.toBuilder(); // builder.addHeader("Authorization", "Bearer " + TOKEN); // builder.setHttpAsyncResponseConsumerFactory( // new HttpAsyncResponseConsumerFactory // .HeapBufferedResponseConsumerFactory(30 * 1024 * 1024 * 1024)); COMMON_OPTIONS = builder.build(); } @Bean public RestHighLevelClient esRestClient() { RestHighLevelClient client = new RestHighLevelClient( RestClient.builder( new HttpHost("localhost", 9200, "http"))); return client; } }
上架商品、将该商品相关属性上传到 Es中 为搜索服务做铺垫。上架是面向商家的后台系统处理的,在product服务中的up方法会调用search服务中的保存至ES。
com/atguigu/gulimall/search/impl/ProductSaveServiceImpl.java
@Slf4j @Service("productsaveservice") public class ProductSaveServiceImpl implements ProductSaveService { @Autowired RestHighLevelClient esClient; @Override public boolean productStatusUp(List<SkuEsModel> skuEsModelList) throws IOException { // 1.给ES建立一个索引 product BulkRequest bulkRequest = new BulkRequest(); // 2.构造保存请求 for (SkuEsModel skuEsModel : skuEsModelList) { IndexRequest indexRequest = new IndexRequest(EsConstant.PRODUCT_INDEX); // 设置索引id indexRequest.id(skuEsModel.getSkuId().toString()); indexRequest.source(JSON.toJSONString(skuEsModel), XContentType.JSON); bulkRequest.add(indexRequest); } //BulkRequest bulkRequest, RequestOptions options BulkResponse bulk = esClient.bulk(bulkRequest, ElasticSearchConfig.COMMON_OPTIONS); // TODO 是否拥有错误 boolean hasFailures = bulk.hasFailures(); if(hasFailures){ List<String> collect = Arrays.stream(bulk.getItems()).map(item -> item.getId()).collect(Collectors.toList()); log.error("商品上架完成:{}",collect); } return hasFailures; } }
语句有点长,粘贴在这里,没法看,放在Kibana中看着舒服点
/home/xu/PersonProjects/IdeaProjects/guimail/gulimall-search/src/main/resources/dsl.json
GET gulimall_product/_search { "query": { "bool": { "must": [ { "match": { "skuTitle": "华为" // 按照关键字查询 } } ], "filter": [ { "term": { "catalogId": "225" // 根据分类id过滤 } }, { "terms": { "brandId": [ // 品牌id "1", "5", "9" ] } }, { "nested": { //嵌入式的 "path": "attrs", "query": { "bool": { "must": [ // 根据属性id 以及属性值进行过滤 { "term": { "attrs.attrId": { "value": "8" } } }, { "terms": { "attrs.attrValue": [ "2019" ] } } ] } } } }, { "term": { // 是否有库存 "hasStock": { "value": "false" } } }, { "range": { // 价格区间 "skuPrice": { "gte": 0, "lte": 7000 } } } ] } }, "sort": [ //排序 { "skuPrice": { "order": "desc" } } ], "from": 0, "size":4, "highlight": { // 对搜索田间进行高亮 "fields": {"skuTitle": {}}, "pre_tags": "<b style=color:red>", "post_tags": "</b>" }, "aggs": { "brand_agg": { //品牌进行聚合 "terms": { "field": "brandId", "size": 10 }, "aggs": { "brand_name_agg": { // 品牌名字 "terms": { "field": "brandName", "size": 10 } }, "brand_img_agg": { //品牌图片 "terms": { "field": "brandImg", "size": 10 } } } }, "catalog_agg": { // 分类 "terms": { "field": "catalogId", "size": 10 }, "aggs": { "catalog_name_agg": { //分类名字 "terms": { "field": "catalogName", "size": 10 } } } }, "attr_agg":{ "nested": { //嵌入式的聚合 "path": "attrs" }, "aggs": { //属性聚合 "attr_id_agg": { "terms": { "field": "attrs.attrId", "size": 10 }, "aggs": { "attr_name_agg": { //属性名字 "terms": { "field": "attrs.attrName", "size": 10 } }, "attr_value_agg":{ //属性的值 "terms": { "field": "attrs.attrValue", "size": 10 } } } } } } } }
这里的代码量太大了,我就只放出DSL翻译为Java的代码部分,查询的结果处理还是看文件。
/home/xu/PersonProjects/IdeaProjects/guimail/gulimall-search/src/main/java/com/atguigu/gulimall/search/impl/MallSearchServiceImpl.java
@Autowired
RestHighLevelClient esClient;
//1、准备检索请求
SearchRequest searchRequest = buildSearchRequest(param);
try {
// 2、执行检索请求
SearchResponse response = esClient.search(searchRequest, ElasticSearchConfig.COMMON_OPTIONS);
。。。
/** * 准备检索请求 * #模糊匹配、过滤(按照属性、分类、品牌、价格区间、库存)、排序、分页、高亮、聚合分析 * * @return */ private SearchRequest buildSearchRequest(SearchParam param){ SearchSourceBuilder sourceBuilder = new SearchSourceBuilder(); //构建DSL语句 /** * 模糊匹配 过滤(按照属性、分类、品牌、价格区间、库存) */ // 1、构建bool - query BoolQueryBuilder boolQuery = QueryBuilders.boolQuery(); // 1.1 must - 模糊匹配 if (!StringUtils.isEmpty(param.getKeyword())) { boolQuery.must(QueryBuilders.matchQuery("skuTitle", param.getKeyword())); } // 1.2 bool - filter 按照三级分类id来查询 if (param.getCatalog3Id() != null) { boolQuery.filter(QueryBuilders.termQuery("catalogId", param.getCatalog3Id())); } // 1.2 bool - filter 按照品牌id来查询 if (param.getBrandId() != null && param.getBrandId().size() > 0) { boolQuery.filter(QueryBuilders.termsQuery("brandId", param.getBrandId())); } // 1.2 bool - filter 按照所有指定的属性来进行查询 *******不理解这个attr=1_5寸:8寸这样的设计 if (param.getAttrs() != null && param.getAttrs().size() > 0) { for (String attr : param.getAttrs()) { // attr=1_5寸:8寸&attrs=2_16G:8G BoolQueryBuilder nestedboolQuery = QueryBuilders.boolQuery(); String[] s = attr.split("_"); String attrId = s[0];// 检索的属性id String[] attrValues = s[1].split(":"); nestedboolQuery.must(QueryBuilders.termQuery("attrs.attrId", attrId)); nestedboolQuery.must(QueryBuilders.termsQuery("attrs.attrValue", attrValues)); // 每一个必须都生成一个nested查询 NestedQueryBuilder nestedQuery = QueryBuilders.nestedQuery("attrs", nestedboolQuery, ScoreMode.None); boolQuery.filter(nestedQuery); } } // 1.2 bool - filter 按照库存是否存在 boolQuery.filter(QueryBuilders.termQuery("hasStock", param.getHasStock() == 1)); // 1.2 bool - filter 按照价格区间 /** * 1_500/_500/500_ */ if (!StringUtils.isEmpty(param.getSkuPrice())) { RangeQueryBuilder rangeQuery = QueryBuilders.rangeQuery("skuPrice"); String[] s = param.getSkuPrice().split("_"); if (s.length == 2) { // 区间 rangeQuery.gte(s[0]).lte(s[1]); } else if (s.length == 1) { if (param.getSkuPrice().startsWith("_")) { rangeQuery.lte(s[0]); } if (param.getSkuPrice().endsWith("_")) { rangeQuery.gte(s[0]); } } boolQuery.filter(rangeQuery); } //把以前所有条件都拿来进行封装 sourceBuilder.query(boolQuery); /** * 排序、分页、高亮 */ //2.1、排序 if (!StringUtils.isEmpty(param.getSort())) { String sort = param.getSort(); //sort=hotScore_asc/desc String[] s = sort.split("_"); SortOrder order = s[1].equalsIgnoreCase("asc") ? SortOrder.ASC : SortOrder.DESC; sourceBuilder.sort(s[0], order); } //2.2 分页 pageSize:5 // pageNum:1 from 0 size:5 [0,1,2,3,4] // pageNum:2 from 5 size:5 // from (pageNum - 1)*size sourceBuilder.from((param.getPageNum() - 1) * EsConstant.PRODUCT_PAGESIZE); sourceBuilder.size(EsConstant.PRODUCT_PAGESIZE); //2.3、高亮 if (!StringUtils.isEmpty(param.getKeyword())) { HighlightBuilder builder = new HighlightBuilder(); builder.field("skuTitle"); builder.preTags("<b style='color:red'>"); builder.postTags("</b>"); sourceBuilder.highlighter(builder); } /** * 聚合分析 */ //1、品牌聚合 TermsAggregationBuilder brand_agg = AggregationBuilders.terms("brand_agg"); brand_agg.field("brandId").size(50); //品牌聚合的子聚合 brand_agg.subAggregation(AggregationBuilders.terms("brand_name_agg").field("brandName").size(2)); brand_agg.subAggregation(AggregationBuilders.terms("brand_img_agg").field("brandImg").size(2)); // TODO 1、聚合brand sourceBuilder.aggregation(brand_agg); //2、分类聚合 TermsAggregationBuilder catalog_agg = AggregationBuilders.terms("catalog_agg").field("catalogId").size(20); catalog_agg.subAggregation(AggregationBuilders.terms("catalog_name_agg").field("catalogName").size(1)); // TODO 2、聚合catalog sourceBuilder.aggregation(catalog_agg); //3、属性聚合 attr_agg NestedAggregationBuilder attr_agg = AggregationBuilders.nested("attr_agg", "attrs"); // 聚合出当前所有的attrId TermsAggregationBuilder attr_id_agg = AggregationBuilders.terms("attr_id_agg").field("attrs.attrId"); //聚合分析出当前attr_id对应的名字 attr_id_agg.subAggregation(AggregationBuilders.terms("attr_name_agg").field("attrs.attrName").size(1)); // 聚合分析出当前attr_id对应的可能的属性值attractValue attr_id_agg.subAggregation(AggregationBuilders.terms("attr_value_agg").field("attrs.attrValue").size(50)); attr_agg.subAggregation(attr_id_agg); // TODO 3、聚合attr sourceBuilder.aggregation(attr_agg); String s = sourceBuilder.toString(); System.out.println("构建的DSL:" + s); SearchRequest searchRequest = new SearchRequest(new String[]{EsConstant.PRODUCT_INDEX}, sourceBuilder); return searchRequest; }
Copyright © 2003-2013 www.wpsshop.cn 版权所有,并保留所有权利。