当前位置:   article > 正文

Elasticsearch做到像mysql这样的表连接Parent-Child实现

Elasticsearch做到像mysql这样的表连接Parent-Child实现

比如在mysql中我有两张表,movies和ratings,这两张表依赖movie_id实现关联。
那么我需要在elasticsearch中实现一下两个任务

  • 电影标题为When Will I Be Loved的电影评分的均值是多少
  • 电影评分大于5的电影标题是什么
    可以看出来这两个问题都需要将movies和ratings这两张表表连接以后再进行查询。但是Elasticsearch不支持在查询的时候使用movies.movie_id=ratings.movie_id实现两张表的连接,在Elasticsearch的做法是使用Parent-Child实现定义好父文档与子文档。可以理解为例如movie_id为1的数据在movies表中成为父文档,在ratings中成为子文档。

定义索引结构以及定义父子关系

DELETE /movies_ratings_inde


PUT /movies_ratings_index
{
  "mappings": {
    "properties": {
      "movie_id": {"type": "keyword"},
      "movie_title": {"type": "keyword"}
    }
  }
}


PUT /movies_ratings_index/_mapping
{
  "properties": {
    "rating_score": {"type": "float"},
    "movie_id": {"type": "keyword"}
  }
}


# ratings就是表连接,其中movie是父,rating是子
PUT /movies_ratings_index/_mapping
{
  "properties": {
    "movie_id": {"type": "keyword"},
    "movie_title": {"type": "keyword"},
    "ratings": {
      "type": "join",
      "relations": {
        "movie": "rating"
      }
    }
  }
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37

导入数据,我这里父数据定义了两条,movie_id分别是1和2,子数据定义了多个。


POST /movies_ratings_index/_doc/1
{
  "movie_id": "1",
  "movie_title": "When Will I Be Loved",
  "ratings": {
    "name": "movie"
  }
}

POST /movies_ratings_index/_doc/2
{
  "movie_id": "2",
  "movie_title": "When Will I Be Disdained",
  "ratings": {
    "name": "movie"
  }
}


POST /movies_ratings_index/_doc/3?routing=1
{
  "rating_score": 4.5,
  "movie_id": "1",
  "ratings": {
    "name": "rating",
    "parent": "1"
  }
}
POST /movies_ratings_index/_doc/4?routing=1
{
  "rating_score": 6.5,
  "movie_id": "1",
  "ratings": {
    "name": "rating",
    "parent": "1"
  }
}

POST /movies_ratings_index/_doc/5?routing=1
{
  "rating_score": 36.5,
  "movie_id": "1",
  "ratings": {
    "name": "rating",
    "parent": "1"
  }
}
POST /movies_ratings_index/_doc/6?routing=1
{
  "rating_score": 26.5,
  "movie_id": "1",
  "ratings": {
    "name": "rating",
    "parent": "1"
  }
}

POST /movies_ratings_index/_doc/7?routing=1
{
  "rating_score": 16.5,
  "movie_id": "1",
  "ratings": {
    "name": "rating",
    "parent": "1"
  }
}

POST /movies_ratings_index/_doc/8?routing=2
{
  "rating_score": 50,
  "movie_id": "2",
  "ratings": {
    "name": "rating",
    "parent": "2"
  }
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25
  • 26
  • 27
  • 28
  • 29
  • 30
  • 31
  • 32
  • 33
  • 34
  • 35
  • 36
  • 37
  • 38
  • 39
  • 40
  • 41
  • 42
  • 43
  • 44
  • 45
  • 46
  • 47
  • 48
  • 49
  • 50
  • 51
  • 52
  • 53
  • 54
  • 55
  • 56
  • 57
  • 58
  • 59
  • 60
  • 61
  • 62
  • 63
  • 64
  • 65
  • 66
  • 67
  • 68
  • 69
  • 70
  • 71
  • 72
  • 73
  • 74
  • 75
  • 76
  • 77

问题1:使用has_parent,因为我们这里是对父数据的movie_title字段进行筛选数据。

#使用has_parent查询
GET /movies_ratings_index/_search
{
 
  "query": {
    "has_parent": {
      "parent_type": "movie",
      "query": {
        "match": {
          "movie_title": "When Will I Be Loved"
        }
      }
    }
  },
  "aggs": {
    "avg_rating_score": {
      "avg": {
        "field": "rating_score"
      }
    }
  }
}
  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22

问题2:使用has_child,因为我们这里是对孩子数据进行筛选。

#使用has_child查询
GET /movies_ratings_index/_search
{
  "query": {
    "has_child": {
      "type": "rating",
      "query": {
        "range": {
          "rating_score": {
            "gt": 6
          }
        }
      }
    }
  },
  "aggs": {
    "movies_with_high_ratings": {
      "terms": {
        "field": "movie_title.keyword",
        "size": 10  // 返回前10个最频繁出现的电影标题
      }
    }
  }
}

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • 7
  • 8
  • 9
  • 10
  • 11
  • 12
  • 13
  • 14
  • 15
  • 16
  • 17
  • 18
  • 19
  • 20
  • 21
  • 22
  • 23
  • 24
  • 25

到这里能发现,就是说要想在elasticsearch中实现mysql的表连接操作必须要事先定义好父子关系,除此之外还要我这里只提到了两张表之间的关系,那么更多表的连接需要如何操作呢

声明:本文内容由网友自发贡献,不代表【wpsshop博客】立场,版权归原作者所有,本站不承担相应法律责任。如您发现有侵权的内容,请联系我们。转载请注明出处:https://www.wpsshop.cn/w/Cpp五条/article/detail/599132
推荐阅读
相关标签
  

闽ICP备14008679号