文章目錄
- 1. arent / Child
- 2. 父子關系
- 3. 示例1
- 3.1 設定 Mapping
- 3.2 索引父文檔
- 3.3 索引子文檔
- 3.4 查詢
- 4. 示例2
- 4.1 Mapping定義
- 4.2 join類型定義父文檔
- 4.3 join類型定義子文檔
- 4.4 Join類型限制
- 4.5 Join全量檢索
- 4.6 基于父文檔查找子文檔
- 4.7 基于子文檔查找父文檔
- 4.8 Join聚合操作實戰
- 5. Join 一對多實戰
- 5.1 一對多定義
- 5.2 一對多對多定義
1. arent / Child
對象和 Nested 對象的局限性
- 每次更新,需要重新索引整個對象(包括跟對象和嵌套對象)
ES 提供了類似關系型資料庫中 Join 的實作。使用 Join 資料類型實作,可以通過 Parent / Child 的關系,進而分離兩個對象
- 父文檔和子文檔是兩個獨立的文檔
- 更新父文檔無需重新索引整個子文檔。子文檔被新增,更改和删除也不會影響到父文檔和其他子文檔。
2. 父子關系
定義父子關系的幾個步驟
- 設定索引的 Mapping
- 索引父文檔
- 索引子文檔
- 按需查詢文檔
3. 示例1
3.1 設定 Mapping
DELETE my_blogs
# 設定 Parent/Child Mapping
PUT my_blogs
{
"settings": {
"number_of_shards": 2
},
"mappings": {
"properties": {
"blog_comments_relation": {
"type": "join",
"relations": {
"blog": "comment"
}
},
"content": {
"type": "text"
},
"title": {
"type": "keyword"
}
}
}
}
3.2 索引父文檔
#索引父文檔
PUT my_blogs/_doc/blog1
{
"title":"Learning Elasticsearch",
"content":"learning ELK @ geektime",
"blog_comments_relation":{
"name":"blog"
}
}
#索引父文檔
PUT my_blogs/_doc/blog2
{
"title":"Learning Hadoop",
"content":"learning Hadoop",
"blog_comments_relation":{
"name":"blog"
}
}
3.3 索引子文檔
父文檔和子文檔必須存在相同的分片上
- 確定查詢 join 的性能
當指定文檔時候,必須指定它的父文檔 ID
- 使用 route 參數來保證,配置設定到相同的分片
#索引子文檔
PUT my_blogs/_doc/comment1?routing=blog1
{
"comment":"I am learning ELK",
"username":"Jack",
"blog_comments_relation":{
"name":"comment",
"parent":"blog1"
}
}
#索引子文檔
PUT my_blogs/_doc/comment2?routing=blog2
{
"comment":"I like Hadoop!!!!!",
"username":"Jack",
"blog_comments_relation":{
"name":"comment",
"parent":"blog2"
}
}
#索引子文檔
PUT my_blogs/_doc/comment3?routing=blog2
{
"comment":"Hello Hadoop",
"username":"Bob",
"blog_comments_relation":{
"name":"comment",
"parent":"blog2"
}
}
3.4 查詢
# 查詢所有文檔
POST my_blogs/_search
{
}
傳回輸出:
...
#根據父文檔ID檢視
GET my_blogs/_doc/blog2
傳回輸出:
{
"_index" : "my_blogs",
"_type" : "_doc",
"_id" : "blog2",
"_version" : 1,
"_seq_no" : 1,
"_primary_term" : 1,
"found" : true,
"_source" : {
"title" : "Learning Hadoop",
"content" : "learning Hadoop",
"blog_comments_relation" : {
"name" : "blog"
}
}
}
# Parent Id 查詢
POST my_blogs/_search
{
"query": {
"parent_id": {
"type": "comment",
"id": "blog2"
}
}
}
傳回輸出:
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 2,
"successful" : 2,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 0.44183275,
"hits" : [
{
"_index" : "my_blogs",
"_type" : "_doc",
"_id" : "comment2",
"_score" : 0.44183275,
"_routing" : "blog2",
"_source" : {
"comment" : "I like Hadoop!!!!!",
"username" : "Jack",
"blog_comments_relation" : {
"name" : "comment",
"parent" : "blog2"
}
}
},
{
"_index" : "my_blogs",
"_type" : "_doc",
"_id" : "comment3",
"_score" : 0.44183275,
"_routing" : "blog2",
"_source" : {
"comment" : "Hello Hadoop",
"username" : "Bob",
"blog_comments_relation" : {
"name" : "comment",
"parent" : "blog2"
}
}
}
]
}
}
# Has Child 查詢,傳回父文檔
POST my_blogs/_search
{
"query": {
"has_child": {
"type": "comment",
"query" : {
"match": {
"username" : "Jack"
}
}
}
}
}
傳回輸出:
{
"took" : 43,
"timed_out" : false,
"_shards" : {
"total" : 2,
"successful" : 2,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "my_blogs",
"_type" : "_doc",
"_id" : "blog1",
"_score" : 1.0,
"_source" : {
"title" : "Learning Elasticsearch",
"content" : "learning ELK @ geektime",
"blog_comments_relation" : {
"name" : "blog"
}
}
},
{
"_index" : "my_blogs",
"_type" : "_doc",
"_id" : "blog2",
"_score" : 1.0,
"_source" : {
"title" : "Learning Hadoop",
"content" : "learning Hadoop",
"blog_comments_relation" : {
"name" : "blog"
}
}
}
]
}
}
# Has Parent 查詢,傳回相關的子文檔
POST my_blogs/_search
{
"query": {
"has_parent": {
"parent_type": "blog",
"query" : {
"match": {
"title" : "Learning Hadoop"
}
}
}
}
}
傳回輸出:
{
"took" : 4,
"timed_out" : false,
"_shards" : {
"total" : 2,
"successful" : 2,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 2,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "my_blogs",
"_type" : "_doc",
"_id" : "comment2",
"_score" : 1.0,
"_routing" : "blog2",
"_source" : {
"comment" : "I like Hadoop!!!!!",
"username" : "Jack",
"blog_comments_relation" : {
"name" : "comment",
"parent" : "blog2"
}
}
},
{
"_index" : "my_blogs",
"_type" : "_doc",
"_id" : "comment3",
"_score" : 1.0,
"_routing" : "blog2",
"_source" : {
"comment" : "Hello Hadoop",
"username" : "Bob",
"blog_comments_relation" : {
"name" : "comment",
"parent" : "blog2"
}
}
}
]
}
}
#通過ID ,通路子文檔
GET my_blogs/_doc/comment3
傳回輸出:
{
"_index" : "my_blogs",
"_type" : "_doc",
"_id" : "comment3",
"found" : false
}
#通過ID和routing ,通路子文檔
GET my_blogs/_doc/comment3?routing=blog2
傳回輸出:
{
"_index" : "my_blogs",
"_type" : "_doc",
"_id" : "comment3",
"_version" : 1,
"_seq_no" : 5,
"_primary_term" : 1,
"_routing" : "blog2",
"found" : true,
"_source" : {
"comment" : "Hello Hadoop",
"username" : "Bob",
"blog_comments_relation" : {
"name" : "comment",
"parent" : "blog2"
}
}
}
#更新子文檔
PUT my_blogs/_doc/comment3?routing=blog2
{
"comment": "Hello Hadoop??",
"blog_comments_relation": {
"name": "comment",
"parent": "blog2"
}
}
傳回輸出:
{
"_index" : "my_blogs",
"_type" : "_doc",
"_id" : "comment3",
"_version" : 2,
"result" : "updated",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 6,
"_primary_term" : 1
}
4. 示例2
4.1 Mapping定義
Join類型的Mapping如下:
核心
• 1) "my_join_field"為join的名稱。
• 2)“question”: “answer” 指:qustion為answer的父類。
PUT join_ext_index
{
"mappings": {
"properties": {
"my_join_field": {
"type": "join",
"relations": {
"question": [
"answer",
"comment"
]
}
}
}
}
}
4.2 join類型定義父文檔
文檔類型為父類型:“question”。
PUT my_join_index/_doc/1?refresh
{
"text": "This is a question",
"my_join_field": "question"
}
PUT my_join_index/_doc/2?refresh
{
"text": "This is another question",
"my_join_field": "question"
}
4.3 join類型定義子文檔
• 路由值是強制性的,因為父檔案和子檔案必須在相同的分片上建立索引。
• "answer"是此子文檔的加入名稱。
• 指定此子文檔的父文檔ID:1。
PUT my_join_index/_doc/3?routing=1&refresh
{
"text": "This is an answer",
"my_join_field": {
"name": "answer",
"parent": "1"
}
}
PUT my_join_index/_doc/4?routing=1&refresh
{
"text": "This is another answer",
"my_join_field": {
"name": "answer",
"parent": "1"
}
}
4.4 Join類型限制
- 每個索引隻允許一個Join類型Mapping定義;
- 父文檔和子文檔必須在同一個分片上編入索引;這意味着,當進行删除、更新、查找子文檔時候需要提供相同的路由值。
- 一個文檔可以有多個子文檔,但隻能有一個父文檔。
- 可以為已經存在的Join類型添加新的關系。
- 當一個文檔已經成為父文檔後,可以為該文檔添加子文檔。
4.5 Join全量檢索
GET my_join_index/_search
{
"query": {
"match_all": {}
},
"sort": ["_id"]
}
傳回結果如下:
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 4,
"max_score": null,
"hits": [
{
"_index": "my_join_index",
"_type": "_doc",
"_id": "1",
"_score": null,
"_source": {
"text": "This is a question",
"my_join_field": "question"
},
"sort": [
"1"
]
},
{
"_index": "my_join_index",
"_type": "_doc",
"_id": "2",
"_score": null,
"_source": {
"text": "This is another question",
"my_join_field": "question"
},
"sort": [
"2"
]
},
{
"_index": "my_join_index",
"_type": "_doc",
"_id": "3",
"_score": null,
"_routing": "1",
"_source": {
"text": "This is an answer",
"my_join_field": {
"name": "answer",
"parent": "1"
}
},
"sort": [
"3"
]
},
{
"_index": "my_join_index",
"_type": "_doc",
"_id": "4",
"_score": null,
"_routing": "1",
"_source": {
"text": "This is another answer",
"my_join_field": {
"name": "answer",
"parent": "1"
}
},
"sort": [
"4"
]
}
]
}
}
4.6 基于父文檔查找子文檔
GET my_join_index/_search
{
"query": {
"has_parent" : {
"parent_type" : "question",
"query" : {
"match" : {
"text" : "This is"
}
}
}
}
}
傳回結果:
{
"took": 0,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 1,
"hits": [
{
"_index": "my_join_index",
"_type": "_doc",
"_id": "3",
"_score": 1,
"_routing": "1",
"_source": {
"text": "This is an answer",
"my_join_field": {
"name": "answer",
"parent": "1"
}
}
},
{
"_index": "my_join_index",
"_type": "_doc",
"_id": "4",
"_score": 1,
"_routing": "1",
"_source": {
"text": "This is another answer",
"my_join_field": {
"name": "answer",
"parent": "1"
}
}
}
]
}
}
4.7 基于子文檔查找父文檔
GET my_join_index/_search
{
"query": {
"has_child" : {
"type" : "answer",
"query" : {
"match" : {
"text" : "This is question"
}
}
}
}
}
傳回結果:
{
"took": 0,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 1,
"hits": [
{
"_index": "my_join_index",
"_type": "_doc",
"_id": "1",
"_score": 1,
"_source": {
"text": "This is a question",
"my_join_field": "question"
}
}
]
}
}
4.8 Join聚合操作實戰
以下操作含義如下:
• 1)parent_id是特定的檢索方式,用于檢索屬于特定父文檔id=1的,子文檔類型為answer的文檔的個數。
• 2)基于父文檔類型question進行聚合;
• 3)基于指定的field處理。
GET my_join_index/_search
{
"query": {
"parent_id": {
"type": "answer",
"id": "1"
}
},
"aggs": {
"parents": {
"terms": {
"field": "my_join_field#question",
"size": 10
}
}
},
"script_fields": {
"parent": {
"script": {
"source": "doc['my_join_field#question']"
}
}
}
}
傳回結果:
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 0.13353139,
"hits": [
{
"_index": "my_join_index",
"_type": "_doc",
"_id": "3",
"_score": 0.13353139,
"_routing": "1",
"fields": {
"parent": [
"1"
]
}
},
{
"_index": "my_join_index",
"_type": "_doc",
"_id": "4",
"_score": 0.13353139,
"_routing": "1",
"fields": {
"parent": [
"1"
]
}
}
]
},
"aggregations": {
"parents": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "1",
"doc_count": 2
}
]
}
}
}
5. Join 一對多實戰
5.1 一對多定義
如下,一個父文檔question與多個子文檔answer,comment的映射定義。
PUT join_ext_index
{
"mappings": {
"_doc": {
"properties": {
"my_join_field": {
"type": "join",
"relations": {
"question": ["answer", "comment"]
}
}
}
}
}
}
5.2 一對多對多定義
實作如下圖的祖孫三代關聯關系的定義。
question
/ \
/ \
comment answer
|
|
vote
PUT join_multi_index
{
"mappings": {
"properties": {
"my_join_field": {
"type": "join",
"relations": {
"question": [
"answer",
"comment"
],
"answer": "vote"
}
}
}
}
}
PUT join_multi_index/_doc/3?routing=1&refresh
{
"text": "This is a vote",
"my_join_field": {
"name": "vote",
"parent": "2"
}
}
- 孫子文檔所在分片必須與其父母和祖父母相同
- 孫子文檔的父代号(必須指向其父親answer文檔)