概述
什麼是highlight
Highlight就是我們所謂的高亮,即允許對一個或者對個字段在搜尋結果中高亮顯示。比如字型加粗或者字型呈現和其他文本普通顔色等。
為了執行高亮顯示,該字段必須有實際的内容,并且這個字段必須存儲,即在mapping中store設為true,不能隻存在于記憶體中,否則系統會自動加載_source字段并比對相關的列。
三種高亮類型
ES提供了三種高亮類型,Lucene的plain highlighter,以及fast vector highlighter(fvh)以及posting highlighter.
Plain Highlighter
Plain Hightlighter是預設的高亮選擇,由使用Lucene Hightlighter實作的。它主要是試圖反應查詢比對邏輯。
如果想高亮很多字段,而且帶有複雜的查詢,那麼這個highlight并不是很快的。為了準确地反映查詢邏輯,它建立了一個很小的記憶體索引。并通過Lucene的查詢執行計劃來重新運作原始的查詢條件,進而獲得對目前文檔的低級比對資訊,每個字段和每個需要高亮顯示的文檔都會重複這個過程,是以是有性能隐患的。是以需要你換一個hightlight類型
Fast Vector Highlighter
如果我們在mapping中對字段指定了term_vector參數,且參數值是with_positions_offsets,那麼fast vector highlighter 将會替代plain highlighter成為預設的highlight類型。
它的主要特點:
- 對磁盤的消耗更少
- 将文本切割為句子,并且對句子進行高亮,效果更好
- 性能比plain highlight高,因為不需要重新對高亮文本進行分詞
Posting Highlighter
如果我們在mapping裡index_options設定成offsets,這個posting hightlighter将會代替plain highlighter。
它對大檔案而言(大于1M),性能更高。
示例
查詢位址資訊中含有mill或者Court的記錄,并将它們高亮顯示。
查詢語句如下:
GET /bank/_search
{
"query": {
"bool": {
"should": [
{ "match": { "address": "mill" } },
{ "match": { "address": "Court" } }
]
}
},
"highlight": {
"fields": {
"address": {}
}
}
}
查詢結果如下:
{
"_index" : "bank",
"_type" : "account",
"_id" : "472",
"_score" : 5.4032025,
"_source" : {
"account_number" : 472,
"balance" : 25571,
"firstname" : "Lee",
"lastname" : "Long",
"age" : 32,
"gender" : "F",
"address" : "288 Mill Street",
"employer" : "Comverges",
"email" : "[email protected]",
"city" : "Movico",
"state" : "MT"
},
"highlight" : {
"address" : [
"288 <em>Mill</em> Street"
]
}
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "18",
"_score" : 2.1248586,
"_source" : {
"account_number" : 18,
"balance" : 4180,
"firstname" : "Dale",
"lastname" : "Adams",
"age" : 33,
"gender" : "M",
"address" : "467 Hutchinson Court",
"employer" : "Boink",
"email" : "[email protected]",
"city" : "Orick",
"state" : "MD"
},
"highlight" : {
"address" : [
"467 Hutchinson <em>Court</em>"
]
}
}
發現它會自動在比對字段上加上
<em> </em>
标簽
自定義高亮标簽
文法如下:
"pre_tags": ["<tag1>"],
"post_tags": ["</tag2>"],
查詢語句如下:
GET /bank/_search
{
"query": {
"bool": {
"should": [
{ "match": { "address": "mill" } },
{ "match": { "address": "Court" } }
]
}
},
"highlight": {
"pre_tags": ["<a>"],
"post_tags": ["</a>"],
"fields": {
"address": {}
}
}
}
查詢結果如下:
{
"_index" : "bank",
"_type" : "account",
"_id" : "472",
"_score" : 5.4032025,
"_source" : {
"account_number" : 472,
"balance" : 25571,
"firstname" : "Lee",
"lastname" : "Long",
"age" : 32,
"gender" : "F",
"address" : "288 Mill Street",
"employer" : "Comverges",
"email" : "[email protected]",
"city" : "Movico",
"state" : "MT"
},
"highlight" : {
"address" : [
"288 <a>Mill</a> Street"
]
}
},
{
"_index" : "bank",
"_type" : "account",
"_id" : "18",
"_score" : 2.1248586,
"_source" : {
"account_number" : 18,
"balance" : 4180,
"firstname" : "Dale",
"lastname" : "Adams",
"age" : 33,
"gender" : "M",
"address" : "467 Hutchinson Court",
"employer" : "Boink",
"email" : "[email protected]",
"city" : "Orick",
"state" : "MD"
},
"highlight" : {
"address" : [
"467 Hutchinson <a>Court</a>"
]
}
}
發現高亮标簽已經被替換