github: https://github.com/medcl/elasticsearch-analysis-ik
安裝方式
1、先檢視版本号:
http://localhost:9200/找到對應版本:
https://github.com/medcl/elasticsearch-analysis-ik/releases2、安裝
./bin/elasticsearch-plugin install https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v6.3.0/elasticsearch-analysis-ik-6.3.0.zip
3、重新開機es
4、分詞測試
curl -X PUT 'localhost:9200/website'
curl -XGET "http://localhost:9200/website/_analyze" -H 'Content-Type: application/json' -d'
{
"text":"中華人民共和國國歌","tokenizer": "ik_max_word"
}'
傳回内容
{
"tokens": [
{
"token": "中華人民共和國",
"start_offset": 0,
"end_offset": 7,
"type": "CN_WORD",
"position": 0
},
{
"token": "中華人民",
"start_offset": 0,
"end_offset": 4,
"type": "CN_WORD",
"position": 1
},
{
"token": "中華",
"start_offset": 0,
"end_offset": 2,
"type": "CN_WORD",
"position": 2
},
{
"token": "華人",
"start_offset": 1,
"end_offset": 3,
"type": "CN_WORD",
"position": 3
},
{
"token": "人民共和國",
"start_offset": 2,
"end_offset": 7,
"type": "CN_WORD",
"position": 4
},
{
"token": "人民",
"start_offset": 2,
"end_offset": 4,
"type": "CN_WORD",
"position": 5
},
{
"token": "共和國",
"start_offset": 4,
"end_offset": 7,
"type": "CN_WORD",
"position": 6
},
{
"token": "共和",
"start_offset": 4,
"end_offset": 6,
"type": "CN_WORD",
"position": 7
},
{
"token": "國",
"start_offset": 6,
"end_offset": 7,
"type": "CN_CHAR",
"position": 8
},
{
"token": "國歌",
"start_offset": 7,
"end_offset": 9,
"type": "CN_WORD",
"position": 9
}
]
}
如果安裝失敗,可以使用如下方式進行安裝
源碼解壓後拷貝至es目錄: plugins/ik , 重新開機服務
ik_max_word: 會将文本做最細粒度的拆分
ik_smart: 會做最粗粒度的拆分
參考 Elasticsearch5.x安裝IK分詞器以及使用