天天看點

elasticsearch安裝中文分詞擴充elasticsearch-analysis-ik安裝方式

github: https://github.com/medcl/elasticsearch-analysis-ik

安裝方式

1、先檢視版本号:

http://localhost:9200/

找到對應版本:

https://github.com/medcl/elasticsearch-analysis-ik/releases

2、安裝

./bin/elasticsearch-plugin install https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v6.3.0/elasticsearch-analysis-ik-6.3.0.zip      

3、重新開機es

4、分詞測試

curl -X PUT 'localhost:9200/website'

curl -XGET "http://localhost:9200/website/_analyze" -H 'Content-Type: application/json' -d'
{
   "text":"中華人民共和國國歌","tokenizer": "ik_max_word"
}'      

傳回内容

{
    "tokens": [
        {
            "token": "中華人民共和國",
            "start_offset": 0,
            "end_offset": 7,
            "type": "CN_WORD",
            "position": 0
        },
        {
            "token": "中華人民",
            "start_offset": 0,
            "end_offset": 4,
            "type": "CN_WORD",
            "position": 1
        },
        {
            "token": "中華",
            "start_offset": 0,
            "end_offset": 2,
            "type": "CN_WORD",
            "position": 2
        },
        {
            "token": "華人",
            "start_offset": 1,
            "end_offset": 3,
            "type": "CN_WORD",
            "position": 3
        },
        {
            "token": "人民共和國",
            "start_offset": 2,
            "end_offset": 7,
            "type": "CN_WORD",
            "position": 4
        },
        {
            "token": "人民",
            "start_offset": 2,
            "end_offset": 4,
            "type": "CN_WORD",
            "position": 5
        },
        {
            "token": "共和國",
            "start_offset": 4,
            "end_offset": 7,
            "type": "CN_WORD",
            "position": 6
        },
        {
            "token": "共和",
            "start_offset": 4,
            "end_offset": 6,
            "type": "CN_WORD",
            "position": 7
        },
        {
            "token": "國",
            "start_offset": 6,
            "end_offset": 7,
            "type": "CN_CHAR",
            "position": 8
        },
        {
            "token": "國歌",
            "start_offset": 7,
            "end_offset": 9,
            "type": "CN_WORD",
            "position": 9
        }
    ]
}      

如果安裝失敗,可以使用如下方式進行安裝

源碼解壓後拷貝至es目錄: plugins/ik , 重新開機服務

ik_max_word: 會将文本做最細粒度的拆分

ik_smart: 會做最粗粒度的拆分

參考 Elasticsearch5.x安裝IK分詞器以及使用

繼續閱讀