天天看點

電商項目-es相關基礎知識解決的問題

基礎知識

存在索引: es.indices.exists(index=name)

删除索引:es.indices.delete(index=name, ignore=[400, 404])

建立索引:es.indices.create(index=name, body=wsku_index_body)

es.delete_by_query(index='api_log', body=query)

query = {'query': {'match': {'level': 'warning'}}}

基本搜尋:GET /api_log/_search?q=critical

GET /api_log/_search
{
    "query": {
        "bool": {
            "must": [{
                "match_phrase": {
                    "url": {
                        "query": "/api/v1.0/reseller/sms/active"
                    }
                }
            },
                {
                    "range": {
                        "now": {
                            "gte": "2019-11-23"
                        }
                    }
                }
            ]
        }
    }
}
           

我們把搜尋的相關度提高三倍

"fields": ["title", "summary^3"]

must等同于and,should等同于or,must_not等同于and not

"bool": {
    "should": [
        { "match": { "request": "order_id" }},
        { "match": { "request": "order_ids" }} 
    ],
    "must": 
        { "match": { "request": "1" }
    }
}
           

"fuzziness":"AUTO",增加模糊比對拼寫錯誤,用于商品搜尋 ., * , [a-z]正則通配符

"type": "phrase",

要求在請求字元串中的所有查詢項必須都在文檔中存在,文中順序也得和請求字元串一緻,且彼此相連。  分隔多遠的距離"slop", 用于精确查找

"query": "(saerch~1 algorithm~1) AND (grant ingersoll) OR (tom morton)", 多個字元串搜尋, ~1進行一次模糊查詢 因為它用 

+ / \| / -

 分别替換了 

AND/OR/NOT ,可能使用者有搜尋框,可以更簡化

"fields": ["_all", "summary^2"] 提權

詞條查詢 和 多詞條查詢

"query": { "term" : { "publisher": "manning" } }

排序

"sort": [ { "publish_date": {"order":"desc"}}, { "title": { "order": "desc" }} ]

範圍

"query": { "range" : { "publish_date": { "gte": "2015-01-01", "lte": "2015-12-31" } } }

過濾

"filter": { "range" : { "num_reviews": { "gte": 20 } } }

查詢,限定結果傳回資料字段_source

POST /pro_product/_search

{

        "_source": ["wcate_id"],

        "query": {

            "bool": {

                "must": [{"terms": {"wcate_id": [6239,6240,6241,6242,6243,6244,6245,6246,6247,6248,6249]}}]

            }

        },

        "sort": [

        {"priority": "asc"},{"wcate_id": "desc"}

        ],

    "size":20,

    "search_after":[1,12802]

}

字段OR AND

title:(quick OR brown)

精準比對

author:"John Smith"

where any of the fields 

book.title

book.content

 or 

book.date

 contains 

quick

 or 

brown:

book.\*:(quick brown)
           

存在字段:

_exists_:title
           

單字元通配符?和任意字元通配符*

qu?ck bro*
           

Regular expression patterns can be embedded in the query string by wrapping them in forward-slashes (

"/"

):

name:/joh?n(ath[oa]n)/
           

範圍:

date:[2012-01-01 TO 2012-12-31]
           
count:[10 TO *]
           
age:(>=10 AND <20)
age:(+>=10 +<20)
           

+-操作符

es修改字段類型步驟

Text:會分詞,然後進行索引, 不支援聚合

       支援模糊、精确查詢

keyword:不進行分詞,直接索引,支援聚合

       支援模糊、精确查詢

1.查詢出原來索引的結構

GET /pro_api_log

2.建立新的索引

PUT /pro_api_log2

{

            "settings": {

                "index": {

                    "max_result_window": 30000,

                    "analysis": {

                        "analyzer": {

                            "custom_standard": {

                                "type": "custom",

                                "tokenizer": "standard",

                                "char_filter": ["filter_char_filter"],

                                "filter": "lowercase"

                            }

                        },

                        "char_filter": {

                            "filter_char_filter": {

                                "type": "mapping",

                                "mappings": [

                                    "· => xxDOT1xx",

                                    "+ => xxPLUSxx",

                                    "- => xxMINUSxx",

                                    "\" => xxQUOTATIONxx",

                                    "( => xxLEFTBRACKET1xx",

                                    ") => xxRIGHTBRACKET1xx",

                                    "& => xxANDxx",

                                    "| => xxVERTICALxx",

                                    "—=> xxUNDERLINExx",

                                    "/=> xxSLASHxx",

                                    "!=> xxEXCLAxx",

                                    "•=> xxDOT2xx",

                                    "【=>xxLEFTBRACKET2xx",

                                    "】 => xxRIGHTBRACKET2xx",

                                    "`=>xxapostrophexx",

                                    ".=>xxDOT3xx",

                                    "#=>xxhashtagxx",

                                    ",=>xxcommaxx"

                                ]

                            }

                        }

                    },

                    "number_of_shards": 3,

                    "number_of_replicas": 1

                }

            },

            "mappings": {

                "wemore": {

                    "properties": {

                        "now": {

                            "type": "date"

                            , "index": true

                        },

                        "ip": {

                            "type": "keyword"

                            , "index": true

                        },

                        "name": {

                            "type": "keyword"

                            , "index": true

                        },

                        "request": {

                            "type": "text"

                            , "index": true

                        },

                        "response": {

                            "type": "text"

                            , "index": true

                        },

                        "level": {

                            "type": "keyword"

                            , "index": true

                        },

                        "url": {

                            "type": "keyword"

                            , "index": true

                        },

                        "method": {

                            "type": "keyword"

                            , "index":true

                        },

                        "exception": {

                            "type": "text"

                            , "index":true

                        }

                    }

                }

            }

}

3.舊資料同步到新的索引,時間可能比較長網絡會逾時,但是可以查詢資料觀察資料增長

POST _reindex

{

  "source": {

    "index": "pro_api_log"

  },

  "dest": {

    "index": "pro_api_log2",

    "version_type": "external" #相同版本資料進行覆寫

  }

}

4.查詢同步進度,關注total,update,created

GET _tasks?detailed=true&actions=*reindex 

5.删除舊的索引

DELETE /pro_api_log

6.建立新的索引,修改字段類型,其餘格式同上,删除舊的時候如果同時有插入資料,則會因插入資料先行建立一個預設的文檔,導緻建立文檔有舊的字段,必須保證沒有插入幹擾

PUT /pro_api_log

7.同步資料回來

POST _reindex

{

  "source": {

    "index": "pro_api_log2"

  },

  "dest": {

    "index": "pro_api_log",

    "version_type": "external"

  }

}

解決的問題

1.除了api日志,其他自己列印的日志插入不了es的問題

自己列印的日志的特點就是不是全字段,經過測試發現,es索引的字段缺失是可以的,所有的字段都可以缺失,但是kibana上顯示是按照時間now(設定的)來排序了,缺失這個字段就不能查詢出來。

POST /test_api_log/_doc/

{

"ip":"666.666.666.666",

"now":"2019-12-02T02:40:02.685213"

}

2.定期清除日志,清除一個月的日志

POST /pro_api_log/_delete_by_query

{

    "query": {

        "range" : {

            "now" : {

                "lt" :  "now-30d/d"

            }

        }

    }

}

3.how to fix problem

POST /pro_api_log/_analyze

{

  "analyzer": "standard",

  "text": """{"cod_charge":0,"product_id":[33005,33006,33007,33008,33009,33010],"COD":1}"""

}

正則

            {

                    "regexp": {

                        "request": "[0-9|,]*8697[0-9|,]*"

                    }

            }

繼續閱讀