天天看点

ElasticSearch DSL结构的一些说明

对于初次接触elasticsearch(界内简称es)的童鞋来说,DSL是挺晦涩难懂的,不知道为什么要这样嵌套,自己写老是会出错,下文给你一个思路读懂DSL,大神可跳过。

官方文档:

https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl.html

https://www.elastic.co/guide/en/elasticsearch/reference/current/search.html

基本结构

{
    QUERY_NAME: {
        ARGUMENT: VALUE,
        ARGUMENT: VALUE,...
    }
}
或者
{
    QUERY_NAME: {
        FIELD_NAME: {
            ARGUMENT: VALUE,
            ARGUMENT: VALUE,...
        }
    }
}


GET /_search?pretty
{"explain":true,
 "version":true,
 "from":0,
 "size":10,
 "stored_fields":["field1","field2"],
 "query":{QUERY_CLAUSE},
 facets:{FACETS_CLAUSE},
 "_source": {},
 "script_fields":{},
 "post-filter":{},
 "highlight":{},
 "track_scores":true,
 "sort":{SORT_CLAUSE}
}
           

另外参考:

https://www.elastic.co/guide/en/elasticsearch/reference/current/search-request-stored-fields.html

在Top Level query里面可能的组合

{

term:{TERM_QUERY} 详情参见

| terms:{TERMS_QUERY} 详情参见

| terms_set:{TERMS_SET_QUERY} 详情参见

| range:{RANGE_QUERY} 详情参见

| prefix:{PREFIX_QUERY} 详情参见

| exists:{EXISTS_QUERY} 详情参见

| wildcard:{WILDCARD_QUERY} 详情参见

| match_all:{MATCH_ALL_QUERY} 详情参见

| match_none:{MATCH_NONE_QUERY} 详情参见

| match:{MATCH_QUERY} 详情参见

| match_phrase:{MATCH_PHRASE_QUERY} 详情参见

| match_phrase_prefix:{MATCH_PHRASE_PREFIX_QUERY} 详情参见

| multi_match:{MULTI_MATCH_QUERY} 详情参见

| common:{COMMON_TERM_QUERY} 详情参见

| queryString:{QUERY_STRING_QUERY}详情参见

| simple_query_string:{SIMPLE_QUERY_STRING_QUERY} 详情参见

| bool:{BOOLEAN_QUERY} 详情参见

| dis_max:{DISMAX_QUERY} 详情参见

| constant_score:{CONSTANT_SCORE_QUERY} 详情参见

| nested:{NESTED_QUERY} 详情参见

returns the parent doc that have child doc matches the query
           
| has_child:{HAS_CHILD_QUERY} 详情参见
returns the child documents which associated parents have matched
           

| has_parent:{HAS_PARENT_QUERY} 详情参见

| parent_id:{PARENT_ID_QUERY} 详情参见

| boosting:{BOOSTING_QUERY} 详情参见

| function_score:{FUNCTION_SCORE_QUERY} 详情参见

| fuzzy:{FUZZY_QUERY} 详情参见

| regexp:{REGEXP_QUERY} 详情参见

| type:{TYPE_QUERY} 详情参见

| ids:{IDS_QUERY} 详情参见

| more_like_this:{MORE_LIKE_THIS_QUERY} 详情参见

| percolate:{PERCOLATE_QUERY} 详情参见

| Span_queries 详情参见

}

关于 FILTER_CLAUSE 的组合
{ query:{QUERY_CLAUSE}
| term:{TERM_FILTER}
| range:{RANGE_FILTER}
| prefix:{PREFIX_FILTER}
| wildcard:{WILDCARD_FILTER}
| bool:{BOOLEAN_FILTER}
| constantScore:{CONSTANT_SCORE_QUERY}
}
           
关于 FACETS_CLAUSE 的组合
{ $FACET_NAME_1:
 {filter_clause},
 $FACET_NAME_2:
 {filter_clause},
 ...
}
           
关于BOOLEAN_FILTER 的组合
{
   must:
    {FILTER_CLAUSE}
  | [{filter_clause},...],
   should:
    {FILTER_CLAUSE}
  | [{filter_clause},...],
   mustnot:
    {FILTER_CLAUSE}
  | [{filter_clause},...],
   minimum_should_match
}
           
关于 BOOLEAN_QUERY 的组合
{
    must:{QUERY_CLAUSE}| [{QUERY_CLAUSE},...],
  should:{QUERY_CLAUSE}|[{QUERY_CLAUSE},...],
 mustnot:{QUERY_CLAUSE}|[{QUERY_CLAUSE},...],
   boost:FLOAT,
   minimum_should_match
}
           
关于 聚合
参考:https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations.html
"aggregations[aggs]" : {
    "<aggregation_name>" : {
        "<aggregation_type>" : {
            <aggregation_body>
        }
        [,"meta" : {  [<meta_data_body>] } ]?
        [,"aggregations" : { [<sub_aggregation>]+ } ]?
    }
    [,"<aggregation_name_2>" : { ... } ]*
}
           
关于 query 和filter的使用场景

As a general rule, use query clauses for full-text search or for any condition that should affect the relevance score, and use filters for everything else.

通常情况,由于filter在查询时不进行评分,并附缓存,效率上比query高。

关于父子关系的示例

假定 branch是employee的父文档

通过 children 查找 parents

has_child 查询和过滤器可以用来根据 children 的内容找到 parent 文档。

所有包含出生在 1980 后的员工的分部

GET /company/branch/_search
{
  "query": {
    "has_child": {
      "type":       "employee",
      "score_mode": "max",  --none(默认) max,min,avg,sum. 匹配打分的情况。
      "query": {
        "match": {
          "name": "Alice Smith"
        }
      }
    }
  }
}
           

通过parent查询children

has_parent 查询是基于 parents 的数据返回 children。

GET /company/employee/_search
{
  "query": {
    "has_parent": {
      "type": "branch", 
      "query": {
        "match": {
          "country": "UK"
        }
      }
    }
  }
}
           

has_parent 过滤器的结果并不缓存,通常的缓存机制用在 has_parent 过滤器的内部 filter 上。

parent-child 支持 children 聚合 parent 聚合不支持。

根据国家来确定员工最爱的兴趣爱好。

GET /company/branch/_search?search_type=count
{
  "aggs": {
    "country": {
      "terms": {  
        "field": "country"   父文档中的字段
      },
      "aggs": {
        "employees": {
          "children": {   -- children 聚合联结了 parent 文档和相关联的 children 类型 employee。
            "type": "employee"
          },
          "aggs": {
            "hobby": {
              "terms": {  
                "field": "employee.hobby"  -- employee child 文档的 hobby 字段。
              }
            }
          }
        }
      }
    }
  }
}
           
关于es-sql 的用法

http://localhost:9200/_sql/_explain?sql=select member_id as a from label_20180205 where mobile = '13627182930'

“`http://localhost:9200/_sql?sql=select * from indexName limit 10““

详情参考:https://github.com/NLPchina/elasticsearch-sql/wiki

继续阅读