天天看點

druid查詢報錯:沒有足夠空間執行查詢

今天在用druid做group by查詢時,報錯如下

{

"error":"Resourcelimitexceeded",

"errorMessage":"Notenoughdictionaryspacetoexecutethisquery.Tryincreasingdruid.query.groupBy

    .maxMergingDictionarySizeorenablediskspillingbysettingdruid.query.groupBy.maxOnDiskStoragetoapositivenumber.",

"errorClass":"io.druid.query.ResourceLimitExceededException",

"host":"ubuntu:8083"

}

到網上查了下原因是查詢結果在記憶體中裝不下

解決辦法是For this you need to increase the buffer sizes on all historical and realtime nodes and broker nodes.

去druid的每台機器上改配置,但是目下我不能改啊。

然後我就想怎麼能減小durid的查詢結果集,既然一天的資料查不出來,那我就用六個小時的,然後我發現列印結果如下

然後我就意識到并不是因為整個結果集太大,而是因為個别不規範的posid太大,于是我在filter中用正規表達式,隻保留所有數字開頭的posid,問題迎刃而解

"filter": { "type":"regex", "dimension":"posid", "pattern":"^[0-9]*$" }

下面我粘貼一個完整的請求格式

curl -X POST \
  http://ip:port/druid/v2/ \
  -H 'cache-control: no-cache' \
  -H 'content-type: application/json' \
  -d '{
    "aggregations":[
        {
            "fieldName":"request_num",
            "name":"request_num",
            "type":"longSum"
        },
        {
            "fieldName":"response_num",
            "name":"response_num",
            "type":"longSum"
        }
    ],
    "context":{

    },
    "dataSource":"table_name",
    "dimensions":[
        "posid"
    ],
    "filter": {
	    "type":"regex",
	    "dimension":"posid",
	    "pattern":"^[0-9]*$"
	   },
    "granularity": "month",
    "intervals":"2018-05-01T00:00:00.000Z/2018-06-01T00:00:00.000Z",
    "queryType":"groupBy"
}'| python -m json.tool
           

繼續閱讀