天天看点

druid查询报错:没有足够空间执行查询

今天在用druid做group by查询时,报错如下

{

"error":"Resourcelimitexceeded",

"errorMessage":"Notenoughdictionaryspacetoexecutethisquery.Tryincreasingdruid.query.groupBy

    .maxMergingDictionarySizeorenablediskspillingbysettingdruid.query.groupBy.maxOnDiskStoragetoapositivenumber.",

"errorClass":"io.druid.query.ResourceLimitExceededException",

"host":"ubuntu:8083"

}

到网上查了下原因是查询结果在内存中装不下

解决办法是For this you need to increase the buffer sizes on all historical and realtime nodes and broker nodes.

去druid的每台机器上改配置,但是目下我不能改啊。

然后我就想怎么能减小durid的查询结果集,既然一天的数据查不出来,那我就用六个小时的,然后我发现打印结果如下

然后我就意识到并不是因为整个结果集太大,而是因为个别不规范的posid太大,于是我在filter中用正则表达式,只保留所有数字开头的posid,问题迎刃而解

"filter": { "type":"regex", "dimension":"posid", "pattern":"^[0-9]*$" }

下面我粘贴一个完整的请求格式

curl -X POST \
  http://ip:port/druid/v2/ \
  -H 'cache-control: no-cache' \
  -H 'content-type: application/json' \
  -d '{
    "aggregations":[
        {
            "fieldName":"request_num",
            "name":"request_num",
            "type":"longSum"
        },
        {
            "fieldName":"response_num",
            "name":"response_num",
            "type":"longSum"
        }
    ],
    "context":{

    },
    "dataSource":"table_name",
    "dimensions":[
        "posid"
    ],
    "filter": {
	    "type":"regex",
	    "dimension":"posid",
	    "pattern":"^[0-9]*$"
	   },
    "granularity": "month",
    "intervals":"2018-05-01T00:00:00.000Z/2018-06-01T00:00:00.000Z",
    "queryType":"groupBy"
}'| python -m json.tool
           

继续阅读