![]( http://yunlei-statics.cn-hangzhou.log.aliyuncs.com/logstores/blog-tracking/track_ua.gif?APIVersion=0.6.0&blog=日志服務資料加工最佳實踐

: 特定格式文本的加工&src=yq&author=laiqiang.dlq)

本部分實踐案例主要是根據在實際工作中的工單需求産生。接下來将從工單需求，加工編排（解決方案）等幾個方面給讀者解答如何使用LOG DSL編排解決任務需求。

場景：非标準JSON對象轉JSON展開

需要對收集的dict資料進行二次嵌套展開操作。解決方案是先将dict資料轉成json資料，然後使用e_json函數進行展開即可。

原始日志

在控制台收集到的日志格式是dict格式，如下所示：

content: {
    'referer': '-',
    'request': 'GET /phpMyAdmin',
    'status': 404,
    'data-1': {
        'aaa': 'Mozilla',
        'bbb': 'asde'
    },
    'data-2': {
        'up_adde': '-',
        'up_host': '-'
    }
}

LOG DSL編排

1、首先是對上述content資料做轉json格式資料處理

e_set("content_json",str_replace(ct_str(v("content")),"'",'"'))

此時經過處理後的日志為：

content: {
    'referer': '-',
    'request': 'GET /phpMyAdmin',
    'status': 404,
    'data-1': {
        'aaa': 'Mozilla',
        'bbb': 'asde'
    },
    'data-2': {
        'up_adde': '-',
        'up_host': '-'
    }
}
content_json:  {
    "referer": "-",
    "request": "GET /phpMyAdmin",
    "status": 404,
    "data-1": {
        "aaa": "Mozilla",
        "bbb": "asde"
    },
    "data-2": {
        "up_adde": "-",
        "up_host": "-"
    }
}

2、對經過處理後的标準化的content_json資料進行展開。比如要展開第一層隻需要設定JSON中的depth參數為1即可

e_json("content_json",depth=1,fmt='full')

此時的展開的的日志為：

content_json.data-1:  {"aaa": "Mozilla", "bbb": "asde"}
content_json.data-2:  {"up_adde": "-", "up_host": "-"}
content_json.referer:  -
content_json.request:  GET /phpMyAdmin
content_json.status:  404

如果depth設定為2，則展開的日志為：

content_json.data-1.aaa:  Mozilla
content_json.data-1.bbb:  asde
content_json.data-2.up_adde:  -
content_json.data-2.up_host:  -
content_json.referer:  -
content_json.request:  GET /phpMyAdmin
content_json.status:  404

3、綜上LOG DSL規則可以如以下形式：

e_set("content_json",str_replace(ct_str(v("content")),"'",'"'))
e_json("content_json",depth=2,fmt='full')

加工後資料

加工後的資料是按照depth為2處理的，具體形式如下：

content:  {
    'referer': '-',
    'request': 'GET /phpMyAdmin',
    'status': 404,
    'data-1': {
        'aaa': 'Mozilla',
        'bbb': 'asde'
    },
    'data-2': {
        'up_adde': '-',
        'up_host': '-'
    }
}
content_json:  {
    "referer": "-",
    "request": "GET /phpMyAdmin",
    "status": 404,
    "data-1": {
        "aaa": "Mozilla",
        "bbb": "asde"
    },
    "data-2": {
        "up_adde": "-",
        "up_host": "-"
    }
}
content_json.data-1.aaa:  Mozilla
content_json.data-1.bbb:  asde
content_json.data-2.up_adde:  -
content_json.data-2.up_host:  -
content_json.referer:  -
content_json.request:  GET /phpMyAdmin
content_json.status:  404

場景：其他格式的文本轉JSON格式展開

對于一些非标準的json格式資料，如果進行展開操作可以考慮組合規則的形式進行操作

原始日志收集到的格式如以下格式：

content : {
    "pod" => {
        "name" => "crm-learning-follow-7bc48f8b6b-m6kgb"
    }, "node" => {
        "name" => "tw5"
    }, "labels" => {
        "pod-template-hash" => "7bc48f8b6b", "app" => "crm-learning-follow"
    }, "container" => {
        "name" => "crm-learning-follow"
    }, "namespace" => "testing1"
}

1、首先對日志格式進行轉換json形式，可以使用str_logtash_config_normalize函數進行轉換，操作如下：

e_set("normalize_data",str_logtash_config_normalize(v("content")))

2、展開操作可以使用JSON函數，具體如下：

e_json("normalize_data",depth=1,fmt='full')

e_set("normalize_data",str_logtash_config_normalize(v("content")))
e_json("normalize_data",depth=1,fmt='full')

content : {
    "pod" => {
        "name" => "crm-learning-follow-7bc48f8b6b-m6kgb"
    }, "node" => {
        "name" => "tw5"
    }, "labels" => {
        "pod-template-hash" => "7bc48f8b6b", "app" => "crm-learning-follow"
    }, "container" => {
        "name" => "crm-learning-follow"
    }, "namespace" => "testing1"
}
normalize_data:  {
    "pod": {
        "name": "crm-learning-follow-7bc48f8b6b-m6kgb"
    },
    "node": {
        "name": "tw5"
    },
    "labels": {
        "pod-template-hash": "7bc48f8b6b",
        "app": "crm-learning-follow"
    },
    "container": {
        "name": "crm-learning-follow"
    },
    "namespace": "testing1"
}
normalize_data.container.container:  {"name": "crm-learning-follow"}
normalize_data.labels.labels:  {"pod-template-hash": "7bc48f8b6b", "app": "crm-learning-follow"}
normalize_data.namespace:  testing1
normalize_data.node.node:  {"name": "tw5"}
normalize_data.pod.pod:  {"name": "crm-learning-follow-7bc48f8b6b-m6kgb"}

場景：部分文本特殊編碼轉換

在真實的工作環境下，總會遇到一些十六進制字元，需要對其解碼才能正常閱讀。是以，對于一些十六進制字元進行轉義操作可是使用str_hex_escape_encode函數。

content : "\xe4\xbd\xa0\xe5\xa5\xbd"

e_set("hex_encode",str_hex_escape_encode(v("content")))

content : "\xe4\xbd\xa0\xe5\xa5\xbd"
hex_encode : "你好"

場景：XML字段展開

測試日志

在工作中也會時不時遇到各種類型資料，比如xml資料。如果要展開xml資料可是使用xml_to_json函數處理。

str : <?xmlversion="1.0"?>
<data>
    <countryname="Liechtenstein">
        <rank>1</rank>
        <year>2008</year>
        <gdppc>141100</gdppc>
        <neighborname="Austria"direction="E"/>
        <neighborname="Switzerland"direction="W"/>
    </country>
    <countryname="Singapore">
        <rank>4</rank>
        <year>2011</year>
        <gdppc>59900</gdppc>
        <neighborname="Malaysia"direction="N"/>
    </country>
    <countryname="Panama">
        <rank>68</rank>
        <year>2011</year>
        <gdppc>13600</gdppc>
        <neighborname="Costa Rica"direction="W"/>
        <neighborname="Colombia"direction="E"/>
    </country>
</data>

e_set("str_json",xml_to_json(v("str")))

加工後的日志

str : <?xmlversion="1.0"?>
<data>
    <countryname="Liechtenstein">
        <rank>1</rank>
        <year>2008</year>
        <gdppc>141100</gdppc>
        <neighborname="Austria"direction="E"/>
        <neighborname="Switzerland"direction="W"/>
    </country>
    <countryname="Singapore">
        <rank>4</rank>
        <year>2011</year>
        <gdppc>59900</gdppc>
        <neighborname="Malaysia"direction="N"/>
    </country>
    <countryname="Panama">
        <rank>68</rank>
        <year>2011</year>
        <gdppc>13600</gdppc>
        <neighborname="Costa Rica"direction="W"/>
        <neighborname="Colombia"direction="E"/>
    </country>
</data>
str_dict :{
    "data": {
        "country": [{
            "@name": "Liechtenstein",
            "rank": "1",
            "year": "2008",
            "gdppc": "141100",
            "neighbor": [{
                "@name": "Austria",
                "@direction": "E"
            }, {
                "@name": "Switzerland",
                "@direction": "W"
            }]
        }, {
            "@name": "Singapore",
            "rank": "4",
            "year": "2011",
            "gdppc": "59900",
            "neighbor": {
                "@name": "Malaysia",
                "@direction": "N"
            }
        }, {
            "@name": "Panama",
            "rank": "68",
            "year": "2011",
            "gdppc": "13600",
            "neighbor": [{
                "@name": "Costa Rica",
                "@direction": "W"
            }, {
                "@name": "Colombia",
                "@direction": "E"
            }]
        }]
    }
}

進一步參考

日志服務最佳實踐彙總(持續更新)
完整DSL文法介紹與參考PDF下載下傳(持續更新)
資料加工指南
- 介紹:
- 快速開始:
- 文法:
- 管理配置:
  - 子賬号授權配置

歡迎掃碼加入官方釘釘群獲得實時更新與阿裡雲工程師的及時直接的支援:

日志服務資料加工最佳實踐: 特定格式文本的加工場景：非标準JSON對象轉JSON展開場景：其他格式的文本轉JSON格式展開場景：部分文本特殊編碼轉換場景：XML字段展開進一步參考

日志服務資料加工最佳實踐: 特定格式文本的加工場景：非标準JSON對象轉JSON展開場景：其他格式的文本轉JSON格式展開場景：部分文本特殊編碼轉換場景：XML字段展開進一步參考

場景：非标準JSON對象轉JSON展開

原始日志

LOG DSL編排

加工後資料

場景：其他格式的文本轉JSON格式展開

場景：部分文本特殊編碼轉換

場景：XML字段展開

測試日志

加工後的日志

進一步參考

繼續閱讀

出現invalid byte 1 of 1-byte UTF-8 sequence問題

Perl與網絡監控

為什麼要選擇UniDAC

SIP Presence SUBSCRIBE-NOTIFY

關于 underscore 中模闆引擎的應用示範樣例

underscore 模闆标簽修改。

Ajax——模闆引擎

使用underscore的template自定義模闆

underscore模闆功能的使用和學習

QName是什麼

[HTML5]自定義屬性 data-* 和 jQuery.data 詳解

七牛雲-C#SDK-上傳-前期準備

【python】【資料處理】畫多元資料分布圖

vue-cli簡介（中文翻譯）

Ajax發送和擷取json資料到Spring mvc 1.spring mvc後端2.web前段

JSONObject包導入異常 java.lang.NoClassDefFoundErrorweb項目的導入包的問題