Python資料處理_将json檔案處理為txt檔案

2023-06-25 11:28:41

1.将json資料按行讀取到清單中

def loaddata(path):
    jsonlist = []
    with open(path,encoding='utf-8') as file:
        for line in file.readlines():
            line = json.loads(line) # 将line資料格式化為字典
            line = json.dumps(line, ensure_ascii=False) # 将字典轉化為字元串，ensure_ascii=False用于進行中文字元
            jsonlist.append(line)
    return jsonlist

2.分割清單中的資料為訓練集和測試集

def datasplit(data, num_train):
    train = data[:num_train]
    dev = data[num_train:]
    return train, dev

3.json資料集格式轉換

def json_conversion(file_path):
    json_sentences = []
    with open(file=file_path, encoding='utf-8') as f:
        data = f.readlines() # 讀取所有行資料
        for record in tqdm(data, total=len(data)): # 周遊得到每行資料
            enity_relation_list = []
            record = record.strip('\n')  # 删除結尾的換行符
            record = json.loads(record) # 轉換為字典格式
            text = record['text']
            spo_list = record['spo_list']
            for spo in spo_list: # 周遊三元組清單
                predicate = spo['predicate']
                object = spo['object']['@value']
                subject = spo['subject']
                enity_relation = {"em1Text": subject, "em2Text": object, "label": predicate}
                enity_relation_list.append(enity_relation)
            json_text = {"sentText": text, "relationMentions": enity_relation_list} # 按字典格式重新儲存
            json_text = json.dumps(json_text, ensure_ascii=False) # 将字典格式轉換為字元串格式，ensure_ascii=False用于進行中文字元
            json_sentences.append(json_text)
    return json_sentences

4.将清單中的資料寫入.txt檔案中

def write_data(path, data):
    with open(path, "w", encoding="utf-8") as fw:
        for ele in tqdm(data, total=len(data)):
            fw.write(str(ele) + "\n")

Python資料處理_将json檔案處理為txt檔案

繼續閱讀

YAML簡介和PyYAML安全操作YAML支援的類型YAML的優點：yaml的基本文法python操作

Small tricks

libsvm for python 安裝

學習軟體測試基礎測試第七天

Zeppelin 配置通路 REST APIApache Zeppelin Configuration REST API

【Torch】最簡潔logging使用指南

27. Remove Element(清單)題目代碼

Cloud Studio初體驗

使用 ctypes 進行 Python 和 C 的混合程式設計

【python】【資料處理】畫多元資料分布圖

vue-cli簡介（中文翻譯）

Ajax發送和擷取json資料到Spring mvc 1.spring mvc後端2.web前段

【python】netconf協定對接管理裝置

「Python 網絡自動化」NETCONF —— Python 使用 NETCONF 管理配置 H3C 網絡裝置

JSONObject包導入異常 java.lang.NoClassDefFoundErrorweb項目的導入包的問題

在python中建立excel并寫入