天天看點

python對文本内容按關鍵字48進行換行處理

昨天處理一批序列槽資料文本,有許多模式,對應了很多資料,資料是每十個為一組,全部以48開頭。為了省事,可以通過python進行文本處理,将資料全部按48換行。

原始資料如:48 54 05 00 03 00 66 18 00 22 48 54 05 00 03 00 66 18 00 22 48 54 05 00 03 00 66 18 00 22 48 54 04 00 00 00 00 00 00 A2 48 54 05 00 03 00 66 18 00 22 48 54 05 00 03 00 66 18 00 22 48 54 05 00 03 00 66 18 00 22 48 54 04 00 02 00 66 18 00 22

儲存了很多檔案。

1、對檔案夾内是以檔案進行處理,适合大批量檔案

#!/usr/bin/env python
#-*- coding: utf-8 -*-
#filename:exchange_newline_all.py

import re,os


def newline(file_name):#定義轉換函數
    f = open(file_name,'r')
    content = f.readlines()
    f.close()

    f = open(file_name,'w')
    for line in content:
        f.writelines(re.sub('48','\n48',line))#在48之前換行替換

    f.close()


while True:
    path = raw_input('INPUT THE FILEPATH(eg F:\sscom\dir):')#輸入待處理檔案夾
    print path
    l = os.listdir(r'%s' % path)#周遊檔案夾内檔案
    num = len(l)
    for i in range(0,num):#逐一處理每個檔案
        file_x = r'%s\%s' % (path, l[i])
        newline(file_x)
        print "FINISHED,OK"      

輸出:

>>> ================================ RESTART ================================
>>> 
INPUT THE FILEPATH(eg F:\sscom\dir):F:\tmp\1307\sscom\tmp2\mode_test
F:\tmp\1307\sscom\tmp2\mode_test
FINISHED,OK
FINISHED,OK
FINISHED,OK
FINISHED,OK
FINISHED,OK
FINISHED,OK
FINISHED,OK
FINISHED,OK
FINISHED,OK
INPUT THE FILEPATH(eg F:\sscom\dir):      

運作處理後檔案内容為:

48 54 05 00 03 00 66 18 00 22

48 54 05 00 03 00 66 18 00 22

48 54 05 00 03 00 66 18 00 22

48 54 04 00 00 00 00 00 00 A2

48 54 05 00 03 00 66 18 00 22

48 54 05 00 03 00 66 18 00 22

48 54 05 00 03 00 66 18 00 22

48 54 04 00 02 00 66 18 00 22

這樣就友善使用了。

2、對單個檔案進行處理

#!/usr/bin/env python
#-*- coding: utf-8 -*-
#filename:exchange_newline2.py

import re,os



def newline(file_name):
    f = open(file_name,'r')
    content = f.readlines()
    f.close()

    f = open(file_name,'w')
    for line in content:
        f.writelines(re.sub('48','\n48',line))

    f.close()




while True:#對單個檔案進行處理
    path = raw_input('INPUT THE FILEPATH(eg F:\sscom\crt.TXT):')
    file_path = r'%s' % path
    newline(file_path)
    print "FINISHED,OK"      

輸出:

>>> ================================ RESTART ================================
>>> 
INPUT THE FILEPATH(eg F:\sscom\crt.TXT):F:\tmp\1307\sscom\tmp2\mode_test\custom.TXT
FINISHED,OK
INPUT THE FILEPATH(eg F:\sscom\crt.TXT):      

3、使用帶指令行模式輸入檔案路徑,處理單個檔案

#!/usr/bin/env python
#-*- coding: utf-8 -*-
#filename:exchange_newline1.py

import re,os
from sys import argv


def newline(file_name):
    f = open(file_name,'r')
    content = f.readlines()
    f.close()

    f = open(file_name,'w')
    for line in content:
        f.writelines(re.sub('48','\n48',line))

    f.close()


script, path = argv
file_path = r'%s' % path
newline(file_path)      

輸出:

D:\>cd Python\tmp

D:\Python\tmp>python exchange_newline1.py F:\tmp\1307\sscom\tmp2\mode_test\custo
m.TXT

D:\Python\tmp>      

這樣就基本都能滿足需求了,不過檔案隻支援txt檔案,office檔案不支援。