天天看點

python讀取tsv檔案_從tsv檔案python提取資料

python讀取tsv檔案_從tsv檔案python提取資料

I have a TSV file, that looks like this:

A B C D D=1;E=2

S D F G H=2;B=4

I'd like to write the contents to another tsv file in this way.

A B C D D 1

A B C D E 2

S D F G H 2

S D F G B 4

I'd really appreciate if anyone could help/ hint me in splitting column 5 as desired.

解決方案

If you are positively sure you only have tabs and semicolons, then you can use split.

with open('/tmp/test.tsv') as infile, open('/tmp/test2.tsv', 'w') as outfile:

for line in infile:

tsplit = line.split("\t")

firstcolumns = tsplit[:-1]

lastitems = tsplit[-1].strip().split(";")

for item in lastitems:

allcolumns = firstcolumns + item.split("=")

outfile.write("\t".join(allcolumns) + "\n")

(Updated to make it easier to compare with the other answer.)

This will work regardless of the number of semicolon-separated items you have in the last column. However, this is sensitive to small changes in the format (e.g. added spaces).