天天看點

正規表達式 re.findall 用法(轉)

http://www.cnblogs.com/xieshengsen/p/6727064.html

正則 re.findall  的簡單用法(傳回string中所有與pattern相比對的全部字串,傳回形式為數組)
文法:
           
1

findall(pattern, string, flags

=

)

import re

Python 正規表達式 re findall 方法能夠以清單的形式傳回能比對的子串

# print (help(re.findall))
# print (dir(re.findall))

findall查找全部r辨別代表後面是正則的語句
           

1

2

3

regular_v1

=

re.findall(r

"docs"

,

"https://docs.python.org/3/whatsnew/3.6.html"

)

print

(regular_v1)

# ['docs']

符号^表示比對以https開頭的的字元串傳回,
           

1

2

3

regular_v2

=

re.findall(r

"^https"

,

"https://docs.python.org/3/whatsnew/3.6.html"

)

print

(regular_v2)

# ['https']

  

用$符号表示以html結尾的字元串傳回,判斷是否字元串結束的字元串
           

1

2

3

regular_v3

=

re.findall(r

"html$"

,

"https://docs.python.org/3/whatsnew/3.6.html"

)

print

(regular_v3)

# ['html']

# [...]比對括号中的其中一個字元
           

1

2

3

regular_v4

=

re.findall(r

"[t,w]h"

,

"https://docs.python.org/3/whatsnew/3.6.html"

)

print

(regular_v4)

# ['th', 'wh']

“d”是正則文法規則用來比對0到9之間的數傳回清單
           

1

2

3

4

5

6

regular_v5

=

re.findall(r

"\d"

,

"https://docs.python.org/3/whatsnew/3.6.html"

)

regular_v6

=

re.findall(r

"\d\d\d"

,

"https://docs.python.org/3/whatsnew/3.6.html/1234"

)

print

(regular_v5)

# ['3', '3', '6']

print

(regular_v6)

# ['123']

小d表示取數字0-9,大D表示不要數字,也就是出了數字以外的内容傳回
           

1

2

3

regular_v7

=

re.findall(r

"\D"

,

"https://docs.python.org/3/whatsnew/3.6.html"

)

print

(regular_v7)

# ['h', 't', 't', 'p', 's', ':', '/', '/', 'd', 'o', 'c', 's', '.', 'p', 'y', 't', 'h', 'o', 'n', '.', 'o', 'r', 'g', '/', '/', 'w', 'h', 'a', 't', 's', 'n', 'e', 'w', '/', '.', '.', 'h', 't', 'm', 'l']

“w”在正則裡面代表比對從小寫a到z,大寫A到Z,數字0到9
           

1

2

3

regular_v8

=

re.findall(r

"\w"

,

"https://docs.python.org/3/whatsnew/3.6.html"

)

print

(regular_v8)

#['h', 't', 't', 'p', 's', 'd', 'o', 'c', 's', 'p', 'y', 't', 'h', 'o', 'n', 'o', 'r', 'g', '3', 'w', 'h', 'a', 't', 's', 'n', 'e', 'w', '3', '6', 'h', 't', 'm', 'l']

“W”在正則裡面代表比對除了字母與數字以外的特殊符号
           

1

2

3

regular_v9

=

re.findall(r

"\W"

,

"https://docs.python.org/3/whatsnew/3.6.html"

)

print

(regular_v9)

# [':', '/', '/', '.', '.', '/', '/', '/', '.', '.']