python常用函數技巧彙總

python有許多實用函數，合理實用可以大幅精簡代碼。本篇博文旨在記錄一些常用的操作技巧，以便重複使用時快速查閱，會持續進行更新。

讀取txt檔案

data = np.genfromtxt('./sonar.txt', delimiter=',', usecols=np.arange(0, 60)

通過numpy的genfromtxt來讀取txt檔案

delimiter

分隔符

usecols

指定讀取的列

随機生成正态分布數

生成[0,1)大小為(2,2)的符合正态分布的矩陣

u = np.random.uniform(0, 1, (2, 2))

随機生成不重複的數

産生k個[0,60)的不同随機數

Index = random.sample(range(0, 60), k)

傳回清單中最多次出現過的數

cx = max(label_list, key=label_list.count)

傳回數組中非零元素的位置

nozero_index = np.nonzero()

這個函數更多的實用案例可參考：

繪制散點圖

導入庫：

import matplotlib.pyplot as

plt.figure(1)
plt.scatter(x0[:, 0], x0[:, 1], c='r', marker='o', label='類别一') # scatter繪制散點圖
plt.scatter(x1[:, 0], x1[:, 1], c='g', marker='o', label='類别二')
plt.xlabel('x軸标簽')
plt.ylabel('y軸标簽')
plt.title('圖檔标題')
plt.legend(loc=2)  # 把圖例放到左上角
plt.rcParams['font.sans-serif'] = ['SimHei'] # 中文字型顯示
plt.savefig('./儲存名')# 導出圖檔儲存
plt.show() # 顯示圖檔

關于淺拷貝和深拷貝的冷知識

沒有嵌套，copy()即可；

有嵌套，必須copy.deepcopy(變量)

求歐式距離

經常用到，有兩種方式實作，一種手寫，另一種調用numpy的某接口。

我傾向手寫的方式，對結果更容易掌控。

# 計算x,y歐式距離
def dist_cal(x, y):
    return ((x[0] - y[0]) ** 2 + (x[1] - y[1]) ** 2) ** 0.5

洗牌操作shuffle

用于打亂某序列的固定順序

np.random.shuffle(rand_ch)

求累積和

在輪盤賭算法中常用，累求和序列

q = p.cumsum()

比如，這裡的p是1,2,3，q就是1,3,6

生成随機數/整數

生成随機數：

np.random.rand()

生成随機整數：

np.random.randint()

括号裡可添加範圍，預設(0,1]

求清單ind_a中元素等于1的下标

index = np.argwhere(ind_a == 1)

反解包*zip

已知location = [(x1,y1),(x2,y2)]

通過下面的方式将x，y單獨分離

x, y = zip(*location)

将一個序列作為索引，另一個序列輸出索引值

很實用，很巧妙

ls=[1,2,3,4,5,6,7,8,9,0]#list
index=[2,3,6]#index list 
[ls[i]for i in index]

array的部分翻轉

翻轉[::-1]

a = np.array([[24, 20, 10, 22, 21, 4, 27, 6, 25, 1, 0, 28, 2, 17, 14, 7, 12, 16, 8, 23, 9, 3, 13, 11,
              19, 18, 26, 5, 15],[24, 20, 10, 22, 21, 4, 27, 6, 25, 1, 0, 28, 2, 17, 14, 7, 12, 16, 8, 23, 9, 3, 13, 11,19, 18, 26, 5, 15]])
a[0,1:4] = a[0,1:4][::-1]

結果：a[0]的20,10,22變為22,10,20

List中每個數都除以某數

直接除會報錯，巧妙辦法：

每個數都除以10

my_list = [x/10 for x in my_list]

多個清單同時排序

遇到這麼一個問題：兩個list元素一一對應，一個list進行排序，另一個list上的元素也跟着排序，保持一一對應關系。

下面是我遇到的實際問題場景：

一個list存儲文章标題，另一個list存儲文章發表時間，根據時間來進行兩者同時排序：

title_list = ['文章1标題', '文章2']
time_List = ['2021-2-12', '2020-3-18']

title_time = zip(title_list, time_List)
sorted_title_time = sorted(title_time, key=lambda x: x[1])
result = zip(*sorted_title_time)
title_list, title_time = [list(x) for x in result]
print(title_list)
print(title_time)

主要思路：用zip将兩者進行打包，排序完之後再用zip*解包。

跳過異常繼續運作

這個需求是我在進行爬蟲練習時遇到的，有的網站為了防爬蟲，會連續性的網站資料中加入某些異常值，導緻正常爬蟲遇到時會進行報錯，進而前功盡棄。

為了防止這種事情發生，就需要通過異常檢測的方式來跳過去：

for item in List:
        try:
            # 繼續執行的内容
        except Exception:
            pass
        continue

字元串截取（以截取Link為例）

字元串截取比較正常，遇到這麼一個場景：需要從字元串中提取出所有的網頁連結，即Link。

可直接調用下面封裝好的函數。

# 從a标簽中切分出具體文章連結
def split_link(string):
    start_string = 'http'
    end_string = '.html'
    sub_str = ""
    start = string.find(start_string)
    # 隻要start不等于-1，說明找到了http
    while start != -1:
        # 找結束的位置
        end = string.find(end_string, start)
        # 截取字元串 結束位置=結束字元串的開始位置+結束字元串的長度
        sub_str = string[start:end + len(end_string)]
        # 找下一個開始的位置
        # 如果沒有下一個開始的位置，結束循環
        start = string.find(start_string, end)
    return

擷取今天年月日

import time
print(time.strftime("%Y-%m-%d"))

轉換array中類型

将numpy中的array序列中的類型進行轉換可使用

astype

例如：轉換成浮點型

X.astype(int)

Matplotlib設定中文

讓圖例顯示中文，全局添加：

import matplotlib.pyplot as plt

plt.rcParams['font.sans-serif'] = ["SimHei"]
plt.rcParams["axes.unicode_minus"] = False

Matplotlib兩個子圖并列顯示

使用

subplot

控制子圖位置，用

figsize

調整子圖大小

plt.figure(figsize=(20, 15))
plt.subplot(2, 2, 1)
for i in range(len(label_pred)):
    plt.scatter(smile['smile'][i][0], smile['smile']
                [i][1], color=colors[label_pred[i]])
    plt.title("原始資料")
plt.subplot(2, 2, 2)
for i in range(len(y)):
    plt.scatter(smile['smile'][i][0], smile['smile'][i][1], color=colors[y[i]])
    plt.title("聚類後資料")

Matplotlib子圖并列顯示/儲存組合圖

和上面的寫法略有差別

# 繪圖顯示
fig, ax = plt.subplots(1, 3, figsize=(20, 20))
ax[0].imshow(img)
ax[0].set_title("子圖示題1")
ax[1].imshow(out_img)
ax[1].set_title("子圖示題2")
ax[2].imshow(out_img2)
ax[2].set_title("子圖示題3")
plt.show()
fig.savefig(r"組合圖名稱.png")

統計程式花費時間

import time

begin_time = time.time()
# 所運作程式
end_time = time.time()
print("程式花費時間{}秒".format(end_time-begin_time))

繪制簡單折線圖并儲存

# 繪制折線圖
def plot_pic(x, y):
    plt.plot(x, y, linewidth=1, color="orange", marker="o")
    plt.xlabel("num_bits")
    plt.ylabel("ACC (%)")
    plt.savefig("./result.png")
    plt.show()

将資料結果寫入txt檔案

with open(r'./result.txt', mode='a', encoding='utf-8') as f:
    f.write(str(reward) + "\n")

擷取矩陣每行下标

# 擷取每行最大值
y_pred = []
for row in y_test:
    y = np.argmax(row)
    y_pred.append(y)

批量修改txt檔案

建立一個txt檔案，讀取原檔案每行資料，批量進行添加資訊

ff = open(r'D:\Desktop\ailab\task6\pa\submission.txt', 'w')  # 打開一個檔案，可寫模式
with open(r'D:\Desktop\ailab\task6\pa\pa_result.txt', 'r') as f:  # 打開一個檔案隻讀模式
    line = f.readlines()
    for line_list in line:
        line_new = line_list.replace('\n', '')  # 将換行符替換為空('')
        line_new = 'cat_12_test/' + line_new + '\n'
        print(line_new)
        ff.write(line_new)  # 寫入一個新檔案中

通道交換

print(img.shape)  # (3, 320, 640)
print(img.transpose((1, 2, 0)).shape)  # (320, 640, 3)