pytorch 使用autocast半精度加速訓練pytorch 使用autocast半精度加速訓練如何在PyTorch中使用自動混合精度？答案：autocast + GradScaler。 1.autocast 正如前文所說，需要使用torch.cuda.amp子產品中的autocast 類。使用也是非常簡單的

2023-08-07 13:02:11

pytorch 使用autocast半精度加速訓練

準備工作

pytorch 1.6+

如何使用autocast？

根據官方提供的方法，

如何在PyTorch中使用自動混合精度？

答案：autocast + GradScaler。

1.autocast

正如前文所說，需要使用torch.cuda.amp子產品中的autocast 類。使用也是非常簡單的

from torch.cuda.amp import autocast as autocast

# 建立model，預設是torch.FloatTensor

model = Net().cuda()

optimizer = optim.SGD(model.parameters(), ...)

for input, target in data:

optimizer.zero_grad()

# 前向過程(model + loss)開啟 autocast

with autocast():

output = model(input)

loss = loss_fn(output, target)

# 反向傳播在autocast上下文之外

loss.backward()

optimizer.step()

2.GradScaler

GradScaler就是梯度scaler子產品，需要在訓練最開始之前執行個體化一個GradScaler對象。

是以PyTorch中經典的AMP使用方式如下：

from torch.cuda.amp import autocast as autocast

# 建立model，預設是torch.FloatTensor

model = Net().cuda()

optimizer = optim.SGD(model.parameters(), ...)

# 在訓練最開始之前執行個體化一個GradScaler對象

scaler = GradScaler()

for epoch in epochs:

for input, target in data:

optimizer.zero_grad()

# 前向過程(model + loss)開啟 autocast

with autocast():

output = model(input)

loss = loss_fn(output, target)

scaler.scale(loss).backward()

scaler.step(optimizer)

scaler.update()

3.nn.DataParallel

單卡訓練的話上面的代碼已經夠了，親測在2080ti上能減少至少1/3的顯存，至于速度。。。

要是想多卡跑的話僅僅這樣還不夠，會發現在forward裡面的每個結果都還是float32的，怎麼辦？

class Model(nn.Module):

def __init__(self):

super(Model, self).__init__()

def forward(self, input_data_c1):

   with autocast():

       # code

   return

隻要把forward裡面的代碼用autocast代碼塊方式運作就好啦！

自動進行autocast的操作

如下操作中tensor會被自動轉化為半精度浮點型的torch.HalfTensor：

matmul

addbmm

addmm

addmv

addr

baddbmm

bmm

chain_matmul

conv1d

conv2d

conv3d

conv_transpose1d

conv_transpose2d

conv_transpose3d

linear

matmul

mm

mv

prelu

pytorch 使用autocast半精度加速訓練pytorch 使用autocast半精度加速訓練如何在PyTorch中使用自動混合精度？答案：autocast + GradScaler。 1.autocast 正如前文所說，需要使用torch.cuda.amp子產品中的autocast 類。使用也是非常簡單的

pytorch 使用autocast半精度加速訓練

如何在PyTorch中使用自動混合精度？

答案：autocast + GradScaler。

1.autocast

正如前文所說，需要使用torch.cuda.amp子產品中的autocast 類。使用也是非常簡單的

繼續閱讀

TestLink導出用例轉換工具(XML2Excel)

解碼器用于語義分割：資料依賴的解碼可以實作靈活的特征聚合

YAML簡介和PyYAML安全操作YAML支援的類型YAML的優點：yaml的基本文法python操作

cs231n斯坦福基于卷積神經網絡的CV學習筆記（一）KNN和線性分類器/分類器損失/反向傳播一，KNN圖像分類算法二，線性分類器三，線性分類器損失四，反向傳播五，神經網絡

Small tricks

libsvm for python 安裝

學習軟體測試基礎測試第七天

Zeppelin 配置通路 REST APIApache Zeppelin Configuration REST API

【Torch】最簡潔logging使用指南

27. Remove Element(清單)題目代碼

Cloud Studio初體驗

使用 ctypes 進行 Python 和 C 的混合程式設計

【python】【資料處理】畫多元資料分布圖

【python】netconf協定對接管理裝置

「Python 網絡自動化」NETCONF —— Python 使用 NETCONF 管理配置 H3C 網絡裝置

在python中建立excel并寫入

pytorch 使用autocast半精度加速訓練pytorch 使用autocast半精度加速訓練如何在PyTorch中使用自動混合精度？ 答案：autocast + GradScaler。 1.autocast 正如前文所說，需要使用torch.cuda.amp子產品中的autocast 類。使用也是非常簡單的

pytorch 使用autocast半精度加速訓練

如何在PyTorch中使用自動混合精度？ 答案：autocast + GradScaler。 1.autocast 正如前文所說，需要使用torch.cuda.amp子產品中的autocast 類。使用也是非常簡單的

繼續閱讀

pytorch 使用autocast半精度加速訓練pytorch 使用autocast半精度加速訓練如何在PyTorch中使用自動混合精度？答案：autocast + GradScaler。 1.autocast 正如前文所說，需要使用torch.cuda.amp子產品中的autocast 類。使用也是非常簡單的

如何在PyTorch中使用自動混合精度？

答案：autocast + GradScaler。

1.autocast

正如前文所說，需要使用torch.cuda.amp子產品中的autocast 類。使用也是非常簡單的