Pytorch源碼解讀-torchvision.transforms

2023-07-29 00:09:57

torchvision.transforms對于使用Pytorch的人肯定不陌生，我也用了Pytorch但是對transform卻不是真正掌握，圖檔的預處理對網絡的性能十分重要，是以打算仔細看看pytorch的源碼。

Transforms組成

Transforms are common image transforms. They can be chained together using Compose Transforms是常用的一些圖像變換操作，可以用 Compose 将這些變換組合在一起

Transfroms由5部分組成

Transforms on PIL Image
Transforms on torch.Tensor
Conversion Transforms
Generic Transforms
Functional Transforms

其中，前兩部分

Transforms on PIL Image

和

Transforms on torch.*Tensor

用的比較多。下面會詳細說明，先說一下

compose

compose

Compose

是一個類用來組合所有的變換操作。

class torchvision.transforms.Compose(transforms)

使用方法，應該很常見：

transforms.Compose([
		transforms.CenterCrop(10),
	    transforms.ToTensor(),
        ])

Transforms on PIL Image

函數	用途	用法
CenterCrop(size)	crops the given PIL image at the center 從圖檔中心剪裁一個size大小的圖檔	CenterCrop(160)
GrayScale(num)	convert image to grayscale 将圖檔變成灰階圖	GrayScale(1)或者 GrayScale(3)
RandomCrop(size)	crop the given PIL image at a random location 在給定的圖檔中随機的剪裁一張size大小的圖。這個在訓練中經常用到，算作一種資料增強的手段	RandomCrop(224)
RandomHorizontaFlip(p)	Horizontally flip the given image randomly with a given probability 按照給定機率随機的對圖檔進行水準鏡像這一條也經常用于訓練	RandomHorizontaFlip(0.5)
Resize(size)	Resize the input PIL image to the given size 一般用在驗證集和測試集	Resize(224)

Transforms on torch.Tensor

Tensor資料的處理隻有一個函數

Normalize(mean,std)

Normalize函數幾乎都會用到，需要注意的是:

隻能對Tensor資料進行Normalize，不能對PIL image用，即在用

transform.Normalize()

前先使用下面的

transform.ToTensor()

用法：

transform.Normalize([127.5,127.5,127.5],[128,128,128])

對指定通道的像素減去均值除以方差，一般用于圖檔的歸一化，均值和方差的取值需要注意。

Conversion Transforms

這部分包括兩個變換：

class torchvision.transforms.ToPILImage(mode=None)

将Tensor形式變成PIL Image形式，用的不多
class torchvision.transforms.ToTensor

将PIL Image形式變成Tensor形式，用的很多，隻要你用到gpu都會涉及到。

Converts a PIL Image or numpy.ndarray (H x W x C) in the range [0, 255] to a torch.FloatTensor of shape (C x H x W) in the range [0.0, 1.0].

将一個image或者numpy.ndarrya形式的圖檔，圖檔的取值範圍[0,255]變成一個tensor，tensor的取值為[0,1]，并且通道順序由HWC變成CHW。

需要注意兩點，圖檔的取值範圍和通道順序。取值範圍會影響到接下來的歸一化

normalize

。通道順序從高x寬x通道變成通道x高x寬，簡單說就是從224x224x3變成3x224x224

Pytorch源碼解讀-torchvision.transforms

Transforms組成

Transforms on PIL Image

Transforms on torch.Tensor

Conversion Transforms

繼續閱讀

PyTorch自動混合精度訓練(AMP)手冊PyTorch自動混合精度訓練(AMP)手冊

PyTorch的自動混合精度（AMP）

Pytorch自動混合精度(AMP)介紹與使用Pytorch自動混合精度(AMP)介紹與使用

關于半精度fp16的混合訓練fp16fp16&fp32混合精度訓練

pytorch 基于 apex.amp 的混合精度訓練：原理介紹與實作

9、TORCH.UTILS.MODEL_ZOO

梯度累加及torch實作1. 什麼是梯度累加2. 梯度累加的過程3. 實驗4. 參考

torch.nn.Upsample實作上采樣

深度學習的一些小記錄裡面有一部分是摘錄

LabelImg的安裝與使用（Anaconda環境）Labellmg的安裝

pytorch：List中包含Tensor的grad資料怎麼辦？

Pytorch機器學習（九）—— YOLO中對于錨框，預測框，産生候選區域及對候選區域進行标注詳解 Pytorch機器學習（九）—— YOLO中錨框，預測框，産生候選區域及對候選區域進行标注詳解前言一、基本概念二、代碼講解總結

CogView: Mastering Text-to-Image Generation via Transformers翻譯摘要1.介紹2.方法3.Finetuning

【深度學習】損失函數記錄0. 前言1. 正文參考文獻

深度學習之卷積01 卷積02 填充Padding03 步幅Stride04 卷積核的選擇05 多通道卷積參考

【Torch】最簡潔logging使用指南