介紹

torch 和 numpy

相同點: 張量(tensors)，ndarrays類似

不同點: tensors可用GPU計算

改變次元:

torch.view

reshape

屬性:

x.size()

x.shape

y.shape

pytorch和lua底層都是torch

官方60分鐘快速入門

1.張量

初始化

import torch
x = torch.empty(5,3)     #torch.Size([5,3])
x = torch.rand(5,3)
x = torch.zeros(5,3,dtype=torch.long)
x = torch.tensor([1,2])  #torch.Size([2])
y = torch.ones_like(x)
x.copy_(y)               #任何以_結尾的操作都會用結果替換原變量
x = torch.randn(4,4)
y = x.view(-1,8)         #torch.Size([2,8]) , size -1 從其他次元推斷

轉換

#torch到numpy轉換
a = torch.ones(5)
b = a.numpy()
#numpy到torch轉換
import numpy as np
a = np.ones(5)
b = torch.from_numpy(a)

torch tensor 與numpy 數組共享底層記憶體位址，修改一個另一個會變

CUDA張量

x = torch.randn(5,3)
if torch.cuda.is_available():
	device = torch.device("cuda")           #a CUDA 裝置對象
	y = torch.ones_like(x,device=device)    #直接從GPU建立張量
	x = x.to(device)                        #或者直接用.to("cuda") 将張量移動到cuda中
	z = x + y
	print(z)
	print(z.to("cpu",torch.double)) &emsp;  #.to 也會對變量類型做改變

2.自動求導

x = torch.ones(2,2,requires_grad=True) #追蹤張量計算曆史
y = x + 2                              #y已被計算出來, grad_fn自動生成
print(y.grad_fn)
z = y * y * 3
out = z.mean()
print(z, out)                                            
out.backward()                         #自動計算所有梯度,此張量所有梯度累積到.grad屬性
print(x.grad)

阻止張量追蹤曆史記錄，可調用.detach()方法将其與計算曆史記錄剝離，為防止跟蹤曆史記錄(使用記憶體)，可将代碼包裝在

with_torch.no_grad():

中

3.神經網絡

import torch
import torch.nn as nn
import torch.nn.functional as F

class Net(nn.Module):

    def __init__(self):
        super(Net, self).__init__()
        # 1 input image channel, 6 output channels, 5x5 square convolution
        # kernel
        self.conv1 = nn.Conv2d(1, 6, 5)
        self.conv2 = nn.Conv2d(6, 16, 5)
        # an affine operation: y = Wx + b
        self.fc1 = nn.Linear(16 * 5 * 5, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)

    def forward(self, x):
        # Max pooling over a (2, 2) window
        x = F.max_pool2d(F.relu(self.conv1(x)), (2, 2))
        # If the size is a square you can only specify a single number
        x = F.max_pool2d(F.relu(self.conv2(x)), 2)
        x = x.view(-1, self.num_flat_features(x))
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x

    def num_flat_features(self, x):
        size = x.size()[1:]  # all dimensions except the batch dimension
        num_features = 1
        for s in size:
            num_features *= s
        return num_features

net = Net()
print(net)

Net(
  (conv1): Conv2d(1, 6, kernel_size=(5, 5), stride=(1, 1))
  (conv2): Conv2d(6, 16, kernel_size=(5, 5), stride=(1, 1))
  (fc1): Linear(in_features=400, out_features=120, bias=True)
  (fc2): Linear(in_features=120, out_features=84, bias=True)
  (fc3): Linear(in_features=84, out_features=10, bias=True)
)

在模型中必須要定義 forward 函數，可以在 forward 函數中使用任何針對 Tensor 的操作。backward 函數（用來計算梯度）會被autograd自動建立。 net.parameters()傳回可被學習的參數（權重）清單和值

params = list(net.parameters())
print(len(params))       # 10
print(params[0].size())  # conv1's .weight torch.Size([6, 1, 5, 5])

測試随機輸入32×32。注：這個網絡（LeNet）期望的輸入大小是32×32，如果使用MNIST資料集來訓練這個網絡，請把圖檔大小重新調整到32×32。

input = torch.randn(1, 1, 32, 32)
out = net(input)
print(out) #tensor([[ 0.1120,  0.0713,  0.1014, -0.0696, -0.1210,  0.0084, -0.0206,  0.1366,-0.0455, -0.0036]], grad_fn=<AddmmBackward>)

将所有參數的梯度緩存清零，然後進行随機梯度的的反向傳播：

net.zero_grad()
out.backward(torch.randn(1, 10))

概念複習:

torch.nn

隻支援小批量輸入。整個

torch.nn

包都隻支援小批量樣本，而不支援單個樣本。例如，

nn.Conv2d

接受一個4維的張量，每一維分别是sSamples * nChannels * Height * Width（樣本數 * 通道數 *高 * 寬）。如果你有單個樣本，隻需使用

input.unsqueeze(0)

來添加其它的維數。

torch.Tensor

：一個用過自動調用 backward()實作支援自動梯度計算的多元數組，并且儲存關于這個向量的梯度 w.r.t.

nn.Module

：神經網絡子產品。封裝參數、移動到GPU上運作、導出、加載等。

nn.Parameter

：一種變量，當把它指派給一個Module時，被自動地注冊為一個參數。

autograd.Function

：實作一個自動求導操作的前向和反向定義，每個變量操作至少建立一個函數節點，每一個Tensor的操作都回建立一個接到建立Tensor和編碼其曆史的函數的Function節點。

損失函數

一個損失函數接受一對 (output, target) 作為輸入，計算一個值來估計網絡的輸出和目标值相差多少。

output = net(input)
target = torch.randn(10)         # 随機值作為樣例
target = target.view(1, -1)      # 使target和output的shape相同
criterion = nn.MSELoss()
loss = criterion(output, target) # tensor(0.8109, grad_fn=<MseLossBackward>)
print(loss)

現在，如果在反向過程中跟随loss ， 使用它的 .grad_fn 屬性，将看到如下所示的計算圖。
input -> conv2d -> relu -> maxpool2d -> conv2d -> relu -> maxpool2d
      -> view -> linear -> relu -> linear -> relu -> linear
      -> MSELoss
      -> loss

當我們調用 loss.backward()時,整張計算圖都會根據loss進行微分，而且圖中所有設定為requires_grad=True的張量将會擁有一個随着梯度累積的.grad 張量。

print(loss.grad_fn)  # MSELoss
print(loss.grad_fn.next_functions[0][0])  # Linear
print(loss.grad_fn.next_functions[0][0].next_functions[0][0])  # ReLU

反向傳播

調用

loss.backward()

獲得反向傳播的誤差。但是在調用前需要清除已存在的梯度，否則梯度将被累加到已存在的梯度。現在，我們将調用loss.backward()，并檢視conv1層的偏差（bias）項在反向傳播前後的梯度。

net.zero_grad()     # 清除梯度

print('conv1.bias.grad before backward') # conv1.bias.grad before backward
print(net.conv1.bias.grad)               # tensor([0., 0., 0., 0., 0., 0.])

loss.backward()

print('conv1.bias.grad after backward')  # conv1.bias.grad after backward
print(net.conv1.bias.grad)               # tensor([ 0.0051,  0.0042,  0.0026,  0.0152, -0.0040, -0.0036])

更新權重

在實踐中最簡單的權重更新規則是随機梯度下降（SGD）：

weight = weight - learning_rate * gradient

我們可以使用簡單的Python代碼實作這個規則：

learning_rate = 0.01
for f in net.parameters():
    f.data.sub_(f.grad.data * learning_rate)

但是當使用神經網絡是想要使用各種不同的更新規則時，比如SGD、Nesterov-SGD、Adam、RMSPROP等，PyTorch中建構了一個包torch.optim實作了所有的這些規則。使用它們非常簡單：

import torch.optim as optim

# create your optimizer
optimizer = optim.SGD(net.parameters(), lr=0.01)

# in your training loop:
optimizer.zero_grad()   # zero the gradient buffers
output = net(input)
loss = criterion(output, target)
loss.backward()
optimizer.step()    # Does the update

4.訓練一個分類器

1.使用torchvision加載和歸一化CIFAR10訓練集和測試集

2.定義一個卷積神經網絡

3.定義損失函數

4.在訓練集上訓練網絡

5.在測試集上測試網絡

見first_classfication_net_cpu

5.資料并行

import torch
import torch.nn as nn
from torch.utils.data import Dataset, DataLoader

# Parameters and DataLoaders
input_size = 5
output_size = 2
batch_size = 30
data_size = 100
#Device
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
#建立一個虛拟資料集
class RandomDataset(Dataset):
    def __init__(self, size, length):
        self.len = length
        self.data = torch.randn(length, size)
    def __getitem__(self, index):
        return self.data[index]
    def __len__(self):
        return self.len

rand_loader = DataLoader(dataset=RandomDataset(input_size, data_size),
                         batch_size=batch_size, shuffle=True)
class Model(nn.Module):
    # Our model
    def __init__(self, input_size, output_size):
        super(Model, self).__init__()
        self.fc = nn.Linear(input_size, output_size)
    def forward(self, input):
        output = self.fc(input)
        print("\tIn Model: input size", input.size(),
              "output size", output.size())
        return output

model = Model(input_size, output_size)
if torch.cuda.device_count() > 1:
    print("Let's use", torch.cuda.device_count(), "GPUs!")
    # dim = 0 [30, xxx] -> [10, ...], [10, ...], [10, ...] on 3 GPUs
    model = nn.DataParallel(model)
model.to(device)
for data in rand_loader:
    input = data.to(device)
    output = model(input)
    print("Outside: input size", input.size(),
          "output_size", output.size())

Chapter1介紹官方60分鐘快速入門

介紹

官方60分鐘快速入門

1.張量

2.自動求導

3.神經網絡

4.訓練一個分類器

5.資料并行

繼續閱讀

Pytorch中的contiguous了解

pytorch中 max()、view()、 squeeze()、 unsqueeze()

PyTorch中Tensor的學習筆記1 緻謝2 如何建立一個Tensor2.2 如何建立作為參數的Tensor3 PyTorch中Tensor的索引

＜pytorch學習＞初識張量（Tensor）

pytorch中的numel函數

PyTorch實戰mnist圖像分類項目結構項目代碼

Pytorch學習01-訓練圖像分類器前言一、訓練圖像分類器總結

深度學習AlexNet模型詳細分析

Pytorch學習——GAN——MINST

pytorch GAN

PyTorch實作Inception Module一、Inception網絡簡介二、Inception Module三、使用該Inception Module實作MNIST

二進制交叉熵損失函數

（１）nn和nn.functional的差別

torch.nn.ReLU用法

pytorch學習13：實作LetNet和學習nn.Module相關基本操作模型建立代碼輸出網絡結構檢視可訓練參數前向傳播反向傳播和梯度下降優化器關于 zero_grad

torch.init.normal_和torch.init.constant_用法