天天看點

線性回歸之随機梯度下降(sgd)

梯度下降的原理:梯度下降

普通梯度下降bgd的方法簡單暴力,但是調整速度比較慢。

如果不想等所有資料都計算完了才開始調整w,而是計算完資料的一部分(batch_size)後就立即調整w,說白了就是在訓練過程中進行權重的更新。

這樣就成了随機梯度下降

主要優點有:

* 收斂速度更快,

* 避免過拟合的問題。

代碼更新如下:

'''
随機全梯度下降方法
改進:進行到一部分的時候即更新權重

'''
import numpy as np
import math


print(__doc__)

sample = 
num_input = 

#加入訓練資料
np.random.seed()
normalRand = np.random.normal(,,sample)      # 10個均值為0方差為0.1 的随機數  (b)
weight = [,,-,-,]                     # 1 * 5 權重
x_train = np.random.random((sample, num_input))  #x 資料(10 * 5)
y_train = np.zeros((sample,))                   # y資料(10 * 1)


for i in range (,len(x_train)):
    total = 
    for j in range(,len(x_train[i])):
        total += weight[j]*x_train[i,j]
    y_train[i] = total+ normalRand[i]


# 訓練
np.random.seed()
weight = np.random.random(num_input+)
rate = 
batch = 

def train(x_train,y_train):
    #計算損失
    global weight,rate
    predictY = np.zeros((len(x_train)))
    for i in range(,len(x_train)):
        predictY[i] = np.dot(x_train[i],weight[:num_input])+ weight[num_input]
        loss = 
        for i in range(,len(x_train)):
            loss += (predictY[i]-y_train[i])**

    for i in range(,len(weight)-):
        grade = 
        for j in range(,len(x_train)):
            grade += *(predictY[j]-y_train[j])*x_train[j,i]
        weight[i] = weight[i] - rate*grade

    grade = 
    for j in range(,len(x_train)):
        grade += *(predictY[j]-y_train[j])
        weight[num_input] = weight[num_input] - rate*grade

    return loss


for epoch in range(,):
     begin = 
     while begin < len(x_train):
          end = begin + batch
          if end > len(x_train):
               end = len(x_train)

          loss = train(x_train[begin:end],y_train[begin:end])

          begin = end

          print("epoch: %d-loss: %f"%(epoch,loss))      #列印疊代次數和損失函數
print(weight)