本實驗通過建立一個含有兩個隐含層的BP神經網絡,拟合具有二次函數非線性關系的方程,并通過可視化展現學習到的拟合曲線,同時随機給定輸入值,輸出預測值,最後給出一些關鍵的提示。
源代碼如下:
# -*- coding: utf-8 -*-
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
plotdata = { "batchsize":[], "loss":[] }
def moving_average(a, w=11):
if len(a) < w:
return a[:]
return [val if idx < w else sum(a[(idx-w):idx])/w for idx, val in enumerate(a)]
#生成模拟資料,二次函數關系
train_X = np.linspace(-1, 1, 100)[:, np.newaxis]
train_Y = train_X*train_X + 5 * train_X + np.random.randn(*train_X.shape) * 0.3
#子圖1顯示模拟資料點
plt.figure(12)
plt.subplot(221)
plt.plot(train_X, train_Y, \'ro\', label=\'Original data\')
plt.legend()
# 建立模型
# 占位符
X = tf.placeholder("float",[None,1])
Y = tf.placeholder("float",[None,1])
# 模型參數
W1 = tf.Variable(tf.random_normal([1,10]), name="weight1")
b1 = tf.Variable(tf.zeros([1,10]), name="bias1")
W2 = tf.Variable(tf.random_normal([10,6]), name="weight2")
b2 = tf.Variable(tf.zeros([1,6]), name="bias2")
W3 = tf.Variable(tf.random_normal([6,1]), name="weight3")
b3 = tf.Variable(tf.zeros([1]), name="bias3")
# 前向結構
z1 = tf.matmul(X, W1) + b1
z2 = tf.nn.relu(z1)
z3 = tf.matmul(z2, W2) + b2
z4 = tf.nn.relu(z3)
z5 = tf.matmul(z4, W3) + b3
#反向優化
cost =tf.reduce_mean( tf.square(Y - z5))
learning_rate = 0.01
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost) #Gradient descent
# 初始化變量
init = tf.global_variables_initializer()
# 訓練參數
training_epochs = 5000
display_step = 2
# 啟動session
with tf.Session() as sess:
sess.run(init)
for epoch in range(training_epochs+1):
sess.run(optimizer, feed_dict={X: train_X, Y: train_Y})
#顯示訓練中的詳細資訊
if epoch % display_step == 0:
loss = sess.run(cost, feed_dict={X: train_X, Y:train_Y})
print ("Epoch:", epoch, "cost=", loss)
if not (loss == "NA" ):
plotdata["batchsize"].append(epoch)
plotdata["loss"].append(loss)
print (" Finish")
#圖形顯示
plt.subplot(222)
plt.plot(train_X, train_Y, \'ro\', label=\'Original data\')
plt.plot(train_X, sess.run(z5, feed_dict={X: train_X}), label=\'Fitted line\')
plt.legend()
plotdata["avgloss"] = moving_average(plotdata["loss"])
plt.subplot(212)
plt.plot(plotdata["batchsize"], plotdata["avgloss"], \'b--\')
plt.xlabel(\'Minibatch number\')
plt.ylabel(\'Loss\')
plt.title(\'Minibatch run vs Training loss\')
plt.show()
#預測結果
a=[[0.2],[0.3]]
print ("x=[[0.2],[0.3]],z5=", sess.run(z5, feed_dict={X: a}))
運作結果如下:

結果實在是太棒了,把這個關系拟合的非常好。在上述的例子中,需要進一步說明如下内容:
- 輸入節點可以通過字典類型定義,而後通過字典的方法通路
input = {
\'X\': tf.placeholder("float",[None,1]),
\'Y\': tf.placeholder("float",[None,1])
}
sess.run(optimizer, feed_dict={input[\'X\']: train_X, input[\'Y\']: train_Y})
直接定義輸入節點的方法是不推薦使用的。
- 變量也可以通過字典類型定義,例如上述代碼可以改為:
parameter = {
\'W1\': tf.Variable(tf.random_normal([1,10]), name="weight1"),
\'b1\': tf.Variable(tf.zeros([1,10]), name="bias1"),
\'W2\': tf.Variable(tf.random_normal([10,6]), name="weight2"),
\'b2\': tf.Variable(tf.zeros([1,6]), name="bias2"),
\'W3\': tf.Variable(tf.random_normal([6,1]), name="weight3"),
\'b3\': tf.Variable(tf.zeros([1]), name="bias3")
}
z1 = tf.matmul(X, parameter[\'W1\']) +parameter[\'b1\']
在上述代碼中練習儲存/載入模型,代碼如下:
# -*- coding: utf-8 -*-
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
plotdata = { "batchsize":[], "loss":[] }
def moving_average(a, w=11):
if len(a) < w:
return a[:]
return [val if idx < w else sum(a[(idx-w):idx])/w for idx, val in enumerate(a)]
#生成模拟資料,二次函數關系
train_X = np.linspace(-1, 1, 100)[:, np.newaxis]
train_Y = train_X*train_X + 5 * train_X + np.random.randn(*train_X.shape) * 0.3
#子圖1顯示模拟資料點
plt.figure(12)
plt.subplot(221)
plt.plot(train_X, train_Y, \'ro\', label=\'Original data\')
plt.legend()
# 建立模型
# 字典型占位符
input = {\'X\':tf.placeholder("float",[None,1]),
\'Y\':tf.placeholder("float",[None,1])}
# X = tf.placeholder("float",[None,1])
# Y = tf.placeholder("float",[None,1])
# 模型參數
parameter = {\'W1\':tf.Variable(tf.random_normal([1,10]), name="weight1"), \'b1\':tf.Variable(tf.zeros([1,10]), name="bias1"),
\'W2\':tf.Variable(tf.random_normal([10,6]), name="weight2"),\'b2\':tf.Variable(tf.zeros([1,6]), name="bias2"),
\'W3\':tf.Variable(tf.random_normal([6,1]), name="weight3"), \'b3\':tf.Variable(tf.zeros([1]), name="bias3")}
# W1 = tf.Variable(tf.random_normal([1,10]), name="weight1")
# b1 = tf.Variable(tf.zeros([1,10]), name="bias1")
# W2 = tf.Variable(tf.random_normal([10,6]), name="weight2")
# b2 = tf.Variable(tf.zeros([1,6]), name="bias2")
# W3 = tf.Variable(tf.random_normal([6,1]), name="weight3")
# b3 = tf.Variable(tf.zeros([1]), name="bias3")
# 前向結構
z1 = tf.matmul(input[\'X\'], parameter[\'W1\']) + parameter[\'b1\']
z2 = tf.nn.relu(z1)
z3 = tf.matmul(z2, parameter[\'W2\']) + parameter[\'b2\']
z4 = tf.nn.relu(z3)
z5 = tf.matmul(z4, parameter[\'W3\']) + parameter[\'b3\']
#反向優化
cost =tf.reduce_mean( tf.square(input[\'Y\'] - z5))
learning_rate = 0.01
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost) #Gradient descent
# 初始化變量
init = tf.global_variables_initializer()
# 訓練參數
training_epochs = 5000
display_step = 2
# 生成saver
saver = tf.train.Saver()
savedir = "model/"
# 啟動session
with tf.Session() as sess:
sess.run(init)
for epoch in range(training_epochs+1):
sess.run(optimizer, feed_dict={input[\'X\']: train_X, input[\'Y\']: train_Y})
#顯示訓練中的詳細資訊
if epoch % display_step == 0:
loss = sess.run(cost, feed_dict={input[\'X\']: train_X, input[\'Y\']:train_Y})
print ("Epoch:", epoch, "cost=", loss)
if not (loss == "NA" ):
plotdata["batchsize"].append(epoch)
plotdata["loss"].append(loss)
print (" Finish")
#儲存模型
saver.save(sess, savedir+"mymodel.cpkt")
#圖形顯示
plt.subplot(222)
plt.plot(train_X, train_Y, \'ro\', label=\'Original data\')
plt.plot(train_X, sess.run(z5, feed_dict={input[\'X\']: train_X}), label=\'Fitted line\')
plt.legend()
plotdata["avgloss"] = moving_average(plotdata["loss"])
plt.subplot(212)
plt.plot(plotdata["batchsize"], plotdata["avgloss"], \'b--\')
plt.xlabel(\'Minibatch number\')
plt.ylabel(\'Loss\')
plt.title(\'Minibatch run vs Training loss\')
plt.show()
#預測結果
#在另外一個session裡面載入儲存的模型,再測試
a=[[0.2],[0.3]]
with tf.Session() as sess2:
#sess2.run(tf.global_variables_initializer())可有可無,因為下面restore會載入參數,相當于本次調用的初始化
saver.restore(sess2, "model/mymodel.cpkt")
print ("x=[[0.2],[0.3]],z5=", sess2.run(z5, feed_dict={input[\'X\']: a}))
生成如下目錄:
上述代碼模型的載入沒有利用到檢查點檔案,顯得不夠智能,還需使用者去查找指定某一模型,那在很多算法項目中是不需要使用者去找的,而可以通過檢查點找到儲存的模型。例如:
# -*- coding: utf-8 -*-
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
plotdata = { "batchsize":[], "loss":[] }
def moving_average(a, w=11):
if len(a) < w:
return a[:]
return [val if idx < w else sum(a[(idx-w):idx])/w for idx, val in enumerate(a)]
#生成模拟資料,二次函數關系
train_X = np.linspace(-1, 1, 100)[:, np.newaxis]
train_Y = train_X*train_X + 5 * train_X + np.random.randn(*train_X.shape) * 0.3
#子圖1顯示模拟資料點
plt.figure(12)
plt.subplot(221)
plt.plot(train_X, train_Y, \'ro\', label=\'Original data\')
plt.legend()
# 建立模型
# 字典型占位符
input = {\'X\':tf.placeholder("float",[None,1]),
\'Y\':tf.placeholder("float",[None,1])}
# X = tf.placeholder("float",[None,1])
# Y = tf.placeholder("float",[None,1])
# 模型參數
parameter = {\'W1\':tf.Variable(tf.random_normal([1,10]), name="weight1"), \'b1\':tf.Variable(tf.zeros([1,10]), name="bias1"),
\'W2\':tf.Variable(tf.random_normal([10,6]), name="weight2"),\'b2\':tf.Variable(tf.zeros([1,6]), name="bias2"),
\'W3\':tf.Variable(tf.random_normal([6,1]), name="weight3"), \'b3\':tf.Variable(tf.zeros([1]), name="bias3")}
# W1 = tf.Variable(tf.random_normal([1,10]), name="weight1")
# b1 = tf.Variable(tf.zeros([1,10]), name="bias1")
# W2 = tf.Variable(tf.random_normal([10,6]), name="weight2")
# b2 = tf.Variable(tf.zeros([1,6]), name="bias2")
# W3 = tf.Variable(tf.random_normal([6,1]), name="weight3")
# b3 = tf.Variable(tf.zeros([1]), name="bias3")
# 前向結構
z1 = tf.matmul(input[\'X\'], parameter[\'W1\']) + parameter[\'b1\']
z2 = tf.nn.relu(z1)
z3 = tf.matmul(z2, parameter[\'W2\']) + parameter[\'b2\']
z4 = tf.nn.relu(z3)
z5 = tf.matmul(z4, parameter[\'W3\']) + parameter[\'b3\']
#反向優化
cost =tf.reduce_mean( tf.square(input[\'Y\'] - z5))
learning_rate = 0.01
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost) #Gradient descent
# 初始化變量
init = tf.global_variables_initializer()
# 訓練參數
training_epochs = 5000
display_step = 2
# 生成saver
saver = tf.train.Saver(max_to_keep=1)
savedir = "model/"
# 啟動session
with tf.Session() as sess:
sess.run(init)
for epoch in range(training_epochs+1):
sess.run(optimizer, feed_dict={input[\'X\']: train_X, input[\'Y\']: train_Y})
saver.save(sess, savedir+"mymodel.cpkt",global_step=epoch)
#顯示訓練中的詳細資訊
if epoch % display_step == 0:
loss = sess.run(cost, feed_dict={input[\'X\']: train_X, input[\'Y\']:train_Y})
print ("Epoch:", epoch, "cost=", loss)
if not (loss == "NA" ):
plotdata["batchsize"].append(epoch)
plotdata["loss"].append(loss)
print (" Finish")
#圖形顯示
plt.subplot(222)
plt.plot(train_X, train_Y, \'ro\', label=\'Original data\')
plt.plot(train_X, sess.run(z5, feed_dict={input[\'X\']: train_X}), label=\'Fitted line\')
plt.legend()
plotdata["avgloss"] = moving_average(plotdata["loss"])
plt.subplot(212)
plt.plot(plotdata["batchsize"], plotdata["avgloss"], \'b--\')
plt.xlabel(\'Minibatch number\')
plt.ylabel(\'Loss\')
plt.title(\'Minibatch run vs Training loss\')
plt.show()
#預測結果
#在另外一個session裡面載入儲存的模型,再測試
a=[[0.2],[0.3]]
load=5000
with tf.Session() as sess2:
#sess2.run(tf.global_variables_initializer())可有可無,因為下面restore會載入參數,相當于本次調用的初始化
#saver.restore(sess2, "model/mymodel.cpkt")
saver.restore(sess2, "model/mymodel.cpkt-" + str(load))
print ("x=[[0.2],[0.3]],z5=", sess2.run(z5, feed_dict={input[\'X\']: a}))
#通過檢查點檔案載入儲存的模型
with tf.Session() as sess3:
ckpt = tf.train.get_checkpoint_state(savedir)
if ckpt and ckpt.model_checkpoint_path:
saver.restore(sess3, ckpt.model_checkpoint_path)
print ("x=[[0.2],[0.3]],z5=", sess3.run(z5, feed_dict={input[\'X\']: a}))
#通過檢查點檔案載入最新儲存的模型
with tf.Session() as sess4:
ckpt = tf.train.latest_checkpoint(savedir)
if ckpt!=None:
saver.restore(sess4, ckpt)
print ("x=[[0.2],[0.3]],z5=", sess4.run(z5, feed_dict={input[\'X\']: a}))
而通常情況下,上述兩種通過檢查點載入模型參數的結果是一樣的,主要是因為不管使用者儲存了多少個模型檔案,都會被記錄在唯一一個檢查點檔案中,這個指定儲存模型個數的參數就是max_to_keep,例如:
saver = tf.train.Saver(max_to_keep=3)
而檢查點都會預設用最新的模型載入,忽略了之前的模型,是以上述兩個檢查點載入了同一個模型,自然最後輸出的測試結果是一緻的。儲存的三個模型如圖:
接下來,為什麼上面的變量,需要給它對應的操作起個名字,而且是不一樣的名字呢?像weight1、bias1等等。大家都知道,名字這個東西太重要了,通過它可以通路我們想通路的變量,也就可以對其進行一些操作。例如:
- 顯示模型的内容
不同版本的函數會有些差別,本文試驗的版本是1.7.0,代碼例如:
# -*- coding: utf-8 -*-
import tensorflow as tf
from tensorflow.python.tools import inspect_checkpoint as chkp
#顯示全部變量的名字和值
chkp.print_tensors_in_checkpoint_file("model/mymodel.cpkt-5000", all_tensor_names=\'\', tensor_name=\'\', all_tensors=True)
#顯示指定名字變量的值
chkp.print_tensors_in_checkpoint_file("model/mymodel.cpkt-5000", all_tensor_names=\'\', tensor_name=\'weight1\', all_tensors=False)
chkp.print_tensors_in_checkpoint_file("model/mymodel.cpkt-5000", all_tensor_names=\'\', tensor_name=\'bias1\', all_tensors=False)
運作結果如下圖:
相反如果對不同變量的操作用了同一個name,系統将會自動對同名稱操作排序,例如:
# -*- coding: utf-8 -*-
import tensorflow as tf
from tensorflow.python.tools import inspect_checkpoint as chkp
#顯示全部變量的名字和值
chkp.print_tensors_in_checkpoint_file("model/mymodel.cpkt-50", all_tensor_names=\'\', tensor_name=\'\', all_tensors=True)
#顯示指定名字變量的值
chkp.print_tensors_in_checkpoint_file("model/mymodel.cpkt-50", all_tensor_names=\'\', tensor_name=\'weight\', all_tensors=False)
chkp.print_tensors_in_checkpoint_file("model/mymodel.cpkt-50", all_tensor_names=\'\', tensor_name=\'bias\', all_tensors=False)
結果為:
需要注意的是因為對所有同名的變量排序之後,真正的變量名已經變了,是以,當指定檢視某一個變量的值時,其實輸出的是第一個變量的值,因為它的名稱還保留着不變。另外,也可以通過變量的name屬性檢視其操作名。
- 按名字儲存變量
可以通過指定名稱來儲存變量;注意如果名字如果搞混了,名稱所對應的值也就搞混了,比如:
#隻儲存這兩個變量,并且這兩個被搞混了
saver = tf.train.Saver({\'weight\': parameter[\'b2\'], \'bias\':parameter[\'W1\']})
# -*- coding: utf-8 -*-
import tensorflow as tf
from tensorflow.python.tools import inspect_checkpoint as chkp
#顯示全部變量的名字和值
chkp.print_tensors_in_checkpoint_file("model/mymodel.cpkt-50", all_tensor_names=\'\', tensor_name=\'\', all_tensors=True)
#顯示指定名字變量的值
chkp.print_tensors_in_checkpoint_file("model/mymodel.cpkt-50", all_tensor_names=\'\', tensor_name=\'weight\', all_tensors=False)
chkp.print_tensors_in_checkpoint_file("model/mymodel.cpkt-50", all_tensor_names=\'\', tensor_name=\'bias\', all_tensors=False)
此時的結果是:
這樣,模型按照我們的想法儲存了參數,注意不能搞混變量和其對應的名字。