PS:1、本文旨在對TF學習過程進行備忘,本人菜得摳腳,故文章難免會有一定錯誤,還望指出,謝謝;
2、本文程式代碼使用Google TensorFlow所給出的官方入門教程;
3、本文使用tf.keras,對模型進行建構與訓練。
1、在訓練中儲存模型參數(Cheakpoints)
本文通過keras所提供回調參數(callbacks)中的模型檢查點(ModelCheckpoint)儲存模型訓練中的權重資料。然後建立一個未經訓練的模型,測試集顯示新模型準确度約為10.5%,後将儲存的權重加載,重新使用訓練集評估,準确度約為87.2%。
回調函數是一個函數的合集,會在訓練的階段中所使用。你可以使用回調函數來檢視訓練模型的内在狀态和統計。
允許在訓練的過程中和結束時回調儲存的模型。
參考資料:https://keras.io/zh/callbacks/#_1u
import os
import tensorflow as tf
from tensorflow import keras
print(tf.version.VERSION)
#加載資料集(訓練集、測試集)
(train_images, train_labels), (test_images, test_labels) = tf.keras.datasets.mnist.load_data()
#使用前1000個資料
train_labels = train_labels[:1000]
test_labels = test_labels[:1000]
#類歸一化處理,将圖像深度從0-255變為0-1
train_images = train_images[:1000].reshape(-1, 28 * 28) / 255.0
test_images = test_images[:1000].reshape(-1, 28 * 28) / 255.0
# 定義一個簡單的序列模型
def create_model():
model = tf.keras.models.Sequential([
#全連接配接層模型,激活函數relu,輸入次元784(28*28)
keras.layers.Dense(512, activation='relu', input_shape=(784,)),
#建立Dropout,防止過拟合,增加模型泛化能力,随機丢棄輸入單元機率設定為0.2
keras.layers.Dropout(0.2),
keras.layers.Dense(10)
])
model.compile(optimizer='adam',
loss=tf.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
return model
# 建立一個基本的模型執行個體
model = create_model()
# 顯示模型的結構
model.summary()
#在訓練期間儲存模型(以 checkpoints 形式儲存)
#儲存的路徑和名稱
checkpoint_path = "training_1/cp.ckpt"
checkpoint_dir = os.path.dirname(checkpoint_path)
# 建立一個儲存模型權重的回調
#ModelCheckpoint:在每個訓練期之後儲存模型
#filepath:檔案路徑
#save_weights_only=True:被監測資料最佳模型不會被覆寫
#verbose=1:列印詳細資訊
#period: 每個檢查點之間的間隔(訓練輪數)
cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_path,
save_weights_only=True,
verbose=1)
# 使用新的回調訓練模型
model.fit(train_images,
train_labels,
epochs=10,
validation_data=(test_images,test_labels),
callbacks=[cp_callback]) # 記錄回調參數
# 建立一個基本模型執行個體
model = create_model()
# 評估模型
loss, acc = model.evaluate(test_images, test_labels, verbose=2)
print("Untrained model, accuracy: {:5.2f}%".format(100*acc))
# 加載權重
model.load_weights(checkpoint_path)
# 重新評估模型
loss,acc = model.evaluate(test_images, test_labels, verbose=2)
print("Restored model, accuracy: {:5.2f}%".format(100*acc))
輸出結果:
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense (Dense) (None, 512) 401920
_________________________________________________________________
dropout (Dropout) (None, 512) 0
_________________________________________________________________
dense_1 (Dense) (None, 10) 5130
=================================================================
Total params: 407,050
Trainable params: 407,050
Non-trainable params: 0
_________________________________________________________________
Epoch 1/10
1/32 [..............................] - ETA: 0s - loss: 2.3353 - accuracy: 0.1562
Epoch 00001: saving model to training_1/cp.ckpt
32/32 [==============================] - 1s 18ms/step - loss: 1.2178 - accuracy: 0.6480 - val_loss: 0.7362 - val_accuracy: 0.7850
Epoch 2/10
1/32 [..............................] - ETA: 0s - loss: 0.3494 - accuracy: 0.9375
Epoch 00002: saving model to training_1/cp.ckpt
32/32 [==============================] - 0s 15ms/step - loss: 0.4409 - accuracy: 0.8740 - val_loss: 0.5288 - val_accuracy: 0.8410
Epoch 3/10
14/32 [============>.................] - ETA: 0s - loss: 0.3190 - accuracy: 0.9219
Epoch 00003: saving model to training_1/cp.ckpt
32/32 [==============================] - 0s 15ms/step - loss: 0.2917 - accuracy: 0.9300 - val_loss: 0.4958 - val_accuracy: 0.8490
Epoch 4/10
1/32 [..............................] - ETA: 0s - loss: 0.1519 - accuracy: 0.9688
Epoch 00004: saving model to training_1/cp.ckpt
32/32 [==============================] - 0s 13ms/step - loss: 0.2089 - accuracy: 0.9540 - val_loss: 0.4435 - val_accuracy: 0.8530
Epoch 5/10
1/32 [..............................] - ETA: 0s - loss: 0.0919 - accuracy: 1.0000
Epoch 00005: saving model to training_1/cp.ckpt
32/32 [==============================] - 0s 13ms/step - loss: 0.1563 - accuracy: 0.9630 - val_loss: 0.4257 - val_accuracy: 0.8560
Epoch 6/10
30/32 [===========================>..] - ETA: 0s - loss: 0.1299 - accuracy: 0.9760
Epoch 00006: saving model to training_1/cp.ckpt
32/32 [==============================] - 1s 17ms/step - loss: 0.1316 - accuracy: 0.9760 - val_loss: 0.4221 - val_accuracy: 0.8630
Epoch 7/10
31/32 [============================>.] - ETA: 0s - loss: 0.0900 - accuracy: 0.9829
Epoch 00007: saving model to training_1/cp.ckpt
32/32 [==============================] - 0s 14ms/step - loss: 0.0896 - accuracy: 0.9830 - val_loss: 0.4172 - val_accuracy: 0.8740
Epoch 8/10
30/32 [===========================>..] - ETA: 0s - loss: 0.0660 - accuracy: 0.9917
Epoch 00008: saving model to training_1/cp.ckpt
32/32 [==============================] - 1s 18ms/step - loss: 0.0658 - accuracy: 0.9920 - val_loss: 0.4227 - val_accuracy: 0.8680
Epoch 9/10
31/32 [============================>.] - ETA: 0s - loss: 0.0495 - accuracy: 0.9980
Epoch 00009: saving model to training_1/cp.ckpt
32/32 [==============================] - 0s 15ms/step - loss: 0.0494 - accuracy: 0.9980 - val_loss: 0.4176 - val_accuracy: 0.8650
Epoch 10/10
1/32 [..............................] - ETA: 0s - loss: 0.0210 - accuracy: 1.0000
Epoch 00010: saving model to training_1/cp.ckpt
32/32 [==============================] - 1s 17ms/step - loss: 0.0382 - accuracy: 0.9970 - val_loss: 0.4103 - val_accuracy: 0.8720
評估未訓練的模型
32/32 - 0s - loss: 2.3249 - accuracy: 0.1050
Untrained model, accuracy: 10.50%
加載權重後重新評估模型
32/32 - 0s - loss: 0.4103 - accuracy: 0.8720
Restored model, accuracy: 87.20%
2、按頻次儲存Checkpoint
此外還可以根據一定頻率epoch,儲存多個具有唯一名稱的回調參數,
import os
import tensorflow as tf
from tensorflow import keras
print(tf.version.VERSION)
#加載資料集(訓練集、測試集)
(train_images, train_labels), (test_images, test_labels) = tf.keras.datasets.mnist.load_data()
#使用前1000個資料
train_labels = train_labels[:1000]
test_labels = test_labels[:1000]
#類歸一化處理,将圖像深度從0-255變為0-1
train_images = train_images[:1000].reshape(-1, 28 * 28) / 255.0
test_images = test_images[:1000].reshape(-1, 28 * 28) / 255.0
# 定義一個簡單的序列模型
def create_model():
model = tf.keras.models.Sequential([
#全連接配接層模型,激活函數relu,輸入次元784(28*28)
keras.layers.Dense(512, activation='relu', input_shape=(784,)),
#建立Dropout,防止過拟合,增加模型泛化能力,随機丢棄輸入單元機率設定為0.2
keras.layers.Dropout(0.2),
keras.layers.Dense(10)
])
model.compile(optimizer='adam',
loss=tf.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
return model
# 建立一個基本的模型執行個體
model = create_model()
# 顯示模型的結構
model.summary()
#在訓練期間儲存模型(以 checkpoints 形式儲存)
#儲存的路徑和名稱
checkpoint_path = "training_1/cp.ckpt"
checkpoint_dir = os.path.dirname(checkpoint_path)
# 建立一個儲存模型權重的回調
#ModelCheckpoint:在每個訓練期之後儲存模型
#filepath:檔案路徑
#save_weights_only=True:被監測資料最佳模型不會被覆寫
#verbose=1:列印詳細資訊
#period: 每個檢查點之間的間隔(訓練輪數)
cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_path,
save_weights_only=True,
verbose=1)
#在檔案名中包含 epoch (使用 `str.format`)
checkpoint_path = "training_2/cp-{epoch:04d}.ckpt"
checkpoint_dir = os.path.dirname(checkpoint_path)
# 建立一個回調,每 5 個 epochs 儲存模型的權重
cp_callback = tf.keras.callbacks.ModelCheckpoint(
filepath=checkpoint_path,
verbose=1,
save_weights_only=True,
period=5)
# 使用 `checkpoint_path` 格式儲存權重
model.save_weights(checkpoint_path.format(epoch=0))
# 使用新的回調訓練模型
model.fit(train_images,
train_labels,
epochs=50,
callbacks=[cp_callback],
validation_data=(test_images,test_labels),
verbose=0)
#現在檢視生成的 checkpoint 并選擇最新的 checkpoint :
#該功能由函數latest_checkpoint()實作
latest = tf.train.latest_checkpoint(checkpoint_dir)
print('latest:',latest)
#如需選擇其他節點儲存的檔案,可以直接調用對應檔案名,如下
FristCheckPoint='training_2/cp-0000.ckpt'
#驗證回調參數
# 建立一個新的模型執行個體
model = create_model()
# 加載以前儲存的權重
model.load_weights(latest)
# 重新評估模型
loss, acc = model.evaluate(test_images, test_labels, verbose=2)#顯示結果
print("Restored model, accuracy: {:5.2f}%".format(100*acc))#輸出準确度
輸出結果
Epoch 00005: saving model to training_2/cp-0005.ckpt
Epoch 00010: saving model to training_2/cp-0010.ckpt
Epoch 00015: saving model to training_2/cp-0015.ckpt
Epoch 00020: saving model to training_2/cp-0020.ckpt
Epoch 00025: saving model to training_2/cp-0025.ckpt
Epoch 00030: saving model to training_2/cp-0030.ckpt
Epoch 00035: saving model to training_2/cp-0035.ckpt
Epoch 00040: saving model to training_2/cp-0040.ckpt
Epoch 00045: saving model to training_2/cp-0045.ckpt
Epoch 00050: saving model to training_2/cp-0050.ckpt
latest: training_2\cp-0050.ckpt
32/32 - 0s - loss: 0.4845 - accuracy: 0.8740
Restored model, accuracy: 87.40%
儲存的檔案
上述代碼将權重存儲到 checkpoint—— 格式化檔案的集合中,這些檔案僅包含二進制格式的訓練權重。 Checkpoints 包含:
一個或多個包含模型權重的分片。
索引檔案,訓示哪些權重存儲在哪個分片中。
如果你隻在一台機器上訓練一個模型,你将有一個帶有字尾的碎片:.data-00000-of-00001
3、手動儲存權重
您将了解如何将權重加載到模型中。使用 Model.save_weights 方法手動儲存它們同樣簡單。預設情況下, tf.keras 和 save_weights 特别使用 TensorFlow checkpoints 格式 .ckpt 擴充名和 ( 儲存在 HDF5 擴充名為 .h5 儲存并序列化模型 ):
import os
import tensorflow as tf
from tensorflow import keras
print(tf.version.VERSION)
#加載資料集(訓練集、測試集)
(train_images, train_labels), (test_images, test_labels) = tf.keras.datasets.mnist.load_data()
#使用前1000個資料
train_labels = train_labels[:1000]
test_labels = test_labels[:1000]
#類歸一化處理,将圖像深度從0-255變為0-1
train_images = train_images[:1000].reshape(-1, 28 * 28) / 255.0
test_images = test_images[:1000].reshape(-1, 28 * 28) / 255.0
# 定義一個簡單的序列模型
def create_model():
model = tf.keras.models.Sequential([
#全連接配接層模型,激活函數relu,輸入次元784(28*28)
keras.layers.Dense(512, activation='relu', input_shape=(784,)),
#建立Dropout,防止過拟合,增加模型泛化能力,随機丢棄輸入單元機率設定為0.2
keras.layers.Dropout(0.2),
keras.layers.Dense(10)
])
model.compile(optimizer='adam',
loss=tf.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
return model
# 建立一個基本的模型執行個體
model = create_model()
# 顯示模型的結構
model.summary()
#在訓練期間儲存模型(以 checkpoints 形式儲存)
#儲存的路徑和名稱
checkpoint_path = "training_1/cp.ckpt"
checkpoint_dir = os.path.dirname(checkpoint_path)
# 建立一個儲存模型權重的回調
#ModelCheckpoint:在每個訓練期之後儲存模型
#filepath:檔案路徑
#save_weights_only=True:被監測資料最佳模型不會被覆寫
#verbose=1:列印詳細資訊
#period: 每個檢查點之間的間隔(訓練輪數)
cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_path,
save_weights_only=True,
verbose=1)
# 使用新的回調訓練模型
model.fit(train_images,
train_labels,
epochs=10,
validation_data=(test_images,test_labels),
callbacks=[cp_callback]) # 記錄回調參數
# 儲存權重
#Saves all layer weights.
model.save_weights('./checkpoints/my_checkpoint')
# 建立模型執行個體
model = create_model()
# 恢複權重
model.load_weights('./checkpoints/my_checkpoint')
# 評估模型
loss,acc = model.evaluate(test_images, test_labels, verbose=2)
print("Restored model, accuracy: {:5.2f}%".format(100*acc))
輸出結果
4、以HDF5格式儲存整個模型
此處代碼使用提前停止的方式防止過拟合。
import os
import tensorflow as tf
from tensorflow import keras
print(tf.version.VERSION)
#加載資料集(訓練集、測試集)
(train_images, train_labels), (test_images, test_labels) = tf.keras.datasets.mnist.load_data()
#使用前1000個資料
train_labels = train_labels[:1000]
test_labels = test_labels[:1000]
#類歸一化處理,将圖像深度從0-255變為0-1
train_images = train_images[:1000].reshape(-1, 28 * 28) / 255.0
test_images = test_images[:1000].reshape(-1, 28 * 28) / 255.0
# 定義一個簡單的序列模型
def create_model():
model = tf.keras.models.Sequential([
#全連接配接層模型,激活函數relu,輸入次元784(28*28)
keras.layers.Dense(512, activation='relu', input_shape=(784,)),
#建立Dropout,防止過拟合,增加模型泛化能力,随機丢棄輸入單元機率設定為0.2
keras.layers.Dropout(0.2),
keras.layers.Dense(10)
])
model.compile(optimizer='adam',
loss=tf.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
return model
#以HDF5格式儲存
# 建立并訓練一個新的模型執行個體
model = create_model()
# patience 值用來檢查改進 epochs 的數量
#當驗證值沒有提高上是自動停止訓練。 我們将使用一個 EarlyStopping callback 來測試每個 epoch
#的訓練條件。如果經過一定數量的 epochs 後沒有改進,則自動停止訓練。
early_stop = keras.callbacks.EarlyStopping(monitor='val_loss', patience=10)
#再次訓練模型,顯示日志verbose=1
model.fit(train_images, train_labels, epochs=50,
validation_split = 0.2, verbose=1, callbacks=[early_stop])
#model.fit(train_images, train_labels, epochs=20)
# 将整個模型儲存為 HDF5 檔案。
# '.h5' 擴充名訓示應将模型儲存到 HDF5。
model.save('my_model.h5')
# 重新建立完全相同的模型,包括其權重和優化程式
new_model = tf.keras.models.load_model('my_model.h5')
# 顯示網絡結構
new_model.summary()
#檢查準确性
loss, acc = new_model.evaluate(test_images, test_labels, verbose=2)
print('Restored model, accuracy: {:5.2f}%'.format(100*acc))
輸出結果
Epoch 16/50
25/25 [==============================] - 0s 2ms/step - loss: 0.0124 - accuracy: 1.0000 - val_loss: 0.5265 - val_accuracy: 0.8650
Epoch 17/50
25/25 [==============================] - 0s 2ms/step - loss: 0.0124 - accuracy: 1.0000 - val_loss: 0.5186 - val_accuracy: 0.8750
Epoch 18/50
25/25 [==============================] - 0s 2ms/step - loss: 0.0110 - accuracy: 1.0000 - val_loss: 0.5418 - val_accuracy: 0.8750
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense (Dense) (None, 512) 401920
_________________________________________________________________
dropout (Dropout) (None, 512) 0
_________________________________________________________________
dense_1 (Dense) (None, 10) 5130
=================================================================
Total params: 407,050
Trainable params: 407,050
Non-trainable params: 0
_________________________________________________________________
32/32 - 0s - loss: 0.4602 - accuracy: 0.8680
Restored model, accuracy: 86.80%
參考資料:HDF5 資料檔案簡介
5、以SavedModel 格式儲存整個模型
SavedModel 格式是序列化模型的另一種方法。以這種格式儲存的模型,可以使用 tf.keras.models.load_model 還原,并且模型與 TensorFlow Serving 相容。SavedModel 指南詳細介紹了如何提供/檢查 SavedModel。以下部分說明了儲存和還原模型的步驟。
SavedModel 格式是一個包含 protobuf 二進制檔案和 Tensorflow 檢查點(checkpoint)的目錄。
import os
import tensorflow as tf
from tensorflow import keras
print(tf.version.VERSION)
#加載資料集(訓練集、測試集)
(train_images, train_labels), (test_images, test_labels) = tf.keras.datasets.mnist.load_data()
#使用前1000個資料
train_labels = train_labels[:1000]
test_labels = test_labels[:1000]
#類歸一化處理,将圖像深度從0-255變為0-1
train_images = train_images[:1000].reshape(-1, 28 * 28) / 255.0
test_images = test_images[:1000].reshape(-1, 28 * 28) / 255.0
# 定義一個簡單的序列模型
def create_model():
model = tf.keras.models.Sequential([
#全連接配接層模型,激活函數relu,輸入次元784(28*28)
keras.layers.Dense(512, activation='relu', input_shape=(784,)),
#建立Dropout,防止過拟合,增加模型泛化能力,随機丢棄輸入單元機率設定為0.2
keras.layers.Dropout(0.2),
keras.layers.Dense(10)
])
model.compile(optimizer='adam',
loss=tf.losses.SparseCategoricalCrossentropy(from_logits=True),
metrics=['accuracy'])
return model
#以SavedModel 格式儲存
# 建立并訓練一個新的模型執行個體。
model = create_model()
model.fit(train_images, train_labels, epochs=5)
# 将整個模型另存為 SavedModel。
model.save('saved_model/my_model')
#從儲存的模型重新加載新的 Keras 模型:
new_model = tf.keras.models.load_model('saved_model/my_model')
# 檢查其架構
new_model.summary()
# 評估還原的模型
loss, acc = new_model.evaluate(test_images, test_labels, verbose=2)
print('Restored model, accuracy: {:5.2f}%'.format(100*acc))
print(new_model.predict(test_images).shape)
輸出結果
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense (Dense) (None, 512) 401920
_________________________________________________________________
dropout (Dropout) (None, 512) 0
_________________________________________________________________
dense_1 (Dense) (None, 10) 5130
=================================================================
Total params: 407,050
Trainable params: 407,050
Non-trainable params: 0
_________________________________________________________________
32/32 - 0s - loss: 0.4319 - accuracy: 0.8610
Restored model, accuracy: 86.10%
(1000, 10)
産生的檔案結構