keras基本結構功能

1. Keras和TensorFlow的關系和差別:

TensorFlow和theano以及Keras都是深度學習架構，TensorFlow和theano比較靈活，也比較難學，它們其實就是一個微分器
Keras其實就是TensorFlow和Keras的接口（Keras作為前端，TensorFlow或theano作為後端），它也很靈活，且比較容易學。可以把keras看作為tensorflow封裝後的一個API。

2. Sequential與Model等基本功能

中文文檔：http://keras-cn.readthedocs.io/en/latest/

官方文檔：https://keras.io/

文檔主要是以keras2.0。

零、keras介紹與基本的模型儲存

寫成了思維導圖，便于觀察與了解。

1.keras網絡結構

keras基本結構功能

2.keras網絡配置

keras基本結構功能

其中回調函數callbacks應該是keras的精髓~

3.keras預處理功能

keras基本結構功能

4、模型的節點資訊提取

# 節點資訊提取
config = model.get_config()  # 把model中的資訊，solver.prototxt和train.prototxt資訊提取出來
model = Model.from_config(config)  # 還回去
# or, for Sequential:
model = Sequential.from_config(config) # 重構一個新的Model模型，用去其他訓練，fine-tuning比較好用

5、模型概況查詢（包括權重查詢）

# 1、模型概括列印
model.summary()

# 2、傳回代表模型的JSON字元串，僅包含網絡結構，不包含權值。可以從JSON字元串中重構原模型：
from models import model_from_json

json_string = model.to_json()
model = model_from_json(json_string)

# 3、model.to_yaml：與model.to_json類似，同樣可以從産生的YAML字元串中重構模型
from models import model_from_yaml

yaml_string = model.to_yaml()
model = model_from_yaml(yaml_string)

# 4、權重擷取
model.get_layer()      #依據層名或下标獲得層對象
model.get_weights()    #傳回模型權重張量的清單，類型為numpy array
model.set_weights()    #從numpy array裡将權重載入給模型，要求數組具有與model.get_weights()相同的形狀。

# 檢視model中Layer的資訊
model.layers 檢視layer資訊

6、模型儲存與加載

model.save_weights(filepath)
# 将模型權重儲存到指定路徑，檔案類型是HDF5（字尾是.h5）

model.load_weights(filepath, by_name=False)
# 從HDF5檔案中加載權重到目前模型中, 預設情況下模型的結構将保持不變。
# 如果想将權重載入不同的模型（有些層相同）中，則設定by_name=True，隻有名字比對的層才會載入權重

7、如何在keras中設定GPU使用的大小

本節來源于：深度學習theano/tensorflow多顯示卡多人使用問題集（參見：Limit the resource usage for tensorflow backend · Issue #1538 · fchollet/keras · GitHub）

在使用keras時候會出現總是占滿GPU顯存的情況，可以通過重設backend的GPU占用情況來進行調節。

import tensorflow as tf
from keras.backend.tensorflow_backend import set_session
config = tf.ConfigProto()
config.gpu_options.per_process_gpu_memory_fraction = 0.3
set_session(tf.Session(config=config))

需要注意的是，雖然代碼或配置層面設定了對顯存占用百分比門檻值，但在實際運作中如果達到了這個門檻值，程式有需要的話還是會突破這個門檻值。換而言之如果跑在一個大資料集上還是會用到更多的顯存。以上的顯存限制僅僅為了在跑小資料集時避免對顯存的浪費而已。（2017年2月20日補充）

8.更科學地模型訓練與模型儲存

filepath = 'model-ep{epoch:03d}-loss{loss:.3f}-val_loss{val_loss:.3f}.h5'
checkpoint = ModelCheckpoint(filepath, monitor='val_loss', verbose=1, save_best_only=True, mode='min')
# fit model
model.fit(x, y, epochs=20, verbose=2, callbacks=[checkpoint], validation_data=(x, y))

save_best_only打開之後，會如下：

ETA: 3s - loss: 0.5820Epoch 00017: val_loss did not improve

如果val_loss 提高了就會儲存，沒有提高就不會儲存。

9.如何在keras中使用tensorboard

RUN = RUN + 1 if 'RUN' in locals() else 1   # locals() 函數會以字典類型傳回目前位置的全部局部變量。

    LOG_DIR = model_save_path + '/training_logs/run{}'.format(RUN)
    LOG_FILE_PATH = LOG_DIR + '/checkpoint-{epoch:02d}-{val_loss:.4f}.hdf5'   # 模型Log檔案以及.h5模型檔案存放位址

    tensorboard = TensorBoard(log_dir=LOG_DIR, write_images=True)
    checkpoint = ModelCheckpoint(filepath=LOG_FILE_PATH, monitor='val_loss', verbose=1, save_best_only=True)
    early_stopping = EarlyStopping(monitor='val_loss', patience=5, verbose=1)

    history = model.fit_generator(generator=gen.generate(True), steps_per_epoch=int(gen.train_batches / 4),
                                  validation_data=gen.generate(False), validation_steps=int(gen.val_batches / 4),
                                  epochs=EPOCHS, verbose=1, callbacks=[tensorboard, checkpoint, early_stopping])

都是在回調函數中起作用：

EarlyStopping patience：當early

（1）stop被激活（如發現loss相比上一個epoch訓練沒有下降），則經過patience個epoch後停止訓練。

（2）mode：‘auto’，‘min’，‘max’之一，在min模式下，如果檢測值停止下降則中止訓練。在max模式下，當檢測值不再上升則停止訓練。
模型檢查點ModelCheckpoint

（1）save_best_only：當設定為True時，将隻儲存在驗證集上性能最好的模型

（2） mode：‘auto’，‘min’，‘max’之一，在save_best_only=True時決定性能最佳模型的評判準則，例如，當監測值為val_acc時，模式應為max，當檢測值為val_loss時，模式應為min。在auto模式下，評價準則由被監測值的名字自動推斷。

（3）save_weights_only：若設定為True，則隻儲存模型權重，否則将儲存整個模型（包括模型結構，配置資訊等）

（4）period：CheckPoint之間的間隔的epoch數
可視化tensorboard write_images: 是否将模型權重以圖檔的形式可視化

其他内容可參考keras中文文檔

一、Sequential 序貫模型

序貫模型是函數式模型的簡略版，為最簡單的線性、從頭到尾的結構順序，不分叉。

Sequential模型的基本元件

一般需要：

1、model.add，添加層；
2、model.compile,模型訓練的BP模式設定；
3、model.fit，模型訓練參數設定 + 訓練；
4、模型評估
5、模型預測

1. add：添加層——train_val.prototxt

add(self, layer)

# 譬如：
model.add(Dense(32, activation='relu', input_dim=100))
model.add(Dropout(0.25))

add裡面隻有層layer的内容，當然在序貫式裡面，也可以model.add（other_model）加載另外模型，在函數式裡面就不太一樣，詳見函數式。

2、compile 訓練模式——solver.prototxt檔案

compile(self, optimizer, loss, metrics=None, sample_weight_mode=None)

其中：

optimizer：字元串（預定義優化器名）或優化器對象，參考優化器

loss：字元串（預定義損失函數名）或目标函數，參考損失函數

metrics：清單，包含評估模型在訓練和測試時的網絡性能的名額，典型用法是metrics=[‘accuracy’]

sample_weight_mode：如果你需要按時間步為樣本賦權（2D權矩陣），将該值設為“temporal”。

預設為“None”，代表按樣本賦權（1D權）。在下面fit函數的解釋中有相關的參考内容。

kwargs：使用TensorFlow作為後端請忽略該參數，若使用Theano作為後端，kwargs的值将會傳遞給 K.function

注意：

模型在使用前必須編譯，否則在調用fit或evaluate時會抛出異常。

3、fit 模型訓練參數+訓練——train.sh+soler.prototxt（部分）

fit(self, x, y, batch_size=32, epochs=10, verbose=1, callbacks=None, validation_split=0.0, validation_data=None, shuffle=True, class_weight=None, sample_weight=None, initial_epoch=0)

本函數将模型訓練nb_epoch輪，其參數有：

x：輸入資料。如果模型隻有一個輸入，那麼x的類型是numpy

array，如果模型有多個輸入，那麼x的類型應當為list，list的元素是對應于各個輸入的numpy array
y：标簽，numpy array
batch_size：整數，指定進行梯度下降時每個batch包含的樣本數。訓練時一個batch的樣本會被計算一次梯度下降，使目标函數優化一步。
epochs：整數，訓練的輪數，每個epoch會把訓練集輪一遍。
verbose：日志顯示，0為不在标準輸出流輸出日志資訊，1為輸出進度條記錄，2為每個epoch輸出一行記錄
callbacks：list，其中的元素是keras.callbacks.Callback的對象。這個list中的回調函數将會在訓練過程中的适當時機被調用，參考回調函數
validation_split：0~1之間的浮點數，用來指定訓練集的一定比例資料作為驗證集。驗證集将不參與訓練，并在每個epoch結束後測試的模型的名額，如損失函數、精确度等。注意，validation_split的劃分在shuffle之前，是以如果你的資料本身是有序的，需要先手工打亂再指定validation_split，否則可能會出現驗證集樣本不均勻。
validation_data：形式為（X，y）的tuple，是指定的驗證集。此參數将覆寫validation_spilt。
shuffle：布爾值或字元串，一般為布爾值，表示是否在訓練過程中随機打亂輸入樣本的順序。若為字元串“batch”，則是用來處理HDF5資料的特殊情況，它将在batch内部将資料打亂。
class_weight：字典，将不同的類别映射為不同的權值，該參數用來在訓練過程中調整損失函數（隻能用于訓練）
sample_weight：權值的numpy

array，用于在訓練時調整損失函數（僅用于訓練）。可以傳遞一個1D的與樣本等長的向量用于對樣本進行1對1的權重，或者在面對時序資料時，傳遞一個的形式為（samples，sequence_length）的矩陣來為每個時間步上的樣本賦不同的權。這種情況下請确定在編譯模型時添加了sample_weight_mode=’temporal’。
initial_epoch: 從該參數指定的epoch開始訓練，在繼續之前的訓練時有用。

fit函數傳回一個History的對象，其History.history屬性記錄了損失函數和其他名額的數值随epoch變化的情況，如果有驗證集的話，也包含了驗證集的這些名額變化情況

注意：

要與之後的fit_generator做差別，兩者輸入x/y不同。

4.evaluate 模型評估

evaluate(self, x, y, batch_size=32, verbose=1, sample_weight=None)

本函數按batch計算在某些輸入資料上模型的誤差，其參數有：

x：輸入資料，與fit一樣，是numpy array或numpy array的list
y：标簽，numpy array
batch_size：整數，含義同fit的同名參數
verbose：含義同fit的同名參數，但隻能取0或1
sample_weight：numpy array，含義同fit的同名參數

本函數傳回一個測試誤差的标量值（如果模型沒有其他評價名額），或一個标量的list（如果模型還有其他的評價名額）。model.metrics_names将給出list中各個值的含義。

如果沒有特殊說明，以下函數的參數均保持與fit的同名參數相同的含義

如果沒有特殊說明，以下函數的verbose參數（如果有）均隻能取0或1

5 predict 模型評估

predict(self, x, batch_size=32, verbose=0)
predict_classes(self, x, batch_size=32, verbose=1)
predict_proba(self, x, batch_size=32, verbose=1)

本函數按batch獲得輸入資料對應的輸出，其參數有：

函數的傳回值是預測值的numpy array

predict_classes：本函數按batch産生輸入資料的類别預測結果；

predict_proba：本函數按batch産生輸入資料屬于各個類别的機率

6 on_batch 、batch的結果，檢查

train_on_batch(self, x, y, class_weight=None, sample_weight=None)
test_on_batch(self, x, y, sample_weight=None)
predict_on_batch(self, x)

train_on_batch：本函數在一個batch的資料上進行一次參數更新，函數傳回訓練誤差的标量值或标量值的list，與evaluate的情形相同。
test_on_batch：本函數在一個batch的樣本上對模型進行評估，函數的傳回與evaluate的情形相同
predict_on_batch：本函數在一個batch的樣本上對模型進行測試，函數傳回模型在一個batch上的預測結果

7 fit_generator

#利用Python的生成器，逐個生成資料的batch并進行訓練。
#生成器與模型将并行執行以提高效率。
#例如，該函數允許我們在CPU上進行實時的資料提升，同時在GPU上進行模型訓練
# 參考連結：http://keras-cn.readthedocs.io/en/latest/models/sequential/

有了該函數，圖像分類訓練任務變得很簡單。

fit_generator(self, generator, steps_per_epoch, epochs=1, verbose=1, callbacks=None, validation_data=None, validation_steps=None, class_weight=None, max_q_size=10, workers=1, pickle_safe=False, initial_epoch=0)

# 案例：
def generate_arrays_from_file(path):
    while 1:
            f = open(path)
            for line in f:
                # create Numpy arrays of input data
                # and labels, from each line in the file
                x, y = process_line(line)
                yield (x, y)
        f.close()

model.fit_generator(generate_arrays_from_file('/my_file.txt'),
        samples_per_epoch=10000, epochs=10)

其他的兩個輔助的内容：

evaluate_generator(self, generator, steps, max_q_size=10, workers=1, pickle_safe=False)
predict_generator(self, generator, steps, max_q_size=10, workers=1, pickle_safe=False, verbose=0)

evaluate_generator：本函數使用一個生成器作為資料源評估模型，生成器應傳回與test_on_batch的輸入資料相同類型的資料。該函數的參數與fit_generator同名參數含義相同，steps是生成器要傳回資料的輪數。

predcit_generator：本函數使用一個生成器作為資料源預測模型，生成器應傳回與test_on_batch的輸入資料相同類型的資料。該函數的參數與fit_generator同名參數含義相同，steps是生成器要傳回資料的輪數。

案例一：簡單的2分類

For a single-input model with 2 classes (binary classification):

from keras.models import Sequential
from keras.layers import Dense, Activation

#模型搭建階段
model= Sequential()
model.add(Dense(32, activation='relu', input_dim=100))
# Dense(32) is a fully-connected layer with 32 hidden units.
model.add(Dense(1, activation='sigmoid'))
model.compile(optimizer='rmsprop',
              loss='binary_crossentropy',
              metrics=['accuracy'])

其中：

Sequential()代表類的初始化；

Dense代表全連接配接層，此時有32個全連接配接層，最後接relu，輸入的是100次元

model.add，添加新的全連接配接層，

compile，跟prototxt一樣，一些訓練參數,solver.prototxt

# Generate dummy data
import numpy as np
data = np.random.random((1000, 100))
labels = np.random.randint(2, size=(1000, 1))

# Train the model, iterating on the data in batches of 32 samples
model.fit(data, labels, nb_epoch =10, batch_size=32)

之前報過這樣的錯誤，是因為版本的問題。版本1.2裡面是nb_epoch ，而keras2.0是epochs = 10

error:
    TypeError: Received unknown keyword arguments: {'epochs': 10}

其中：

epoch=batch_size * iteration,10次epoch代表訓練十次訓練集

案例二:多分類-VGG的卷積神經網絡

import numpy as np
import keras
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras.optimizers import SGD
from keras.utils import np_utils

# Generate dummy data
x_train = np.random.random((100, 100, 100, 3))
# 100張圖檔，每張100*100*3
y_train = keras.utils.to_categorical(np.random.randint(10, size=(100, 1)), num_classes=10)
# 100*10
x_test = np.random.random((20, 100, 100, 3))
y_test = keras.utils.to_categorical(np.random.randint(10, size=(20, 1)), num_classes=10)
# 20*100

model = Sequential()
# input: 100x100 images with 3 channels -> (100, 100, 3) tensors.
# this applies 32 convolution filters of size 3x3 each.
model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(100, 100, 3)))
model.add(Conv2D(32, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))

model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))

model.add(Flatten())
model.add(Dense(256, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(10, activation='softmax'))

sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
model.compile(loss='categorical_crossentropy', optimizer=sgd)

model.fit(x_train, y_train, batch_size=32, epochs=10)
score = model.evaluate(x_test, y_test, batch_size=32)

标準序貫網絡，标簽的訓練模式

注意：

這裡非常重要的一點，對于我這樣的新手，這一步的作用？

keras.utils.to_categorical

特别是多分類時候，我之前以為輸入的就是一列（100，），但是keras在多分類任務中是不認得這個的，是以需要再加上這一步，讓其轉化為Keras認得的資料格式。

案例三：使用LSTM的序列分類

from keras.models import Sequential
from keras.layers import Dense, Dropout
from keras.layers import Embedding
from keras.layers import LSTM

model = Sequential()
model.add(Embedding(max_, output_dim=256))
model.add(LSTM(128))
model.add(Dropout(0.5))
model.add(Dense(1, activation='sigmoid'))

model.compile(loss='binary_crossentropy',
              optimizer='rmsprop',
              metrics=['accuracy'])

model.fit(x_train, y_train, batch_size=16, epochs=10)
score = model.evaluate(x_test, y_test, batch_size=16)

三、Model式模型

來自keras中文文檔：http://keras-cn.readthedocs.io/en/latest/

比序貫模型要複雜，但是效果很好，可以同時/分階段輸入變量，分階段輸出想要的模型；

一句話，隻要你的模型不是類似VGG一樣一條路走到黑的模型，或者你的模型需要多于一個的輸出，那麼你總應該選擇函數式模型。

不同之處：

書寫結構完全不一緻

函數式模型基本屬性與訓練流程

一般需要：

1、model.layers，添加層資訊；

2、model.compile,模型訓練的BP模式設定；

3、model.fit，模型訓練參數設定 + 訓練；

4、evaluate，模型評估；

5、predict 模型預測

1 常用Model屬性

model.layers：組成模型圖的各個層
model.inputs：模型的輸入張量清單
model.outputs：模型的輸出張量清單

2 compile 訓練模式設定——solver.prototxt

compile(self, optimizer, loss, metrics=None, loss_weights=None, sample_weight_mode=None)

本函數編譯模型以供訓練，參數有

optimizer：優化器，為預定義優化器名或優化器對象，參考優化器

loss：損失函數，為預定義損失函數名或一個目标函數，參考損失函數

metrics：清單，包含評估模型在訓練和測試時的性能的名額，典型用法是metrics=[‘accuracy’]如果要在多輸出模型中為不同的輸出指定不同的名額，可像該參數傳遞一個字典，例如metrics={‘ouput_a’: ‘accuracy’}

sample_weight_mode：如果你需要按時間步為樣本賦權（2D權矩陣），将該值設為“temporal”。預設為“None”，代表按樣本賦權（1D權）。

如果模型有多個輸出，可以向該參數傳入指定sample_weight_mode的字典或清單。在下面fit函數的解釋中有相關的參考内容。

【Tips】如果你隻是載入模型并利用其predict，可以不用進行compile。在Keras中，compile主要完成損失函數和優化器的一些配置，是為訓練服務的。predict會在内部進行符号函數的編譯工作（通過調用_make_predict_function生成函數）

3 fit 模型訓練參數設定 + 訓練

fit(self, x=None, y=None, batch_size=32, epochs=1, verbose=1, callbacks=None, validation_split=0.0, validation_data=None, shuffle=True, class_weight=None, sample_weight=None, initial_epoch=0)

本函數用以訓練模型，參數有：

x：輸入資料。如果模型隻有一個輸入，那麼x的類型是numpy

array，如果模型有多個輸入，那麼x的類型應當為list，list的元素是對應于各個輸入的numpy

array。如果模型的每個輸入都有名字，則可以傳入一個字典，将輸入名與其輸入資料對應起來。
y：标簽，numpy array。如果模型有多個輸出，可以傳入一個numpy

array的list。如果模型的輸出擁有名字，則可以傳入一個字典，将輸出名與其标簽對應起來。
batch_size：整數，指定進行梯度下降時每個batch包含的樣本數。訓練時一個batch的樣本會被計算一次梯度下降，使目标函數優化一步。
nb_epoch：整數，訓練的輪數，訓練資料将會被周遊nb_epoch次。Keras中nb開頭的變量均為”number of”的意思
verbose：日志顯示，0為不在标準輸出流輸出日志資訊，1為輸出進度條記錄，2為每個epoch輸出一行記錄
callbacks：list，其中的元素是keras.callbacks.Callback的對象。這個list中的回調函數将會在訓練過程中的适當時機被調用，參考回調函數
validation_split：0~1之間的浮點數，用來指定訓練集的一定比例資料作為驗證集。驗證集将不參與訓練，并在每個epoch結束後測試的模型的名額，如損失函數、精确度等。注意，validation_split的劃分在shuffle之後，是以如果你的資料本身是有序的，需要先手工打亂再指定validation_split，否則可能會出現驗證集樣本不均勻。
validation_data：形式為（X，y）或（X，y，sample_weights）的tuple，是指定的驗證集。此參數将覆寫validation_spilt。
shuffle：布爾值，表示是否在訓練過程中每個epoch前随機打亂輸入樣本的順序。
class_weight：字典，将不同的類别映射為不同的權值，該參數用來在訓練過程中調整損失函數（隻能用于訓練）。該參數在處理非平衡的訓練資料（某些類的訓練樣本數很少）時，可以使得損失函數對樣本數不足的資料更加關注。
sample_weight：權值的numpy

array，用于在訓練時調整損失函數（僅用于訓練）。可以傳遞一個1D的與樣本等長的向量用于對樣本進行1對1的權重，或者在面對時序資料時，傳遞一個的形式為（samples，sequence_length）的矩陣來為每個時間步上的樣本賦不同的權。這種情況下請确定在編譯模型時添加了sample_weight_mode=’temporal’。
initial_epoch: 從該參數指定的epoch開始訓練，在繼續之前的訓練時有用。

輸入資料與規定資料不比對時會抛出錯誤

4.evaluate，模型評估

evaluate(self, x, y, batch_size=32, verbose=1, sample_weight=None)

本函數按batch計算在某些輸入資料上模型的誤差，其參數有：

x：輸入資料，與fit一樣，是numpy array或numpy array的list
y：标簽，numpy array
batch_size：整數，含義同fit的同名參數
verbose：含義同fit的同名參數，但隻能取0或1
sample_weight：numpy array，含義同fit的同名參數

如果沒有特殊說明，以下函數的參數均保持與fit的同名參數相同的含義

如果沒有特殊說明，以下函數的verbose參數（如果有）均隻能取0或1

5.predict 模型預測

predict(self, x, batch_size=32, verbose=0)

本函數按batch獲得輸入資料對應的輸出，其參數有：

函數的傳回值是預測值的numpy array

模型檢查 on_batch

train_on_batch(self, x, y, class_weight=None, sample_weight=None)
test_on_batch(self, x, y, sample_weight=None)
predict_on_batch(self, x)

train_on_batch：本函數在一個batch的資料上進行一次參數更新，函數傳回訓練誤差的标量值或标量值的list，與evaluate的情形相同。

test_on_batch：本函數在一個batch的樣本上對模型進行評估，函數的傳回與evaluate的情形相同；

predict_on_batch：本函數在一個batch的樣本上對模型進行測試，函數傳回模型在一個batch上的預測結果

_generator

fit_generator(self, generator, steps_per_epoch, epochs=1, verbose=1, callbacks=None, validation_data=None, validation_steps=None, class_weight=None, max_q_size=10, workers=1, pickle_safe=False, initial_epoch=0)
evaluate_generator(self, generator, steps, max_q_size=10, workers=1, pickle_safe=False)

案例一：簡單的單層-全連接配接網絡

from keras.layers import Input, Dense
from keras.models import Model

# This returns a tensor
inputs = Input(shape=(784,))

# a layer instance is callable on a tensor, and returns a tensor
x = Dense(64, activation='relu')(inputs)
# 輸入inputs，輸出x
# (inputs)代表輸入
x = Dense(64, activation='relu')(x)
# 輸入x，輸出x
predictions = Dense(10, activation='softmax')(x)
# 輸入x，輸出分類

# This creates a model that includes
# the Input layer and three Dense layers
model = Model(inputs=inputs, outputs=predictions)
model.compile(optimizer='rmsprop',
              loss='categorical_crossentropy',
              metrics=['accuracy'])
model.fit(data, labels)  # starts training

其中：

可以看到結構與序貫模型完全不一樣，其中x = Dense(64, activation=’relu’)(inputs)中：(input)代表輸入；x代表輸出

model = Model(inputs=inputs, outputs=predictions)；該句是函數式模型的經典，可以同時輸入兩個input，然後輸出output兩個模型

案例二：視訊處理

x = Input(shape=(784,))
# This works, and returns the 10-way softmax we defined above.
y = model(x)
# model裡面存着權重，然後輸入x，輸出結果，用來作fine-tuning

# 分類->視訊、實時處理
from keras.layers import TimeDistributed

# Input tensor for sequences of 20 timesteps,
# each containing a 784-dimensional vector
input_sequences = Input(shape=(20, 784))
# 20個時間間隔，輸入784次元的資料

# This applies our previous model to every timestep in the input sequences.
# the output of the previous model was a 10-way softmax,
# so the output of the layer below will be a sequence of 20 vectors of size 10.
processed_sequences = TimeDistributed(model)(input_sequences)
# Model是已經訓練好的

其中：

Model是已經訓練好的，現在用來做遷移學習；

其中還可以通過TimeDistributed來進行實時預測；

TimeDistributed(model)(input_sequences)，input_sequences代表序列輸入；model代表已訓練的模型

案例三：雙輸入、雙模型輸出：LSTM 時序預測

本案例很好，可以了解到Model的精髓在于他的任意性，給編譯者很多的便利。

輸入：

新聞語料；新聞語料對應的時間

輸出：

新聞語料的預測模型；新聞語料+對應時間的預測模型

keras基本結構功能

模型一：隻針對新聞語料的LSTM模型

from keras.layers import Input, Embedding, LSTM, Dense
from keras.models import Model

# Headline input: meant to receive sequences of 100 integers, between 1 and 10000.
# Note that we can name any layer by passing it a "name" argument.
main_input = Input(shape=(100,), dtype='int32', name='main_input')
# 一個100詞的BOW序列

# This embedding layer will encode the input sequence
# into a sequence of dense 512-dimensional vectors.
x = Embedding(output_dim=512, input_dim=10000, input_length=100)(main_input)
# Embedding層，把100次元再encode成512的句向量，10000指的是詞典單詞總數


# A LSTM will transform the vector sequence into a single vector,
# containing information about the entire sequence
lstm_out = LSTM(32)(x)
# ？ 32什麼意思？？？？？？？？？？？？？？？？？？？？？

#然後，我們插入一個額外的損失，使得即使在主損失很高的情況下，LSTM和Embedding層也可以平滑的訓練。

auxiliary_output = Dense(1, activation='sigmoid', name='aux_output')(lstm_out)
#再然後，我們将LSTM與額外的輸入資料串聯起來組成輸入，送入模型中：
# 模型一：隻針對以上的序列做的預測模型

組合模型：新聞語料+時序

# 模型二：組合模型
auxiliary_input = Input(shape=(5,), name='aux_input')  # 新加入的一個Input,5次元
x = keras.layers.concatenate([lstm_out, auxiliary_input])   # 組合起來，對應起來


# We stack a deep densely-connected network on top
# 組合模型的形式
x = Dense(64, activation='relu')(x)
x = Dense(64, activation='relu')(x)
x = Dense(64, activation='relu')(x)
# And finally we add the main logistic regression layer
main_output = Dense(1, activation='sigmoid', name='main_output')(x)


#最後，我們定義整個2輸入，2輸出的模型：
model = Model(inputs=[main_input, auxiliary_input], outputs=[main_output, auxiliary_output])
#模型定義完畢，下一步編譯模型。
#我們給額外的損失賦0.2的權重。我們可以通過關鍵字參數loss_weights或loss來為不同的輸出設定不同的損失函數或權值。
#這兩個參數均可為Python的清單或字典。這裡我們給loss傳遞單個損失函數，這個損失函數會被應用于所有輸出上。

其中：Model(inputs=[main_input, auxiliary_input], outputs=[main_output, auxiliary_output])是核心，

Input兩個内容，outputs兩個模型

# 訓練方式一：兩個模型一個loss
model.compile(optimizer='rmsprop', loss='binary_crossentropy',
              loss_weights=[1., 0.2])
#編譯完成後，我們通過傳遞訓練資料和目标值訓練該模型：

model.fit([headline_data, additional_data], [labels, labels],
          epochs=50, batch_size=32)

# 訓練方式二：兩個模型,兩個Loss
#因為我們輸入和輸出是被命名過的（在定義時傳遞了“name”參數），我們也可以用下面的方式編譯和訓練模型：
model.compile(optimizer='rmsprop',
              loss={'main_output': 'binary_crossentropy', 'aux_output': 'binary_crossentropy'},
              loss_weights={'main_output': 1., 'aux_output': 0.2})

# And trained it via:
model.fit({'main_input': headline_data, 'aux_input': additional_data},
          {'main_output': labels, 'aux_output': labels},
          epochs=50, batch_size=32)

因為輸入兩個，輸出兩個模型，是以可以分為設定不同的模型訓練參數

案例四：共享層：對應關系、相似性

一個節點，分成兩個分支出去

import keras
from keras.layers import Input, LSTM, Dense
from keras.models import Model

tweet_a = Input(shape=(140, 256))
tweet_b = Input(shape=(140, 256))
#若要對不同的輸入共享同一層，就初始化該層一次，然後多次調用它
# 140個單詞，每個單詞256次元，詞向量
# 

# This layer can take as input a matrix
# and will return a vector of size 64
shared_lstm = LSTM(64)
# 傳回一個64規模的向量

# When we reuse the same layer instance
# multiple times, the weights of the layer
# are also being reused
# (it is effectively *the same* layer)
encoded_a = shared_lstm(tweet_a)
encoded_b = shared_lstm(tweet_b)

# We can then concatenate the two vectors:
    # 連接配接兩個結果
    # axis=-1？？？？？
merged_vector = keras.layers.concatenate([encoded_a, encoded_b], axis=-1)

# And add a logistic regression on top
predictions = Dense(1, activation='sigmoid')(merged_vector)
# 其中的1 代表什麼？？？？

# We define a trainable model linking the
# tweet inputs to the predictions
model = Model(inputs=[tweet_a, tweet_b], outputs=predictions)

model.compile(optimizer='rmsprop',
              loss='binary_crossentropy',
              metrics=['accuracy'])
model.fit([data_a, data_b], labels, epochs=10)
# 訓練模型，然後預測

案例五：抽取層節點内容

# 1、單節點
a = Input(shape=(140, 256))
lstm = LSTM(32)
encoded_a = lstm(a)
assert lstm.output == encoded_a
# 抽取獲得encoded_a的輸出張量

# 2、多節點
a = Input(shape=(140, 256))
b = Input(shape=(140, 256))

lstm = LSTM(32)
encoded_a = lstm(a)
encoded_b = lstm(b)

assert lstm.get_output_at(0) == encoded_a
assert lstm.get_output_at(1) == encoded_b

# 3、圖像層節點
# 對于input_shape和output_shape也是一樣，如果一個層隻有一個節點，
#或所有的節點都有相同的輸入或輸出shape，
#那麼input_shape和output_shape都是沒有歧義的，并也隻傳回一個值。
#但是，例如你把一個相同的Conv2D應用于一個大小為(3,32,32)的資料，
#然後又将其應用于一個(3,64,64)的資料，那麼此時該層就具有了多個輸入和輸出的shape，
#你就需要顯式的指定節點的下标，來表明你想取的是哪個了
a = Input(shape=(3, 32, 32))
b = Input(shape=(3, 64, 64))

conv = Conv2D(16, (3, 3), padding='same')
conved_a = conv(a)

# Only one input so far, the following will work:
assert conv.input_shape == (None, 3, 32, 32)

conved_b = conv(b)
# now the `.input_shape` property wouldn't work, but this does:
assert conv.get_input_shape_at(0) == (None, 3, 32, 32)
assert conv.get_input_shape_at(1) == (None, 3, 64, 64)

案例六：視覺問答模型

#這個模型将自然語言的問題和圖檔分别映射為特征向量，
#将二者合并後訓練一個logistic回歸層，從一系列可能的回答中挑選一個。
from keras.layers import Conv2D, MaxPooling2D, Flatten
from keras.layers import Input, LSTM, Embedding, Dense
from keras.models import Model, Sequential

# First, let's define a vision model using a Sequential model.
# This model will encode an image into a vector.
vision_model = Sequential()
vision_model.add(Conv2D(64, (3, 3) activation='relu', padding='same', input_shape=(3, 224, 224)))
vision_model.add(Conv2D(64, (3, 3), activation='relu'))
vision_model.add(MaxPooling2D((2, 2)))
vision_model.add(Conv2D(128, (3, 3), activation='relu', padding='same'))
vision_model.add(Conv2D(128, (3, 3), activation='relu'))
vision_model.add(MaxPooling2D((2, 2)))
vision_model.add(Conv2D(256, (3, 3), activation='relu', padding='same'))
vision_model.add(Conv2D(256, (3, 3), activation='relu'))
vision_model.add(Conv2D(256, (3, 3), activation='relu'))
vision_model.add(MaxPooling2D((2, 2)))
vision_model.add(Flatten())

# Now let's get a tensor with the output of our vision model:
image_input = Input(shape=(3, 224, 224))
encoded_image = vision_model(image_input)

# Next, let's define a language model to encode the question into a vector.
# Each question will be at most 100 word long,
# and we will index words as integers from 1 to 9999.
question_input = Input(shape=(100,), dtype='int32')
embedded_question = Embedding(input_dim=10000, output_dim=256, input_length=100)(question_input)
encoded_question = LSTM(256)(embedded_question)

# Let's concatenate the question vector and the image vector:
merged = keras.layers.concatenate([encoded_question, encoded_image])

# And let's train a logistic regression over 1000 words on top:
output = Dense(1000, activation='softmax')(merged)

# This is our final model:
vqa_model = Model(inputs=[image_input, question_input], outputs=output)

# The next stage would be training this model on actual data.

延伸一：fine-tuning時如何加載No_top的權重

如果你需要加載權重到不同的網絡結構（有些層一樣）中，例如fine-tune或transfer-learning，你可以通過層名字來加載模型：

model.load_weights(‘my_model_weights.h5’, by_name=True)

例如：

假如原模型為：

model = Sequential()
    model.add(Dense(2, input_dim=3, name="dense_1"))
    model.add(Dense(3, name="dense_2"))
    ...
    model.save_weights(fname)

# new model
model = Sequential()
model.add(Dense(2, input_dim=3, name="dense_1"))  # will be loaded
model.add(Dense(10, name="new_dense"))  # will not be loaded

# load weights from first model; will only affect the first layer, dense_1.
model.load_weights(fname, by_name=True)

轉載：

https://blog.csdn.net/sinat_26917383/article/details/72857454?locationNum=1&fps=1

keras基本結構功能

1. Keras和TensorFlow的關系和差別:

2. Sequential與Model等基本功能

零、keras介紹與基本的模型儲存

1.keras網絡結構

2.keras網絡配置

3.keras預處理功能

4、模型的節點資訊提取

5、 模型概況查詢（包括權重查詢）

6、模型儲存與加載

7、如何在keras中設定GPU使用的大小

8.更科學地模型訓練與模型儲存

9.如何在keras中使用tensorboard

一、Sequential 序貫模型

Sequential模型的基本元件

1. add：添加層——train_val.prototxt

2、compile 訓練模式——solver.prototxt檔案

3、fit 模型訓練參數+訓練——train.sh+soler.prototxt（部分）

4.evaluate 模型評估

5 predict 模型評估

6 on_batch 、batch的結果，檢查

7 fit_generator

案例一：簡單的2分類

案例二:多分類-VGG的卷積神經網絡

案例三：使用LSTM的序列分類

三、Model式模型

函數式模型基本屬性與訓練流程

1 常用Model屬性

2 compile 訓練模式設定——solver.prototxt

3 fit 模型訓練參數設定 + 訓練

4.evaluate，模型評估

5.predict 模型預測

模型檢查 on_batch

_generator

案例一：簡單的單層-全連接配接網絡

案例二：視訊處理

案例三：雙輸入、雙模型輸出：LSTM 時序預測

模型一：隻針對新聞語料的LSTM模型

組合模型：新聞語料+時序

案例四：共享層：對應關系、相似性

案例五：抽取層節點内容

案例六：視覺問答模型

延伸一：fine-tuning時如何加載No_top的權重

繼續閱讀

5、模型概況查詢（包括權重查詢）