天天看點

【TensorFlow】 利用卷積神經網絡對CIFAR-10資料集進行分類一、CIFAR-10介紹二、卷積整體流程三、代碼實作

一、CIFAR-10介紹

CIFAR-10資料集包括6萬張32*32大小的彩色圖檔,該資料集一共有10個類别,每個類别有6千張圖檔。其中訓練資料集圖檔5萬張,測試資料集1萬張。

該訓練資料集平均分成5個訓練批次,每個訓練批次1萬張圖。下載下傳該資料集的連結:http://www.cs.toronto.edu/~kriz/cifar.html

由于本文是基于Python3.6的,是以選擇“CIFAR-10 python version”下載下傳。

【TensorFlow】 利用卷積神經網絡對CIFAR-10資料集進行分類一、CIFAR-10介紹二、卷積整體流程三、代碼實作

下載下傳後的檔案:

【TensorFlow】 利用卷積神經網絡對CIFAR-10資料集進行分類一、CIFAR-10介紹二、卷積整體流程三、代碼實作

二、卷積整體流程

  1. 5萬張3通道訓練圖檔的原始尺寸32*32
  2. 為了減少計算量把圖像預處理成1通道24*24,當然在硬體配置較好的情況下,此步驟非必須
  3. 輸入(?,24,24,1),經過conv1+pool1後,輸出(?,12,12,64)
  4. 步驟三作為輸入,經過conv2+pool2後,輸出(?,6,6,64)
  5. 把輸出值(?,6,6,64)轉置成一維數組(?,6*6*64)
  6. 步驟五作為輸入,經過FC1後,輸出(?,1024)
  7. 步驟六作為輸入,經過FC2後,輸出(?,10)

三、代碼實作

import pickle

def unpickle(file):
    fo = open(file, 'rb')
    dict = pickle.load(fo, encoding='latin1')
    fo.close()
    return dict
           

1、圖像預處理:

# 圖像預處理
import numpy as np

def clean(data):
    # 圖像預處理,32*32-->24*24,速度快
    imgs = data.reshape(data.shape[0], 3, 32, 32)
    grayscale_imgs = imgs.mean(1)
    cropped_imgs = grayscale_imgs[:, 4:28, 4:28]
    img_data = cropped_imgs.reshape(data.shape[0], -1)
    img_size = np.shape(img_data)[1]
    means = np.mean(img_data, axis=1)
    meansT = means.reshape(len(means), 1)
    stds = np.std(img_data, axis=1)
    stdsT = stds.reshape(len(stds), 1)
    adj_stds = np.maximum(stdsT, 1.0 / np.sqrt(img_size))
    normalized = (img_data - meansT) / adj_stds
    return normalized
           

2、讀取資料:

# 讀取資料
def read_data(directory):
    names = unpickle('{}/batches.meta'.format(directory))['label_names']
    print('names', names)
    
    data, labels = [], []
    # 五批資料,data_batch_1...5
    for i in range(1, 6):
        filename = '{}/data_batch_{}'.format(directory, i)
        batch_data = unpickle(filename)
        if len(data) > 0:
            # data labels拼加
            data = np.vstack((data, batch_data['data']))
            labels = np.hstack((labels, batch_data['labels']))
        else:
            data = batch_data['data']
            labels = batch_data['labels']
            
    print(np.shape(data), np.shape(labels))
    
    data = clean(data)
    data = data.astype(np.float32)
    return names, data, labels
           

3、顯示資料:

%matplotlib inline
import matplotlib.pyplot as plt
import random
# 針對random.seed()、random.sample()函數的作用,可參考https://blog.csdn.net/duanlianvip/article/details/95214866
random.seed(1)

names, data, labels = read_data('./cifar-10-batches-py')

# 顯示資料
def show_some_examples(names, data, labels):
    # 建立一個顯示視窗
    plt.figure()
    # 4行4列子圖
    rows, cols = 4, 4
    # 從data中(5W張圖)随機擷取16個子圖索引,并作為一個片段傳回
    random_idxs = random.sample(range(len(data)), rows * cols)
    for i in range(rows * cols):
        # 将視窗分為四行四列16個子圖
        plt.subplot(rows, cols, i + 1)
        j = random_idxs[i]
        # 圖檔标題
        plt.title(names[labels[j]])
        # 格式化圖檔數組大小
        img = np.reshape(data[j, :], (24, 24))
        # 繪制圖檔。cmap,使用灰階圖進行表示
        plt.imshow(img, cmap='Greys_r')
        # 不顯示坐标尺寸
        plt.axis('off')
    # 為多個子圖(subplot)自動調整顯示的布局
    plt.tight_layout()
    plt.savefig('cifar_examples.png')

show_some_examples(names, data, labels)
           

輸出:

【TensorFlow】 利用卷積神經網絡對CIFAR-10資料集進行分類一、CIFAR-10介紹二、卷積整體流程三、代碼實作

4、檢視卷積、RELU、池化的中間過程:

%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf

names, data, labels = read_data('./cifar-10-batches-py')

# 卷積結果
def show_conv_results(data, filename=None):
    plt.figure()
    # 4行8列子圖
    rows, cols = 4, 8
    for i in range(np.shape(data)[3]):
        # 取目前圖,圖檔h,圖檔w,展示所有圖像的特征圖
        img = data[0, :, :, i]
        plt.subplot(rows, cols, i + 1)
        plt.imshow(img, cmap='Greys_r', interpolation='none')
        plt.axis('off')
    if filename:
        plt.savefig(filename)
    else:
        plt.show()

# 權重參數
def  show_weights(W, filename=None):
    plt.figure()
    # 4行8列子圖
    rows, cols = 4, 8
    for i in range(np.shape(W)[3]):
        # 
        img = W[:, :, 0, i]
        plt.subplot(rows, cols, i + 1)
        plt.imshow(img, cmap='Greys_r', interpolation='none')
        plt.axis('off')
    if filename:
        plt.savefig(filename)
    else:
        plt.show()
           

輸出:

names ['airplane', 'automobile', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']
(50000, 3072) (50000,)
           

4.1、 檢視索引為4的圖檔: 

raw_data = data[4, :]
raw_img = np.reshape(raw_data, (24, 24))
plt.figure()
plt.imshow(raw_img, cmap='Greys_r')
plt.show()
           

輸出:

【TensorFlow】 利用卷積神經網絡對CIFAR-10資料集進行分類一、CIFAR-10介紹二、卷積整體流程三、代碼實作

4.2、檢視索引為4的圖檔被處理的中間過程

x = tf.reshape(raw_data, shape=[-1, 24, 24, 1])
W = tf.Variable(tf.random_normal([5, 5, 1, 32]))
b = tf.Variable(tf.random_normal([32]))

conv = tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')
conv_with_b = tf.nn.bias_add(conv, b)
conv_out = tf.nn.relu(conv_with_b)

k = 2
maxpool = tf.nn.max_pool(conv_out, ksize=[1, k, k, 1], strides=[1, k, k, 1], padding='SAME')
           
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    
    W_val = sess.run(W)
    print('weights:')
    show_weights(W_val)
    
    conv_val = sess.run(conv)
    print('convolution results:')
    print(np.shape(conv_val))
    show_conv_results(conv_val)
    
    conv_out_val = sess.run(conv_out)
    print('convolution with bias and relu:')
    print(np.shape(conv_out_val))
    show_conv_results(conv_out_val)
    
    maxpool_val = sess.run(maxpool)
    print('maxpool after all the convolutions:')
    print(np.shape(maxpool_val))
    show_conv_results(maxpool_val)
           

32個W矩陣的可視化輸出:

【TensorFlow】 利用卷積神經網絡對CIFAR-10資料集進行分類一、CIFAR-10介紹二、卷積整體流程三、代碼實作

索引為4的圖像經過卷積後:

【TensorFlow】 利用卷積神經網絡對CIFAR-10資料集進行分類一、CIFAR-10介紹二、卷積整體流程三、代碼實作

索引為4的圖像經過卷積、偏置、RELU後:

【TensorFlow】 利用卷積神經網絡對CIFAR-10資料集進行分類一、CIFAR-10介紹二、卷積整體流程三、代碼實作

索引為4的圖像經過卷積、偏置、RELU、池化後:

【TensorFlow】 利用卷積神經網絡對CIFAR-10資料集進行分類一、CIFAR-10介紹二、卷積整體流程三、代碼實作

5、 建構完整的網絡模型

# 建構完整的網絡模型
x = tf.placeholder(tf.float32, [None, 24 * 24])
y = tf.placeholder(tf.float32, [None, len(names)])
W1 = tf.Variable(tf.random_normal([5, 5, 1, 64]))
b1 = tf.Variable(tf.random_normal([64]))
W2 = tf.Variable(tf.random_normal([5, 5, 64, 64]))
b2 = tf.Variable(tf.random_normal([64]))
W3 = tf.Variable(tf.random_normal([6*6*64, 1024]))
b3 = tf.Variable(tf.random_normal([1024]))
W_out = tf.Variable(tf.random_normal([1024, len(names)]))
b_out = tf.Variable(tf.random_normal([len(names)]))
           
def conv_layer(x, W, b):
    conv = tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')
    conv_with_b = tf.nn.bias_add(conv, b)
    conv_out = tf.nn.relu(conv_with_b)
    return conv_out

def maxpool_layer(conv, k=2):
    return tf.nn.max_pool(conv, ksize=[1, k, k, 1], strides=[1, k, k, 1], padding='SAME')
           
def model():
    x_reshaped = tf.reshape(x, shape=[-1, 24, 24, 1])
    
    conv_out1 = conv_layer(x_reshaped, W1, b1)
    maxpool_out1 = maxpool_layer(conv_out1)
    # 提出了LRN層,對局部神經元的活動建立競争機制,使得其中響應比較大的值變得相對更大,并抑制其他回報較小的神經元,增強了模型的泛化能力。
    # 局部響應層,詳情參考:http://blog.csdn.net/banana1006034246/article/details/75204013
    norm1 = tf.nn.lrn(maxpool_out1, 4, bias=1.0, alpha=0.001 / 9.0, beta=0.75)
    conv_out2 =conv_layer(norm1, W2, b2)
    norm2 = tf.nn.lrn(conv_out2, 4, bias=1.0, alpha=0.001 / 9.0, beta=0.75)
    maxpool_out2 =maxpool_layer(norm2)
    
    maxpool_reshaped = tf.reshape(maxpool_out2, [-1, W3.get_shape().as_list()[0]])
    local = tf.add(tf.matmul(maxpool_reshaped, W3), b3)
    local_out = tf.nn.relu(local)
    
    out = tf.add(tf.matmul(local_out, W_out), b_out)
    return out
           
# 試水學習率0.001,可以調整此參數來優化模型訓練的效果
learning_rate = 0.001
model_op = model()

cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=model_op, labels=y))

train_op = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)

correct_pred = tf.equal(tf.argmax(model_op, 1), tf.argmax(y, 1))
accuracy = tf.reduce_mean(tf.cast(correct_pred, tf.float32))
           
with tf.Session() as sess:
    sess.run(tf.global_variables_initializer())
    # 标簽1個值轉換成10個機率
    # 針對one_hot的使用,可以參考https://blog.csdn.net/duanlianvip/article/details/95184391
    onehot_labels = tf.one_hot(labels, len(names), axis=-1)
    onehot_vals = sess.run(onehot_labels)
    batch_size = 64
    print('batch size', batch_size)
    # 1000個Epoch,1個Epoch有5W*64張圖像
    for j in range(0, 1000):
        avg_accuracy_val = 0.
        batch_count = 0.
        for i in range(0, len(data), batch_size):
            batch_data = data[i:i+batch_size, :]
            batch_onehot_vals = onehot_vals[i:i+batch_size, :]
            _, accuracy_val = sess.run([train_op, accuracy], feed_dict={x: batch_data, y: batch_onehot_vals})
            avg_accuracy_val += accuracy_val
            batch_count += 1.
        avg_accuracy_val /= batch_count
        print('Epoch {}. Avg accuracy {}'.format(j, avg_accuracy_val))
           

部分輸出結果:

batch size 64
Epoch 0. Avg accuracy 0.2292399296675192
Epoch 1. Avg accuracy 0.28380754475703324
Epoch 2. Avg accuracy 0.30570652173913043
Epoch 3. Avg accuracy 0.32149136828644503
Epoch 4. Avg accuracy 0.33320012787723785
Epoch 5. Avg accuracy 0.34800591432225064
Epoch 6. Avg accuracy 0.3562180306905371
Epoch 7. Avg accuracy 0.3654092071611253
Epoch 8. Avg accuracy 0.3759191176470588
Epoch 9. Avg accuracy 0.3850703324808184
Epoch 10. Avg accuracy 0.39695891943734013
Epoch 11. Avg accuracy 0.3998960997442455
Epoch 12. Avg accuracy 0.40025575447570333
Epoch 13. Avg accuracy 0.40988650895140666
Epoch 14. Avg accuracy 0.4163003516624041
Epoch 15. Avg accuracy 0.42451246803069054
Epoch 16. Avg accuracy 0.4205163043478261
Epoch 17. Avg accuracy 0.4351622442455243
Epoch 18. Avg accuracy 0.4343430306905371
Epoch 19. Avg accuracy 0.44027733375959077
Epoch 20. Avg accuracy 0.4500079923273657
Epoch 21. Avg accuracy 0.44888906649616367
Epoch 22. Avg accuracy 0.4554028132992327
Epoch 23. Avg accuracy 0.46117726982097185
Epoch 24. Avg accuracy 0.45818014705882354
           

以上輸出結果僅是前24批Epoch,通過觀察accuracy值趨勢,模型整體呈收斂趨勢,1000批Epoch完成之後,模型的成功率可以達到90%以上。

由于我的電腦是ThinkPadX270,沒有GPU,是以訓練速度較慢,2個多小時才訓練完24個Epoch。

由于訓練率learning_rate設定的值較小,是以相鄰Epoch之間的accuracy變化不大,當然讀者可以設定不同的訓練率參數,以此檢視訓練率對模型正确率的影響、收斂程度。

繼續閱讀