Deep Learning: The Basics of Keras Primer (I).

1. About Keras

1) Introduction

Keras is a deep learning framework based on theano/tensorflow, written in pure python.

Keras is a high-level neural network API that supports rapid experimentation and can quickly convert your ideas into results, if you have the following needs, you can give preference to Keras:

a) Easy and rapid prototyping (KERASs are highly modular, minimalist, and extensible)

b) Support CNN and RNN, or a combination of both

c) Seamless CPU and GPU switching

2) Design principles

a) User-friendly: Keras is an API designed for humans, not zeniths. The user's experience is always the first and central part of our consideration. Keras follows best practices for reducing cognitive difficulties: Keras provides a consistent and concise API that greatly reduces the workload of users in general applications, while Keras provides clear and practical bug feedback.

b) Modularity: The model can be understood as a sequence of layers or an operation graph of data, and fully configurable modules can be freely combined together with minimal cost. Specifically, the network layer, loss function, optimizer, initialization strategy, activation function, regularization method are all independent modules that you can use to build your own models.

c) Extensibility: Adding new modules is super easy, just write new classes or functions modeled after existing modules. The ease of creating new modules makes Keras more suitable for advanced research work.

d) Collaboration with Python: Keras does not have a separate model profile type (in contrast, caffe does), and models are described by python code, making it more compact and buggy, and providing the convenience of extension.

2. The modular structure of Keras

3. Use Keras to build a neural network

4. Main concepts

1) Symbolic calculation

Keras' underlying libraries use Theano or TensorFlow, which are also known as Keras' backends. Both Theano and TensorFlow are "symbolic" libraries. Symbolic calculations first define various variables, and then build a "computational graph" that defines the computational relationships between the variables.

Symbolic calculation, also called data flow diagram, the process is as follows (GIF diagram is not easy to open, so a static diagram is used, and the data flows according to the black line with arrows in the figure):

2) Tensors

Tensors can be seen as natural generalizations of vectors and matrices to represent a wide range of data types. The order of a tensor is also called dimension.

A 0th order tensor, or scalar, is a number.

A 1st order tensor, i.e. a vector, is a set of ordered numbers

A second-order tensor, i.e. a matrix, is a set of vectors arranged in an orderly manner

A 3rd order tensor, that is, a cube, a set of matrices arranged up and down

4th order tensor...

And so on

Focus: Understanding about dimensions

If there is a list of 10 lengths, then we have 10 numbers horizontally, which can also be called 10 dimensions, and only 1 number can be seen vertically, then it is called 1 dimension. Note that this distinction helps to understand the dimensionality problems that arise when computing in Keras or neural networks.

3) Data format (data_format)

At present, there are two main ways to represent tensors:

a) th mode or channels_first mode, which Theano and Caffe use.

b) tf mode or channels_last mode, which TensorFlow uses.

Here are some examples of the differences between the two modes:

For 100 RGB3 channels with 16×32 (16 high and 32 wide) color maps,

th representation: (100,3,16,32)

tf representation: (100,16,32,3)

The only difference is that the position of the number of channels 3 is different.

4) Model

There are two types of models in Keras, sequential models (Sequential) and functional models (Model), functional models are more widely used, sequential models are a special case of functional models.

a) Sequential: single input and single output, a path leads to the end, there is only adjacent relationship between layers, and there is no cross-layer connection. This model compiles quickly and operates relatively simple

b) Functional model (Model): multiple input and multiple output, arbitrary connection between layers. This model compiles slowly.

5. First example

An example commonly used when introducing neural networks is also used here: recognition of handwritten digits.

Before writing code, introduce some concepts based on this example to make it easier for everyone to understand.

PS: It may be a problem of version differences, the parameters in the official website are different from the parameters in the example, the parameters given in the official website are few, and some parameters are supported, and some are not. Therefore, this example removes the unsupported parameters and only introduces the parameters used in this example.

1）Dense(500,input_shape=(784,))

a) The Dense layer belongs to the network layer - one of the > commonly used layers

b) 500 represents the dimension of the output, and the complete output represents (*,500): that is, the output of any 500-dimensional data stream. However, only the dimension can be written in the parameter, and the number of specific outputs is determined by the input. In other words, Dense's output is actually an N×500 matrix.

c) input_shape (784,) indicates that the input dimension is 784 (28×28, which will be explained later), and the complete input indicates: (*,784): that is, enter N 784 dimension data

2）Activation('tanh')

a) Activation: Activate the layer

b) 'tanh': activation function

3）Dropout(0.5)

A certain percentage of input neurons are randomly disconnected each time the parameters are updated during the training process to prevent overfitting.

4) Datasets

The dataset includes 60,000 28×28 training sets and 10,000 28×28 test sets with their corresponding target numbers. If it is completely according to the above data format, TensorFlow as the backend should be (60000,28,28,3), because the example uses mnist.load_data() to obtain the dataset, so it has been judged that TensorFlow is used as the backend, so the dataset becomes (60000,28,28), then the input_shape(784,) should be input_shape ( 28,28,) is correct, but in this example it is not correct to write so, and it needs to be converted to (60000,784). Why do you need to convert?

As shown in the figure above, the training set (60000,28,28) as input is equivalent to a cube, and the input layer is a plane from the current perspective, how does the data flow of the cube enter the input layer of the plane for calculation? Therefore, the transformation shown by the yellow arrow needs to be carried out before entering the input layer for subsequent calculations. As for how the input layer is handled after the conversion from 28*28 to 784, we don't need to care. (Students who like to study can study the source code).

Moreover, the input in Keras is mostly in the form of (nb_samples, input_dim): that is, (sample size, input dimension).

5) Sample code

from keras.models import Sequential  
from keras.layers.core import Dense, Dropout, Activation  
from keras.optimizers import SGD  
from keras.datasets import mnist  
import numpy 
'''
    第一步：选择模型
'''
model = Sequential()
'''
   第二步：构建网络层
'''
model.add(Dense(500,input_shape=(784,))) # 输入层，28*28=784  
model.add(Activation('tanh')) # 激活函数是tanh  
model.add(Dropout(0.5)) # 采用50%的dropout

model.add(Dense(500)) # 隐藏层节点500个  
model.add(Activation('tanh'))  
model.add(Dropout(0.5))

model.add(Dense(10)) # 输出结果是10个类别，所以维度是10  
model.add(Activation('softmax')) # 最后一层用softmax作为激活函数

'''
   第三步：编译
'''
sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True) # 优化函数，设定学习率（lr）等参数  
model.compile(loss='categorical_crossentropy', optimizer=sgd, class_mode='categorical') # 使用交叉熵作为loss函数

'''
   第四步：训练
   .fit的一些参数
   batch_size：对总的样本数进行分组，每组包含的样本数量
   epochs ：训练次数
   shuffle：是否把数据随机打乱之后再进行训练
   validation_split：拿出百分之多少用来做交叉验证
   verbose：屏显模式 0：不输出  1：输出进度  2：输出每次的训练结果
'''
(X_train, y_train), (X_test, y_test) = mnist.load_data() # 使用Keras自带的mnist工具读取数据（第一次需要联网）
# 由于mist的输入数据维度是(num, 28, 28)，这里需要把后面的维度直接拼起来变成784维  
X_train = X_train.reshape(X_train.shape[0], X_train.shape[1] * X_train.shape[2]) 
X_test = X_test.reshape(X_test.shape[0], X_test.shape[1] * X_test.shape[2])  
Y_train = (numpy.arange(10) == y_train[:, None]).astype(int) 
Y_test = (numpy.arange(10) == y_test[:, None]).astype(int)

model.fit(X_train,Y_train,batch_size=200,epochs=50,shuffle=True,verbose=0,validation_split=0.3)
model.evaluate(X_test, Y_test, batch_size=200, verbose=0)

'''
    第五步：输出
'''
print("test set")
scores = model.evaluate(X_test,Y_test,batch_size=200,verbose=0)
print("")
print("The test loss is %f" % scores)
result = model.predict(X_test,batch_size=200,verbose=0)

result_max = numpy.argmax(result, axis = 1)
test_max = numpy.argmax(Y_test, axis = 1)

result_bool = numpy.equal(result_max, test_max)
true_num = numpy.sum(result_bool)
print("")
print("The accuracy of the model is %f" % (true_num/len(result_bool)))

In order to help more friends interested in artificial intelligence to effectively and systematically learn and research papers, the editor specially made and sorted out an artificial intelligence learning material for everyone, which has been sorted out for a long time and is very comprehensive.

The general content includes some artificial intelligence basic introductory videos and documents + AI common framework practical videos, computer vision, machine learning, image recognition, NLP, OpenCV, YOLO, pytorch, deep learning and neural networks and other learning materials, courseware source code, well-known domestic and foreign elite resources, and AI popular papers and other complete learning materials.

If you need the information mentioned in the above articles, please pay attention to the author's headline [AI George] and reply to [666] to get ~~~~~ for free

Every column is a topic that everyone cares about very much, and very valuable, if my article is helpful to you, please also help like, praise, forward it, your support will motivate me to output higher quality articles, thank you very much!

Deep Learning: The Basics of Keras Primer (I).

1. About Keras

2. The modular structure of Keras

3. Use Keras to build a neural network

4. Main concepts

5. First example

Read on

Exploring the Future World: Application and Principle Analysis of Deep Learning in Autonomous Driving

Deep Learning Basics: Explanation of Some Common Terms in Complete Sets of Electrical Appliances (Recommended Collection)

To predict the fragment spectrum of intact glycopeptides, Zhejiang University developed a deep learning method DeepGlyco

The Stanford team has developed a new deep learning model that can predict surface displacement caused by carbon capture

Wang Ziqi's private clothes are recommended for good-looking boys to learn deeply!

Deep Thinking: Is the bigger the visual deep learning model, the better?

Southern Surveying and Mapping Recommendation | Liu Li: Stope information extraction from Weining Beishan open-pit mine by combining deep learning and object-oriented analysis

【Technology】End-to-end large model of automobiles: deep learning of driving rules by AI

A review of multimodal deep learning!

Preschool Education|Dong Xinran: Promoting Children's Deep Learning in Game Workshops: A Case Study of "Pengcheng Food Street".

Advancements in deep learning hardware: GPUs, TPUs, etc

Detailed explanation of the principles and technologies of generative AI (1) - neural networks and deep learning

I heard that you lack a GPU?

One of the 100 models of analytical thinking: deep learning

The combination of deep learning and chemical language models is used for de novo drug design, which is published in the journal Nature

Research on the application of deep learning algorithms in the generation of material implicit labels