This chapter is easy to understand, easy to understand, there are not too many difficult to understand mathematical formulas, suitable for students with average mathematical level and zero-based introductory learning, while this article implements the perceptron through Python native code, from the underlying principle to in-depth understanding and proficiency, as long as you have learned the basics of Python can understand.

The perceptron was proposed by American scholar Frank Rosenblatt in 1957. Why are we learning this algorithm that has existed for a long time now? Because the perceptron is also the algorithm that serves as the origin of neural networks (deep learning). Therefore, the construction of learning perceptrons is also an important idea of learning to neural networks and deep learning.

Article table of contents

1. What is a perceptron
2. Simple logic circuits
2.1 With doors
2.2 AND NOT and OR gates
3. Implementation of perceptron
3.1 Simple code implementation
3.2 Import weights and biases
3.3 Code implementation using weights and biases
4. Limitations of perceptrons
4.1 XOR Gate
4.2 Linearity and nonlinearity
5. Multilayer perceptron
5.1 Combinations of existing gates
5.2 Implementation of XOR gates
6. Summary

Deep Learning Introductory Notes (3): Perceptron

If you need to get [Artificial Intelligence - Deep Learning Introduction Notes], help forward, forward, forward it, and then follow my private message reply "1" to get the way to get it!

1. What is a perceptron

The perceptron receives multiple input signals and outputs one signal. The "signal" here can be imagined as something "fluid" like an electric current or a river. Just as an electric current flows through a wire and transmits electrons to the front, the signal from the perceptron also forms a flow that transmits information forward. However, unlike the actual current, the signal of the perceptron has only two values: "flow/no flow" (1/0). In this book, 0 corresponds to "no signal" and 1 corresponds to "transmit signal"

Figure 2-1 is an example of a perceptron that receives two input signals. x 1 , x 2 are the input signal, y is the output signal, w 1 , w 2 are the weights (w is the initial letter of weight). The O in the graph is called a "neuron" or "node". When the input signal is sent to the neuron, it is multiplied by a fixed weight (w 1 x 1, w 2 x 2), respectively. Neurons calculate the sum of the signals sent over and output 1 only when this sum exceeds a certain cut-off value. This is also known as "neurons being activated". This limit value is called the threshold and is denoted by the symbol θ θθ.

That's all there is to the principle of operation of the perceptron! Describe the above mathematically (Equation 2-1):

2. Simple logic circuits

2.1 With doors

With a gate is a gate circuit with two inputs and one output. Figure 2-2-2 This correspondence table between input signals and output signals is called a "truth table". As shown in Figure 2-2, ANDS outputs 1 only when both insulators are 1, and 0 at other times.

Consider using a perceptron to represent this and gate. All that needs to be done is to determine the values of w1, w2, θ that satisfy the truth table in Figure 2-2. So, what kind of value can be set to create a perceptron that meets the conditions of Figure 2-2?

In fact, there are countless ways to select parameters that meet the conditions of Figure 2-2. For example, when (w 1, w 2, θ ) = (0.5, 0.5, 0.7), the conditions of Figure 2-2 can be satisfied. In addition, when (w 1, w 2, θ) is (0.5, 0.5, 0.8) or (1.0, 1.0, 1.0), the condition of the gate is also satisfied. When such a parameter is set, the weighted sum of the signals will exceed the given threshold θ only if x 1 and x 2 are both 1s.

2.2 AND NOT and OR gates

AND NOT is the output of the AND gate is reversed. As shown in Figure 2-3, 0 is output only when x I and x 2 are both 1, and 1 at other times. So what kind of combination can the parameters of the NOT gate be?

To express the NAND gate , one can use a combination such as ( w 1 , w 2 , θ ) = ( -0.5 , -0.5 , -0.7 ) ( other combinations also exist infinitely ) . In fact, as long as the implementation and the sign of the parameter value of the gate are negated, the AND gate can be implemented.

Next, look at the OR gate shown in Figures 2-4. Or a gate is a logic circuit that "as long as there is an input signal of 1, the output is 1". So let's think about it, what parameters should be set for this or gate?

As shown above, we already know that logic circuits with gates, AND NOT gates, or gates can be represented using perceptrons. The important point here is that it is the same as the perceptual mechanism of the door, the non-door, or the gate.

In fact, the 3 gates differ only in the values (weights and thresholds) of the parameters. In other words, a perceptron of the same structure, just by adjusting the value of the parameters appropriately, can transform into an AND, NOT GATE, or GATE just like a "chameleon actor" performing different roles.

3. Implementation of perceptron

3.1 Simple code implementation

Now, let's use Python to implement the logic circuit just now. Here, first define an AND function that receives parameters x 1 and x 2.

def AND(x1, x2):
w1，w2，theta = 0.5，0.5，0.7
tmp = x1*wl + x2*w2
if tmp <= theta:
return 0
elif tmp > theta:
return 1

Initialize the parameters w1, w2, theta inside the function, and return 1 when the weighted sum of the inputs exceeds the threshold, otherwise return 0.

3.2 Import weights and biases

The implementation of the AND gate just now is relatively straightforward and easy to understand, but in consideration of future events, we will modify it to another implementation. Before that, first replace 0 in equation (2.1) with -b, so that equation (2.2) can be used to represent the behavior of the perceptron.

Although there is a different symbol, the expressions (2.1) and (2.2) are exactly the same. Here, b is called the bias, and w1 and w2 are called weights. As shown in Equation (2.2), the perceptual opportunity calculates the product of the input signal and the weight, and then adds the bias, if this value is greater than 0, it outputs 1, otherwise it outputs 0.

Next, we use NumPy, which implements the perceptron in the form (2.2). In this process, we confirm the results one by one with Python's interpreter.

>>> import numpy as np
>>> x = np.array([0，1]) #输人
>>> w = np.array([0.5, 0.5]) #权重
>>> b =-0.7 #偏置
>>> w*x
array([ 0.，0.5])
>>> np.sum(w*x)
0.5
>>> np.sum(W*x) + b
- 0.19999999999996 #大约为-0.2(由浮点小数造成的运算误差)

As shown in the above example, in the multiplication operation of a NumPy array, when the number of elements of the two arrays is the same, the elements are multiplied separately, so the result of w ∗ x x w*xw∗x is that their elements are multiplied ( [ 0 , 1 ] ∗ [ 0.5 , 0.5 ] = [ 0 , 0.5 ] ([0,1] * [0.5,0.5] = [0, 0.5] ([0,1]∗ [0.5,0.5] = [0,0.5]. After that, n p . s u m ( w ∗ x ) np.sum(w*x)np.sum(w∗x) Then calculate the sum of the elements after multiplication. Finally, adding the bias to this weighted sum completes the calculation of equation (2.2).

3.3 Code implementation using weights and biases

Python code implementation with gates

def AND(x1, x2):
x = np.array([x1, x2])
w = np.array([0.5， 0.5])
b = -0.7
tmp = np.sum(W*x) + b
if tmp <= 0:
return 0
else:
return 1

Here, a θ θθ is named the bias b bb, but note that the bias and weights w 1 and w 2 do not have the same effect. Specifically, w 1 and w 2 are parameters that control the importance of the input signal, while bias is a parameter that adjusts how easily neurons are activated (the degree to which the output signal is 1). For example, if b is − 0.1 -0.1−0.1, neurons will be activated whenever the weighted sum of the input signals exceeds 0.1 0.10.1. However, if b is − 20.0 -20.0−20.0, the weighted sum of the input signals must exceed 20.0 20.020.0 for the neuron to be activated. As such, the value of the bias determines how easily the neuron is activated. In addition, here we call w 1 and w 2 weights and b biases, but depending on the context, sometimes the parameters b, w1, w2 are collectively referred to as weights.

Next, we move on to implementing AND NOT gates and OR gates.

def NAND(x1, x2):
x = np.array([x1, x2])
w = np.array([-0.5， -0.5]) #仅权重和偏置与AND不同!
b=0.7
tmp = np. sum(w*x) + b
if tmp <= 0:
return 0
else:
return 1

def OR(x1,,x2):
x = np.array([x1, x2])
w = np.array([0.5, 0.5]) #仅权重和偏置与AND不同!
b= -0.2
tmp = np.sum(w*x) + b
if tmp <= 0:
return 0 :
else:
return 1

4. Limitations of perceptrons

At this point, we already know that the use of perceptrons can realize three logic circuits: AND, AND NOT, or gate. Now let's consider the XOR gate.

4.1 XOR Gate

XOR] is also known as a logical XOR circuit. As shown in Figure 2-5, 1 is output only if one of the x1 or x2 is 1 ("XOR" means rejecting the other). So, what kind of weight parameters should be set to achieve this XOR gate with a perceptron?

In fact, this XOR gate cannot be realized with the perceptron introduced earlier. Why can you use a perceptron to achieve an AND door or a door, but you can't achieve an XOR door? Let's try to think about why by drawing a picture.

First, we try to visualize the movement of the door. or gate, when the weight parameter (b, w 1, w 2) = (− 0.5, 1.0, 1.0), 1.0, 1.0, the truth table condition of Figure 2-4 can be satisfied. At this point, the perceptron can be represented by the following equation (2.3).

The perceptual opportunity represented by equation (2.3) generates two spaces separated by the line − 0.5 + x 1 + x 2 = 0. One of the spaces outputs 1 and the other outputs 0, as shown in Figure 2-6.

Or the gate outputs 0 when ( x 1 , x 2 ) = ( 0 , 0 ) and 1 when ( x 1 , x 2 ) is (0,1), (1,0), (1,1). In Figure 2-6, O represents 0 and △ △△ represents 1. If you want to make an or, you need to separate the O and △ △△ in Figure 2-6 with a straight line. In fact, the straight line just now correctly separates these four points.

So, what happens if you change to an XOR door? Is it possible to divide the spaces O and △ △△ in Figure 2-7 with a straight line like a door?

If you want to separate the O and △ △△ in Figure 2-7 with a straight line, you can't do it anyway. In fact, it is impossible to separate O and △ △△ with a straight line.

4.2 Linearity and nonlinearity

The O and △ △△ in Fig. 2-7 cannot be separated by a straight line, but if the restriction of "straight line" is removed, it can be achieved. For example, we can make a space separating O and △ △△ as shown in Figure 2-8.

The limitation of the perceptron is that it can only represent a space divided by a straight line. Fig. 2-8 Such a curved curve cannot be represented by a perceptron. In addition, the space divided by curves such as Figure 2-8 is called a nonlinear space, and the space divided by a straight line is called a linear space. The terms linear and nonlinear are common in the field of machine learning and can be thought of as straight lines and curves shown in Figures 2-6 and 2-8.

5. Multilayer perceptron

It is a pity that the perceptron cannot represent the alien OR door, but there is no need to be pessimistic. In fact, the great thing about perceptrons is that it can "overlay" (representing XOR gates by overlays is the point of this section). Here, let's leave aside what the overlay specifically means, and first think about the problem of XOR gates from other perspectives.

5.1 Combinations of existing gates

There are many ways to make XOR doors, one of which is to combine the AND, NOT and GATE, OR gate that we made earlier. Here, the AND door, AND NOT gate, or gate are represented by the symbols in FIG. 2-9. In addition, the O at the front end of the NOT gate in Fig. 2-9 indicates the meaning of inverting the output.

So, think about it, how do you need to configure the AND, AND AND and OR gates to implement the XOR gate? Here is a hint to replace the "?" in Figure 2-10 with an AND gate, an AND gate, or gate, and an XOR gate to achieve the XOR gate.

The XOR gate can be implemented by the configuration shown in Figure 2-11. Here, x 1 and x 2 represent the input signal and y represents the output signal. x I and x 2 are inputs to NOT and OR gates, while outputs of AND AND are inputs to gates.

Now, let's confirm that the configuration of Figure 2-11 actually implements an XOR gate. Here, x 1 is used as the output of the AND NOT gate, and x 2 is used as the output of the OR gate, and filled in the truth table. The results are shown in Figure 2-12, observing x 1, x 2, y, and it can be found that the output does indeed correspond to the XOR gate.

5.2 Implementation of XOR gates

Let's try to implement the XOR gate shown in Figure 2-11 using Python. Using the previously defined AND function, NAND function, and OR function, it can be implemented (easily) like the following.

def X0R(x1, x2):
s1 = NAND(x1, x2)
s2 = 0R(x1, x2)
y = AND(s1， s2)
return y

In this way, the implementation of the XOR gate is complete. Below we try to represent this XOR gate using the perceptron representation (explicitly showing neurons), and the result is shown in Figure 2-13.

As shown in Figure 2-13, the XOR gate is a multilayer neural network. Here, the leftmost column is called layer 0, the middle column is called layer 1, and the rightmost column is called layer 2.

The perceptron shown in Fig. 2-13 has a different shape from the perceptron with a gate or gate (Figure 2-1) described earlier. In fact, unlike a door, or a door is a single-layer perceptron, or a door is a 2-layer perceptron. A perceptron superimposed on multiple layers is also called a multi-layered perceptron.

In the 2-layer perceptron shown in Fig. 2-13, signals are transmitted and received between neurons in layer 0 and layer 1, and then signals are transmitted and received between layers 1 and 2, as shown below.

The two neurons in layer 0 receive the input signal and send the signal to the neuron in layer 1.

Layer 1 neurons send signals to layer 2 neurons, and layer 2 neurons output y.

6. Summary

What is learned in this chapter

A perceptron is an algorithm that has inputs and outputs. Given an input, a given value is output.
The perceptron sets weights and biases as parameters.
Using a perceptron can represent logic circuits such as gates and or gates.
XOR gates cannot be represented by a single-layer perceptron.
Using a Layer 2 perceptron can represent an XOR gate.
A single-layer perceptron can only represent a linear space, while a multilayer perceptron can represent a nonlinear space.

Deep Learning Introductory Notes (3): Perceptron