天天看点

第四周:cnn卷积神经网络1.正向传播2.反向传播3.参考资料

1.正向传播

1.1输入层

每一张图应该是是m*n的二维数组的形式呈现在我们眼前,但现在我们把它打平:二维展开变成一维,本来该是原图中一列中的下一列,我们把这下一列不放在下面了,直接连着上一列,这样每张图片的数据就变成一维了。

把无数张打平的图片数据拿来训练,进行t次训练,每次训练选择batch张图(batch列)拿来训练,每张图进行训练epoch次。

1.2卷积层

下图的步长为2

第四周:cnn卷积神经网络1.正向传播2.反向传播3.参考资料

代码表示卷积如下:

/**
	 *********************** 
	 * Compute the convolution output according to the output of the last layer.
	 * 
	 * @param paraLastLayer
	 *            the last layer.
	 * @param paraLayer
	 *            the current layer.
	 *********************** 
	 */
	private void setConvolutionOutput(final CnnLayer paraLayer, final CnnLayer paraLastLayer) {
		// int mapNum = paraLayer.getOutMapNum();
		final int lastMapNum = paraLastLayer.getOutMapNum();

		// Attention: paraLayer.getOutMapNum() may not be right.
		for (int j = 0; j < paraLayer.getOutMapNum(); j++) {//当前层的为paraLayer.getOutMapNum()个二维矩阵
			double[][] tempSumMatrix = null;
			for (int i = 0; i < lastMapNum; i++) {//当前层的上一层为paraLastLayer.getOutMapNum()个二维矩阵
				double[][] lastMap = paraLastLayer.getMap(i);
				double[][] kernel = paraLayer.getKernel(i, j);
				if (tempSumMatrix == null) {
					// On the first map.
					tempSumMatrix = MathUtils.convnValid(lastMap, kernel);
				} else {
					// Sum up convolution maps
					tempSumMatrix = MathUtils.matrixOp(MathUtils.convnValid(lastMap, kernel),
							tempSumMatrix, null, null, MathUtils.plus);
				} // Of if
			} // Of for i

			// Activation.
			final double bias = paraLayer.getBias(j);
			tempSumMatrix = MathUtils.matrixOp(tempSumMatrix, new Operator() {
				private static final long serialVersionUID = 2469461972825890810L;

				@Override
				public double process(double value) {
					return MathUtils.sigmod(value + bias);//sigmoid函数处理,1/(1+e^(-z))
				}

			});

			paraLayer.setMapValue(j, tempSumMatrix);
		} // Of for j
	}// Of setConvolutionOutput
           

MathUtils.convnValid:

/**
	 *********************** 
	 * Convolution operation, from a given matrix and a kernel, sliding and sum
	 * to obtain the result matrix. It is used in forward.
	 *********************** 
	 */
	public static double[][] convnValid(final double[][] matrix, double[][] kernel) {
		// kernel = rot180(kernel);
		int m = matrix.length;
		int n = matrix[0].length;
		final int km = kernel.length;
		final int kn = kernel[0].length;
		int kns = n - kn + 1;
		final int kms = m - km + 1;
		final double[][] outMatrix = new double[kms][kns];

		for (int i = 0; i < kms; i++) {
			for (int j = 0; j < kns; j++) {
				double sum = 0.0;
				for (int ki = 0; ki < km; ki++) {
					for (int kj = 0; kj < kn; kj++)
						sum += matrix[i + ki][j + kj] * kernel[ki][kj];//步长为1
				}
				outMatrix[i][j] = sum;

			}
		}
		return outMatrix;
	}// Of convnValid
           

注:进行了对上一层卷积完的结果要进行激活函数处理,这里的激活函数选的是sigmoid, 1 / ( 1 + e − z ) 1/(1+e^{-z}) 1/(1+e−z)

1.3 池化层

1.3.1平均池化

代码上用的是平均池化

第四周:cnn卷积神经网络1.正向传播2.反向传播3.参考资料
/**
	 *********************** 
	 * Compute the convolution output according to the output of the last layer.
	 * 
	 * @param paraLastLayer
	 *            the last layer.
	 * @param paraLayer
	 *            the current layer.
	 *********************** 
	 */
	private void setSampOutput(final CnnLayer paraLayer, final CnnLayer paraLastLayer) {
		// int tempLastMapNum = paraLastLayer.getOutMapNum();

		// Attention: paraLayer.outMapNum may not be right.
		for (int i = 0; i < paraLayer.outMapNum; i++) {
			int templastMapNum=paraLastLayer.outMapNum;
			double[][] lastMap = paraLastLayer.getMap(i);
			Size scaleSize = paraLayer.getScaleSize();
			double[][] sampMatrix = MathUtils.scaleMatrix(lastMap, scaleSize);
			paraLayer.setMapValue(i, sampMatrix);
		} // Of for i
	}// Of setSampOutput
           

MathUtils.scaleMatrix:

/**
	 *********************** 
	 * Scale the matrix.
	 *********************** 
	 */
	public static double[][] scaleMatrix(final double[][] matrix, final Size scale) {
		int m = matrix.length;
		int n = matrix[0].length;
		final int sm = m / scale.width;
		final int sn = n / scale.height;
		final double[][] outMatrix = new double[sm][sn];
		if (sm * scale.width != m || sn * scale.height != n)
			throw new RuntimeException("scale matrix");
		final int size = scale.width * scale.height;
		for (int i = 0; i < sm; i++) {
			for (int j = 0; j < sn; j++) {
				double sum = 0.0;
				for (int si = i * scale.width; si < (i + 1) * scale.width; si++) {
					for (int sj = j * scale.height; sj < (j + 1) * scale.height; sj++) {
						sum += matrix[si][sj];
					} // Of for sj
				} // Of for si
				//池化,区域取平均数
				outMatrix[i][j] = sum / size;
			} // Of for j
		} // Of for i
		return outMatrix;
	}// Of scaleMatrix
           

1.3.2最大池化

第四周:cnn卷积神经网络1.正向传播2.反向传播3.参考资料

1.4输出层

输出层同样用卷积,用和上一层尺寸同样大小的kernel对上层的二维数组进行卷积运算,上一层第i个二维数组和输出层的第j个二维数组之间有自己的kernel。

代码中输出层为10个一维数组,假如当一维数组O_i位置装的是1(其他的位置装的是0),则卷积完成后,从0~9中分出的类是O_i(0<=O_i<=9)。

2.反向传播

2.1池化层的反向传播

2.1.1平均池化反向传播

代码上用的是平均池化

第四周:cnn卷积神经网络1.正向传播2.反向传播3.参考资料

2.1.2最大池化的反向传播

第四周:cnn卷积神经网络1.正向传播2.反向传播3.参考资料

2.2卷积层的反向传播

3.参考资料

  • 源代码:https://blog.csdn.net/minfanphd/article/details/116976111
  • CNN反向传播推导
  • 卷积神经网络六之CNN反向传播计算过程
  • 【Python实现卷积神经网络】:卷积层的正向传播与反向传播+python实现代码

    CNN笔记:通俗理解卷积神经网络

继续阅读