Principles of training multi-layer neural network using backpropagation

使用反向传播训练多层神经网络的原理

原文地址: http://galaxy.agh.edu.pl/~vlsi/AI/backp_t_en/backprop.html


The project describes teaching process of multi-layer neural network employing backpropagation algorithm. To illustrate this process the three layer neural network with two inputs and one output,which is shown in the picture below, is used:

这篇文章讲了用反向传播(backpropagation)算法的多层神经网络训练过程。为了说明这个过程,使用了具有两个输入和一个输出的三层神经网络,如下图所示:

两个输入和一个输出的三层神经网络

Each neuron is composed of two units. First unit adds products of weights coefficients and input signals. The second unit realise nonlinear function, called neuron activation function. Signal e is adder output signal, and y = f(e) is output signal of nonlinear element. Signal y is also output signal of neuron.

每个神经元由两个单元组成。第一单元计算所有输入信号与权重系数乘积的和。第二单元是一个非线性函数,称为神经元激活函数。信号e是加法器的输出信号,y = f(e)是非线性函数的输出信号。信号y也就是神经元的输出信号。
神经元的组成

To teach the neural network we need training data set. The training data set consists of input signals (x1 and x2 ) assigned with corresponding target (desired output) z. The network training is an iterative process. In each iteration weights coefficients of nodes are modified using new data from training data set. Modification is calculated using algorithm described below: Each teaching step starts with forcing both input signals from training set. After this stage we can determine output signals values for each neuron in each network layer. Pictures below illustrate how signal is propagating through the network, Symbols w(xm)n represent weights of connections between network input xm and neuron n in input layer. Symbols yn represents output signal of neuron n.

我们需要训练数据集来训练神经网络。训练数据集由带有相应标签(期望输出)z的输入信号(x1x2)组成。网络训练是一个迭代过程。在每个迭代中,使用来自训练数据集的新数据修改节点的加权系数。使用下面描述的算法计算修改:每个教学步骤从强制来自训练集的两个输入信号开始。在这个阶段之后,我们可以确定每个网络层中每个神经元的输出信号值。下图显示了信号如何通过网络传播,符号w(xm)n表示输入层网络输入xm和神经元n之间的连接的权重。符号yn表示神经元n的输出信号。



Propagation of signals through the hidden layer. Symbols wmn represent weights of connections between output of neuron m and input of neuron n in the next layer.

通过隐藏层传播信号。符号wmn表示神经元m的输出和下一层神经元n的输入之间的连接的权重。


Propagation of signals through the output layer.

通过输出层传播信号。

In the next algorithm step the output signal of the network y is compared with the desired output value (the target), which is found in training data set. The difference is called error signal δ of output layer neuron.

在下一个算法步骤中,将网络y的输出信号与在训练数据集中找到的所需输出值(目标)进行比较。差异称为输出层神经元的误差信号δ

It is impossible to compute error signal for internal neurons directly, because output values of these neurons are unknown. For many years the effective method for training multiplayer networks has been unknown. Only in the middle eighties the backpropagation algorithm has been worked out. The idea is to propagate error signal δ (computed in single teaching step) back to all neurons, which output signals were input for discussed neuron.

因为中间层神经元的输出值是未知的,所以不可能直接计算中间层神经元的误差。一直以来都没找到有效的方式训练多层神经网络。直到八十年代中期,反向传播算法才被提出。这个想法是将误差信号δ(在每次训练中计算)传播回所有神经元,输出信号又被输入到之前的神经元。


The weights’ coefficients wmn used to propagate errors back are equal to this used during computing output value. Only the direction of data flow is changed (signals are propagated from output to inputs one after the other). This technique is used for all network layers. If propagated errors came from few neurons they are added. The illustration is below:

在反向传播过程中(信号从输出到输入一个接一个地传播)。中间层神经元的误差信号等于所有它输出指向的下一层神经元的误差和。用于累加误差的权重系数wmn等于计算输出值时使用的权重系数。如下图:



When the error signal for each neuron is computed, the weights coefficients of each neuron input node may be modified. In formulas below df(e)/derepresents derivative of neuron activation function (which weights are modified).

最终每个神经元输入节点的权重系数将被修改。在下面的公式中,df(e)/de表示当前计算新权重系数神经元激活函数的导数。






Coefficient η affects network teaching speed. There are a few techniques to select this parameter. The first method is to start teaching process with large value of the parameter. While weights coefficients are being established the parameter is being decreased gradually. The second, more complicated, method starts teaching with small parameter value. During the teaching process the parameter is being increased when the teaching is advanced and then decreased again in the final stage. Starting teaching process with low parameter value enables to determine weights coefficients signs.

系数η影响网络训练速度。有几种方式来选择此系数η。第一种方法是在开始训练时设置较大的系数η,当权重系数不断修正时,系数η逐渐减少。第二个更复杂的方法是开始用小系数η进行训练。在训练过程中,当误差较大时系数η会增加,然后在最后阶段再次下降。在训练的开始具有较低的系数η可以确定权重系数的正负。


评论关闭
IT干货网

微信公众号号:IT虾米 (左侧二维码扫一扫)欢迎添加!

pix2code:从截图生成图形用户界面代码