||In the cycle (for several number of iterations with one row) through all the data, network is feeding with the patterns.
- Row is loaded
- Row is divided into 3 groups : input columns, output columns, non-used columns. It is divided according to the parameters of the network and users choice which columns wants to use.
After dividing there are
- input data
- desired outputs for training
It is supposed, that all the data are integers or double numbers.
In function BP_network.TraceForward is implemented this steps of feeding the network with patterns :
- input layer has only one, normalized input. So, the potential and output is the same value as input value (we do not use a sigmoid function).
- for all next layers and all their nodes
- count potential of the node. At the beginning it is threshold, then output (sigmoid of potential * weight of the edge between nodes) of all nodes from previous layer is added
- count output of the layer. It is sigmoid of potential : 1/(1+Math.Exp(-x*Par)) , where "x" is potential and "Par" is the slope of sigmoid function
|Calculating error of this pattern
||To reach total error and correctly train network, it is necessary to calculate error in every step.
It is a sum through all output nodes of difference between squares of (desired output and real output) of the output node
||Back propagation is "a heart" of training this network and it is implemented in function BP_network.BackProp() and according to non-triviality, it is divided into some steps. We have already defined Learning parameter, Learning moment, Slope of Sigmoid function.
We use drivation of sigmoid function = SigmOfPot*(1-SigmOfPot)*Slope
SigmOfPot = 1/(1+Math.Exp(-x*Par)) with Potential as x (Par is a slope of sigmoid function)
At first we have to calculate temporaly parameter Delta :
- for output layer (BP_Network.CalcDeltaOut() ) = -(output - desired output)*derivation of sigmoid function
- for each node (indexed as i) in hidden layer Delta = Sum(Delta(higher(j))*weight(ij))*y(i)*(1-y(i))*slopePar
- Delta(higher(j)) = parameter in the next layer already calculated - node indexed as j.
- weight(ij) = weight of the edge from node "i" to node "j"
- y(i) = output of node "i" = SigmOfPot mentioned above
Calculate new thresholds of nodes for all layers (except input layer, it does not have thresholds in its nodes).
- old threshold is incremented by (Calculated Delta * Learning parameter)
Calculate new weights of edges to next layer
- weight is incremented by
Learning Parameter* Calculated Delta in next layer* Output of this layer+
+ LearningMoment*(weight - old weight)
- old weight is put into parameter weight_old of the node to correctly use Learning moment
||Total error is a global parameter of the network and is incrementing during each given pattern according to difference between desired output and real output of feeded network.
At the end, total error is multiplied by 1/2.