MSDN Magazine, August 2017

Page 68 - MSDN Magazine, August 2017

P. 68

The softmax of three arbitrary values, x, y, y is: softmax(x) = e^x / (e^x + e^y + e^z) softmax(y) = e^y / (e^x + e^y + e^z) softmax(z) = e^z / (e^x + e^y + e^z)
where e is Euler’s number, approximately 2.718282. So, for the DNN in Figure 1, the final output values are:
output[0] = e^0.5628 / (e^0.5628 + e^0.5823 + e^0.6017) = 0.3269 output[1] = e^0.5823 / (e^0.5628 + e^0.5823 + e^0.6017) = 0.3333 output[2] = e^0.6017 / (e^0.5628 + e^0.5823 + e^0.6017) = 0.3398
The purpose of the softmax activation function is to coerce the out- put values to sum to 1.0 so that they can be interpreted as probabilities and map to a categorical value. In this example, because the third output value is the largest, whatever categorical value that was encoded as (0,0,1) would be the predicted category for inputs = (1.0, 2.0).
Implementing a DeepNet Class
To create the demo program, I launched Visual Studio and selected the C# Console Application template and named it DeepNetInput- Output. I used Visual Studio 2015, but the demo has no significant .NET dependencies, so any version of Visual Studio will work.
After the template code loaded, in the Solution Explorer win- dow, I right-clicked on file Program.cs and renamed it to the more descriptive DeepNetInputOutputProgram.cs and allowed Visual Studio to automatically rename class Program for me. At the top of the editor window, I deleted all unnecessary using statements, leaving just the one that references the System namespace.
I implemented the demo DNN as a class named DeepNet. The class definition begins with:
public class DeepNet {
public static Random rnd; public int nInput; public int[] nHidden; public int nOutput; public int nLayers;
...
All class members are declared with public scope for simplic- ity. The static Random object member named rnd is used by the DeepNet class to initialize weights and biases to small random values (which are then overwritten with values 0.01 to 0.37). Members nInput and nOuput are the number of input and output nodes. Array member hHidden holds the number of nodes in each hid- den layer, so the number of hidden layers is given by the Length property of the array, which is stored into member nLayers for convenience. The class definition continues:
public double[] iNodes; public double [][] hNodes; public double[] oNodes;
Member ihWeights is an array-of-arrays-style matrix that holds the input-to-first-hidden-layer weights. Member hoWeights is an array-of-arrays-style matrix that holds the weights connecting the last hidden layer nodes to the output nodes. Member hhWeights is an array where each cell points to an array-of-arrays matrix that holds the hidden-to-hidden weights. For example, hhWeights[0][3] [1] holds the weights connecting hidden node [3] in hidden layer [0] to hidden node [1] in hidden layer [0+1].These data structures are the heart of the DNN input-output mechanism and are a bit tricky. A conceptual diagram of them is shown in Figure 4.
The last two class members hold the hidden node biases and
the output node biases:
public double[][] hBiases; public double[] oBiases;
As much as any software system I work with, DNNs have many alternative data structure designs, and having a sketch of these data structures is essential when writing input-output code.
Computing the Number of Weights and Biases
To set the weights and biases values, it’s necessary to know how many weights and biases there are. The demo program implements the static method NumWeights to calculate and return this number. Recall that the 2-(4-2-2)-3 demo network has (2*4) + (4*2) + (2*2) + (2*3)=26weightsand4+2+2+3=11biases.Thekeycodeinmethod NumWeights, which calculates the number of input-to-hidden,
hidden-to-hidden and hidden-to-output weights is:
int ihWts = numInput * numHidden[0];
int hhWts = 0;
for (int j = 0; j < numHidden.Length - 1; ++j) {
int rows = numHidden[j]; int cols = numHidden[j + 1]; hhWts += rows * cols;
}
int hoWts = numHidden[numHidden.Length - 1] * numOutput;
Instead of returning the total number of weights and biases as method NumWeights does, you might want to consider returning the number of weights and biases separately, in a two-cell integer array.
Setting Weights and Biases
A non-demo DNN typically initializes all weights and biases to small random values. The demo program sets the 26 weights to 0.01 through 0.26, and the biases to 0.27 through 0.37 using class
A deep neural network implementation has many design choices. Array members iNodes and oNodes hold the input and output [0]
ihWeights[][]
[0] [1]
[0] [1]
method SetWeights. The definition begins with:
public void SetWeights(double[] wts) {
int nw = NumWeights(this.nInput, this.nHidden, this.nOutput); if (wts.Length != nw)
throw new Exception("Bad wts[] length in SetWeights()"); int ptr = 0;
...
[1]
[0] [1]
[0]
values, as you’d expect. Array-of-arrays member hNodes holds the hidden node values. An alternative design is to store all nodes in a single array-of-arrays structure nnNodes, where in the demo nnNodes[0] is an array of input node values and nnNodes[4] is an array of output node values.
The node-to-node weights are stored using these data structures:
public double[][] ihWeights; public double[][][] hhWeights; public double[][] hoWeights;
hhWeights[][][]
[0] [1] [2] [3]
[0]
Figure 4 Weights and Biases Data Structures
[0]
[1] [2] hoWeights[][]
[3]
[0]
[1]
[1] [2]
[1]
62 msdn magazine
Test Run

66 67 68 69 70