Page 66 - MSDN Magazine, April 2018
P. 66
The demo defines a MatCopy function that can duplicate a matrix. MatCopy can be called:
float[][] B = MatCopy(A);
Here, B is a new, independent matrix with the same values as A. Be careful of code like:
float[][] B = A;
This creates B as a reference to A, so any change made to either matrix affects the other. This may be the behavior you want, but probably not.
Matrix Element-Wise Operations
An LSTM cell implementation uses several element-wise functions on matrices, where each value in a matrix is used or modified. For example, function MatTanh is defined:
static float[][] MatTanh(float[][] m) {
int rows = m.Length; int cols = m[0].Length; float[][] result = MatCreate(rows, cols); for (int i = 0; i < rows; ++i) // Each row
for (int j = 0; j < cols; ++j) // Each col result[i][j] = Tanh(m[i][j]);
return result; }
The function traverses its input matrix m and applies the hyper- bolic tangent (tanh) to each value. Helper function Tanh is defined:
static float Tanh(float x) {
if (x < -10.0) return -1.0f; else if (x > 10.0) return 1.0f; return (float)(Math.Tanh(x));
}
Figure 4 Demo Program Structure
The demo also defines a MatSigmoid function that’s exactly like MatTanh except logistic-sigmoid is applied to each value. The logistic-sigmoid function is closely related to tanh and returns a value between 0.0 and 1.0 instead of between -1.0 and +1.0.
The demo defines a function MatSum that adds the values in two matrices of the same shape. If you look at math equation (1) in Figure 3, you’ll see that an LSTM adds three matrices. The demo overloads MatSum to work with two or three matrices.
Function MatHada multiplies corresponding values in two matrices that have the same shape:
static float[][] MatHada(float[][] a, float[][] b) {
int rows = a.Length; int cols = a[0].Length; float[][] result = MatCreate(rows, cols); for (int i = 0; i < rows; ++i)
for (int j = 0; j < cols; ++j) result[i][j] = a[i][j] * b[i][j];
return result; }
Element-wise multiplication is sometimes called the Hadamard function. In the math equations (4) and (5) in Figure 3, the Hadamard function is indicated by the open dot symbol. Don’t confuse the element-wise Hadamard function with matrix mul- tiplication, which is a very different function.
Matrix Multiplication
If you haven’t seen matrix multiplication before, the operation is not at all obvious. Suppose A is a 3x2 matrix:
1.0, 2.0 3.0, 4.0 5.0, 6.0
And suppose B is a 2x4 matrix:
10.0, 11.0, 12.0, 13.0 14.0, 15.0, 16.0, 17.0
The result of C = AB (multiplying A and B) is a 3x4 matrix:
38.0 41.0 44.0 47.0
86.0 93.0 100.0 107.0 134.0 145.0 156.0 167.0
The demo implements matrix multiplication as function Mat- Prod. Note that when using C#, for very large matrices you can use the Parallel.For statement in the Task Parallel Library.
To summarize, implementing an LSTM cell using C# requires several helper functions that create and operate on matrices (arrays-of-arrays) and vectors (matrices with one column). The demo code defines functions MatCreate, MatFromArray, MatCopy(m), MatSig(m), MatTanh(m), MatHada(a, b), MatSum(a, b), Mat- Sum(a, b, c) and MatProd(a, b). Although not essential for creating an LSTM cell, it’s useful to have a function to display a C# matrix. The demo defines at MatPrint function.
Implementing and Calling LSTM Input-Output
The code for function ComputeOutputs is presented in Figure 5. The function has 15 parameters, but 12 of them are essentially the same.
Vector xt is the input, such as (1.0, 2.0). Vectors h_prev and c_prev are the previous output vector and previous cell state. The four W matrices are the gate weights associated with input values, where f is a forget gate, i is an input gate and o is an output gate. The four U matrices are the weights associated with cell output. The four b vectors are biases.
using System; namespace LSTM_IO {
class LSTM_IO_Program {
static void Main(string[] args) {
Console.WriteLine("Begin LSTM IO demo");
// Set up inputs
// Set up weights and biases
float[][] ht, ct; // Outputs, new state float[][][] result;
result = ComputeOutputs(xt, h_prev, c_prev,
Wf, Wi, Wo, Wc, Uf, Ui, Uo, Uc, bf, bi, bo, bc);
ht = result[0]; // Outputs ct = result[1]; // New state
Console.WriteLine("Output is:"); MatPrint(ht, 4, true); Console.WriteLine("New cell state is:"); MatPrint(ct, 4, true);
// Set up new inputs
// Call ComputeOutputs again
Console.WriteLine("End LSTM demo"); }
static float[][][] ComputeOutputs(float[][] xt, float[][] h_prev, float[][] c_prev,
float[][] Wf, float[][] Wi, float[][] Wo, float[][] Wc, float[][] Uf, float[][] Ui, float[][] Uo, float[][] Uc, float[][] bf, float[][] bi, float[][] bo, float[][] bc)
{.. }
// Helper matrix functions defined here }
}
60 msdn magazine
Test Run