Page 62 - MSDN Magazine, June 2017
P. 62

primary input patterns back. The point is the RBM has deduced that the data can be placed into one of three buckets. The specific bit patterns aren’t important.
Yet another interpretation of this behavior is that an RBM acts as an auto-encoder. And it’s also possible to chain several RBMs together to create a prediction system called a deep belief network (DBN). In fact, this is arguably the most common use of RBMs.
Implementing a Restricted Boltzmann Machine
Once you understand how RBMs work, they’re actually quite simple. But coding a demo program is a bit more complex than you might expect. There are many design possibilities for an RBM. Take a look at the demo run in Figure 3.
Once you understand how RBMs work, they’re actually quite simple. But coding a demo program is a bit more complex than you might expect.
The demo illustrates the film preference example from the previ- ous sections of this article so there are six visible nodes. The demo program defines a Machine object with 10 member fields. The first six fields in the class definition are:
public class Machine {
public Random rnd;
public int numVisible; public int numHidden; public int[] visValues; public double[] visProbs; public double[] visBiases;
...
All fields are declared with public scope for simplicity. The Random object is used when converting a node probability to a concrete zero or one value. Variables numVisible and numHidden (OK, OK, I know they’re objects) hold the number of hidden and visible nodes. Integer array visValues holds the zero or one values of the visible nodes. Note that you can use a Boolean type if you wish. Double array visBiases holds the bias values associated with each visible node. Double array visProbs holds visible node prob- abilities. Note that the visProbs array isn’t necessary because node values can be computed on the fly; however, storing the probability values is useful if you want to examine the behavior of the RBM during runtime.
The other four Machine class fields are:
public int[] hidValues; public double[] hidProbs; public double[] hidBiases; public double[][] vhWeights;
Arrays hidValues, hidBiases, and hidProbs are the node values, associated bias values, and node probabilities, respectively. The vhWeights object is an array-of-arrays style matrix where the row
index corresponds to a visible node and the column index corre- sponds to a hidden node.
The key class method computes the values of the hidden nodes using values in a parameter that corresponds to the visible nodes. That method’s definition begins with:
public int[] HiddenFromVis(int[] visibles) {
int[] result = new int[numHidden]; ...
Next, the calculations of the hidden-layer nodes are done node by node:
for (int h = 0; h < numHidden; ++h) { double sum = 0.0;
for (int v = 0; v < numVisible; ++v)
sum += visibles[v] * vhWeights[v][h];
sum += hidBiases[h]; // Add the hidden bias double probActiv = LogSig(sum); // Compute prob double pr = rnd.NextDouble(); // Determine 0/1 if (probActiv > pr) result[h] = 1;
else result[h] = 0;
}
The code mirrors the explanation of the input-output mecha- nism explained earlier. Function LogSig is a private helper function, because the Microsoft .NET Framework doesn’t have a built-in logistic sigmoid function (at least that I’m aware of ).
The key method concludes by returning the computed hidden
node values to the caller:
...
return result; }
The rest of the demo code implements the CD-1 training algo- rithm as described earlier. The code isn’t trivial, but if you examine it carefully, you should be able to make the connections between RBM concepts and implementation.
Wrapping Up
Restricted Boltzmann machines are simple and complicated at the same time. The RBM input-output mechanism is both deterministic and probabilistic (sometimes called stochastic), but is relatively easy to understand and implement. The more difficult aspect of RBMs, in my opinion, is understanding how they can be useful.
As a standalone software component, an RBM can act as a lossy compression machine to reduce the number of bits needed to represent some data, or can act as a probabilistic factor analysis component that identifies core concepts in a data set. When con- catenated, RBMs can create a deep neural network structure called a “deep belief network” that can make predictions.
RBMs were invented in 1986 by my Microsoft colleague Paul Smolensky, but gained increased attention relatively recently when the CD-1 training algorithm was devised by researcher and Microsoft collaborator Geoffrey Hinton. Much of the information presented in this article is based on personal conversations with Smolensky, and the 2010 research paper, “A Practical Guide to Training Restricted Boltzmann Machines,” by Hinton. n
Dr. James mccaffrey works for Microsoft Research in Redmond, Wash. He has worked on several Microsoft products including Internet Explorer and Bing. Dr. McCaffrey can be reached at jammc@microsoft.com.
Thanks to the following Microsoft technical experts who reviewed this article: Ani Anirudh, Qiuyuan Huang and Paul Smolensky
58 msdn magazine
Test Run




























































   60   61   62   63   64