MSDN Magazine, June 2019

Page 57 - MSDN Magazine, June 2019

P. 57

double[] eTerms = new double[nc]; for (int k = 0; k < nc; ++k) {
double v = 1.0;
for (int j = 0; j < nx; ++j) {
v *= (double)(jointCts[j][k]) / (yCts[k] + nx); }
v *= (double)(yCts[k]) / N;
eTerms[k] = v; }
I call these values pseudo- probabilities because, although they sum to 1.0, they don’t really represent the likelihood of a result from many repeated sampling experiments.
Notice that v is the product of nx+1 fractions that are all less than 1.0. This isn’t a problem for the demo program because there are only nx = 3 predictor values and none of the fraction terms can ever be less than 1/27 = 0.0370. But in some situations, you could run into arithmetic trouble by multiplying many very small values.
A technique for avoiding arithmetic errors when computing the evidence terms is to use the log trick that takes advantage of the facts that log(A * B) = log(A) + log(B) and log(A / B) = log(A) - log(B). By computing the log of each evidence term, and then taking the exp function of that result, you can use addition and subtraction of many small values instead of multiplication and division. One
possible refactoring using the log trick is:
double[] eTerms = new double[nc]; for (int k = 0; k < nc; ++k) {
double v = 0.0;
for (int j = 0; j < nx; ++j) {
v += Math.Log(jointCts[j][k]) - Math.Log(yCts[k] + nx); }
v += Math.Log(yCts[k]) - Math.Log(N);
eTerms[k] = Math.Exp(v); }
Generating the Predicted Class
After the evidence terms have been computed, the demo program sums them and uses them to compute pseudo-probabilities. I call these values pseudo-probabilities because, although they sum to 1.0, they don’t really represent the likelihood of a result from many repeated sampling experiments. However, you can cautiously interpret pseudo-probabilities as mild forms of confidence. For example, pseudo-probabilities of (0.97, 0.03) suggest class 0 with a bit more strength than (0.56, 0.44).
The predicted class is generated by calling the program-defined ArgMax function, which returns the index of the largest value in a numeric array. For example, if an array holds values (0.20, 0.50, 0.90, 0.10) then ArgMax returns 2.
Wrapping Up
The demo program presented in this article performs binary classifi- cation because there are only two class values. The program logic can also be used without modification for multiclass classification. The msdnmagazine.com
Figure 4 File BayesData.txt Contents
Aqua,Small,Twisted,1 Blue,Small,Pointed,0 Dune,Large,Rounded,0 Dune,Small,Rounded,1 Cyan,Large,Rounded,0 Aqua,Small,Rounded,1 Aqua,Small,Rounded,0 Cyan,Small,Pointed,1 Cyan,Small,Pointed,1 Dune,Small,Rounded,1 Dune,Small,Rounded,0 Dune,Small,Rounded,1 Dune,Small,Rounded,1 Cyan,Small,Pointed,1 Dune,Small,Rounded,1 Dune,Large,Rounded,0 Cyan,Small,Twisted,1 Blue,Small,Rounded,0 Aqua,Small,Pointed,1 Aqua,Small,Pointed,1 Dune,Small,Twisted,0 Blue,Small,Rounded,0 Dune,Small,Rounded,0 Blue,Small,Twisted,0 Dune,Small,Rounded,0 Aqua,Large,Pointed,1 Dune,Large,Rounded,0 Dune,Small,Rounded,0 Dune,Small,Rounded,0 Cyan,Large,Rounded,0 Dune,Small,Twisted,0 Dune,Large,Twisted,0 Dune,Small,Rounded,0 Dune,Small,Rounded,0 Dune,Large,Rounded,0 Aqua,Large,Rounded,1 Aqua,Small,Rounded,0 Aqua,Small,Rounded,1 Dune,Small,Rounded,0 Blue,Small,Rounded,0
predictor values in the demo program are all categorical. If your data has numeric data, such as a weight variable with values like 3.78 and 9.21, you can apply the technique presented in this article by binning the numeric data into categories such as light, medium and heavy.
The program logic can also be used without modification for multiclass classification.
There are several other forms of naive Bayes classification in addition to the type presented in this article. One form uses the same underlying math principles as those used by the demo program, but can handle data where all the predictor values are numeric. However, you must make assumptions about the math properties of the data, such as that the data is a normal (Gaussian) distribu- tion with a certain mean and standard deviation. n
Dr. James mccaffrey works for Microsoft Research in Redmond, Wash. He has worked on several key Microsoft products including Azure and Bing. Dr. McCaffrey can be reached at jamccaff@microsoft.com.
Thanks to the following Microsoft technical experts for reviewing this article: Chris Lee, Ricky Loynd, Kirk Olynyk
June 2019 53

55 56 57 58 59