Page 60 - MSDN Magazine, April 2017
P. 60
simplicity. The Main method has all the control logic. There are two helper methods, Kernel and Accuracy.
To code the demo program, I launched Visual Studio and created a new C# console application program and named it KernelPerceptron. I used Visual Studio 2015, but the demo pro- gram has no significant .NET Framework dependencies so any recent version will work.
In order to understand kernel perceptrons you must understand kernel functions in general.
After the template code loaded into the editor window, I right- clicked on file Program.cs in the Solution Explorer window and renamed the file to KernelPerceptronProgram.cs, then allowed Visual Studio to automatically rename class Program for me. At the top of the template-generated code, I deleted all unnecessary using statements, leaving just the one that references the top-level System namespace.
The Main method sets up the training and test data like so:
int numFeatures = 2;
double[][] trainData = new double[21][]; trainData[0] = new double[] { 2.0, 3.0, -1 }; ..
trainData[20] = new double[] { 5.0, 6.0, +1 }; int numTrain = trainData.Length;
double[][] testData = new double[4][]; testData[0] = new double[] { 2.0, 4.0, -1 }; ..
testData[3] = new double[] { 5.5, 5.5, +1 };
The demo uses two predictor variables (also called features in ML terminology) for simplicity, but kernel perceptrons can han- dle any number of predictor variables. The data is hardcoded but in a non-demo scenario you’d likely load the data from a text file. The demo uses -1 and +1 to represent the two possible classes. This encoding is typical for perceptrons, but classes can be encoded as 0 and 1 instead (though this encoding would require some changes to the code logic).
Training is prepared with these statements:
int[] a = new int[numTrain]; // "Wrong counters" double sigma = 1.5; // For the kernel function int maxEpoch = 10;
int epoch = 0;
Array a holds the wrong counters for each training item, and sigma is the free parameter for the RBF kernel function. The value of sigma and the maxEpoch loop control were determined by trial and error. Next, all possible kernel function values are pre-calculated and stored into a matrix:
double[][] kernelMatrix = new double[numTrain][]; for (int i = 0; i < kernelMatrix.Length; ++i)
kernelMatrix[i] = new double[numTrain]; for (int i = 0; i < numTrain; ++i) {
for (int j = 0; j < numTrain; ++j) {
double k = Kernel(trainData[i], trainData[j], sigma); kernelMatrix[i][j] = kernelMatrix[j][i] = k;
} }
The idea is that during training, the kernel similarity between all pairs of training items will have to be used several times so it makes sense to pre-calculate these values.
The training loop is:
while (epoch < maxEpoch) {
for (int i = 0; i < numTrain; ++i) {
// Get "desired" correct class into di for (int j = 0; j < numTrain; ++j) {
// Get "other" desired class into dj
// Compute y = weighted sum of products }
if ((di == -1 && y >= 0.0) || (di == 1 && y <= 0.0)) ++a[i]; // increment wrong counter
}
++epoch; }
The desired, correct class (-1 or +1) is pulled from the current training item with this code:
int di = (int)trainData[i][numFeatures];
I use di here to stand for desired value. Two other common variable names are t (for target value) and y (which just stands for general output). The inner nested for loop that calculates the weighted sum of ker- nel values is the heart of the kernel perceptron-learning algorithm:
double y = 0.0; // Weighted sum of kernel results for (int j = 0; j < numTrain;; ++j) {
int dj = (int)trainData[j][numFeatures]; double kern = kernelMatrix[i][j];
y += a[j] * dj * kern;
}
The demo code calculates the kernel for all pairs of training items. But when an associated wrong counter has value 0, the product term will be 0. Therefore, an important optional optimization when the number of training items is large is to skip the calculation of the kernel when the value of a[i] or a[j] is 0. Equivalently, training items that have a wrong counter value of 0 can be removed entirely.
I don’t compute an explicit predicted class because it’s easier to check if the predicted class is wrong directly:
if ((di == -1 && y >= 0.0) || (di == 1 && y <= 0.0)) ++a[i]; // wrong counter for curr data
You have to be careful to check y >= 0.0 or y <= 0.0 rather than y > 0.0 or y < 0.0 because the first time through the training loop, all wrong counter values in the a array are zero and so the weighed sum of products will be 0.0.
After the training loop terminates, the kernel perceptron is effectively defined by the training data, the wrong-counter array and the RBF kernel parameter sigma.
Making Predictions
Helper function Accuracy makes predictions. The method’s defi- nition starts with:
static double Accuracy(double[][] data, int[] a, double[][] trainData, double sigma, bool verbose)
{
int numFeatures = data[0].Length - 1; double[] x = new double[numFeatures]; int numCorrect = 0;
int numWrong = 0;
...
The parameter named data holds an array-of-arrays style matrix of data to evaluate. Parameter array a holds the wrong counter values generated by training.
Inside the body of the function, the key code is exactly like the training code, except instead of incrementing a wrong counter for
46 msdn magazine
Test Run