Page 58 - MSDN Magazine, April 2017
P. 58
related. Suppose there are just four training data items: td[0] = (2.0, 3.0, -1), td[1] = (3.0, 2.0, +1), td[2] = (3.0, 5.0, +1), td[3] = (4.0, 3.0, -1). And suppose the trained kernel perceptron model gave you wrong counter values of a = (1, 2, 3, 4). (I use “a” for the wrong counters because in research literature they’re usually given the symbol Greek sigma, which resembles lowercase English “a.”)
To predict the class of new data item x = (3.0, 6.0), you compute the weighted sum (by a-value and training item correct class) of the kernel function applied to the new data item and each of the four training items:
K(td[0], x) = 0.1083 K(td[1], x) = 0.0285 K(td[2], x) = 0.8007 K(td[3], x) = 0.1083
sum = (1)(0.1083)(-1) + (2)(0.0285)(+1) + (3)(0.8007)(+1) + (4)(0.1083)(-1)
= +1.9175 prediction = sign(sum)
=+1
Stated somewhat loosely, the kernel perceptron looks at the sim-
ilarity of the data item to be classified and all training data items, and aggregates the similarity values—using the wrong counters as weights—into a single value that indicates predicted class.
Behind the scenes,
the kernel perceptron uses a function called a radial basis function (RBF) kernel.
So, to make a kernel perceptron prediction you need the training data and the associated wrong counter values. When you train an ordinary perceptron, you use an iterative process. In each iteration, you use the current values of the weights and bias to calculate a pre- dicted class. If the predicted class is incorrect (doesn’t match the class in the training data) you adjust the weights a bit so that the predicted class is closer to the known correct class value.
In a kernel perceptron, you use a similar iterative training pro- cess, but instead of adjusting weight values when a calculated class is wrong, you increment the wrong counter for the current training item. Quite remarkable! The math proof of why this works is stunningly beautiful and can be found in the Wikipedia entry for kernel perceptrons.
Expressed in high-level pseudo-code, the kernel perceptron training algorithm is:
loop a few times
for each curr training item, i
for each other training item, j
sum += a[j] * K(td[i], td[j], sigma) * y[j];
end-for
pred = sign(sum)
if (pred is wrong) increment a[i]
end-for end-loop
The two parameters needed in the training algorithm are the number of times to iterate and the value of sigma for the kernel function. Both of these values must be determined by a bit of trial and error. The demo program iterates 10 times and uses 1.5 for sigma.
Note that the ordinary perceptron training algorithm uses training data to generate the weights and bias values, and then conceptually discards the training data. The kernel perceptron training algorithm generates wrong counter values that are mathematically related to weights, but the algorithm must keep the training data in order to make predictions.
The Demo Program Structure
The overall structure of the demo program, with a few minor edits to save space, is presented in Figure 3. I used a static method style rather than an object-oriented programming style for
Figure 3 Kernel Perceptron Demo Program Structure
using System;
namespace KernelPerceptron {
class KernelPerceptronProgram {
static void Main(string[] args) {
Console.WriteLine("Begin demo "); int numFeatures = 2;
Console.WriteLine("Goal is classification(-1/+1) "); Console.WriteLine("Setting up 21 training items ");
double[][] trainData = new double[21][]; trainData[0] = new double[] { 2.0, 3.0, -1 }; trainData[1] = new double[] { 2.0, 5.0, -1 }; .. .
trainData[20] = new double[] { 5.0, 6.0, +1 }; int numTrain = trainData.Length;
double[][] testData = new double[4][]; testData[0] = new double[] { 2.0, 4.0, -1 }; ..
testData[3] = new double[] { 5.5, 5.5, +1 };
int[] a = new int[trainData.Length];
int maxEpoch = 10;
int epoch = 0;
double sigma = 1.5; // for the kernel function
Console.WriteLine("Starting train, sigma = 1.5 "); ...
Console.WriteLine("Training complete ");
double trainAcc = Accuracy(trainData, a, trainData, sigma, false); // silent
Console.WriteLine("Accuracy = " + trainAcc.ToString("F4"));
Console.WriteLine("Analyzing test data: "); double testAcc = Accuracy(testData, a, trainData, sigma, true); // verbose
Console.WriteLine("End kernel perceptron demo "); Console.ReadLine();
} // Main
static double Kernel(double[] d1, double[] d2, double sigma) { . . }
static double Accuracy(double[][] data, int[] a, double[][] trainData, double sigma, bool verbose)
{.. }
} // Program
} // ns
44 msdn magazine
Test Run