Page 73 - MSDN Magazine, November 2017
P. 73

while (iter < maxIter) {
Shuffle(indices);
for (int idx = 0; idx < indices.Length; ++idx) {
int i = indices[idx];
The Shuffle method is a helper that scrambles the order of the training items using the Fisher-Yates mini-algorithm. The target class label and the predicted probability of the current training item are computed like so:
double sum = 0.0;
for (int j = 0; j < alphas.Length-1; ++j)
sum += alphas[j] * kernelMatrix[i][j]; sum += alphas[alphas.Length - 1]; double y = 1.0 / (1.0 + Math.Exp(-sum)); double t = trainData[i][numFeatures];
Notice that this design assumes that the class label is in the last cell of a training data array. Next, the alphas and the beta values are updated:
for (int j = 0; j < alphas.Length - 1; ++j) alphas[j] = alphas[j] +
(eta * (t - y) * kernelMatrix[i][j]); alphas[alphas.Length-1] = alphas[alphas.Length - 1] +
(eta * (t - y)) * 1; }
++iter;
} // While (train)
Updating the bias value uses a dummy value of 1 in place of the kernel similarity value, just to make the symmetry of the relationship clear. Of course, you can remove the multiplication by 1 because it has no effect. After training, a few of the values of the alphas and the bias value, are displayed, as shown in Figure 1.
The demo program concludes by computing the classification accuracy of the trained KLR model on the training and test data:
double accTrain = Accuracy(trainData, trainData, alphas, sigma, false);
Console.WriteLine(“accuracy = “ + accTrain.ToString(“F4”) + “\
”);
double accTest = Accuracy(testData, trainData, alphas, sigma, true); // Verbose
The Boolean argument passed to method Accuracy indicates whether to compute in verbose mode (with diagnostic messages) or silent mode.
Wrapping Up
Kernel logistic regression isn’t used very often, at least among my colleagues. Its major advantage is simplicity. The major disadvan- tage of KLR is that it doesn't scale well to large data sets because you either have to precompute all item-to-item kernel similarity values and save them, or you must keep all training data and then compute all similarity values on the fly for every prediction.
KLR is designed for binary classification. It’s possible to extend KLR to handle classification problems with three or more class values, but in my opinion, there are better alternatives to use, in particular a single hidden layer feed-forward neural network. KLR has some similarities to the K nearest neighbors (K-NN) classification algo- rithm, and also to support vector machine (SVM) classification.n
Dr. James mccaffrey works for Microsoft Research in Redmond, Wash. He has worked on several Microsoft products, including Internet Explorer and Bing. Dr. McCaffrey can be reached at jamccaff@microsoft.com.
Thanks to the following Microsoft technical experts who reviewed this article: Chris Lee and Adith Swaminathan
msdnmagazine.com
magazine
STATEMENT OF OWNERSHIP, MANAGEMENT AND CIRCULATION
1. Publication Title: MSDN Magazine
2. Publication Number: 1528-4859
3. Filing Date: September 30, 2017
4. Frequency of Issue: Monthly with a special issue in December.
5. Number of Issues Published Annually: 13
6. Annual Subscription Price: US $35, International $60
7. Complete Mailing Address of Known Office of Publication: 9201 Oakdale Ave.,
Ste. 101, Chatsworth, CA 91311
8. Complete Mailing Address of the Headquarters of General Business Offices of the
Publisher: Same as above.
9. Full Name and Complete Mailing Address of Publisher, Editor, and Managing Editor:
Henry Allain, President, 4 Venture, Suite 150, Irvine, CA 92618
Michael Desmond, Editor-in-Chief, 8251 Greensboro Drive, Suite 510, McLean, VA 22102
Wendy Hernandez, Group Managing Editor, 4 Venture, Ste. 150, Irvine, CA 92618
10. Owner(s): 1105 Media, Inc. dba: 101 Communications LLC, 9201 Oakdale Ave, Ste. 101, Chatsworth, CA 91311. Listing of shareholders in 1105 Media, Inc.
11. Known Bondholders, Mortgagees, and Other Security Holders Owning or Holding
1 Percent or more of the Total Amount of Bonds, Mortgages or Other Securities: Nautic Partners V, L.P., 50 Kennedy Plaza, 12th Flr., Providence, RI 02903 Kennedy Plaza Partners III, LLC, 50 Kennedy Plaza, 12th Flr., Providence, RI 02903 Alta Communications IX, L.P., 1000 Winter Street, South Entrance, Suite 3500, Waltham, MA 02451
Alta Communications IX, B-L.P., 1000 Winter Street, South Entrance, Suite 3500, Waltham, MA 02451
Alta Communications IX, Associates LLC, 1000 Winter Street, South Entrance, Suite 3500, Waltham, MA 02451
12. The tax status has not changed during the preceding 12 months.
13. Publication Title: MSDN Magazine
14. Issue date for Circulation Data Below: September 2017
15. Extent & Nature of Circulation:
a. b.
c.
Total Number of Copies (Net Press Run) Legitimate Paid/and or Requested Distribution
1. Outside County Paid/Requested Mail Subscriptions Stated on PS Form 3541
2. In-County Paid/Requested Mail Subscriptions Stated on PS Form 3541
3. Sales Through Dealers and Carriers, Street Vendors, Counter Sales, and Other Paid or Requested Distribution Outside USPS®
4. Requested Copies Distributed by Other Mail Classes Through the USPS®
Total Paid and/or Requested Circulation Nonrequested Distribution
1. Outside County Nonrequested Copies
Stated on PS Form 3541
2. In-County Nonrequested Copies Distribution
Stated on PS Form 3541
3. Nonrequested Copies Distribution Through
the USPS by Other Classes of Mail
4. Nonrequested Copies Distributed
82,029 63,580
0 3,315
0
66,895
13,636
0
0
1,383
15,019 81,914 115 82,029 81.66%
Outside the Mail
Total Nonrequested Distribution
e.
f.
g.
h.
i. Percent paid and/or Requested Circulation
Total Distribution Copies not Distributed Total
16. Electronic Copy Circulation
a. Requested and Paid Electronic Copies
b. Total Requested and Paid Print Copies (Line 15c) + Requested/Paid Electronic
Copies
c. Total Requested Copy Distribution (Line15f) + Requested/Paid Electronic
Copies (Line 16a)
d. Percent Paid and/or Requested Circulation (Both print & Electronic Copies)
(16b divided by 16c x 100)
þ I certify that 50% of all my distributed copies (electronic and paid print
are legitimate request or paid copies.
17. Publication of Statement of Ownership for a Requester Publication is required and will be printed in the November 2017 issue of this publication.
18. I certify that all information furnished on this form is true and complete: Peter B. Weller, Manager, Print Production
Average No. Copies Each Month During Preceding 12 Months
No. Copies of Single Issue Published Nearest to Filing Date
90,229 65,559
0 2,903
0
68,462
11,944
0
0
9,708
21,652 90,114 115 90,229 75.97%
   71   72   73   74   75