Page 34 - MSDN Magazine, February 2018
P. 34

MACHINE LEARNING
Deep Neural Network
Classifiers Using CNTK
James McCaffrey
The Microsoft Cognitive Toolkit (CNTK) library is a powerful set of functions that allows you to create machine learning (ML) prediction systems. I provided an introduction to version 2 in the July 2017 issue (msdn.com/magazine/mt784662). In this article, I explain how to use CNTK to make a deep neural network classi- fier. A good way to see where this article is headed is to take a look at the screenshot in Figure 1.
The CNTK library is written in C++ for performance reasons, but the most usual way to call into the library functions is to use the CNTK Python language API. I invoked the demo program by issuing the following command in an ordinary Windows 10 command shell:
> python seeds_dnn.py
The goal of the demo program is to create a deep neural network that can predict the variety of a wheat seed. Behind the scenes, the demo program uses a set of training data that looks like this:
|properties 15.26 14.84 ... 5.22 |variety 1 0 0 |properties 14.88 14.57 ... 4.95 |variety 1 0 0 ...
|properties 17.63 15.98 ... 6.06 |variety 0 1 0 ...
|properties 10.59 12.41 ... 4.79 |variety 0 0 1
The training data has 150 items. Each line represents one of three varieties of wheat seed: “Kama,” “Rosa” or “Canadian.” The first seven numeric values on each line are the predictor values, often called attributes or features in machine learning terminology. The predictors are seed area, perimeter, compactness, length, width, asymmetry coefficient, and groove length. The item-to-predict (often called the class or the label) fills the last three columns and isencodedas100forKama,010forRosa,and001forCanadian.
The demo program also uses a test data set of 60 items, 20 of each seed variety. The test data has the same format as the training data. The demo program creates a 7-(4-4-4)-3 deep neural network. The network is illustrated in Figure 2. There are seven input nodes (one for each predictor value), three hidden layers, each of which has four processing nodes, and three output nodes that correspond
to the three possible encoded wheat seed varieties.
The demo program trains the network using 5000 batches of 10 items each, using the stochastic gradient descent (SGD) algorithm. After the prediction model has been trained, it’s applied to the 60-item test data set. The model achieved 78.33 percent accuracy,
meaning it correctly predicted 47 of the 60 test items.
This article discusses:
• Installing and using CNTK to create a deep neural network • Understanding the data
• The deep neural network demo program
• Creating the network and the model
• Training the network and saving the trained model Technologies discussed:
Microsoft Cognitive Toolkit, Anaconda Python distribution
Code download available at:
msdn.com/magazine/0218magcode
30 msdn magazine


































































































   32   33   34   35   36