Page 54 - MSDN Magazine, October 2019
P. 54
TesT Run JAMES MCCAFFREY Neural Binary Classification Using PyTorch
The goal of a binary classification problem is to make a prediction where the result can be one of just two possible categorical values. For example, you might want to predict the sex (male or female) of a person based on their age, annual income and so on. Somewhat surprisingly, binary classification problems require a slightly differ- ent set of techniques than classification problems where the value to predict can be one of three or more possible values.
There are many different binary classification algorithms. In this article I’ll demonstrate how to perform binary classification using a deep neural network with the PyTorch code library. The best way to understand where this article is headed is to take a look at the demo program in Figure 1.
The demo program creates a prediction model on the Banknote Authentication dataset. The problem is to predict whether a banknote (think dollar bill or euro) is authentic or a forgery, based on four predictor variables. The demo loads a training subset into mem- ory, then creates a 4-(8-8)-1 deep neural network.
After training for 100 iterations, the resulting model scores 98.18 percent accuracy on a held-out test dataset. The demo concludes by making a prediction for a hypothetical, previously unseen banknote. The probability that the unknown item is a forgery is only 0.0215, so the conclusion is that the banknote is authentic.
This article assumes you have intermediate or better programming skills with a C-family language and a basic familiarity with machine learning, but doesn’t assume you know anything about binary clas- sification using PyTorch. All of the demo code is presented in this article. The code and the two data files used by the demo are avail- able in the accompanying download. All normal error checking has been removed to keep the main ideas as clear as possible.
Installing PyTorch
PyTorch is a relatively low-level code library for creating neural networks. It’s roughly similar in terms of functionality to Tensor- Flow and CNTK. PyTorch is written in C++, but has a Python lan- guage API for easier programming.
Installing PyTorch involves two main steps. First, you install Python and several required auxiliary packages, such as NumPy and SciPy. Second, you install PyTorch as a Python add-on package. Although it’s possible to install Python and the packages required to run PyTorch separately, in most cases it’s much better to install a
Figure 1 Binary Classification Using PyTorch
Python distribution. A distribution is a collection of code libraries containing the base Python interpreter and additional packages that are compatible with each other. For my demo, I installed the Anaconda3 5.2.0 distribution, which contains Python 3.6.5.
PyTorch is a relatively low-level code library for creating neural networks.
After installing Anaconda, I went to the pytorch.org Web site and selected the options for the Windows OS, pip installer, Python 3.6 and no-GPU version. This gave me a URL that pointed to the cor- responding .whl (pronounced “wheel”) file, which I downloaded to my local machine. I downloaded PyTorch version 1.0.0. (If you’re
Code download available at msdn.com/magazine/1019magcode.
50 msdn magazine