Page 56 - MSDN Magazine, June 2018
P. 56
min-max normalization on the six predictor values and on the hull resistance values. I dropped the raw data into an Excel spreadsheet and, for each column, I computed the max and min values. Then, for each column, I replaced every value v with (v - min) / (max - min). For example, the minimum prismatic coefficient value is 0.53 and the maximum value is 0.60. The first value in the column is 0.568 and it’s normalized to (0.568 - 0.53) / (0.60 - 0.53) = 0.038 / 0.07 = 0.5429.
After installing Anaconda, you install CNTK as a Python package, not a standalone system, using the pip utility.
After normalizing, I inserted tags |predictors and |resistance into the Excel spreadsheet so the data can be easily read by a CNTK data reader object. Then I exported the data as a tab-delimited file. The resulting data looks like:
|predictors 0.540000 0.542857 . . |resistance 0.001602 |predictors 0.540000 0.542857 . . |resistance 0.004166 ...
Alternatives to min-max normalization include z-score normal- ization and order-magnitude normalization.
The Demo Program
The complete demo program, with a few minor edits to save space, is presented in Figure 3. All normal error checking has been removed. I indent with two space characters instead of the usual four as a matter of personal preference and to save space. Note that the ‘\’ character is used by Python for line continuation.
Installing CNTK can be a bit tricky. First you install the Anacon- da distribution of Python, which contains the necessary Python interpreter, required packages such as NumPy and SciPy, plus useful utilities such as pip. I used Anaconda3 4.1.1 64-bit, which has Python 3.5. After installing Anaconda, you install CNTK as a
Python package, not a standalone system, using the pip
utility. From an ordinary shell, the command I used was:
Because CNTK is young and under continuous development, it’s a good idea to display the version that’s being used (2.4 in this case). The number of input nodes is determined by the structure of the data set. For a regression problem, the number of output nodes is always set to 1. The number of hidden layers and the number of processing nodes in each hidden layer are free parameters—they must be determined by trial and error.
The demo program uses all 308 items for training. An alterna- tive approach is to split the data set into a training set (typically 80 percent of the data) and a test set (the remaining 20 percent). After training, you can compute loss and accuracy metrics of the model on the test data set to verify that the metrics are similar to those on the training data.
Creating the Neural Network Model
The demo sets up CNTK objects to hold the predictor and true hull resistance values:
X = C.ops.input_variable(input_dim, np.float32) Y = C.ops.input_variable(output_dim)
CNTK uses 32-bit values by default because 64-bit precision is rarely needed. The name of the input_variable function can be a bit confusing if you’re new to CNTK. Here, the “input_” refers to the fact that the return objects hold values that come from the input data (that correspond to both input and output of the neural network).
The neural network is created with these statements:
print("Creating a 6-(5-5)-1 NN") with C.layers.default_options():
hLayer1 = C.layers.Dense(hidden_dim, activation=C.ops.tanh, name='hidLayer1')(X)
hLayer2 = C.layers.Dense(hidden_dim, activation=C.ops.tanh, name='hidLayer2')(hLayer1)
oLayer = C.layers.Dense(output_dim, activation=None, name='outLayer')(hLayer2)
model = C.ops.alias(oLayer) # alias
There’s quite a bit going on here. The Python “with” statement can be used to pass a set of common parameter values to multiple functions. In this case, the neural network weights and biases values are initialized using CNTK default values. Neural networks are highly sensitive to initial weights and biases values, so supplying non-default values is one of the first things to try when your neural network fails to learn—a painfully common situation.
70.0 60.0 50.0 40.0 30.0 20.0 10.0
0.0 0.520
Yacht Hull HydroDynamics
>pip install https://cntk.ai/PythonWheel/CPU-Only/cntk- 2.4-cp35-cp35m-win_amd64.whl
The hydro_reg.py demo has one helper function, create_reader. You can consider create_reader as boilerplate for a CNTK regression problem. The only thing you’ll need to change in most scenarios is the tag names in the data file.
All control logic is in a single main function. The code begins:
def main():
print("Begin yacht hull regression \n") print("Using CNTK version = " + \
str(C.__version__) + "\n")
input_dim = 6 # center of buoyancy, etc. hidden_dim = 5
output_dim = 1 # residuary resistance train_file = ".\\Data\\hydro_data_cntk.txt"
...
0.530 0.540
0.550
0.560 0.570 Prismatic Coefficient
0.580
0.590
0.600
0.610
50 msdn magazine
Test Run
Figure 2 Partial Yacht Hull Data
Residuary Resistance