MSDN Magazine, December 15, 2017

Page 19 - MSDN Magazine, December 15, 2017

P. 19

Inputs {
GPU support noticeably speeds up even trivial ML projects, and low-level performance is an area where the various large projects compete with each other, just as with graphic shaders. (Interestingly, ML doesn’t generally require high precision, and the emerging field of Tensor Processing Units [TPUs] will probably trade-off increased parallelism and power efficiency for word size.)
However, given an efficient low-level foundation, most non- research-level ML architectures can be described using much higher-level abstractions. Some libraries, such as Keras, provide those abstractions on top of various low-level libraries; other librar- ies, such as Microsoft’s Cognitive Toolkit, provide both high-level and low-level abstractions.
While there are several interchange formats striving to gain mindshare, at the moment there's considerable lock-in to the library you choose for training. If you train in Tensorflow, you most likely have to inference in Tensorflow, if you train in Caffe, you most likely have to inference in Caffe.
Classical neural networks do not have any “memory” of their previous inputs and outputs. This is a serious short-coming when it comes to time-series prediction! Recurrent Neural Networks (RNNs), as their name implies, combine their current input with previous results that are looped back as additional inputs. This allows RNNs to recognize patterns in sequential data. The Long Short Term Memory (LSTM) cell, developed by Sepp Hochreiter and Jürgen Schmidhuber in 1997, is a form of RNN that uses internal “gates” to selectively amplify (remember) or damp (forget) these recurrent connections. Although LSTMs are somewhat old fashioned and do not train as fast as modern variants, they have the distinct advantage of being widely supported. An LSTM cell is at the core of my model.
Classifying data, time-series projection, sequence modeling and even sequence-to- sequence construction are all areas where modern ML may lead to competitive advantage.
Because Tensorflow is necessary for deployment on Android, I chose Keras on top of Tensorflow to develop the model for this proj- ect. The Keras high-level description of my model looks like this:
def build_model(lookback_length, input_feature_count, lstm_layer_output_dimensions, prediction_length, dropout_pct):
model = Sequential()
model.add(LSTM(lstm_layer_output_dimensions, input_shape=(lookback_length,
input_feature_count))) model.add(Dropout(dropout_pct)) model.add(Dense(prediction_length)) model.add(Activation('linear')) model.compile(loss='mse', optimizer='rmsprop') return model
LSTM
Outputs {
tide t-199
middle 0
tide t-198
middle 1
tide t-1
middle 126
tide t
middle 127
tide t+1
tide t+2
tide t+3
tide t+98
tide t+99
Figure 1 Schematic of the Tide-Prediction Neural Network
business’ value proposition. Classifying data, time-series projection, sequence modeling and even sequence-to-sequence construction are all areas where modern ML may lead to competitive advantage. Many developers, for instance, face the problem of “time-series prediction,” in which they must reason from large amounts of data that have some structure but are very noisy or have many factors contributing to the ups and downs.
A historically important time-series prediction problem is tide prediction. Predicting the tides was, of course, crucially important in the Ages of Sail and Steam both for merchants and the military. One of the most important artifacts of the early days of comput- ing is Lord Kelvin’s tide-predicting machine, described in Charles Petzold’s work-in-progress “Computer of the Tides” (bit.ly/2yEhaxk), as “a magnificent assemblage of brass and wood that stands as a tall as a person, as gorgeous as it is mysterious.” The timing of tides is primarily dictated by the geometry of the earth, moon, and sun and by the complex flooding and draining of bays. The forces are so complex that modern accurate tide prediction uses more than 30 site-specific harmonic components, whose values are derived from hundreds of tide gauges spread across the globe. A complete cycle of the system takes 19 years.
Predicting tides is a reasonably difficult task for a modern ML approach. Given 200 historical readings of water level taken every three hours, how accurately can the tide be predicted up to 300 hours into the future?
Modeling the Problem
As chance would have it, I have several thousands of lines of tide-predicting F# code in one of my “finish someday” projects, and it was trivial for me to generate data based on a real harbor (which I’ll call “Contoso Harbor” so that no one is tempted to use this code for navigation). To make the task both more difficult and more real-to-life, I added random noise to the training and valida- tion sets (normally distributed, with a standard deviation of 1.5”).
There are many deep-learning libraries available to developers. The nitty-gritty of deep learning involves lots of parallel multipli- cation and sums over very large arrays of floating-point numbers.
msdnmagazine.com
Dec. 15, 2017 / Connect(); Special Issue 15

17 18 19 20 21