37. Neural network regression

Neural networks have become very popular recently due to the advent of high performance GPU algorithms for their application. Modern applications of neural networks often use very large networks, but in this sample we will demonstrate the possibilities using a network with a single hidden layer.

The general idea of a neural network is shown in the picture below:


Each circle represents a neuron, and the output of the neuron is calculated as shown below. In simple terms, the output of a neuron is the weighted average of its inputs, passed through what is known as an activation function. The function shown in the picture is known as a logistic or sigmoid function.


import numpy
import matplotlib.pyplot as plt
%matplotlib inline

We will build our own network in this example to regress a one dimensional function. This means we will only have one input (\(u\)), and one output (\(y\)), but we can choose the number of hidden neurons. The more hidden neurons we have, the more curves we can handle in the fit function. 3 appears to be enough to do the general fits we do here without the training taking very long. There is no general rule for choosing the number of hidden neurons - use as few as possible to capture your behaviour as each new hidden neuron adds lots of weights which all have to be found.

Nhidden = 3

We need to create weights between each input neuron and each hidden neuron as well as each hidden neuron and each output neuron.

w_in_hidden = numpy.random.rand(Nhidden)
w_hidden_out = numpy.random.rand(Nhidden)

We also need a bias for the hidden layer and the output layer

bias_hidden = numpy.random.rand()
bias_output = numpy.random.rand()

We will use a sigmoidal activation function:

def sigmoid(i):
    expi = numpy.exp(-i)
    return ((1 - expi)/(1 + expi))
x = numpy.linspace(-5, 5)
plt.plot(x, sigmoid(x))
[<matplotlib.lines.Line2D at 0x1a1ca307b8>]

To calculate the output of a neuron, we take the weighted sum of its inputs and apply the activation function. We can do this all simulateously with numpy arrays:

def network_output(u, w_in_hidden, w_hidden_out, bias_hidden, bias_output):
    h = sigmoid(w_in_hidden*u + bias_hidden)
    y = sigmoid((w_hidden_out*h + bias_output).sum())

    return y
network_output(0.1, w_in_hidden, w_hidden_out, bias_hidden, bias_output)

Let’s find the weights and bias to regress a function. Due to later decisions about the final activation function which is limited to be between -1 and 1, we will limit our function to be in that range.

known_u = numpy.linspace(-1, 1)
known_y = numpy.sin(known_u*numpy.pi)*0.8 + numpy.random.randn(len(known_u))*0.05
plt.scatter(known_u, known_y)
<matplotlib.collections.PathCollection at 0x1a1cb4c898>
import scipy.optimize

Since we’re going to use optimisation functions which take an array, we need to be able to our parameters into a single array and unpack them again.

def pack(w_in_hidden, w_hidden_out, bias_hidden, bias_output):
    return numpy.concatenate([w_in_hidden,

def unpack(parameters):
    parts = numpy.split(parameters, [Nhidden, 2*Nhidden, 2*Nhidden + 1])
    return parts
p0 = pack(w_in_hidden, w_hidden_out, bias_hidden, bias_output)
def predict(parameters, us):
    w_in_hidden, w_hidden_out, bias_hidden, bias_output = unpack(parameters)
    return numpy.array([network_output(u, w_in_hidden, w_hidden_out, bias_hidden, bias_output) for u in us])
def plotfit(predictions):
    plt.scatter(known_u, known_y, alpha=0.4)
    plt.plot(known_u, predictions)
plotfit(predict(p0, known_u))
def errorfunction(parameters):
    return known_y - predict(parameters, known_u)
result = scipy.optimize.least_squares(errorfunction, p0)
plotfit(predict(result.x, known_u))

37.1. Scikit-learn

As I’ve mentioned before, you’re probably better off using a library for things like this. The Scikit-Learn library has neural network regression built in. It is part of the standard Anaconda install.

import sklearn
import sklearn.neural_network
net = sklearn.neural_network.MLPRegressor(hidden_layer_sizes=Nhidden,
                                          solver='lbfgs', max_iter=1000, learning_rate_init=0.001)
observations = numpy.atleast_2d(known_u).T
net.fit(observations, known_y)
MLPRegressor(activation='tanh', alpha=0.0001, batch_size='auto', beta_1=0.9,
       beta_2=0.999, early_stopping=False, epsilon=1e-08,
       hidden_layer_sizes=4, learning_rate='constant',
       learning_rate_init=0.001, max_iter=1000, momentum=0.9,
       nesterovs_momentum=True, power_t=0.5, random_state=None,
       shuffle=True, solver='lbfgs', tol=0.0001, validation_fraction=0.1,
       verbose=False, warm_start=False)

37.2. Keras

The Keras library offers additinal flexibility, but is not installed by default in Anaconda.

import keras
/Users/alchemyst/anaconda3/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
Using TensorFlow backend.
model = keras.models.Sequential()
model.add(keras.layers.Dense(Nhidden, input_shape=(1,), activation='tanh'))
model.add(keras.layers.Dense(1, activation='tanh'))
model.compile(optimizer='rmsprop', loss='mse')
model.fit(known_u, known_y, epochs=10000, verbose=False)
<keras.callbacks.History at 0x1a1c6d5048>