Neural network introduction | Part 2

🧠 Forward propagation and Cost function

3 min readNov 6, 2018

In a previous note I explained how we can build a L=2 neural network for binary classification. Now that we have the structure of the model, we can feed it with our training data set and compute the error between our predicted value ŷ and the correct value y.

The formula for forward propagation in a L=2 is the following:

In the previous post we built the model taking as an example only two input variables x1 and x2. In reality we will feed the network with a set of m training data. So our training data will be a m × 2 matrix such as :

Because our W1 matrix is a 3 × 2, we have to transpose our matrix X if we want to do a matrix product.

The product of the transposed matrix X.T and the weight matrix W1 will result in a 3 × m matrix. Adding our bias vector b1 will give us the 3 × m Z1 matrix.

g(Z1) become the matrix A1, A1W2+b2 give us Z2 and g(Z2) finally give us A2. (More details in the previous post)

Performing forward propagation with our training data X and those parameters will give us a very bad prediction ŷ if we compare it to the actual value y of our training set. We have to calculate the loss (error) and our parameters will be adjusted to minimise its value.

To compute our loss we will implement a cost function J(𝜃). We need for our binary classification a cross-entropy loss function :

Cross-entropy loss, or log loss, measures the performance of a classification model whose output is a probability value between 0 and 1. Cross-entropy loss increases as the predicted probability diverges from the actual label. So predicting a probability of .012 when the actual observation label is 1 would be bad and result in a high loss value. A perfect model would have a log loss of 0.

To implement Cross-entropy loss in Python :


def compute_cost(A2, Y):
    m = Y.shape[1]
    cost = - np.sum( np.multiply(np.log(A2),Y) + \
             np.multiply(np.log(1-A2),(1-Y)) ) / m    return cost

Neural network introduction | Part 2

🧠 Forward propagation and Cost function

Written by Pierre Portal