### Neural Networks Tutorial

Artificial Neural Networks (ANNs) are *the* hype in Computer Science, and I have jumped on the bandwagon.
I will surely write about how I want to apply ANNs to the field of communication networks in some later post, but for now I want to present what I have gathered about these lovely little thingies, and show it in a way that would have been very useful to me when I first started looking into ANNs:

**I will showcase a brief, hands-on example of how ANNs work, and provide some TensorFlow / Keras code to back up my claims.**

The examples are based on **Andrew Ng’s course on Machine Learning on coursera.org** - I can certainly recommend this course. The Python code is written by me, and uses `tensorflor v1.14`

and its built-in Keras library.

To keep things short, I won’t introduce everything about ANNs; you can easily find info on this on thousands of other sites, online courses and books.
Instead, the practical example of *learning logic operators through ANNs* from the above-mentioned online course is given.

# Learning the logical `AND`

You probably know the `AND`

operator. Its truth table fully defines its function (output 1 only if both inputs are 1):

We can approximate the `AND`

function with a single-layer Neural Network. Learning the truth table above for a number of epochs, it would learn the following weights:

and by setting the activation function as the *Sigmoid function*
$$ g(z) = \frac{1}{1 + e^{-\Theta^T z}} $$
the ANN is capable of learning the `AND`

function, as its prediction becomes
$$ h_\theta(x) = g(-30 + 20x_1 + 20x_2) $$
which leads to

## And now in Python

```
import tensorflow.compat.v1 as tf
from tensorflow.keras import layers
import numpy as np
def construct_single_layer_ann(learning_rate, num_inputs, num_outputs):
model = tf.keras.Sequential()
model.add(layers.Dense(units=num_outputs, activation='sigmoid', input_shape=(num_inputs,)))
# Gradient Descent + Mean Squared Error loss function.
model.compile(optimizer=tf.train.GradientDescentOptimizer(learning_rate), loss='mse', metrics=['accuracy'])
return model
def get_training_data_AND():
data = [[(0,0), (0,1), (1,0), (1,1)]]
labels = [0, 0, 0, 1]
return data, labels
def print_layer_weights(model, num_layer):
layer_weights = model.layers[num_layer].get_weights()[0]
bias_weights = model.layers[num_layer].get_weights()[1]
print("Learned weights are", end=' ')
for i in range(len(layer_weights)):
for j in range(len(layer_weights[i])):
print("n_" + str(num_layer+1) + "," + str(i+1) + "_" + str(j) + "=" + str(layer_weights[i][0]), end=' ')
for i in range(len(bias_weights)):
print("b_" + str(num_layer+1) + "," + str(i) + "=" + str(bias_weights[i]), end=' ')
print()
if __name__ == "__main__":
learning_rate = 10.0
num_epochs = 50
verbose = False
# Logical AND model.
model_AND = construct_single_layer_ann(learning_rate, 2, 1)
# Provide perfect training data.
data_AND, labels_AND = get_training_data_AND()
# Fit the model to the data.
history = LossHistory()
model_AND.fit(data_AND, labels_AND, epochs=num_epochs, verbose=verbose, callbacks=[history])
# Print weights and loss.
print("Logical AND ANN:\n================")
print_layer_weights(model_AND, 0)
print("Final loss is " + str(history.losses[-1]))
predictions = model_AND.predict(data_AND)
for i in range(predictions.size):
print(str(data_AND[0][i]) + " -> " + str(predictions[i][0]) + " ≈ " + str(round(predictions[i][0])))
print()
```

which outputs

```
Logical AND ANN:
================
Learned weights are n_1,1_0=3.8320987 n_1,2_0=3.8320732 b_1,0=-5.8527803
Final loss is 0.012041996
(0, 0) -> 0.002863679 ≈ 0.0
(0, 1) -> 0.117045894 ≈ 0.0
(1, 0) -> 0.11704853 ≈ 0.0
(1, 1) -> 0.85953003 ≈ 1.0
```

We can see that instead of learning the weights -30, 20, 20 it learns -5.8, 3.8, 3.8. The *relations* are the same though
$$\frac{-30}{20} = -1.5 \approx{} \frac{-5.8}{3.8} = -1.53$$

This concludes this very brief introduction. The practical example of the `AND`

function seemed intuitive to me. All an ANN does is adapt its weights until the loss (prediction - target) is adequately small. Since the Sigmoid function is non-linear, the ANN can approximate non-linear functions. And Keras gives us a way of implementing this in very few lines of code. :)