Neural networks form the basis for many modern AI systems, serving as a means for machines to perceive patterns and solve complicated problems. In simple terms, neural networks comprise layers of nodes, similar to the human brain’s neurons. This structure allows artificial systems to tackle a wide range of tasks.
Also, the branch of AI that utilizes neural networks is known as Deep Learning.
Structure of Neural Networks
A general neural network consist of three layers:
Input Layer: This layer receives the data.
Hidden Layer(s): This layer performs a series of computations on the data based on the net input function (combination of inputs and weights) and the activation function.
Output Layer: This produces the result or a prediction from the computation given by hidden layers.
In order to distinguish these systems from real brains, we call them Artificial Neural Networks (ANNs).
Nodes as Linear Regression Models
In essence, each node or neuron in the neural network works just like a linear regression model. In that context: –
Linear Regression: A method of mathematically modelling predictions of some outcome based on historical events.
Weights: Define how much each input will influence the output.
Bias: Acts like a threshold to alter the output of the node.
Take a pragmatic example from cybersecurity: a possible phishing email detection.
Assume we are designing a neural network to classify emails as either “phishing” or “not phishing” based on several factors:
- Presence of suspicious links (x1)
- Unusual sender address (x2)
- Spelling and grammar errors (x3)
Each factor is assigned a weight based on its importance in identifying phishing emails:
- Suspicious links (x1) might be weighted at 4.
- Unusual sender address (x2) could be weighted at 3.
- Spelling and grammar errors (x3) might be weighted at 2.
We use these weights to calculate the output of the node:
Output = (x1×weight1) + (x2×weight2) + (x3×weight3) − bias
For instance, consider an email with the following characteristics:
- Suspicious links: 1 (present)
- Unusual sender address: 1 (yes)
- Spelling and grammar errors: 0 (none)
And the weights:
- Suspicious links weight: 4
- Unusual sender address weight: 3
- Spelling and grammar errors weight: 2
With a bias of 3, the calculation would be:
Output = (1×4) + (1×3) + (0×2) – 3
Output=4+3+0−3
- Output=4
If the output is greater than some threshold, say 2, then the email would be predicted to be phishing. Changing the weights and/or bias would adjust how the network classifies different emails based on their features.
Training Neural Networks
Neural networks improve accuracy with training conducted on labelled data using supervised learning. This includes:
Training Data: Data with known outcomes can enable the model to learn.
Cost Function: A metric to evaluate the model’s predictions.
Gradient Descent: An optimization method of adjusting the weights and biases to minimize the cost function along with the errors.
As the model’s training progresses, it keeps modifying its parameters iteratively to fit the data best, and that is how its predictive performance increases progressively.
Types of Neural Networks
Beyond the basic feedforward neural network described, there are some other variants:
Convolutional Neural Networks (CNNs): Designed for pattern recognition, it is useful in applications involving image processing (image-based phishing detection) and malware analysis.
Recurrent Neural Networks (RNNs): They have feedback loops in them, due to this, they are useful in the processing of sequences and time-series data (such as conducting threat intelligence and checking for anomalies in logs).
Each type of neural network was developed with specific tasks in mind, making it effective at doing different tasks in various ways.
Understanding these basics provides the background for more advanced concepts and applications relating to neural networks and deep learning.