“I learned very early the difference of knowing the name of something and knowing something.” –Richard Feynman
Terms like deep learning and neural networks get tossed around a lot lately but few people outside of Google and MIT can really explain simply what they are, how they work or why they’re used.
It’s no wonder, deep learning gets into some pretty deep calculus. It also requires an enormous amount of data and seemingly endless amount of repetition.
But machine learning is a fascinating concept. The idea that you can teach a machine to recognize an object and then generate a human like sentence describing that object within different environments is beyond incredible.
If you use Siri or Alexa you’re experiencing the benefits of neural networks. The movie recommendations Netflix and Amazon gives you are the result of neural networks. When your credit card company alerts you about unusual activity on your card or Facebook recognizes your face in an image there is a neural network behind it.
Neural networks affect and will continue to affect so many aspects of our lives that having a basic understanding of how they work is essential, especially in business. In a future where more and more jobs will be automated, possessing skill like this will pay dividends.
Search engine optimization and digital marketing, in particular are two fields currently feeling the reverberating effects of neural networks. Voice search has given rise to long tail keywords and more specific search results. Google’s search algorithms are refined every six months elevating quality long form content while punishing spam, repetition and low-quality websites.
While deep learning and neural networks seem complex- and they are very complex- understanding the basics of how they work is easier than it sounds.
Part computer science, part neuroscience at the most basic level neural nets are just layers of artificial neurons, not as complex as a human neuron but based on similar principles.
You feed a neural net data and it generates a result. Give it an input it gives you an output. Feed it an unlimited amount of data and it learns to beat professionals at ancient Chinese board games, recognize cats on YouTube or drive a car.
A Brief History
This is pushing the limits of my competence but over the next 2000 words or so I’m going try and explain how neural networks function at their most basic level. I’ll also cover the common terminology, some history, the different types of neural networks and learning algorithms.
Before we go down the proverbial rabbit hole let’s cover some history and basic terminology.
Neural networks are a form of deep learning which is a type of machine learning which is a subfield of artificial intelligence. Some neural nets use supervised learning– which means that all machine learning taking place is guided by labeled data the machine is fed, others use unsupervised learning which uses unlabeled data divided into groups or categories and others use reinforcement learning which uses rewards or punishments to guide learning. At the moment, deep neural nets are the most promising avenue at the moment in the quest for true Artificial Intelligence.
The concept of neural networks has been around since the 1940s when two researchers at the University of Chicago, Warren McCullough and Walter Pitts published a paper titled “A Logical Calculus of Ideas Immanent in Nervous Activity,” the seminal paper is still referenced today.
Initially the most prominently researched field of computer science the first functional neural net was created in 1957 by Frank Rosenblatt, a research psychologist at Cornell. Called a Perceptron, it had very simple learning algorithm with only a few layers of neurons.
Several years later two researchers from MIT, Marvin Minsky and Seymour Papert, published a book Perceptrons, focusing on the drawbacks of Perceptrons arguing that it was error prone and advocating the use of fledgling programing languages that would generate more exact computational results. The book is often cited as the reason for a halt in research or what’s known as the “AI Winter.”
There was a resurgence of interest in the 1980s but it wasn’t until the last decade that the field took off again thanks to an exponential growth in computing power. Researchers like Paul Werbos and Geoff Hinton (who now works for Google) kept the field alive when the computing speed couldn’t keep up with the expectations. They helped lay the framework for the fields recent successes. Hinton, a cognitive psychologist and computer scientist has developed a number of machine learning algorithms and along with Werbos, helped pioneer the use of backpropagation, which we’ll get into shortly. But first let’s take a look at the structure of a neuron
Neurons, Real and Artificial
Our brains are packed full of neurons, hundreds of billions of them, crisscrossed in a tangled web of activity, buzzing with electricity and chemical signals.
At its most basic, a neuron is a cell that transfers information. It receives an electrical impulse (input) and if that impulse exceeds a certain value it fires off electrical or chemical signal (output) to communicate with other neurons that eventually do something like move the muscles in your arm or trigger a memory from your childhood.
Our knowledge of the brain is still fairly limited but we know for sure that learning and memory are interconnected. Some neuroscientists believe that memories reside in the connections between networks of neurons, while others think memories exist within actual brain cells.
Memories, short term and long term inform how we learn and also help us express the knowledge we’ve acquired. There are two types of memory, declarative and non-declarative. Declarative memory or explicit memory is fact based while non-declarative is implicit and procedural. Researchers believe the closer they can get to understanding the process behind learning and memory, the better chance they have of creating true artificial intelligence.
To develop a clear understanding of how neural networks operate we need to understand the anatomy of a human neuron.
A neuron consists of three major components; a cell body, an axon and dendrites. The dendrites receive signals from the other neurons, which are sent to the cell body which relays the signal to the axon which forwards the signal on to the synapse or the connection/space between cells. A spike in electrical activity in the axon causes a charge to be injected into the post-synaptic neuron. Synapses receive an electrical charge which causes a chemical (neuro transmitter) to be released. Synapses adapt using locally available signals. Each neuron receives inputs from other neurons. Cortical neurons use spikes in electricity to communicate.
An artificial neuron like the one below is just a mathematical representation of a human neuron although millions of times less complex.
How a Neural Net Works
If you like linear algebra and calculus then this part is for you. If you don’t, keep reading anyway I’ll try to make it as simple as possible.
Artificial neurons are mathematical models of human neurons. The electrical and chemical impulses are represented by numbers. Notice that none of the terms in a biological neuron ae represented in an artificial neuron but they work basically the same.
Neural nets are made up of thousands and sometimes millions of artificial neurons. A simple network like the one pictured below is feed forward, meaning that information flows one direction.
This is a very basic neural network with only three layers. An Input layer, a hidden layer and an output layer. Networks can have an enormous amount of inputs and hidden layers. What makes a deep neural network a deep neural network is more than one hidden layer.
There are three nodes in the input layer. Each node receives an input value which is multiplied by a weight.
What is a weight? A weight can also be called a bias. The first time a network is fed data or during the training phase the weight or bias is assigned randomly. 1 if it’s correct and 0 for everything else (in reality the numbers used are closer to 0.9 and 0.1) The weights are adjusted based on the accuracy of the output. The input is multiplied by the weight and if it exceeds a certain value the neuron fires, this is called a step function or an activation function. Most neural nets use a function called a sigmoid function which is better suited for backpropagation and essentially just means that the s-shaped curve can take any real value and turn it in to a number between 1 and 0.
A backpropagation algorithm is a way of determining if your weights are accurate based off the outputs you get. You work backwards with a ton of complicated math to determine the new weights for the hidden layer. You repeat this process over and over until you get the desired result
This is where the endless repetition comes in. In order to increase the accuracy of the neural net for whatever it is you’re trying to get it to do you have to keep feeding it data. In order to determine if your network is learning you establish a cost function and adjust the weights accordingly. A cost function is just a way of checking the networks work to determine how accurate it is.
Wash. Rinse. Repeat.
Types of Neural Nets
It would be extremely difficult to provide a definitive list of all the different types of neural networks; new variations are invented all the time. For the purpose of simplicity, I’ll list the three most common or generalized categories of neural networks, Convolutional, Feed Forward and Recurrent. Remember, this is not a definitive list, within each of these categories are many subcategories and combinations. I’ve included an infographic below from the Asimov Institute that lists examples of the many different types of neural networks.
Convolutional Neural Networks
Used for machine vision, image recognition and image analysis. Convolutional neural networks are based off the visual cortex of the brain. They reduce the number of parameters required in a typical network and improve generalization. The term convolutional comes from an operation in mathematical analysis where two functions generate a third function.
Facebook most likely uses CNN technology in its facial recognition software. This is assumed because CNNs were developed by Yann Lecun, the director of Facebook’s AI Group. Convolutional Neural Nets are the most promising type of image recognition software at the current time.
Feed Forward Neural Network
A feed forward network is the simplest type of neural network. Like the one described above, they are based off the perceptron and all information flows one direction from the input layer to the hidden layer to the output layer.
Recurrent Neural Network
Recurrent neural networks are extremely versatile they used for everything from handwriting and voice recognition to self-driving cars and document classification. With built in feedback loops that allow them to adapt to changing data sets they are also great for forecasting markets and weather.
RNN’s have a different structure than feed forward networks and only have one layer. They are better for sequential data. Google uses a RNN for their speech recognition software.
Below is a fairly exhaustive list of the different types of neural networks created by the Asimov Institute.
A normal computer program can perform a sequence of steps very fast but it needs an exact set of instructions. Richard Feynman uses the analogy of an incredibly fast but stupid file clerk. If you want something done you have to explain it in painstaking detail, in a language the clerk understands with absolutely no syntax errors.
Move three steps forward, turn right, move five steps forward, turn left, move two steps forward, turn right, move two steps forward, stop, grab the handle of the file cabinet, open the file cabinet, get the file.
The human mind is capable of performing rapid parallel computations but also has the ability to adapt to unforeseen conditions and make judgements based on an incomplete set of facts using a very low level of energy. One of the goals of machine learning is to replicate this type of parallel thinking and reduce the amount of instructions you need to give a machine. Instead of giving it an exhaustive set of instructions like you would a normal computer program, in machine learning, you provide a generalized set of instructions.
Calculate the shortest distance to the file cabinet. Get a specific file. Repeat these instructions until you get it right.
Analyze the test results of millions of cancer patients and find a pattern that predicts a higher likelihood of cancer.
Determine if the recent climate data is an anomaly or part of a larger trend.
Identify youth most at risk of dropping out of high school and recommend resources that have generated the best outcome in the past instances.
Teaching machines to think could solve major real-world issues but it could also create unfathomable problems. Neural nets have emerged as the most promising new type of machine learning and in a rapidly changing world they seem to hold the path to real AI. Every day they creep further into our lives. Understanding their uses and having a familiarity of their inner workings is important for too many reasons to list, it’s like the famous Norwegian Diplomat and historian Christian Lous Lange once said, “technology is a useful servant but a dangerous master.”
If you’re interested in neural networks and AI research here are a few resources I used to write this blog you might enjoy:
Geoffrey Hinton’s Class Neural Networks for Machine Learning
A free class from MIT Deep Learning for Self-Driving Cars
Nick Bostrom’s book Super Intelligence: Paths, Dangers, Strategies
Google’s Research Blog
Peter Domingoe’s book The Master Algorithm
Richard Feynman’s Lecture on Computer Heuristics