Article by Johanna Pingel, MathWorks product marketing manager.
Deep learning is getting lots of attention lately, and for good reason. It’s making a big impact in areas such as computer vision and natural language processing. It’s a key technology behind driverless cars, and voice control in consumer devices like phones and hands-free speakers.
In deep learning, a computer model learns to perform classification tasks directly from images, text, or sound. Deep learning models can achieve state-of-the-art accuracy, sometimes exceeding human-level performance. Most deep learning methods use neural network architectures, which is why deep learning models are often referred to as deep neural networks.
The term “deep” usually refers to the number of hidden layers in the neural network. Traditional neural networks only contain 2-3 hidden layers, while deep networks can have as many as 150. One of the most popular types of deep neural networks is known as convolutional neural networks (CNN or ConvNet). A CNN convolves learned features with input data, and uses 2D convolutional layers, making this architecture well suited to processing 2D data, such as images.
Using an image example, a fully trained deep learning model will be able to automatically identify objects in images, even if it has never seen those exact images before. Ever wondered how certain websites can identify specific people in photos that were just uploaded? That’s deep learning at work.
Many of the techniques used in deep learning today have been around for decades. For example, deep learning has been used to recognise handwritten postal codes in the mail service since the 1990s.
Why has deep learning surged in popularity recently?
This main reason is accuracy. Deep learning models can achieve state-of-the-art accuracy, sometimes exceeding human-level performance. In addition, there are two main factors that have made these advances possible;
Deep learning requires large amounts of labelled data. For example, driverless car development requires millions of images and thousands of hours of video. These large sets of labelled data are prevalent and have become available recently.
Deep learning requires substantial computing power. High-performance GPUs have a parallel architecture that is efficient for deep learning. When combined with clusters or cloud computing, this enables development teams to reduce training time for a deep learning network from weeks to hours or less.
Deep learning and machine learning both offer ways to train models and classify data. Let’s compare these two approaches to see what scenarios determine the use of each.
Using a standard machine learning approach, we would need to manually select the relevant features of an image, such as edges or corners, to train the machine learning model. The model then references these features when analysing and classifying new objects.
With a deep learning workflow, relevant features are automatically extracted from images. In addition, deep learning performs “end-to-end learning” – where a network is given raw data and a task to perform, such as classification, and it learns how to do this automatically.
Another key difference is deep learning algorithms scale with data, whereas shallow learning converges. Shallow learning refers to machine learning methods that plateau at a certain level of performance when you add more examples and training data to the network.
When choosing between machine learning and deep learning, we should ask ourselves whether we have a high-performance GPU and lots of labelled data. If we don’t have either of these things, we’ll have better luck using machine learning over deep learning.
This is because deep learning is generally more complex, so we need at least a few thousand images to get reliable results. We will also need a high-performance GPU so the model spends less time analysing all those images.
If we select machine learning, there is the option to train our model on many different classifiers. We might also know which features to extract that will produce the best results. Plus, with machine learning, we have the flexibility to choose a combination of approaches. Use different classifiers and features to see which arrangement works best for the data.
So, in general, deep learning is more computationally intensive, while machine learning techniques are often simpler to apply.
Deep learning applications are used in many industries from automated driving to medical devices.
Automated Driving: Automotive researchers are using deep learning to automatically detect objects such as stop signs and traffic lights. In addition, deep learning is used to detect pedestrians, which helps decrease accidents.
Industrial Automation: Deep learning is helping to improve worker safety around heavy machinery by automatically detecting when people or objects are within an unsafe distance of machines.
Electronics: Deep learning is being used in automated hearing and speech translation. For example, home assistance devices that respond to your voice and know your preferences are powered by deep learning applications.
Deep learning often seems inaccessible to non-experts, but by exploring common deep learning workflows, engineers and scientists can now quickly and easily apply deep learning to their applications.
As deep learning becomes more ubiquitous, we will continue to see innovation and evolution in applications that were previously considered impossible across fields like computer vision, natural language processing, and robotics.