In this article, I will be discussing about Transfer learning, what it is? and Why it’s important? This article is intended for beginner to intermediate peoples in deep learning(a bit of understanding in deep learning is required) so It won’t be too detailed, but if you are looking for detailed information then you can follow the link given at the bottom of this article.
What is Transfer Learning
Suppose you are asked by your boss to create a model which classifies different cars, you gathered tons of images of the car, resized it in one shape, passed it do the model, optimized it by trying different, activation function, loss function, optimizer and all and it took you a week training it. Basically, you did your best and created a perfect model with 98+ accuracy, but now your boss says, they need a model which can classify truck as well. Ugh! Now you have to repeat it all again, even though there’s no big difference between car and truck. So, Here Transfer learning comes into the picture.
As the name suggests Transfer learning is an idea of transferring the learning of a model “A” to model “B”, only if model “B” is related to model “A”. It is a technique of reusing the pre-trained model on a new problem, Transfer learning leverages the knowledge gained from a previous model to generalize another. That means to create a model “B” (classifying different trucks) which is similar to model “A” (classifying different cars), we don’t have to create it from scratch. We can use model “A” (which classifies different cars) to classify trucks as well, How! We will discuss below
How does it work
If you think deeply, you will find out that, deep learning is nothing but a process of fine-tuning to the perfect weight. In a computer vision problem, the neural network learns about the edges, then shape, and the big part, which you can see in the example given below. And this process goes same for almost all computer vision tasks. So, in our car and truck example, there won’t be much difference in the edges and shapes of the car, but it will be for a big part.
So, in transfer learning early and middle layers were frozen means we don’t retrain it and we use the weight in which it is fine-tuned, we only retrain the last layer of the model (this is not true in all cases, sometimes we also need to retrain the middle layer as well). And this saves us lots of computation power and training time.
Benefits of Transfer Learning
If you want to build a model and you want your model to learn complex features, but you have a very little amount of data like 1000 images, and you also don’t have a high-end system then, you won’t be able to create such model. That kind of model usually requires millions of images and high-computation power as well. But because of Transfer learning, one can create such model without having millions of images and high-end processors. As discussed earlier we can use a pre-trained model (must be related to your tasks) which is already trained in millions of images to get the best model with just 1000s of images.
According to Andrew Ng (co-founder and head of Google Brain and was the former Chief Scientist at Baidu) Transfer learning will become a key driver of machine learning success in the industry, as during his widely popular NIPS 2016 tutorial said,
❝After supervised learning, transfer learning will be the next driver of ML commercial success.❞-Andrew NG
Also Demis Hassabis (CEO of Deep mind) says ❝Transfer Learning is the key to general Intelligence.❞
When should you use it?
Mostly Transfer learning is used for computer vision and natural language processing tasks because of its better performance but it is not limited to it. Now talking about, when should you use it actually there’s no one answer to this question but you can use it in the following situation like if you don’t have enough labeled data but you require a model to learn complex features or you don’t have enough computation power or in both cases. And obviously, you need to first check out if any pre-trained model related to your tasks is available or not.
Apart from this, you can also follow these guidelines, you can also look for these three possible benefits while using Transfer learning
Example of some pre-trained networks
Some of most popular pre-trained networks are
- Word2vec, etc.
Data Scientist with 3+ years of experience in building data-intensive applications in diverse industries. Proficient in predictive modeling, computer vision, natural language processing, data visualization etc. Aside from being a data scientist, I am also a blogger and photographer.