Transfer Learning

Concept

Transfer learning is a method where the weights of a pre-trained network are used as the starting point when training a new machine learning model. The weights of these pre-trained networks are often frozen (i.e. not updated via backpropagation) and their classification heads are removed. What is left becomes the "backbone" of the new model. A trainable "head" is usually added after the backbone to make the new model learn a specific task (eg. to classify a desired # of objects).

Transfer learning explained visually.

Transfer learning is best used when the backbone has been trained on a dataset that is similar to the domain in which the new model being trained will be deployed in. Frequently, we see large CNNs such as VGG-16 or ResNets pre-trained on ImageNet and the COCO dataset. Since these are large scale object detection datasets, the pre-trained networks act as feature extractors and can be used for other tasks. This method is especially helpful if you have minimal hardware or training resources and have a very small dataset.

The idea for the project below comes from the course Intro To Deep Learning With Pytorch by Udacity and allowed me to practice transfer learning in order to classify cats and dogs.

Loading pre-trained network

Freezing backbone and defining classifier head

Notice how with transfer learning we did not need to train for long

Story

My friends and I received an honourable mention award for creating a site that identifies skin, breast, and colorectal cancer. Transfer learning was the key to achieving high enough accuracy on this task. VGG-16 is frequently used in the academic literature for cancer detection which is why we used it while the use of ResNet50 was more experimental.

Deer detection visualization. Attribution: aUToronto

In Year 1 of the Autodrive Challenge II a day before the Dynamic Obstacle Challenge (where our perception system was required to identify pedestrians, vehicles, and deers) we realized that the colour segmentation approach we were using for deer detection was failing at the competition due to a change in the deer mannequin being used. Brian Cheong and I were representatives from the Perception 2D OD team at the time so this became a crucial moment for us. We collectively took and labelled over 2,000 images of the new deer and spent the whole night training a brand-new model. The key to doing this in such a short period of time was transfer learning! We had previously spent months fine tuning a pedestrian and vehicle detector. By freezing the backbone of this network and training a classifier head on the new data, we were able to achieve incredible performance and won this challenge. A demo video of the deer being detected at the competition is available above. Read more about our challenges and success at the competition.