RarePlanes soar higher: Self-supervised pretraining for resource constrained and synthetic datasets

2022
Download Copy BibTeX
Copy BibTeX
Self-supervised pretraining has advanced the capabilities of many computer vision tasks without requiring additional labels. One drawback is this technique requires
extensive datasets and computational resources. This requirement of large datasets to pretrain with has often precluded the use of smaller, more niche datasets. Recently
a method of pretraining has been developed that uses several stages of training, arranging each subsequent pretraining step to a dataset more closely resembling the target labelled data. This Hierarchical PreTraining (HPT) allows small datasets that are significantly different from generalized pretraining datasets (e.g. ImageNet) to build off subsequent knowledge transfers of increasingly focused training. However, there remains computer vision domains that are sufficiently difficult to acquire data that the use of synthetic data to augment their training has become a common convention. This paper examines how Remote Sensing Imagery (RSI) datasets, both augmented with synthetic data and without, still benefit from HPT despite being a niche domain. We show the fine balance that must be maintained when pretraining with these small datasets through a series of experiments focused on isolating various training parameters. We also demonstrate how these techniques lead to model improvements over existing baselines with and without synthetic data. Given that HPT provides a straightforward process to increase performance, and synthetic data is a growing resource for dataset augmentation, these combined methods can enhance a wide variety of current and future computer vision tasks.
Research areas

Latest news

CA, ON, Toronto
Are you motivated to explore research in ambiguous spaces? Are you interested in conducting research that will improve associate, employee and manager experiences at Amazon? Do you want to work on an interdisciplinary team of scientists that collaborate rather than compete? Join us at PXT Central Science! The People eXperience and Technology Central Science Team (PXTCS) uses economics, behavioral science, statistics, and machine learning to proactively identify mechanisms and process improvements which simultaneously improve Amazon and the lives, wellbeing, and the value of work to Amazonians. We are an interdisciplinary team that combines the talents of science and engineering to develop and deliver solutions that measurably achieve thisRead more