Self-supervised pretraining for large-scale point clouds
Pretraining on large unlabeled datasets has been proven to improve the down stream task performance on many computer vision tasks, such as 2D object detection and video classification. However, for large scale 3D scenes, such as outdoor LiDAR point clouds, pretraining is not widely used. Due to the special data characteristics of large 3D point clouds, approaches for 2D pretraining frameworks tend to not generalize well to this domain. In this paper, we propose a new self-supervised pretraining method that targets large-scale 3D scenes. We pretrain commonly used point-based and voxel-based model architectures and show the transfer learning performance on 3D object detection and semantic segmentation. We demonstrate the effectiveness of our approach on both dense 3D indoor point clouds and sparse outdoor LiDAR point clouds.