Unstructured object matching using co-salient region segmentation
Unstructured object matching is a less-explored and very challenging topic in the scientific literature. This includes matching scenarios where the context, appearance and the geometrical integrity of the objects to be matched changes drastically from one image to another (e.g. a pair of pyjamas which in one image is folded and in the other is worn by a person), making it impossible to determine a transformation which aligns the matched regions. Traditional approaches like keypoint-based feature matching perform poorly on this use case due to the high complexity in terms of viewpoint, scene context variety, background variations or high degrees of freedom concerning structural configurations. In this paper we propose a deep learning framework consisting of a twins based matching approach leveraging a co-salient region segmentation task and a cosine-similarity based region descriptor pairing technique. The importance of our proposed framework is demonstrated on a novel use case consisting of image pairs with various objects used by children. Additionally, we evaluate on Human3.6M and Market-1501, two datasets with humans depicting various appearances and kinematic configurations captured under different backgrounds.