-
CVPR 20222022Vision-language representation learning largely benefits from image-text alignment through contrastive losses (e.g., InfoNCE loss). The success of this alignment strategy is attributed to its capability in maximizing the mutual information (MI) between an image and its matched text. However, simply performing cross-modal alignment (CMA) ignores data potential within each modality, which may result in degraded
-
CVPR 20222022Aligning signals from different modalities is an important step in vision-language representation learning as it affects the performance of later stages such as cross-modality fusion. Since image and text typically reside in different regions of the feature space, directly aligning them at instance level is challenging especially when features are still evolving during training. In this paper, we propose
-
CVPR 2022 Workshop on New Trends in Image Restoration and Enhancement and Challenges2022Deep learning model inference on embedded devices is challenging due to the limited availability of computation resources. A popular alternative is to perform model inference on the cloud, which requires transmitting images from the embedded device to the cloud. Image compression techniques are commonly employed in such cloud-based architectures to reduce transmission latency over low bandwidth networks
-
CVPR 20222022Fashion image retrieval based on a query pair of reference image and natural language feedback is a challenging task that requires models to assess fashion related information from visual and textual modalities simultaneously. We propose a new vision-language transformer based model, FashionVLP, that brings the prior knowledge contained in large image-text corpora to the domain of fashion image retrieval
-
CVPR 20222022We introduce Amazon Berkeley Objects (ABO), a new large-scale dataset designed to help bridge the gap between real and virtual 3D worlds. ABO contains product catalog images, metadata, and artist-created 3D models with complex geometries and physically-based materials that correspond to real, household objects. We derive challenging benchmarks that exploit the unique properties of ABO and measure the current
Related content
-
June 07, 2021Scientists discuss the challenges in developing a system that can accurately estimate body fat percentage and create personalized 3D avatars of users from smartphone photos.
-
-
May 12, 2021ICCV workshop hosted by Amazon Prime Air and AWS will announce results of challenge to detect airborne obstacles.
-
April 21, 2021Polito is one of the featured speakers at the first virtual Amazon Web Services Machine Learning Summit on June 2.
-
April 05, 2021Amazon Machine Learning Research Award recipient utilizes a combination of people and machine learning models to illuminate the planet's incredible biodiversity.
-
March 08, 2021The combination of a new loss metric and a module that identifies high-importance image regions improves compression.