Computer vision

Helping devices see and understand our visual world.

Leveling down in computer vision: Pareto inefficiencies in fair deep classifiers

Dominik Zietlow, Michael Lohaus, Guha Balakrishnan, Matthaus Kleindessner, Francesco Locatello, Bernhard Schölkopf, Chris Russell

CVPR 2022

2022

Algorithmic fairness is frequently motivated in terms of a trade-off in which overall performance is decreased so as to improve performance on disadvantaged groups where the algorithm would otherwise be less accurate. Contrary to this, we find that applying existing fairness approaches to computer vision improve fairness by degrading the performance of classifiers across all groups (with increased degradation

Computer vision
One-stage object referring with gaze estimation

Jianhang Chen, Xu Zhang, Yue (Rex) Wu, Shalini Ghosh, Pradeep Natarajan, Shih-Fu Chang, Jan Allebach

CVPR 2022 Workshop on Gaze Estimation and Prediction in the Wild

2022

The classic object referring task aims at localizing the referred object in the image and requires a reference image and a natural language description as inputs. Given the facts that gaze signal can be easily obtained by a modern human-computer interaction system with a camera and that human tends to look at the object when referring to it, we propose a novel gaze-assisted object referring framework. The

Computer vision
ASD-transformer: Efficient active speaker detection using self and multimodal transformers

Gourav Datta, Tyler Etchart, Vivek Yadav, Varsha Hedau, Pradeep Natarajan, Shih-Fu Chang

ICASSP 2022

2022

Multimodal active speaker detection (ASD) methods assign a speaking/not-speaking label per individual in a video clip. ASD is critical for applications such as natural human-computer interaction, speaker diarization, and video reframing. Recent work has shown the success of transformers in multimodal settings, thus we propose a novel framework that leverages modern transformer and concatenation mechanisms

Computer vision
TubeR: Tubelet transformer for video action detection

Jiaojiao Zhao, Yanyi Zhang, Xinyu (Arthur) Li, Hao Chen, Bing Shuai, Mingze Xu, Chunhui Liu, Kaustav Kundu, Yuanjun Xiong, Davide Modolo, Ivan Marsic, Cees G.M. Snoek, Joe Tighe

CVPR 2022

2022

We propose TubeR: a simple solution for spatio-temporal video action detection. Different from existing methods that depend on either an off-line actor detector or hand-designed actor-positional hypotheses like proposals or anchors, we propose to directly detect an action tubelet in a video by simultaneously performing action localization and recognition from a single representation. TubeR learns a set

Computer vision
Privacy preserving visual question answering

Cristian-Paul Bara, Qing Ping, Abhinav Mathur, Govind Thattai, Rohith MV, Gaurav Sukhatme

AAAI 2022 Workshop on Privacy-Preserving Artificial Intelligence

2022

We introduce a novel privacy-preserving methodology for performing Visual Question Answering on the edge. Our method constructs a symbolic representation of the visual scene, using a low-complexity computer vision model that jointly predicts classes, attributes and predicates. This symbolic representation is non-differentiable, which means it cannot be used to recover the original image, thereby keeping

Computer vision

Former Amazon intern Karsten Roth wins EMVA young professional award

Staff writer

June 23, 2022

EMVA Young Professional Award honors “outstanding and innovative work of a student or a young professional in the field of machine vision or image processing.”

Computer vision
Prime Video's work on 3-D scene reconstruction, image representation

Raffay Hamid

June 22, 2022

CVPR papers examine the recovery of 3-D information from camera movement and learning general representations from weakly annotated data.

Computer vision
Amelia Hayson

Olga Moskvyak’s journey into the world of science

Mariana Lenharo

June 21, 2022

How she moved across the world to discover a passion for (and a career in) machine learning.

Computer vision
Anton van den Hengel’s journey from intellectual property law to computer vision pioneer

Sean O'Neill

June 20, 2022

Amazon’s director of applied science in Adelaide, Australia, believes the economic value of computer vision has “gone through the roof".

Computer vision
CVPR: Understanding images means understanding the world

Larry Hardesty

June 16, 2022

Senior principal scientist Aleix M. Martinez on why computer vision research has only begun to scratch the surface.

Computer vision
Richard Zhang wins 2022 CHCCS Achievement Award

Staff writer

June 2, 2022

The Amazon Scholar received the award for his seminal and sustained contributions to the fields of computer graphics and visual computing.

Computer vision

Computer vision

Recent publications

Related content

Work with us