Privacy preserving visual question answering

Cristian-Paul Bara; Qing Ping; Abhinav Mathur; Govind Thattai; Rohith MV; Gaurav Sukhatme

Publication

Privacy preserving visual question answering

By Cristian-Paul Bara, Qing Ping, Abhinav Mathur, Govind Thattai, Rohith MV, Gaurav Sukhatme

2022

Download Copy BibTeX

Share

Download

Copy BibTeX

Share

We introduce a novel privacy-preserving methodology for performing Visual Question Answering on the edge. Our method constructs a symbolic representation of the visual scene, using a low-complexity computer vision model that jointly predicts classes, attributes and predicates. This symbolic representation is non-differentiable, which means it cannot be used to recover the original image, thereby keeping the original image private. Our proposed hybrid solution uses a vision model which is more than 25 times smaller than the current state-of-the-art (SOTA) vision models (Anderson et al. 2018), and 100 times smaller than end-to-end SOTA VQA models (Jiang et al. 2020). We report detailed error analysis and discuss the trade-offs of using a distilled vision model and a symbolic representation of the visual scene.

Privacy preserving visual question answering

Latest news

Work with us