Robotics

Physics-informed neural controlled differential equations for long horizon multi-agent motion forecasting

Shounak Sural, Charles Kekeh, Wenliang Liu, Federico Pecora, Mouhacine Benosman

NeurIPS 2025 Workshop on Machine Learning and the Physical Sciences

2025

Long-horizon motion forecasting for multiple autonomous robots is challenging due to non-linear agent interactions, compounding prediction errors, and continuous-time evolution of dynamics. Learnt dynamics of such a system can be useful in various applications such as travel time prediction, prediction-guided planning and surrogate simulation. In this work, we aim to develop an efficient trajectory forecasting

Machine learning

Attribute-based object grounding and robot grasp detection with spatial reasoning

Houjian Yu, Zheming Zhou, Min Sun, Omid Alizadeh, Yuyin Sun, Cheng-Hao Kuo, Arnie Sen, Changhyun Choi

2025 IEEE-RAS Humanoids

2025

Enabling robots to grasp objects specified through natural language is essential for effective human–robot interaction, yet it remains a significant challenge. Existing approaches often struggle with open–form language expressions and typically assume unambiguous target objects without duplicates. Moreover, they frequently rely on costly, dense pixel–wise annotations for both object grounding and grasp

Computer vision

Is the house ready for sleeptime? Generating and evaluating situational queries for embodied question answering

Vishnu Sashank Dorbala, Prasoon Goyal, Robinson Piramuthu, Michael Johnston, Reza Ghanadan, Dinesh Manocha

IROS 2025

2025

We present and tackle the problem of Embodied Question Answering (EQA) with Situational Queries (S-EQA) in a household environment. Unlike prior EQA work tackling simple queries that directly reference target objects and properties ('What is the color of the car?'), situational queries (such as 'Is the house ready for sleeptime?') are challenging as they require the agent to correctly identify multiple

Robotics

POp-GS: Next best view in 3D-Gaussian splatting with P-Optimality

Joey Wilson, Marcelino Almeida, Sachit Mahajan, Martin Labrie, Maani Ghaffari, Omid Alizadeh, Min Sun, Cheng-Hao Kuo, Arnab Sen

CVPR 2025

2025

In this paper, we present a novel algorithm for quantifying uncertainty and information gained within 3D Gaussian Splatting (3D-GS) through P-Optimality. While 3D-GS has proven to be a useful world model with high-quality rasterizations, it does not natively quantify uncertainty or information, posing a challenge for real-world applications such as 3D-GS SLAM. We propose to quantify information gain in

Computer vision

HomeEmergency - Using audio to find and respond to emergencies in the home

James F. Mullen Jr, Dhruva Kumar, Tony Qi, Rajasimman Madhivanan, Arnie Sen, Dinesh Manocha, Richard Kim

IEEE Robotics and Automation Letters 2025

2025

In the United States alone accidental home deaths exceed 128,000 per year. Our work aims to enable home robots who respond to emergency scenarios in the home, preventing injuries and deaths. We introduce a new dataset of household emergencies based in the ThreeDWorld simulator. Each scenario in our dataset begins with an instantaneous or periodic sound which may or may not be an emergency. The agent must

Robotics

Scalable multi-robot task allocation and coordination under signal temporal logic specifications

Wenliang Liu, Nathalie Majcherczyk, Federico Pecora

ICRA 2025

2025

Motion planning with simple objectives, such as collision-avoidance and goal-reaching, can be solved efficiently using modern planners. However, the complexity of the allowed tasks for these planners is limited. On the other hand, signal temporal logic (STL) can specify complex requirements, but STL-based motion planning and control algorithms often face scalability issues, especially in large multi-robot

Robotics

Mastering robot manipulation with multimodal prompts through pretraining and multi-task fine-tuning

Jiachen Li, Qiaozi (QZ) Gao, Michael Johnston, Xiaofeng Gao, Xuehai He, Hangjie Shi, Suhaila Shakiah, Reza Ghanadan, William Yang Wang

ICML 2024

2024

Prompt-based learning has been demonstrated as a compelling paradigm contributing to large language models’ tremendous success (LLMs). Inspired by their success in language tasks, existing research has leveraged LLMs in embodied instruction following and task planning. In this work, we tackle the problem of training a robot to understand multimodal prompts, interleaving vision signals with text descriptions

Conversational AI

Robotics call for proposals

Amazon Research Awards Call for Proposals.png

Pursuing the future of robotics research

Tabletop transparent scene reconstruction via epipolar-guided optical flow with monocular depth completion prior

Xiaotong Chen, Zheming Zhou, Zhuo Deng, Omid Alizadeh, Min Sun, Cheng-Hao Kuo, Arnie Sen

IEEE RAS Humanoids 2023

2023

Reconstructing transparent objects using affordable RGB-D cameras is a persistent challenge in robotic perception due to inconsistent appearances across views in the RGB domain and inaccurate depth readings in each single-view. We introduce a two-stage pipeline for reconstructing transparent objects tailored for mobile platforms. In the first stage, off-theshelf monocular object segmentation and depth completion

Computer vision

Pick planning strategies for large-scale package manipulation

Shuai Li, Azarakhsh Keipour, Kevin Jamieson, Nicolas Hudson, Sicong Szhao, Charles Swan, Kostas Bekris

IROS 2023 Workshop on Learning Meets Model-based Methods for Manipulation and Grasping

2023

Automating warehouse operations can reduce logistics overhead costs, ultimately driving down the final price for consumers, increasing the speed of delivery, and enhancing the resiliency to market fluctuations. This extended abstract showcases a large-scale package manipulation from unstructured piles in Amazon Robotics’ Robot Induction (Robin) fleet, which is used for picking and singulating up to 6 million

Machine learning

Robotics

Work with us