Customer-obsessed science
Research areas
-
June 8, 20267 min readFour approaches can dramatically improve the performance and trustworthiness of AI agents in operational environments.
-
-
-
May 26, 20265 min read
-
Featured news
-
EUSIPCO 20262026We present VCNAC, a variable-channel neural audio codec. Our approach features a single encoder and decoder parametrization that enables native inference for different channel setups, from mono speech to cinematic 5.1 channel surround audio. Channel compatibility objectives ensure that multichannel content maintains perceptual quality when decoded to fewer channels. The shared representation enables training
-
2026Multimodal Large Language Models (MLLMs) have recently demonstrated promising capabilities in multimodal coding tasks such as chart-to-code generation. However, existing methods primarily rely on supervised fine-tuning (SFT), which requires the model to learn code patterns through chart-code pairs but does not expose the model to a code execution environment. Moreover, while self-correction through execution
-
VLDB 20262026Accurate optimizer statistics are fundamental to query and ML-prediction performance in modern database systems, yet maintaining them poses a significant challenge for large-scale data warehouses. Traditional statistics collection relies on full table scans, which become prohibitively expensive as tables grow to billions of rows and beyond. This creates a critical tension: statistics must be kept current
-
CVPR 2026 Workshop on Personalization in Generative AI2026Makeup transfer models enable fun augmented reality (AR) experiences as well as virtual try-on (VTO) for online makeup shopping. While recent state-of-the-art diffusion-based solutions such as Stable-Makeup [45] dramatically improve the accuracy and realism of makeup transfer, they still face limitations in identity and skin color preservation, making production-level VTO for makeup shopping unrealistic
-
2026Inferring rigid-body physical states and properties from monocular videos is a fundamental step toward physicsbased perception and simulation. Existing approaches assume specific underlying physical systems, object types, and camera poses, which are unable to generalize to complex real-world settings. We introduce ∆YNAMICS, a visionlanguage framework that uses language as a unified representation of rigid-body
Collaborations
View allWhether you're a faculty member or student, there are number of ways you can engage with Amazon.
View all