Customer-obsessed science


Research areas
-
February 27, 2025Prototype is the first realization of a scalable, hardware-efficient quantum computing architecture based on bosonic quantum error correction.
-
Featured news
-
ICMLT 20252025Applications of reinforcement learning (RL) in real-world scenarios are often limited by its generalizability across multiple different environments. Contextual RL offers a principled solution to this issue by capturing environmental heterogeneity through observable contextual variables. However, directly applying Contextual RL may not achieve optimal results when contexts exhibit high randomness and variance
-
International Journal of Research in Marketing2025In 2020, Amazon launched the Climate Pledge Friendly (CPF) program to make it easy for customers to discover and shop for products with sustainability certifications. In this paper, we measure the causal impact of products qualifying for CPF on consumer purchase behavior. Using a dataset of about 45,000 products spanning three categories, and a Differencein-Differences identification strategy, we show that
-
QECC-Synth: A layout synthesizer for quantum error correction codes on sparse hardware architecturesASPLOS 20252025Quantum Error Correction (QEC) codes are essential for achieving fault-tolerant quantum computing (FTQC). However, their implementation faces significant challenges due to disparity between required dense qubit connectivity and sparse hardware architectures. Current approaches often either underutilize QEC circuit features or focus on manual designs tailored to specific codes and architectures, limiting
-
2025Hybrid models that combine the language modeling capabilities of Attention layers with the efficiency of Recurrent layers (e.g., State Space Models) have gained traction in practically supporting long contexts in Large Language Model serving. Yet, the unique properties of these models complicate the usage of complementary efficiency optimizations such as prefix caching that skip redundant computations across
-
2025Training Deep Neural Networks (DNNs) with billions of parameters generally involves pipeline-parallel (PP) execution. Unfortunately, PP model training can use GPUs inefficiently, especially at large scale, due to idle GPU time caused by pipeline bubbles, which are often 15–30% and can exceed 60% of the training job’s GPU allocation. To improve the GPU utilization of PP model training, this paper describes
Academia
View allWhether you're a faculty member or student, there are number of ways you can engage with Amazon.
View all