Customer-obsessed science
Research areas
-
May 15, 20265 min readA new scaling law that relates particular architectural choices to loss helps identify models that improve throughput by up to 47% with no loss of accuracy.
-
May 14, 202616 min read
-
-
April 15, 20268 min read
Featured news
-
CVPR 2026 EarthVision Workshop2026Building outline extraction from remote sensing imagery traditionally relies on segmentation or detection followed by post-processing to derive polygonal geometries. Despite advances in sequential prediction methods [2, 20], end-to-end extraction remains challenging, often missing buildings or requiring additional refinement steps. In this work, we reformulate building outline extraction as next-coordinate
-
ICLR 2026 Workshop on AI with Recursive Self-Improvement2026Foundation-model upgrades frequently break deployed prompt-based systems: target models differ in chat-template conventions, multimodal interfaces, context limits, and structured-output reliability. We study cross-model prompt adaptation: given a prompt program validated on a source model, produce a target-model prompt that preserves a semantic contract and an interface contract under bounded regression
-
2026We present a systematic method for pruning edges from causal graphs by leveraging tiered knowledge. We characterize conditions under which edges can be removed from a causal graph while preserving the identifiability of (conditional) causal effects. This result enables causal identification on simplified graphs that are substantially smaller than the original graphs. The approach is particularly valuable
-
2026Gradient orthogonalization is a simple strategy that shows great utility in speeding up gradient descent. The Muon optimizer (Jordan et al., 2024b) combines gradient orthogonalization with first-order momentum and achieves significant improvement in data efficiency over Adam/AdamW (Loshchilov & Hutter, 2019a) for language model training. However, when using model parallelism, gradient orthogonalization
-
2026Scaling the number of parameters and the size of training data has proven to be an effective strategy for improving large language model (LLM) performance. Yet, as these models grow increasingly powerful and widely deployed, the cost of inference has become a pressing concern. Despite its importance, the tradeoff between model accuracy and inference efficiency remains underexplored. In this work, we examine
Collaborations
View allWhether you're a faculty member or student, there are number of ways you can engage with Amazon.
View all