Customer-obsessed science


Research areas
-
July 29, 2025New cost-to-serve-software metric that accounts for the full software development lifecycle helps determine which software development innovations provide quantifiable value.
Featured news
-
2024 Conference on Digital Experimentation @ MIT (CODE@MIT)2024There are different reasons why experimenters may want to randomize their experiment at a region level. In some cases, treatments cannot be turned on or off at the individual level, therefore requiring randomization at a group level, for which regions can be a good candidate. In other cases, experimenters may worry about network effects or other types of spillovers within a geographic area, and opt to randomize
-
Representation learning is a fundamental aspect of modern artificial intelligence, driving substantial improvements across diverse applications. While self-supervised contrastive learning has led to significant advancements in fields like computer vision and natural language processing, its adaptation to tabular data presents unique challenges. Traditional approaches often prioritize optimizing model architecture
-
2024Various types of learning rate (LR) schedulers are being used for training or fine tuning of Large Language Models today. In practice, several mid-flight changes are required in the LR schedule either manually, or with careful choices around warmup steps, peak LR, type of decay and restarts. To study this further, we consider the effect of switching the learning rate at a predetermined time during training
-
2024Large Language Models (LLMs) have shown impressive capabilities but also a concerning tendency to hallucinate. This paper presents REFCHECKER, a framework that introduces claim-triplets to represent claims in LLM responses, aiming to detect fine-grained hallucinations. In REFCHECKER, an extractor generates claim-triplets from a response, which are then evaluated by a checker against a reference. We delineate
-
ACM SoCC 20242024There has been a growing demand for making modern cloud-based data analytics systems cost-effective and easy to use. AI-powered intelligent resource scaling is one such effort, aiming at automating scaling decisions for serverless offerings like Amazon Redshift Serverless. The foundation of intelligent resource scaling lies in the ability to forecast query workloads and their resource consumption accurately
Academia
View allWhether you're a faculty member or student, there are number of ways you can engage with Amazon.
View all