Customer-obsessed science
Research areas
-
May 15, 20265 min readA new scaling law that relates particular architectural choices to loss helps identify models that improve throughput by up to 47% with no loss of accuracy.
-
May 14, 202616 min read
-
-
April 15, 20268 min read
Featured news
-
EMNLP 20232023In the burgeoning field of natural-language processing, Neural Topic Models (NTMs) and Large Language Models (LLMs) have emerged as areas of significant research interest. Despite this, NTMs have predominantly leveraged contextual embeddings from LLMs, neglecting the potential benefits of harnessing the overall structure. Our study addresses this gap by introducing a novel framework named Diffusion-Enhanced
-
EMNLP 20232023Continual Federated Learning (CFL) combines Federated Learning (FL), the decentralized learning of a central model on a number of client devices that may not communicate their data, and Continual Learning (CL), the learning of a model from a continual stream of data without keeping the entire history. In CL, the main challenge is forgetting what was learned from past data. While replay-based algorithms
-
ACM 2023 Symposium on Operating Systems Principles (SOSP)2023Large deep learning models have recently garnered substantial attention from both academia and industry. Nonetheless, frequent failures are observed during large model training due to large-scale resources involved and extended training time. Existing solutions have significant failure recovery costs due to the severe restriction imposed by the bandwidth of remote storage in which they store checkpoints
-
EMNLP 20232023Large language models (LLMs) encode vast amounts of world knowledge. However, since these models are trained on large swaths of internet data, they are at risk of inordinately capturing information about dominant groups. This imbalance can propagate into generated language. In this work, we study and operationalise a form of geographical erasure, wherein language models underpredict certain countries. We
-
EMNLP 20232023Modern ML systems ingest data aggregated from diverse sources, such as synthetic, human-annotated, and live customer traffic. Understanding which examples are important to the performance of a learning algorithm is crucial for efficient model training. Recently, a growing body of literature has given rise to various “influence scores,” which use training artifacts such as model confidence or checkpointed
Collaborations
View allWhether you're a faculty member or student, there are number of ways you can engage with Amazon.
View all