Customer-obsessed science
Research areas
-
February 2, 202610 min readEvery NFL game generates millions of tracking data points from 22 RFID-equipped players. Seventy-five machine learning models running on AWS process that data in under a second, transforming football into a sport where every movement is measured, modeled, and instantly analyzed.
-
January 13, 20267 min read
-
January 8, 20264 min read
-
-
December 29, 20256 min read
Featured news
-
2024State-of-the-art speech models may exhibit suboptimal performance in specific population subgroups. Detecting these challenging subgroups is crucial to enhance model robustness and fairness. Traditional methods for subgroup identification typically rely on demographic information such as age, gender, and origin. However, collecting such sensitive data at deployment time can be impractical or unfeasible
-
ECIR 20242024We investigate the integration of Large Language Models (LLMs) into query encoders to improve dense retrieval without increasing latency and cost, by circumventing the dependency on LLMs at inference time. SoftQE incorporates knowledge from LLMs by mapping embeddings of input queries to those of the LLM-expanded queries. While improvements over various strong baselines on in-domain MS-MARCO metrics are
-
AAAI 2024 Workshop on Learnable Optimization (LEANOPT-24)2024Given a network, allocating resources at clusters level, rather than at each node, enhances efficiency in resource allocation and usage. In this paper, we study the problem of finding fully connected disjoint clusters to minimize the intra-cluster distances and maximize the number of nodes assigned to the clusters, while also ensuring that no two nodes within a cluster exceed a threshold distance. While
-
2024Foundation models (FMs) learn from large volumes of unlabeled data to demonstrate superior performance across a wide range of tasks. However, FMs developed for biomedical domains have largely remained unimodal, i.e., independently trained and used for tasks on protein sequences alone, small-molecule structures alone, or clinical data alone. To overcome this limitation, we present BioBRIDGE, a parameter-efficient
-
EACL 2024 Workshop on Linguistic Annotation2024Recent developments in active learning algorithms for NLP tasks show promising results in terms of reducing labelling complexity. In this paper we extend this effort to imbalanced datasets; we bridge between the active learning approach of obtaining diverse and informative examples, and the heuristic of class balancing used in imbalanced datasets. We develop a novel tune-free weighting technique that can
Collaborations
View allWhether you're a faculty member or student, there are number of ways you can engage with Amazon.
View all