-
2024Adapting large language models (LLMs) to unseen tasks with in-context training samples without fine-tuning remains an important research problem. To learn a robust LLM that adapts well to unseen tasks, multiple meta-training approaches have been proposed such as MetaICL and MetaICT, which involve meta-training pre-trained LLMs on a wide variety of diverse tasks. These meta-training approaches essentially
-
2024Following the introduction of Adam, several novel adaptive optimizers for deep learning have been proposed. These optimizers typically excel in some tasks but may not outperform Adam uniformly across all tasks. In this work, we introduce Meta-Adaptive Optimizers (MADA), a unified optimizer framework that can generalize several known optimizers and dynamically learn the most suitable one during training.
-
2024A key challenge in contrastive learning is to generate negative samples from a large sample set to contrast with positive samples, for learning better encoding of the data. These negative samples often follow a softmax distribution which are dynamically updated during the training process. However, sampling from this distribution is non-trivial due to the high computational costs in computing the partition
-
2024Selecting appropriate thresholds for anomaly detection in online, unsupervised settings is a challenging task, especially in the presence of data distribution shifts. Addressing these challenges is critical in many practical large scale systems, such as infrastructure monitoring and network intrusion detection. This paper proposes an algorithm that connects online thresholding with constructing confidence
-
2024Comparing two samples of data, we observe a change in the distribution of an outcome variable. In the presence of multiple explanatory variables, how much of the change can be explained by each possible cause? We develop a new estimation strategy that, given a causal model, combines regression and re-weighting methods to quantify the contribution of each causal mechanism. Our proposed methodology is multiply
Related content
-
March 30, 2022How he parlayed an internship to land an expanded role at Amazon while pursuing his master’s degree.
-
March 28, 2022Workshop provides opportunity for students to showcase their work and for connections to be established between academics and Amazon researchers.
-
March 24, 2022Information extraction, drug discovery, and software analysis are just a few applications of this versatile tool.
-
March 21, 2022Graduate Research Fellows Program, launched in 2021, supports research in automated reasoning, computer vision, robotics, language technology, machine learning, operations research, and data science.
-
March 16, 2022A machine learning model learns representations that cluster devices according to their usage patterns.
-
March 15, 2022In-person event featuring some of the brightest leaders in science, academia, and business is planned for June 21-24 in Las Vegas