Search - Amazon Science

Learning fair and transferable representations with theoretical guarantees

Luca Oneto, Michele Donini, Andreas Maurer, Massimiliano Pontil

DSAA 2020

2020

Developing learning methods which do not discriminate subgroups in the population is the central goal of algorithmic fairness. One way to reach this goal is by modifying the data representation in order to satisfy prescribed fairness constraints. This allows to reuse the same representation in other context (tasks) without discriminate subgroups. In this work we measure fairness according to demographic

Machine learning

Learning fair and transferable representations

Luca Oneto, Michele Donini, Massimiliano Pontil, Andreas Maurer

NeurIPS 2019 Workshop on Human-Centric Machine Learning

2019

Developing learning methods which do not discriminate subgroups in the population is a central goal of algorithmic fairness. One way to reach this goal is by modifying the data representation in order to meet certain fairness constraints. In this work we measure fairness according to demographic parity. This requires the probability of the possible model decisions to be independent of the sensitive information

Machine learning

Learn to transfer learn by studying task manifolds

Sebastian Flennerhag, Pablo Garcia Moreno, Neil Lawrence, Andreas Damianou

ICLR 2019

2018

In complex transfer learning scenarios new tasks might not be tightly linked to previous tasks. Approaches that transfer information contained only in the final parameters of a source model will therefore struggle. Instead, transfer learning at a higher level of abstraction is needed. We propose Leap, a framework that achieves this by transferring knowledge across learning processes. We associate each task

Machine learning

Scalable Hyperparameter Transfer Learning

Valerio Perrone, Rodolphe Jenatton, Matthias Seeger, Cédric Archambeau

NeurIPS 2018

2018

Bayesian optimization (BO) is a model-based approach for gradient-free black-box function optimization, such as hyperparameter optimization. Typically, BO relies on conventional Gaussian process (GP) regression, whose algorithmic complexity is cubic in the number of evaluations. As a result, GP-based BO cannot leverage large numbers of past function evaluations, for example, to warm-start related BO runs

Machine learning

TANDA: Transfer and adapt pre-trained transformer models for answer sentence selection

Siddhant Garg, Thuy Vu, Alessandro Moschitti

2020

We propose TANDA, an effective technique for fine-tuning pre-trained Transformer models for natural language tasks. Specifically, we first transfer a pre-trained model into a model for a general task by fine-tuning it with a large and high quality dataset. We then perform a second fine-tuning step to adapt the transferred model to the target domain. We demonstrate the benefits of our approach for answer

Conversational AI

Hyperparameter transfer learning with adaptive complexity

Samuel Horváth, Aaron Klein, Peter Richtárik, Cédric Archambeau

AISTATS 2021

2021

Bayesian optimization (BO) is a sample efficient approach to automatically tune the hyperparameters of machine learning models. In practice, one frequently has to solve similar hyperparameter tuning problems sequentially. For example, one might have to tune a type of neural network learned across a series of different classification problems. Recent work on multi-task BO exploits knowledge gained from previous

Machine learning

Transferring knowledge across learning processes

Sebastian Flennerhag, Pablo Garcia Moreno, Neil Lawrence, Andreas Damianou

ICLR 2019

2019

In complex transfer learning scenarios new tasks might not be tightly linked to previous tasks. Approaches that transfer information contained only in the final parameters of a source model will therefore struggle. Instead, transfer learning at a higher level of abstraction is needed. We propose Leap, a framework that achieves this by transferring knowledge across learning processes. We associate each task

Machine learning

REPAINT: Knowledge transfer in deep reinforcement learning

Yunzhe Tao, Sahika Genc, Jonathan Chung, Tao Sun, Sunil Mallya

ICML 2021

2021

Accelerating learning processes for complex tasks by leveraging previously learned tasks has been one of the most challenging problems in reinforcement learning, especially when the similarity between source and target tasks is low. This work proposes REPresentation And INstance Transfer (REPAINT) algorithm for knowledge transfer in deep reinforcement learning. REPAINT not only transfers the representation

Machine learning

Conversation style transfer using few-shot learning

Shamik Roy, Raphael Shu, Nikolaos Pappas, Elman Mansimov, Yi Zhang, Saab Mansour, Dan Roth

IJCNLP-AACL 2023

2023

Conventional text style transfer approaches focus on sentence-level style transfer without considering contextual information, and the style is described with attributes (e.g., formality). When applying style transfer in conversations such as task-oriented dialogues, existing approaches suffer from these limitations as context can play an important role and the style attributes are often difficult to define

Conversational AI

Learning search spaces for Bayesian optimization: Another view of hyperparameter transfer learning

Valerio Perrone, Huibin Shen, Matthias Seeger, Cédric Archambeau, Rodolphe Jenatton

NeurIPS 2019

2019

Bayesian optimization (BO) is a successful methodology to optimize black-box functions that are expensive to evaluate. While traditional methods optimize each black-box function in isolation, there has been recent interest in speeding up BO by transferring knowledge across multiple related black-box functions. In this work, we introduce a method to automatically design the BO search space by relying on

Machine learning

A simple transfer-learning extension of Hyperband

Lazar Valkov, Rodolphe Jenatton, Fela Winkelmolen, Cédric Archambeau

NeurIPS 2018

2018

Hyperband has become a popular method to tune the hyperparameters (HPs) of expensive machine learning models, whose performance depends on the amount of resources allocated for training. While Hyperband is conceptually simple, combining random search to a successive halving technique to reallocate resources to the most promising HPs, it often outperforms standard Bayesian optimization when solutions with

Machine learning

Language-informed transfer learning for embodied household activities

Yuqian Jiang, Qiaozi (QZ) Gao, Govind Thattai, Gaurav Sukhatme

AAAI 2023 Workshop on Artificial Intelligence for User-Centric Assistance for at Home Tasks

2023

For service robots to become general-purpose in everyday household environments, they need not only a large library of primitive skills, but also the ability to quickly learn novel tasks specified by users. Fine-tuning neural networks on a variety of downstream tasks has been successful in many vision and language domains, but research is still limited on transfer learning between diverse long-horizon tasks

Machine learning

Transfer learning, reinforcement learning for adaptive control optimization under distribution shift

Pankaj Rajak, Wojciech Kowalinski, Fei Wang

NeurIPS 2023 Workshop on Distribution Shifts (DistShifts)

2023

Many control systems rely on a pipeline of machine learning models and handcoded rules to make decisions. However, due to changes in the operating environment, these rules require constant tuning to maintain optimal system performance. Reinforcement learning (RL) can automate the online optimization of rules based on incoming data. However, RL requires extensive training data and exploration, which limits

Machine learning

Transfer Learning for Neural Semantic Parsing

Xing Fan, Emilio Monti, Lambert Mathias, Markus Dreyer

ACL 2017

2017

The goal of semantic parsing is to map natural language to a machine interpretable meaning representation language (MRL). One of the constraints that limits full exploration of deep learning technologies for semantic parsing is the lack of sufficient annotation training data. In this paper, we propose using sequence-to-sequence in a multi-task setup for semantic parsing with a focus on transfer learning

Conversational AI

Dynamic transfer learning for named entity recognition

Parminder Bhatia, Kristjan Arumae, Busra Celikkaya

AAAI 2019 Workshop on Health Intelligence

2019

State-of-the-art named entity recognition (NER) systems have been improving continuously using neural architectures over the past several years. However, many tasks including NER require large sets of annotated data to achieve such performance. In particular, we focus on NER from clinical notes, which is one of the most fundamental and critical problems for medical text analysis. Our work centers on effectively

Information and knowledge management

A quantile-based approach for hyperparameter transfer learning

David Salinas, Huibin Shen, Valerio Perrone

NeurIPS 2019 Workshop on Metalearning, ICML 2020

2020

Bayesian optimization (BO) is a popular methodology to tune the hyperparameters of expensive black-box functions. Despite its success, standard BO focuses on a single task at a time and is not designed to leverage information from related functions, such as the performance metric of the same algorithm tuned across multiple datasets. In this work, we introduce a novel approach to achieve transfer learning

Machine learning

Transfer learning for e-commerce query product type prediction

Anna Tigunova, Thomas Ricatte, Ghadir Eraisha

CIKM 2024 Workshop on Data-Centric AI

2024

Getting a good understanding of the customer intent is essential in e-commerce search engines. In particular, associating the correct product type to a search query plays a vital role in surfacing correct products to the customers. Query product type classification (Q2PT) is a particularly challenging task because search queries are short and ambiguous, the number of existing product categories is extremely

Conversational AI

Assaying out-of-distribution generalization in transfer learning

Florian Wenzel, Andrea Dittadi, Peter Gehler, Carl-Johann Simon-Gabriel, Max Horn, Dominik Zietlow, David Kernert, Chris Russell, Thomas Brox, Bernt Schiele, Bernhard Schölkopf, Francesco Locatello

NeurIPS 2022

2022

Since out-of-distribution generalization is a generally ill-posed problem, various proxy targets (e.g., calibration, adversarial robustness, algorithmic corruptions, invariance across shifts) were studied across different research programs resulting in different recommendations. While sharing the same aspirational goal, these approaches have never been tested under the same experimental conditions on real

Computer vision

Limitations of knowledge distillation for zero-shot transfer learning

Saleh Soltan, Haidar Khan, Wael Hamza

EMNLP 2021 Workshop on Simple and Efficient Natural Language Processing (SustaiNLP)

2021

Pretrained transformer-based encoders such as BERT have been demonstrated to achieve state-of-the-art performance on numerous NLP tasks. Despite their success, BERT style encoders are large in size and have high latency during inference (especially on CPU machines) which make them unappealing for many online applications. Recently introduced compression and distillation methods have provided effective ways

Conversational AI

Mutual information guided distillation for transfer learning

Sung-soo Ahn, Shell Hu, Zhenwen Dai, Andreas Damianou, Neil Lawrence

NeurIPS 2018

2018

We consider the teacher-student framework for knowledge transfer, where the goal is to improve learning of a “student” neural network, given a “teacher” neural network pretrained on the same or a similar task. The majority of existing approaches for distilling knowledge from a teacher network to a student network rely on matching either activations or handcrafted features from the teacher network. Instead

Machine learning

Search results

Work with us