Learning computational tasks from single examples

New “meta-learning” approach improves on the state of the art in “one-shot” learning.

In the past decade, deep-learning systems have proven remarkably successful at many artificial-intelligence tasks, but their applications tend to be narrow. A computer vision system trained to recognize cats and dogs, for instance, would need significant retraining to start recognizing sharks and sea turtles.

Meta-learning is a paradigm intended to turn machine learning systems into generalists. A meta-learning model is trained on a range of related tasks, but it learns not only how to perform those tasks but also how to learn to perform them. The idea is that it could then be adapted to new tasks with only a handful of labeled training examples, drastically reducing the need for labor-intensive data annotation.

At the (virtual) International Conference on Learning Representations, we will present an approach that improves performance on meta-learning tasks without increasing the data annotation requirements. The key idea is to adapt the meta-learning procedure so that it can leverage small sets of unlabeled data, in addition to the traditional labeled examples.

Meta-learning
In meta-learning, a machine learning model learns how to learn. During meta-training, the model is trained on a group of related tasks — using data from “support sets” — and tested using data from “query sets”. But the query sets are labeled, so the model can assess how effectively it's learning. During meta-testing, the model is again trained on a group of support sets, but it's evaluated on its ability to classify unlabeled query data.
Stacy Reilly

The intuition is that even without labels, these extra data still contain a lot of useful information. Suppose, for instance, that a meta-learning system trained on images of terrestrial animals (such as cats and dogs) is being adapted to recognize aquatic animals. Unlabeled images of aquatic animals (i.e., images that don’t indicate whether an animal is a shark or a sea turtle) still tell the model something about the learning task, such as the lighting conditions and background colors typical of underwater photos.

In experiments, we compared models trained through our approach to 16 different baselines on an object recognition meta-learning task. We found that our approach improved performance on one-shot learning, or learning a new object classification task from only a single labeled example, by 11% to 16%, depending on the architectures of the underlying neural networks.

Meta-learning

In conventional machine learning, a model is fed a body of labeled data and learns to correlate data features with the labels. Then it’s fed a separate body of test data and evaluated on how well it predicts the labels for that data. For evaluation purposes, the system designers have access to the test-data labels, but the model itself doesn’t.

Meta-learning adds another layer of complexity. During meta-training — the analogue of conventional training — the model learns to perform a range of related tasks. Each task has its own sets of training data and test data, and the model sees both. That is, part of its meta-training is learning how particular ways of responding to training data tend to affect its performance on test data.

During meta-testing, it is again trained on a range of tasks. These are related to but not identical to the tasks it saw during meta-training — recognizing aquatic animals, for instance, as opposed to terrestrial animals. Again, for each task, the model sees both training data and test data. But whereas, during meta-training, the test data were labeled, during meta-testing, the labels are unknown and must be predicted.

The terminology can get a bit confusing, so meta-learning researchers typically refer to the meta-learning “training” sets as support sets and the meta-learning “test” sets as query sets. During meta-training, the learning algorithm has access to the labels for both the support sets and the query sets, and it uses them to produce a global model. During meta-testing, it has access only to the labels for the support sets, which it uses to adapt the global model to each of the new tasks.

Our approach has two key innovations. First, during meta-training, we do not learn a single global model. Instead, we train an auxiliary neural network to produce a local model for each task, based on the corresponding support set. Second and more important, during meta-training we also train a second auxiliary network to leverage the unlabeled data of the query sets. Then, during meta-testing, we can use the query sets to fine-tune the local models, improving performance.

Leveraging unlabeled data

A machine learning system is governed by a set of parameters, and in meta-learning, meta-training optimizes them for a particular family of tasks — such as recognizing animals. During meta-testing or operational deployment, the model uses a handful of training examples to optimize those parameters for a new task.

A particular set of parameter values defines a point in a multidimensional space, and adaptation to a new task can be thought of as searching the space for the point representing the optimal new settings.

Meta-learning parameter space
In traditional meta-learning (left), the result of training is a model (φ) that can be adapted to a new set of related tasks (1 – 4). Adaptation involves searching for the optimal settings 1 – θ4) of the model parameters, based on a small set of labeled data (dl1 – dl4). Our system (right), by contrast, uses the labeled data and the available unlabeled data (x1 – x4) to better approximate those settings.

A traditional meta-learning system might begin its search at the point defined by the global model (φ in the figure above); this is the initialization step. Then, using the labeled data of the support set, it would work its way toward the settings that correspond to the new task; this is the adaptation step.

With our approach, by contrast, the initialization network selects a starting search location on the basis of the data in the support set 01(dl1) – θ04(dl4) in the figure above). Then it works its way toward the optimal settings using the unlabeled data of the query set (x1 – x4, above). More precisely, the second auxiliary neural network estimates the gradient implied by the query set data.

In the same way that the parameter settings of a machine learning model can be interpreted as a point in a representational space, so can the parameter settings and the resulting error rate on a particular data set. The multidimensional graph that results is like a topological map, with depressions that represent low error rates and peaks that represent high error rates. In this context, machine learning is a matter of identifying the slope of a depression — a gradient — and moving down it, toward a region of low error.

This is how many machine learning systems learn, but typically, they have the advantage of knowing, from training data labels, what the true error rate is for a given set of system parameters. In our case, because we’re relying on the unlabeled query set data, we can only guess at the true gradients.

That’s where the second auxiliary neural network comes in: it infers gradients from query set data. The system as a whole then uses the inferred gradients to fine-tune the initial parameter settings supplied by the first neural network.

The approach can be explained and justified through connections to two topics in theoretical machine learning, namely empirical Bayes and information bottleneck. These theoretical developments are beyond the scope of this blog post, but the interested reader can consult the full manuscript.

The associated software code has also been open-sourced as part of the Xfer repository.

Although our system beat all 16 baselines on the task of one-shot learning, there were several baseline systems that outperformed it on five-shot learning, or learning with five examples per new task. The approaches used by those baselines are complementary to our approach, and we believe that combining approaches could yield even lower error rates. Going forward, that’s one of several extensions of this work that we will be pursuing.

Research areas

Related content

US, VA, Arlington
The People eXperience and Technology Central Science Team (PXTCS) uses economics, behavioral science, statistics, and machine learning to proactively identify mechanisms and process improvements which simultaneously improve Amazon and the lives, wellbeing, and the value of work to Amazonians. We are an interdisciplinary team that combines the talents of science and engineering to develop and deliver solutions that measurably achieve this goal. We are looking for economists who are able to apply economic methods to address business problems. The ideal candidate will work with engineers and computer scientists to estimate models and algorithms on large scale data, design pilots and measure their impact, and transform successful prototypes into improved policies and programs at scale. We are looking for creative thinkers who can combine a strong technical economic toolbox with a desire to learn from other disciplines, and who know how to execute and deliver on big ideas as part of an interdisciplinary technical team. Ideal candidates will work in a team setting with individuals from diverse disciplines and backgrounds. They will work with teammates to develop scientific models and conduct the data analysis, modeling, and experimentation that is necessary for estimating and validating models. They will work closely with engineering teams to develop scalable data resources to support rapid insights, and take successful models and findings into production as new products and services. They will be customer-centric and will communicate scientific approaches and findings to business leaders, listening to and incorporate their feedback, and delivering successful scientific solutions. Key job responsibilities Use causal inference methods to evaluate the impact of policies on employee outcomes. Examine how external labor market and economic conditions impact Amazon's ability to hire and retain talent. Use scientifically rigorous methods to develop and recommend career paths for employees. A day in the life Work with teammates to apply economic methods to business problems. This might include identifying the appropriate research questions, writing code to implement a DID analysis or estimate a structural model, or writing and presenting a document with findings to business leaders. Our economists also collaborate with partner teams throughout the process, from understanding their challenges, to developing a research agenda that will address those challenges, to help them implement solutions. About the team We are a multidisciplinary team that combines the talents of science and engineering to develop innovative solutions to make Amazon Earth's Best Employer.
US, WA, Seattle
We are looking for detail-oriented, organized, and responsible individuals who are eager to learn how to work with large and complicated data sets. Knowledge of econometrics, as well as basic familiarity with Python (or R, Matlab, or equivalent) is necessary, and experience with SQL would be a plus. These are full-time positions at 40 hours per week, with compensation being awarded on an hourly basis. You will learn how to build data sets and perform applied econometric analysis at Internet speed collaborating with economists, scientists, and product managers. These skills will translate well into writing applied chapters in your dissertation and provide you with work experience that may help you with placement. Roughly 85% of previous cohorts have converted to full time scientist employment at Amazon. If you are interested, please send your CV to our mailing list at econ-internship@amazon.com.
US, WA, Virtual Contact Center-WA
We are looking for detail-oriented, organized, and responsible individuals who are eager to learn how to work with large and complicated data sets. Some knowledge of econometrics, as well as basic familiarity with Python is necessary, and experience with SQL and UNIX would be a plus. These are full-time positions at 40 hours per week, with compensation being awarded on an hourly basis. You will learn how to build data sets and perform applied econometric analysis at Internet speed collaborating with economists, scientists, and product managers. These skills will translate well into writing applied chapters in your dissertation and provide you with work experience that may help you with placement. Roughly 85% of previous cohorts have converted to full time scientist employment at Amazon. If you are interested, please send your CV to our mailing list at econ-internship@amazon.com. About the team The Selling Partner Fees team owns the end-to-end fees experience for two million active third party sellers. We own the fee strategy, fee seller experience, fee accuracy and integrity, fee science and analytics, and we provide scalable technology to monetize all services available to third-party sellers. Within the Science team, our goal is to understand the impact of changing fees on Seller (supply) and Customers (demand) behavior (e.g. price changes, advertising strategy changes, introducing new selection etc.) as well as using this information to optimize our fee structure and maximizing our long term profitability.
US, WA, Seattle
This is a unique opportunity to build technology and science that millions of people will use every day. Are you excited about working on large scale Natural Language Processing (NLP), Machine Learning (ML), and Deep Learning (DL)? We are embarking on a multi-year journey to improve the shopping experience for customers globally. Amazon Search team creates customer-focused search solutions and technologies that makes shopping delightful and effortless for our customers. Our goal is to understand what customers are looking for in whatever language happens to be their choice at the moment and help them find what they need in Amazon's vast catalog of billions of products. As Amazon expands to new geographies, we are faced with the unique challenge of maintaining the bar on Search Quality due to the diversity in user preferences, multilingual search and data scarcity in new locales. We are looking for an applied researcher to work on improving search on Amazon using NLP, ML, and DL technology. As an Applied Scientist, you will lead our efforts in query understanding, semantic matching (e.g. is a drone the same as quadcopter?), relevance ranking (what is a "funny halloween costume"?), language identification (did the customer just switch to their mother tongue?), machine translation (猫の餌を注文する). This is a highly visible role with a huge impact on Amazon customers and business. As part of this role, you will develop high precision, high recall, and low latency solutions for search. Your solutions should work for all languages that Amazon supports and will be used in all Amazon locales world-wide. You will develop scalable science and engineering solutions that work successfully in production. You will work with leaders to develop a strategic vision and long term plans to improve search globally. We are growing our collaborative group of engineers and applied scientists by expanding into new areas. This is a position on Global Search Quality team in Seattle Washington. We are moving fast to change the way Amazon search works. Together with a multi-disciplinary team you will work on building solutions with NLP/ML/DL at its core. Along the way, you’ll learn a ton, have fun and make a positive impact on millions of people. Come and join us as we invent new ways to delight Amazon customers.
US, WA, Seattle
This is a unique opportunity to build technology and science that millions of people will use every day. Are you excited about working on large scale Natural Language Processing (NLP), Machine Learning (ML), and Deep Learning (DL)? We are embarking on a multi-year journey to improve the shopping experience for customers globally. Amazon Search team creates customer-focused search solutions and technologies that makes shopping delightful and effortless for our customers. Our goal is to understand what customers are looking for in whatever language happens to be their choice at the moment and help them find what they need in Amazon's vast catalog of billions of products. As Amazon expands to new geographies, we are faced with the unique challenge of maintaining the bar on Search Quality due to the diversity in user preferences, multilingual search and data scarcity in new locales. We are looking for an applied researcher to work on improving search on Amazon using NLP, ML, and DL technology. As an Applied Scientist, you will lead our efforts in query understanding, semantic matching (e.g. is a drone the same as quadcopter?), relevance ranking (what is a "funny halloween costume"?), language identification (did the customer just switch to their mother tongue?), machine translation (猫の餌を注文する). This is a highly visible role with a huge impact on Amazon customers and business. As part of this role, you will develop high precision, high recall, and low latency solutions for search. Your solutions should work for all languages that Amazon supports and will be used in all Amazon locales world-wide. You will develop scalable science and engineering solutions that work successfully in production. You will work with leaders to develop a strategic vision and long term plans to improve search globally. We are growing our collaborative group of engineers and applied scientists by expanding into new areas. This is a position on Global Search Quality team in Seattle Washington. We are moving fast to change the way Amazon search works. Together with a multi-disciplinary team you will work on building solutions with NLP/ML/DL at its core. Along the way, you’ll learn a ton, have fun and make a positive impact on millions of people. Come and join us as we invent new ways to delight Amazon customers.
US, WA, Seattle
The retail pricing science and research group is a team of scientists and economists who design and implement the analytics powering pricing for Amazon’s on-line retail business. The team uses world-class analytics to make sure that the prices for all of Amazon’s goods and services are aligned with Amazon’s corporate goals. We are seeking an experienced high-energy Economist to help envision, design and build the next generation of retail pricing capabilities. You will work at the intersection of economic theory, statistical inference, and machine learning to design new methods and pricing strategies to deliver game changing value to our customers. Roughly 85% of previous intern cohorts have converted to full time scientist employment at Amazon. If you are interested, please send your CV to our mailing list at econ-internship@amazon.com. Key job responsibilities Amazon’s Pricing Science and Research team is seeking an Economist to help envision, design and build the next generation of pricing capabilities behind Amazon’s on-line retail business. As an economist on our team, you will work at the intersection of economic theory, statistical inference, and machine learning to design new methods and pricing strategies with the potential to deliver game changing value to our customers. This is an opportunity for a high-energy individual to work with our unprecedented retail data to bring cutting edge research into real world applications, and communicate the insights we produce to our leadership. This position is perfect for someone who has a deep and broad analytic background and is passionate about using mathematical modeling and statistical analysis to make a real difference. You should be familiar with modern tools for data science and business analysis. We are particularly interested in candidates with research background in applied microeconomics, econometrics, statistical inference and/or finance. A day in the life Discussions with business partners, as well as product managers and tech leaders to understand the business problem. Brainstorming with other scientists and economists to design the right model for the problem in hand. Present the results and new ideas for existing or forward looking problems to leadership. Deep dive into the data. Modeling and creating working prototypes. Analyze the results and review with partners. Partnering with other scientists for research problems. About the team The retail pricing science and research group is a team of scientists and economists who design and implement the analytics powering pricing for Amazon’s on-line retail business. The team uses world-class analytics to make sure that the prices for all of Amazon’s goods and services are aligned with Amazon’s corporate goals.
US, CA, San Francisco
The retail pricing science and research group is a team of scientists and economists who design and implement the analytics powering pricing for Amazon's on-line retail business. The team uses world-class analytics to make sure that the prices for all of Amazon's goods and services are aligned with Amazon's corporate goals. We are seeking an experienced high-energy Economist to help envision, design and build the next generation of retail pricing capabilities. You will work at the intersection of statistical inference, experimentation design, economic theory and machine learning to design new methods and pricing strategies for assessing pricing innovations. Roughly 85% of previous intern cohorts have converted to full time scientist employment at Amazon. If you are interested, please send your CV to our mailing list at econ-internship@amazon.com. Key job responsibilities Amazon's Pricing Science and Research team is seeking an Economist to help envision, design and build the next generation of pricing capabilities behind Amazon's on-line retail business. As an economist on our team, you will will have the opportunity to work with our unprecedented retail data to bring cutting edge research into real world applications, and communicate the insights we produce to our leadership. This position is perfect for someone who has a deep and broad analytic background and is passionate about using mathematical modeling and statistical analysis to make a real difference. You should be familiar with modern tools for data science and business analysis. We are particularly interested in candidates with research background in experimentation design, applied microeconomics, econometrics, statistical inference and/or finance. A day in the life Discussions with business partners, as well as product managers and tech leaders to understand the business problem. Brainstorming with other scientists and economists to design the right model for the problem in hand. Present the results and new ideas for existing or forward looking problems to leadership. Deep dive into the data. Modeling and creating working prototypes. Analyze the results and review with partners. Partnering with other scientists for research problems. About the team The retail pricing science and research group is a team of scientists and economists who design and implement the analytics powering pricing for Amazon's on-line retail business. The team uses world-class analytics to make sure that the prices for all of Amazon's goods and services are aligned with Amazon's corporate goals.
US, WA, Bellevue
We are looking for detail-oriented, organized, and responsible individuals who are eager to learn how to work with large and complicated data sets. Some knowledge of econometrics, as well as basic familiarity with Python is necessary, and experience with SQL and UNIX would be a plus. These are full-time positions at 40 hours per week, with compensation being awarded on an hourly basis. You will learn how to build data sets and perform applied econometric analysis at Internet speed collaborating with economists, scientists, and product managers. These skills will translate well into writing applied chapters in your dissertation and provide you with work experience that may help you with placement. Roughly 85% of interns from previous cohorts have converted to full time economics employment at Amazon. If you are interested, please send your CV to our mailing list at econ-internship@amazon.com.
US
The Amazon Supply Chain Optimization Technology (SCOT) organization is looking for an Intern in Economics to work on exciting and challenging problems related to Amazon's worldwide inventory planning. SCOT provides unique opportunities to both create and see the direct impact of your work on billions of dollars’ worth of inventory, in one of the world’s most advanced supply chains, and at massive scale. We are looking for detail-oriented, organized, and responsible individuals who are eager to learn how to work with large and complicated data sets. We are looking for a PhD candidate with exposure to Program Evaluation/Causal Inference. Knowledge of econometrics and Stata/R/or Python is necessary, and experience with SQL, Hadoop, and Spark would be a plus. These are full-time positions at 40 hours per week, with compensation being awarded on an hourly basis. You will learn how to build data sets and perform applied econometric analysis at Internet speed collaborating with economists, scientists, and product managers. These skills will translate well into writing applied chapters in your dissertation and provide you with work experience that may help you with placement. Roughly 85% of previous cohorts have converted to full time scientist employment at Amazon. If you are interested, please send your CV to our mailing list at econ-internship@amazon.com.
US, WA, Seattle
The Selling Partner Fees team owns the end-to-end fees experience for two million active third party sellers. We own the fee strategy, fee seller experience, fee accuracy and integrity, fee science and analytics, and we provide scalable technology to monetize all services available to third-party sellers. We are looking for an Intern Economist with excellent coding skills to design and develop rigorous models to assess the causal impact of fees on third party sellers’ behavior and business performance. As a Science Intern, you will have access to large datasets with billions of transactions and will translate ambiguous fee related business problems into rigorous scientific models. You will work on real world problems which will help to inform strategic direction and have the opportunity to make an impact for both Amazon and our Selling Partners.