Amazon presents new method for "debugging" machine learning models

Synthetic data produced by perturbing test inputs identify error classes and provide additional data for retraining.

Machine learning models are usually tested on held-out data from their training sets. But when data distributions in the real world diverge from those in the test set, such tests can overstate model performance. And models can still fail in surprising ways — for instance, systematically mishandling inputs that don’t seem any more challenging than others that the model handles smoothly.

At the NeurIPS 2021 Workshop on Explainable AI Approaches for Debugging and Diagnosis (XAI4Debugging), my colleagues and I will present Defuse, a method that automatically discovers model errors on particular types of input and provides a way to correct them.

Defuse procedure.gif
Examples of three types of misclassifications identified by Defuse. In the distillation step, similar images are grouped together to represent classes of error. In the correction step, the model has been re-trained to handle those classes correctly.

Given a trained image classification model (a classifier), Defuse generates realistic-looking new images that are variations on test-set inputs that the classifier mishandles. Then Defuse sorts misclassified images into high-level “model bugs” — groups of similar images that consistently cause errors. Users can then identify scenarios under which their models would fail and train more-robust models.

During retraining, we mix the artificially generated error-causing data with the model’s original training data, in a ratio controlled by a parameter, λ. In experiments, we show that for each of three sample datasets, there is a value of λ that enables consistently high performance on the input classes identified by Defuse while largely preserving performance on the original data.

Lambda sweep.png
Performance on the original test set (blue) and on the error classes identified by Defuse (orange) for different values of λ.

To spur further research on this problem, we have publicly released the code for Defuse on GitHub.

Data augmentation

To generate our synthetic images, we train a variational autoencoder (VAE) on the classifier’s training data. A VAE is a model trained to output the same data that it takes as input, but in-between, it produces a vector representation that captures salient properties of the input. That vector representation defines a latent space in which proximity indicates similarity of data.

Once the VAE is trained, we use its latent space to generate new image data. In the image below, we show how the VAE can interpolate between two test images — which define points in the latent space — to generate new related images.

Image interpolations.gif
Defuse uses the latent space of a trained VAE to generate new data.

The VAE’s latent representations are the basis of Defuse’s three main steps: identification, distillation, and correction.

Identification: First, Defuse encodes all the images from the training data using the trained VAE. Perturbing a latent code with small amounts of noise should cause the decoded instances to have small but semantically meaningful differences from the original instance. The perturbations are assigned the same label as the original instance, and Defuse saves instances that the classifier misclassifies. In the figure above, the red number in the upper right-hand corner of each input image is the classifier’s prediction.

Distillation: Next, a clustering model groups together the latent codes of the images from the previous step to diagnose misclassification regions. In the example above, Defuse groups together generated images of the digit 8 that are incorrectly classified as 3. Defuse uses a Gaussian mixture model with a Dirichlet process prior because the number of clusters is unknown ahead of time.

Below is a low-dimensional (t-distributed stochastic neighbor embedding, or t-SNE) visualization of latent codes obtained from one of the three datasets we used in our experiments. The blue circles are the latent codes of images in the training set; the black dots are the latent codes of generated images that were identified as classifier mistakes. The red dots are the three decoded latent codes, where the red number in the upper left-hand corner is the classifier label.

Latent-space projection.png
A low-dimensional (t-SNE) visualization of the latent space encodings of images in a dataset (blue dots) and generated new samples (black dots).

Correction: The clusters created in the distillation step are then annotated by labelers. Defuse runs the correction step using both the annotator-labeled data and the original training data, mixed according the parameter λ.

Experiments

We conducted experiments on three public benchmark datasets, assessing accuracy on both the misclassification region test data and the original test set after performing correction. We compared Defuse to finetuning only on the images from the identification step that are labeled as classifier mistakes by annotators. We expect this baseline to be reasonable because related works that focus on robustness to classic adversarial attacks demonstrate the effectiveness of tuning directly on the adversarial examples. Overall, these results indicate that the correction step in Defuse is highly effective at correcting the errors discovered during identification and distillation.

Misclassification samples.png
Samples from three misclassification regions for each of our three test datasets.

Novelty of the misclassified instances generated with Defuse: We expect Defuse to find novel model misclassifications beyond those revealed by the available data. To test this hypothesis, we compared the errors proposed by Defuse (the misclassification region data) and the misclassified training data.

We chose 10 images from the misclassification regions and found their nearest neighbors in the misclassified training data. In the figure below, we can see that the data in the misclassification regions reveal different types of errors than are found in the training set.

training set.png
We compare samples from the misclassification regions with their nearest neighbors in the training-set data. We see that the misclassification regions reveal novel sources of model error not found in the misclassified training data.

To learn more about Defuse check out the paper and the repository.

Acknowledgements: Krishnaram Kenthapadi

Related content

US, NY, New York
We are seeking a Robotics/AI Motor Control Scientist to develop cutting-edge machine learning algorithms for motor control systems in robots. In this role, you will focus on creating and optimizing intelligent motor control strategies to enable robots to perform complex, whole-body tasks. Your contributions will be essential in advancing robotics by enabling fluid, reliable, and safe interactions between robots and their environments. Key job responsibilities - Develop controllers that leverage reinforcement learning, imitation learning, or other advanced AI techniques to achieve natural, robust, and adaptive motor behaviors - Collaborate with multi-disciplinary teams to integrate motor control systems with robotic hardware, ensuring alignment with real-world constraints such as actuator dynamics and energy efficiency - Use simulation and real-world testing to refine and validate control algorithms - Stay updated on advancements in robotics, AI, and control systems to apply advanced techniques to robotic motion challenges - Lead technical projects from conception through production deployment - Mentor junior scientists and engineers - Bridge research initiatives with practical engineering implementation About the team Fauna Robotics, an Amazon company, is building capable, safe, and genuinely delightful robots for everyday life. Our goal is simple: make robots people actually want to live and interact with in everyday human spaces. We believe that future won’t arrive until building for robotics becomes far more accessible. Today, too much effort is spent reinventing the fundamentals. We’re changing that by developing tightly integrated hardware and software systems that make it faster, safer, and more intuitive to create real-world robotic products. Our work spans the full stack: mechanical design, control systems, dynamic modeling, and intelligent software. The focus is not just functionality, but experience. We’re building robots that feel responsive, expressive, and genuinely useful. At Fauna, you’ll work at the frontier of this space, helping define how robots move, manipulate, and interact with people in natural environments. It’s an opportunity to solve hard problems across hardware and software with a team focused on making robotics accessible and joyful to build. If you care about making robotics real for everyone and building systems that are as delightful as they are capable, we’re interested in hearing from you. an opportunity to solve hard problems across hardware and software with a team focused on making robotics accessible and joyful to build. If you care about making robotics real for everyone and building systems that are as delightful as they are capable, we’re interested in hearing from you.
US, WA, Seattle
Amazon.com strives to be Earth's most customer-centric company where customers can shop in our stores to find and discover anything they want to buy. We hire the world's brightest minds, offering them a fast paced, technologically sophisticated and friendly work environment. Economists in the Forecasting, Macroeconomics & Finance field document, interpret and forecast Amazon business dynamics. This track is well suited for economists adept at combining times-series statistical methods with strong economic analysis and intuition. This track could be a good fit for candidates with research experience in: macroeconometrics and/or empirical macroeconomics; international macroeconomics; time-series econometrics; forecasting; financial econometrics and/or empirical finance; and the use of micro and panel data to improve and validate traditional aggregate models. Economists at Amazon are expected to work directly with our senior management and scientists from other fields on key business problems faced across Amazon, including retail, cloud computing, third party merchants, search, Kindle, streaming video, and operations. The Forecasting, Macroeconomics & Finance field utilizes methods at the frontier of economics to develop formal models to understand the past and the present, predict the future, and identify relevant risks and opportunities. For example, we analyze the internal and external drivers of growth and profitability and how these drivers interact with the customer experience in the short, medium and long-term. We build econometric models of dynamic systems, using our world class data tools, formalizing problems using rigorous science to solve business issues and further delight customers.
US, WA, Seattle
Amazon.com strives to be Earth's most customer-centric company where customers can shop in our stores to find and discover anything they want to buy. We hire the world's brightest minds, offering them a fast paced, technologically sophisticated and friendly work environment. Economists at Amazon partner closely with senior management, business stakeholders, scientist and engineers, and economist leadership to solve key business problems ranging from Amazon Web Services, Kindle, Prime, inventory planning, international retail, third party merchants, search, pricing, labor and employment planning, effective benefits (health, retirement, etc.) and beyond. Amazon Economists build econometric models using our world class data systems and apply approaches from a variety of skillsets – applied macro/time series, applied micro, econometric theory, empirical IO, empirical health, labor, public economics and related fields are all highly valued skillsets at Amazon. You will work in a fast moving environment to solve business problems as a member of either a cross-functional team embedded within a business unit or a central science and economics organization. You will be expected to develop techniques that apply econometrics to large data sets, address quantitative problems, and contribute to the design of automated systems around the company.
US, WA, Seattle
Amazon.com strives to be Earth's most customer-centric company where customers can shop in our stores to find and discover anything they want to buy. We hire the world's brightest minds, offering them a fast paced, technologically sophisticated and friendly work environment. Economists at Amazon partner closely with senior management, business stakeholders, scientist and engineers, and economist leadership to solve key business problems ranging from Amazon Web Services, Kindle, Prime, inventory planning, international retail, third party merchants, search, pricing, labor and employment planning, effective benefits (health, retirement, etc.) and beyond. Amazon Economists build econometric models using our world class data systems and apply approaches from a variety of skillsets – applied macro/time series, applied micro, econometric theory, empirical IO, empirical health, labor, public economics and related fields are all highly valued skillsets at Amazon. You will work in a fast moving environment to solve business problems as a member of either a cross-functional team embedded within a business unit or a central science and economics organization. You will be expected to develop techniques that apply econometrics to large data sets, address quantitative problems, and contribute to the design of automated systems around the company.
US, WA, Seattle
Amazon.com strives to be Earth's most customer-centric company where customers can shop in our stores to find and discover anything they want to buy. We hire the world's brightest minds, offering them a fast paced, technologically sophisticated and friendly work environment. Economists at Amazon partner closely with senior management, business stakeholders, scientist and engineers, and economist leadership to solve key business problems ranging from Amazon Web Services, Kindle, Prime, inventory planning, international retail, third party merchants, search, pricing, labor and employment planning, effective benefits (health, retirement, etc.) and beyond. Amazon Economists build econometric models using our world class data systems and apply approaches from a variety of skillsets – applied macro/time series, applied micro, econometric theory, empirical IO, empirical health, labor, public economics and related fields are all highly valued skillsets at Amazon. You will work in a fast moving environment to solve business problems as a member of either a cross-functional team embedded within a business unit or a central science and economics organization. You will be expected to develop techniques that apply econometrics to large data sets, address quantitative problems, and contribute to the design of automated systems around the company.
US, WA, Seattle
Amazon.com strives to be Earth's most customer-centric company where customers can shop in our stores to find and discover anything they want to buy. We hire the world's brightest minds, offering them a fast paced, technologically sophisticated and friendly work environment. Economists at Amazon partner closely with senior management, business stakeholders, scientist and engineers, and economist leadership to solve key business problems ranging from Amazon Web Services, Kindle, Prime, inventory planning, international retail, third party merchants, search, pricing, labor and employment planning, effective benefits (health, retirement, etc.) and beyond. Amazon Economists build econometric models using our world class data systems and apply approaches from a variety of skillsets – applied macro/time series, applied micro, econometric theory, empirical IO, empirical health, labor, public economics and related fields are all highly valued skillsets at Amazon. You will work in a fast moving environment to solve business problems as a member of either a cross-functional team embedded within a business unit or a central science and economics organization. You will be expected to develop techniques that apply econometrics to large data sets, address quantitative problems, and contribute to the design of automated systems around the company.
US, WA, Seattle
Economists in the Forecasting, Macroeconomics & Finance field document, interpret and forecast Amazon business dynamics. This track is well suited for economists adept at combining times-series statistical methods with strong economic analysis and intuition. This track could be a good fit for candidates with research experience in: macroeconometrics and/or empirical macroeconomics; international macroeconomics; time-series econometrics; forecasting; financial econometrics and/or empirical finance; and the use of micro and panel data to improve and validate traditional aggregate models. Economists at Amazon are expected to work directly with our senior management and scientists from other fields on key business problems faced across Amazon, including retail, cloud computing, third party merchants, search, Kindle, streaming video, and operations. The Forecasting, Macroeconomics & Finance field utilizes methods at the frontier of economics to develop formal models to understand the past and the present, predict the future, and identify relevant risks and opportunities. For example, we analyze the internal and external drivers of growth and profitability and how these drivers interact with the customer experience in the short, medium and long-term. We build econometric models of dynamic systems, using our world class data tools, formalizing problems using rigorous science to solve business issues and further delight customers.
US, WA, Seattle
Amazon.com strives to be Earth's most customer-centric company where customers can shop in our stores to find and discover anything they want to buy. We hire the world's brightest minds, offering them a fast paced, technologically sophisticated and friendly work environment. Economists at Amazon partner closely with senior management, business stakeholders, scientist and engineers, and economist leadership to solve key business problems ranging from Amazon Web Services, Kindle, Prime, inventory planning, international retail, third party merchants, search, pricing, labor and employment planning, effective benefits (health, retirement, etc.) and beyond. Amazon Economists build econometric models using our world class data systems and apply approaches from a variety of skillsets – applied macro/time series, applied micro, econometric theory, empirical IO, empirical health, labor, public economics and related fields are all highly valued skillsets at Amazon. You will work in a fast moving environment to solve business problems as a member of either a cross-functional team embedded within a business unit or a central science and economics organization. You will be expected to develop techniques that apply econometrics to large data sets, address quantitative problems, and contribute to the design of automated systems around the company.
US, WA, Seattle
Amazon.com strives to be Earth's most customer-centric company where customers can shop in our stores to find and discover anything they want to buy. We hire the world's brightest minds, offering them a fast paced, technologically sophisticated and friendly work environment. Economists at Amazon partner closely with senior management, business stakeholders, scientist and engineers, and economist leadership to solve key business problems ranging from Amazon Web Services, Kindle, Prime, inventory planning, international retail, third party merchants, search, pricing, labor and employment planning, effective benefits (health, retirement, etc.) and beyond. Amazon Economists build econometric models using our world class data systems and apply approaches from a variety of skillsets – applied macro/time series, applied micro, econometric theory, empirical IO, empirical health, labor, public economics and related fields are all highly valued skillsets at Amazon. You will work in a fast moving environment to solve business problems as a member of either a cross-functional team embedded within a business unit or a central science and economics organization. You will be expected to develop techniques that apply econometrics to large data sets, address quantitative problems, and contribute to the design of automated systems around the company.
US, WA, Seattle
Amazon.com strives to be Earth's most customer-centric company where customers can shop in our stores to find and discover anything they want to buy. We hire the world's brightest minds, offering them a fast paced, technologically sophisticated and friendly work environment. Economists in the Forecasting, Macroeconomics & Finance field document, interpret and forecast Amazon business dynamics. This track is well suited for economists adept at combining times-series statistical methods with strong economic analysis and intuition. This track could be a good fit for candidates with research experience in: macroeconometrics and/or empirical macroeconomics; international macroeconomics; time-series econometrics; forecasting; financial econometrics and/or empirical finance; and the use of micro and panel data to improve and validate traditional aggregate models. Economists at Amazon are expected to work directly with our senior management and scientists from other fields on key business problems faced across Amazon, including retail, cloud computing, third party merchants, search, Kindle, streaming video, and operations. The Forecasting, Macroeconomics & Finance field utilizes methods at the frontier of economics to develop formal models to understand the past and the present, predict the future, and identify relevant risks and opportunities. For example, we analyze the internal and external drivers of growth and profitability and how these drivers interact with the customer experience in the short, medium and long-term. We build econometric models of dynamic systems, using our world class data tools, formalizing problems using rigorous science to solve business issues and further delight customers.