Amazon SageMaker autopilot: a white box AutoML solution at scale

Piali Das; Nikita Ivkin; Tanya Bansal; Laurence Rouesnel; Philip Gautier; Zohar Karnin; Leo Dirac; Andre Perunicic; Iaroslav Shcherbatyi; Aida Zolic

Publication

Amazon SageMaker autopilot: a white box AutoML solution at scale

By Piali Das, Nikita Ivkin, Tanya Bansal, Laurence Rouesnel, Philip Gautier, Zohar Karnin, Leo Dirac, Andre Perunicic, Iaroslav Shcherbatyi, Aida Zolic

2020

Download Copy BibTeX

Share

Download

Copy BibTeX

Share

We present Amazon SageMaker Autopilot: a fully managed system that provides an automatic machine learning solution. Given a tabular dataset and the target column name, Autopilot identifies the problem type, analyzes the data and produces a diverse set of complete ML pipelines, which are tuned to generate a leaderboard of candidate models that the customer can choose from. The diversity allows users to balance between different needs such as model accuracy vs. latency. By exposing not only the final models but the way they are trained, meaning the pipelines, we allow to customize the generated training pipeline, thus catering the need of users with different levels of expertise. This trait is crucial for users and is the main novelty of Autopilot; it provides a solution that on one hand is not fully black-box and can be further worked on, while on the other hand is not a do it yourself solution, requiring expertise in all aspects of machine learning. This paper describes the different components in the eco-system of Autopilot, emphasizing the infrastructure choices that allow scalability, high quality models, editable ML pipelines, consumption of artifacts of offline meta-learning, and a convenient integration with the entire SageMaker system allowing these trained models to be used in a production setting.

Amazon SageMaker autopilot: a white box AutoML solution at scale

Latest news

Work with us