Automatically tracking metadata and provenance of machine learning experiments

Sebastian Schelter; Joos-Hendrik Böse; Johannes Kirschnick; Thoralf Klein; Stephan Seufert

Publication

Automatically tracking metadata and provenance of machine learning experiments

By Sebastian Schelter, Joos-Hendrik Böse, Johannes Kirschnick, Thoralf Klein, Stephan Seufert

2017

Download Copy BibTeX

Share

Download

Copy BibTeX

Share

We present a lightweight system to extract, store and manage metadata and provenance information of common artifacts in machine learning (ML) experiments: datasets, models, predictions, evaluations and training runs. Our system accelerates users in their ML workflow, and provides a basis for comparability and repeatability of ML experiments. We achieve this by tracking the lineage of produced artifacts and automatically extracting metadata such as hyperparameters of models, schemas of datasets or layouts of deep neural networks. Our system provides a general declarative representation of said ML artifacts, is integrated with popular frameworks such as MXNet, SparkML and scikit-learn, and meets the demands of various production use cases at Amazon.

Automatically tracking metadata and provenance of machine learning experiments

Latest news

Work with us