MLIO is a high performance data access library for machine learning tasks with support for multiple data formats. It makes it easy for scientists to train models on their data without worrying about the format or where it's stored. Algorithm developers can also use MLIO to build production-quality algorithms that support a rich variety of data formats and provide helpful parsing and validation messages to their customers without compromising on performance.
MLIO is already being leveraged by various components of the Amazon SageMaker platform such as its first-party algorithms and the Autopilot feature. The open-source Amazon SageMaker XGBoost and Scikit-learn container images also use MLIO for consuming datasets.