MLSys 2021: Bridging the divide between machine learning and systems
Amazon distinguished scientist and conference general chair Alex Smola on what makes MLSys unique — both thematically and culturally.
The Conference on Machine Learning and Systems (MLSys), which starts next week, is only four years old, but Amazon scientists already have a rich history of involvement with it. Amazon Scholar Michael I. Jordan is on the steering committee; vice president and distinguished scientist Inderjit Dhillon is on the board and was general chair last year; and vice president and distinguished scientist Alex Smola, who is also on the steering committee, is this year’s general chair.
As the deep-learning revolution spread, MLSys was founded to bridge two communities that had much to offer each other but that were often working independently: machine learning researchers and system developers.
“If you look at the big machine learning conferences, they mostly focus on, ‘Okay, here's a cool algorithm, and here are the amazing things that it can do. And by the way, it now recognizes cats even better than before,’” Smola says. “They're conferences where people mostly show an increase in capability. At the same time, there are systems conferences, and they mostly care about file systems, databases, high availability, fault tolerance, and all of that.
“Now, why do you need something in-between? Well, because quite often in machine learning, approximate is good enough. You don't necessarily need such good guarantees from your systems. If you lower the requirement, you can do things cheaper, faster, or more scalably.”
Sys for ML, ML for sys
At the same time, as deep learning’s popularity grew, it was natural to ask whether it could help allocate resources in computer systems.
“This is, ultimately, what a lot of systems papers do,” Smola says. “They are along the lines of, Should I start a machine? Should I end one? How many devices should I schedule for a job? When do I decide that a machine has failed? How much redundancy do I need? So you can ask yourself, Well, given that these are a lot of resource decisions, can I use machine learning to predict what an optimal or at least a better strategy will be to handle those resource decisions?”
Should I start a machine? Should I end one? How many devices should I schedule for a job? ... Can I use machine learning to predict what an optimal or at least a better strategy will be?
Papers accepted to MLSys, Smola explains, feature research in both directions — machine learning for systems and systems for machine learning. As an example, he pulls up the program listing for the conference session on Communication and Storage, on Tuesday, April 6.
“This is very much a systems session,” Smola says. “But even there, it covers both directions. For instance, the paper ‘In-network aggregation for shared machine learning clusters’ is about how you can do operations cheaply to facilitate more machine learning. But another paper is about how to use machine learning to make those storage systems themselves better.”
MLSys doesn’t just represent a merger of research programs, Smola says; it also represents a merger of cultures — which can make for some lively discussion.
“In the systems community, essentially, unless you actually have a working system, they won't take you seriously,” Smola says. “That makes the conference a little bit interesting, because you have two different cultures. You have the machine learning culture, where it's more like, ‘Hey, here is an impressionist painting of what could be. Next paper, please.’ And then the systems community, which is a lot more rigorous in terms of, ‘Well, here's something that actually works, and hey, we've demonstrated it. And by the way, maybe there's a product that's actually shipping now with it.’ And that makes the conference interesting, because those two cultures usually don't mix quite that much. And this maybe gives the systems papers a slightly more theoretical bent and the machine learning papers a slightly more empirical one.”
At this year’s MLSys, one of the additions that Smola oversaw is the introduction of a daylong Chips and Compilers Symposium, which brings the conversation about system design for machine learning down to the metal — the chip level. The symposium was organized by Mu Li, a senior principal scientist with Amazon Web Services.
“There are events like Hot Chips and Cool Chips where NVIDIA, Intel, AMD, ARM, and others show up and demonstrate the latest silicon,” Smola says. “But the silicon is only half the equation. So we figured this would be a good place to bring these two communities much more closely together. This is a community-building exercise.”
Like all computer science conferences in the past year, MLSys has moved online. The advantage of that, Smola says, is that it has drastically reduced the price of registration — $25 for students and $100 for academics and professionals.
“It's super affordable for anybody,” Smola says.