ICLR: What representation learning means in the data center
Amazon Scholar Aravind Srinivasan on the importance of machine learning for real-time and offline resource management.
Until relatively recently, a major concern of machine learning research was feature engineering, or determining which aspects of a machine learning model’s input data were most useful for the task at hand. Feature engineering typically required domain expertise: vision scientists to identify important features of images, linguists to identity important features of speech, and so on.
The International Conference on Learning Representations (ICLR) was founded to investigate an alternative: learning features directly from data. This is the approach that fueled the deep-learning revolution, and in the nine years since its founding, ICLR has moved from the periphery of machine learning into the center of the mainstream.
Aravind Srinivasan is an Amazon Scholar, a Distinguished University Professor at the University of Maryland, and an area chair at this year’s ICLR, and in his work at Amazon, he brings representation learning to bear in an environment that, when ICLR was founded, would have seemed far afield: data centers.
Srinivasan works chiefly on Amazon Web Services Lambda service, which offers function execution as a service.
“You can think of a function as simply a piece of code,” Srinivasan says. “If you can learn representations of the inputs that we get, which are the functions, their shapes” — that is, their resource consumption over time — “how often they run, when they get invoked, how quickly they terminate, whether they have very strict deadlines or lax deadlines, you can use that data to do all kinds of optimizations, both online as well as longer-term planning.”
For instance, Srinivasan explains, “the function could be spiky: it could spike up its CPU utilization for some time, and then be relatively low maintenance for a bit, and spike again. If you can understand these shapes, you can cluster related functions together.
2021 Amazon Research Awards announced
“You often want to cluster what are called anticorrelated functions together. If there are functions F1 and F2, and when F1 is spiking up, F2 will not spike up, and conversely, when F1 comes down, F2 spikes up, it's much better to pack them on the same worker because they are not going to simultaneously require significant resource bounce.
“Or suppose you have a new service that you want to roll out. You want to be able to predict, for instance, what its success rate will be. You can use machine learning to predict what kinds of jobs are most popular, and maybe what kinds of jobs are popular during which times of the year, which days of the week, et cetera. We can explicitly ask customers, but also, we want our own prediction models that can talk about the probability of success of a new anticipated service.
“Then there is the other significant pipeline post-learning, which is how to use these predictions in order to do better resource planning, to do better resource allocation, how to provide guaranteed qualities of service to different customers, et cetera.”
ML meets algorithm design
The idea of a post-processing pipeline that makes use of learned representations touches on a central theme of Srinivasan’s research, both at Amazon and in his academic lab: the intersection of machine learning and algorithm design.
“Both within the guts of machine learning and as a post-processing or even preprocessing step, machine learning can be very helpful,” Srinivasan says. “For example, there is growing awareness and concern as well that our models are becoming very, very large. Of course, computation time is a problem, but the carbon footprints of these models are becoming nontrivial. If you have a model that runs on many cores for many days, the amount of energy it takes is nontrivial.
“So can we make our models more efficient? Can we view neural-network architectures as a constrained optimization problem where you can make the neural-network inference faster while retaining accuracy and provide other sorts of guarantees? For example, can fairness be baked into how a neural network runs?”
Fairness, Srinivasan says, is a research topic that has gained momentum in recent years. When he surveys the program at this year’s ICLR, the new emphasis on fairness is one of the things that jumps out at him.
“Amazon is playing a leading role in this,” Srinivasan says. “Amazon has a collaboration with the National Science Foundation to give out grants to people working on fairness and AI. There's a growing number of papers on fairness in forums like ICLR, as well as the other major machine learning and AI conferences.”
Fairness is also a good example of the kind of interesting scientific question that arises in the context of trying to provide better services to Amazon customers.
“Amazon is a real sandbox where the inferences one comes up with are of tremendous value to the corporation as well as of scientific interest,” Srinivasan says. “There is the opportunity to both develop new science and apply known science in interesting ways, to highly multimodal data in some cases. There are very interesting scientific and technical challenges, and there are very interesting practical challenges. I don't mean these are separate: they of course interlace with each other. So for people who are interested in large data sets, in uncertainty in data prediction, in predictive models, representation learning, all of these, you have an environment where there are very significant practical problems to be solved.”