Solving some of the largest, most complex operations problems
How Amazon’s Supply Chain Optimization Technologies team has evolved over time to meet a challenge of staggering complexity.
Amazon’s ability to grow to an unprecedented scale, while simultaneously meeting the growing expectations of its customers, particularly around delivery speeds, is a success story on many levels.
One of the keys to that success is a team that is fundamental to Amazon’s increasingly rapid transformation. A largely unsung team that in little more than a decade has built one of the largest and most sophisticated automated decision-making systems in the world. A team that has harnessed simulation, mathematical optimization, and machine learning to create the capability to deliver products at speeds once thought impossible at the mass market scale — in some cases within 2 hours — across a fulfillment network of dizzying complexity.
This is Amazon’s Supply Chain Optimization Technologies team (SCOT). If the Amazon Store were a human body, think of SCOT as its nervous system: essential to life, quietly acting in the background to automatically optimize critical functions and flows.
“At SCOT, using science and technology to optimize the supply chain is not just an enabler, it's our core focus,” says Ashish Agiwal, vice president, Fulfillment Optimization.
Today, SCOT’s systems have end-to-end responsibility for orchestrating Amazon Store’s supply chain.
SCOT is responsible for computing the delivery promises Amazon Store customers see when ordering, forecasting demand for its hundreds of millions of products, deciding which products to stock and in what quantities, allocating stock to warehouses and fulfillment centers (FCs) in anticipation of regional customer needs, offering markdown pricing when necessary, working out how to consolidate customer orders for maximum efficiency, coordinating inbound and inventory management from millions of sellers worldwide, and so much more.
But it was not always thus. Far from it, says Deepak Bhatia, vice president of SCOT, whose team’s methodologies and mechanisms will be a topic of conversation at INFORMS, the world’s largest operations research and analytics conference, taking place next week in Indianapolis, Indiana.
“A very different world”
In 2011 when Bhatia joined Amazon, the team that would evolve into SCOT was much smaller, he recalls, and its main concern was trying to automate Amazon’s product buying and inventory management.
“It was a very different world. The notion of an end-to-end supply chain tech function wasn’t there. But there were powerful intellects and a lot of energy in that team.”
It was a huge deal. Will it improve things, and if so by how much? Will it completely break? In the beginning, we took baby steps. We made changes one product category at a time.
In 2011, Amazon’s total revenue reached nearly $48 billion, and it was already clear to the senior leadership that the company’s scale would require the automation of buying and the management of inventory; monitoring spreadsheets was not a long-term solution. Indeed, even then the sheer range of products offered by Amazon meant the “illusion of control” was already kicking in among the groups managing inventory, says Bhatia. In fact, Bhatia notes, the sheer complexity and scale meant the challenge was beyond the scope of any team, let alone an individual.
In response, Bhatia and his colleagues set out to develop complex algorithms that could make buying and inventory placement decisions for a given category of products. And while that was all well and good in theory, trying it for real was a watershed moment.
“It was a huge deal. Will it improve things, and if so by how much? Will it completely break? In the beginning, we took baby steps. We made changes one product category at a time.”
Media category products were the early adopters. In randomized, controlled trials that ran over several months, some of these products were managed in the traditional way, and some by the new algorithms. Crucially, human judgement could still override the system’s decisions if deemed necessary.
The trial went well — the algorithms’ decisions were overridden only a small percentage of the time — and the approach was expanded across additional categories, including consumables such as groceries.
Going all in
“Then one day, in a high-level meeting someone said: ‘What if we go all in and make these categories 100% automated?’, Bhatia recalls. “Someone responded ‘All hell will break loose’.” And that, Bhatia notes, is where Amazon’s comfort with risk-taking came into play. “They decided to go all in.” That was around 2014. And the systems worked as designed, improving customer experience outcomes like in-stock rates while reducing costs.
One day, in a high-level meeting someone said: ‘What if we go all in and make these categories 100% automated?’ Someone responded ‘All hell will break loose’.
“After this success, automating one product category at a time started to feel too risk-averse,” says Bhatia.
Over the next few years, the technology was rapidly rolled out across the retail business, all the while being iterated and improved upon, with increasing success in terms of efficiency and customer satisfaction. At the same time, the rapidly growing SCOT team was developing technologies that would enable them to join the dots from one end of the Amazon supply chain to the other.
For example, SCOT grew its own demand forecasting team, with a sharp focus on scientific and technological innovation. The forecasting aspect of SCOT’s work started out as a patchwork of models, which evolved eventually to deep learning approaches to decide what features of the retail data were most important.
Today, building on a 2018 in-house research breakthrough, the forecasting team is using a single model that learns business-critical demand patterns without even being told what to look for. Called the Multi-Horizon Quantile Recurrent Forecaster, the model can accurately forecast shifting seasonal demand, future planned-event demand spikes and even “cold-start forecasting” for products with limited sales history.
Forecast accuracy is particularly important at Amazon’s scale.
“SCOT is directing hundreds of billions of dollars of product flows. That means just a few percentage points of change in our topline predictions equates to several fulfillment centers worth of products,” says Salal Humair, a SCOT vice president and Amazon distinguished scientist.
As SCOT’s demand forecasting has improved, so too has its ability to ensure that products were best positioned to fulfill those anticipated customer orders.
The challenge of One-Day Delivery
While Amazon’s largely manual inventory management system became increasingly automated in the early part of the previous decade, those changes proved insufficient for the logistical challenges that lay ahead: Amazon’s ever more ambitious customer-delivery promises, particularly its One-Day Delivery promise in the US in 2019, and Prime Now, Amazon's 2-hour grocery businesses.
“Before we announced the One-Day Delivery promise, a detailed SCOT simulation called Mechanical Sensei was the key to figuring out how much additional inventory we would need, where it would be placed, and how that would affect shipping costs,” says Humair.
So, at a time when Amazon was continuing to expand globally, the company’s bold delivery promises meant there was a pressing need to locate products closer to Amazon customers. This meant a significant increase in local distribution facilities, and yet another challenge: which items should be locally placed?
“Most of our systems were designed to operate under the simplifying assumption that demand for each item sold on the website is independent, but we know that’s not the case in reality,” says Jeffrey Maurer, vice president, Inventory Planning and Control. “When one product goes out of stock, or isn’t available for fast delivery, demand shifts to other products. We can’t make every product locally available in every location, so how do we account for these constraints while trying to maximize customer satisfaction?”
That nut has yet to be comprehensively cracked, but the simple fact of adding local warehousing resulted in a supply chain network of such layered complexity, that the SCOT team realized its automated network would need yet another radical redesign.
It took them several years to solve for the new set of challenges.
“We had to iterate, fail, iterate, fail, iterate, fail many times,” Humair recalls.
Then, in 2020, the team unveiled its latest breakthrough: the “multi-echelon system”. This is a multi-product, multi-layered, multi-fulfillment center model for optimizing inventory levels for varying delivery speeds in a space where future demand, product lead times and capacity constraints are all uncertain, and where real-time customer promises and fulfillment make the demand patterns seen by FCs very hard to characterize.
“We have a strong sense of pride for the work the SCOT team is doing,” says Bhatia. “These sorts of solutions are just unheard of in academia and industry.”
The SCOT team was able to demonstrate significant improvements to inventory buying and placement through the multi-echelon system, but rolling it out across the business was a challenge.
“Not only did the teams, systems and coordination mechanisms all need to be rebuilt, but we also had to keep the business running,” says Humair. “We had to change the engine while still flying the plane!”
And then there was COVID. “The impact of COVID on our supply chain brought capacity management to the forefront,” says Maurer. “It was no longer enough to be approximately right at network level in terms of capacity management; we needed to get it exactly right at every facility and connection in our network.”
Ultimately, the successful combination of powerful forecasting, multi-echelon inventory management‚ and several other algorithms and systems — running the gamut from fulfillment to customer promise, inventory health, and inventory placement — along with unparalleled distribution capacity enabled Amazon to deal with the effects of COVID as well as the enormous surges in demand created by shopping events such as Cyber Monday and Amazon’s own Prime Day. The latter, this year, resulted in the record-breaking purchase of more than 300 million items across more than 20 countries.
So what are the current and future challenges in SCOT’s sights?
“The range of problems requiring disruptive technology solutions is not exhausted,” Humair notes.
For example, about 60% of the Amazon Store’s sales is through Fulfillment by Amazon (FBA), a service for small-and-medium sized businesses to provide unique selection for Amazon customers at low costs and fast speeds.
Optimizing supply chain efficiency would be hard enough at Amazon’s scale, even if Amazon was in full control of every aspect of its fulfillment network. “However we work with millions of FBA sellers with different cost structures and inventory management practices who independently decide what to sell, how much to inbound, and how to price their products,” notes Piyush Saraogi, vice president, FBA.
These businesses share Amazon’s storage capacity and transportation network, but make their own decisions on pricing and inventory management. COVID played a role here as well: capacity constraints meant the FBA team had to adopt limits on restocking.
“Balancing the supply and demand of capacity in a network with 60% FBA inventory is an incredibly complex business problem,” Saraogi says. “To balance capacity in the marketplace setting, we have to invent new approaches that offer predictability to our sellers and are consistent with our general laissez-faire approach to FBA, while giving Amazon the flexibility to balance the network and ensure our store has all the in-stock selection customers are looking for.
Sellers may have developed a blockbuster new product, received fresh capital, or shifted distribution toward FBA. The science for leveraging this key seller input in a scalable manner into our inventory and capacity management systems is an unchartered territory that our scientists, engineers, and product managers are working on.”
“This is a big challenge for SCOT,” Bhatia agrees. “How can we support all our independent third-party sellers in ways that result in a triple win, for them, for Amazon, and for our customers?”
The SCOT team also wrestles with something that is increasingly prevalent in the modern world of complex optimization modelling and machine learning: how to explain automated decisions to the people who need to understand why things are happening as they are.
“We have hundreds of people fielding questions from selling partners and other stakeholders,” says Humair. “Why have my in-stock rates changed? Why do I have more inventory? Each such question requires manual deep dives, hundreds of person hours to answer.” The team is currently developing new methods to make its systems more explainable.
These systems optimize millions of customer promises every second and billions of customer order fulfillment plans daily. This is done by evaluating hundreds of millions of potential transport routes across the network and tracking over a billion real-time inventory updates every day
Indeed, the very fact that such technology is extremely complex and requires a sophisticated technical background to fully understand makes the idea of going all-in on data science a daunting proposition,” says Humair.
“Data is always ambiguous, so you need a lot of conviction and judgment to stay the course. But it has yielded spectacular benefits for Amazon, for our selling partners, and, most importantly, for our customers.”
Another big challenge is managing transportation through Amazon’s growing delivery fleet of trucks, planes, sort centers, and delivery stations. SCOT’s Fulfillment Optimization team, led by Agiwal, runs the systems that makes outbound fulfillment decisions.
“These systems optimize millions of customer promises every second and billions of customer order fulfillment plans daily. This is done by evaluating hundreds of millions of potential transport routes across the network and tracking over a billion real-time inventory updates every day,” he says.
Amazon’s operation of its own transportation network has created what Agiwal calls “a very exciting problem space” that his team is now addressing. “Designing the network topology, optimizing connections in a multi-tier multi-modal network, and coordinating all operational resources at Amazon scale is unprecedented,” he notes.
“Our new priority is ensuring that our own delivery trucks or cargo planes are as full as possible while also meeting our customer-delivery windows,” says Bhatia.
That problem space also illustrates why Amazon SCOT is so unique.
“We are solving some of the largest, most complex problems in operations using solutions entirely built in-house,” says Agiwal. “We have some of the best scientists, engineers and product managers in the world, working together and controlling their own destiny. We have the luxury of large and diverse data sets and the ability to innovate and experiment at a massive scale with immediate, measurable impact on customer experience and costs. It is truly gratifying.”
That complexity also explains why SCOT is so appealing to data scientists, economists, and machine learning scientists of all stripes.
“Our problem dimensionality is high and closed-form solutions are rarely applicable,” notes Maurer. “Our teams continually invent and implement new algorithms and evolve the fundamental structure of our systems as the physical network changes. SCOT is a great place for people who are drawn to exceptionally complex problem spaces and motivated by having high production impact.”