An overhead shot inside an Amazon fulfillment center shows hundreds of boxes on conveyor belts along with people monitoring the flow of those packages — Amazon's scale makes picking the right package for each product a challenge. Fortunately, machine learning approaches — particularly deep learning — thrive on big data and massive scale. These tools have helped Amazon reduce per-shipment packaging weight by 36% and eliminate more than a million tons of packaging.

Sustainability

How pioneering deep learning is reducing Amazon’s packaging waste

A combination of deep learning, natural language processing, and computer vision enables Amazon to hone in on the right amount of packaging for each product.

January 4, 2022

7 min read

Finding the right amount of packaging to ship an item can be challenging — and at Amazon, an ever-changing catalog of hundreds of millions of products makes it an ongoing challenge. In addition, Amazon’s scale also means it is impossible to solve this challenge using manual inspection to choose packaging for each and every item. For the same reason, general packaging rules and run-of-the-mill logic just won’t cut it. What’s required is a cutting-edge-smart automated mechanism that can adapt on the fly to changing circumstances.

Prasanth Meiyappan, top right, an applied scientist, and Matthew Bales, a research science manager, authored "Reducing Amazon’s packaging waste using multimodal deep learning". Their position paper was one of the 10 most read research papers on Amazon Science in 2021.

Fortunately, machine learning approaches — particularly deep learning — thrive on big data and massive scale, and a pioneering combination of natural language processing and computer vision is enabling Amazon to hone in on using the right amount of packaging. These tools have helped Amazon drive change over the past six years, reducing per-shipment packaging weight by 36% and eliminating more than a million tons of packaging, equivalent to more than 2 billion shipping boxes.

“When I started at Amazon in 2017, we had a lot of physical testing of products going on, but not a scalable mechanism that could assess hundreds of millions of products to identify the optimal packaging type for each product,” says research science manager Matthew Bales. Bales, who is also a physicist, heads up machine learning within Amazon’s Customer Packaging Experience team.

“Statistical tests were the first piece, but they are essentially only useful when products have already been shipped in more than one package type. We wanted the capability to predict how a product would fare in a less-protective, lighter, and more sustainable package type. And once you're in that predictive space, you need machine learning,” Bales explains.

The power of customer feedback

To make a prediction about whether a given product could be safely shipped in a particular package type, Bales and his colleagues built a ML model based largely on the text-based data that customers find on the Amazon Store — the item name, description, price, package dimensions, and so on.

Balancing act

To arrive at this triple win, though, the team also had to take on a thorny challenge encountered frequently in the ML domain: class imbalance. In a nutshell, the problem is this: if you want an ML model to learn effectively, you ideally provide it with as many examples of failures as successes, so it can learn to differentiate effectively between the two.

The data used to train the model had many millions of examples of product/package pairings, yet depending on the package type, as little as 1% of those examples were for packages that turned out to be unsuitable in some way for the product within.

The machine learning literature to do with packaging is pretty sparse. Not many people deal with the kind of datasets we are dealing with in the packaging domain.

Prasanth Meiyappan

“Prior to implementing ML, we’ve shipped some product in envelopes and mailers for some time,” says Bales. “So, we had loads of examples of things that were good in mailers, but didn't have a lot of examples of things that were bad in mailers. ML models have problems with this kind of overwhelming imbalance.”

“The machine learning literature to do with packaging is pretty sparse,” Meiyappan says. “Not many people deal with the kind of datasets we are dealing with in the packaging domain. How effective a technique is in dealing with dataset imbalance is both domain and dataset specific.”

Thus the team’s approach to the class imbalance problem was primarily experimental. And of the six approaches they applied — four data based, two algorithm based — the clear winner produced a marked improvement in model accuracy. That was a data-based approach called two-phase learning with random under sampling which focuses the model on the minority class in the first phase of training and then on all of the data in the second. “In our position paper we share that knowledge with the ML community,” says Bales, “so that anyone who encounters a similar problem might choose to try this approach for themselves, to see if it also works in their problem space.”

What’s next

The team said they are eager to expand the use of this tool by training the model to understand all Amazon’s customers languages while also incorporating the unique aspects of fulfilment in each country.

Read the Amazon Sustainability Report

Amazon is committed to building a sustainable business for customers and the planet. Learn more about Amazon's goals, strategies, and policies in the Amazon Sustainability Report.

While Amazon scientists continue to research other ways to utilize machine learning to eliminate waste, the company is also working to reduce packaging waste throughout the e-commerce supply chain. Amazon is, for example, increasingly incentivizing its vendors to create optimized e-commerce packaging for themselves that saves space and materials without compromising product protection.

Through the Climate Pledge, which we cofounded and committed to in 2019, our goal is to reach net‑zero carbon emissions across our global operations by 2040, while inspiring and inviting others to take action.

About the Author

Sean O'Neill

Sean O’Neill is a writer, editor, and science communicator based near Bristol, UK.

How pioneering deep learning is reducing Amazon’s packaging waste

A combination of deep learning, natural language processing, and computer vision enables Amazon to hone in on the right amount of packaging for each product.

The power of customer feedback

Balancing act

What’s next

Read the Amazon Sustainability Report

Related content

Work with us