How a universal model is helping one generation of Amazon robots train the next

New approach can cut the setup time required to develop vision-based machine learning solutions from between six to twelve months to one or two.

A fundamental theme at Amazon is movement. Obtaining a product ordered by a customer and moving that product as quickly and efficiently as possible from its source to the customer’s doorstep.

This video shows robots moving packages around an Amazon fulfillment center.

That journey will often take a package through multiple warehouses and include loadings, unloadings, sortings, and routings. Human associates are crucial to this process and so, increasingly, are robotic manipulators. A rising star in this department is the Robin robotic arm and the computer vision system that makes it possible.

Robin’s visual-perception algorithms can identify and locate packages on a conveyor belt below it, for example, and even distinguish individual packages and their type within a cluttered pile.

This perceptive ability is known as segmentation, and it is central to the development of flexible and adaptive robotic processes for Amazon fulfillment centers. That’s because packages vary enormously in their dimensions and physical characteristics, moving amid an ever-changing mix of packages and against varying backdrops.

Amazon's Robin robot arm is seen lifting packages
Robin’s visual-perception algorithms can identify and locate packages on a conveyor belt below it, for example, and even distinguish individual packages and their type within a cluttered pile.

Robin is a maturing technology, but there is a constant simmering of new ideas just below the surface at Amazon, with teams of scientists and engineers across the Amazon Robotics AI group and beyond collaborating to develop AI-powered robotic solutions to improve warehouse efficiency. A new modeling approach aims to serve them all.

An abundance of packages — but not data

The initial challenge for these early-stage collaborations is often the same.

“The biggest problem that new project teams usually face is data scarcity,” says Cassie Meeker, an Amazon Robotics AI applied scientist, based in Seattle. Obtaining images relevant to a warehouse process of interest takes time and resources, but that’s just the beginning.

Cassie Meeker, an Amazon Robotics AI applied scientist, is seen standing in front of a Robin robot arm
Cassie Meeker, an Amazon Robotics AI applied scientist, says she and her team started their quest to develop universal models by utilizing publicly available datasets to give their model basic classification skills.

“For some machine learning models, you must annotate each training image manually by drawing multiple polygons around the various packages in the picture,” Meeker explains. “It can take five minutes to annotate just one image if it’s cluttered.”

The lack of task-specific training data means teams might base their perceptual models on just a few hundred images, says Meeker: “If they're lucky, they have a thousand. But even a thousand images aren’t a lot for training a model.”

If new projects do not have sufficient variety in their training data, that’s a challenge.

“The production environment is typically very different to a prototyping environment, so when they go into the production phase on the warehouse floor, they will suddenly see all these things they've never seen before and that their perception system can’t identify,” says Meeker. “They could be setting themselves up for failure.”

This difficulty in obtaining data to train segmentation models is partly due to the very specific subject matter: packages. Many computer vision models are trained on enormous, publicly available datasets full of annotated imagery, including everything from aardvarks to zabaglione. A social media company might want to segment faces, or dogs or cats, because that’s what people have lots of pictures of.

“Many publicly available datasets are perfect for that,” says Meeker. “But at Amazon, we have such a specific application and annotation requirements. It just doesn’t translate well from cat pics.”

A ’universal model’ for packages

In short, building a dataset big enough to train a demanding machine learning model requires time and resources, with no guarantee that the novel robotic process you are working toward will prove successful. This became a recurring issue for Amazon Robotics AI. So this year, work began in earnest to address the data scarcity problem. The solution: a “universal model” able to generalize to virtually any package segmentation task.

To develop the model, Meeker and her colleagues first used publicly available datasets to give their model basic classification skills — being able to distinguish boxes or packages from other things, for example. Next, they honed the model, teaching it to distinguish between many types of packaging in warehouse settings — from plastic bags to padded mailers to cardboard boxes of varying appearance — using a trove of training data compiled by the Robin program and half a dozen other Amazon teams over the last few years. This dataset comprised almost half a million annotated images.

Meet the Amazon robot improving safety

Crucially, these images of packages were snapped from a variety of angles — not only straight down from above a conveyor belt — and against a variety of backgrounds. The sheer number and variation of images make the dataset useful in virtually any warehouse location that may benefit from robotic perception and manipulation.

Meeker estimates that starting a project with the universal model can slash the setup time required to develop vision-based ML solutions from between six to twelve months to just one or two. And it has been made available to other Amazon teams in a user-friendly form, so extensive machine learning expertise is not required.

The universal model has already demonstrated its prowess, courtesy of a project run by Amazon Robotics, called Cardinal. Cardinal is a prototype robotic arm-based system that perceives and picks up packages and places them neatly into large containers ready for transport on delivery trucks. Cardinal’s perception system was developed before the universal model was available, so the team spent a lot of time creating a bespoke training dataset for it, says Cardinal’s perception lead, Jeroen van Baar, an Amazon Robotics senior applied scientist, based in North Reading, Massachusetts.

This video shows Cardinal training itself to distinguish between package types.

“We trained the system using 25,000 annotated training images that we created ourselves. But those early training images were taken using a setup with a different appearance to our prototype Cardinal workstation,” van Baar says. “To achieve the performance that we initially desired, we had to fine-tune our model using a thousand new training images taken from that prototype setting.”

After being updated with only those new images, the universal model was as accurate for performing Cardinal’s task as the workstation’s own robust model.

“Had it been available sooner, I would only have captured data specific to our setup and fine-tuned the universal model from there,” says van Baar. “Being able to shorten training time so significantly is a major benefit.”

Related content
Company is testing a new class of robots that use artificial intelligence and computer vision to move freely throughout facilities.

And that’s the point. The universal model can quickly capitalize on any training data produced by a new-project team. This means that when new ideas are tested on the warehouse floor, or existing methods are transplanted to a new Amazon region where things are done slightly differently, the model will have enough data diversity to handle the differences.

Siddhartha Srinivasa, director of Robotics AI, thinks of the universal model as a supportive scaffold that you can use to build your house.

“We're not advocating that everybody live in the same house,” he says. “We're advocating that Amazon teams leverage the scaffolding we're providing to build whatever house they want, because it’s already very powerful, and it is getting better every day.”

Tipping point

Only recently has all this become possible.

“The Robotics AI program is young,” says Meeker. “In the beginning, there was no reason to use other teams’ data, because no one had very much.” But a tipping point has arrived. “We now have enough mature teams in production that we are seeing a real diversity and scaling of data. It is finally generalizable.”

Indeed, while the immediate focus of universal models is identifying and localizing various package types, diverse image data is now accumulating across a range of Amazon programs that cover more aspects of fulfillment centers.

Related content
Why detecting damage is so tricky at Amazon’s scale — and how researchers are training robots to help with that gargantuan task.

The universal model now includes images of unpackaged items, too, allowing it to perform segmentation across a greater diversity of warehouse processes. Initiatives such as multimodal identification, which aims to visually identify items without needing to see a barcode, and the automated damage detection program are accruing product-specific data that could be fed into the universal model, as well as images taken on the fulfillment center floor by the autonomous robots that carry crates of products.

“We’re moving towards a situation in which even data collected by small projects run by interns can be fed into the universal base model, incrementally improving the productivity of the entire robot fleet,” says Srinivasa.

We’re moving towards a situation in which even data collected by small projects run by interns can be fed into the universal base model, incrementally improving the productivity of the entire robot fleet.
Siddhartha Srinivasa

This diversity of data and its aggregation is particularly important for robotic perception within Amazon, especially given customers’ shifting needs, frequently novel Amazon packaging, and the company’s commitment to sustainability that means shipping more items in their own unique packaging.

All of this increases the visual variety of products and packages, making it harder for robots to identify from an image where one package ends and another begins.

Feeding the universal model in this way and having it available to new teams will accelerate the experimentation and deployment of future robotic processes. The use of the universal model is factored into Amazon’s immediate operational plans.

“We’re not doing this because it's cool — though it really is cool — but because it is inevitable,” says Srinivasa.

Related content

US, WA, Seattle
The Amazon Devices and Services organization designs, builds and markets Kindle e-readers, Fire Tablets, Fire TV Streaming Media Players and Echo devices. The Device Economics team is looking for an Economist to join our fast paced, start-up environment to help invent the future of product economics. We solve significant business problems in the devices and retail spaces by understanding customer behavior and developing business decision-making frameworks. You will build econometric and machine learning models for causal inference and prediction, using our world class data systems, and apply economic theory to solve business problems in a fast-moving environment. This involves analyzing Amazon Devices and Services customer behavior, and measuring and predicting the lifetime value of existing and future products. We build scalable systems to ensure that our models have broad applicability and large impact. You will work with Scientists, Economists, Product Managers, and Software Developers to provide meaningful feedback about stakeholder problems to inform business solutions and increase the velocity, quality, and scope behind our recommendations. Key job responsibilities Applies expertise in causal modeling to develop econometric/machine learning models to measure the economic value of devices and the business Reviews models and results for other scientists, mentors junior scientists Generates economic insights for the Devices and Services business and work with stakeholders to run the business for effectively Describes strategic importance of vision inside and outside of team. Identifies business opportunities, defines the problem and how to solve it. Engages with scientists, business leadership outside Devices and Services to understand interplay between different business units We are open to hiring candidates to work out of one of the following locations: Arlington, VA, USA | Seattle, WA, USA
US, WA, Seattle
Amazon Advertising's Publisher Technologies team is looking for an experienced Applied Scientist with proven research experience in control theory, online machine learning, and/or mechanism design to drive innovative algorithms for ad-delivery at scale. Your work will directly shape pacing, yield optimization, and ad-selection for Amazon's publishers and impact experiences for hundreds of millions of users and devices. About the team Amazon Advertising operates at the intersection of eCommerce, streaming, and advertising, offering a rich array of digital advertising solutions with the goal of helping our customers find and discover anything they want to buy. We help advertisers reach customers across Amazon's owned and operated sites (publishers) across the web and on millions of devices such as Amazon.com, Prime Video, FreeVee, Kindles, Fire tablets, Fire TV, Alexa, Mobile, Twitch, and more. Within Ads, Publisher Technologies is building the next generation of ad-serving products to allow our publishers to monetize their on-demand, streaming, and static content across Amazon’s ad network in a few clicks. Publishers interact directly with our technology, through programmatic APIs to optimize billions of impression opportunities per day. About the role Publisher Technologies is looking to build out our Publisher Ad Server Science + Simulation and Experimentation team to drive innovation across ad-server delivery algorithms for budget pacing, ad-selection, and yield optimization. We seek to ensure the highest quality experiences for Amazon's customers by matching them with most relevant ads while ensuring optimal yield for publishers. As a Senior Applied Scientist, you will research, invent, and apply cutting edge designs and methodologies in control theory, online optimization, and machine learning to improve publisher yield and customer experience. You will work closely with our engineering and product team to design and implement algorithms in production. In addition, you will contribute to the end state vision of AI enhanced ad-delivery. You will be a foundational member of the team that builds a world-class, green-field ad-delivery service for Amazon's video, audio, and display advertising. To be successful in this role, you must be customer obsessed, have a deep technical background in both online algorithms and distributed systems, comfort dealing with ambiguity, an eye for detail, and a passion to identify and solve for practical considerations that occur when complex control-loops have to operate autonomously and reliably to make millisecond level decisions at scale. You are a technical leader with track record of building control theoretic and/or machine learning models in production to drive business KPIs such as budget delivery. If you are interested working on challenging and practical problems that impact hundreds of millions of users and devices and span cutting edge areas of optimization and AI while having fun on a rapidly expanding team, come join us! Key job responsibilities * Developing new statistical, causal, machine learning, and simulation techniques and develop solution prototypes to drive innovation * Developing an understanding of key business metrics / KPIs and providing clear, compelling analysis that shapes the direction of our business * Working with technical and non-technical customers to design experiments, simulations, and communicate results * Collaborating with our dedicated software team to create production implementations for large-scale data analysis * Staying up-to-date with and contributing to the state-of-the-art research and methodologies in the area of advertising algorithms * Presenting research results to our internal research community * Leading training and informational sessions on our science and capabilities * Your contributions will be seen and recognized broadly within Amazon, contributing to the Amazon research corpus and patent portfolio. We are open to hiring candidates to work out of one of the following locations: Seattle, WA, USA
US, WA, Seattle
The Alexa Economics team is looking for a Senior Economics Manager who is able to provide structure around complex business problems, hone those complex problems into specific, scientific questions, and test those questions to generate insights. The candidate will work with various product, analytics, science, and engineering teams to develop models and algorithms on large scale data, design pilots and measure their impact, and transform successful prototypes into data products at scale. They will lead teams of researchers to produce robust, objective research results and insights which can be communicated to a broad audience inside and outside of Alexa. Key job responsibilities Ideal candidates will work closely with business partners to develop science that solves the most important business challenges. They will work well in a team setting with individuals from diverse disciplines and backgrounds. They will serve as an ambassador for science for business teams, so that leaders are equipped with the right data and mental model to make important business decisions. Ideal candidates will own the development of scientific models and manage the data analysis, modeling, and experimentation that is necessary for estimating and validating models. They will be customer centric – clearly communicating scientific approaches and findings to business leaders, listening to and incorporate their feedback, and delivering successful scientific solutions. A day in the life - Review new technical approaches to understand Engagement and associated benefits to Alexa. - Partner with Engineering and Product teams to inject econometric insights and models into customer-facing products. - Help business teams understand the key causal inputs that drive business outcome objectives. About the team The Alexa Engagement and Economics and Team uses data, analytics, economics, statistics, and machine learning to measure, report, and track business outputs and growth. We are a team that is obsessed with understanding customer behaviors, and leveraging all aspects from customers behaviors with Alexa and Amazon to develop and deliver solutions that can drive Alexa growth and long-term business success. We use causal inference to identify business optimization and product opportunities. We are open to hiring candidates to work out of one of the following locations: Bellevue, WA, USA | Seattle, WA, USA
US, WA, Bellevue
We are seeking a passionate, talented, and inventive individual to join the Applied AI team and help build industry-leading technologies that customers will love. This team offers a unique opportunity to make a significant impact on the customer experience and contribute to the design, architecture, and implementation of a cutting-edge product. The mission of the Applied AI team is to enable organizations within Worldwide Amazon.com Stores to accelerate the adoption of AI technologies across various parts of our business. We are looking for an Applied Scientist to join our Applied AI team to work on LLM-based solutions. Key job responsibilities You will be responsible for developing and maintaining the systems and tools that enable us to accelerate knowledge operations and work in the intersection of Science and Engineering. You will push the boundaries of ML and Generative AI techniques to scale the inputs for hundreds of billions of dollars of annual revenue for our eCommerce business. If you have a passion for AI technologies, a drive to innovate and a desire to make a meaningful impact, we invite you to become a valued member of our team. A day in the life We are seeking an experienced Scientist who combines superb technical, research, analytical and leadership capabilities with a demonstrated ability to get the right things done quickly and effectively. This person must be comfortable working with a team of top-notch developers and collaborating with our research teams. We’re looking for someone who innovates, and loves solving hard problems. You will be expected to have an established background in building highly scalable systems and system design, excellent project management skills, great communication skills, and a motivation to achieve results in a fast-paced environment. You should be somebody who enjoys working on complex problems, is customer-centric, and feels strongly about building good software as well as making that software achieve its operational goals. About the team On our team you will push the boundaries of ML and Generative AI techniques to scale the inputs for hundreds of billions of dollars of annual revenue for our eCommerce business. If you have a passion for AI technologies, a drive to innovate and a desire to make a meaningful impact, we invite you to become a valued member of our team. We are open to hiring candidates to work out of one of the following locations: Bellevue, WA, USA
US, WA, Bellevue
We are looking for detail-oriented, organized, and responsible individuals who are eager to learn how to work with large and complicated data sets. Some knowledge of econometrics, as well as basic familiarity with Python is necessary, and experience with SQL and UNIX would be a plus. These are full-time positions at 40 hours per week, with compensation being awarded on an hourly basis. You will learn how to build data sets and perform applied econometric analysis at Internet speed collaborating with economists, scientists, and product managers. These skills will translate well into writing applied chapters in your dissertation and provide you with work experience that may help you with placement. Roughly 85% of previous cohorts have converted to full time economics employment at Amazon. If you are interested, please send your CV to our mailing list at econ-internship@amazon.com. We are open to hiring candidates to work out of one of the following locations: Arlington, VA, USA | Bellevue, WA, USA | Seattle, WA, USA
US, WA, Seattle
The ASFS Team is hiring an Intern in Economics. We are looking for detail-oriented, organized, and responsible individuals who are eager to learn how to work with large and complicated data sets. Knowledge of econometrics and macroeconomics, as well as familiarity with Python, Matlab, or R is necessary. This is a full-time position at 40 hours per week, with compensation being awarded on an hourly basis. You will use internal and external data to estimate macroeconometric models to answer critical business questions, also you will have the opportunity to collaborate with economists and data scientists. Roughly 85% of interns from previous cohorts have converted to full time economics employment at Amazon. If you are interested, please send your CV to our mailing list at econ-internship@amazon.com. We are open to hiring candidates to work out of one of the following locations: Bellevue, WA, USA | New York City, NY, USA | Seattle, WA, USA
US, WA, Bellevue
As an Applied Scientist on our Learning and Development team, you will play a critical role in driving the design, development, and delivery of learning programs and initiatives aimed at enhancing leadership and associate development within the organization. You will leverage your expertise in learning science, data analysis, and statistical model design to create impactful learning journey roadmap that align with organizational goals and priorities. Key job responsibilities 1) Research and Analysis: Conduct research on learning and development trends, theories, and best practices related to leadership and associate development. Analyze data to identify learning needs, performance gaps, and opportunities for improvement within the organization. Use data-driven insights to inform the design and implementation of learning interventions. 2) Program Design and Development: Collaborate with cross-functional teams to develop comprehensive learning programs focused on leadership development and associate growth. Design learning experiences using evidence-based instructional strategies, adult learning principles, and innovative technologies. Create engaging and interactive learning materials, including e-learning modules, instructor-led workshops, and multimedia resources. 3) Evaluation and Continuous Improvement: Develop evaluation frameworks to assess the effectiveness and impact of learning programs on leadership development and associate performance. Collect and analyze feedback from participants and stakeholders to identify strengths, areas for improvement, and future learning needs. Iterate on learning interventions based on evaluation results and feedback to continuously improve program outcomes. 4) Thought Leadership and Collaboration: Serve as a subject matter expert on learning science, instructional design, and leadership development within the organization. Collaborate with stakeholders across the company to align learning initiatives with strategic priorities and business objectives. Share knowledge and best practices with colleagues to foster a culture of continuous learning and development. We are open to hiring candidates to work out of one of the following locations: Bellevue, WA, USA | Nashville, TN, USA
US, WA, Seattle
Amazon Web Services (AWS) is building a world-class marketing organization, and we are looking for an experienced Economist to join the central data and science organization for AWS Marketing. This candidate will develop innovative solutions to measure the return on marketing investments. They will work closely with business leaders, scientists, and engineers to translate business and functional requirements into concrete deliverables, including the design, development, testing, and deployment of innovative measurement solutions. They will interact with functional leaders owning events (e.g. re:Invent, summits, webinars), paid media (paid search, paid social, display), AWS-owned channels (email, website, console) as well as lead management organization to drive the development, fine-tuning and adoption of the consistent measurement framework across these diverse initiatives. We seek candidates with an entrepreneurial spirit who want to make a big impact on AWS growth. They will develop strong working relationships and thrive in a collaborative team environment. They will have the creativity, curiosity, and strong judgment to work on high-impact, high-visibility products to improve the experience of AWS leads and customers. Key job responsibilities - Apply your expertise in causal inference and ML to develop systems to measure B2B marketing impact - Develop and execute science products from concept, prototype to production incorporating feedback from customers, scientists and business leaders - Identify new opportunities for leveraging economic insights and models in the marketing space - Write technical white papers and business-facing documents to clearly explain complex technical concepts to audiences with diverse business/scientific backgrounds We are open to hiring candidates to work out of one of the following locations: Arlington, VA, USA | Austin, TX, USA | New York City, NY, USA | Seattle, WA, USA
US, GA, Atlanta
Looking for your next challenge? North America Sort Centers (NASC) are experiencing growth and looking for a skilled, highly motivated Data Scientist to join the NASC Engineering Data, Product and Simulation Team. The Sort Center network is the critical Middle-Mile solution in the Amazon Transportation Services (ATS) group, linking Fulfillment Centers to the Last Mile. The experience of our customers is dependent on our ability to efficiently execute volume flow through the middle-mile network. Key job responsibilities The Senior Data Scientist will design and implement solutions to address complex business questions using simulation. In this role, you will apply advanced analysis techniques and statistical concepts to draw insights from massive datasets, and create intuitive simulations and data visualizations. You can contribute to each layer of a data solution – you work closely with process design engineers, business intelligence engineers and technical product managers to obtain relevant datasets and create simulation models, and review key results with business leaders and stakeholders. Your work exhibits a balance between scientific validity and business practicality. On this team, you will have a large impact on the entire NASC organization, with lots of opportunity to learn and grow within the NASC Engineering team. This role will be the first dedicated simulation expert, so you will have an exceptional opportunity to define and drive vision for simulation best practices on our team. To be successful in this role, you must be able to turn ambiguous business questions into clearly defined problems, develop quantifiable metrics and deliver results that meet high standards of data quality, security, and privacy. About the team NASC Engineering’s Product and Analytics Team’s sole objective is to develop tools for under the roof simulation and optimization, supporting the needs of our internal and external stakeholders (i.e Process Design Engineering, NASC Engineering, ACES, Finance, Safety and Operations). We develop data science tools to evaluate what-if design and operations scenarios for new and existing sort centers to understand their robustness, stability, scalability, and cost-effectiveness. We conceptualize new data science solutions, using optimization and machine learning platforms, to analyze new and existing process, identify and reduce non-value added steps, and increase overall performance and rate. We work by interfacing with various functional teams to test and pilot new hardware/software solutions. We are open to hiring candidates to work out of one of the following locations: Atlanta, GA, USA | Bellevue, WA, USA
US, WA, Bellevue
Amazon’s Middle Mile Planning & Optimization team is looking for an exceptional Sr. Applied Scientist to solve complex optimization problems that ensure we exceed customer delivery promise expectations and minimize overall operational cost while supporting Amazon’s rapid growth globally. We use cutting edge technologies in large-scale optimization, predictive analytics, and generative AI to optimize the flow of packages within our network to efficiently match network capacity with shipment demand. Our services already handle thousands of requests per second, make business decisions impacting billions of dollars a year, and improve the delivery experience for millions of online shoppers. That said, this remains a fast-growing business and our journey has just started. Our mission is to build the most efficient and optimal transportation solution on the planet, using our technology and engineering muscle as our biggest advantage. Key job responsibilities You will work closely with product managers, research scientists, business/operations leaders, and technical leadership to build capabilities that transform our transportation network. This includes analyzing big data, building end-to-end workflows, prototype optimization/simulation models, and launch production capabilities. You will have exposure to senior leadership as you communicate results and provide scientific guidance to the business. Your insights will be a key influencer of our product strategy and roadmap and your experimental research will inform our future investment areas. About the team You will join the Surface Research Science (SRS) team, which is the science partner of the Middle-Mile Planning & Optimization tech organization. SRS is working on a fascinating range of problems, including some of the hardest and largest optimization, simulation, and prediction problems in the industry. Examples are long-term and short-term demand forecasting, capacity planning, driver scheduling, vehicle routing, and equipment rebalancing problems. We are open to hiring candidates to work out of one of the following locations: Bellevue, WA, USA