Teaching robots to respond to natural-language commands

Technique that relies on inverse reinforcement learning, or learning by example, improves task completion rate by 14% to 17% in simulations.

If general-purpose household robots ever become a reality, it would be nice to address them in natural language — to say to a robot, for instance, “Take the dirty dishes to the kitchen.”

Natural-language commands, however, introduce a new layer of complexity to the control of robotic systems, since the same sequence of actions can correspond to many different natural-language commands (“Can you clear the dishes from the dining room?”).

In a paper that my colleagues and I presented last week at the annual meeting of the Association for the Advancement of Artificial Intelligence (AAAI), we apply some of what we’ve learned working on natural-language understanding to the problem of natural-language robotic control.

In particular, we consider the case of inverse reinforcement learning (IRL), in which an AI agent learns to perform a specified task by observing human demonstrations. We augment the standard IRL framework, though, by specifying the agent’s goals in natural language, not, explicitly, as unique states.

IRL methodology.png
A diagram of the researchers’ training methodology, which alternates between updating an autonomous agent’s policy — a set of actions (a) to take in various states (s) in order to achieve a goal (G) — and training a discriminator to recognize the reward function implicit in experts’ examples. The discriminator learns from both positive and negative examples. Some negative examples (sampled trajectories) are relabeled (relabeled trajectories) and used to augment the experts' examples, both for updating the policy and for training the discriminator.
Credit: Glynis Condon

In experiments involving a benchmark data set consisting of high-quality 3-D simulations of an indoor environment, we compared our method to four leading approaches to IRL. 

In cases in which the agent is tested in an environment that it saw during training, our method improves its success rate in achieving goals specified in natural language by 14%, relative to the best-performing baseline. In novel test environments — environments unseen during training — our method improves the agent’s success rate by 17%.

In the paper, we also present a method whereby a trained AI agent deployed to an unfamiliar environment can generate its own training examples tailored to that environment. This additional self-supervised learning improves the agent’s success rate by an additional 38%.

Inverse reinforcement learning

Reinforcement learning is a paradigm in which an agent learns through trial and error. More specifically, it has a reward function — a measure of how successful it is at achieving some goal — and it learns a set of behaviors that maximize its reward.

In inverse reinforcement learning, by contrast, the agent is presented with a set of demonstrations — the examples of a human expert or other agent — and it must learn the reward function implicitly maximized by the experts.

Demonstrations are represented as trajectories, which consist of sequences of alternating states (of the environment and the agent’s place in it) and actions. With IRL, as with standard reinforcement learning, the agent’s ultimate aim is to learn a policy, which dictates which actions to take in which states. With IRL, however, the agent must learn the reward function and the policy simultaneously.

A common approach to IRL is to use a generative adversarial network, or GAN. The training data for the agent is a set of true trajectories, modeled by experts, which accomplish the goal to be learned.

But the training setup also includes an adversarial generator, which creates false trajectories, and the IRL discriminator must learn to distinguish the two. That is, it must learn a reward function that assigns a high value to true trajectories and a low value to false ones. Simultaneously, the adversarial generator tries to learn a policy that generates high-reward trajectories.

We vary this setup by combining each trajectory with an additional input: a natural-language specification of the goal. A single trajectory may have multiple natural-language goals, corresponding to multiple states and actions in the sequence: for example, “go down the hall”, “turn left”, “find the first doorway on your right”, and so on.

In this setting, the negative examples generated by the adversarial generator are trajectories with misaligned natural-language goals: the trajectory maps out a right turn, for instance, but the natural-language goal is “turn left”.

We alternate between using training examples to teach the agent the reward function and to update the agent’s policy. The reward function is trained on both trajectories and natural-language goals (NL goals), and its training data includes negative examples from the adversarial generator. For policy updates, the agent receives only the NL goals — and only from positive examples — and must predict the associated trajectories.

In our experiments, this basic model offered little improvement over existing IRL models, requiring several additional features to improve its performance.

Data augmentation

First, using our expert-supplied trajectories, we trained a variational goal generator to predict NL goals on the basis of trajectories. That model includes a variational autoencoder, a neural network that produces a highly compressed vector representation of each NL goal. The compressed representation captures semantic information about the NL goal, but it loses information about the goal’s phrasing. Re-expanding such a representation produces a new NL goal that is phrased differently from the original but preserves the semantic content.

We use these trajectories with rephrased NL goals as new positive training examples. This augments our supply of expert training examples, which tend to be scarce, increasing robustness through lexical variance.

When a negative example from the adversarial generator — whose NL goal is inaccurate — passes through the label prediction model, the result is a reconstructed trajectory with a correct NL goal. These relabeled trajectories are added to our supply of positive examples as well.

We use our added positive examples to both train the reward function and update the agent’s policy. Not only does this improve the accuracy of the reward function, but it also increases the agent’s ability to generalize to new settings, since it has more varied encounters with the environment to learn from than it would otherwise.

Finally, we explore an additional method for bootstrapping an agent that is asked to perform tasks in an unfamiliar environment. First, the agent learns a new, goal-agnostic policy from existing training data. This policy encodes general principles, such as not trying to move through closed doors. 

Then we use that general policy to generate sample trajectories in the new environment; these pass through the variational goal generator, which assigns them NL goals. We treat these newly labeled trajectories as expert examples in the new setting, and we use them to update the reward function. 

This added layer of training is what increased our agents’ success rates by 36% when they were deployed to new environments. We think this kind of adaptability will be crucial to household robots of the future, which will need to adjust to new environments — when a family moves or goes on vacation, for instance — without being retrained from scratch.

Research areas

Related content

RO, Iasi
Are you a MS or PhD student interested in a 2026 internship in the field of machine learning, deep learning, generative AI, large language models and speech technology, robotics, computer vision, optimization, operations research, quantum computing, automated reasoning, or formal methods? If so, we want to hear from you! We are looking for students interested in using a variety of domain expertise to invent, design and implement state-of-the-art solutions for never-before-solved problems. You can find more information about the Amazon Science community as well as our interview process via the links below; https://www.amazon.science/ https://amazon.jobs/content/en/career-programs/university/science https://amazon.jobs/content/en/how-we-hire/university-roles/applied-science Key job responsibilities As an Applied Science Intern, you will own the design and development of end-to-end systems. You’ll have the opportunity to write technical white papers, create roadmaps and drive production level projects that will support Amazon Science. You will work closely with Amazon scientists and other science interns to develop solutions and deploy them into production. You will have the opportunity to design new algorithms, models, or other technical solutions whilst experiencing Amazon’s customer focused culture. The ideal intern must have the ability to work with diverse groups of people and cross-functional teams to solve complex business problems. A day in the life At Amazon, you will grow into the high impact person you know you’re ready to be. Every day will be filled with developing new skills and achieving personal growth. How often can you say that your work changes the world? At Amazon, you’ll say it often. Join us and define tomorrow. Some more benefits of an Amazon Science internship include; • All of our internships offer a competitive stipend/salary • Interns are paired with an experienced manager and mentor(s) • Interns receive invitations to different events such as intern program initiatives or site events • Interns can build their professional and personal network with other Amazon Scientists • Interns can potentially publish work at top tier conferences each year About the team Applicants will be reviewed on a rolling basis and are assigned to teams aligned with their research interests and experience prior to interviews. Start dates are available throughout the year and durations can vary in length from 3-6 months for full time internships. This role may available across multiple locations in the EMEA region (Austria, Estonia, France, Germany, Ireland, Israel, Italy, Jordan, Luxembourg, Netherlands, Poland, Romania, Spain, South Africa, UAE, and UK). Please note these are not remote internships.
EE, Tallinn
Are you a MS or PhD student interested in a 2026 internship in the field of machine learning, deep learning, generative AI, large language models, speech technology, robotics, computer vision, optimization, operations research, quantum computing, automated reasoning, or formal methods? If so, we want to hear from you! We are looking for students interested in using a variety of domain expertise to invent, design and implement state-of-the-art solutions for never-before-solved problems. You can find more information about the Amazon Science community as well as our interview process via the links below; https://www.amazon.science/ https://amazon.jobs/content/en/career-programs/university/science https://amazon.jobs/content/en/how-we-hire/university-roles/applied-science Key job responsibilities As an Applied Science Intern, you will own the design and development of end-to-end systems. You’ll have the opportunity to write technical white papers, create roadmaps and drive production level projects that will support Amazon Science. You will work closely with Amazon scientists and other science interns to develop solutions and deploy them into production. You will have the opportunity to design new algorithms, models, or other technical solutions whilst experiencing Amazon’s customer focused culture. The ideal intern must have the ability to work with diverse groups of people and cross-functional teams to solve complex business problems. A day in the life At Amazon, you will grow into the high impact person you know you’re ready to be. Every day will be filled with developing new skills and achieving personal growth. How often can you say that your work changes the world? At Amazon, you’ll say it often. Join us and define tomorrow. Some more benefits of an Amazon Science internship include; • All of our internships offer a competitive stipend/salary • Interns are paired with an experienced manager and mentor(s) • Interns receive invitations to different events such as intern program initiatives or site events • Interns can build their professional and personal network with other Amazon Scientists • Interns can potentially publish work at top tier conferences each year About the team Applicants will be reviewed on a rolling basis and are assigned to teams aligned with their research interests and experience prior to interviews. Start dates are available throughout the year and durations can vary in length from 3-6 months for full time internships. This role may available across multiple locations in the EMEA region (Austria, Estonia, France, Germany, Ireland, Israel, Italy, Jordan, Luxembourg, Netherlands, Poland, Romania, Spain, South Africa, UAE, and UK). Please note these are not remote internships.
GB, London
Are you a MS student interested in a 2026 internship in the field of machine learning, deep learning, generative AI, large language models and speech technology, robotics, computer vision, optimization, operations research, quantum computing, automated reasoning, or formal methods? If so, we want to hear from you! We are looking for a customer obsessed Data Scientist Intern who can innovate in a business environment, building and deploying machine learning models to drive step-change innovation and scale it to the EU/worldwide. If this describes you, come and join our Data Science teams at Amazon for an exciting internship opportunity. If you are insatiably curious and always want to learn more, then you’ve come to the right place. You can find more information about the Amazon Science community as well as our interview process via the links below; https://www.amazon.science/ https://amazon.jobs/content/en/career-programs/university/science Key job responsibilities As a Data Science Intern, you will have following key job responsibilities: • Work closely with scientists and engineers to architect and develop new algorithms to implement scientific solutions for Amazon problems. • Work on an interdisciplinary team on customer-obsessed research • Experience Amazon's customer-focused culture • Create and Deliver Machine Learning projects that can be quickly applied starting locally and scaled to EU/worldwide • Build and deploy Machine Learning models using large data-sets and cloud technology. • Create and share with audiences of varying levels technical papers and presentations • Define metrics and design algorithms to estimate customer satisfaction and engagement A day in the life At Amazon, you will grow into the high impact person you know you’re ready to be. Every day will be filled with developing new skills and achieving personal growth. How often can you say that your work changes the world? At Amazon, you’ll say it often. Join us and define tomorrow. Some more benefits of an Amazon Science internship include; • All of our internships offer a competitive stipend/salary • Interns are paired with an experienced manager and mentor(s) • Interns receive invitations to different events such as intern program initiatives or site events • Interns can build their professional and personal network with other Amazon Scientists • Interns can potentially publish work at top tier conferences each year About the team Applicants will be reviewed on a rolling basis and are assigned to teams aligned with their research interests and experience prior to interviews. Start dates are available throughout the year and durations can vary in length from 3-6 months for full time internships. This role may available across multiple locations in the EMEA region (Austria, France, Germany, Ireland, Israel, Italy, Luxembourg, Netherlands, Poland, Romania, Spain and the UK). Please note these are not remote internships.
IL, Tel Aviv
Are you a MS or PhD student interested in a 2026 internship in the field of machine learning, deep learning, generative AI, large language models, speech technology, robotics, computer vision, optimization, operations research, quantum computing, automated reasoning, or formal methods? If so, we want to hear from you! We are looking for students interested in using a variety of domain expertise to invent, design and implement state-of-the-art solutions for never-before-solved problems. You can find more information about the Amazon Science community as well as our interview process via the links below; https://www.amazon.science/ https://amazon.jobs/content/en/career-programs/university/science https://amazon.jobs/content/en/how-we-hire/university-roles/applied-science Key job responsibilities As an Applied Science Intern, you will own the design and development of end-to-end systems. You’ll have the opportunity to write technical white papers, create roadmaps and drive production level projects that will support Amazon Science. You will work closely with Amazon scientists and other science interns to develop solutions and deploy them into production. You will have the opportunity to design new algorithms, models, or other technical solutions whilst experiencing Amazon’s customer focused culture. The ideal intern must have the ability to work with diverse groups of people and cross-functional teams to solve complex business problems. A day in the life At Amazon, you will grow into the high impact person you know you’re ready to be. Every day will be filled with developing new skills and achieving personal growth. How often can you say that your work changes the world? At Amazon, you’ll say it often. Join us and define tomorrow. Some more benefits of an Amazon Science internship include; • All of our internships offer a competitive stipend/salary • Interns are paired with an experienced manager and mentor(s) • Interns receive invitations to different events such as intern program initiatives or site events • Interns can build their professional and personal network with other Amazon Scientists • Interns can potentially publish work at top tier conferences each year About the team Applicants will be reviewed on a rolling basis and are assigned to teams aligned with their research interests and experience prior to interviews. Start dates are available throughout the year and durations can vary in length from 3-6 months for full time internships. This role may available across multiple locations in the EMEA region (Austria, Estonia, France, Germany, Ireland, Israel, Italy, Jordan, Luxembourg, Netherlands, Poland, Romania, South Africa, Spain, Sweden, UAE, and UK). Please note these are not remote internships.
GB, London
Are you a MS or PhD student interested in a 2026 internship in the field of machine learning, deep learning, generative AI, large language models and speech technology, robotics, computer vision, optimization, operations research, quantum computing, automated reasoning, or formal methods? If so, we want to hear from you! We are looking for students interested in using a variety of domain expertise to invent, design and implement state-of-the-art solutions for never-before-solved problems. You can find more information about the Amazon Science community as well as our interview process via the links below; https://www.amazon.science/ https://amazon.jobs/content/en/career-programs/university/science https://amazon.jobs/content/en/how-we-hire/university-roles/applied-science Key job responsibilities As an Applied Science Intern, you will own the design and development of end-to-end systems. You’ll have the opportunity to write technical white papers, create roadmaps and drive production level projects that will support Amazon Science. You will work closely with Amazon scientists and other science interns to develop solutions and deploy them into production. You will have the opportunity to design new algorithms, models, or other technical solutions whilst experiencing Amazon’s customer focused culture. The ideal intern must have the ability to work with diverse groups of people and cross-functional teams to solve complex business problems. A day in the life At Amazon, you will grow into the high impact person you know you’re ready to be. Every day will be filled with developing new skills and achieving personal growth. How often can you say that your work changes the world? At Amazon, you’ll say it often. Join us and define tomorrow. Some more benefits of an Amazon Science internship include; • All of our internships offer a competitive stipend/salary • Interns are paired with an experienced manager and mentor(s) • Interns receive invitations to different events such as intern program initiatives or site events • Interns can build their professional and personal network with other Amazon Scientists • Interns can potentially publish work at top tier conferences each year About the team Applicants will be reviewed on a rolling basis and are assigned to teams aligned with their research interests and experience prior to interviews. Start dates are available throughout the year and durations can vary in length from 3-6 months for full time internships. This role may available across multiple locations in the EMEA region (Austria, Estonia, France, Germany, Ireland, Israel, Italy, Jordan, Luxembourg, Netherlands, Poland, Romania, Spain, South Africa, UAE, and UK). Please note these are not remote internships.
US, WA, Seattle
Passionate about books? The Amazon Books personalization team is looking for a talented Applied Scientist II to help develop and implement innovative science solutions to make it easier for millions of customers to find the next book they will love. In this role you will: - Collaborate within a dynamic team of scientists, economists, engineers, analysts, and business partners. - Utilize Amazon's large-scale computing and data resources to analyze customer behavior and product relationships. - Contribute to building and maintaining recommendation models, and assist in running A/B tests on the retail website. - Help develop and implement solutions to improve Amazon's recommendation systems. Key job responsibilities The role involves working with recommender systems that combine Natural Language Processing (NLP), Reinforcement Learning (RL), graph networks, and deep learning to help customers discover their next great read. You will assist in developing recommendation model pipelines, analyze deep learning-based recommendation models, and collaborate with engineering and product teams to improve customer-facing recommendations. As part of the team, you will learn and contribute across these technical areas while developing your skills in the recommendation systems space. A day in the life In your day-to-day role, you will contribute to the development and maintenance of recommendation models, support the implementation of A/B test experiments, and work alongside engineers, product teams, and other scientists to help deploy machine learning solutions to production. You will gain hands-on experience with our recommendation systems while working under the guidance of senior scientists. About the team We are Books Personalization a collaborative group of 5-7 scientists, 2 product leaders, and 2 engineering teams that aims to help find the right next read for customers through high quality personalized book recommendation experiences. Books Personalization is a part of the Books Content Demand organization, which focuses on surfacing the best books for customers wherever they are in their current book journey.
IN, KA, Bengaluru
Do you want to join an innovative team of scientists who use machine learning and statistical techniques to create state-of-the-art solutions for providing better value to Amazon’s customers? Do you want to build and deploy advanced algorithmic systems that help optimize millions of transactions every day? Are you excited by the prospect of analyzing and modeling terabytes of data to solve real world problems? Do you like to own end-to-end business problems/metrics and directly impact the profitability of the company? Do you like to innovate and simplify? If yes, then you may be a great fit to join the Machine Learning and Data Sciences team for India Consumer Businesses. If you have an entrepreneurial spirit, know how to deliver, love to work with data, are deeply technical, highly innovative and long for the opportunity to build solutions to challenging problems that directly impact the company's bottom-line, we want to talk to you. Major responsibilities - Use machine learning and analytical techniques to create scalable solutions for business problems - Analyze and extract relevant information from large amounts of Amazon’s historical business data to help automate and optimize key processes - Design, development, evaluate and deploy innovative and highly scalable models for predictive learning - Research and implement novel machine learning and statistical approaches - Work closely with software engineering teams to drive real-time model implementations and new feature creations - Work closely with business owners and operations staff to optimize various business operations - Establish scalable, efficient, automated processes for large scale data analyses, model development, model validation and model implementation - Mentor other scientists and engineers in the use of ML techniques
CA, ON, Toronto
Are you a passionate scientist in the computer vision area who is aspired to apply your skills to bring value to millions of customers? Here at Ring, we have a unique opportunity to innovate and see how the results of our work improve the lives of millions of people and make neighborhoods safer. As a Principal Applied Scientist, you will work with talented peers pushing the frontier of computer vision and machine learning technology to deliver the best experience for our neighbors. This is a great opportunity for you to innovate in this space by developing highly optimized algorithms that will work at scale. This position requires experience with developing Computer Vision, Multi-modal LLMs and/or Vision Language Models. You will collaborate with different Amazon teams to make informed decisions on the best practices in machine learning to build highly-optimized integrated hardware and software platforms. Key job responsibilities - You will be responsible for defining key research directions in Multimodal LLMs and Computer Vision, adopting or inventing new techniques, conducting rigorous experiments, publishing results, and ensuring that research is translated into practice. - You will develop long-term strategies, persuade teams to adopt those strategies, propose goals and deliver on them. - You will also participate in organizational planning, hiring, mentorship and leadership development. - You will serve as a key scientific resource in full-cycle development (conception, design, implementation, testing to documentation, delivery, and maintenance).
DE, BE, Berlin
Are you interested in enhancing Alexa user experiences through Large Language Models? The Alexa AI Berlin team is looking for an Applied Scientist to join our innovative team working on Large Language Models (LLMs), Natural Language Processing, and Machine/Deep Learning. You will be at the center of Alexa's LLM transformation, collaborating with a diverse team of applied and research scientists to enhance existing features and explore new possibilities with LLMs. In this role, you'll work cross-functionally with science, product, and engineering leaders to shape the future of Alexa. Key job responsibilities As an Applied Scientist in Alexa Science team: - You will develop core LLM technologies including supervised fine tuning and prompt optimization to enable innovative Alexa use cases - You will research and design novel metrics and evaluation methods to measure and improve AI performance - You will create automated, multi-step processes using AI agents and LLMs to solve complex problems - You will communicate effectively with leadership and collaborate with colleagues from science, engineering, and business backgrounds - You will participate in on-call rotations to support our systems and ensure continuous service availability A day in the life As an Applied Scientist, you will own the design and development of end-to-end systems. You’ll have the opportunity to write technical white papers, create technical roadmaps and drive production level projects that will support Amazon Science. You will have the opportunity to design new algorithms, models, or other technical solutions whilst experiencing Amazon’s customer focused culture. The ideal scientist must have the ability to work with diverse groups of people and cross-functional teams to solve complex business problems. About the team You would be part of the Alexa Science Team where you would be collaborating with Fellow Applied and research scientists!
US, WA, Redmond
Project Kuiper is an initiative to launch a constellation of Low Earth Orbit satellites that will provide low-latency, high-speed broadband connectivity to unserved and under-served communities around the world. We are looking for an accomplished Applied Scientist who will deliver science applications such as anomaly detection, advanced calibration methods, space engineering simulations, and performance analytics -- to name a few. Key job responsibilities • Translate ambiguous problems into well defined mathematical problems • Prototype, test, and implement state-of-the-art algorithms for antenna pointing calibration, anomaly detection, predictive failure models, and ground terminal performance evaluation • Provide actionable recommendations for system design/definition by defining, running, and summarizing physically-accurate simulations of ground terminal functionality • Collaborate closely with engineers to deploy performant, scalable, and maintainable applications in the cloud Export Control Requirement: Due to applicable export control laws and regulations, candidates must be a U.S. citizen or national, U.S. permanent resident (i.e., current Green Card holder), or lawfully admitted into the U.S. as a refugee or granted asylum. A day in the life In this role as an Applied Scientist, you will design, implement, optimize, and operate systems critical to the uptime and performance of Kuiper ground terminals. Your contributions will have a direct impact on customers around the world. About the team This role will be part of the Ground Software & Analytics team, part of Ground Systems Engineering. Our team is responsible for: • Design, development, deployment, and support of a Tier-1 Monitoring and Remediation System (MARS) needed to maintain high availability of hundreds of ground terminals deployed around the world • Ground systems integration/test (I&T) automation • Ground terminal configuration, provisioning, and acceptance automation • Systems analysis • Algorithm development (pointing/tracking/calibration/monitoring) • Software interface definition for supplier-provided hardware and development of software test automation