Amazon releases dataset for complex, multilingual question answering

Dataset that requires question-answering models to look up multiple facts and perform comparisons bridges a significant gap in the field.

Question answering (QA) is the machine learning task of learning to predict answers to questions. For example, given the question, “Where was Natalie Portman born?”, a QA model could predict the answer “Jerusalem”, using articles from the web, facts in a knowledge graph, or knowledge stored within the model. This is an example of a simple question, since it can be answered using a single fact or a single source on the web, such as the Natalie Portman Wikipedia page.

Related content
Novel pretraining method enables increases of 5% to 14% on five different evaluation metrics.

Not all questions are simple. For example, the question “What movie had a higher budget, Titanic or Men in Black II?” is a complex question because it requires looking up two different facts (Titanic | budget | 200 million USD and Men in Black II | budget | 140 million USD), followed by a calculation to compare values (200 million USD > 140 million USD).

While many state-of-the art question-answering models get good performance on simple questions, complex questions remain an open problem. One reason is a lack of datasets. Most existing QA datasets are large but simple, complex but small, or large and complex but synthetically generated, so they are less natural. A majority of QA datasets are also only in English.

To help bridge this gap, we have publicly released a new dataset: Mintaka, which we describe in a paper we're presenting at this year’s International Conference on Computational Linguistics (COLING).

Mintaka is a large, complex, natural, and multilingual question-answer dataset with 20,000 questions collected in English and professionally translated into eight languages: Arabic, French, German, Hindi, Italian, Japanese, Portuguese, and Spanish. We also ground Mintaka in the Wikidata knowledge graph by linking entities in the question text and answer text to Wikidata IDs.

Mintaka interface.png
The interface that Amazon Mechanical Turk workers used to annotate and link entities for the Mintaka dataset.

Building the dataset

We define complex questions as any question that requires an operation beyond a single fact lookup. We built Mintaka using the crowdsourcing platform Amazon Mechanical Turk (MTurk). First, we designed an MTurk task to elicit complex but natural questions. We asked workers to write question-answer pairs with one of the following complexity types:

Related content
Replacing hand annotation with a machine learning component reduces labor, while an intersection operation enables multiple-entity queries.

  • Count (e.g., Q: How many astronauts have been elected to Congress? A: 4)
  • Comparative (e.g., Q: Is Mont Blanc taller than Mount Rainier? A: Yes)
  • Superlative (e.g., Q: Who was the youngest tribute in the Hunger Games? A: Rue)
  • Ordinal (e.g., Q: Who was the last Ptolemaic ruler of Egypt? A: Cleopatra)
  • Multi-hop (e.g., Q: Who was the quarterback of the team that won Super Bowl 50? A: Peyton Manning)
  • Intersection (e.g., Q: Which movie was directed by Denis Villeneuve and stars Timothee Chalamet? A: Dune)
  • Difference (e.g., Q: Which Mario Kart game did Yoshi not appear in? A: Mario Kart Live: Home Circuit)
  • Yes/No (e.g., Q: Has Lady Gaga ever made a song with Ariana Grande? A: Yes.)
  • Generic (e.g., Q: Where was Michael Phelps born? A: Baltimore, Maryland)
Related content
New metric can be calculated 55 times as quickly as its state-of-the-art predecessor, making it practical for model training.

Question-answer pairs were limited to eight categories: movies, music, sports, books, geography, politics, video games, and history. They were collected as free text, with no restrictions on what sources could be used.

Next, we created an entity-linking task where workers were shown question-answer pairs from the previous task and asked to either identify or verify the entities in either the question or answer and provide supporting evidence from Wikipedia entries. For example, given the question “How many Oscars did Argo win?”, a worker could identify the film Argo as an entity and link to its Wikidata URL.

Examples of Mintaka questions are shown below:

Q: Which Studio Ghibli movie scored the lowest on Rotten Tomatoes?
A: Earwig and the Witch

Q: When Franklin D. Roosevelt was first elected, how long had it been since someone in his party won the presidential election?
A: 16 years

Q: Which member of the Red Hot Chili Peppers appeared in Point Break?
A: Anthony Kiedis

Results

Mintaka naturalness.png
A box plot showing the quartile, median, and mean (black diamond) naturalness ranks for all four datasets, from 1 (least natural) to 5 (most natural).

To see how Mintaka compares to previous QA datasets in terms of naturalness, we ran an evaluation on MTurk with four comparison datasets: KQA Pro, ComplexWebQuestions (CWQ), DROP, and ComplexQuestions (CQ). Workers were shown five questions, one from each dataset, and asked to rank them from 1 (least natural) to 5 (most natural). On average, Mintaka ranked higher in naturalness than the other datasets. This shows that Mintaka questions are perceived as more natural than automatically generated or passage-constrained questions.

Mintaka baselines.png
Results of English baseline models on Mintaka.

We also evaluated eight baseline QA models trained using Mintaka. The best-performing was the language model T5 for Closed Book QA, which scored 38% hits@1. The baselines show that Mintaka is a challenging dataset, and there is ample room for improving model design and training procedures.

Mintaka bridges a significant gap in QA datasets by being large-scale, complex, naturally elicited, and multilingual. With the release of Mintaka, we hope to encourage researchers to continue pushing question-answering models to handle more-complex questions in more languages.

Related content

US, NY, New York
Are you passionate about solving big problems from ground-up? Do you enjoy building new state-of-the-art products at internet scale? Come lead the innovation in this startup team, vertical ad products. This is a green field problem without a known answer or a pattern to follow. We have ambitious vision to simplify full funnel advertising solutions, at scale, with specialized agentic AI-powered models and diversify the demand to strategic verticals including finserv, autos, locals.. etc. We are seeking an experienced Applied Scientist to drive innovation in our Ads Foundational Model. In this individual contributor role, you will apply advanced machine learning techniques to improve advertiser performance and customer experience. Key job responsibilities As an Applied Scientist on this team, you will: 1. Develop and drive the science strategy for Ads Foundational Model (Ads-FM), aligning it with the program's objectives and overall business goals. 2. Identify high-impact opportunities within Ads-FM program and lead the ideation, planning, and execution of science initiatives to address them. 3. Build and deploy machine learning models using computer vision, natural language processing, and deep learning to evaluate and enhance ad effectiveness. 4. Develop algorithms that extract meaningful signals from image, video, and audio content to predict and improve customer engagement 5. Leverage Amazon's extensive data repository to create predictive models that generate actionable recommendations for more compelling ad creative 6. Collaborate with business leaders and cross-functional teams to implement ML-powered solutions 7. Contribute to the ML roadmap for the Ads-FM program through innovation and research.
IN, KA, Bangalore
Amazon’s Last Mile Team is looking for a passionate individual with strong optimization and analytical skills to join its Last Mile Science team in the endeavor of designing and improving the most complex planning of delivery network in the world. Last Mile builds global solutions that enable Amazon to attract an elastic supply of drivers, companies, and assets needed to deliver Amazon's and other shippers' volumes at the lowest cost and with the best customer delivery experience. Last Mile Science team owns the core decision models in the space of jurisdiction planning, delivery channel and modes network design, capacity planning for on the road and at delivery stations, routing inputs estimation and optimization. Our research has direct impact on customer experience, driver and station associate experience, Delivery Service Partner (DSP)’s success and the sustainable growth of Amazon. Optimizing the last mile delivery requires deep understanding of transportation, supply chain management, pricing strategies and forecasting. Only through innovative and strategic thinking, we will make the right capital investments in technology, assets and infrastructures that allows for long-term success. Our team members have an opportunity to be on the forefront of supply chain thought leadership by working on some of the most difficult problems in the industry with some of the best product managers, scientists, and software engineers in the industry. Key job responsibilities Candidates will be responsible for developing solutions to better manage and optimize delivery capacity in the last mile network. The successful candidate should have solid research experience in one or more technical areas of Operations Research or Machine Learning. These positions will focus on identifying and analyzing opportunities to improve existing algorithms and also on optimizing the system policies across the management of external delivery service providers and internal planning strategies. They require superior logical thinkers who are able to quickly approach large ambiguous problems, turn high-level business requirements into mathematical models, identify the right solution approach, and contribute to the software development for production systems. To support their proposals, candidates should be able to independently mine and analyze data, and be able to use any necessary programming and statistical analysis software to do so. Successful candidates must thrive in fast-paced environments, which encourage collaborative and creative problem solving, be able to measure and estimate risks, constructively critique peer research, and align research focuses with the Amazon's strategic needs. As a senior scientist, you will also help coach/mentor junior scientists in the team.
US, WA, Seattle
This role will contribute to developing the Economics and Science products and services in the Fee domain, with specialization in supply chain systems and fees. Through the lens of economics, you will develop causal links for how Amazon, Sellers and Customers interact. You will be a key and senior scientist, advising Amazon leaders how to price our services. You will work on developing frameworks and scaleable, repeatable models supporting optimal pricing and policy in the two-sided marketplace that is central to Amazon's business. The pricing for Amazon services is complex. You will partner with science and technology teams across Amazon including Advertising, Supply Chain, Operations, Prime, Consumer Pricing, and Finance. We are looking for an experienced Principal Economist to improve our understanding of seller Economics, enhance our ability to estimate the causal impact of fees, and work with partner teams to design pricing policy changes. In this role, you will provide guidance to scientists to develop econometric models to influence our fee pricing worldwide. You will lead the development of causal models to help isolate the impact of fee and policy changes from other business actions, using experiments when possible, or observational data when not. Key job responsibilities The ideal candidate will have extensive Economics knowledge, demonstrated strength in practical and policy relevant structural econometrics, strong collaboration skills, proven ability to lead highly ambiguous and large projects, and a drive to deliver results. They will work closely with Economists, Data / Applied Scientists, Strategy Analysts, Data Engineers, and Product leads to integrate economic insights into policy and systems production. Familiarity with systems and services that constitute seller supply chains is a plus but not required. About the team The Stores Economics and Sciences team is a central science team that supports Amazon's Retail and Supply Chain leadership. We tackle some of Amazon's most challenging economics and machine learning problems, where our mandate is to impact the business on massive scale.
US, CA, Pasadena
The Amazon Center for Quantum Computing in Pasadena, CA, is looking to hire an Applied Scientist specializing in the design of microwave components for use in cryogenic environments. Working alongside other scientists and engineers, you will design and validate hardware performing microwave signal conditioning at cryogenic temperatures for Amazon quantum processors. Working effectively within a cross-functional team environment is critical. The ideal candidate will have a proven track record of hardware development from requirements development to validation. Diverse Experiences Amazon values diverse experiences. Even if you do not meet all of the preferred qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying. Work/Life Balance We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud. Inclusive Team Culture Here at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness. Mentorship and Career Growth We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional. Key job responsibilities Our scientists and engineers collaborate across diverse teams and projects to offer state of the art, cost effective solutions for the signal conditioning of Amazon quantum processor systems at cryogenic temperatures. You’ll bring a passion for innovation, collaboration, and mentoring to: Solve layered technical problems across our cryogenic signal chain. Develop requirements with key system stakeholders, including quantum device, test and measurement, hardware, and theory teams. Design, implement, test, deploy, and maintain innovative solutions that meet both performance and cost metrics. Research enabling technologies necessary for Amazon reach commercial viability in quantum computing . A day in the life As you research, design, and implement cryogenic microwave signal conditioning solutions, you will also: Participate in requirements, design, and test reviews. Work cross-functionally to help drive decisions using your unique technical background and skill set. Define and maintain standards for operational excellence. Work in a high-paced, startup-like environment where you are provided the resources to innovate quickly.
US, CA, Pasadena
The Amazon Center for Quantum Computing (CQC) team is looking for a passionate, talented, and inventive Research Engineer specializing in hardware design for cryogenic environments. The ideal candidate should have expertise in 3D CAD (SolidWorks), thermal and structural FEA (Ansys/COMSOL), hardware design for cryogenic applications, design for manufacturing, and mechanical engineering principles. The candidate must have demonstrated experience driving designs through full product development cycles (requirements, conceptual design, detailed design, manufacturing, integration, and testing). Candidates must also have a strong background in both cryogenic mechanical engineering theory and implementation. Working effectively within a cross-functional team environment is critical. Key job responsibilities The CQC collaborates across teams and projects to offer state-of-the-art, cost-effective solutions for scaling the signal delivery to quantum processor systems at cryogenic temperatures. Equally important is the ability to scale the thermal performance and improve EMI mitigation of the cryogenic environment. You will work on the following: - High density novel packaging solutions for quantum processor units - Cryogenic mechanical design for novel cryogenic signal conditioning sub-assemblies - Cryogenic mechanical design for signal delivery systems - Simulation-driven designs (shielding, filtering, etc.) to reduce sources of EMI within the qubit environment. - Own end-to-end product development through requirements, design reports, design reviews, assembly/testing documentation, and final delivery A day in the life As you design and implement cryogenic hardware solutions, from requirements definition to deployment, you will also: - Participate in requirements, design, and test reviews and communicate with internal stakeholders - Work cross-functionally to help drive decisions using your unique technical background and skill set - Refine and define standards and processes for operational excellence - Work in a high-paced, startup-like environment where you are provided the resources to innovate quickly About the team The Amazon Center for Quantum Computing (CQC) is a multi-disciplinary team of scientists, engineers, and technicians, on a mission to develop a fault-tolerant quantum computer. Inclusive Team Culture Here at Amazon, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon conferences, inspire us to never stop embracing our uniqueness. Diverse Experiences Amazon values diverse experiences. Even if you do not meet all of the preferred qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying. Mentorship & Career Growth We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional. Work/Life Balance We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud. Export Control Requirement Due to applicable export control laws and regulations, candidates must be either a U.S. citizen or national, U.S. permanent resident (i.e., current Green Card holder), or lawfully admitted into the U.S. as a refugee or granted asylum, or be able to obtain a US export license. If you are unsure if you meet these requirements, please apply and Amazon will review your application for eligibility.
IT, Turin
As an Applied Scientist in the Alexa AI team, you will spearhead the advancement and deployment of state-of-the-art ML/RAG systems that revolutionize how millions of customers interact with Alexa. You'll leverage your expertise in machine learning, natural language processing, and large language models to create reliable, scalable, high-performance products that set new standards in operational excellence. Working at the intersection of research and production, you'll translate latest AI innovations into customer-facing features that delight users daily. Your work will span the full ML lifecycle—from analyzing customer behavior patterns and building novel metrics for personal digital assistants, to deploying automated training pipelines and conducting rigorous A/B testing across diverse devices and endpoints. Collaborating closely with business, engineering, and science teams across Amazon, you'll lead high-visibility programs that automate workflows and deliver measurable customer impact. This role offers the unique opportunity to publish at top-tier conferences while seeing your innovations scale to one of the world's most popular voice assistants, serving millions of customers globally. Key job responsibilities As an Applied Scientist in the Alexa AI team: - You'll analyze and model customer behavior at scale, building novel metrics for personal digital assistants across diverse devices and endpoints. Your work will involve creating deep learning, policy-based learning, and machine learning algorithms that directly impact customer experiences, translating complex data patterns into actionable insights that drive product innovation. - Your technical leadership will extend to building and deploying automated model training and evaluation pipelines, implementing complex machine learning and deep learning algorithms, and conducting rigorous model and data analysis through online A/B testing. You'll research and implement novel approaches that push the boundaries of what's possible in conversational AI. - Beyond model development, you'll ensure operational excellence by taking ownership of production systems, including on-call responsibilities during peak and non-peak hours. Working alongside Software Development Engineers, you'll deploy fixes and handle high-severity issues, ensuring our ML systems maintain the reliability and performance that millions of Alexa customers depend on daily. A day in the life As an Applied Scientist in the Alexa AI team, your day will involve collaborating with talented engineers and scientists to build scalable solutions for our conversational assistant. You'll dive into data analysis, experiment with novel algorithms, and iterate on models based on real-time user feedback. Working in a fast-paced, ambiguous environment, you'll tackle complex technical challenges—from debugging production issues to presenting research findings to stakeholders. Your self-motivated approach will drive you to swiftly deliver impactful solutions while maintaining the high standards that define our mission to revolutionize user experiences for millions of customers. About the team The Alexa AI team develops the intelligence behind one of the world's most popular voice assistants, serving millions of customers globally. We're a diverse group of scientists, engineers, and researchers united by our mission to make Alexa more natural, helpful, and delightful. Our culture thrives on innovation, collaboration, and customer obsession. We tackle some of the most challenging problems in conversational AI—from natural language understanding to personalization at scale. Here, you'll work alongside world-class talent, publish at top-tier conferences, and see your innovations impact customers daily. We move fast, think big, and celebrate both successes and learnings.
IT, Turin
As an Applied Scientist in the Alexa AI team, you will spearhead the advancement and deployment of state-of-the-art ML/RAG systems that revolutionize how millions of customers interact with Alexa. You'll leverage your expertise in machine learning, natural language processing, and large language models to create reliable, scalable, high-performance products that set new standards in operational excellence. Working at the intersection of research and production, you'll translate latest AI innovations into customer-facing features that delight users daily. Your work will span the full ML lifecycle—from analyzing customer behavior patterns and building novel metrics for personal digital assistants, to deploying automated training pipelines and conducting rigorous A/B testing across diverse devices and endpoints. Collaborating closely with business, engineering, and science teams across Amazon, you'll lead high-visibility programs that automate workflows and deliver measurable customer impact. This role offers the unique opportunity to publish at top-tier conferences while seeing your innovations scale to one of the world's most popular voice assistants, serving millions of customers globally. Key job responsibilities As an Applied Scientist in the Alexa AI team: - You'll analyze and model customer behavior at scale, building novel metrics for personal digital assistants across diverse devices and endpoints. Your work will involve creating deep learning, policy-based learning, and machine learning algorithms that directly impact customer experiences, translating complex data patterns into actionable insights that drive product innovation. - Your technical leadership will extend to building and deploying automated model training and evaluation pipelines, implementing complex machine learning and deep learning algorithms, and conducting rigorous model and data analysis through online A/B testing. You'll research and implement novel approaches that push the boundaries of what's possible in conversational AI. - Beyond model development, you'll ensure operational excellence by taking ownership of production systems, including on-call responsibilities during peak and non-peak hours. Working alongside Software Development Engineers, you'll deploy fixes and handle high-severity issues, ensuring our ML systems maintain the reliability and performance that millions of Alexa customers depend on daily. A day in the life As an Applied Scientist in the Alexa AI team, your day will involve collaborating with talented engineers and scientists to build scalable solutions for our conversational assistant. You'll dive into data analysis, experiment with novel algorithms, and iterate on models based on real-time user feedback. Working in a fast-paced, ambiguous environment, you'll tackle complex technical challenges—from debugging production issues to presenting research findings to stakeholders. Your self-motivated approach will drive you to swiftly deliver impactful solutions while maintaining the high standards that define our mission to revolutionize user experiences for millions of customers. About the team The Alexa AI team develops the intelligence behind one of the world's most popular voice assistants, serving millions of customers globally. We're a diverse group of scientists, engineers, and researchers united by our mission to make Alexa more natural, helpful, and delightful. Our culture thrives on innovation, collaboration, and customer obsession. We tackle some of the most challenging problems in conversational AI—from natural language understanding to personalization at scale. Here, you'll work alongside world-class talent, publish at top-tier conferences, and see your innovations impact customers daily. We move fast, think big, and celebrate both successes and learnings.
IT, Turin
As an Applied Scientist in the Alexa AI team, you will spearhead the advancement and deployment of state-of-the-art ML/RAG systems that revolutionize how millions of customers interact with Alexa. You'll leverage your expertise in machine learning, natural language processing, and large language models to create reliable, scalable, high-performance products that set new standards in operational excellence. Working at the intersection of research and production, you'll translate latest AI innovations into customer-facing features that delight users daily. Your work will span the full ML lifecycle—from analyzing customer behavior patterns and building novel metrics for personal digital assistants, to deploying automated training pipelines and conducting rigorous A/B testing across diverse devices and endpoints. Collaborating closely with business, engineering, and science teams across Amazon, you'll lead high-visibility programs that automate workflows and deliver measurable customer impact. This role offers the unique opportunity to publish at top-tier conferences while seeing your innovations scale to one of the world's most popular voice assistants, serving millions of customers globally. Key job responsibilities As an Applied Scientist in the Alexa AI team: - You'll analyze and model customer behavior at scale, building novel metrics for personal digital assistants across diverse devices and endpoints. Your work will involve creating deep learning, policy-based learning, and machine learning algorithms that directly impact customer experiences, translating complex data patterns into actionable insights that drive product innovation. - Your technical leadership will extend to building and deploying automated model training and evaluation pipelines, implementing complex machine learning and deep learning algorithms, and conducting rigorous model and data analysis through online A/B testing. You'll research and implement novel approaches that push the boundaries of what's possible in conversational AI. - Beyond model development, you'll ensure operational excellence by taking ownership of production systems, including on-call responsibilities during peak and non-peak hours. Working alongside Software Development Engineers, you'll deploy fixes and handle high-severity issues, ensuring our ML systems maintain the reliability and performance that millions of Alexa customers depend on daily. A day in the life As an Applied Scientist in the Alexa AI team, your day will involve collaborating with talented engineers and scientists to build scalable solutions for our conversational assistant. You'll dive into data analysis, experiment with novel algorithms, and iterate on models based on real-time user feedback. Working in a fast-paced, ambiguous environment, you'll tackle complex technical challenges—from debugging production issues to presenting research findings to stakeholders. Your self-motivated approach will drive you to swiftly deliver impactful solutions while maintaining the high standards that define our mission to revolutionize user experiences for millions of customers. About the team The Alexa AI team develops the intelligence behind one of the world's most popular voice assistants, serving millions of customers globally. We're a diverse group of scientists, engineers, and researchers united by our mission to make Alexa more natural, helpful, and delightful. Our culture thrives on innovation, collaboration, and customer obsession. We tackle some of the most challenging problems in conversational AI—from natural language understanding to personalization at scale. Here, you'll work alongside world-class talent, publish at top-tier conferences, and see your innovations impact customers daily. We move fast, think big, and celebrate both successes and learnings.
IT, Turin
As an Applied Scientist in the Alexa AI team, you will spearhead the advancement and deployment of state-of-the-art ML/RAG systems that revolutionize how millions of customers interact with Alexa. You'll leverage your expertise in machine learning, natural language processing, and large language models to create reliable, scalable, high-performance products that set new standards in operational excellence. Working at the intersection of research and production, you'll translate latest AI innovations into customer-facing features that delight users daily. Your work will span the full ML lifecycle—from analyzing customer behavior patterns and building novel metrics for personal digital assistants, to deploying automated training pipelines and conducting rigorous A/B testing across diverse devices and endpoints. Collaborating closely with business, engineering, and science teams across Amazon, you'll lead high-visibility programs that automate workflows and deliver measurable customer impact. This role offers the unique opportunity to publish at top-tier conferences while seeing your innovations scale to one of the world's most popular voice assistants, serving millions of customers globally. Key job responsibilities As an Applied Scientist in the Alexa AI team: - You'll analyze and model customer behavior at scale, building novel metrics for personal digital assistants across diverse devices and endpoints. Your work will involve creating deep learning, policy-based learning, and machine learning algorithms that directly impact customer experiences, translating complex data patterns into actionable insights that drive product innovation. - Your technical leadership will extend to building and deploying automated model training and evaluation pipelines, implementing complex machine learning and deep learning algorithms, and conducting rigorous model and data analysis through online A/B testing. You'll research and implement novel approaches that push the boundaries of what's possible in conversational AI. - Beyond model development, you'll ensure operational excellence by taking ownership of production systems, including on-call responsibilities during peak and non-peak hours. Working alongside Software Development Engineers, you'll deploy fixes and handle high-severity issues, ensuring our ML systems maintain the reliability and performance that millions of Alexa customers depend on daily. A day in the life As an Applied Scientist in the Alexa AI team, your day will involve collaborating with talented engineers and scientists to build scalable solutions for our conversational assistant. You'll dive into data analysis, experiment with novel algorithms, and iterate on models based on real-time user feedback. Working in a fast-paced, ambiguous environment, you'll tackle complex technical challenges—from debugging production issues to presenting research findings to stakeholders. Your self-motivated approach will drive you to swiftly deliver impactful solutions while maintaining the high standards that define our mission to revolutionize user experiences for millions of customers. About the team The Alexa AI team develops the intelligence behind one of the world's most popular voice assistants, serving millions of customers globally. We're a diverse group of scientists, engineers, and researchers united by our mission to make Alexa more natural, helpful, and delightful. Our culture thrives on innovation, collaboration, and customer obsession. We tackle some of the most challenging problems in conversational AI—from natural language understanding to personalization at scale. Here, you'll work alongside world-class talent, publish at top-tier conferences, and see your innovations impact customers daily. We move fast, think big, and celebrate both successes and learnings.
US, NY, New York
We are seeking a Human-Robot Interaction (HRI) Applied Scientist to develop cutting-edge interactions that make robots feel alive, personal, and fun. In this role, you will focus on verbal and non-verbal conversational systems, social dynamics, memory, and long-term relationship formation between robots, their environments, and the people they interact with. Your contributions will be essential in advancing robotics by enabling expressive, socially intelligent, and trustworthy interactions between robots and humans. Key job responsibilities - Develop interactive systems that leverage large language models, multimodal inputs and outputs, reinforcement learning from human feedback, or other advanced techniques to achieve fluid, engaging, and socially appropriate robot behavior - Design and implement intelligent conversational systems that handle turn-taking, grounding, interruption, and incorporates context drawn from a robot's physical environment and shared history with a user - Integrate perceptual sensor streams including gaze, facial expression, gesture, posture, and more to understand social context and produce coherent, lifelike interactions. - Develop memory and personalization systems that allow robots to form lasting relationships with individual users, learn their environments, and adapt their behavior over weeks and months - Stay updated on advancements in HRI, NLP, multimodal AI, and cognitive and social science to apply cutting-edge techniques to robot interaction challenges - Lead technical projects from conception through production deployment - Mentor junior scientists and engineers - Bridge research initiatives with practical engineering implementation