Applying PECOS to product retrieval and text autocompletion

Two KDD papers demonstrate the power and flexibility of Amazon’s framework for “extreme multilabel ranking”.

In April, our research team at Amazon open-sourced our PECOS framework for extreme multilabel ranking (XMR), which is the general problem of classifying an input when you have an enormous space of candidate classes. PECOS presents a way to solve XMR problems that is both accurate and efficient enough for real-time use.

At this year’s Knowledge Discovery and Data Mining Conference (KDD), members of our team presented two papers that demonstrate both the power and flexibility of the PECOS framework.

Retrieved products.png
A comparison of the top ten products returned by the PECOS-based product retrieval system and two predecessors for the query "rose of jericho plant". Products outlined in green were purchased by at least one customer performing that search; products outlined in red were not purchased.

One applies PECOS to the problem of product retrieval, a use case very familiar to customers at the Amazon Store. The other is a less obvious application: session-aware query autocompletion, in which an autocompletion model — which predicts what a customer is going to type — bases its predictions on the customer’s last few text inputs, as well as on statistics for customers at large.

In both cases, we tailor PECOS’s default models to the tasks at hand and, in comparisons with several strong benchmarks, show that PECOS offers the best combination of accuracy and speed.

The PECOS model

The classic case of XMR would be the classification of a document according to a handful of topics, where there are hundreds of thousands of topics to choose from.

We generalize the idea, however, to any problem that, for a given input, finds a few matches from among a large set of candidates. In product retrieval, for instance, the names of products would be “labels” we apply to a query: “Echo Dot”, “Echo Studio”, and other such names would be labels applied to the query “Smart speaker”.

PECOS adopts a three-step solution to the XMR problem. First is the indexing step, in which PECOS groups labels according to topic. Next is the matching step, which matches an input to a topic (which significantly shrinks the space of candidates). Last comes the ranking step, which reranks the labels in the matched topic, based on features of the input.

PECOS-framework.png
The three-stage PECOS model.
Credit: Stacy Reilly

PECOS comes with default models for each of these steps, which we described in a blog post about the April code release. But users can modify those models as necessary, or create their own and integrate them into the PECOS framework.

Product retrieval

For the product retrieval problem, we adapt one of the matching models that comes standard with PECOS: XR-Linear. Details are in the earlier blog post (and in our KDD paper), but XR-Linear reduces computation time by using B-ary trees — a generalization of binary trees to trees whose nodes have B descendants each. The top node of the tree represents the full label set; the next layer down represents B partitions of the full set; the next layer represents B partitions of each partition in the previous layer, and so on.

Connections between nodes of the trees have associated weights, which are multiplied by features of the input query to produce a probability score. Matching is the process of tracing the most-probable routes through the tree and retrieving the topics at the most-probable leaf nodes. To make this process efficient, we use beam search: i.e., at each layer, we limit the number of nodes whose descendants we consider, a limit known as the beam width.

Beam search.gif
An example of linear ranking with a beam width of two. At each level of the tree, two nodes (green) are selected for further exploration. Each of their descendant nodes is evaluated (orange), and two of those are selected for further exploration.
Credit: Giana Bucchino

In our KDD paper on product retrieval, we vary this general model through weight pruning; i.e., we delete edges whose weights fall below some threshold, reducing the number of options the matching algorithm has to consider as it explores the tree. In the paper, we report experiments with several different weight thresholds and beam widths.

We also experimented with several different sets of input features. One was n-grams of query words. For instance, the query “Echo with screen” would produce the 1-grams “Echo”, “with”, “screen”, the 2-grams “Echo with” and “with screen”, and the 3-gram “Echo with screen”. This sensitizes the matching model to phrases that may carry more information than their constituent words.

Similarly, we used n-grams of input characters. If we use the token “#” to denote the end of a word, the same query would produce the character trigrams “Ech”, “cho”, “ho#”, “with”, “ith”, and so on. Character n-grams helps the model deal with typos or word variants.

Finally, we also used TF-IDF (term frequency–inverse document frequency) features, which normalize the frequency of a word in a given text by its frequency across all texts (which filters out common words like “the”). We found that our model performed best when we used all three sets of features.

As benchmarks in our experiments, we used the state-of-the-art linear model and the state-of-the-art neural model and found that our linear approach outperformed both, with a recall@10 — that is, the number of correct labels among the top ten — that was more than double the neural model’s and almost quadruple the linear model’s. At the same time, our model took about one-sixth as long to train as the neural model.

We also found that our model took an average of only 1.25 milliseconds to complete each query, which is fast enough for deployment in a real-time system like the Amazon Store.

Session-aware query autocompletion

Session-aware query autocompletion uses the history of a customer’s recent queries — not just general statistics for the customer base — to complete new queries. The added contextual information means that it can often complete queries accurately after the customer has typed only one or two letters.

To frame this task as an XMR problem, we consider the case in which the input is a combination of the customer’s previous query and the beginning — perhaps just a few characters — of a new query. The labels are queries that an information retrieval system has seen before.

In this case, PECOS didn’t work well out of the box, and we deduced that the problem was the indexing scheme used to cluster labels by topic. PECOS’s default indexing model embeds inputs, or converts them into vectors, then clusters labels according to proximity in the vector space.

We suspected that this was ineffective when the inputs to the autocompletion model were partial phrases — fragments of words that a user is typing in. So we experimented with an indexing model that instead used data structures known as tries(a variation on “tree” that borrows part of the word “retrieve”).

A trie is a tree whose nodes represent strings of letters, where each descendant node extends its parent node’s string by one letter. So if the top node of the trie represents the letter “P”, its descendants might represent the strings “PA” and “PE”; their descendants might represent the strings “PAN”, “PAD”, “PEN”, “PET”, and so on. With a trie, all the nodes that descend from a common parent constitute a cluster.

Clustering using tries dramatically improved the performance of our model, but it also slowed it down: the strings encoded by tries can get very long, which means that tracing a path through the trie can get very time consuming.

So we adopted a hybrid clustering technique that combines tries with embeddings. The top few layers of the hybrid tree constitute a trie, but the nodes that descend from the lowest of these layers represent strings whose embeddings are near that of the parent node in the vector space.

Tree, Trie, Trie-tree hybrid.cloned.png
Three different ways of clustering the eight strings "a", "ab", "abc", "abd", "abfgh", "abfgi", "bcde", and "bcdf". At left is a conventional tree; in the center is a trie; and at right is a trie-tree hybrid.

To ensure that the embeddings in the hybrid tree preserve some of the sequential information encoded by tries, we varied the standard TF-IDF approach. First we applied it at the character level, rather than at the word level, so that it measured the relative frequency of particular strings of letters, not just words.

Then we weighted the frequency statistics, overcounting character strings that occurred at the beginning of words, relative to those that occurred later. This forced the embedding to mimic the string extension logic of the tries.

Once we’d adopted this indexing scheme, we found that the PECOS model outperformed both the state-of-the-art linear model and the state-of-the art neural model, when measured by both mean reciprocal rank and the BLEU metric used to evaluate machine translation models.

The use of tries still came with a performance penalty: our model took significantly longer to process inputs than the earlier linear model did. But its execution time was still below the threshold for real-time application and significantly lower than the neural model’s.

Related content

US, MA, N.reading
Amazon Industrial Robotics is seeking exceptional talent to help develop the next generation of advanced robotics systems that will transform automation at Amazon's scale. We're building revolutionary robotic systems that combine cutting-edge AI, sophisticated control systems, and advanced mechanical design to create adaptable automation solutions capable of working safely alongside humans in dynamic environments. This is a unique opportunity to shape the future of robotics and automation at an unprecedented scale, working with world-class teams pushing the boundaries of what's possible in robotic dexterous manipulation, locomotion, and human-robot interaction. This role presents an opportunity to shape the future of robotics through innovative applications of deep learning and large language models. At Amazon Industrial Robotics we leverage advanced robotics, machine learning, and artificial intelligence to solve complex operational challenges at an unprecedented scale. Our fleet of robots operates across hundreds of facilities worldwide, working in sophisticated coordination to fulfill our mission of customer excellence. Join us in building the next generation of intelligent robotics systems that will transform the future of automation and human-robot collaboration. Key job responsibilities Design and deploy end-to-end teleoperation pipelines integrating VR/AR headsets and haptics interfaces with robotic hardware Implement force-feedback and tactile sensing algorithms to provide operators with a "sense of touch," improving performance in contact-rich manipulation tasks Collaborate with ML teams to ensure teleoperation interfaces capture high-fidelity state-action pairs, including proprioception, visual, and force/torque data for model training Develop custom networking and streaming protocols to minimize operator-to-robot latency. Conduct user studies to evaluate ergonomics, cognitive load, and "telepresence" effectiveness to iterate on UI/UX designs.
US, TX, Austin
Amazon Security is looking for a talented and driven Applied Scientist II to spearhead Generative AI acceleration within the Secure Third Party Tools (S3T) organization. The S3T team has bold ambitions to re-imagine security products that serve Amazon's pace of innovation at our global scale. This role will focus on leveraging large language models and agentic AI to transform third-party security risk management, automate complex vendor assessments, streamline controllership processes, and dramatically reduce assessment cycle times. You will drive builder efficiency and deliver bar-raising security engagements across Amazon. Key job responsibilities Lead the research, design, and development of GenAI-powered solutions to enhance the security and governance of third-party tools across Amazon Develop and fine-tune large language models (LLMs) and other ML models tailored to security use cases, including risk detection, anomaly identification, and automated compliance Collaborate with cross-functional teams — including Security Engineers, Software Development Engineers, and Product Managers — to translate scientific innovations into scalable, production-ready systems Define and drive the GenAI roadmap for the S3T organization, influencing strategy and prioritization Conduct rigorous experimentation, evaluate model performance, and iterate rapidly to deliver measurable impact Stay current with the latest advancements in GenAI and applied ML research, and bring relevant innovations into Amazon's security ecosystem Mentor junior scientists and contribute to a culture of scientific excellence within the team About the team Security is central to maintaining customer trust and delivering delightful customer experiences. At Amazon, our Security organization is designed to drive bar-raising security engagements. Our vision is that Builders raise the Amazon security bar when they use our recommended tools and processes, with no overhead to their business. Diverse Experiences Amazon Security values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying. Why Amazon Security? At Amazon, security is central to maintaining customer trust and delivering delightful customer experiences. Our organization is responsible for creating and maintaining a high bar for security across all of Amazon’s products and services. We offer talented security professionals the chance to accelerate their careers with opportunities to build experience in a wide variety of areas including cloud, devices, retail, entertainment, healthcare, operations, and physical stores. Inclusive Team Culture In Amazon Security, it’s in our nature to learn and be curious. Ongoing DEI events and learning experiences inspire us to continue learning and to embrace our uniqueness. Addressing the toughest security challenges requires that we seek out and celebrate a diversity of ideas, perspectives, and voices. Training & Career Growth We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, training, and other career-advancing resources here to help you develop into a better-rounded professional. Work/Life Balance We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why flexible work hours and arrangements are part of our culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve.
GB, London
We are looking for a passionate, talented, and inventive Data Scientist with a strong machine learning and analytics background to help build industry-leading language technology powering Rufus, our AI-driven search and shopping assistant, helping customers with their shopping tasks at every step of their shopping journey. This innovative role focuses on developing and optimizing large language model (LLM)-powered conversational experiences. The core emphasis is to get the best performance out of state-of-the-art LLMs via careful and methodical instruction design, contextual grounding, informed choices of MCP tools and agent/multi-agent systems, evaluation frameworks, and experimentation to systematically improve LLM quality, robustness, and customer impact. The work combines scientific rigor with product intuition to systematically raise the bar for conversational AI performance at Amazon scale. Our mission in conversational shopping is to make it easy for customers to find and discover the best products to meet their needs by helping with their product research, providing comparisons and recommendations, answering product questions, enabling shopping directly from images or videos, providing visual inspiration, and more. We do this by leveraging advanced analytics, Natural Language Processing (NLP), Machine Learning (ML), A/B testing, causal inference, and data-driven insights to continuously improve our systems. Key job responsibilities As a Data Scientist on our team, you will develop and maintain LLM instructions iterations and evaluation frameworks, including automated eval pipelines, LLM-as-a-judge methodologies, rubric design, and dataset curation to measure nuanced aspects of response quality. You will partner with the wider org to experiment with techniques such as retrieval augmentation, context enrichment, prompt decomposition, and model fine-tuning or post-training strategies, if and when applicable. You will leverage petabytes of data and identify opportunities to leverage machine learning models aimed at making conversational systems more performant. A day in the life You will: Perform hands-on analysis of large-scale multimodal interaction datasets to develop insights into how customers engage with conversational AI systems and how to improve response quality and customer experience. Use statistical methods, experimentation, and data-driven analysis to develop scalable approaches for measuring, evaluating, and optimizing large language model (LLM)-based shopping assistant systems, leveraging structured and unstructured contextual signals. Design and analyze A/B tests and experiments to evaluate new features and model improvements, ensuring statistical rigor and actionable insights. Develop metrics, dashboards, and reporting frameworks to monitor system performance, customer engagement, and business impact. Conduct deep-dive analyses to identify opportunities for improving conversational relevance, grounding, customer satisfaction, and downstream business impact. Collaborate with Applied Scientists and Engineers to translate analytical insights into production systems, working closely on model evaluation and deployment. Establish automated processes for large-scale data analysis, ETL pipelines, metric generation, and experimentation frameworks. Communicate results and insights to both technical and non-technical audiences, including through presentations, written reports, and data visualizations. About the team The Rufus Features Science team, based in London, works alongside ~150 engineers, designers and product managers, shaping the future of AI-driven shopping experiences at Amazon. The team works on every aspect of the Rufus AI, from making Rufus agentic, enabling customers to set price alerts or empower Rufus to act on their behalf and automatically purchase products when the price is right, to understanding multimodal user queries and generating answers that combine text, image, audio and video, including deep research reports that scour the web and the Amazon catalog to provide detailed and personalised shopping guidance. We utilize and advance state-of-art techniques in the fields of Natural Language Processing, gen AI, Information Retrieval, Machine/Deep Learning, and Data Mining. We validate our work by actively participating in the internal and external scientific communities.
US, NY, New York
The Sponsored Products and Brands (SPB) team at Amazon Ads is re-imagining the advertising landscape through state-of-the-art generative AI technologies, revolutionizing how millions of customers discover products and engage with brands across Amazon.com and beyond. We are at the forefront of re-inventing advertising experiences, bridging human creativity with artificial intelligence to transform every aspect of the advertising lifecycle from ad creation and optimization to performance analysis and customer insights. We are a passionate group of innovators dedicated to developing responsible and intelligent AI technologies that balance the needs of advertisers, enhance the shopping experience, and strengthen the marketplace. If you're energized by solving complex challenges and pushing the boundaries of what's possible with AI, join us in shaping the future of advertising. The Off-Search team within Sponsored Products and Brands (SPB) is focused on building delightful ad experiences across various surfaces beyond Search on Amazon—such as product detail pages, the homepage, and store-in-store pages—to drive monetization. Our vision is to deliver highly personalized, context-aware advertising that adapts to individual shopper preferences, scales across diverse page types, remains relevant to seasonal and event-driven moments, and integrates seamlessly with organic recommendations such as new arrivals, basket-building content, and fast-delivery options. To execute this vision, we work in close partnership with Amazon Stores stakeholders to lead the expansion and growth of advertising across Amazon-owned and -operated pages beyond Search. We operate full stack—from backend ads-retail edge services, ads retrieval, and ad auctions to shopper-facing experiences—all designed to deliver meaningful value. Curious about our advertising solutions? Discover more about Sponsored Products and Sponsored Brands to see how we’re helping businesses grow on Amazon.com and beyond! Key job responsibilities This role will be pivotal in redesigning how ads contribute to a personalized, relevant, and inspirational shopping experience, with the customer value proposition at the forefront. Key responsibilities include, but are not limited to: - Contribute to the design and development of GenAI, deep learning, multi-objective optimization and/or reinforcement learning empowered solutions to transform ad retrieval, auctions, whole-page relevance, and/or bespoke shopping experiences. - Collaborate cross-functionally with other scientists, engineers, and product managers to bring scalable, production-ready science solutions to life. - Stay abreast of industry trends in GenAI, LLMs, and related disciplines, bringing fresh and innovative concepts, ideas, and prototypes to the organization. - Contribute to the enhancement of team’s scientific and technical rigor by identifying and implementing best-in-class algorithms, methodologies, and infrastructure that enable rapid experimentation and scaling. - Mentor and grow junior scientists and engineers, cultivating a high-performing, collaborative, and intellectually curious team. A day in the life As an Applied Scientist on the Sponsored Products and Brands Off-Search team, you will contribute to the development in Generative AI (GenAI) and Large Language Models (LLMs) to revolutionize our advertising flow, backend optimization, and frontend shopping experiences. This is a rare opportunity to redefine how ads are retrieved, allocated, and/or experienced—elevating them into personalized, contextually aware, and inspiring components of the customer journey. You will have the opportunity to fundamentally transform areas such as ad retrieval, ad allocation, whole-page relevance, and differentiated recommendations through the lens of GenAI. By building novel generative models grounded in both Amazon’s rich data and the world’s collective knowledge, your work will shape how customers engage with ads, discover products, and make purchasing decisions. If you are passionate about applying frontier AI to real-world problems with massive scale and impact, this is your opportunity to define the next chapter of advertising science. About the team The Off-Search team within Sponsored Products and Brands (SPB) is focused on building delightful ad experiences across various surfaces beyond Search on Amazon—such as product detail pages, the homepage, and store-in-store pages—to drive monetization. Our vision is to deliver highly personalized, context-aware advertising that adapts to individual shopper preferences, scales across diverse page types, remains relevant to seasonal and event-driven moments, and integrates seamlessly with organic recommendations such as new arrivals, basket-building content, and fast-delivery options. To execute this vision, we work in close partnership with Amazon Stores stakeholders to lead the expansion and growth of advertising across Amazon-owned and -operated pages beyond Search. We operate full stack—from backend ads-retail edge services, ads retrieval, and ad auctions to shopper-facing experiences—all designed to deliver meaningful value. Curious about our advertising solutions? Discover more about Sponsored Products and Sponsored Brands to see how we’re helping businesses grow on Amazon.com and beyond!
CN, 44, Shenzhen
职位:Applied scientist 应用科学家实习生 毕业时间:2026年10月 - 2027年7月之间毕业的应届毕业生 · 入职日期:2026年6月及之前 · 实习时间:保证一周实习4-5天全职实习,至少持续3个月 · 工作地点:深圳福田区 投递须知: 1 填写简历申请时,请把必填和非必填项都填写完整。提交简历之后就无法修改了哦! 2 学校的英文全称请准确填写。中英文对应表请查这里(无法浏览请登录后浏览)https://docs.qq.com/sheet/DVmdaa1BCV0RBbnlR?tab=BB08J2 关于职位 Amazon Device &Services Asia团队正在寻找一位充满好奇心、善于沟通的应用科学家实习生,成为连接前沿AI研究与现实世界认知的桥梁。这是一个独特的角色——既需要动手参与机器学习项目,又要接受将复杂AI概念转化为通俗易懂内容的创意挑战。D&S Asia是亚马逊设备与服务业务在亚洲的支柱组织,自2009年支持Kindle制造起步,现已发展为横跨软硬件、AI(Alexa)及智能家居(Ring/Blink)的综合性团队,持续驱动区域业务创新与人才发展。 你将做什么 • 解密AI: 将复杂的技术发现转化为直观的解释、博客文章、教程或互动演示,让非技术背景的业务方和更广泛的社区都能理解 • 技术叙事: 与工程团队协作,以清晰、引人入胜的方式记录AI的能力与局限性 • 知识共享: 协助开发内部工作坊或"AI入门"课程,提升跨职能团队(产品、设计、商务)的AI素养 • 保持前沿: 持续学习并整合最新突破(如大语言模型、扩散模型、智能体),为团队输出简明易懂的趋势简报 • 研究与应用: 参与端到端的应用研究项目,从文献综述到原型开发,涵盖自然语言处理、计算机视觉或多模态AI领域
US, MA, N.reading
Amazon Industrial Robotics Group is seeking exceptional talent to help develop the next generation of advanced robotics systems that will transform automation at Amazon's scale. We're building revolutionary robotic systems that combine cutting-edge AI, sophisticated control systems, and advanced mechanical design to create adaptable automation solutions capable of working safely alongside humans in dynamic environments. This is a unique opportunity to shape the future of robotics and automation at an unprecedented scale, working with world-class teams pushing the boundaries of what's possible in robotic dexterous manipulation, locomotion, and human-robot interaction. This role presents an opportunity to shape the future of robotics through innovative applications of deep learning and large language models. At Amazon Industrial Robotics Group, we leverage advanced robotics, machine learning, and artificial intelligence to solve complex operational challenges at an unprecedented scale. Our fleet of robots operates across hundreds of facilities worldwide, working in sophisticated coordination to fulfill our mission of customer excellence. We are pioneering the development of dexterous manipulation system that: - Enables unprecedented generalization across diverse tasks - Enables contact-rich manipulation in different environments - Seamlessly integrates low-level skills and high-level behaviors - Leverage mechanical intelligence, multi-modal sensor feedback and advanced control techniques. The ideal candidate will contribute to research that bridges the gap between theoretical advancement and practical implementation in robotics. You will be part of a team that's revolutionizing how robots learn, adapt, and interact with their environment. Join us in building the next generation of intelligent robotics systems that will transform the future of automation and human-robot collaboration. A day in the life - Lead design and implementation of methods for Visual SLAM, navigation and spatial reasoning - Leverage simulation and real-world data collection to create large datasets for model development - Develop a hierarchical system that combines low-level control with high-level planning - Collaborate effectively with multi-disciplinary teams to co-design hardware and algorithms for dexterous manipulation
US, WA, Seattle
Amazon Prime is looking for an ambitious Economist Intern to help create econometric insights for world-wide Prime. Prime is Amazon's premiere membership program, with over 200M members world-wide. This role is at the center of many major company decisions that impact Amazon's customers. These decisions span a variety of industries, each reflecting the diversity of Prime benefits. These range from fast-free e-commerce shipping, digital content (e.g., exclusive streaming video, music, gaming, photos), reading, healthcare, and grocery offerings. Prime Science creates insights that power these decisions. As an economist intern in this role, you will create statistical tools that embed causal interpretations. You will utilize massive data, state-of-the-art scientific computing, econometrics (causal, counterfactual/structural, experimentation), and machine-learning, to do so. Some of the science you create will be publishable in internal or external scientific journals and conferences. You will work closely with a team of economists, applied scientists, data professionals (business analysts, business intelligence engineers), product managers, and software/data engineers. You will create insights from descriptive statistics, as well as from novel statistical and econometric models. You will create internal-to-Amazon-facing automated scientific data products to power company decisions. You will write strategic documents explaining how senior company leaders should utilize these insights to create sustainable value for customers. These leaders will often include the senior-most leaders at Amazon. The team is unique in its exposure to company-wide strategies as well as senior leadership. It operates at the research frontier of utilizing data, econometrics, artificial intelligence, and machine-learning to form business strategies. A successful candidate will have demonstrated a capacity for building, estimating, and defending statistical models (e.g., causal, counterfactual, machine-learning) using software such as R, Python, or STATA. They will have a willingness to learn and apply a broad set of statistical and computational techniques to supplement deep training in one area of econometrics. For example, many applications on the team motivate the use of structural econometrics and machine-learning. They rely on building scalable production software, which involves a broad set of world-class software-building skills often learned on-the-job. As a consequence, already-obtained knowledge of SQL, machine learning, and large-scale scientific computing using distributed computing infrastructures such as Spark-Scala or PySpark would be a plus. Additionally, this candidate will show a track-record of delivering projects well and on-time, preferably in collaboration with other team members (e.g. co-authors). Candidates must have very strong writing and emotional intelligence skills (for collaborative teamwork, often with colleagues in different functional roles), a growth mindset, and a capacity for dealing with a high-level of ambiguity. Endowed with these traits and on-the-job-growth, the role will provide the opportunity to have a large strategic, world-wide impact on the customer experiences of Prime members.
US, WA, Bellevue
The Mission Build AI safety systems that protect millions of Alexa customers every day. As conversational AI evolves, you'll solve challenging problems in Responsible AI by ensuring LLMs provide safe, trustworthy responses, building AI systems that understand nuanced human values across cultures, and maintaining customer trust at scale. What You'll Build You'll pioneer breakthrough solutions in Responsible AI at Amazon's scale. Imagine training models that set new safety standards, designing automated testing systems that hunt for vulnerabilities before they surface, and certifying the systems that power millions of daily conversations. You'll create intelligent evaluation systems that judge responses with human-level insight, build models that truly understand what makes interactions safe and delightful, and craft feedback mechanisms that help Alexa+ grasp the nuances of complex customer conversations. Here's where it gets even more exciting: you'll build AI agents that act as your team's safety net—automatically detecting and fixing production issues in real-time, often before anyone notices there was a problem. Your innovations won't just improve Alexa+; they'll fundamentally shape how it learns, evolves, and earns customer trust. As Alexa+ continues to delight customers, your work ensures it becomes more trustworthy, safer, and deeply aligned with customer needs and expectations. Your work directly protects customer trust at Amazon's scale. Every innovation you create—from novel safety mechanisms to sophisticated evaluation techniques—shapes how millions of people interact with AI confidently. You're not just building products; you're defining industry standards for responsible AI. This is frontier research with immediate real-world impact. You'll tackle problems that require innovative solutions: training models that remain truthful and grounded across diverse contexts, building reward models that capture the nuanced spectrum of human values across cultures and languages, and creating automated systems that continuously discover and address potential issues before customers encounter them. You'll collaborate with world-class scientists, product managers, and engineers to transform state-of-the-art ideas into production systems serving millions. What We're Looking For * Deep expertise in state-of-the-art NLP and Large Language Models * Track record of building scalable ML systems * Passion for impactful research—where frontier science meets real-world responsibility at scale * Excitement about solving problems that will shape the future of AI Ready to work on AI safety challenges that define the industry? Join us. Key job responsibilities This is where you'll make your mark. You'll architect breakthrough Responsible AI solutions that become industry benchmarks, pioneering algorithms that eliminate false information, designing frameworks that hunt down vulnerabilities before bad actors find them, and developing models that understand human values across every culture we serve. Working with world-class engineers and scientists, you'll push the boundaries of model training—transforming bold research into production systems that protect millions of customers daily while withstanding attacks and delivering exceptional experiences. But here's what makes this role truly special: you'll shape the future. You'll lead certification processes, advance optimization techniques, build evaluation systems that reason like humans, and mentor the next generation of AI safety experts. Every innovation you drive will set new standards for trustworthy AI at the world's largest scale. A day in the life As a Responsible AI Scientist, you're at the frontier of AI safety—experimenting with breakthrough techniques that push the boundaries of what's possible. You partner with engineering to transform research into production-ready solutions, tackling complex optimization challenges. You brainstorm with Product teams, translating ambitious visions into concrete objectives that drive real impact. Your expertise shapes critical deployment decisions as you review impactful work and guide go/no-go calls. You mentor the next generation of AI safety leaders, watching ideas spark and capabilities grow. This is where science meets impact—building AI that's not just intelligent, but trustworthy and aligned with human values. About the team Our team pioneers Responsible AI for conversational assistants. We ensure Alexa delivers safe, trustworthy experiences across all devices, modalities, and languages worldwide. We work on frontier AI safety challenges—and we're looking for scientists who want to help shape the future of trustworthy AI.
US, WA, Bellevue
The Mission Build AI safety systems that protect millions of Alexa customers every day. As conversational AI evolves, you'll solve challenging problems in Responsible AI by ensuring LLMs provide safe, trustworthy responses, building AI systems that understand nuanced human values across cultures, and maintaining customer trust at scale. What You'll Build You'll pioneer breakthrough solutions in Responsible AI at Amazon's scale. Imagine training models that set new safety standards, designing automated testing systems that hunt for vulnerabilities before they surface, and certifying the systems that power millions of daily conversations. You'll create intelligent evaluation systems that judge responses with human-level insight, build models that truly understand what makes interactions safe and delightful, and craft feedback mechanisms that help Alexa+ grasp the nuances of complex customer conversations. Here's where it gets even more exciting: you'll build AI agents that act as your team's safety net—automatically detecting and fixing production issues in real-time, often before anyone notices there was a problem. Your innovations won't just improve Alexa+; they'll fundamentally shape how it learns, evolves, and earns customer trust. As Alexa+ continues to delight customers, your work ensures it becomes more trustworthy, safer, and deeply aligned with customer needs and expectations. Your work directly protects customer trust at Amazon's scale. Every innovation you create—from novel safety mechanisms to sophisticated evaluation techniques—shapes how millions of people interact with AI confidently. You're not just building products; you're defining industry standards for responsible AI. This is frontier research with immediate real-world impact. You'll tackle problems that require innovative solutions: training models that remain truthful and grounded across diverse contexts, building reward models that capture the nuanced spectrum of human values across cultures and languages, and creating automated systems that continuously discover and address potential issues before customers encounter them. You'll collaborate with world-class scientists, product managers, and engineers to transform state-of-the-art ideas into production systems serving millions. What We're Looking For * Deep expertise in state-of-the-art NLP and Large Language Models * Track record of building scalable ML systems * Passion for impactful research—where frontier science meets real-world responsibility at scale * Excitement about solving problems that will shape the future of AI Ready to work on AI safety challenges that define the industry? Join us. Key job responsibilities This is where you'll make your mark. You'll architect breakthrough Responsible AI solutions that become industry benchmarks, pioneering algorithms that eliminate false information, designing frameworks that hunt down vulnerabilities before bad actors find them, and developing models that understand human values across every culture we serve. Working with world-class engineers and scientists, you'll push the boundaries of model training—transforming bold research into production systems that protect millions of customers daily while withstanding attacks and delivering exceptional experiences. But here's what makes this role truly special: you'll shape the future. You'll lead certification processes, advance optimization techniques, build evaluation systems that reason like humans, and mentor the next generation of AI safety experts. Every innovation you drive will set new standards for trustworthy AI at the world's largest scale. A day in the life As a Responsible AI Scientist, you're at the frontier of AI safety—experimenting with breakthrough techniques that push the boundaries of what's possible. You partner with engineering to transform research into production-ready solutions, tackling complex optimization challenges. You brainstorm with Product teams, translating ambitious visions into concrete objectives that drive real impact. Your expertise shapes critical deployment decisions as you review impactful work and guide go/no-go calls. You mentor the next generation of AI safety leaders, watching ideas spark and capabilities grow. This is where science meets impact—building AI that's not just intelligent, but trustworthy and aligned with human values. About the team Our team pioneers Responsible AI for conversational assistants. We ensure Alexa delivers safe, trustworthy experiences across all devices, modalities, and languages worldwide. We work on frontier AI safety challenges—and we're looking for scientists who want to help shape the future of trustworthy AI.
US, WA, Seattle
Innovators wanted! Are you an entrepreneur? A builder? A dreamer? This role is part of an Amazon Special Projects team that takes the company’s Think Big leadership principle to the limits. If you’re interested in innovating at scale to address big challenges in the world, this is the team for you. As an Applied Scientist on our team, you will focus on building state-of-the-art ML models for biology. Our team rewards curiosity while maintaining a laser-focus in bringing products to market. Competitive candidates are responsive, flexible, and able to succeed within an open, collaborative, entrepreneurial, startup-like environment. At the forefront of both academic and applied research in this product area, you have the opportunity to work together with a diverse and talented team of scientists, engineers, and product managers and collaborate with other teams. Key job responsibilities - Build, adapt and evaluate ML models for life sciences applications - Collaborate with a cross-functional team of ML scientists, biologists, software engineers and product managers