Diverse reasoning traces teach LLMs to make better decisions

How to train language models to generate diverse, accurate reasoning paths using tokens that control distinct reasoning strategies.

Overview by Amazon Nova
  • Amazon researchers introduce set-supervised fine-tuning (SSFT) and global forking policy optimization (GFPO) to train language models that generate diverse reasoning paths.
  • SSFT and GFPO improve single-shot accuracy on AIME 2025 and LiveCodeBench benchmarks without mode collapse.
  • Global forking tokens are used to elicit distinct reasoning modes, enabling the model to produce diverse, high-quality reasoning paths.
  • SSFT models reasoning as a set of complete solution paths, while GFPO selects the most effective reasoning mode for each input.
  • The combined approach of SSFT and GFPO results in gains of 5% to 7% in single-shot accuracy on standard benchmarks.
Was this answer helpful?

Large language models (LLMs) are pretrained on huge volumes of unlabeled data, but afterward, they’re typically post-trained on specific tasks such as instruction following, avoiding harmful outputs, and reasoning, or providing justifications for the outputs they generate.

Parallel reasoning — in which multiple, diverse reasoning paths are generated and compared for the same problem — is emerging as a key tool for understanding the limits of LLMs’ reasoning capability. It also underpins techniques for testing LLMs such as self-consistency, where multiple reasoning paths are aggregated to improve accuracy.

LLMs are generally optimized for reasoning through supervised fine-tuning (SFT), in which each training example is labeled with a single, human-verified reasoning trace. Given the usefulness of parallel reasoning for evaluation, the question naturally arises, Can we expand the limits of LLMs’ reasoning capacities by training them on diverse reasoning traces for each question? In a paper we presented at this year’s International Conference on Learning Representations (ICLR), we propose a method for doing just that, which avoids some previously identified pitfalls of parallel reasoning.

generate reasoning traces.png
For each question, we gather multiple reasoning traces from different models and sources, capturing diverse solution strategies that serve as supervision for parallel reasoning.

To prompt a single LLM to adopt different reasoning strategies, we introduce a set of global forking tokens (such as <think1> through <think6> in the figure below) in the post-training phase, each intended to elicit a distinct reasoning mode. These tokens enable the model to generate diverse, high-quality reasoning paths for the same problem.

naive sft.png
Under naïve SFT, different tokens fail to specialize: they achieve similar accuracy (top) and exhibit comparable reasoning effort (bottom), indicating mode collapse.

However, naïve post-training strategies such as SFT can lead to mode collapse, where different reasoning tokens produce nearly identical behaviors. To address this, we propose set-supervised fine tuning (SSFT) — a simple and principled training approach that enables models to learn multiple distinct reasoning strategies from diverse supervision. Instead of representing reasoning with a single trace, SSFT models it as a set of complete solution paths, which arrive at the same answer through different strategies.

To further teach the model which reasoning strategy to adopt in what contexts, we introduce a reinforcement learning paradigm we call global forking policy optimization. Between these two techniques, we observe gains of 5% to 7% in single-shot accuracy on standard benchmarks, indicating that improved reasoning-mode selection directly translates to better end-to-end performance.

Supervised fine tuning

In practice, multiple reasoning traces for the same question can be obtained by prompting multiple teacher models, sampling alternative reasoning paths from a single model, or aggregating solutions from heterogeneous sources.

SSFT pairs each such trace with a dedicated forking token (e.g., <think1> through <think6>), where each token indicates a different reasoning mode. During training, a bipartite matching step assigns traces to tokens for each question, encouraging the model to learn distinct behaviors rather than collapsing to a single pattern. The training objective sums the next-token prediction (NTP) losses across all matched pairs, evaluating each reasoning trace conditioned on its assigned control token.

sft and ssft.png
Standard SFT uses fixed or random matching (left) and is order dependent, while SSFT (right) uses min-cost bipartite matching to achieve order-invariant training.

As a result, each forking token is specialized to a distinct reasoning strategy, and the model produces more diverse solutions — measured by pass@k, the probability that at least one of k generated answers is correct — while maintaining strong single-shot accuracy ( pass@1).

Reinforcement learning

While supervised training encourages the model to learn diverse reasoning strategies, it does not explicitly teach the model which strategy to use for a given question. Choosing the right reasoning mode is inherently a decision problem, making it a natural fit for reinforcement learning.

We address this with global forking policy optimization (GFPO), a lightweight reinforcement learning approach that learns to select the most effective reasoning mode for each input. For a given question x, the model samples a global forking token from a distribution over control tokens (the <think i>s).

The model then produces an answer conditioned on the sampled token, and the output is verified to obtain a reward signal (e.g., correct or incorrect). These rewards are converted into advantages, which are used to update the policy over forking tokens. Importantly, the generated reasoning traces are treated as rollouts: their gradients are detached and used only for computing rewards, not for direct optimization.

By focusing optimization on the forking-token distribution, GFPO avoids the complexity of token-level reinforcement learning while still capturing the key decision — selecting the right reasoning mode upfront. This makes training both efficient and stable, while directly improving end-to-end performance.

gfpo pipeline.png
GFPO pipeline. The model samples a reasoning mode from the distribution π(<think i> | x), generates answers, and receives rewards from verification. Advantages derived from these rewards update only the forking-token distribution, while the generated reasoning traces are used solely for evaluation.

Together, SSFT and GFPO enable models to both learn diverse reasoning strategies and select the right one at inference time.

Evaluation

We evaluate SSFT+GFPO on both reasoning and coding benchmarks along two axes: (i) accuracy and (ii) diversity of reasoning. Across all settings, SSFT+GFPO consistently outperforms standard pipelines, such as SFT+GRPO.

58.80%

64.22%

52.07%

AIME 2025 (Pass@1)

AIME 2024 (Pass@1)

LiveCodeBench-v5 (Pass@1)

+6.84 vs. SFT+GRPO

+5.37 vs. SFT+GRPO

+4.94 vs. SFT

Beyond accuracy, a key goal of SSFT is to address mode collapse. SSFT explicitly encourages specialization, allowing different tokens to represent distinct reasoning strategies. This leads to two important effects. First, each global forking token consistently triggers a distinct reasoning pattern. Second, this diversity improves pass@k without compromising pass@1. This contrasts with temperature-based sampling, where increasing diversity typically comes at the cost of accuracy.

diverse reasoning effort.png
Different global forking tokens produce distinct reasoning behaviors, demonstrating specialization across modes.
SSFT improves pass@k while preserving pass@1 accuracy on AIME 2025, a challenging math reasoning benchmark.png
SSFT improves pass@k while preserving pass@1 accuracy on AIME 2025, a challenging math reasoning benchmark.

Below, we present a qualitative example illustrating our approach on a representative problem from the AIME 2025 benchmark, a challenging math reasoning dataset. The same question is solved using multiple qualitatively distinct strategies — such as algebraic manipulation, geometric reasoning, and case-based analysis — depending on the selected global forking token.

Multiple distinct solution strategies for the same problem, each induced by a different global forking token.png
Multiple distinct solution strategies for the same problem, each induced by a different global forking token
Code and models
Ready to try SSFT and GFPO on your own tasks? We've open-sourced the training pipeline, evaluation harnesses, and model weights on GitHub and Hugging Face.

Research areas

Related content

IN, KA, Bengaluru
RBS (Retail Business Services) Tech team works towards enhancing the customer experience (CX) and their trust in product data by providing technologies to find and fix Amazon CX defects at scale. Our platforms help in improving the CX in all phases of customer journey, including selection, discoverability & fulfilment, buying experience and post-buying experience (product quality and customer returns). The team also develops GenAI platforms for automation of Amazon Stores Operations. As a Sciences team in RBS Tech, we focus on foundational ML research and develop scalable state-of-the-art ML solutions to solve the problems covering customer experience (CX) and Selling partner experience (SPX). We work to solve problems related to multi-modal understanding (text and images), task automation through multi-modal LLM Agents, supervised and unsupervised techniques, multi-task learning, multi-label classification, aspect and topic extraction for Customer Anecdote Mining, image and text similarity and retrieval using NLP and Computer Vision for product groupings and identifying duplicate listings in product search results. Key job responsibilities As a Data Scientist, you will be responsible to design and deploy scalable GenAI, NLP and Computer Vision solutions that will impact the content visible to millions of customer and solve key customer experience issues. You will develop novel LLM, deep learning and statistical techniques for task automation, text processing, image processing, pattern recognition, and anomaly detection problems. You will define the research and experiments strategy with an iterative execution approach to develop AI/ML models and progressively improve the results over time. You will partner with business and engineering teams to identify and solve large and significantly complex problems that require scientific innovation. You will help the team leverage your expertise, by coaching and mentoring. You will contribute to the professional development of colleagues, improving their technical knowledge and the engineering practices. You will independently as well as guide team to file for patents and/or publish research work where opportunities arise. The RBS org deals with problems that are directly related to the selling partners and end customers and the ML team drives resolution to organization level problems. Therefore, the Data Scientist role will impact the large product strategy, identifies new business opportunities and provides strategic direction which is very exciting.
US, CA, Sunnyvale
Prime Video is a first-stop entertainment destination offering customers a vast collection of premium programming in one app available across thousands of devices. Prime members can customize their viewing experience and find their favorite movies, series, documentaries, and live sports – including Amazon MGM Studios-produced series and movies; licensed fan favorites; and programming from Prime Video add-on subscriptions such as Apple TV+, Max, Crunchyroll and MGM+. All customers, regardless of whether they have a Prime membership or not, can rent or buy titles via the Prime Video Store, and can enjoy even more content for free with ads. Are you interested in shaping the future of entertainment? Prime Video's technology teams are creating best-in-class digital video experience. As a Prime Video technologist, you’ll have end-to-end ownership of the product, user experience, design, and technology required to deliver state-of-the-art experiences for our customers. You’ll get to work on projects that are fast-paced, challenging, and varied. You’ll also be able to experiment with new possibilities, take risks, and collaborate with remarkable people. We’ll look for you to bring your diverse perspectives, ideas, and skill-sets to make Prime Video even better for our customers. With global opportunities for talented technologists, you can decide where a career Prime Video Tech takes you! We are looking for a self-motivated, passionate and resourceful Applied Science Manager to bring diverse perspectives, ideas, and skill-sets to make Prime Video even better for our customers. You will lead a strong science team and work closely with other science and engineering leaders, product and business partners together to build the best personalized customer experience for Prime Video. At the end of the day, you will have the reward of seeing your contributions benefit millions of Amazon.com customers worldwide. Key job responsibilities - Lead to develop AI solutions for various Prime Video recommendation and personalization systems using Deep learning, GenAI, Reinforcement Learning, recommendation system and optimization methods; - Work closely with engineers and product managers to design, implement and launch AI solutions end-to-end; - Effectively communicate technical and non-technical ideas with teammates and stakeholders; - Stay up-to-date with advancements and the latest modeling techniques in the field; - Hire and grow a science team working in this exciting video personalization domain. About the team Prime Video Recommendation Science team owns science solution to power recommendation and personalization experience on various devices. We work closely with the engineering teams to launch our solutions in production.
US, CA, San Francisco
We are seeking a Member of Technical Staff Simulation Engineer to join our AI robotics research team developing foundation models for robotics. You will rapidly develop 3D physics-based and photorealistic simulations alongside scientists to enable training large-scale machine learning models. Key job responsibilities - Develop simulations for reinforcement learning, closed-loop simulations and synthetic data generation - Implement essential robotics features, including accurate modeling of sensors, actuators, and controllers - Build real-to-sim workflows for dynamic environments and robotics tasks - Implement simulation features to minimize sim-to-real gaps through domain randomization and system identification - Create asset toolchains supporting industry-standard formats (URDF, MJCF, USD) - Collaborate closely with a team of ML researchers to enable large-scale robotics training pipelines About the team At Frontier AI & Robotics (FAR), we're not just advancing robotics – we're reimagining it from the ground up. Our team is building the future of intelligent robotics through frontier foundation models and end-to-end learned systems. We tackle some of the most challenging problems in AI and robotics, from developing sophisticated perception systems to creating adaptive manipulation strategies that work in complex, real-world scenarios. What sets us apart is our unique combination of ambitious research vision and practical impact. We leverage Amazon's massive computational infrastructure and rich real-world datasets to train and deploy state-of-the-art foundation models. Our work spans the full spectrum of robotics intelligence – from multimodal perception using images, videos, and sensor data, to sophisticated manipulation strategies that can handle diverse real-world scenarios. We're building systems that don't just work in the lab, but scale to meet the demands of Amazon's global operations. Join us if you're excited about pushing the boundaries of what's possible in robotics, working with world-class researchers, and seeing your innovations deployed at unprecedented scale.
IN, KA, Bengaluru
RBS (Retail Business Services) Tech team works towards enhancing the customer experience (CX) and their trust in product data by providing technologies to find and fix Amazon CX defects at scale. Our platforms help in improving the CX in all phases of customer journey, including selection, discoverability & fulfilment, buying experience and post-buying experience (product quality and customer returns). The team also develops GenAI platforms for automation of Amazon Stores Operations. As a Sciences team in RBS Tech, we focus on foundational ML research and develop scalable state-of-the-art ML solutions to solve the problems covering customer experience (CX) and Selling partner experience (SPX). We work to solve problems related to multi-modal understanding (text and images), task automation through multi-modal LLM Agents, supervised and unsupervised techniques, multi-task learning, multi-label classification, aspect and topic extraction for Customer Anecdote Mining, image and text similarity and retrieval using NLP and Computer Vision for product groupings and identifying duplicate listings in product search results. Key job responsibilities As an Applied Scientist, you will be responsible to design and deploy scalable GenAI, NLP and Computer Vision solutions that will impact the content visible to millions of customer and solve key customer experience issues. You will develop novel LLM, deep learning and statistical techniques for task automation, text processing, image processing, pattern recognition, and anomaly detection problems. You will define the research and experiments strategy with an iterative execution approach to develop AI/ML models and progressively improve the results over time. You will partner with business and engineering teams to identify and solve large and significantly complex problems that require scientific innovation. You will independently file for patents and/or publish research work where opportunities arise. The RBS org deals with problems that are directly related to the selling partners and end customers and the ML team drives resolution to organization level problems. Therefore, the Applied Scientist role will impact the large product strategy, identifies new business opportunities and provides strategic direction which is very exciting.
US, CA, San Francisco
Amazon’s Frontier AI & Robotics (FAR) team is seeking a Member of Technical Staff, Infrastructure to build and scale the foundational systems that power our robotics research and development platform. In this role, you will design and operate the distributed infrastructure that enables our researchers and engineers to train foundation models, run large-scale experiments, and deploy intelligent robotic systems at Amazon scale. Join the next revolution in robotics, where you’ll work alongside world-renowned AI pioneers to push the boundaries of what’s possible in robotic intelligence. As a Member of Technical Staff focused on Infrastructure, you’ll build the critical platform layer that accelerates every aspect of FAR’s research — from high-throughput data pipelines and experiment management systems to low-latency model serving and configuration delivery for robotic deployments. This role is deeply technical and focuses on performance, scalability, and reliability at scale. You will design systems that support volumes of training data, operate with strict latency requirements, and provide the compute and data foundation that enables breakthrough research across FAR’s robotics ecosystem. Key job responsibilities - Design and build scalable data infrastructure to support AI robotics research, including automated pipelines for data ingestion, processing, curation, and delivery - Build highly scalable experimentation and analytics infrastructure to support model evaluation, A/B testing, and feature performance monitoring across robotic systems - Design and operate low-latency configuration and model delivery systems powering progressive rollouts across FAR’s robotic platforms - Improve the performance, efficiency, and reliability of FAR’s core compute and storage infrastructure, ensuring systems remain fast and stable as research scales - Develop tooling and frameworks that accelerate research workflows, including dataset management, visualization, and quality assessment systems - Optimize query performance and data availability for experimentation and analytics workflows used by research teams - Collaborate directly with science and robotics teams to support research projects through both infrastructure development and hands-on technical contribution - Lead large technical initiatives and shape the architecture of FAR’s research platform infrastructure
US, MA, N.reading
Amazon is seeking exceptional talent to help develop the next generation of advanced robotics systems that will transform automation at Amazon's scale. We're building revolutionary robotic systems that combine cutting-edge AI, sophisticated control systems, and advanced mechanical design to create adaptable automation solutions capable of working safely alongside humans in dynamic environments. This is a unique opportunity to shape the future of robotics and automation at an unprecedented scale, working with world-class teams pushing the boundaries of what's possible in robotic dexterous manipulation, locomotion, and human-robot interaction. We are seeking a talented Applied Scientist to join our advanced robotics team, focusing on developing and applying cutting-edge simulation methodologies for advanced robotics systems. This role centers on research and development of physics-based simulation techniques, sim-to-real transfer methods, and machine learning approaches that enable rapid development, testing, and validation of robotic systems operating in complex, real-world environments. Key job responsibilities - Advance physics-based simulation fidelity for contact-rich manipulation and locomotion - Design and build high-performance simulation tools integrated into a robotics design stack - Translate research ideas into robust, verifiable data - Develop methods to quantify and reduce simulation-to-reality gaps across design, safety, and control - Architect scalable simulation solutions for rigid and deformable body dynamics - Build simulation pipelines optimized for a digital twin level of fidelity - Establish frameworks for continuous simulation improvement using real-world hardware - Collaborate with engineering, science, and safety teams on simulation requirements and validation About the team Our team is building a comprehensive robot simulation and modeling platform for advanced robotics development, combining locomotion and manipulation capabilities. We operate at the cutting edge of physics simulation, reinforcement learning, hardware-in-the-loop (HIL), and sim-to-real transfer, collaborating with world-class robotics engineers, scientists, and mechanical designers in a fast-paced, innovation-driven environment. This role uniquely combines fundamental research with real-world development. You will pursue core research questions in physics-based simulation while seeing your work translated into real robots, validated on real hardware. Working alongside Robot scientist and designers, you will help transform research ideas into scalable, quantifiable simulation capabilities that directly impact how robots are designed and built.
US, CA, East Palo Alto
As part of the AWS Solutions organization, we have a vision to provide business applications, leveraging Amazon’s unique experience and expertise, that are used by millions of companies worldwide to manage day-to-day operations. We will accomplish this by accelerating our customers’ businesses through delivery of intuitive and differentiated technology solutions that solve enduring business challenges. We blend vision with curiosity and Amazon’s real-world experience to build opinionated, turnkey solutions. Where customers prefer to buy over build, we become their trusted partner with solutions that are no-brainers to buy and easy to use. Key job responsibilities Everyone on the team needs to be entrepreneurial, wear many hats and work in a highly collaborative environment that’s more startup than big company. We’ll need to tackle problems that span a variety of domains: computer vision, image recognition, machine learning, real-time and distributed systems. As a Sr. Applied Scientist, you will help solve a variety of technical challenges and mentor other scientists. You will be the thought leader of the team. You will tackle challenging, novel situations every day and given the size of this initiative, you’ll have the opportunity to work with multiple technical teams at Amazon in different locations. You should be comfortable with a degree of ambiguity that’s higher than most projects and relish the idea of solving problems that, frankly, haven’t been solved at scale before - anywhere. Along the way, we guarantee that you’ll learn a ton, have fun and make a positive impact on millions of people. A key focus of this role will be developing and implementing advanced visual reasoning systems that can understand complex spatial relationships and object interactions in real-time. You'll work on designing autonomous AI agents that can make intelligent decisions based on visual inputs, understand customer behavior patterns, and adapt to dynamic retail environments. This includes developing systems that can perform complex scene understanding, reason about object permanence, and predict customer intentions through visual cues. About the team Just Walk Out (JWO) is a new kind of store with no lines and no checkout—you just grab and go! Customers simply use the Amazon Go app to enter the store, take what they want from our selection of fresh, delicious meals and grocery essentials, and go! Our checkout-free shopping experience is made possible by our Just Walk Out Technology, which automatically detects when products are taken from or returned to the shelves and keeps track of them in a virtual cart. When you’re done shopping, you can just leave the store. Shortly after, we’ll charge your account and send you a receipt. Check it out at amazon.com/go. Designed and custom-built by Amazonians, our Just Walk Out Technology uses a variety of technologies including computer vision, sensor fusion, and advanced machine learning. Innovation is part of our DNA! Our goal is to be Earths’ most customer centric company and we are just getting started. We need people who want to join an ambitious program that continues to push the state of the art in computer vision, machine learning, distributed systems and hardware design.
US, NY, New York
We are seeking a Robotics/AI Motor Control Scientist to develop cutting-edge machine learning algorithms for motor control systems in robots. In this role, you will focus on creating and optimizing intelligent motor control strategies to enable robots to perform complex, whole-body tasks. Your contributions will be essential in advancing robotics by enabling fluid, reliable, and safe interactions between robots and their environments. Key job responsibilities - Develop controllers that leverage reinforcement learning, imitation learning, or other advanced AI techniques to achieve natural, robust, and adaptive motor behaviors - Collaborate with multi-disciplinary teams to integrate motor control systems with robotic hardware, ensuring alignment with real-world constraints such as actuator dynamics and energy efficiency - Use simulation and real-world testing to refine and validate control algorithms - Stay updated on advancements in robotics, AI, and control systems to apply advanced techniques to robotic motion challenges - Lead technical projects from conception through production deployment - Mentor junior scientists and engineers - Bridge research initiatives with practical engineering implementation About the team Fauna Robotics, an Amazon company, is building capable, safe, and genuinely delightful robots for everyday life. Our goal is simple: make robots people actually want to live and interact with in everyday human spaces. We believe that future won’t arrive until building for robotics becomes far more accessible. Today, too much effort is spent reinventing the fundamentals. We’re changing that by developing tightly integrated hardware and software systems that make it faster, safer, and more intuitive to create real-world robotic products. Our work spans the full stack: mechanical design, control systems, dynamic modeling, and intelligent software. The focus is not just functionality, but experience. We’re building robots that feel responsive, expressive, and genuinely useful. At Fauna, you’ll work at the frontier of this space, helping define how robots move, manipulate, and interact with people in natural environments. It’s an opportunity to solve hard problems across hardware and software with a team focused on making robotics accessible and joyful to build. If you care about making robotics real for everyone and building systems that are as delightful as they are capable, we’re interested in hearing from you. an opportunity to solve hard problems across hardware and software with a team focused on making robotics accessible and joyful to build. If you care about making robotics real for everyone and building systems that are as delightful as they are capable, we’re interested in hearing from you.
US, CA, Palo Alto
We are looking for a passionate Applied Scientist to help pioneer the next generation of agentic AI applications for Amazon advertisers. In this role, you will design agentic architectures, develop tools and datasets, and contribute to building systems that can reason, plan, and act autonomously across complex advertiser workflows. You will work at the forefront of applied AI, developing methods for fine-tuning, reinforcement learning, and preference optimization, while helping create evaluation frameworks that ensure safety, reliability, and trust at scale. You will work backwards from the needs of advertisers—delivering customer-facing products that directly help them create, optimize, and grow their campaigns. Beyond building models, you will advance the agent ecosystem by experimenting with and applying core primitives such as tool orchestration, multi-step reasoning, and adaptive preference-driven behavior. This role requires working independently on ambiguous technical problems, collaborating closely with scientists, engineers, and product managers to bring innovative solutions into production. Key job responsibilities - Design and build agents for our autonomous campaigns experience. - Design and implement advanced model and agent optimization techniques, including supervised fine-tuning, instruction tuning and preference optimization (e.g., DPO/IPO). - Curate datasets and tools for MCP. - Build evaluation pipelines for agent workflows, including automated benchmarks, multi-step reasoning tests, and safety guardrails. - Develop agentic architectures (e.g., CoT, ToT, ReAct) that integrate planning, tool use, and long-horizon reasoning. - Prototype and iterate on multi-agent orchestration frameworks and workflows. - Collaborate with peers across engineering and product to bring scientific innovations into production. - Stay current with the latest research in LLMs, RL, and agent-based AI, and translate findings into practical applications. About the team The Sponsored Products and Brands team at Amazon Ads is re-imagining the advertising landscape through the latest generative AI technologies, revolutionizing how millions of customers discover products and engage with brands across Amazon.com and beyond. We are at the forefront of re-inventing advertising experiences, bridging human creativity with artificial intelligence to transform every aspect of the advertising lifecycle from ad creation and optimization to performance analysis and customer insights. We are a passionate group of innovators dedicated to developing responsible and intelligent AI technologies that balance the needs of advertisers, enhance the shopping experience, and strengthen the marketplace. If you're energized by solving complex challenges and pushing the boundaries of what's possible with AI, join us in shaping the future of advertising. The Autonomous Campaigns team within Sponsored Products and Brands is focused on guiding and supporting 1.6MM advertisers to meet their advertising needs of creating and managing ad campaigns. At this scale, the complexity of diverse advertiser goals, campaign types, and market dynamics creates both a massive technical challenge and a transformative opportunity: even small improvements in guidance systems can have outsized impact on advertiser success and Amazon’s retail ecosystem. Our vision is to build a highly personalized, context-aware campaign creation and management system that leverages LLMs together with tools such as auction simulations, ML models, and optimization algorithms. This agentic framework, will operate across both chat and non-chat experiences in the ad console, scaling to natural language queries as well as proactively delivering guidance based on deep understanding of the advertiser. To execute this vision, we collaborate closely with stakeholders across Ad Console, Sales, and Marketing to identify opportunities—from high-level product guidance down to granular keyword recommendations—and deliver them through a tailored, personalized experience. Our work is grounded in state-of-the-art agent architectures, tool integration, reasoning frameworks, and model customization approaches (including tuning, MCP, and preference optimization), ensuring our systems are both scalable and adaptive.
US, WA, Seattle
Amazon.com strives to be Earth's most customer-centric company where customers can shop in our stores to find and discover anything they want to buy. We hire the world's brightest minds, offering them a fast paced, technologically sophisticated and friendly work environment. Economists at Amazon partner closely with senior management, business stakeholders, scientist and engineers, and economist leadership to solve key business problems ranging from Amazon Web Services, Kindle, Prime, inventory planning, international retail, third party merchants, search, pricing, labor and employment planning, effective benefits (health, retirement, etc.) and beyond. Amazon Economists build econometric models using our world class data systems and apply approaches from a variety of skillsets – applied macro/time series, applied micro, econometric theory, empirical IO, empirical health, labor, public economics and related fields are all highly valued skillsets at Amazon. You will work in a fast moving environment to solve business problems as a member of either a cross-functional team embedded within a business unit or a central science and economics organization. You will be expected to develop techniques that apply econometrics to large data sets, address quantitative problems, and contribute to the design of automated systems around the company.