Shrinking machine learning models for offline use

"Perfect hashing" is among the techniques that reduce the memory footprints of machine learning models by 94%.

Last week, the Alexa Auto team announced the release of its new Alexa Auto Software Development Kit (SDK), enabling developers to bring Alexa functionality to in-vehicle infotainment systems.

SYNC 3 and Amazon Echo
Ford is working to link home automation devices like Amazon Echo and Wink with its vehicles through Ford SYNC®, allowing consumers to control lights, thermostats and other home systems from their car and interact with their vehicle, including starting and unlocking it, from their home.

The initial release of the SDK assumes that automotive systems will have access to the cloud, where the machine-learning models that power Alexa currently reside. But in the future, we would like Alexa-enabled vehicles — and other mobile devices — to have recourse to some core functions even when they’re offline. That will mean drastically reducing the size of the underlying machine-learning models, so they can fit in local memory.

At the same time, third-party developers have created more than 45,000 Alexa skills, which expand on Alexa’s native capabilities, and that number is increasing daily. Even in the cloud, third-party skills are loaded into memory only when explicitly invoked by a customer request. Shrinking the underlying models would reduce load time, ensuring that Alexa customers continue to experience millisecond response times.

At this year’s Interspeech, my colleagues and I will present a new technique for compressing machine-learning models that reduces their memory footprints by 94% while leaving their performance almost unchanged. We report our results in a paper titled “Statistical model compression for small-footprint natural language understanding.”

Quantization

Alexa’s natural-language-understanding systems, which interpret free-form utterances, use several different types of machine-learning (ML) models, but they all share some common traits. One is that they learn to extract “features” — or strings of text with particular predictive value — from input utterances. An ML model trained to handle music requests, for instance, will probably become sensitized to text strings like “the Beatles”, “Elton John”, “Whitney Houston”, “Adele”, and so on. Alexa’s ML models frequently have millions of features.

Another common trait is that each feature has a set of associated “weights,” which determine how large a role it should play in different types of computation. The need to store multiple weights for millions of features is what makes ML models so memory intensive.

Our first technique for compressing an ML model is to quantize its weights. We take the total range of weights — say, -100 to 100 — and divide it into even intervals — say, -100 to -90, -90 to -80, and so on. Then we simply round each weight off to the nearest boundary value for its interval. In practice, we use 256 intervals, which allows us to represent every weight in the model with a single byte of data, with minimal effect on the network’s accuracy. This approach has the added benefit of automatically rounding low weights to zero, so they can be discarded.

Perfect hashing

Our other compression technique is more elegant. If an Alexa customer says, “Alexa, play ‘Yesterday,’ by the Beatles,” we want our system to pull up the weights associated with the feature “the Beatles” — not the weights associated with “Adele”, “Elton John”, and the rest. This requires a means of mapping particular features to the memory locations of the corresponding weights.

The standard way to perform such mappings is through hashing. A hash function is a mathematical function that takes arbitrary inputs and scrambles them up — hashes them — in such a way that the outputs (1) are of fixed size and (2) bear no predictable relationship to the inputs. If the output size is fixed at 16 bits, for instance, there are 65,536 possible hash values, but “Hank Williams” might map to value 1, while “Hank Williams, Jr.” maps to value 65,000.

Nonetheless, traditional hash functions sometimes produce collisions: Hank Williams, Jr. may not map to the same location as Hank Williams, but something totally arbitrary — the Bay City Rollers, say — might. In terms of runtime performance, this usually isn’t a big problem. If you hash the name “Hank Williams” and find two different sets of weights at the corresponding memory location, it doesn’t take that long to consult a metadata tag to determine which set of weights belongs to which artist.

In terms of memory footprint, however, this approach to collision resolution makes a substantial difference. With quantizing, the weights themselves will require just a few bytes of data; the metadata used to distinguish sets of weights could end up requiring more space in memory than the data it’s tagging.

We address this problem by using a more advanced hashing technique called perfect hashing, which maps a specific number of data items to the same number of memory slots but guarantees there will be no collisions. With perfect hashing, the system can simply hash a string of characters and pull up the corresponding weights — no metadata required.

Perfect-hashing algorithm
Our perfect-hashing algorithm relies on a family of conventional hash functions (h1, h2, etc.). If a function in the family produces a collision-free hash, we toggle the corresponding 0 in an array to 1. Then we repeat the process with different functions and smaller arrays, until every input value has a unique hash.

To produce a perfect hash, we assume that we have access to a family of conventional hash functions all of which produce random hashes. That is, each function in the family might hash “Hank Williams” to a different value, but that value tells you nothing about how the same function will hash any other string. In practice, we use the hash function MurmurHash, which can be seeded with a succession of different values.

Suppose that you have N input strings that you want to hash. We begin with an array of N 0’s. Then we apply our first hash function — call it Hash1 — to all N inputs. For every string that yields a unique hash value — no collisions — we change the corresponding 0 in the array to a 1.

Then we build a new array of 0’s, with entries for only the input strings that yielded collisions under Hash1. To those strings, we now apply a different hash function — say, Hash2 — and we again toggle the 0’s corresponding to collision-free hashes.

We repeat this process until every input string has a corresponding 1 in some array. Then we combine all the arrays into one giant array. The position of a 1 in the giant array indicates the unique memory location assigned to the corresponding input string.

Now, when the trained network receives an input, it applies Hash1 to each of the input’s substrings and, if it finds a 1 in the first array, it goes to the associated address. If it finds a 0, it applies Hash2 and repeats the process.

Calling successive hash functions for some inputs does incur a slight performance penalty. But it’s a penalty that’s paid only where a conventional hash function would yield a collision, anyway. In our paper, we include both a theoretical analysis and experimental results that demonstrate that this penalty is almost negligible. And it’s certainly a small price to pay for the drastic reduction in memory footprint that the method affords.

Acknowledgments: Kanthashree Mysore Sathyendra, Stanislav Peshterliev

Research areas

Related content

US, MA, N.reading
Amazon Industrial Robotics Group is seeking exceptional talent to help develop the next generation of advanced robotics systems that will transform automation at Amazon's scale. We're building revolutionary robotic systems that combine cutting-edge AI, sophisticated control systems, and advanced mechanical design to create adaptable automation solutions capable of working safely alongside humans in dynamic environments. This is a unique opportunity to shape the future of robotics and automation at an unprecedented scale, working with world-class teams pushing the boundaries of what's possible in robotic dexterous manipulation, locomotion, and human-robot interaction. This role presents an opportunity to shape the future of robotics through innovative applications of deep learning and large language models. At Amazon Industrial Robotics Group we leverage advanced robotics, machine learning, and artificial intelligence to solve complex operational challenges at an unprecedented scale. Our fleet of robots operates across hundreds of facilities worldwide, working in sophisticated coordination to fulfill our mission of customer excellence. The ideal candidate will contribute to research and implementation that bridges the gap between theoretical advancement and practical implementation in robotics. You will be part of a team that's revolutionizing how robots learn, adapt, and interact with their environment. Join us in building the next generation of intelligent robotics systems that will transform the future of automation and human-robot collaboration. Key job responsibilities - Implement and optimize control algorithms for robot locomotion - Support development of behaviors that enable robots to traverse diverse terrain - Contribute to methods that integrate stability, locomotion, and manipulation tasks - Help create dynamics models and simulations that enable sim2real transfer of algorithms - Collaborate effectively with multi-disciplinary teams on hardware and algorithms for loco-manipulation
US, CA, Sunnyvale
Amazon Industrial Robotics Group is seeking exceptional talent to help develop the next generation of advanced robotics systems that will transform automation at Amazon's scale. We're building revolutionary robotic systems that combine innovative AI, sophisticated control systems, and advanced mechanical design to create adaptable automation solutions capable of working safely alongside humans in dynamic environments. This is a unique opportunity to shape the future of robotics and automation at unprecedented scale, working with world-class teams pushing the boundaries of what's possible in robotic manipulation, locomotion, and human-robot interaction. This role presents an opportunity to shape the future of robotics through innovative applications of deep learning and large language models. We leverage advanced robotics, machine learning, and artificial intelligence to solve complex operational challenges at unprecedented scale. Our fleet of robots operates across hundreds of facilities worldwide, working in sophisticated coordination to fulfill our mission of customer excellence. We are pioneering the development of robotics foundation models that: - Enable unprecedented generalization across diverse tasks - Integrate multi-modal learning capabilities (visual, tactile, linguistic) - Accelerate skill acquisition through demonstration learning - Enhance robotic perception and environmental understanding - Streamline development processes through reusable capabilities The ideal candidate will contribute to research that bridges the gap between theoretical advancement and practical implementation in robotics. You will be part of a team that's revolutionizing how robots learn, adapt, and interact with their environment. Join us in building the next generation of intelligent robotics systems that will transform the future of automation and human-robot collaboration. As a Senior Applied Scientist, you will lead the development of machine learning systems that help robots perceive, reason, and act in real-world environments. You will set technical direction for adapting and advancing state-of-the-art models (open source and internal research) into robust, safe, and high-performing “robot brain” capabilities for our target tasks, environments, and robot embodiments. You will drive rigorous capability profiling and experimentation, lead targeted innovation where gaps exist, and partner across research, controls, hardware, and product teams to ensure outputs can be further customized and deployed on specific robots. Key job responsibilities - Lead technical initiatives for foundation-model capabilities (e.g., visuomotor / VLA / video-action worldmodel-action policies), from problem definition through validated model deliverables. - Own model readiness for our embodiment class: drive adaptation, fine-tuning, and optimization (latency/throughput/robustness), and define success criteria that downstream teams can build on. - Establish and evolve capability evaluation: define benchmark strategy, metrics, and profiling methodology to quantify performance, generalization, and failure modes; ensure evaluations drive clear roadmap decisions. - Drive the data + training strategy needed to close key capability gaps, including data requirements, collection/curation standards, dataset quality/provenance, and repeatable training recipes (sim + real). - Invent and validate new methods when leveraging SOTA is insufficient—new training schemes, model components, supervision signals, or sim↔real techniques—backed by strong empirical evidence. - Influence cross-team technical decisions by collaborating with controls/WBC, hardware, and product teams on interfaces, constraints, and integration plans; communicate results via design docs and technical reviews. - Mentor and raise the bar: guide junior scientists/engineers, set best practices for experimentation and code quality, and drive a culture of rigor and reproducibility.
US, CA, Sunnyvale
Amazon Industrial Robotics Group is seeking exceptional talent to help develop the next generation of advanced robotics systems that will transform automation at Amazon's scale. We're building revolutionary robotic systems that combine innovative AI, sophisticated control systems, and advanced mechanical design to create adaptable automation solutions capable of working safely alongside humans in dynamic environments. This is a unique opportunity to shape the future of robotics and automation at unprecedented scale, working with world-class teams pushing the boundaries of what's possible in robotic manipulation, locomotion, and human-robot interaction. This role presents an opportunity to shape the future of robotics through innovative applications of deep learning and large language models. We leverage advanced robotics, machine learning, and artificial intelligence to solve complex operational challenges at unprecedented scale. Our fleet of robots operates across hundreds of facilities worldwide, working in sophisticated coordination to fulfill our mission of customer excellence. We are pioneering the development of robotics foundation models that: - Enable unprecedented generalization across diverse tasks - Integrate multi-modal learning capabilities (visual, tactile, linguistic) - Accelerate skill acquisition through demonstration learning - Enhance robotic perception and environmental understanding - Streamline development processes through reusable capabilities The ideal candidate will contribute to research that bridges the gap between theoretical advancement and practical implementation in robotics. You will be part of a team that's revolutionizing how robots learn, adapt, and interact with their environment. Join us in building the next generation of intelligent robotics systems that will transform the future of automation and human-robot collaboration. As a Senior Applied Scientist, you will lead the development of machine learning systems that help robots perceive, reason, and act in real-world environments. You will set technical direction for adapting and advancing state-of-the-art models (open source and internal research) into robust, safe, and high-performing “robot brain” capabilities for our target tasks, environments, and robot embodiments. You will drive rigorous capability profiling and experimentation, lead targeted innovation where gaps exist, and partner across research, controls, hardware, and product teams to ensure outputs can be further customized and deployed on specific robots. Key job responsibilities - Lead technical initiatives for foundation-model capabilities (e.g., visuomotor / VLA / video-action worldmodel-action policies), from problem definition through validated model deliverables. - Own model readiness for our embodiment class: drive adaptation, fine-tuning, and optimization (latency/throughput/robustness), and define success criteria that downstream teams can build on. - Establish and evolve capability evaluation: define benchmark strategy, metrics, and profiling methodology to quantify performance, generalization, and failure modes; ensure evaluations drive clear roadmap decisions. - Drive the data + training strategy needed to close key capability gaps, including data requirements, collection/curation standards, dataset quality/provenance, and repeatable training recipes (sim + real). - Invent and validate new methods when leveraging SOTA is insufficient—new training schemes, model components, supervision signals, or sim↔real techniques—backed by strong empirical evidence. - Influence cross-team technical decisions by collaborating with controls/WBC, hardware, and product teams on interfaces, constraints, and integration plans; communicate results via design docs and technical reviews. - Mentor and raise the bar: guide junior scientists/engineers, set best practices for experimentation and code quality, and drive a culture of rigor and reproducibility.
US, WA, Seattle
We are looking for a passionate Applied Scientist to help pioneer the next generation of agentic AI applications for Amazon advertisers. In this role, you will design agentic architectures, develop tools and datasets, and contribute to building systems that can reason, plan, and act autonomously across complex advertiser workflows. You will work at the forefront of applied AI, developing methods for fine-tuning, reinforcement learning, and preference optimization, while helping create evaluation frameworks that ensure safety, reliability, and trust at scale. You will work backwards from the needs of advertisers—delivering customer-facing products that directly help them create, optimize, and grow their campaigns. Beyond building models, you will advance the agent ecosystem by experimenting with and applying core primitives such as tool orchestration, multi-step reasoning, and adaptive preference-driven behavior. This role requires working independently on ambiguous technical problems, collaborating closely with scientists, engineers, and product managers to bring innovative solutions into production. Key job responsibilities - Design and build agents to guide advertisers in conversational and non-conversational experience. - Design and implement advanced model and agent optimization techniques, including supervised fine-tuning, instruction tuning and preference optimization (e.g., DPO/IPO). - Curate datasets and tools for MCP. - Build evaluation pipelines for agent workflows, including automated benchmarks, multi-step reasoning tests, and safety guardrails. - Develop agentic architectures (e.g., CoT, ToT, ReAct) that integrate planning, tool use, and long-horizon reasoning. - Prototype and iterate on multi-agent orchestration frameworks and workflows. - Collaborate with peers across engineering and product to bring scientific innovations into production. - Stay current with the latest research in LLMs, RL, and agent-based AI, and translate findings into practical applications. About the team The Sponsored Products and Brands team at Amazon Ads is re-imagining the advertising landscape through the latest generative AI technologies, revolutionizing how millions of customers discover products and engage with brands across Amazon.com and beyond. We are at the forefront of re-inventing advertising experiences, bridging human creativity with artificial intelligence to transform every aspect of the advertising lifecycle from ad creation and optimization to performance analysis and customer insights. We are a passionate group of innovators dedicated to developing responsible and intelligent AI technologies that balance the needs of advertisers, enhance the shopping experience, and strengthen the marketplace. If you're energized by solving complex challenges and pushing the boundaries of what's possible with AI, join us in shaping the future of advertising. The Campaign Strategies team within Sponsored Products and Brands is focused on guiding and supporting 1.6MM advertisers to meet their advertising needs of creating and managing ad campaigns. At this scale, the complexity of diverse advertiser goals, campaign types, and market dynamics creates both a massive technical challenge and a transformative opportunity: even small improvements in guidance systems can have outsized impact on advertiser success and Amazon’s retail ecosystem. Our vision is to build a highly personalized, context-aware agentic advertiser guidance system that leverages LLMs together with tools such as auction simulations, ML models, and optimization algorithms. This agentic framework, will operate across both chat and non-chat experiences in the ad console, scaling to natural language queries as well as proactively delivering guidance based on deep understanding of the advertiser. To execute this vision, we collaborate closely with stakeholders across Ad Console, Sales, and Marketing to identify opportunities—from high-level product guidance down to granular keyword recommendations—and deliver them through a tailored, personalized experience. Our work is grounded in state-of-the-art agent architectures, tool integration, reasoning frameworks, and model customization approaches (including tuning, MCP, and preference optimization), ensuring our systems are both scalable and adaptive.
US, WA, Bellevue
Amazon LEO is Amazon's low Earth orbit satellite network. Our mission is to deliver fast, reliable internet connectivity to customers beyond the reach of existing networks. From individual households to schools, hospitals, businesses, and government agencies, Amazon LEO will serve people and organizations operating in locations without reliable connectivity. The Amazon LEO Global Business Operations (GBO) team drives data-driven decision-making across sales, marketing, operations, product, engineering, finance, and legal functions. We build scalable business intelligence solutions and data infrastructure to solve complex, ambiguous problems with LEO-wide impact. We are looking for a talented Research Scientist to contribute to LEO's long-term vision and strategy for capacity simulations and inventory optimization. This effort will be instrumental in helping LEO execute on its business plans globally. As one of our valued team members, you will be obsessed with matching our standards for operational excellence with a relentless focus on delivering results. Key job responsibilities In this role, you will: Collaborate with product, business development, sales, marketing, operations, finance, and various technical teams (engineering, science, R&D, simulations, etc.) to support the implementation of capacity simulations and inventory optimization solutions. Develop and prototype scalable solutions to optimization problems for operating and planning satellite resources. Support technical roadmap definition efforts by building models to predict future inventory availability and key operational and financial metrics across the network. Design experiments and simulations to evaluate optimization improvements and understand how they interact with each other. Analyze large amounts of satellite and business data to identify simulation and optimization opportunities. Communicate insights and recommendations to technical and non-technical audiences to support decision-making across LEO. Export Control Requirement: Due to applicable export control laws and regulations, candidates must be a U.S. citizen or national, U.S. permanent resident (i.e., current Green Card holder), or lawfully admitted into the U.S. as a refugee or granted asylum.
US, CA, Sunnyvale
Prime Video is a first-stop entertainment destination offering customers a vast collection of premium programming in one app available across thousands of devices. Prime members can customize their viewing experience and find their favorite movies, series, documentaries, and live sports – including Amazon MGM Studios-produced series and movies; licensed fan favorites; and programming from Prime Video add-on subscriptions such as Apple TV+, Max, Crunchyroll and MGM+. All customers, regardless of whether they have a Prime membership or not, can rent or buy titles via the Prime Video Store, and can enjoy even more content for free with ads. Are you interested in shaping the future of entertainment? Prime Video's technology teams are creating best-in-class digital video experience. As a Prime Video technologist, you’ll have end-to-end ownership of the product, user experience, design, and technology required to deliver state-of-the-art experiences for our customers. You’ll get to work on projects that are fast-paced, challenging, and varied. You’ll also be able to experiment with new possibilities, take risks, and collaborate with remarkable people. We’ll look for you to bring your diverse perspectives, ideas, and skill-sets to make Prime Video even better for our customers. With global opportunities for talented technologists, you can decide where a career Prime Video Tech takes you! Our organization is building world-class teams with deep expertise in large-scale recommender systems. This role sits at the intersection of AI research and direct business impact, where recommendation quality directly influences business outcomes and customer satisfaction. You'll be joining a team focused on foundational models for recommender systems and working on production systems that serve millions of customers and shape the future of personalized entertainment experiences. We're seeking talent who can deliver measurable impact on our core business metrics while advancing the state-of-the-art in personalization and recommendation technology. Key job responsibilities - Develop AI solutions for various Prime Video Search & Recommendation systems using Deep Learning, Reinforcement Learning, Optimization Methods, and most importantly, GenAI - Work closely with engineers and product managers to design, implement and launch AI solutions end-to-end - Design and conduct offline and online (A/B) experiments to evaluate proposed solutions based on in-depth data analyses - Effectively communicate technical and non-technical ideas with teammates and stakeholders - Stay up-to-date with advancements and the latest modeling techniques in the field - Publish your research findings in top conferences and journals About the team The Prime Video - Personalization & Discovery Science team owns science solution to power search experience on various devices, from sourcing, relevance, & ranking (to name a few). We are on a mission to deliver an AI-first customer experience. At the heart of this transformation are our recommendation systems -- core, customer-facing components that serve as primary drivers of engagement & growth.
CA, ON, Toronto
Are you interested in shaping the future of Advertising and B2B Sales? We are a growing team with an exciting AI-first charter and need your passion, innovative thinking, and creativity to help take our products to new heights. Amazon Advertising is one of Amazon's fastest growing and most profitable businesses, responsible for defining and delivering a collection of advertising products that drive discovery and sales. Our products are strategically important to our businesses driving long term growth. We break fresh ground in product and technical innovations every day! Within the Advertising Sales organization, we are building a central AI/ML team and are seeking top Applied Science talent to help us build new, science-backed services that drive success for our customers. Our goal is to transform the way account teams operate by creating AI agents that help optimize their end-to-end workflows, and developing actionable insights and recommendations they can share with their advertising accounts As an Applied Scientist on the team with a specific focus on creating autonomous AI agents that can operate accurately at large scale, you will bring deep expertise in Natural Language Processing (inc. tokenization, syntactic parsing, named entity recognition (NER), sentiment analysis, text classification), Large Language Models (inc. foundation model fundamentals, post-training, reward modeling, RAG, transformer architecture), Deep Learning, Reinforcement Learning and/or Recommender Systems. You have the scientific and technical skills to build and refine models that can be implemented in production and you continuously measure the performance of your system to drive continuous improvements. You will contribute to chart new courses with our ad sales support technologies, and you have the communication skills necessary to explain complex technical approaches to a variety of stakeholders and customers. You will be part of a team of fellow scientists and engineers taking on iterative approaches to tackle big, long-term problems. You are fluently able to leverage the latest Generative AI systems and services to accelerate and improve your work while maintaining high quality in your work outputs. Key job responsibilities Scientific Modeling - Conceptualize and lead state-of-the-art research on new Reinforcement Learning, Deep Learning, NLP, LLM, (Generative) Artificial Intelligence and Recommender System solutions to create AI agents and optimize all aspects of the Ad Sales business - Lead the technical approach for the design and implementation of successful models and algorithms in support of expert cross-functional teams delivering on demanding projects - Run regular A/B experiments, gather data, and perform statistical analysis - Improve the scalability, efficiency and automation of large-scale data analytics, model training, deployment and serving - Publish scientific findings in reports and papers that can be shared internally and externally Product Development Support - Partner with software engineering and product management teams to support product and service development, define success metrics and measurement approaches, and help drive adoption of innovative new features for our services. - Lead requirements gathering sessions with product teams and business stakeholders - Maintain scientific documentation and knowledge for product initiatives Collaboration & Communication - Work closely with software engineers to deliver end-to-end solutions into production - Translate complex scientific findings into actionable business recommendations for stakeholders and senior management - Provide clear, compelling reports and presentations on a regular basis with respect to your models and services - Communicate with internal teams to showcase results and identify best practices. About the team Sales AI is a central science and engineering organization within Amazon Advertising Sales that powers selling motions and account team workflows via state-of-the-art of AI/ML services. Sales AI is investing in a range of sales intelligence models, including the development of advertiser insights, recommendations and Generative AI-powered applications throughout account team workflows.
CA, ON, Toronto
Are you interested in shaping the future of Advertising and B2B Sales? We are a growing team with an exciting AI-first charter and need your passion, innovative thinking, and creativity to help take our products to new heights. Amazon Advertising is one of Amazon's fastest growing and most profitable businesses, responsible for defining and delivering a collection of advertising products that drive discovery and sales. Our products are strategically important to our businesses driving long term growth. We break fresh ground in product and technical innovations every day! Within the Advertising Sales organization, we are building a central AI/ML team and are seeking top Applied Science talent to help us build new, science-backed services that drive success for our customers. Our goal is to transform the way account teams operate by creating AI agents that help optimize their end-to-end workflows, and developing actionable insights and recommendations they can share with their advertising accounts As an Applied Scientist on the team with a specific focus on creating autonomous AI agents that can operate accurately at large scale, you will bring deep expertise in Natural Language Processing (inc. tokenization, syntactic parsing, named entity recognition (NER), sentiment analysis, text classification), Large Language Models (inc. foundation model fundamentals, post-training, reward modeling, RAG, transformer architecture), Deep Learning and/or Reinforcement Learning . You have the scientific and technical skills to build and refine models that can be implemented in production and you continuously measure the performance of your system to drive continuous improvements. You will contribute to chart new courses with our ad sales support technologies, and you have the communication skills necessary to explain complex technical approaches to a variety of stakeholders and customers. You will be part of a team of fellow scientists and engineers taking on iterative approaches to tackle big, long-term problems. You are fluently able to leverage the latest Generative AI systems and services to accelerate and improve your work while maintaining high quality in your work outputs. Key job responsibilities Scientific Modeling - Conceptualize and lead state-of-the-art research on new NLP, LLM and (Generative) Artificial Intelligence solutions (inc. post-training, fine-tuning, reward modeling) to optimize all aspects of the Ad Sales business - Lead the technical approach for the design and implementation of successful models and algorithms in support of expert cross-functional teams delivering on demanding projects - Run regular A/B experiments, gather data, and perform statistical analysis - Improve the scalability, efficiency and automation of large-scale data analytics, model training, deployment and serving - Publish scientific findings in reports and papers that can be shared internally and externally Product Development Support - Partner with software engineering and product management teams to support product and service development, define success metrics and measurement approaches, and help drive adoption of innovative new features for our services. - Lead requirements gathering sessions with product teams and business stakeholders - Maintain scientific documentation and knowledge for product initiatives Collaboration & Communication - Work closely with software engineers to deliver end-to-end solutions into production - Translate complex scientific findings into actionable business recommendations for stakeholders and senior management - Provide clear, compelling reports and presentations on a regular basis with respect to your models and services - Communicate with internal teams to showcase results and identify best practices. About the team Sales AI is a central science and engineering organization within Amazon Advertising Sales that powers selling motions and account team workflows via state-of-the-art of AI/ML services. Sales AI is investing in a range of sales intelligence models, including the development of advertiser insights, recommendations and Generative AI-powered applications throughout account team workflows.
US, WA, Bellevue
Alexa+ is Amazon’s next-generation, AI-powered virtual assistant. Building on the original Alexa, it uses generative AI to deliver a more conversational, personalized, and effective experience. As an Applied Scientist II on the Alexa Sensitive Content Intelligence (ASCI) team, you'll be part of an elite group developing industry-leading technologies in attribute extraction and sensitive content detection that work seamlessly across all languages and countries. In this role, you'll join a team of exceptional scientists pushing the boundaries of Natural Language Processing. Working in our dynamic, fast-paced environment, you'll develop novel algorithms and modeling techniques that advance the state of the art in NLP. Your innovations will directly shape how millions of customers interact with Amazon Echo, Echo Dot, Echo Show, and Fire TV devices every day. What makes this role exciting is the unique blend of scientific innovation and real-world impact. You'll be at the intersection of theoretical research and practical application, working alongside talented engineers and product managers to transform breakthrough ideas into customer-facing experiences. Your work will be crucial in ensuring Alexa remains at the forefront of AI technology while maintaining the highest standards of trust and safety. We're looking for a passionate innovator who combines strong technical expertise with creative problem-solving skills. Your deep understanding of NLP models (including LSTM and transformer-based architectures) will be essential in tackling complex challenges and identifying novel solutions. You'll leverage your exceptional technical knowledge, strong Computer Science fundamentals, and experience with large-scale distributed systems to create reliable, scalable, and high-performance products that delight our customers. Key job responsibilities In this dynamic role, you'll design and implement GenAI solutions that define the future of AI interaction. You'll pioneer novel algorithms, conduct ground breaking experiments, and optimize user experiences through innovative approaches to sensitive content detection and mitigation. Working alongside exceptional engineers and scientists, you'll transform theoretical breakthroughs into practical, scalable solutions that strengthen user trust in Alexa globally. You'll also have the opportunity to mentor rising talent, contributing to Amazon's culture of scientific excellence while helping build high-performing teams that deliver swift, impactful results. A day in the life Imagine starting your day collaborating with brilliant minds on advancing state-of-the-art NLP algorithms, then moving on to analyze experiment results that could reshape how Alexa understands and responds to users. You'll partner with cross-functional teams - from engineers to product managers - to ensure data quality, refine policies, and enhance model performance. Your expertise will guide technical discussions, shape roadmaps, and influence key platform features that require cross-team leadership. About the team The Alexa Sensitive Content Intelligence (ASCI) team owns the Responsible AI and customer feedback charters in Alexa+ and Classic Alexa across all device endpoints, modalities and languages. The mission of our team is to (1) minimize negative surprises to customers caused by sensitive content, (2) detect and prevent potential brand-damaging interactions, (3) build customer trust through generating appropriate interactions on sensitive topics, and (4) analyze customer feedback to gain insight and drive continuous improvement loops. The term “sensitive content” includes within its scope a wide range of categories of content such as offensive content (e.g., hate speech, racist speech), profanity, content that is suitable only for certain age groups, politically polarizing content, and religiously polarizing content. The term “content” refers to any material that is exposed to customers by Alexa (including both 1P and 3P experiences) and includes text, speech, audio, and video.
US, CA, Palo Alto
The Sponsored Products and Brands team at Amazon Ads is re-imagining the advertising landscape through generative AI technologies, revolutionizing how millions of customers discover products and engage with brands across Amazon.com and beyond. We are at the forefront of re-inventing advertising experiences, bridging human creativity with artificial intelligence to transform every aspect of the advertising lifecycle from ad creation and optimization to performance analysis and customer insights. We are a passionate group of innovators dedicated to developing responsible and intelligent AI technologies that balance the needs of advertisers, enhance the shopping experience, and strengthen the marketplace. If you're energized by solving complex challenges and pushing the boundaries of what's possible with AI, join us in shaping the future of advertising. About the team The SPB-Agent is the central agent that interfaces with advertisers in Ads Console, Selling Partner portals (Seller Central, KDP, Vendor Central), and internal Sales systems across all agentic experiences (conversational and others). SPB Agent team's vision is to build a highly personalized and context-aware agentic advertiser guidance system that seamlessly integrates Large Language Models (LLMs) with sophisticated tooling, operating across all experiences. We identify high-impact opportunities spanning from strategic product guidance to granular optimization and deliver them through personalized, scalable experiences grounded in state-of-the-art agent architectures, reasoning frameworks, sophisticated tool integration, and model customization approaches including fine-tuning, MCP, and preference optimization. This presents an exceptional opportunity to shape the future of e-commerce advertising through advanced AI technology at unprecedented scale, creating solutions that directly impact millions of advertisers.