Two new papers discuss how Alexa recognizes sounds

Last year, Amazon announced the beta release of Alexa Guard, a new service that lets customers who are leaving the house instruct their Echo devices to listen for glass breaking or smoke and carbon dioxide alarms going off.

At this year’s International Conference on Acoustics, Speech, and Signal Processing, our team is presenting several papers on sound detection. I wrote about one of them a few weeks ago, a new method for doing machine learning with unbalanced data sets.

Today I’ll briefly discuss two others, both of which, like the first, describe machine learning systems. One paper addresses the problem of media detection, or recognizing when the speech captured by a digital-assistant device comes from a TV or radio rather than a human speaker. In particular, we develop a way to better characterize media audio by examining longer-duration audio streams versus merely classifying short audio snippets. Media detection helps filter a particularly deceptive type of background noise out of speech signals.

For our other paper, we used semi-supervised learning to train a system developed from an external dataset to do acoustic-event detection. Semi-supervised learning uses small sets of annotated training data to leverage larger sets of unannotated data. In particular, we use tri-training, in which three different models are trained to perform the same task, but on slightly different data sets. Pooling their outputs corrects a common problem in semi-supervised training, in which a model’s errors end up being amplified.

Our media detection system is based on the observation that the audio characteristics we would most like to identify are those common to all instances of media sound, regardless of content. Our network design is an attempt to abstract away from the properties of particular training examples.

Like many machine learning models in the field of spoken-language understanding, ours uses recurrent neural networks (RNNs). An RNN processes sequenced inputs in order, and each output factors in the inputs and outputs that preceded it.

We use a convolutional neural network (CNN) as feature extractor, and stack RNN layers on top of it. But each RNN layer has only a fraction as many nodes as the one beneath it. That is, only every third or fourth output from the first RNN provides an input to the second, and only every third or fourth output of the second RNN provides an input to the third.

Pyramidal.jpg._CB465895532_.jpg
A standard stack of recurrent neural networks (left) and the “pyramidal” stack we use instead

Because the networks are recurrent, each output we pass contains information about the outputs we skip. But this “pyramidal” stacking encourages the model to ignore short-term variations in the input signal.

For every five-second snippet of audio processed by our system, the pyramidal RNNs produce a single output vector, representing the probabilities that the snippet belongs to any of several different sound categories.

But our system includes still another RNN, which tracks relationships between five-second snippets. We experimented with two different ways of integrating that higher-level RNN with the pyramidal RNNs. In the first, the output vector from the pyramidal RNN simply passes to the higher-level RNN, which makes the final determination about whether media sound is present.

In the other, however, the higher-level RNN lies between the middle and top layers of the pyramidal RNN. It receives its input from the middle layer, and its output, along with that of the middle layer, passes to the top layer of the pyramidal RNN.

contextual_2.jpg._CB465896350_.jpg
In the second of our two contextual models, a high-level RNN (red circles) receives inputs from one layer of a pyramidal RNN (groups of five blue circles), and its output passes to the next layer (groups of two blue circles).

This was our best-performing model. When compared to a model that used the pyramidal RNNs but no higher-level RNN, it offered a 24% reduction in equal error rate, which is the error rate that results when the system parameters are set so that the false-positive rate equals the false-negative rate.

Our other ICASSP paper presents our semi-supervised approach to acoustic-event detection (AED). One popular and simple semi-supervised learning technique is self-training, in which a machine learning model is trained on a small amount of labeled data and then itself labels a much larger set of unlabeled data. The machine-labeled data is then sorted according to confidence score — the system’s confidence that its labels are correct — and data falling in the right confidence window is used to fine-tune the model.

The model, that is, is retrained on data that it has labeled itself. Remarkably, this approach tends to improve the model’s performance.

But it also poses a risk. If the model makes a systematic error, and if it makes it with high confidence, then that error will feed back into the model during self-training, growing in magnitude.

Tri-training is intended to mitigate this kind of self-reinforcement. In our experiments, we created three different training sets, each the size of the original — 39,000 examples — by randomly sampling data from the original. There was substantial overlap between the sets, but in each, some data items were oversampled, and some were undersampled.

We trained neural networks on all three data sets and saved copies of them, which we might call initial models. Then we used each of those networks to label another 5.4 million examples. For each of the initial models, we used machine-labeled data to re-train it only if both of the other models agreed on the labels with high confidence. In all, we retained only 5,000 examples out of the more than five million in the unlabeled data set.

Finally, we used six different models to classify the examples in our test set: the three initial models and the three retrained models. On samples of three sounds — dog sounds, baby cries, and gunshots — pooling the results of all six models led to reductions in equal-error rate (EER) of 16%, 26%, and 19%, respectively, over a standard self-trained model.

Of course, using six different models to process the same input is impractical, so we also trained a seventh neural network to mimic the aggregate results of the first six. On the test set, that network was not quite as accurate as the six-network ensemble, but it was still a marked improvement over the standard self-trained model, reducing EER on the same three sample sets by 11%, 18%, and 6%, respectively.

Acknowledgments: Qingming Tang, Chieh-Chi Kao, Viktor Rozgic, Bowen Shi, Spyros Matsoukas, Chao Wang

Research areas

Related content

IN, TS, Hyderabad
Have you ever wondered how Amazon launches and maintains a consistent customer experience across hundreds of countries and languages it serves its customers? Are you passionate about data and mathematics, and hope to impact the experience of millions of customers? Are you obsessed with designing simple algorithmic solutions to very challenging problems? If so, we look forward to hearing from you! At Amazon, we strive to be Earth's most customer-centric company, where both internal and external customers can find and discover anything they want in their own language of preference. Our Translations Services (TS) team plays a pivotal role in expanding the reach of our marketplace worldwide and enables thousands of developers and other stakeholders (Product Managers, Program Managers, Linguists) in developing locale specific solutions. Amazon Translations Services (TS) is seeking an Applied Scientist to be based in our Hyderabad office. As a key member of the Science and Engineering team of TS, this person will be responsible for designing algorithmic solutions based on data and mathematics for translating billions of words annually across 130+ and expanding set of locales. The successful applicant will ensure that there is minimal human touch involved in any language translation and accurate translated text is available to our worldwide customers in a streamlined and optimized manner. With access to vast amounts of data, technology, and a diverse community of talented individuals, you will have the opportunity to make a meaningful impact on the way customers and stakeholders engage with Amazon and our platform worldwide. Together, we will drive innovation, solve complex problems, and shape the future of e-commerce. Key job responsibilities * Apply your expertise in LLM models to design, develop, and implement scalable machine learning solutions that address complex language translation-related challenges in the eCommerce space. * Collaborate with cross-functional teams, including software engineers, data scientists, and product managers, to define project requirements, establish success metrics, and deliver high-quality solutions. * Conduct thorough data analysis to gain insights, identify patterns, and drive actionable recommendations that enhance seller performance and customer experiences across various international marketplaces. * Continuously explore and evaluate state-of-the-art modeling techniques and methodologies to improve the accuracy and efficiency of language translation-related systems. * Communicate complex technical concepts effectively to both technical and non-technical stakeholders, providing clear explanations and guidance on proposed solutions and their potential impact. About the team We are a start-up mindset team. As the long-term technical strategy is still taking shape, there is a lot of opportunity for this fresh Science team to innovate by leveraging Gen AI technoligies to build scalable solutions from scratch. Our Vision: Language will not stand in the way of anyone on earth using Amazon products and services. Our Mission: We are the enablers and guardians of translation for Amazon's customers. We do this by offering hands-off-the-wheel service to all Amazon teams, optimizing translation quality and speed at the lowest cost possible.
US, VA, Arlington
Are you fascinated by the power of Large Language Models (LLM) and Artificial Intelligence (AI) to transform the way we learn and interact with technology? Are you passionate about applying advanced machine learning (ML) techniques to solve complex challenges in the cloud learning space? If so, AWS Training & Certification (T&C) team has an exciting opportunity for you as an Applied Scientist. At AWS T&C, we strive to be leaders in not only how we learn about the latest AI/ML development and AWS services, but also how the same technologies transform the way we learn about them. As an Applied Scientist, you will join a talented and collaborative team that is dedicated to driving innovation and delivering exceptional experiences in our Skill Builder platform for both new learners and seasoned developers. You will be a part of a global team that is focused on transforming how people learn. The position will interact with global leaders and teams across the globe as well as different business and technical organizations. Join us at the AWS T&C Science Team and become a part of a global team that is redefining the future of cloud learning. With access to vast amounts of data, exciting new technology, and a diverse community of talented individuals, you will have the opportunity to make a meaningful impact on the ways how worldwide learners engage with our learning system and builders develop on our platform. Together, we will drive innovation, solve complex problems, and shape the future of future-generation cloud builders. Please visit https://skillbuilder.awsto learn more. Key job responsibilities - Apply your expertise in LLM to design, develop, and implement scalable machine learning solutions that address challenges in discovery and engagement for our international audiences. - Collaborate with cross-functional teams, including software engineers, data engineers, scientists, and product managers, to define project requirements, establish success metrics, and deliver high-quality solutions. - Conduct thorough data analysis to gain insights, identify patterns, and drive actionable recommendations that enhance operational performance and customer experiences across Skill Builder. - Continuously explore and evaluate state-of-the-art techniques and methodologies to improve the accuracy and efficiency of AI/ML systems. - Communicate complex technical concepts effectively to both technical and non-technical stakeholders, providing clear explanations and guidance on proposed solutions and their potential impact. About the team Why AWS? Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses. Inclusive Team Culture Here at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon conferences, inspire us to never stop embracing our uniqueness. Diverse Experiences AWS values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying. Mentorship & Career Growth We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional. Work/Life Balance We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud.
US, MA, N.reading
Amazon Industrial Robotics is seeking exceptional talent to help develop the next generation of advanced robotics systems that will transform automation at Amazon's scale. We're building revolutionary robotic systems that combine cutting-edge AI, sophisticated control systems, and advanced mechanical design to create adaptable automation solutions capable of working safely alongside humans in dynamic environments. This is a unique opportunity to shape the future of robotics and automation at an unprecedented scale, working with world-class teams pushing the boundaries of what's possible in robotic dexterous manipulation, locomotion, and human-robot interaction. This role presents an opportunity to shape the future of robotics through innovative applications of deep learning and large language models. At Amazon Industrial Robotics we leverage advanced robotics, machine learning, and artificial intelligence to solve complex operational challenges at an unprecedented scale. Our fleet of robots operates across hundreds of facilities worldwide, working in sophisticated coordination to fulfill our mission of customer excellence. We are pioneering the development of robotics dexterous hands that: - Enable unprecedented generalization across diverse tasks - Are compliant and durable - Can span tasks from power grasps to fine dexterity and nonprehensile manipulation - Can navigate the uncertainty of the environment - Leverage mechanical intelligence, multi-modal sensor feedback and advanced control techniques. The ideal candidate will contribute to research that bridges the gap between theoretical advancement and practical implementation in robotics. You will be part of a team that's revolutionizing how robots learn, adapt, and interact with their environment. Join us in building the next generation of intelligent robotics systems that will transform the future of automation and human-robot collaboration. Key job responsibilities - Design and implement robust sensing for dexterous manipulation, including but not limited to: Tactile sensing, Position sensing, Force sensing, Non-contact sensing - Prototype the various identified sensing strategies, considering the constraints of the rest of the hand design - Build and test full hand sensing prototypes to validate the performance of the solution - Develop testing and validation strategies, supporting fast integration into the rest of the robot - Partner with cross-functional teams to iterate on concepts and prototypes - Work with Amazon's robotics engineering and operations customers to deeply understand their requirements and develop tailored solutions - Document the designs, performance, and validation of the final system
US, CA, San Francisco
Join the next revolution in robotics at Amazon's Frontier AI & Robotics team. As a Senior Applied Scientist, you'll spearhead the development of breakthrough foundation models and full-stack robotics systems that enable robots to perceive, understand, and interact with the world in unprecedented ways. You'll drive technical excellence in areas such as locomotion, manipulation, sim2real transfer, multi-modal and multi-task robot learning, designing novel frameworks that bridge the gap between research and real-world deployment at Amazon scale. In this role, you'll combine hands-on technical work with scientific leadership, ensuring your team delivers robust solutions for dynamic real-world environments. You'll leverage Amazon's vast computational resources to tackle ambitious problems in areas like very large multi-modal robotic foundation models and efficient, promptable model architectures that can scale across diverse robotic applications. Key job responsibilities - Lead technical initiatives across the robotics stack, driving breakthrough approaches through hands-on research and development in areas including robot co-design, dexterous manipulation mechanisms, innovative actuation strategies, state estimation, low-level control, system identification, reinforcement learning, and sim-to-real transfer, as well as foundation models for perception and manipulation - Guide technical direction for full-stack robotics projects from conceptualization through deployment, taking a system-level approach that integrates hardware considerations with algorithmic development - Develop and optimize control algorithms and sensing pipelines that enable robust performance in production environments - Mentor fellow scientists while maintaining strong individual technical contributions - Collaborate with platform and hardware teams to ensure seamless integration across the entire robotics stack - Influence technical decisions and implementation strategies within your area of focus A day in the life - Design and implement innovative systems and algorithms, working hands-on with our extensive infrastructure to prototype and evaluate at scale - Guide fellow scientists in solving complex technical challenges across the full robotics stack - Lead focused technical initiatives from conception through deployment, ensuring successful integration with production systems - Drive technical discussions within your team and with key stakeholders - Conduct experiments and prototype new ideas using our massive compute cluster and extensive robotics infrastructure - Mentor team members while maintaining significant hands-on contribution to technical solutions About the team At Frontier AI & Robotics, we're not just advancing robotics – we're reimagining it from the ground up. Our team is building the future of intelligent robotics through innovative foundation models and end-to-end learned systems. We tackle some of the most challenging problems in AI and robotics, from developing sophisticated perception systems to creating adaptive manipulation strategies that work in complex, real-world scenarios. What sets us apart is our unique combination of ambitious research vision and practical impact. We leverage Amazon's massive computational infrastructure and rich real-world datasets to train and deploy state-of-the-art foundation models. Our work spans the full spectrum of robotics intelligence – from multimodal perception using images, videos, and sensor data, to sophisticated manipulation strategies that can handle diverse real-world scenarios. We're building systems that don't just work in the lab, but scale to meet the demands of Amazon's global operations. Join us if you're excited about pushing the boundaries of what's possible in robotics, working with world-class researchers, and seeing your innovations deployed at unprecedented scale.
US, CA, San Francisco
Join the next revolution in robotics at Amazon's Frontier AI & Robotics team, where you'll work alongside world-renowned AI pioneers to push the boundaries of what's possible in robotic intelligence. As an Applied Scientist, you'll be at the forefront of developing breakthrough foundation models that enable robots to perceive, understand, and interact with the world in unprecedented ways. You'll drive independent research initiatives in areas such as locomotion, manipulation, sim2real transfer, multi-modal and multi-task robot learning, designing novel frameworks that bridge the gap between state-of-the-art research and real-world deployment at Amazon scale. In this role, you'll balance innovative technical exploration with practical implementation, collaborating with platform teams to ensure your models and algorithms perform robustly in dynamic real-world environments. You'll have access to Amazon's vast computational resources, enabling you to tackle ambitious problems in areas like very large multi-modal robotic foundation models and efficient, promptable model architectures that can scale across diverse robotic applications. Key job responsibilities - Drive independent research initiatives across the robotics stack, including robot co-design, dexterous manipulation mechanisms, innovative actuation strategies, state estimation, low-level control, system identification, reinforcement learning, and sim-to-real transfer, as well as foundation models for perception and manipulation - Lead full-stack robotics projects from conceptualization through deployment, taking a system-level approach that integrates hardware considerations with algorithmic development - Develop and optimize control algorithms and sensing pipelines that enable robust performance in production environments - Collaborate with platform and hardware teams to ensure seamless integration across the entire robotics stack - Contribute to the team's technical strategy and help shape our approach to next-generation robotics challenges A day in the life - Design and implement innovative systems and algorithms, leveraging our extensive infrastructure to prototype and evaluate at scale - Collaborate with our world-class research team to solve complex technical challenges - Lead technical initiatives from conception to deployment, working closely with robotics engineers to integrate your solutions into production systems - Participate in technical discussions and brainstorming sessions with team leaders and fellow scientists - Leverage our massive compute cluster and extensive robotics infrastructure to rapidly prototype and validate new ideas - Transform theoretical insights into practical solutions that can handle the complexities of real-world robotics applications About the team At Frontier AI & Robotics, we're not just advancing robotics – we're reimagining it from the ground up. Our team is building the future of intelligent robotics through ground breaking foundation models and end-to-end learned systems. We tackle some of the most challenging problems in AI and robotics, from developing sophisticated perception systems to creating adaptive manipulation strategies that work in complex, real-world scenarios. What sets us apart is our unique combination of ambitious research vision and practical impact. We leverage Amazon's massive computational infrastructure and rich real-world datasets to train and deploy state-of-the-art foundation models. Our work spans the full spectrum of robotics intelligence – from multimodal perception using images, videos, and sensor data, to sophisticated manipulation strategies that can handle diverse real-world scenarios. We're building systems that don't just work in the lab, but scale to meet the demands of Amazon's global operations. Join us if you're excited about pushing the boundaries of what's possible in robotics, working with world-class researchers, and seeing your innovations deployed at unprecedented scale.
CA, QC, Montreal
Join the next revolution in robotics at Amazon's Frontier AI & Robotics team, where you'll work alongside world-renowned AI pioneers to push the boundaries of what's possible in robotic intelligence. As an Applied Scientist, you'll be at the forefront of developing breakthrough foundation models that enable robots to perceive, understand, and interact with the world in unprecedented ways. You'll drive independent research initiatives in areas such as perception, manipulation, scene understanding, sim2real transfer, multi-modal foundation models, and multi-task learning, designing novel algorithms that bridge the gap between state-of-the-art research and real-world deployment at Amazon scale. In this role, you'll balance innovative technical exploration with practical implementation, collaborating with platform teams to ensure your models and algorithms perform robustly in dynamic real-world environments. You'll have access to Amazon's vast computational resources, enabling you to tackle ambitious problems in areas like very large multi-modal robotic foundation models and efficient, promptable model architectures that can scale across diverse robotic applications. Key job responsibilities - Design and implement novel deep learning architectures that push the boundaries of what robots can understand and accomplish - Drive independent research initiatives in robotics foundation models, focusing on breakthrough approaches in perception, and manipulation, for example open-vocabulary panoptic scene understanding, scaling up multi-modal LLMs, sim2real/real2sim techniques, end-to-end vision-language-action models, efficient model inference, video tokenization - Lead technical projects from conceptualization through deployment, ensuring robust performance in production environments - Collaborate with platform teams to optimize and scale models for real-world applications - Contribute to the team's technical strategy and help shape our approach to next-generation robotics challenges A day in the life - Design and implement novel foundation model architectures, leveraging our extensive compute infrastructure to train and evaluate at scale - Collaborate with our world-class research team to solve complex technical challenges - Lead technical initiatives from conception to deployment, working closely with robotics engineers to integrate your solutions into production systems - Participate in technical discussions and brainstorming sessions with team leaders and fellow scientists - Leverage our massive compute cluster and extensive robotics infrastructure to rapidly prototype and validate new ideas - Transform theoretical insights into practical solutions that can handle the complexities of real-world robotics applications About the team At Frontier AI & Robotics, we're not just advancing robotics – we're reimagining it from the ground up. Our team is building the future of intelligent robotics through ground breaking foundation models and end-to-end learned systems. We tackle some of the most challenging problems in AI and robotics, from developing sophisticated perception systems to creating adaptive manipulation strategies that work in complex, real-world scenarios. What sets us apart is our unique combination of ambitious research vision and practical impact. We leverage Amazon's massive computational infrastructure and rich real-world datasets to train and deploy state-of-the-art foundation models. Our work spans the full spectrum of robotics intelligence – from multimodal perception using images, videos, and sensor data, to sophisticated manipulation strategies that can handle diverse real-world scenarios. We're building systems that don't just work in the lab, but scale to meet the demands of Amazon's global operations. Join us if you're excited about pushing the boundaries of what's possible in robotics, working with world-class researchers, and seeing your innovations deployed at unprecedented scale.
IL, Tel Aviv
Come build the future of entertainment with us. Are you interested in helping shape the future of movies and television? Do you want to help define the next generation of how and what Amazon customers are watching? Prime Video is a premium streaming service that offers customers a vast collection of TV shows and movies - all with the ease of finding what they love to watch in one place. We offer customers thousands of popular movies and TV shows from Originals and Exclusive content to exciting live sports events. We also offer our members the opportunity to subscribe to add-on channels which they can cancel at any time and to rent or buy new release movies and TV box sets on the Prime Video Store. Prime Video is a fast-paced, growth business - available in over 240 countries and territories worldwide. The team works in a dynamic environment where innovating on behalf of our customers is at the heart of everything we do. If this sounds exciting to you, please read on We are seeking an exceptional Applied Scientist to join our Prime Video Sports personalization team in Israel. Our team is dedicated to developing state-of-the-art science to personalize the customer experience and help customers seamlessly find any live event in our selection. You will have the opportunity to work on innovative, large-scale projects that push the boundaries of what's possible in sports content delivery and engagement. Your expertise will be crucial in tackling complex challenges such as information retrieval, sequential modeling, realtime model optimizations, utilizing Large Language Models (LLMs), and building state-of-the-art complex recommender systems. Key job responsibilities We are looking for an Applied Scientist with domain expertise in Personalization, Information Retrieval, and Recommender Systems, or general ML to develop new algorithms and end-to-end solutions. As part of our team of applied scientists and software development engineers, you will be responsible for researching, designing, developing, and deploying algorithms into production pipelines. Your role will involve working with cutting-edge technologies in recommender systems and search. You'll also tackle unique challenges like temporal information retrieval to improve real-time sports content recommendations. As a technologist, you will drive the publication of original work in top-tier conferences in Machine Learning and Recommender Systems. We expect you to thrive in ambiguous situations, demonstrating outstanding analytical abilities and comfort in collaborating with cross-functional teams and systems. The ideal candidate is a self-starter with the ability to learn and adapt quickly in our fast-paced environment. About the team We are the Prime Video Sports team. In September 2018 Prime Video launched its first full-scale live streaming experience to world-wide Prime customers with NFL Thursday Night Football. That was just the start. Now Amazon has exclusive broadcasting rights to major leagues like NFL Thursday Night Football, Tennis majors like Roland-Garros and English Premier League to list a few and are broadcasting live events across 30+ sports world-wide. Prime Video is expanding not just the breadth of live content that it offers, but the depth of the experience. This is a transformative opportunity, the chance to be at the vanguard of a program that will revolutionize Prime Video, and the live streaming experience of customers everywhere.
US, WA, Seattle
Within Amazon’s Corporate Financial Planning & Analysis team (FP&A), we enjoy a unique vantage point into everything happening within Amazon. This is exciting opportunity for scientist to join our Financial Transformation team, where you will get to harness the power of statistical and machine learning models to revolutionize finance forecasting that spans entire company and business units. As a key player in this innovative group, you'll be at the forefront of applying state-of-the-art scientific approaches and emerging technologies to solve complex financial challenges. Your deep domain expertise will be instrumental in identifying and addressing customer needs, often venturing into uncharted territories where textbook solutions don't exist. You'll have the chance to author Finance AI articles, showcasing your novel work to both internal and external audiences. Key job responsibilities Your role will involve developing production-ready science models/components that directly impact large-scale systems and services, making critical decisions on implementation complexity and technology adoption. You'll be a driving force in MLOps, optimizing compute and inference usage and enhancing system performance. Beyond technical prowess, you'll contribute to financial strategic planning, mentor team members, and represent our tech. organization in the broader scientific community. This role offers a perfect blend of hands-on development, strategic thinking, and thought leadership in the exciting intersection of finance and advanced analytics. Ready to shape the future of financial forecasting? Join us and let's transform the industry together!
CA, QC, Montreal
Join the next revolution in robotics at Amazon's Frontier AI & Robotics team, where you'll work alongside world-renowned AI pioneers to push the boundaries of what's possible in robotic intelligence. As an Applied Scientist, you'll be at the forefront of developing breakthrough foundation models that enable robots to perceive, understand, and interact with the world in unprecedented ways. You'll drive independent research initiatives in areas such as perception, manipulation, scene understanding, sim2real transfer, multi-modal foundation models, and multi-task learning, designing novel algorithms that bridge the gap between state-of-the-art research and real-world deployment at Amazon scale. In this role, you'll balance innovative technical exploration with practical implementation, collaborating with platform teams to ensure your models and algorithms perform robustly in dynamic real-world environments. You'll have access to Amazon's vast computational resources, enabling you to tackle ambitious problems in areas like very large multi-modal robotic foundation models and efficient, promptable model architectures that can scale across diverse robotic applications. Key job responsibilities - Design and implement novel deep learning architectures that push the boundaries of what robots can understand and accomplish - Drive independent research initiatives in robotics foundation models, focusing on breakthrough approaches in perception, and manipulation, for example open-vocabulary panoptic scene understanding, scaling up multi-modal LLMs, sim2real/real2sim techniques, end-to-end vision-language-action models, efficient model inference, video tokenization - Lead technical projects from conceptualization through deployment, ensuring robust performance in production environments - Collaborate with platform teams to optimize and scale models for real-world applications - Contribute to the team's technical strategy and help shape our approach to next-generation robotics challenges A day in the life - Design and implement novel foundation model architectures, leveraging our extensive compute infrastructure to train and evaluate at scale - Collaborate with our world-class research team to solve complex technical challenges - Lead technical initiatives from conception to deployment, working closely with robotics engineers to integrate your solutions into production systems - Participate in technical discussions and brainstorming sessions with team leaders and fellow scientists - Leverage our massive compute cluster and extensive robotics infrastructure to rapidly prototype and validate new ideas - Transform theoretical insights into practical solutions that can handle the complexities of real-world robotics applications About the team At Frontier AI & Robotics, we're not just advancing robotics – we're reimagining it from the ground up. Our team is building the future of intelligent robotics through ground breaking foundation models and end-to-end learned systems. We tackle some of the most challenging problems in AI and robotics, from developing sophisticated perception systems to creating adaptive manipulation strategies that work in complex, real-world scenarios. What sets us apart is our unique combination of ambitious research vision and practical impact. We leverage Amazon's massive computational infrastructure and rich real-world datasets to train and deploy state-of-the-art foundation models. Our work spans the full spectrum of robotics intelligence – from multimodal perception using images, videos, and sensor data, to sophisticated manipulation strategies that can handle diverse real-world scenarios. We're building systems that don't just work in the lab, but scale to meet the demands of Amazon's global operations. Join us if you're excited about pushing the boundaries of what's possible in robotics, working with world-class researchers, and seeing your innovations deployed at unprecedented scale.
US, WA, Seattle
The Sponsored Products and Brands (SPB) team at Amazon Ads is transforming advertising through generative AI technologies. We help millions of customers discover products and engage with brands across Amazon.com and beyond. Our team combines human creativity with artificial intelligence to reinvent the entire advertising lifecycle—from ad creation and optimization to performance analysis and customer insights. We develop responsible AI technologies that balance advertiser needs, enhance shopping experiences, and strengthen the marketplace. Our team values innovation and tackles complex challenges that push the boundaries of what's possible with AI. Join us in shaping the future of advertising. Key job responsibilities This role will redesign how ads create personalized, relevant shopping experiences with customer value at the forefront. Key responsibilities include: - Design and develop solutions using GenAI, deep learning, multi-objective optimization and/or reinforcement learning to transform ad retrieval, auctions, whole-page relevance, and shopping experiences. - Partner with scientists, engineers, and product managers to build scalable, production-ready science solutions. - Apply industry advances in GenAI, Large Language Models (LLMs), and related fields to create innovative prototypes and concepts. - Improve the team's scientific and technical capabilities by implementing algorithms, methodologies, and infrastructure that enable rapid experimentation and scaling. - Mentor junior scientists and engineers to build a high-performing, collaborative team. A day in the life As an Applied Scientist on the Sponsored Products and Brands Off-Search team, you will contribute to the development in Generative AI (GenAI) and Large Language Models (LLMs) to revolutionize our advertising flow, backend optimization, and frontend shopping experiences. This is a rare opportunity to redefine how ads are retrieved, allocated, and/or experienced—elevating them into personalized, contextually aware, and inspiring components of the customer journey. You will have the opportunity to fundamentally transform areas such as ad retrieval, ad allocation, whole-page relevance, and differentiated recommendations through the lens of GenAI. By building novel generative models grounded in both Amazon’s rich data and the world’s collective knowledge, your work will shape how customers engage with ads, discover products, and make purchasing decisions. If you are passionate about applying frontier AI to real-world problems with massive scale and impact, this is your opportunity to define the next chapter of advertising science. About the team The Off-Search team within Sponsored Products and Brands (SPB) is focused on building delightful ad experiences across various surfaces beyond Search on Amazon—such as product detail pages, the homepage, and store-in-store pages—to drive monetization. Our vision is to deliver highly personalized, context-aware advertising that adapts to individual shopper preferences, scales across diverse page types, remains relevant to seasonal and event-driven moments, and integrates seamlessly with organic recommendations such as new arrivals, basket-building content, and fast-delivery options. To execute this vision, we work in close partnership with Amazon Stores stakeholders to lead the expansion and growth of advertising across Amazon-owned and -operated pages beyond Search. We operate full stack—from backend ads-retail edge services, ads retrieval, and ad auctions to shopper-facing experiences—all designed to deliver meaningful value.