Alexa’s speech recognition research at ICASSP 2022

Multimodal training, signal-to-interpretation, and BERT rescoring are just a few topics covered by Amazon’s 21 speech-related papers.

This week, the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) got under way in virtual form, to be followed by an in-person meeting two weeks later (May 22-27) in Singapore. ICASSP is the flagship conference of the IEEE Signal Processing Society and, as such, one of the premier venues for publishing the latest advances in automatic speech recognition (ASR) and other speech-processing and speech-related fields, with strong participation from both industry and academia.

More ICASSP coverage on Amazon Science

This year, the Alexa AI ASR organization is represented by 21 papers, more than in any prior year, reflecting the growth of speech-related science in Alexa AI. Here we highlight a few of these papers, to give an idea of their breadth.

Multimodal pretraining for end-to-end ASR

Deep-learning methods have taken over as the method of choice in speech-based recognition and classification tasks, and increasingly, self-supervised representation learning is used to pretrain models on large unlabeled datasets, followed by “fine-tuning” on task-labeled data.

In their paper “Multi-modal Pretraining for Automated Speech Recognition”, David Chan and colleagues give a new twist to this approach by pretraining speech representations on audiovisual data. As the self-supervision task for both modalities, they adapt the masked language model, in which words of training sentences are randomly masked out, and the model learns to predict them. In their case, however, the masks are applied to features extracted from the video and audio stream.

Multimodal MLM.png
In "Multi-modal pre-training for automated speech recognition", Amazon researchers adapt the masked language model, which learns to predict masked-out words of training sentences, to features extracted from video and audio streams.

Once pretrained, the audio-only portion of the learned representation is fused with a more standard front-end representation to feed into an end-to-end speech recognition system. The researchers show that this approach yields more accurate ASR results than pretraining with only audio-based self-supervision, suggesting that the correlations between acoustic and visual signals are helpful in extracting higher-level structures relevant to the encoding of speech.

Signal-to-interpretation with multimodal embeddings

The advantages of multimodality are not limited to unsupervised-learning settings. In “Tie your embeddings down: Cross-modal latent spaces for end-to-end spoken language understanding”, Bhuvan Agrawal and coauthors study signal-to-interpretation (S2I) recognizers that map a sequential acoustic input to an embedding, from which the intent of an utterance is directly inferred.

Cross-modal SLU.png
In "Tie your embeddings down: Cross-modal latent spaces for end-to-end spoken language understanding", Amazon researchers train encoders to generate acoustic and text embeddings in the same representational space, so that the origin of the embeddings becomes indistinguishable.

This bypasses the need for explicit speech transcription but still uses supervision for utterance intents. Due to their compactness, S2I models are attractive for on-device deployment, which has multiple benefits. For example, Alexa AI has used on-device speech processing to make Alexa faster and lower-bandwidth.

Agrawal and colleagues show that S2I recognizers give better results when their acoustic embeddings are constrained to be close to embeddings of the corresponding textual input produced by a pretrained language model (BERT). As in the earlier paper, this cross-modal signal is used during learning only and not required for inference (i.e., at runtime). It is a clever way to sneak linguistic structure back into the S2I system while also infusing it with knowledge gleaned from the vastly larger language model training data.

TinyS2I.png
The TinyS2I architecture. From "TINYS2I: A small-footprint utterance classification model with contextual support for on-device SLU".

The idea of matching embeddings derived from audio to those for corresponding text strings (i.e., transcripts) also has other applications. In their paper “TinyS2I: A small-footprint utterance classification model with contextual support for on-device SLU”, Anastasios Alexandridis et al. show that extremely compact, low-latency speech-understanding models can be obtained for the utterances most frequently used to control certain applications, such as media playback.

The most frequent control commands (“pause”, “volume up”, and the like) can be classified directly from an acoustic embedding. For commands involving an item from a contextual menu (“play [title]”), the acoustic embedding is matched to the media title’s textual embedding. In this paper, unlike the previous one, the textual embeddings are trained jointly with the acoustic ones. But the same triplet loss function can be used to align the cross-modal embeddings in a shared space.

ASR rescoring with BERT

Deep encoders of text trained using the masked-language-model (MLM) paradigm, such as BERT, have been widely used as the basis for all sorts of natural-language tasks. As mentioned earlier, they can incorporate vast amounts of language data through self-supervised pretraining, followed by task-specific supervised fine-tuning.

Related content
Second-pass language models that rescore automatic-speech-recognition hypotheses benefit from multitask training on natural-language-understanding objectives.

So far, however, the practical impact of MLMs on ASR proper has been limited, in part because of unsatisfactory tradeoffs between computational overhead (latency) and achievable accuracy gains. This is now changing with the work of Liyan Xu et al., as described in “RescoreBERT: Discriminative speech recognition rescoring with BERT”.

The researchers show how BERT-generated sentence encodings can be incorporated into a model that rescores the text strings output by an ASR model. Because BERT is trained on large corpora of (text-only) public data, it understands the relative probabilities of different ASR hypotheses better than the ASR model can.

The researchers achieved their best results with a combined loss function that is based on both sentence pseudo-likelihood — a more computationally tractable estimate of sentence likelihood — and word error prediction. The resulting rescoring model is so effective compared to standard LSTM (long short-term memory) language models, while also exhibiting lower latency, that the RescoreBERT method has gone from internship project to Alexa production in less than a year.

Ontological biasing for acoustic-event detection

We round out this short selection of papers with one from an ASR-adjacent field. In “Improved representation learning for acoustic event classification using tree-structured ontology”, Arman Zharmagambetov and coauthors look at an alternative to self-supervised training for the task of acoustic-event detection (AED). (AED is the technology behind Alexa’s ability to detect breaking glass, smoke alarms, and other noteworthy events around the house.)

They show that AED classifier training can be enhanced by forcing the resulting representations to identify not only the target event label (such as “dog barking”) but also supercategories (such as “domestic animal” and “animal sound”) drawn from an ontology, a hierarchical representation of relationships between concepts. The method can be further enhanced by forcing the classification to stay the same under distortions of the inputs. The researchers found that their method is more effective than purely self-supervised pretraining and comes close to fully supervised training with only a fraction of the labeled data.

AED architecture.png
In "Improved representation learning for acoustic event classification using tree-structured ontology", Amazon researchers present a two-module joint model consisting of a representation neural network and a decision tree based on a predefined tree-structured ontology.

Conclusion and outlook

As we have seen, Alexa relies on a range of audio-based technologies that use deep-learning architectures. The need to train these models robustly, fairly, and with limited supervision, as well as computational constraints at runtime, continues to drive research in Alexa Science. We have highlighted some of the results from that work as they are about to be presented to the wider science community, and we are excited to see the field as a whole come up with creative solutions and push toward ever more capable applications of speech-based AI.

Research areas

Related content

IN, HR, Gurugram
Our customers have immense faith in our ability to deliver packages timely and as expected. A well planned network seamlessly scales to handle millions of package movements a day. It has monitoring mechanisms that detect failures before they even happen (such as predicting network congestion, operations breakdown), and perform proactive corrective actions. When failures do happen, it has inbuilt redundancies to mitigate impact (such as determine other routes or service providers that can handle the extra load), and avoids relying on single points of failure (service provider, node, or arc). Finally, it is cost optimal, so that customers can be passed the benefit from an efficiently set up network. Amazon Shipping is hiring Applied Scientists to help improve our ability to plan and execute package movements. As an Applied Scientist in Amazon Shipping, you will work on multiple challenging machine learning problems spread across a wide spectrum of business problems. You will build ML models to help our transportation cost auditing platforms effectively audit off-manifest (discrepancies between planned and actual shipping cost). You will build models to improve the quality of financial and planning data by accurately predicting ship cost at a package level. Your models will help forecast the packages required to be pick from shipper warehouses to reduce First Mile shipping cost. Using signals from within the transportation network (such as network load, and velocity of movements derived from package scan events) and outside (such as weather signals), you will build models that predict delivery delay for every package. These models will help improve buyer experience by triggering early corrective actions, and generating proactive customer notifications. Your role will require you to demonstrate Think Big and Invent and Simplify, by refining and translating Transportation domain-related business problems into one or more Machine Learning problems. You will use techniques from a wide array of machine learning paradigms, such as supervised, unsupervised, semi-supervised and reinforcement learning. Your model choices will include, but not be limited to, linear/logistic models, tree based models, deep learning models, ensemble models, and Q-learning models. You will use techniques such as LIME and SHAP to make your models interpretable for your customers. You will employ a family of reusable modelling solutions to ensure that your ML solution scales across multiple regions (such as North America, Europe, Asia) and package movement types (such as small parcel movements and truck movements). You will partner with Applied Scientists and Research Scientists from other teams in US and India working on related business domains. Your models are expected to be of production quality, and will be directly used in production services. You will work as part of a diverse data science and engineering team comprising of other Applied Scientists, Software Development Engineers and Business Intelligence Engineers. You will participate in the Amazon ML community by authoring scientific papers and submitting them to Machine Learning conferences. You will mentor Applied Scientists and Software Development Engineers having a strong interest in ML. You will also be called upon to provide ML consultation outside your team for other problem statements. If you are excited by this charter, come join us!
US, MA, Boston
The Artificial General Intelligence (AGI) team is looking for a passionate, talented, and inventive Senior Applied Scientist with a strong deep learning background, to build industry-leading technology with Large Language Models (LLMs) and multimodal systems. Key job responsibilities As a Senior Applied Scientist with the AGI team, you will work with talented peers to lead the development of novel algorithms and modeling techniques, to advance the state of the art with LLMs. Your work will directly impact our customers in the form of products and services that make use of speech and language technology. You will leverage Amazon’s heterogeneous data sources and large-scale computing resources to accelerate advances in generative artificial intelligence (GenAI). About the team The AGI team has a mission to push the envelope in LLMs and multimodal systems, in order to provide the best-possible experience for our customers.
IN, KA, Bengaluru
The Amazon Alexa AI team in India is seeking a talented, self-driven Applied Scientist to work on prototyping, optimizing, and deploying ML algorithms within the realm of Generative AI. Key responsibilities include: - Research, experiment and build Proof Of Concepts advancing the state of the art in AI & ML for GenAI. - Collaborate with cross-functional teams to architect and execute technically rigorous AI projects. - Thrive in dynamic environments, adapting quickly to evolving technical requirements and deadlines. - Engage in effective technical communication (written & spoken) with coordination across teams. - Conduct thorough documentation of algorithms, methodologies, and findings for transparency and reproducibility. - Publish research papers in internal and external venues of repute - Support on-call activities for critical issues Basic Qualifications: - Master’s or PhD in computer science, statistics or a related field - 2-7 years experience in deep learning, machine learning, and data science. - Proficiency in coding and software development, with a strong focus on machine learning frameworks. - Experience in Python, or another language; command line usage; familiarity with Linux and AWS ecosystems. - Understanding of relevant statistical measures such as confidence intervals, significance of error measurements, development and evaluation data sets, etc. - Excellent communication skills (written & spoken) and ability to collaborate effectively in a distributed, cross-functional team setting. - Papers published in AI/ML venues of repute Preferred Qualifications: - Track record of diving into data to discover hidden patterns and conducting error/deviation analysis - Ability to develop experimental and analytic plans for data modeling processes, use of strong baselines, ability to accurately determine cause and effect relations - The motivation to achieve results in a fast-paced environment. - Exceptional level of organization and strong attention to detail - Comfortable working in a fast paced, highly collaborative, dynamic work environment
GB, London
Are you looking to work at the forefront of Machine Learning and AI? Would you be excited to apply cutting edge Generative AI algorithms to solve real world problems with significant impact? The AWS Industries Team at AWS helps AWS customers implement Generative AI solutions and realize transformational business opportunities for AWS customers in the most strategic industry verticals. This is a team of data scientists, engineers, and architects working step-by-step with customers to build bespoke solutions that harness the power of generative AI. The team helps customers imagine and scope the use cases that will create the greatest value for their businesses, select and train and fine tune the right models, define paths to navigate technical or business challenges, develop proof-of-concepts, and build applications to launch these solutions at scale. The AWS Industries team provides guidance and implements best practices for applying generative AI responsibly and cost efficiently. You will work directly with customers and innovate in a fast-paced organization that contributes to game-changing projects and technologies. You will design and run experiments, research new algorithms, and find new ways of optimizing risk, profitability, and customer experience. In this Data Scientist role you will be capable of using GenAI and other techniques to design, evangelize, and implement and scale cutting-edge solutions for never-before-solved problems. Key job responsibilities - Collaborate with AI/ML scientists, engineers, and architects to research, design, develop, and evaluate cutting-edge generative AI algorithms and build ML systems to address real-world challenges - Interact with customers directly to understand the business problem, help and aid them in implementation of generative AI solutions, deliver briefing and deep dive sessions to customers and guide customer on adoption patterns and paths to production - Create and deliver best practice recommendations, tutorials, blog posts, publications, sample code, and presentations adapted to technical, business, and executive stakeholder - Provide customer and market feedback to Product and Engineering teams to help define product direction About the team Diverse Experiences Amazon values diverse experiences. Even if you do not meet all of the preferred qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying. Why AWS Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses. Work/Life Balance We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why we strive for flexibility as part of our working culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud. Inclusive Team Culture Here at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness. Mentorship and Career Growth We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional.
US, CA, Palo Alto
Amazon Sponsored Products is investing heavily in building a world class advertising business and we are responsible for defining and delivering a collection of GenAI/LLM powered self-service performance advertising products that drive discovery and sales. Our products are strategically important to Amazon’s Selling Partners and key to driving their long-term growth. We deliver billions of ad impressions and clicks daily and are breaking fresh ground to create world-class products. We are highly motivated, collaborative and fun-loving team with an entrepreneurial spirit and bias for action. With a broad mandate to experiment and innovate, we are growing at an unprecedented rate with a seemingly endless range of new opportunities. This role will be pivotal within the Autonomous Campaigns org of Sponsored Products Ads, where we're pioneering the development of AI-powered advertising innovations that will redefine the future of campaign management and optimization. As a Principal Applied Scientist, you will lead the charge in creating the next generation of self-operating, GenAI-driven advertising systems that will set a new standard for the industry. Our team is at the forefront of designing and implementing these transformative technologies, which will leverage advanced Large Language Models (LLMs) and sophisticated chain-of-thought reasoning to achieve true advertising autonomy. Your work will bring to life systems capable of deeply understanding the nuanced context of each product, market trends, and consumer behavior, making intelligent, real-time decisions that surpass human capabilities. By harnessing the power of these future-state GenAI systems, we will develop advertising solutions capable of autonomously selecting optimal keywords, dynamically adjusting bids based on complex market conditions, and optimizing product targeting across various Amazon platforms. Crucially, these systems will continuously analyze performance metrics and implement strategic pivots, all without requiring manual intervention from advertisers, allowing them to focus on their core business while our AI works tirelessly on their behalf. This is not simply about automating existing processes; your work will redefine what's possible in advertising. Our GenAI systems will employ multi-step reasoning, considering a vast array of factors, from seasonality and competitive landscape to macroeconomic trends, to make decisions that far exceed human speed and effectiveness. This autonomous, context-aware approach represents a paradigm shift in how advertising campaigns are conceived, executed, and optimized. As a Principal Applied Scientist, you will be at the forefront of this transformation, tackling complex challenges in natural language processing, reinforcement learning, and causal inference. Your pioneering efforts will directly shape the future of e-commerce advertising, with the potential to influence marketplace dynamics on a global scale. This is an unparalleled opportunity to push the boundaries of what's achievable in AI-driven advertising and leave an indelible mark on the industry. Key job responsibilities • Seek to understand in depth the Sponsored Products offering at Amazon and identify areas of opportunities to grow our business using GenAI, LLM, and ML solutions. • Mentor and guide the applied scientists in our organization and hold us to a high standard of technical rigor and excellence in AI/ML. • Design and lead organization-wide AI/ML roadmaps to help our Amazon shoppers have a delightful shopping experience while creating long term value for our advertisers. • Work with our engineering partners and draw upon your experience to meet latency and other system constraints. • Identify untapped, high-risk technical and scientific directions, and devise new research directions that you will drive to completion and deliver. • Be responsible for communicating our Generative AI/ Traditional AI/ML innovations to the broader internal & external scientific community.
US, CO, Boulder
Do you want to lead the Ads industry and redefine how we measure the effectiveness of the WW Amazon Ads business? Are you passionate about causal inference, Deep Learning/DNN, raising the science bar, and connecting leading-edge science research to Amazon-scale implementation? If so, come join Amazon Ads to be an Applied Science leader within our Advertising Incrementality Measurement science team! Key job responsibilities As an Applied Science leader within the Advertising Incrementality Measurement (AIM) science team, you are responsible for defining and executing on key workstreams within our overall causal measurement science vision. In particular, you will lead the science development of our Deep Neural Net (DNN) ML model, a foundational ML model to understand the impact of individual ad touchpoints for billions of daily ad touchpoints. You will work on a team of Applied Scientists, Economists, and Data Scientists to work backwards from customer needs and translate product ideas into concrete science deliverables. You will be a thought leader for inventing scalable causal measurement solutions that support highly accurate and actionable causal insights--from defining and executing hundreds of thousands of RCTs, to developing an exciting science R&D agenda. You will solve hard problems, advance science at Amazon, and be a leading innovator in the causal measurement of advertising effectiveness. In this role, you will work with a team of applied scientists, economists, engineers, product managers, and UX designers to define and build the future of advertising causal measurement. You will be working with massive data, a dedicated engineering team, and industry-leading partner scientists. Your team’s work will help shape the future of Amazon Advertising.
US, WA, Seattle
The Seller Fees organization drives the monetization infrastructure powering Amazon's global marketplace, processing billions of transactions for over two million active third-party sellers worldwide. Our team owns the complete technical stack and strategic vision for fee computation systems, leveraging advanced machine learning to optimize seller experiences and maintain fee integrity at unprecedented scale. We're seeking an exceptional Applied Scientist to push the boundaries of large-scale ML systems in a business-critical domain. This role presents unique opportunities to • Architect and deploy state-of-the-art transformer-based models for fee classification and anomaly detection across hundreds of millions of products • Pioneer novel applications of multimodal LLMs to analyze product attributes, images, and seller metadata for intelligent fee determination • Build production-scale generative AI systems for fee integrity and seller communications • Advance the field of ML through novel research in high-stakes, large-scale transaction processing • Develop SOTA causal inference frameworks integrated with deep learning to understand fee impacts and optimize seller outcomes • Collaborate with world-class scientists and engineers to solve complex problems at the intersection of deep learning, economics, and large business systems. If you're passionate about advancing the state-of-the-art in applied ML/AI while tackling challenging problems at global scale, we want you on our team! Key job responsibilities Responsibilities: . Design measurable and scalable science solutions that can be adopted across stores worldwide with different languages, policy and requirements. · Integrate AI (both generative and symbolic) into compound agentic workflows to transform complex business systems into intelligent ones for both internal and external customers. · Develop large scale classification and prediction models using the rich features of text, image and customer interactions and state-of-the-art techniques. · Research and implement novel machine learning, statistical and econometrics approaches. · Write high quality code and implement scalable models within the production systems. · Stay up to date with relevant scientific publications. · Collaborate with business and software teams both within and outside of the fees organization.
US, WA, Seattle
The Selling Partner Experience (SPX) organization strives to make Amazon the best place for Selling Partners to do business. The SPX Science team is building an AI-powered conversational assistant to transform the Selling Partner experience. The Selling Assistant is a trusted partner and a seasoned advisor that’s always available to enable our partners to thrive in Amazon’s stores. It takes away the cognitive load of selling on Amazon by providing a single interface to handle a diverse set of selling needs. The assistant always stays by the seller's side, talks to them in their language, enables them to capitalize on opportunities, and helps them accomplish their business goals with ease. It is powered by the state-of-the-art Generative AI, going beyond a typical chatbot to provide a personalized experience to sellers running real businesses, large and small. Do you want to join an innovative team of scientists, engineers, product and program managers who use the latest Generative AI and Machine Learning technologies to help Amazon create a delightful Selling Partner experience? Do you want to build solutions to real business problems by automatically understanding and addressing sellers’ challenges, needs and opportunities? Are you excited by the prospect of contributing to one of Amazon’s most strategic Generative AI initiatives? If yes, then you may be a great fit to join the Selling Partner Experience Science team. Key job responsibilities - Use state-of-the-art Machine Learning and Generative AI techniques to create the next generation of the tools that empower Amazon's Selling Partners to succeed. - Design, develop and deploy highly innovative models to interact with Sellers and delight them with solutions. - Work closely with teams of scientists and software engineers to drive real-time model implementations and deliver novel and highly impactful features. - Establish scalable, efficient, automated processes for large scale data analyses, model benchmarking, model validation and model implementation. - Research and implement novel machine learning and statistical approaches. - Participate in strategic initiatives to employ the most recent advances in ML in a fast-paced, experimental environment. About the team Selling Partner Experience Science is a growing team of scientists, engineers and product leaders engaged in the research and development of the next generation of ML-driven technology to empower Amazon's Selling Partners to succeed. We draw from many science domains, from Natural Language Processing to Computer Vision to Optimization to Economics, to create solutions that seamlessly and automatically engage with Sellers, solve their problems, and help them grow. We are focused on building seller facing AI-powered tools using the latest science advancements to empower sellers to drive the growth of their business. We strive to radically simplify the seller experience, lowering the cognitive burden of selling on Amazon by making it easy to accomplish critical tasks such as launching new products, understanding and complying with Amazon’s policies and taking actions to grow their business.
US, WA, Seattle
Join us in the evolution of Amazon’s Seller business! The Selling Partner Growth organization is the growth and development engine for our Store. Partnering with business, product, and engineering, we catalyze SP growth with comprehensive and accurate data, unique insights, and actionable recommendations and collaborate with WW SP facing teams to drive adoption and create feedback loops. We strongly believe that any motivated SP should be able to grow their businesses and reach their full potential supported by Amazon tools and resources. We are looking for a Senior Applied Scientist to lead us to identify data-driven insight and opportunities to improve our SP growth strategy and drive new seller success. As a successful applied scientist on our talented team of scientists and engineers, you will solve complex problems to identify actionable opportunities, and collaborate with engineering, research, and business teams for future innovation. You need to have deep understanding on the business domain and have the ability to connect business with science. You are also strong in ML modeling and scientific foundation with the ability to collaborate with engineering to put models in production to answer specific business questions. You are an expert at synthesizing and communicating insights and recommendations to audiences of varying levels of technical sophistication. You will continue to contribute to the research community, by working with scientists across Amazon, as well as collaborating with academic researchers and publishing papers (www.aboutamazon.com/research). Key job responsibilities As a Sr. Applied Scientist in the team, you will: - Identify opportunities to improve SP growth and translate those opportunities into science problems via principled statistical solutions (e.g. ML, causal, RL). - Mentor and guide the applied scientists in our organization and hold us to a high standard of technical rigor and excellence in MLOps. - Design and lead roadmaps for complex science projects to help SP have a delightful selling experience while creating long term value for our shoppers. - Work with our engineering partners and draw upon your experience to meet latency and other system constraints. - Identify untapped, high-risk technical and scientific directions, and simulate new research directions that you will drive to completion and deliver. - Be responsible for communicating our science innovations to the broader internal & external scientific community.
US, CA, Sunnyvale
Our team leads the development and optimization of on-device ML models for Amazon's hardware products, including audio, vision, and multi-modal AI features. We work at the critical intersection of ML innovation and silicon design, ensuring AI capabilities can run efficiently on resource-constrained devices. Currently, we enable production ML models across multiple device families, including Echo, Ring/Blink, and other consumer devices. Our work directly impacts Amazon's customer experiences in consumer AI device market. The solutions we develop determine which AI features can be offered on-device versus requiring cloud connectivity, ultimately shaping product capabilities and customer experience across Amazon's hardware portfolio. This is a unique opportunity to shape the future of AI in consumer devices at unprecedented scale. You'll be at the forefront of developing industry-first model architectures and compression techniques that will power AI features across millions of Amazon devices worldwide. Your innovations will directly enable new AI features that enhance how customers interact with Amazon products every day. Come join our team! Key job responsibilities As a Principal Applied Scientist, you will: • Own the technical architecture and optimization strategy for ML models deployed across Amazon's device ecosystem, from existing to yet-to-be-shipped products. • Develop novel model architectures optimized for our custom silicon, establishing new methodologies for model compression and quantization. • Create an evaluation framework for model efficiency and implement multimodal optimization techniques that work across vision, language, and audio tasks. • Define technical standards for model deployment and drive research initiatives in model efficiency to guide future silicon designs. • Spend the majority of your time doing deep technical work - developing novel ML architectures, writing critical optimization code, and creating proof-of-concept implementations that demonstrate breakthrough efficiency gains. • Influence architecture decisions impacting future silicon generations, establish standards for model optimization, and mentor others in advanced ML techniques.