Amazon’s papers at SLT

Quantization with self-adjustable centroids, contrastive predictive coding for transfer learning, teacher ensembles for differential privacy, and more — Amazon’s speech research features a battery of cutting-edge machine learning techniques.

A quick guide to Amazon’s innovative work at the IEEE Spoken Language Technology Workshop (SLT), which begins next week:

Accelerator-aware training for transducer-based speech recognition
Suhaila Shakiah, Rupak Vignesh Swaminathan, Hieu Duy Nguyen, Raviteja Chinta, Tariq Afzal, Nathan Susanj, Athanasios Mouchtaris, Grant Strimel, Ariya Rastrow

Machine learning models trained at full precision can suffer performance falloffs when deployed on neural-network accelerator (NNA) chips, which leverage highly parallelized fixed-point arithmetic to improve efficiency. To avoid this problem, Amazon researchers propose a method for emulating NNA operations at training time.

Related content
Combination of distillation and distillation-aware quantization compresses BART model to 1/16th its size.

An analysis of the effects of decoding algorithms on fairness in open-ended language generation
Jwala Dhamala, Varun Kumar, Rahul Gupta, Kai-Wei Chang, Aram Galstyan

The researchers systematically study the effects of different decoding algorithms on the fairness of large language models, showing that fairness varies significantly with changes in decoding algorithms’ hyperparameters. They also provide recommendations for reporting decoding details during fairness evaluations and optimizing decoding algorithms.

An experimental study on private aggregation of teacher ensemble learning for end-to-end speech recognition
Chao-Han Huck Yang, I-Fan Chen, Andreas Stolcke, Sabato Marco Siniscalchi, Chin-Hui Lee

For machine learning models, meeting differential-privacy (DP) constraints usually means adding noise to data, which can hurt performance. Amazon researchers apply private aggregation of teacher ensembles (PATE), which uses different noisy models to train a single student model, to automatic speech recognition, reducing word error rate by 26% to 28% while meeting DP constraints.

Related content
Technique that mixes public and private training data can meet differential-privacy criteria while cutting error increase by 60%-70%.

Exploration of language-specific self-attention parameters for multilingual end-to-end speech recognition
Brady Houston, Katrin Kirchhoff

Multilingual, end-to-end, automatic-speech-recognition models perform better when they’re trained using both language-specific and language-universal model parameters. Amazon researchers show that using language-specific parameters in the attention mechanisms of Conformer-based encoders can improve the performance of ASR models across six languages by up to 12% relative to multilingual baselines and 36% relative to monolingual baselines.

Guided contrastive self-supervised pre-training for automatic speech recognition
Aparna Khare, Minhua Wu, Saurabhchand Bhati, Jasha Droppo, Roland Maas

Contrastive predictive coding (CPC) is a representation-learning method that maximizes the mutual information between a model’s intermediate representations and its output. Amazon researchers present a modification of CPC that maximizes the mutual information between representations from a prior-knowledge model and the output of a model being pretrained, reducing the word error rate relative to CPC pretraining only.

Guided CPC.png
The conventional contrastive-predictive-coding (CPC) representation-learning approach (left) and Amazon researchers' proposed guided CPC method (right, in red), which maximizes the mutual information between representations from a prior-knowledge model and the output of a model being pretrained. From "Guided contrastive self-supervised pre-training for automatic speech recognition".

Implicit acoustic echo cancellation for keyword spotting and device-directed speech detection
Samuele Cornell, Thomas Balestri, Thibaud Sénéchal

In realistic human-machine interactions, customer speech can overlap with device playback. Amazon researchers propose a way to improve keyword spotting and device-directed-speech detection in these circumstances. They teach the model to ignore playback audio via an implicit acoustic echo cancellation mechanism. They show that, by conditioning on the reference signal as well as the signal captured at the microphone, they can improve recall by as much as 56%.

Mixture of domain experts for language understanding: An analysis of modularity, task performance, and memory tradeoffs
Benjamin Kleiner, Jack FitzGerald, Haidar Khan, Gokhan Tur

Amazon researchers show that natural-language-understanding models that incorporate mixture-of-experts networks, in which each network layer corresponds to a different domain, are easier to update after deployment, with less effect on performance, than other types of models.

N-best hypotheses reranking for text-to-SQL systems
Lu Zeng, Sree Hari Krishnan Parthasarathi, Dilek Hakkani-Tür

Text-to-SQL models map natural-language requests to structured database queries, and today’s state-of-the-art systems rely on fine-tuning pretrained language models. Amazon researchers improve the coherence of such systems with a model that generates a query plan predicting whether a SQL query contains particular clauses; they improve the correctness of such systems with an algorithm that generates schemata that can be used to match prefixes and abbreviations for slot values (such as “left” and “L”).

Related content
At re:Invent, AWS announces that the CodeWhisperer preview has added support for two new programming languages.

On granularity of prosodic representations in expressive text-to-speech
Mikolaj Babianski, Kamil Pokora, Raahil Shah, Rafal Sienkiewicz, Daniel Korzekwa, Viacheslav Klimkov

In expressive-speech synthesis, the same input text can be mapped to different acoustic realizations. Prosodic embeddings at the utterance, word, or phoneme level can be used at training time to simplify that mapping. Amazon researchers study these approaches, showing that utterance-level embeddings have insufficient capacity and phoneme-level embeddings tend to introduce instabilities, while word-level representations strike a balance between capacity and predictability. The researchers use that finding to close the gap in naturalness between synthetic speech and recordings by 90%.

Personalization of CTC speech recognition models
Saket Dingliwal, Monica Sunkara, Srikanth Ronanki, Jeff Farris, Katrin Kirchhoff, Sravan Bodapati

Connectionist temporal classification (CTC) loss functions are an attractive option for automatic speech recognition because they yield simple models with low inference latency. But CTC models are hard to personalize because of their conditional-independence assumption. Amazon researchers propose a battery of techniques to bias a CTC model’s encoder and its beam search decoder, yielding a 60% improvement in F1 score on domain-specific rare words over a strong CTC baseline.

Related content
Accounting for data heterogeneity across edge devices enables more useful model updates, both locally and globally.

Remap, warp and attend: Non-parallel many-to-many accent conversion with normalizing flows
Abdelhamid Ezzerg, Tom Merritt, Kayoko Yanagisawa, Piotr Bilinski, Magdalena Proszewska, Kamil Pokora, Renard Korzeniowski, Roberto Barra-Chicote, Daniel Korzekwa

Regional accents affect not only how words are pronounced but prosodic aspects of speech such as speaking rate and intonation. Amazon researchers investigate an approach to accent conversion that uses normalizing flows. The approach has three steps: remapping the phonetic conditioning, to better match the target accent; warping the duration of the converted speech, to better suit the target phonemes; and applying an attention mechanism to implicitly align source and target speech sequences.

Residual adapters for targeted updates in RNN-transducer based speech recognition system
Sungjun Han, Deepak Baby, Valentin Mendelev

While it is possible to incrementally fine-tune an RNN-transducer (RNN-T) automatic-speech-recognition model to recognize multiple sets of new words, this creates a dependency between the updates, which is not ideal when we want each update to be applied independently. Amazon researchers propose training residual adapters on the RNN-T model and combining them on the fly through adapter fusion, enabling a recall on new words of more than 90%, with less than 1% relative word error rate degradation.

Residual adapters.png
An RNN-transducer model with n independently trained adapters combined through different adapter-fusion methods. From "Residual adapters for targeted updates in RNN-transducer based speech recognition system".

Sub-8-bit quantization for on-device speech recognition: a regularization-free approach
Kai Zhen, Martin Radfar, Hieu Nguyen, Grant Strimel, Nathan Susanj, Athanasios Mouchtaris

For on-device automatic speech recognition (ASR), quantization-aware training (QAT) can help manage the trade-off between performance and efficiency. Among existing QAT methods, one major drawback is that the quantization centroids have to be predetermined and fixed. Amazon researchers introduce a compression mechanism with self-adjustable centroids that results in a simpler yet more versatile quantization scheme that enables a 30.73% memory footprint savings and a 31.75% user-perceived latency reduction, compared to eight-bit QAT.

Related content

US, CA, Palo Alto
Join a team working on cutting-edge science to innovate search experiences for Amazon shoppers! Amazon Search helps customers shop with ease, confidence and delight WW. We aim to transform Search from an information retrieval engine to a shopping engine. In this role, you will build models to generate and recommend search queries that can help customers fulfill their shopping missions, reduce search efforts and let them explore and discover new products. You will also build models and applications that will increase customer awareness of related products and product attributes that might be best suited to fulfill the customer needs. Key job responsibilities On a day-to-day basis, you will: Design, develop, and evaluate highly innovative, scalable models and algorithms; Design and execute experiments to determine the impact of your models and algorithms; Work with product and software engineering teams to manage the integration of successful models and algorithms in complex, real-time production systems at very large scale; Share knowledge and research outcomes via internal and external conferences and journal publications; Project manage cross-functional Machine Learning initiatives. About the team The mission of Search Assistance is to improve search feature by reducing customers’ effort to search. We achieve this through three customer-facing features: Autocomplete, Spelling Correction and Related Searches. The core capability behind the three features is backend service Query Recommendation.
US, CA, Palo Alto
Amazon is investing heavily in building a world class advertising business and we are responsible for defining and delivering a collection of self-service performance advertising products that drive discovery and sales. Our products are strategically important to our Retail and Marketplace businesses driving long term growth. We deliver billions of ad impressions and millions of clicks daily and are breaking fresh ground to create world-class products. We are highly motivated, collaborative and fun-loving with an entrepreneurial spirit and bias for action. With a broad mandate to experiment and innovate, we are growing at an unprecedented rate with a seemingly endless range of new opportunities. The Ad Response Prediction team in Sponsored Products organization build advanced deep-learning models, large-scale machine-learning (ML) pipelines, and real-time serving infra to match shoppers’ intent to relevant ads on all devices, for all contexts and in all marketplaces. Through precise estimation of shoppers’ interaction with ads and their long-term value, we aim to drive optimal ads allocation and pricing, and help to deliver a relevant, engaging and delightful ads experience to Amazon shoppers. As the business and the complexity of various new initiatives we take continues to grow, we are looking for energetic, entrepreneurial, and self-driven science leaders to join the team. Key job responsibilities As a Principal Applied Scientist in the team, you will: Seek to understand in depth the Sponsored Products offering at Amazon and identify areas of opportunities to grow our business via principled ML solutions. Mentor and guide the applied scientists in our organization and hold us to a high standard of technical rigor and excellence in ML. Design and lead organization wide ML roadmaps to help our Amazon shoppers have a delightful shopping experience while creating long term value for our sellers. Work with our engineering partners and draw upon your experience to meet latency and other system constraints. Identify untapped, high-risk technical and scientific directions, and simulate new research directions that you will drive to completion and deliver. Be responsible for communicating our ML innovations to the broader internal & external scientific community.
US, CA, Santa Clara
AWS AI/ML is looking for world class scientists and engineers to join its AI Research and Education group working on foundation models, large-scale representation learning, and distributed learning methods and systems. At AWS AI/ML you will invent, implement, and deploy state of the art machine learning algorithms and systems. You will build prototypes and innovate on new representation learning solutions. You will interact closely with our customers and with the academic and research communities. You will be at the heart of a growing and exciting focus area for AWS and work with other acclaimed engineers and world famous scientists. Large-scale foundation models have been the powerhouse in many of the recent advancements in computer vision, natural language processing, automatic speech recognition, recommendation systems, and time series modeling. Developing such models requires not only skillful modeling in individual modalities, but also understanding of how to synergistically combine them, and how to scale the modeling methods to learn with huge models and on large datasets. Join us to work as an integral part of a team that has diverse experiences in this space. We actively work on these areas: * Hardware-informed efficient model architecture, training objective and curriculum design * Distributed training, accelerated optimization methods * Continual learning, multi-task/meta learning * Reasoning, interactive learning, reinforcement learning * Robustness, privacy, model watermarking * Model compression, distillation, pruning, sparsification, quantization About Us Inclusive Team Culture Here at AWS, we embrace our differences. We are committed to furthering our culture of inclusion. We have ten employee-led affinity groups, reaching 40,000 employees in over 190 chapters globally. We have innovative benefit offerings, and host annual and ongoing learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences. Amazon’s culture of inclusion is reinforced within our 14 Leadership Principles, which remind team members to seek diverse perspectives, learn and be curious, and earn trust. Work/Life Balance Our team puts a high value on work-life balance. It isn’t about how many hours you spend at home or at work; it’s about the flow you establish that brings energy to both parts of your life. We believe striking the right balance between your personal and professional life is critical to life-long happiness and fulfillment. We offer flexibility in working hours and encourage you to find your own balance between your work and personal lives. Mentorship & Career Growth Our team is dedicated to supporting new members. We have a broad mix of experience levels and tenures, and we’re building an environment that celebrates knowledge sharing and mentorship. Our senior members enjoy one-on-one mentoring and thorough, but kind, code reviews. We care about your career growth and strive to assign projects based on what will help each team member develop into a better-rounded engineer and enable them to take on more complex tasks in the future.
US, CA, Palo Alto
We’re working to improve shopping on Amazon using the conversational capabilities of large language models, and are searching for pioneers who are passionate about technology, innovation, and customer experience, and are ready to make a lasting impact on the industry. You'll be working with talented scientists, engineers, and technical program managers (TPM) to innovate on behalf of our customers. If you're fired up about being part of a dynamic, driven team, then this is your moment to join us on this exciting journey!"?
US, WA, Seattle
Do you want to join an innovative team of scientists who use machine learning to help Amazon provide the best experience to our Selling Partners by automatically understanding and addressing their challenges, needs and opportunities? Do you want to build advanced algorithmic systems that are powered by state-of-art ML, such as Natural Language Processing, Large Language Models, Deep Learning, Computer Vision and Causal Modeling, to seamlessly engage with Sellers? Are you excited by the prospect of analyzing and modeling terabytes of data and creating cutting edge algorithms to solve real world problems? Do you like to build end-to-end business solutions and directly impact the profitability of the company and experience of our customers? Do you like to innovate and simplify? If yes, then you may be a great fit to join the Selling Partner Experience Science team. Key job responsibilities Use statistical and machine learning techniques to create the next generation of the tools that empower Amazon's Selling Partners to succeed. Design, develop and deploy highly innovative models to interact with Sellers and delight them with solutions. Work closely with teams of scientists and software engineers to drive real-time model implementations and deliver novel and highly impactful features. Establish scalable, efficient, automated processes for large scale data analyses, model development, model validation and model implementation. Research and implement novel machine learning and statistical approaches. Lead strategic initiatives to employ the most recent advances in ML in a fast-paced, experimental environment. Drive the vision and roadmap for how ML can continually improve Selling Partner experience. About the team Selling Partner Experience Science (SPeXSci) is a growing team of scientists, engineers and product leaders engaged in the research and development of the next generation of ML-driven technology to empower Amazon's Selling Partners to succeed. We draw from many science domains, from Natural Language Processing to Computer Vision to Optimization to Economics, to create solutions that seamlessly and automatically engage with Sellers, solve their problems, and help them grow. Focused on collaboration, innovation and strategic impact, we work closely with other science and technology teams, product and operations organizations, and with senior leadership, to transform the Selling Partner experience.
US, WA, Seattle
The AWS AI Labs team has a world-leading team of researchers and academics, and we are looking for world-class colleagues to join us and make the AI revolution happen. Our team of scientists have developed the algorithms and models that power AWS computer vision services such as Amazon Rekognition and Amazon Textract. As part of the team, we expect that you will develop innovative solutions to hard problems, and publish your findings at peer reviewed conferences and workshops. AWS is the world-leading provider of cloud services, has fostered the creation and growth of countless new businesses, and is a positive force for good. Our customers bring problems which will give Applied Scientists like you endless opportunities to see your research have a positive and immediate impact in the world. You will have the opportunity to partner with technology and business teams to solve real-world problems, have access to virtually endless data and computational resources, and to world-class engineers and developers that can help bring your ideas into the world. Our research themes include, but are not limited to: few-shot learning, transfer learning, unsupervised and semi-supervised methods, active learning and semi-automated data annotation, large scale image and video detection and recognition, face detection and recognition, OCR and scene text recognition, document understanding, 3D scene and layout understanding, and geometric computer vision. For this role, we are looking for scientist who have experience working in the intersection of vision and language. We are located in Seattle, Pasadena, Palo Alto (USA) and in Haifa and Tel Aviv (Israel).
GB, London
Are you excited about applying economic models and methods using large data sets to solve real world business problems? Then join the Economic Decision Science (EDS) team. EDS is an economic science team based in the EU Stores business. The teams goal is to optimize and automate business decision making in the EU business and beyond. An internship at Amazon is an opportunity to work with leading economic researchers on influencing needle-moving business decisions using incomparable datasets and tools. It is an opportunity for PhD students and recent PhD graduates in Economics or related fields. We are looking for detail-oriented, organized, and responsible individuals who are eager to learn how to work with large and complicated data sets. Knowledge of econometrics, as well as basic familiarity with Stata, R, or Python is necessary. Experience with SQL would be a plus. As an Economics Intern, you will be working in a fast-paced, cross-disciplinary team of researchers who are pioneers in the field. You will take on complex problems, and work on solutions that either leverage existing academic and industrial research, or utilize your own out-of-the-box pragmatic thinking. In addition to coming up with novel solutions and prototypes, you may even need to deliver these to production in customer facing products. Roughly 85% of previous intern cohorts have converted to full time economics employment at Amazon.
US, CA, Cupertino
We're looking for an Applied Scientist to help us secure Amazon's most critical data. In this role, you'll work closely with internal security teams to design and build AR-powered systems that protect our customers' data. You will build on top of existing formal verification tools developed by AWS and develop new methods to apply those tools at scale. You will need to be innovative, entrepreneurial, and adaptable. We move fast, experiment, iterate and then scale quickly, thoughtfully balancing speed and quality. Inclusive Team Culture Here at AWS, we embrace our differences. We are committed to furthering our culture of inclusion. We have ten employee-led affinity groups, reaching 40,000 employees in over 190 chapters globally. We have innovative benefit offerings, and host annual and ongoing learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences. Work/Life Balance Our team puts a high value on work-life balance. It isn’t about how many hours you spend at home or at work; it’s about the flow you establish that brings energy to both parts of your life. We believe striking the right balance between your personal and professional life is critical to life-long happiness and fulfillment. We offer flexibility in working hours and encourage you to find your own balance between your work and personal lives. Mentorship & Career Growth Our team is dedicated to supporting new members. We have a broad mix of experience levels and tenures, and we’re building an environment that celebrates knowledge sharing and mentorship. Our senior members enjoy one-on-one mentoring and thorough, but kind, code reviews. We care about your career growth and strive to assign projects based on what will help each team member develop into a better-rounded engineer and enable them to take on more complex tasks in the future. Key job responsibilities Deeply understand AR techniques for analyzing programs and other systems, and keep up with emerging ideas from the research community. Engage with our customers to develop understanding of their needs. Propose and develop solutions that leverage symbolic reasoning services and concepts from programming languages, theorem proving, formal verification and constraint solving. Implement these solutions as services and work with others to deploy them at scale across Payments and Healthcare. Author papers and present your work internally and externally. Train new teammates, mentor others, participate in recruiting and interviewing, and participate in our tactical and strategic planning. About the team Our small team of applied scientists works within a larger security group, supporting thousands of engineers who are developing Amazon's payments and healthcare services. Security is a rich area for automated reasoning. Most other approaches are quite ad-hoc and take a lot of human effort. AR can help us to reason deliberately and systematically, and the dream of provable security is incredibly compelling. We are working to make this happen at scale. We partner closely with our larger security group and with other automated reasoning teams in AWS that develop core reasoning services.
US, NY, New York
Search Thematic Ad Experience (STAX) team within Sponsored Products is looking for a leader to lead a team of talented applied scientists working on cutting-edge science to innovate on ad experiences for Amazon shoppers!. You will manage a team of scientists, engineers, and PMs to innovate new widgets on Amazon Search page to improve shopper experience using state-of-the-art NLP and computer vision models. You will be leading some industry first experiences that has the potential to revolutionize how shopping looks and feels like on Amazon, and e-commerce marketplaces in general. You will have the opportunity to design the vision on how ad experiences look on Amazon search page, and use the combination of advanced techniques and continuous experimentation to realize this vision. Your work will be core to Amazon’s advertising business. You will be a significant contributor in building the future of sponsored advertising, directly impacting the shopper experience for our hundreds of millions of shoppers worldwide, while delivering significant value for hundreds of thousands of advertisers across the purchase journey with ads on Amazon. Key job responsibilities * Be the technical leader in Machine Learning; lead efforts within the team, and collaborate and influence across the organization. * Be a critic, visionary, and execution leader. Invent and test new product ideas that are powered by science that addresses key product gaps or shopper needs. * Set, plan, and execute on a roadmap that strikes the optimal balance between short term delivery and long term exploration. You will influence what we invest in today and tomorrow. * Evangelize the team’s science innovation within the organization, company, and in key conferences (internal and external). * Be ruthless with prioritization. You will be managing a team which is highly sought after. But not all can be done. Have a deep understanding of the tradeoffs involved and be fierce in prioritizing. * Bring clarity, direction, and guidance to help teams navigate through unsolved problems with the goal to elevate the shopper experience. We work on ambiguous problems and the right approach is often unknown. You will bring your rich experience to help guide the team through these ambiguities, while working with product and engineering in crisply defining the science scope and opportunities. * Have strong product and business acumen to drive both shopper improvements and business outcomes. A day in the life * Lead a multidisciplinary team that embodies “customer obsessed science”: inventing brand new approaches to solve Amazon’s unique problems, and using those inventions in software that affects hundreds of millions of customers * Dive deep into our metrics, ongoing experiments to understand how and why they are benefitting our shoppers (or not) * Design, prototype and validate new widgets, techniques, and ideas. Take end-to-end ownership of moving from prototype to final implementation. * Be an advocate and expert for STAX science to leaders and stakeholders inside and outside advertising. About the team We are the Search thematic ads experience team within Sponsored products - a fast growing team of customer-obsessed engineers, technologists, product leaders, and scientists. We are focused on continuous exploration of contexts and creatives to drive value for both our customers and advertisers, through continuous innovation. We focus on new ads experiences globally to help shoppers make the most informed purchase decision while helping shortcut the time to discovery that shoppers are highly likely to engage with. We also harvest rich contextual and behavioral signals that are used to optimize our backend models to continually improve the shopper experience. We obsess about our customers and are continuously seeking opportunities to delight them.
US, CA, Palo Alto
Amazon is the 4th most popular site in the US. Our product search engine, one of the most heavily used services in the world, indexes billions of products and serves hundreds of millions of customers world-wide. We are working on a new initiative to transform our search engine into a shopping engine that assists customers with their shopping missions. We look at all aspects of search CX, query understanding, Ranking, Indexing and ask how we can make big step improvements by applying advanced Machine Learning (ML) and Deep Learning (DL) techniques. We’re seeking a thought leader to direct science initiatives for the Search Relevance and Ranking at Amazon. This person will also be a deep learning practitioner/thinker and guide the research in these three areas. They’ll also have the ability to drive cutting edge, product oriented research and should have a notable publication record. This intellectual thought leader will help enhance the science in addition to developing the thinking of our team. This leader will direct and shape the science philosophy, planning and strategy for the team, as we explore multi-modal, multi lingual search through the use of deep learning . We’re seeking an individual that can enhance the science thinking of our team: The org is made of 60+ applied scientists, (2 Principal scientists and 5 Senior ASMs). This person will lead and shape the science philosophy, planning and strategy for the team, as we push into Deep Learning to solve problems like cold start, discovery and personalization in the Search domain. Joining this team, you’ll experience the benefits of working in a dynamic, entrepreneurial environment, while leveraging the resources of Amazon [Earth's most customer-centric internet company]. We provide a highly customer-centric, team-oriented environment in our offices located in Palo Alto, California.