Using warped language models to correct speech recognition errors

Model using ASR hypotheses as extra inputs reduces word error rate of human transcriptions by almost 11%.

Language-related machine learning applications have made great strides in recent years, thanks in part to masked language models such as BERT: during training, sentences are fed to the models with certain words either masked out or replaced with random substitutions, and the models learn to output complete and corrected sentences.

The success of masked language models has led to the development of warped language models, which add insertions and deletions to the menu of possible alterations. Warped language models were designed specifically to address the types of errors common in automatic speech recognition (ASR), so they could serve as the basis for ASR models.

In a paper we presented at this year’s Interspeech, we describe how to use warped language models, not during ASR, but to correct ASR output — or to correct the human transcriptions of speech used to train ASR models.

This new use case required us to modify the design of warped language models so that they not only output text strings but also classify the errors in the input strings. From that information, we can produce a corrected text even when its word count differs from that of the input.

Since ASR models output not a single transcript of input speech but a ranked list of hypotheses, we also experimented with using multiple hypotheses as inputs to our error correction model. With the human transcriptions, we generated the hypotheses by subjecting the transcribed speech to ASR. 

We found that the multiple-hypothesis approach had particular benefits for the correction of human transcription errors. There, it was able to reduce the word error rate by about 11%. For ASR outputs, the same model reduced the word error rate by almost 6%.

Warped language models

Traditional WLM.cropped.png
The traditional architecture of a warped language model, in which each output token corresponds to exactly one input token.

A warped language model outputs a token — either a word or a special symbol, such as a blank when it detects a spurious insertion — for each word of its input. This means, however, that it can’t fully correct word deletions: it has to choose between outputting the dropped word or the input word at the current position.

We adapt the basic architecture of the warped language model so that, for each input token, it predicts both an output token and a warping operation.

New WLM model.cropped.png
Our modified architecture. For each input token, it predicts both an output token and a warping operation.

Our model still outputs a single token for each input token, but from the combination of the token and the warping operation, a simple correction algorithm can deduce the original input. 

The figure below, for example, depicts our model’s handling of the input sentence “I saying that table I [mask] apples place oranges.” The middle row indicates our model’s output: first, an operation name, and second, an output token. When our model replaces the input “saying” with the output “was” and flags the operation as “drop”, the correction algorithm deduces that the sentence should have begun “I was saying”, not “I saying”.

wlm-alignment.cropped.png
An example of our model’s handling of all five operations used to train warped language models: keep (no alteration); drop; insert; mask; and rand (random substitution).

The great advantage of masked (and warped) language models is that they are unsupervised: the masking (and warping) operations can be performed automatically, enabling an effectively unlimited amount of training data. Our model is similarly unsupervised: we simply modify the warping algorithm so that, when it applies an operation, it also tags the output with the operation’s name.

Multiple hypotheses

After training our model on a corpus of English-language texts, we fine-tuned it on the output of an ASR model for a separate set of spoken-language utterances. For each utterance, we kept the top five ASR hypotheses.

Algorithms automatically aligned the tokens of the hypotheses and standardized their lengths, adding blank tokens where necessary. We treated hypotheses two through five as warped versions of the top hypothesis, automatically computing the minimum number of warping operations required to transform the top hypothesis into an alternate hypothesis and labeling the hypothesis tokens appropriately.

Multi-hypothesis model.png
The multi-hypothesis version of the model.

For each input, our model combines all five hypotheses to produce a single vector representation (an embedding) that the model’s decoder uses to produce output strings. 

During training, the model outputs a separate set of predictions for each hypothesis. This ensures the fine-tuning of the operation predictor as well as the token predictor, as the operation classifications will differ for each hypothesis, even if the token strings are identical. At run time, however, we keep only the output corresponding to the top-ranked ASR hypothesis.

Without fine-tuning on the ASR hypotheses, our model reduced the word error rate of ASR model outputs by 5%. But it slightly increased the word error rate on the human transcriptions of speech. This is probably because the human-transcribed speech, even when erroneous, is still syntactically and semantically coherent, so errors are hard to identify. The addition of the alternate ASR hypotheses, however, enables the correction model to exploit additional information in the speech signal itself, leading to a pronounced reduction in word error rate.

About the Author
Mahdi Namazifar is a senior applied scientist in the Alexa AI organization.

Related content

US, WA, Seattle
Job summaryAmazon brings buyers and sellers together. Our retail customers depend on us to give them access to every product at the best possible price. Our sellers depend on us to give them a platform to launch their business into every home and marketplace. Making this happen is the mission of every engineer in Amazon's North America Consumer (NAC) organization.To this end, the Science team is tasked with:· Organizing available data sources, and creating detailed dictionaries of data that can be used in future analyses.· Partnering with product teams in evaluating the financial and operational impact of new product offerings.· Conducting research into optimization and machine learning algorithms which can be applied to solve business problems.· Partnering with other scientists in evaluating algorithms and suggestions from a business view point.· Carrying out independent data-backed initiatives that can be leveraged later on in the fields of network organization, costing and financial modeling of processes.In order to execute the above mandate we are on the look out for smart and qualified Data Scientists who will own projects in partnership with product and research teams as well as operate autonomously on independent initiatives that are expected to unlock benefits in the future. A past background in Statistics is necessary, along with advanced proficiency in languages such as Python and R.Key job responsibilitiesAs a Data Scientist, you are able to use a range of advanced analytical methodologies to solve challenging business problems when the solution is unclear. You have a combination of business acumen, broad knowledge of statistics, deep understanding of ML algorithms, and an analytical mindset. You thrive in a collaborative environment, and are passionate about learning. Our team utilizes a variety of AWS tools such as Redshift, Sagemaker, Lambda, S3, and EC2 with a variety of skillsets in Linear and Discrete Optimization, ML, NLP, Forecasting, Probabilistic ML and Causal ML. You will bring knowledge in many of these domains along with your own specialties and skillsets.
US, CA, Pasadena
Job summaryThe Amazon Web Services (AWS) Center for Quantum Computing in Pasadena, CA, is hiring a Quantum Research Scientist to join a multi-disciplinary, fast-paced team of theoretical and experimental physicists, materials scientists, and hardware and software engineers pushing the forefront of quantum computing. The candidate should demonstrate a thorough knowledge of experimental measurement techniques as well as quantum mechanics theory.Inclusive Team CultureHere at AWS, we embrace our differences. We are committed to furthering our culture of inclusion. We have ten employee-led affinity groups, reaching 40,000 employees in over 190 chapters globally. We have innovative benefit offerings, and host annual and ongoing learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences.Work/Life BalanceOur team puts a high value on work-life balance. It isn’t about how many hours you spend at home or at work; it’s about the flow you establish that brings energy to both parts of your life. We believe striking the right balance between your personal and professional life is critical to life-long happiness and fulfillment. We offer flexibility in working hours and encourage you to find your own balance between your work and personal lives.Mentorship & Career GrowthOur team is dedicated to supporting new members. We have a broad mix of experience levels and tenures, and we’re building an environment that celebrates knowledge sharing and mentorship. Our senior members enjoy one-on-one mentoring and thorough, but kind, code reviews. We care about your career growth and strive to assign projects based on what will help each team member develop into a better-rounded engineer and enable them to take on more complex tasks in the future.Key job responsibilities* Contribute to fast-paced and agile research to help close the many orders of magnitude gap in gate error rates required for fault tolerant quantum computation* Design and perform experiments to characterize quantum devices in close collaboration with software and engineering teams* Develop models to understand and improve device performance* Effectively document results and communicate to a broad audience* Create robust software for implementation, automation, and analysis of measurements* Specify technical requirements in a cross-team collaboration using analytical arguments derived from physics theoryA day in the life* Analyze experimental data* Develop software to test and run new experiments on existing devices; collaborate with software engineers to achieve high code standard* Debug test setups to achieve high-quality data* Present results and cross-collaborate with others’ work* Perform code review for a colleague’s merge request
US, CA, Pasadena
Job summaryThe Amazon Web Services (AWS) Center for Quantum Computing in Pasadena, CA, is looking to hire a Quantum Research Scientist in the Test and Measurement group. You will join a multi-disciplinary team of theoretical and experimental physicists, materials scientists, and hardware and software engineers working at the forefront of quantum computing. You should have a deep and broad knowledge of experimental measurement techniques.Candidates with a track record of original scientific contributions will be preferred. We are looking for candidates with strong engineering principles, resourcefulness and a bias for action, superior problem solving, and excellent communication skills. Working effectively within a team environment is essential. As a research scientist you will be expected to work on new ideas and stay abreast of the field of experimental quantum computation.Inclusive Team CultureHere at AWS, we embrace our differences. We are committed to furthering our culture of inclusion. We have ten employee-led affinity groups, reaching 40,000 employees in over 190 chapters globally. We have innovative benefit offerings, and host annual and ongoing learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences.Work/Life BalanceOur team puts a high value on work-life balance. It isn’t about how many hours you spend at home or at work; it’s about the flow you establish that brings energy to both parts of your life. We believe striking the right balance between your personal and professional life is critical to life-long happiness and fulfillment. We offer flexibility in working hours and encourage you to find your own balance between your work and personal lives.Mentorship & Career GrowthOur team is dedicated to supporting new members. We have a broad mix of experience levels and tenures, and we’re building an environment that celebrates knowledge sharing and mentorship. Our senior members enjoy one-on-one mentoring and thorough, but kind, code reviews. We care about your career growth and strive to assign projects based on what will help each team member develop into a better-rounded engineer and enable them to take on more complex tasks in the future.Key job responsibilitiesIn this role, you will drive improvements in qubit performance by characterizing the impact of environmental and material noise on qubit dynamics. This will require designing experiments to assess the role of specific noise sources, ensuring the collection of statistically significant data, analyzing the results, and preparing clear summaries for the team. Finally, you will work with hardware engineers, material scientists, and circuit designers to implement changes which mitigate the impact of the most significant noise sources.
US, MA, Cambridge
Job summaryThe Alexa Artificial Intelligence (AI) team is looking for a passionate, talented, and inventive Applied Scientist with a strong machine learning background, to help build industry-leading Speech and Language technology.Key job responsibilitiesAs an Applied Scientist with the Alexa AI team, you will work with talented peers to develop novel algorithms and modeling techniques to advance the state of the art in spoken language understanding. Your work will directly impact our customers in the form of products and services that make use of speech and language technology. You will leverage Amazon’s heterogeneous data sources and large-scale computing resources to accelerate advances in spoken language understanding.About the teamThe Alexa AI team has a mission to push the envelope in Natural Language Understanding (NLU). Specifically, we focus on incremental learning, continual learning and fairness, in order to provide the best-possible experience for our customers.
US, WA, Seattle
Job summaryThe Alexa Artificial Intelligence (AI) team is looking for a passionate, talented, and inventive Applied Scientist with a strong machine learning background to help build industry-leading Speech and Language technology. Our mission is to push the envelope in Natural Language Understanding (NLU), Audio Signal Processing, text-to-speech (TTS), and Dialog Management, in order to provide the best-possible experience for our customers.Key job responsibilitiesAs an Applied Scientist, you will work with talented peers to develop novel algorithms and modeling techniques to advance the state of the art in spoken language understanding. Your work will directly impact our customers in the form of products and services that make use of speech and language technology. You will leverage Amazon’s heterogeneous data sources and large-scale computing resources to accelerate advances in spoken language understanding.
US, MA, Cambridge
Job summaryWant to transform the way people enjoy music, video, and radio? Come join the team that made Amazon Music, Spotify, Hulu, Netflix, Pandora, available to Alexa customers. We are innovating the way our customers interact with entertainment in the living room, on the go, and in the car. We are at the epicenter of the future of entertainment.Alexa Entertainment is looking for an Applied Scientist as we build a team of talented and passionate scientists for ASR (automatic speech recognition) and NLU (natural language understanding). As a Research Scientist, you will participate in the design, development, and evaluation of models and ML (machine learning) technology so that customers have the magical experience of entertainment via Alexa. You will help lay the foundation to move from directed interactions to learned behaviors that enable Alexa to proactively take action on behalf of the customer. And, you will have the satisfaction of working on a product your friends and family can relate to, and want to use every day. Like the world of smart phones less than 10 years ago, this is a rare opportunity to have a giant impact on the way people live.You will be part of a team delivering features that are highly anticipated by media and well received by our customers.
US, VA, Arlington
Job summaryThe People eXperience and Technology Central Science Team (PXTCS) uses economics, behavioral science, statistics, and machine learning to proactively identify mechanisms and process improvements which simultaneously improve Amazon and the lives, wellbeing, and the value of work to Amazonians. We are an interdisciplinary team that combines the talents of science and engineering to develop and deliver solutions that measurably achieve this goal.We are looking for a research scientist with expertise in applying causal inference, experimental design, or causal machine learning techniques to topics in labor, personnel, education, health, public, or behavioral science. We are particularly interested in candidates with experience applying these skills to strategic problems with significant business and/or social policy impact.Candidates will work with economists, scientists and engineers to estimate and validate their models on large scale data, and will help business partners turn the results of their analysis into policies, programs, and actions that have a major impact on Amazon’s business and its workforce. We are looking for creative thinkers who can combine a strong scientific toolbox with a desire to learn from others, and who know how to execute and deliver on big ideas.You will conduct, direct, and coordinate all phases of research projects, including defining key research questions, developing models, designing and implementing appropriate data collection methods, executing analysis plans, and communicating results. You will earn trust from our business partners by collaborating with them to define key research questions, communicate scientific approaches and findings, listen to and incorporate their feedback, and deliver successful solutions.
US, WA, Seattle
Job summaryWant to work on one of Amazon’s most ambitious efforts? Time and Attendance (TAA) is leading the charge to build products that support our global workforce of passionate Amazonians!At Amazon we take seriously our commitment to pay employees accurately and on-time. While each line of business is responsible for knowing and driving down pay defects for their own employees, the centralized Perfect Pay team manages data stores and analytics, program oversight, cross-org technical and non-technical projects, and drives accountability across leaders.TAA is looking for a strong Data Scientist, Machine Learning for the Perfect Pay program to drive and own design and development of Machine Learning products to detect anomalies and risks to prevent pay errors before they happen. You will lead the team in designing anomaly and risk detection models to identify and prevent defects for Amazonians in their HR and pay data. You will work on all aspects of the product development life cycle, with a focus on the hardest problems around building scalable machine learning models with native AWS solutions that leverage tools like SageMaker, Glue, and Redshift to grow with Amazon. You will build high quality, scalable models which create immediate and impactful value for our Amazonians worldwide, while also ensuring that our products are evolving in a sustainable long-term direction.Who are we looking for to join our team?We are looking for a Data Science, machine learning specialist to build new and innovative systems that can predict pay defects before they happen and drive operational excellence across businesses. The HR systems and tools have never been analyzed together in context. The opportunity to automate improving the Amazonian experience using ML and AI span from improving the pay experience, to building risk prevention, to automatically triggering internal HR systems to correct anomalies. Getting the opportunity to cross-functionally explore data sets which support 1.4 million Amazonians for the first time is a unique opportunity. The ideal candidate will be experienced in innovating in domains without current ML/AI products. Domain experience in time and attendance and payroll, or Amazon operations field experience is useful but not required.Key job responsibilitiesMain responsibilities• Use statistical and machine learning techniques to create scalable anomaly detection and risk management systems• Analyzing and understanding large amounts of Amazon’s historical HR data for specific instances of defects or broader risk trends• Design, development, and evaluation of highly innovative models for anomaly detection and risk assessment• Working closely with data engineering team to scope scalable data architecture solutions that support your ML models• Working closely with software engineering teams to drive real-time model implementations and new feature creations• Working closely with operations staff to optimize defect prevention and model implementations• Establishing scalable, efficient, automated processes for large scale data analyses, model development, model validation and model implementation• Research and implement novel machine learning and statistical approaches• Working closely with HR Business Partners to understand their use-cases for anomaly and risk detection as well as to define the data needed to carry out the work
US, WA, Bellevue
Job summaryAmazon relies on the latest technology to deliver millions of packages every day to our customers – on time, at low cost, and safely. The Middle Mile Planning Research & Optimization Science team builds complex science models and solutions that work across our vendors, warehouses and carriers to optimize both time & cost of getting the packages delivered. Our models are state-of-the-art, make business decisions impacting billions of dollars a year, and improve ordering and delivery experience for millions of online shoppers. That said, this remains a fast growing business and our journey has only started. Our mission is to build the most efficient and transportation network on the planet, using our science and technology as our biggest advantage. We aim to leverage cutting edge technologies in machine learning and operations research to grow our businesses.As a Machine Learning Applied Scientist, you’ll design, model, develop and implement state-of-the-art machine learning models and solutions used by Amazon worldwide. You will need to collaborate effectively with internal stakeholders and cross-functional teams to solve problems, create operational efficiencies, and deliver successfully against high organizational standards. As part of your role you will regularly interact with software engineering teams and business leadership. The focus of this role is to research, develop, and deploy predictive models that will inform and support our business, primarily in the areas of carrier safety.Tasks/ Responsibilities:· Lead and partner with the engineering and operations teams to drive modeling and technical design for complex business problems.· Develop accurate and scalable machine learning models and methods to solve our hardest predictive problems in transportation.· Lead complex modeling analyses to aid management in making key business decisions and set new policies.
US, NJ, Newark
Job summaryGood storytelling starts with great listening. At Audible, that means each role and every project has our audience in mind. Because the same people who design, develop, and deploy our products also happen to use them. To us, that speaks volumes.ABOUT THIS ROLEAudible is searching for an exceptional data scientist to join our economics team and drive the development of models at the intersection of machine learning and econometrics at scale. The Audible economics organization works across the business to measure and maximize the value Audible delivers to customers, creators, and communities globally. In this role, there will be a focus on partnering with our content and product teams to build a groundbreaking catalog of audiobooks and spoken-word entertainment, develop innovative tools to generate value for creators, and optimize content distribution and monetization.We are looking for someone experienced in building ML models at scale for complex prediction and optimization problems, who also has a background (or burgeoning interest!) in causal inference or interpretable machine learning. In addition to working with our staff economists and data scientists, you will also collaborate closely with scientists across Audible and partner teams at Amazon on problems pertinent to subscription businesses and the production of original media content.As a Data Scientist, you will...· Work with leadership in our content and product organizations to identify key analytical problems and opportunities – your work is expected to be a key input to our future content strategy.· Develop and maintain scalable, innovative data science and machine learning models that deliver actionable insights and results.· Collaborate with other data scientists, economists, and analysts at Audible to build data-driven solutions to key business problems.