Should Alexa read “2/3” as “two-thirds” or “February Third”?: The science of text normalization

Text normalization is an important process in conversational AI. If an Alexa customer says, “book me a table at 5:00 p.m.”, the automatic speech recognizer will transcribe the time as “five p m”. Before a skill can handle this request, “five p m” will need to be converted to “5:00PM”. Once Alexa has processed the request, it needs to synthesize the response — say, “Is 6:30 p.m. okay?” Here, 6:30PM will be converted to “six thirty p m” for the text-to-speech synthesizer. We call the process of converting “5:00PM” to “five p m” text normalization and its counterpart — converting “five p m” to “5:00PM” — inverse text normalization.

TokenizerInSDS.png._CB464400123_.png
ASR = automatic speech recognition; NLU = natural-language understanding; DM = dialogue management;
NLG = natural-language generation; and TTS = text-to-speech synthesis

In the example above, time expressions live two lives inside Alexa, to meet an individual skill’s needs and to optimize the system’s performance, even though end users are unaware of such internal format switches. There are many other types of expressions that receive similar treatment, such as date, e-mail address, numbers, and abbreviations.

To do text normalization and inverse text normalization in English, Alexa currently relies on thousands of handwritten rules. As the range of possible interactions with Alexa increases, authoring rules becomes an intrinsically error-prone process. Moreover, as Alexa continues to move into new languages, we would rather not rewrite all those rules from scratch.

Consequently, at this year’s meeting of the North American Chapter of the Association for Computational Linguistics (NAACL), my colleagues and I will report a set of experiments in using recurrent neural networks to build a text normalization system.

By breaking words in our network’s input and output streams into smaller strings of characters (called subword units), we demonstrate a 75% reduction in error rate relative to the best-performing neural system previously reported. We also show a 63% reduction in latency, or the time it takes to receive a response to a single request.

By factoring in additional information, such as words’ parts of speech and their capitalizations, we demonstrate a further error rate reduction of 81%.

What makes text normalization nontrivial is the ambiguity of its inputs: depending on context, for instance, “Dr.” could mean “doctor” or “Drive”, and “2/3” could mean “two-thirds” or “February third”. A text normalization system needs to consider context when determining how to handle a given word.

To that end, the best previous neural model adopted a window-based approach to textual analysis. With every input sentence it receives, the model slides a “window” of fixed length — say, five words — along the sentence. Within each window, the model decides only what to do with the central word; the words on either side are there for context.

But this is time consuming. In principle, it would be more efficient to process the words of a sentence individually, rather than in five-word chunks. In the absence of windows, the model could gauge context using an attention mechanism. For each input word, the attention mechanism would determine which previously seen words should influence its interpretation.

attns-date.png._CB464400122_.png
The activation pattern of an attention mechanism, during the normalization of the input “archived from the original on 2011/11/11”

In our experiments, however, a sentence-based text normalization system, with attention mechanism, performed poorly compared to a window-based model, making about 2.5 times as many errors. Our solution: break inputs into their subword components before passing them to the neural net and, similarly, train the model to output subword units. A separate algorithm then stitches the network’s outputs into complete words.

The big advantage of subword units is that they reduce the number of inputs that a neural network must learn to handle. A network that operates at the word level would, for instance, treat the following words as distinct inputs: crab, crabs, pine, pines, apple, apples, crabapple, crabapples, pineapple, and pineapples. A network that uses subwords might treat them as different sequences of four inputs: crab, pine, apple, and the letter s.

Using subword units also helps the model decide what to do with input words it hasn’t seen before. Even if a word isn’t familiar, it may have subword components that are, and that could be enough to help the model decide on a course of action.

To produce our inventory of subword units, we first break all the words in our training set into individual characters. An algorithm then combs through the data, identifying the most commonly occurring two-character units, three-character units, and so on, adding them to our inventory until it reaches capacity.

We tested six different inventory sizes, starting with 500 subword units and doubling the size until we reached 16,000. We found that an inventory of 2,000 subwords worked best.

We trained our model using 500,000 examples from a public data set, and we compared its performance to that of a window-based model and a sentence-based model that does not use subword units.

The baseline sentence-based model had a word error rate (WER) of 9.3%, meaning that 9.3% of its word-level output decisions were wrong. With a WER of 3.8%, the window-based model offered a significant improvement. But the model with subword units reduced the error rate still further, to 0.9%. It was also the fastest of the three models.

Once we had benchmarked our system against the two baselines, we re-trained it to use not only subword units but additional linguistic data that could be algorithmically extracted from the input, such as parts of speech, position within the sentence, and capitalization.

That data can help the system resolve ambiguities. For instance, if the word “resume” is tagged as a verb, it should simply be copied verbatim to the output stream. If, however, it’s tagged as a noun, it’s probably supposed to be the word “résumé,” and accents should be added. Similarly, the character strings “us” and “id” are more likely to be one-syllable nouns if lowercase, two-syllable abbreviations if capitalized.

With the addition of the linguistic data, the model’s WER dropped to just 0.2%.

Acknowledgments: Courtney Mansfield, Ankur Gandhe, Björn Hoffmeister, Ryan Thomas, Denis Filimonov, D. K. Joo, Siyu Wang, Gavrielle Lent

About the Author
Senior Speech Scientist in the Alexa Speech Group at Amazon.

Related content

IL, Tel Aviv
Job summaryAre you interested in working on fascinating scientific and engineering challenges of modern AI? Would you like to contribute to the development of the future generation of cloud computing at Amazon Web Services?As an Applied Scientist, you will be working on cutting edge research at the intersection of deep learning and causal inference. You will be part of an ambitious and multidisciplinary team of scientists and software engineers that is together developing novel tools to learn and exploit causal knowledge from real-world visual data.The AWS Causal Representation Learning Lab is located at the Tübingen site in Germany. Our goal is to develop the next generation of AI algorithms by learning and exploiting causal invariances extracted from non-i.i.d. visual data. Going beyond mere correlations, we quantify the causes of observations and provide robust predictions. Our mission is to provide credible and reliable AI models that do not fail unexpectedly under distribution shifts. The successful applicant will have previous research experience with either representation learning or causality, with an interest in the other.As an Applied Scientist in the Causal Representation Learning Lab, you will be responsible for: - Developing new machine learning and neural network architectures building on and inspired by causal principles· Causal discovery in complex environments and large-scale visual datasets· Benchmarks and data sets for causal representation learning· Developing evaluation pipelines to test model performance under distribution shifts· Engaging with product and development teams across AWS and Amazon to help bringing your scientific breakthroughs to customers· Contribute to our unique multidisciplinary environment with your own creativity and talent· Mentoring junior scientists and interns / PhD studentsWe at AWS value individual expression, respect different opinions, and work together to create a culture where each of us is able to contribute fully. Our unique backgrounds and perspectives strengthen our ability to achieve Amazon's mission of being Earth's most customer-centric company.
US, WA, Seattle
Job summaryWork at the intersection of data science and economics.The DAC AdsEcon Team is looking for a Data Scientist II to help and be part of a team to put cutting edge economic and data science advertising research into production. We are looking for a unique individual to help us build a prototype that will have a profound impact in our advertising businesses.Advertising is used daily to surface new selection and provide customers a wider set of product choices along their shopping journeys. The business is focused on generating value for shoppers as well as advertisers. Our team sits in the Business/Corporate Development, and our charter is to use econometrics, machine learning, and data science to build disruptive products that move the needle in our multiple Amazon Advertising businesses. We also generate insights to guide Amazon Advertising strategy, providing direct support to the high level leaders.If you have a background in economics, computer science, statistics, or mathematics and have a passion for solving large, and impactful problems, this is the job for you. Key responsibilities of Data Scientist include the following:· Partnering with economists and senior team members to drive science improvements and implement technical solutions at the cutting edge of machine learning and econometrics· Helping build data systems that leverage diverse data sources to understand how different advertiser’s decisions impact their performance across multiple advertising products.· Build interpretable statistical models and analyze experiment results to answer questions that will drive high impact decisions across Amazon.About Amazon's Advertising business:Amazon is investing heavily in building a world class advertising business and we are responsible for defining and delivering a collection of self-service performance advertising products that drive discovery and sales. Our products are strategically important to our Retail and Marketplace businesses driving long term growth. We deliver billions of ad impressions and millions of clicks daily and are breaking fresh ground to create world-class products. We are highly motivated, collaborative and fun-loving with an entrepreneurial spirit and bias for action. With a broad mandate to experiment and innovate, we are growing at an unprecedented rate with a seemingly endless range of new opportunities.
US, MA, North Reading
Job summaryAre you inspired by invention? Is problem solving through teamwork in your DNA? Do you like the idea of seeing how your work impacts the bigger picture? Answer yes to these questions and you'll fit right in here at Amazon Robotics. We are a smart team of doers who work passionately to apply cutting edge advances in robotics and software to solve real-world challenges that will transform our customers experiences in ways we can't even image yet. We invent new improvements every day.Amazon Robotics, a wholly owned subsidiary of Amazon.com, empowers a smarter, faster, more consistent customer experience through automation. Amazon Robotics automates fulfilment center operations using various methods of robotic technology including autonomous mobile robots, sophisticated control software, language perception, power management, computer vision, depth sensing, machine learning, object recognition, and semantic understanding of commands. Amazon Robotics has a dedicated focus on research and development to continuously explore new opportunities to extend its product lines into new areas.This role is a 6 month co-op to join AR full time (40 hours/week) from July-December 2022. Amazon Robotics co-op opportunities will be based in the Greater Boston Area, in our two state-of-the-art facilities in Westborough and North Reading, MA. Both campuses provide a unique opportunity for interns to have direct access to robotics testing labs and manufacturing facilities.Amazon Robotics is seeking a talented and motivated Engineering student to join the Advanced Robotics team for a Co-op assignment. The candidate will have the opportunity work with senior engineering staff to conduct research, develop, and test software and hardware for next-generation robotic manipulation solutions used in Amazon.com fulfillment operations. Ideal candidates are enrolled in an undergraduate, graduate, or PhD program related to software engineering or robotics, and have strong mechanical and or electrical aptitude, embedded programming, enjoys problem solving and can potentially handle multiple parallel tasks.The Advanced Robotics Co-op will be responsible for:· Working as part of an interdisciplinary team to design and analyze mechanisms, modules or systems· Identifying creative solutions for challenging problems in robotics and computer vision· Developing software solutions to test hypotheses and demonstrate new functionality· Building models, prototyping concepts, conducting tests, collecting data to quantify performance· Creating milestones and deliverables and tracking status with team· Developing design documentation and leading reviews with other engineers or co-ops· Writing code and unit tests and integrating code with other software and hardware components· Utilizing Amazon Robotics and Amazon engineering tools, processes and technologies
US, NJ, Newark
Job summaryGood storytelling starts with great listening. At Audible, that means each role and every project has our audience in mind. Because the same people who design, develop, and deploy our products also happen to use them. To us, that speaks volumes.ABOUT THIS ROLEAudible is searching for an exceptional data scientist to join our economics team and drive the development of models at the intersection of machine learning and econometrics at scale. The Audible economics organization works across the business to measure and maximize the value Audible delivers to customers, creators, and communities globally. In this role, there will be a focus on partnering with our content and product teams to build a groundbreaking catalog of audiobooks and spoken-word entertainment, develop innovative tools to generate value for creators, and optimize content distribution and monetization.We are looking for someone experienced in building ML models at scale for complex prediction and optimization problems, who also has a background (or burgeoning interest!) in causal inference or interpretable machine learning. In addition to working with our staff economists and data scientists, you will also collaborate closely with scientists across Audible and partner teams at Amazon on problems pertinent to subscription businesses and the production of original media content.As a Data Scientist, you will...· Work with leadership in our content and product organizations to identify key analytical problems and opportunities – your work is expected to be a key input to our future content strategy.· Develop and maintain scalable, innovative data science and machine learning models that deliver actionable insights and results.· Collaborate with other data scientists, economists, and analysts at Audible to build data-driven solutions to key business problems.
US, NJ, Newark
Job summaryGood storytelling starts with great listening. At Audible, that means each role and every project has our audience in mind. Because the same people who design, develop, and deploy our products also happen to use them. To us, that speaks volumes.ABOUT THIS ROLEAudible seeks a Data Scientist who will help our marketing team improve paid marketing efficiency and performance. In this role, you will make the best of your skillset in modeling and general analytics. Modelling: use your knowledge of (un-) supervised learning, reinforcement learning, and simulation to explain, quantify, predict and prescribe. Analytics: use your knowledge of marketing and paid media to translate business and financial goals into insights and influence action. Overall: you will seek to create value for both stakeholders and customers and will convey results in a clear, actionable way to managers and senior leaders.As a Data Scientist, you will...· Will build analytical products end-to-end (decks, dashboards, data science models, simulations) at scale and at speed, from ideation and data extraction to presenting results to stakeholders (from manager to VP level).· Support development of models to optimize the Who, When, Where and How of all our conversations with customers and specifically to measure and optimize paid media.· Develop, maintain, and iterate on Amazon-scale data engineering and modelling pipelines.· Imagine and invent before the business asks, and create groundbreaking applications using cutting-edge approaches.· Contribute to the growth of the Audible Global Insights and Data Science team by sharing your ideas, intellectual property and learning from others.· Work closely with Audible stakeholders to drive the business forward, and deliver impactful models and analyses based on robust economic, financial, and statistical analysis.
US, MA, North Reading
Job summaryAre you an MS or PhD student interested in Robotics, Manipulation, Computer Vision, or Machine Learning? Do you enjoy diving deep into hard technical problems and coming up with solutions that enable successful products that improve the lives of people in a meaningful way?At Amazon Robotics, we strive to push boundaries in order to provide the best possible experience for our customers. We are looking for scientists striving to use their domain expertise to invent, design, evangelize, and implement state-of-the-art solutions for never-before-solved problems. As an Applied Scientist intern, you will have access to large datasets with billions of images and video to build large-scale machine learning systems. Additionally, you will analyze and model terabytes of text, images, and other types of data to solve real-world problems and translate business and functional requirements into quick prototypes or proofs of concept.As an Applied Scientist intern, you will work from concept through to execution. This role will give you the opportunity to build tools and support structures needed to analyze data, dive deep to resolve root cause of systems errors and changes, and present findings to business partners to drive improvements.Come build the future with us. Amazon internships are full-time (40 hours/week) for 12 or more consecutive weeks with start dates between May and June 2022.Amazon Robotics intern opportunities will be based in the Greater Boston Area, in our two state-of-the-art facilities in Westborough and North Reading, MA. Both campuses provide a unique opportunity for co-ops to have direct access to robotics testing labs and manufacturing facilities!
US, WA, Seattle
Job summaryAmazon’s Shipping and Delivery Support (SDS) team is a part of Amazon World Wide Customer Service dedicated to support successful package deliveries to Amazon Customers. As a Data Scientist on our team, you’ll use Amazon’s wealth of data to help answer tough questions like where and when preemptively intervening with a problem is most likely to result in a successful delivery, which signals should alert us that a delivery is at risk of missing its estimate, and what is the relative value of a specific set of support associate actions as they relate to delivery success. You will also leverage Amazon's rich datasets and machine learning techniques to understand customer urgency, and build algorithms to recommend treatment actions to optimize delivery outcome. This role will be a key member of the Shipping and Delivery Support Science Team.The Senior Data Scientist will work closely with Business Intelligence Engineers, Data Engineers, Product Managers, Software Engineers, and Program Managers to develop statistical and machinelearning models, design and run experiments, and find new ways to improve support experience to optimize the customer experience and Amazon’s on-time deliveries. The Scientist will collaborate with technology and product leaders to solve business and technology problems using scientific approaches to build new services that surprise and delight Amazon drivers and our customers. Science at Amazon is a highly experimental activity, although theoretical analysis and innovation are also welcome. Our scientists work closely with software engineers to put algorithms into practice. They also work on cross-disciplinary efforts with other scientists within Amazon.The key strategic objectives for this role include:· Understanding drivers, impacts, and key influences on delivery success and support contacts.· Optimizing support processes to improve the Customer experience and Amazon’s on time delivery.· Automating feedback loops for algorithms in production.· Collaborate with researchers, software developers, and business leaders to define product requirements and provide analytical support.· Utilizing Amazon systems and tools to effectively work with terabytes of data.· Communicating verbally and in writing to business customers and leadership team with various levels of technical knowledge, educating them about our systems, as well as sharing insights and recommendations
US, WA, Seattle
Job summaryAmazon brings buyers and sellers together. Our retail customers depend on us to give them access to every product at the best possible price. Our sellers depend on us to give them a platform to launch their business into every home and marketplace. Making this happen is the mission of every engineer in Amazon's North America Consumer (NAC) organization.To this end, the Science team is tasked with:· Organizing available data sources, and creating detailed dictionaries of data that can be used in future analyses.· Partnering with product teams in evaluating the financial and operational impact of new product offerings.· Conducting research into optimization and machine learning algorithms which can be applied to solve business problems.· Partnering with other scientists in evaluating algorithms and suggestions from a business view point.· Carrying out independent data-backed initiatives that can be leveraged later on in the fields of network organization, costing and financial modeling of processes.In order to execute the above mandate we are on the look out for smart and qualified Data Scientists who will own projects in partnership with product and research teams as well as operate autonomously on independent initiatives that are expected to unlock benefits in the future. A past background in Statistics is necessary, along with advanced proficiency in languages such as Python and R.Key job responsibilitiesAs a Data Scientist, you are able to use a range of advanced analytical methodologies to solve challenging business problems when the solution is unclear. You have a combination of business acumen, broad knowledge of statistics, deep understanding of ML algorithms, and an analytical mindset. You thrive in a collaborative environment, and are passionate about learning. Our team utilizes a variety of AWS tools such as Redshift, Sagemaker, Lambda, S3, and EC2 with a variety of skillsets in Linear and Discrete Optimization, ML, NLP, Forecasting, Probabilistic ML and Causal ML. You will bring knowledge in many of these domains along with your own specialties and skillsets.
US, CA, Pasadena
Job summaryThe Amazon Web Services (AWS) Center for Quantum Computing in Pasadena, CA, is hiring a Quantum Research Scientist to join a multi-disciplinary, fast-paced team of theoretical and experimental physicists, materials scientists, and hardware and software engineers pushing the forefront of quantum computing. The candidate should demonstrate a thorough knowledge of experimental measurement techniques as well as quantum mechanics theory.Inclusive Team CultureHere at AWS, we embrace our differences. We are committed to furthering our culture of inclusion. We have ten employee-led affinity groups, reaching 40,000 employees in over 190 chapters globally. We have innovative benefit offerings, and host annual and ongoing learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences.Work/Life BalanceOur team puts a high value on work-life balance. It isn’t about how many hours you spend at home or at work; it’s about the flow you establish that brings energy to both parts of your life. We believe striking the right balance between your personal and professional life is critical to life-long happiness and fulfillment. We offer flexibility in working hours and encourage you to find your own balance between your work and personal lives.Mentorship & Career GrowthOur team is dedicated to supporting new members. We have a broad mix of experience levels and tenures, and we’re building an environment that celebrates knowledge sharing and mentorship. Our senior members enjoy one-on-one mentoring and thorough, but kind, code reviews. We care about your career growth and strive to assign projects based on what will help each team member develop into a better-rounded engineer and enable them to take on more complex tasks in the future.Key job responsibilities* Contribute to fast-paced and agile research to help close the many orders of magnitude gap in gate error rates required for fault tolerant quantum computation* Design and perform experiments to characterize quantum devices in close collaboration with software and engineering teams* Develop models to understand and improve device performance* Effectively document results and communicate to a broad audience* Create robust software for implementation, automation, and analysis of measurements* Specify technical requirements in a cross-team collaboration using analytical arguments derived from physics theoryA day in the life* Analyze experimental data* Develop software to test and run new experiments on existing devices; collaborate with software engineers to achieve high code standard* Debug test setups to achieve high-quality data* Present results and cross-collaborate with others’ work* Perform code review for a colleague’s merge request
US, CA, Pasadena
Job summaryThe Amazon Web Services (AWS) Center for Quantum Computing in Pasadena, CA, is looking to hire a Quantum Research Scientist in the Test and Measurement group. You will join a multi-disciplinary team of theoretical and experimental physicists, materials scientists, and hardware and software engineers working at the forefront of quantum computing. You should have a deep and broad knowledge of experimental measurement techniques.Candidates with a track record of original scientific contributions will be preferred. We are looking for candidates with strong engineering principles, resourcefulness and a bias for action, superior problem solving, and excellent communication skills. Working effectively within a team environment is essential. As a research scientist you will be expected to work on new ideas and stay abreast of the field of experimental quantum computation.Inclusive Team CultureHere at AWS, we embrace our differences. We are committed to furthering our culture of inclusion. We have ten employee-led affinity groups, reaching 40,000 employees in over 190 chapters globally. We have innovative benefit offerings, and host annual and ongoing learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences.Work/Life BalanceOur team puts a high value on work-life balance. It isn’t about how many hours you spend at home or at work; it’s about the flow you establish that brings energy to both parts of your life. We believe striking the right balance between your personal and professional life is critical to life-long happiness and fulfillment. We offer flexibility in working hours and encourage you to find your own balance between your work and personal lives.Mentorship & Career GrowthOur team is dedicated to supporting new members. We have a broad mix of experience levels and tenures, and we’re building an environment that celebrates knowledge sharing and mentorship. Our senior members enjoy one-on-one mentoring and thorough, but kind, code reviews. We care about your career growth and strive to assign projects based on what will help each team member develop into a better-rounded engineer and enable them to take on more complex tasks in the future.Key job responsibilitiesIn this role, you will drive improvements in qubit performance by characterizing the impact of environmental and material noise on qubit dynamics. This will require designing experiments to assess the role of specific noise sources, ensuring the collection of statistically significant data, analyzing the results, and preparing clear summaries for the team. Finally, you will work with hardware engineers, material scientists, and circuit designers to implement changes which mitigate the impact of the most significant noise sources.