Using wake word acoustics to filter out background speech improves speech recognition by 15%

One of the ways that we’re always trying to improve Alexa’s performance is by teaching her to ignore speech that isn’t intended for her.

At this year’s International Conference on Acoustics, Speech, and Signal Processing, my colleagues and I will present a new technique for doing this, which could complement the techniques that Alexa already uses.

We assume that the speaker who activates an Alexa-enabled device by uttering its “wake word” — usually “Alexa” — is the one Alexa should be listening to. Essentially, our technique takes an acoustic snapshot of the wake word and compares subsequent speech to it. Speech whose acoustics match those of the wake word is judged to be intended for Alexa, and all other speech is treated as background noise.

Rather than training a separate neural network to make this discrimination, we integrate our wake-word-matching mechanism into a standard automatic-speech-recognition system. We then train the system as a whole to recognize only the speech of the wake word utterer. In tests, this approach reduced speech recognition errors by 15%.

We implemented our technique using two different neural-network architectures. Both were variations of a sequence-to-sequence encoder-decoder network with an attention mechanism. A sequence-to-sequence network is one that processes an input sequence — here, a series of “frames”, or millisecond-scale snapshots of an audio signal — in order and produces a corresponding output sequence — here, phonetic renderings of speech sounds.

In an encoder-decoder network, the encoder summarizes the input as a vector — a sequence of numbers — of fixed length. Typically, the vector is more compact than the original input. The decoder then converts the vector into an output. The entire network is trained together, so that the encoder learns to produce summary vectors well suited to the decoder’s task.

Finally, the attention mechanism tells the decoder which elements of the encoder’s summary vector to focus on when producing an output. In a sequence-to-sequence model, the attention mechanism’s decision is typically based on the current states of both the encoder and decoder networks.

Seq2Seq_encoder-decoder_with_attention.png._CB466581826_.png
Our baseline sequence-to-sequence encoder-decoder model with attention, which we modified to emphasize speech inputs with the same acoustic features as the “wake word” that activates Alexa

Our first modification to this baseline network was simply to add an input to the attention mechanism. In addition to receiving information about the current states of the encoder and decoder networks, our modified attention mechanism also receives the raw frame data corresponding to the wake word. During training, the attention mechanism automatically learns which acoustic characteristics of the wake word to look for in subsequent speech.

In another experiment, we trained the network more explicitly to emphasize input speech whose acoustic profile matches that of the wake word. First, we added a mechanism that directly compares the wake word acoustics with those of subsequent speech. Then we used the result of that comparison as an input to a mechanism that learns to suppress — or “mask” — some elements of the encoder’s summary vector before they even pass to the attention mechanism. Otherwise, the attention mechanism is the same as in the baseline model.

We expected the masking approach to outperform the less explicitly supervised attention mechanism, but in fact it fared slightly worse, reducing the error rate of the baseline model by only 13%, rather than 15%. We suspect that this is because the decision to mask encoder outputs is based solely on the state of the encoder network, whereas the modified attention mechanism factored in the state of the decoder network, too. In future work, we plan to explore a masking mechanism that also considers the decoder state.

Acknowledgments: Yiming Wang, I-Fan Chen, Yuzong Liu, Tongfei Chen, Björn Hoffmeister

About the Author
Xing Fan is a senior applied scientist in the Alexa AI group.

Related content

US, WA, Seattle
Job summaryPrime Video is disrupting traditional media with an ever-increasing selection of movies, TV shows, Emmy Award-winning original content, add-on subscriptions including HBO and Showtime, and live events like Thursday Night Football and Major League Baseball. We are a premier provider of digital entertainment worldwide and we continue to grow very quickly! We need your passion, innovative ideas, and creativity to help continue to deliver on our ambitious goals.How often have you had an opportunity to be a founding member of a team solving significant customer problems through innovative AI technology at Amazon scale? We are looking for passionate, hard-working, and talented individuals to join our fast paced, start-up environment to help invent the future and define the next generation of how customers watch videos.Do you want to join an innovative team of scientists who use machine learning and statistical techniques to help Amazon provide the best customer experience by protecting Amazon customers from harmful content? Do you want to build advanced algorithmic systems that help millions of customers every day? Are you excited by the prospect of analyzing and modeling terabytes of data and creating state-of-the-art algorithms to solve real world problems? If yes, then you may be a great fit to join our Amazon Prime Video team. We are expanding our scene understanding team to drive compliance automation and exceptional customer experience using machine learning, computer vision, audio processing, and natural language understanding. Automation of video understanding at scale is our mission and passion. We need to solve problems across many cultures and languages. we have a huge amount of human-labelled data, and operation team to generate labels across many languages. Our team innovates, with many novel patents, inventions, and papers in the motion picture and television industry. We are highly motivated to extend the state of the art.We embrace the challenges of a fast-paced market and evolving technologies, paving the way to universal availability of content. You will be encouraged to see the big picture, be innovative, and positively impact millions of customers. This is a young and evolving business where creativity and drive will have a lasting impact on the way video is enjoyed worldwide.As a machine learning scientist, you will apply your knowledge of deep learning to concrete problems that have broad cross-organizational, global, and technology impact. Your work will focus on training and evaluating models and deploying them to production where we continuously monitor and evaluate. You will work on large engineering efforts that solve significantly complex problems facing global customers. You will be trusted to operate with independence and are often assigned to focus on areas with significant impact on audience satisfaction. You consistently bring strong, data-driven business and technical judgment to decisions. This is a greenfield with no "off-the-shelf algorithms" that can perform the job. We experiment a lot and it is a must to learn and be curios. You will be encouraged to see the big picture, be innovative, and positively impact millions of customersYou will work with internal and external stakeholders, cross-functional partners, and end-users around the world at all levels. Our team makes a big impact because nothing is more important to us than pleasing our customers, continually earning their trust, and thinking long term. You are empowered to bring new technologies and deep learning approaches to your solutions.
US, NY, New York
Job summaryHow often have you had an opportunity to be a founding manager of a team solving significant customer problems through innovative AI at Amazon scale? We are looking for passionate science leader to join our fast paced start-up to help invent the future of how customers watch videos. We are disrupting a 100-years old industry through cloud services (AWS), 2D/3D computer vision, NLP, audio and speech processing and ASR, scalable visual effects (VFX), and multimodal search and classification.Prime Video is disrupting traditional media with an ever-increasing selection of movies, TV shows, Emmy Award-winning original content, add-on subscriptions including HBO and Showtime, and live events like Thursday Night Football and Major League Baseball. We are a premier provider of digital entertainment worldwide and we continue to grow very quickly! We need your passion, innovative ideas, and creativity to help continue to deliver on our ambitious goals.We are building a new team to automate visual effects end-to-end process using machine learning and 2D/3D computer vision. Visual Effects artists perform a lot of manual steps, and go through a long process using multiple software packages to achieve a realistic effect. We will change this process from the creation of 3D object models, adjusting light, texture, tracking, and quality checks. This is a new space with no off the shelf algorithms and we are highly motivated to extend the state of the art.What you will do hereAs an Applied Science manager, you will play a pivotal role in building the technology enabling these innovations. You will be responsible for hiring and managing a team of applied scientists building models, prototypes, simulations, and state-of-the-art optimization algorithms that can scale across several product lines.This is a greenfield with no "off-the-shelf algorithms" that can perform the job. We experiment a lot and it is a must to learn and be curios. We embrace the challenges of a fast paced market and evolving technologies, paving the way to universal availability of content. You will be encouraged to see the big picture, be innovative, and positively impact millions of customers. This is a young and evolving business where creativity and drive will have a lasting impact on the way video is enjoyed worldwide.You will advise and closely collaborate with product and engineering teams to develop innovative products that disrupt established industries. In this role, you will also work closely with the renowned machine learning and optimization community at Amazon.We are looking for a proven ability to execute both strategically and tactically, and someone who is excited to take on new, ambiguous projects that will be industry defining. As an Applied Science Manager, your specific responsibilities will include:· Hire, lead, and grow a team of Applied Scientists· Create a high-quality applied research culture within your team and publish your work in Scientific Conferences and Journals within and outside Amazon· Present Amazon Prime Video in scientific conferences· Influence, collaborate, and communicate effectively with other science, engineering, and business teams and leaders· Apply business judgement to identify and prioritize opportunities and develop science strategies· Advise and collaborate with product and engineering teams to design and implement solutions for machine learning and optimization problems· Communicate scientific solution and insights effectively to a non-scientific audience· Collaborate with the Machine Learning and Computer Vision community at AmazonInclusive Team CultureWe are committed to furthering our culture of inclusion. Amazon’s culture of inclusion is reinforced within our 14 Leadership Principles, which remind team members to seek diverse perspectives, learn and be curious, and earn trust. Our team members come from 7 different countries. We celebrate this rich mix of cultures and actively seek diverse backgrounds to join us.Work/Life BalanceOur team puts a high value on work-life balance. It isn’t about how many hours you spend at home or at work; it’s about the flow you establish that brings energy to both parts of your life. We believe striking the right balance between your personal and professional life is critical to life-long happiness and fulfillment. We offer flexibility in working hours and encourage you to find your own balance between your work and personal lives.
US, WA, Seattle
Job summaryAmazon’s International Seller Services organization is seeking an experienced Applied scientist with excellent statistical and analytical abilities. You are a self-starter, someone who thrives in a fast-paced and ever-changing environment, with an uncanny knack and passion for leveraging machine learning at scale to quickly prototype, build and productionalize ML models? Then you are the right candidate for our team.Please visit https://www.amazon.science for more informationYou will be part of a global team that is focused on acquiring new self-service merchants from around the world to sell on Amazon’s global marketplaces around the world. The position is based in Seattle but will interact with global leaders and teams in Europe, Japan, China, Australia, and other regions. In this position, you will be a key contributor and sparring partner, analytics and insights that global executive management teams and business leaders will use to define global strategies and deep dive businesses. You will also have the opportunity to display your skills in the following areas:Be a thought leader: Interface with business partners, architect, design, implement, and support ML projects & tools that derive insights and shape important, worldwide business decisions.Dive Deep, Raise the bar and insist on high standards: Recognize and adopt best practices in ML modeling, data integrity, test design, analysis, validation, and documentation. Support your leadership team with deep-dives and insights that improve our performance and productivity, so we can serve our customers even better.Deliver Results: Be inspired by the motto of Customer First, use outstanding business acumen, technical and analytical skills to drive real, actionable results.Key job responsibilitiesYou will be expected to:· Leverage knowledge of statistics and optimization to frame decision-making problems for determining selection, pricing and inventory.· Use machine learning techniques to develop prioritization models and work with Data Scientists, BI/Data Engineers and Program managers to implement them· Establish scalable, efficient, automated processes for large scale data analyses, model development, model validation and model implementation· Perform A/B test or develop causal models to assess the impact of machine learning models and feature improvement· Understand future customer behavior and business conditions through machine learning and predictive modeling· Translate prototype models to production quality, large scale software systems· Communicate proposals and results to stakeholders in a clear manner backed by data and coupled with actionable conclusions· Drive roadmap of project as an SME and propose science related strategic initiatives. Present them to leadership team
US, WA, Seattle
Job summaryPrime Video is disrupting traditional media with an ever-increasing selection of movies, TV shows, Emmy Award-winning original content, add-on subscriptions including HBO and Showtime, and live events like Thursday Night Football and Major League Baseball. We are a premier provider of digital entertainment worldwide and we continue to grow very quickly! We need your passion, innovative ideas, and creativity to help continue to deliver on our ambitious goals.How often have you had an opportunity to be a founding member of a team solving significant customer problems through innovative AI technology at Amazon scale? We are looking for passionate, hard-working, and talented individuals to join our fast paced, start-up environment to help invent the future and define the next generation of how customers watch videos.Do you want to join an innovative team of scientists who use machine learning and statistical techniques to help Amazon provide the best customer experience by protecting Amazon customers from harmful content? Do you want to build advanced algorithmic systems that help millions of customers every day? Are you excited by the prospect of analyzing and modeling terabytes of data and creating state-of-the-art algorithms to solve real world problems? If yes, then you may be a great fit to join our Amazon Prime Video team. We are expanding our scene understanding team to drive compliance automation and exceptional customer experience using machine learning, computer vision, audio processing, and natural language understanding. Automation of video understanding at scale is our mission and passion. We need to solve problems across many cultures and languages. we have a huge amount of human-labelled data, and operation team to generate labels across many languages. Our team innovates, with many novel patents, inventions, and papers in the motion picture and television industry. We are highly motivated to extend the state of the art.We embrace the challenges of a fast-paced market and evolving technologies, paving the way to universal availability of content. You will be encouraged to see the big picture, be innovative, and positively impact millions of customers. This is a young and evolving business where creativity and drive will have a lasting impact on the way video is enjoyed worldwide.As a machine learning scientist, you will apply your knowledge of deep learning to concrete problems that have broad cross-organizational, global, and technology impact. Your work will focus on training and evaluating models and deploying them to production where we continuously monitor and evaluate. You will work on large engineering efforts that solve significantly complex problems facing global customers. You will be trusted to operate with independence and are often assigned to focus on areas with significant impact on audience satisfaction. You consistently bring strong, data-driven business and technical judgment to decisions. This is a greenfield with no "off-the-shelf algorithms" that can perform the job. We experiment a lot and it is a must to learn and be curios. You will be encouraged to see the big picture, be innovative, and positively impact millions of customersYou will work with internal and external stakeholders, cross-functional partners, and end-users around the world at all levels. Our team makes a big impact because nothing is more important to us than pleasing our customers, continually earning their trust, and thinking long term. You are empowered to bring new technologies and deep learning approaches to your solutions.
US, NY, New York
Job summaryDo you want a unique, exciting opportunity to help build something from the ground up? We are looking for builders who are passionate about data science to join our team within Amazon Advertising.Amazon is investing heavily in building a world class advertising business that is responsible for defining and delivering a collection of advertising products that drive discovery and sales. Our products are strategically important to our Retail and Marketplace businesses driving long term growth. We deliver billions of ad impressions and millions of clicks daily and are breaking fresh ground to create world-class products. We are highly motivated, collaborative and fun-loving with an entrepreneurial spirit and bias for action. With a broad mandate to experiment and innovate, we are growing at an unprecedented rate with a seemingly endless range of new opportunities.We are looking for a Data Scientist to join our cross functional team of Scientists, Product Managers, Engineers and Data specialists to help us deliver on a key global goal of setting up an ML-based, real time service that will help segment our customers and continuously feed recommendations to both global Ads and Consumer teams to improve customer success.As a Data Scientist on this team, you will:· Solve real-world problems by getting and analyzing large amounts of data, diving deep to identify business insights and opportunities, design simulations and experiments, developing statistical and ML models by tailoring to business needs, and collaborating with Scientists, Engineers, BIE's, and Product Managers.· Write code (Python, R, Scala, SQL, etc.) to obtain, manipulate, and analyze data· Apply statistical and machine learning knowledge to specific business problems and data.· Build decision-making models and propose solution for the business problem you define.· Retrieve, synthesize, and present critical data in a format that is immediately useful to answering specific questions or improving system performance.· Analyze historical data to identify trends and support optimal decision making.· Formalize assumptions about how our systems are expected to work, create statistical definition of the outlier, and develop methods to systematically identify outliers. Work out why such examples are outliers and define if any actions needed.· Given anecdotes about anomalies or generate automatic scripts to define anomalies, deep dive to explain why they happen, and identify fixes.· Conduct written and verbal presentations to share insights to audiences of varying levels of technical sophistication.Why you will love this opportunity: Amazon has invested heavily in building a world-class advertising business. This team defines and delivers a collection of advertising products that drive discovery and sales. Our solutions generate billions in revenue and drive long-term growth for Amazon’s Retail and Marketplace businesses. We deliver billions of ad impressions, millions of clicks daily, and break fresh ground to create world-class products. We are a highly motivated, collaborative, and fun-loving team with an entrepreneurial spirit - with a broad mandate to experiment and innovate.Impact and Career Growth: You will invent new experiences and influence customer-facing shopping experiences to help suppliers grow their retail business and the auction dynamics that leverage native advertising; this is your opportunity to work within the fastest-growing businesses across all of Amazon! Define a long-term science vision for our advertising business, driven from our customers' needs, translating that direction into specific plans for research and applied scientists, as well as engineering and product teams. This role combines science leadership, organizational ability, technical strength, product focus, and business understanding.Team video https://youtu.be/zD_6Lzw8raEAbout the teamAdvertising Readiness (AR) team is composed of science, engineering, and product functions. The AR team’s mission is to optimize each advertiser’s investment through recommendations that improve the marketability of products and brands on Amazon. To do so, we leverage a broad set of product, shopper (retail) and advertising signals to build comprehensive recommendation solutions that help the advertiser connect the worlds of advertising and retail.
US, VA, Arlington
Job summaryDoes the thought of improving one of the world’s most complex logistic systems inspire you? Is your passion to sift through hundreds of systems, processes, and data sources to solve the puzzle and identify the next big opportunity? Are you a creative big thinker who is passionate about using data and optimization tools to direct decision making and solve complex and large-scale challenges? Are you fascinated by the interactions between operations and strategy? Do you want to help design a transportation network that is help us give faster promises to our customer in a cost effective and sustainable way.With ever evolving Amazon transportation network and need to optimize among thousands of connections to enable the flow the billion of customer orders, we need to build the transportation network that serves our customers best. If this excites you come build with us.In this role you will be working across multiple business teams to understand their objective and constraints to design models that can help simulate, propose, measure and improve on transportation networks with the objective to optimize speed, cost and sustainability. You will work on complex machine learning problems in addition to communicating your proposals to wide teams.The ideal candidate will demonstrate a deep understanding of identifying meaningful key performance indicators and building actionable metrics. This candidate is passionate about providing insights that drive larger initiatives and sees the undeniable value of a good metric. They are excited to be part of, and learn from, a large science community and are ready to dig into the details to find insights that direct decisions. The successful candidate will have good communication skills and an ability to speak at a level appropriate for the audience, will collaborate effectively with scientists, product managers, and hands on operators and will deliver business value in close partnership with many stakeholders from operations, finance, research, and business leadership.
IN, KA, Bangalore
Job summaryAmazon is investing heavily in building a world class advertising business and we are responsible for defining and delivering a collection of self-service performance advertising products that drive discovery and sales. Our products are strategically important to our Retail and Marketplace businesses driving long term growth. We deliver billions of ad impressions and millions of clicks daily and are breaking fresh ground to create world-class products. We are highly motivated, collaborative and fun-loving with an entrepreneurial spirit and bias for action. With a broad mandate to experiment and innovate, we are growing at an unprecedented rate with a seemingly endless range of new opportunities.The Moderation and Relevance System (MARS) team, based in Bangalore, is responsible for ensuring that ads are relevant and is of good quality, leading to higher conversion for the sellers and providing a great experience for the customers. We deal with one of the world’s largest product catalog, handle billions of requests a day with plans to grow it by order of magnitude and use automated systems to validate tens of millions of offers submitted by thousands of merchants in multiple countries and languages.In this role, you will build and develop ML models to address content intelligence problems, build advanced algorithms in detecting and generating content. These models will rely on a variety of visual and textual features requiring expertise in both domains. These models need to scale to multiple languages and countries. You will collaborate with engineers and other scientists to build, train and deploy these models. You will propose hypotheses, validate these offline and run A/B tests to validate them online. As part of these activities, you will develop production level code that enables moderation of millions of ads submitted each day.
US, CA, San Diego
Job summarySoftlines Discovery Science is looking for an Applied Scientist to build Machine Learning solutions to solve economic problems at scale. Discovery uses Machine Learning, Reinforcement Learning, Causal Inference, and Econometrics/Economics to derive actionable insights about the complex economy of Amazon’s retail business. We also develop Statistical Models and Algorithms to drive strategic business decisions and improve operations. We are a science-driven team incubating and building disruptive solutions using cutting-edge technology to solve some of the toughest business problems at Amazon.You will work with business leaders, scientists, and engineers to translate business and functional requirements into concrete deliverables, including the design, development, testing, and deployment of highly scalable distributed ML models and services. You will partner with scientists, product managers, and engineers to help invent and implement scalable ML, RL, and econometric models while building tools to help our customers gain and apply insights. This is a unique, high visibility opportunity for someone who wants to have business impact, dive deep into large-scale economic problems, enable measurable actions on the Consumer economy, and work closely with scientists and economists. We are particularly interested in candidates with experience building predictive models and working with distributed systems.As an Applied Scientist, you bring structure to ambiguous business problems and use science, logic, and practical experience to decompose them into straightforward, scalable solutions. You set the standard for scientific excellence and make decisions that affect the way we build and integrate algorithms. Your solutions are exemplary in terms of algorithm design, clarity, model structure, efficiency, and extensibility. You tackle intrinsically hard problems; you're interested in learning; and you acquire skills and expertise as needed.
US, CA, Virtual Contact Center-CA
Job summaryProject Kuiper is an initiative to launch a constellation of Low Earth Orbit satellites that will provide low-latency, high-speed broadband network connectivity to unserved and underserved communities around the world.As a Senior Applied Scientist on the team you will responsible for building out and maintaining the algorithms and software services behind one of the world’s largest satellite constellations. You will be responsible for developing algorithms and applications that provide mission critical information derived from past and predicted satellite orbits to other systems and organizations rapidly, reliably, and at scale.You will be focused on contributing to the design and analysis of software systems responsible across a broad range of areas required for automated management of the Kuiper constellation. You will use your knowledge of mathematical modeling, optimization algorithms, astrodynamics, state estimation, space systems, and software engineering drive the solution of a wide variety of problems to enable space operations at an unprecedented scale. You will develop features for systems to interface with internal and external teams, predict and plan communication opportunities, manage satellite orbits determination and prediction systems, develop analysis and infrastructure to monitor and support systems performance. Your work will interface with various subsystems within Project Kuiper and Amazon, as well as with external organizations, to enable engineers to safely and efficiently manage the satellite constellation.The ideal candidate will be detail oriented, strong organizational skills, able to work independently, juggle multiple tasks at once, and maintain professionalism under pressure. You should have proven knowledge of mathematical modeling and optimization along with strong software engineering skills. You should be able to independently understand customer requirements, and use data-driven approaches to identify possible solutions, select the best approach, and deliver high-quality applications.Export Control Requirement:Due to applicable export control laws and regulations, candidates must be a U.S. citizen or national, U.S. permanent resident (i.e., current Green Card holder), or lawfully admitted into the U.S. as a refugee or granted asylum.About the teamThe Constellation Management & Space Safety team maintains build the software services responsible for maintaining situational awareness for Kuiper satellites and coordinating with internal and external organizations to maintain the nominal operational state. We build automated systems that use satellite telemetry and other related data to predict future orbits, plan maneuvers to avoid high risk conjunctions, and keep satellites in the desired locations. Using knowledge of software engineering and space systems, we provide visibility information that is used to predict and establish communication channels for Kuiper satellites.
US, CA, Sunnyvale
Job summaryAmazon Lab126 is an inventive research and development company that designs and engineers high-profile consumer electronics. Lab126 began in 2004 as a subsidiary of Amazon.com, Inc., originally creating the best-selling Kindle family of products. Since then, we have produced groundbreaking devices like Fire tablets, Fire TV and Amazon Echo. What will you help us create?The Role:As a Design Analysis Engineer, you will be responsible for bringing new product designs through to manufacturing. Thermal and structural engineering contributes unique, in-depth technical knowledge to solve complex engineering problems in concert with multi-disciplinary teams including Industrial Design, Hardware Engineering, and Operations.You will work closely with multi-disciplinary groups including Product Design, Industrial Design, Hardware Engineering, and Operations, to drive key aspects of engineering of consumer electronics products. In this role, you will:· Perform analysis and testing of complex electronic assemblies using advanced simulation and experimentation tools and techniques· Strong fundamentals in dynamics with emphasis on system dynamics, mechanism analysis (Multi Body Dynamics analysis) and co-simulation· Develop, analyze and test thermal, acoustic and structural solutions; from concept design, feature development, product architecture, through system validation· Support creative developments through application of analysis and testing of complex electronic assemblies using advanced simulation and experimentation tools and techniques· Use simulation tools like Abaqus, LS-Dyna, Simpack for analysis and design of products· Validate design modifications using simulation and actual prototypes· Use of programming languages like Python and Matlab for analytical/statistical analyses and automation· Establish noise thresholds for usability and compliance requirements· Determine and validate structural performance under use and test conditions· Have strong knowledge of various materials such as heat spreaders solutions to resolve thermal issues, damping materials for noise and vibration suppression· Use various data acquisition systems with thermocouples, accelerometers, strain gauges and IR cameras· Collaborate as part of the device team to iterate and optimize design parameters of enclosures and structural parts to establish and deliver project performance objectives· Design and execute tests using statistical tools to validate analytical models, identify risks and assess design margins· Create and present analytical and experimental results· Develop and apply design guidelines based on project results