Alexa scientists present two new techniques that improve wake word performance

The Amazon Echo is a hands-free smart home speaker you control with your voice. The first important step in enabling a delightful customer experience with an Echo or other Alexa-enabled device is wake word detection, so accurate detection of “Alexa” or substitute wake words is critical. It is challenging to build a wake word system with low error rates when there are limited computation resources on the device and it's in the presence of background noise such as speech or music.

Next week, at the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2018, we are presenting two new techniques that improve on-device wake word detection performance:

  1. A new deep neural network (DNN) architecture for training a speech feature directly from raw audio input; and
  2. A novel background noise modeling method using monophone-based sound units that can take richer information into account.

In the first paper, we focus on improving the DNN-Hidden Markov Model (HMM) system by training a feature extraction DNN from raw audio rather than handcrafting a speech feature traditionally used in speech recognition. In the second paper, we present a new wake word system that comprises a two-stage classifier and show how wake word performance can be improved by incorporating richer phone (classes of sound) contexts into the two-stage system.

Time Delayed Bottleneck Highway Networks

The illustration below contrasts a conventional DNN architecture to our new DNN structure. A main difference between the two is that our new system replaces a handcrafted log-mel filter bank energy (LFBE) front-end with a trainable front-end DNN. By directly modeling raw audio rather than LFBE, we can learn novel features of the target wake word and optimize our classifier for improved performance. Except for the discrete Fourier transform (DFT), this approach is wholly data-driven. We apply the highway network to direct audio modeling to alleviate the hard optimization problem caused by a deep network structure. Furthermore, we efficiently reduce the large dimension of an input vector with a bottleneck layer followed by a time-delayed window.

Time Delayed Bottleneck Highway Networks

The graph below shows that our time-delayed bottleneck highway network with the DFT input significantly reduces a range of false alarm rates (FAR), yielding approximately a 20 percent relative improvement in the area under the curve (AUC), a common measure of machine-learning model accuracy. It is also clear from our work that a larger amount of training data would improve wake word detection performance.

Contrast of a conventional and our new wake word system
Contrast of a conventional and our new wake word system

Monophone-Based Background Modeling

In this paper, we introduce a two-stage on-device wake word detection system based on DNN acoustic modeling, propose a new approach for modeling background noise using monophone-based sound units, and present how richer information can be extracted from the monophone sound units to improve wake word accuracy.

An overview of the two-stage wake word system
An overview of the two-stage wake word system

With this new approach, we achieved about a 16 percent relative reduction in instances where Alexa doesn’t respond to the wake word (false rejection rates, or FRR) and about a 37 percent relative reduction in instances when Alexa mistakenly believes she’s heard the wake word, or false alarm rates (FAR). Moreover, when we introduce a second-stage classifier that extracts monophone units for final wake word detection, we reduce FAR by about 67 percent utilizing very few additional computational resources.

Below are the papers we’re presenting at ICASSP next week. Although each method is presented as separate work, both techniques can of course be combined to achieve better wake word performance. That will be the focus of our future work.

"Time-Delayed Bottleneck Highway Networks Using A DFT Feature For Keyword Spotting"
"Monophone-based Background Modeling For Two-Stage On-Device Wake Word Detection"

Acknowledgements: Kenichi Kumatani, Sankaran Panchapagesan, Jinxi Guo, Ming Sun, Anirudh Raju, Jiacheng Gu, Ryan Thomas, Nikko Ström, Shiv Naga Prasad Vitaladevuni, Bjorn Hoffmeister, Arindam Mandal, as well as the entire Wake Word team for supporting this work.

Related content

US, WA, Seattle
Are you a Ph.D. interested in the fields of machine learning, deep learning, automated reasoning, speech, robotics, computer vision, optimization, or quantum computing? Do you enjoy diving deep into hard technical problems and coming up with solutions that enable successful products that improve the lives of people in a meaningful way? If this describes you, come join our science teams at Amazon. As an Applied Scientist, you will have access to large datasets with billions of images and video to build large-scale systems. Additionally, you will analyze and model terabytes of text, images, and other types of data to solve real-world problems and translate business and functional requirements into quick prototypes or proofs of concept. We are looking for smart scientists capable of using a variety of domain expertise to invent, design, evangelize, and implement state-of-the-art solutions for never-before-solved problems.
LU, Luxembourg
Have you ever wondered how Amazon delivers timely and reliably hundreds of millions of packages to customer’s doorsteps? Are you passionate about data and mathematics, and hope to impact the experience of millions of customers? Are you obsessed with designing simple algorithmic solutions to very challenging problems?If so, we look forward to hearing from you!Amazon Transportation Services is seeking a Postdoctoral Scientist with Operations Research or Applied Mathematics background, to join our team in the EU Headquarters in Luxembourg, for a one-plus-one year full-time research position. As a key member of the EU Research Science Team, this person will be responsible for designing and implementing beyond state of the art algorithmic frameworks that optimize the middle-mile Amazon Transportation Network. The successful applicant will ensure that our end-to-end strategies in terms of customer demand fulfillment, routing, consolidation locations, linehaul/airhaul/sea options and last-mile transportation are streamlined and optimizedKey job responsibilitiesIn this role you will:• Work closely with a senior science advisor, collaborate with other scientists and engineers, and be part of Amazon’s vibrant and diverse global science community.• Publish your innovation in top-tier academic venues and hone your presentation skills.• Be inspired by challenges and opportunities to invent cutting-edge techniques in your area(s) of expertise.
ES, B, Barcelona
Are you interested in building state-of-the-art machine learning systems for the most complex, and fastest growing, transportation network in the world? If so, Amazon has the most exciting, and never-before-seen, challenges at this scale (including those in sustainability, e.g. how to reach net zero carbon by 2040).Amazon’s transportation systems get millions of packages to customers worldwide faster and cheaper while providing world class customer experience – from online checkout, to shipment planning, fulfillment, and delivery. Our software systems include services that use tens of thousands of signals every second to make business decisions impacting billions of dollars a year, that integrate with a network of small and large carriers worldwide, that manage business rules for millions of unique products, and that improve experience of over hundreds of millions of online shoppers.As part of this team you will focus on the development and research of machine learning solutions and algorithms for core planning systems, as well as for other applications within Amazon Transportation Services, and impact the future of the Amazon delivery network. Current research and areas of work within our team include machine learning forecast, anomaly detection models, model interpretability, graph neural nets, among others.We are looking for a Manager, Applied Science (Machine Learning) with a strong academic background and industry experience in the areas of probabilistic machine learning, time series forecasting, and/or anomaly detection.At Amazon, we strive to continue being the most customer-centric company on earth. To stay there and continue improving, we need exceptionally talented, bright, and driven people. If you'd like to help us build the place to find and buy anything online, and deliver in the most efficient and greenest way possible, this is your chance to make history.
NL, Amsterdam
Are you a passionate scientist in the computer vision area who is aspired to apply your skills to bring value to millions of customers? Here at Ring, we have a unique possibility to innovate and see how the results of our work improve the lives of millions of people and make neighborhoods safer.You will be part of a team committed to pushing the frontier of computer vision and machine learning technology to deliver the best experience for our neighbors. This is a great opportunity for you to innovate in this space by developing highly optimized algorithms that will work on scale. This position requires experience with developing efficient computer vision algorithms on resource-constrained computing platforms on edge. You will collaborate with different Amazon teams to make informed decisions on the best practices in machine learning to build highly-optimized integrated hardware and software platforms.Key job responsibilities* Research and implement the state-of-the-art computer vision and sensor fusion algorithms for resource-constrained computing platforms on a large scale.* Collaborate with product managers and engineering teams to design and implement computer vision and machine learning based features for Ring devices* Influence system design and product vision by making informed decisions on the selection of technology, data sources, algorithms, and sensors.
US, WA, Seattle
Amazon internships are full-time (40 hours/week) for 12 consecutive weeks with start dates in May - July 2023. Our internship program provides hands-on learning and building experiences for students who are interested in a career in hardware engineering. This role will be based in Seattle, and candidates must be willing to work in-person.Corporate Projects (CPT) is a team that sits within the broader Corporate Development organization at Amazon. We seek to bring net-new, strategic projects to life by working together with customers and evolving projects from ZERO-to-ONE. To do so, we deploy our resources towards proofs-of-concept (POCs) and pilot programs and develop them from high-level ideas (the ZERO) to tangible short-term results that provide validating signal and a path to scale (the ONE). We work with our customers to develop and create net-new opportunities by relentlessly scouring all of Amazon and finding new and innovative ways to strengthen and/or accelerate the Amazon Flywheel.CPT seeks an Applied Science intern to work with a diverse, cross-functional team to build new, innovative customer experiences. Within CPT, you will apply both traditional and novel scientific approaches to solve and scale problems and solutions. We are a team where science meets application. A successful candidate will be a self-starter comfortable with ambiguity, strong attention to detail, and the ability to work in a fast-paced, ever-changing environment. As an Applied Science Intern, you will own the design and development of end-to-end systems. You’ll have the opportunity to create technical roadmaps, and drive production level projects that will support Amazon Science. You will work closely with Amazon scientists, and other science interns to develop solutions and deploy them into production. The ideal scientist must have the ability to work with diverse groups of people and cross-functional teams to solve complex business problems.
US, IL, Chicago
MULTIPLE POSITIONS AVAILABLECompany: AMAZON.COM SERVICES LLCPosition Title: Data Scientist ILocation: Chicago, IllinoisPosition Responsibilities:Build the core intelligence, insights, and algorithms that support the real estate acquisition strategies for Amazon physical stores. Tackle cutting-edge, complex problems such as predicting the optimal location for new Amazon stores by bringing together numerous data assets, and using best-in-class modeling solutions to extract the most information out of them. Work with business stakeholders, software development engineers, and other data scientists across multiple teams to develop innovative solutions at massive is an Equal Opportunity-Affirmative Action Employer – Minority / Female / Disability / Veteran / Gender Identity / Sexual Orientation #0000
US, WA, Seattle
Note that this posting is for a handful of teams within Amazon Robotics. Teams include: Robotics, Computer Vision, Machine Learning, Optimization, and more.Are you excited about building high-performance robotic systems that can perceive and learn to help deliver for customers? The Amazon Robotics team is creating new science products and technologies that make this possible, at Amazon scale. We work at the intersection of computer vision, machine learning, robotic manipulation, navigation, and human-robot interaction.Amazon Robotics is seeking broad, curious applied scientists and engineering interns to join our diverse, full-stack team. In addition to designing, building, and delivering end-to-end robotic systems, our team is responsible for core infrastructure and tools that serve as the backbone of our robotic applications, enabling roboticists, applied scientists, software and hardware engineers to collaborate and deploy systems in the lab and in the field. We will give you the tools and support you need to invent with us in ways that are rewarding, fulfilling and fun. Come join us!A day in the lifeAs an intern you will develop a new algorithm to solve one of the challenging computer vision and manipulation problems in Amazon's robotic warehouses. Your project will fit your academic research experience and interests. You will code and test out your solutions in increasingly realistic scenarios and iterate on the idea with your mentor to find the best solution to the problem.
US, WA, Seattle
Are you excited about building high-performance robotic systems that can perceive, learn, and act intelligently alongside humans? The Robotics AI team is creating new science products and technologies that make this possible, at Amazon scale. We work at the intersection of computer vision, machine learning, robotic manipulation, navigation, and human-robot interaction.The Amazon Robotics team is seeking broad, curious applied scientists and engineering interns to join our diverse, full-stack team. In addition to designing, building, and delivering end-to-end robotic systems, our team is responsible for core infrastructure and tools that serve as the backbone of our robotic applications, enabling roboticists, applied scientists, software and hardware engineers to collaborate and deploy systems in the lab and in the field. Come join us!
US, WA, Bellevue
Employer: Services LLCPosition: Research Scientist IILocation: Bellevue, WA Multiple Positions Available1. Research, build and implement highly effective and innovative methods in Statistical Modeling, Machine Learning, and other quantitative techniques such as operational research and optimization to deliver algorithms that solve real business problems.2. Take initiative to scope and plan research projects based on roadmap of business owners and enable data-driven solutions. Participate in shaping roadmap for the research team.3. Ensure data quality throughout all stages of acquisition and processing of the data, including such areas as data sourcing/collection, ground truth generation, data analysis, experiment, evaluation and visualization etc.4. Navigate a variety of data sources, understand the business reality behind large-scale data and develop meaningful science solutions.5. Partner closely with product or/and program owners, as well as scientists and engineers in cross-functional teams with a clear path to business impact and deliver on demanding projects.6. Present proposals and results in a clear manner backed by data and coupled with conclusions to business customers and leadership team with various levels of technical knowledge, educating them about underlying systems, as well as sharing insights.7. Perform experiments to validate the feature additions as requested by domain expert teams.8. Some telecommuting benefits available.The pay range for this position in Bellevue, WA is $136,000-$184,000 (yr); however, base pay offered may vary depending on job-related knowledge, skills, and experience. A sign-on bonus and restricted stock units may be provided as part of the compensation package, in addition to a full range of medical, financial, and/or other benefits, dependent on the position offered. This information is provided by the Washington Equal Pay Act. Base pay information is based on market location. Applicants should apply via Amazon's internal or external careers site.#0000
US, VA, Arlington
The Central Science Team within Amazon’s People Experience and Technology org (PXTCS) uses economics, behavioral science, statistics, and machine learning to proactively identify mechanisms and process improvements which simultaneously improve Amazon and the lives, well-being, and the value of work to Amazonians. We are an interdisciplinary team, which combines the talents of science and engineering to develop and deliver solutions that measurably achieve this goal. As Director for PXT Central Science Technology, you will be responsible for leading multiple teams through rapidly evolving complex demands and define, develop, deliver and execute on our science roadmap and vision. You will provide thought leadership to scientists and engineers to invent and implement scalable machine learning recommendations and data driven algorithms supporting flexible UI frameworks. You will manage and be responsible for delivering some of our most strategic technical initiatives. You will design, develop and operate new, highly scalable software systems that support Amazon’s efforts to be Earth’s Best Employer and have a significant impact on Amazon’s commitment to our employees and communities where we both serve and employ 1.3 million Amazonians. As Director of Applied Science, you will be part of the larger technical leadership community at Amazon. This community forms the backbone of the company, plays a critical role in the broad business planning, works closely with senior executives to develop business targets and resource requirements, influences our long-term technical and business strategy, helps hire and develop engineering leaders and developers, and ultimately enables us to deliver engineering innovations.This role is posted for Arlington, VA, but we are flexible on location at many of our offices in the US and Canada.