Two new papers discuss how Alexa recognizes sounds

Last year, Amazon announced the beta release of Alexa Guard, a new service that lets customers who are leaving the house instruct their Echo devices to listen for glass breaking or smoke and carbon dioxide alarms going off.

At this year’s International Conference on Acoustics, Speech, and Signal Processing, our team is presenting several papers on sound detection. I wrote about one of them a few weeks ago, a new method for doing machine learning with unbalanced data sets.

Today I’ll briefly discuss two others, both of which, like the first, describe machine learning systems. One paper addresses the problem of media detection, or recognizing when the speech captured by a digital-assistant device comes from a TV or radio rather than a human speaker. In particular, we develop a way to better characterize media audio by examining longer-duration audio streams versus merely classifying short audio snippets. Media detection helps filter a particularly deceptive type of background noise out of speech signals.

For our other paper, we used semi-supervised learning to train a system developed from an external dataset to do acoustic-event detection. Semi-supervised learning uses small sets of annotated training data to leverage larger sets of unannotated data. In particular, we use tri-training, in which three different models are trained to perform the same task, but on slightly different data sets. Pooling their outputs corrects a common problem in semi-supervised training, in which a model’s errors end up being amplified.

Our media detection system is based on the observation that the audio characteristics we would most like to identify are those common to all instances of media sound, regardless of content. Our network design is an attempt to abstract away from the properties of particular training examples.

Like many machine learning models in the field of spoken-language understanding, ours uses recurrent neural networks (RNNs). An RNN processes sequenced inputs in order, and each output factors in the inputs and outputs that preceded it.

We use a convolutional neural network (CNN) as feature extractor, and stack RNN layers on top of it. But each RNN layer has only a fraction as many nodes as the one beneath it. That is, only every third or fourth output from the first RNN provides an input to the second, and only every third or fourth output of the second RNN provides an input to the third.

Pyramidal.jpg._CB465895532_.jpg
A standard stack of recurrent neural networks (left) and the “pyramidal” stack we use instead

Because the networks are recurrent, each output we pass contains information about the outputs we skip. But this “pyramidal” stacking encourages the model to ignore short-term variations in the input signal.

For every five-second snippet of audio processed by our system, the pyramidal RNNs produce a single output vector, representing the probabilities that the snippet belongs to any of several different sound categories.

But our system includes still another RNN, which tracks relationships between five-second snippets. We experimented with two different ways of integrating that higher-level RNN with the pyramidal RNNs. In the first, the output vector from the pyramidal RNN simply passes to the higher-level RNN, which makes the final determination about whether media sound is present.

In the other, however, the higher-level RNN lies between the middle and top layers of the pyramidal RNN. It receives its input from the middle layer, and its output, along with that of the middle layer, passes to the top layer of the pyramidal RNN.

contextual_2.jpg._CB465896350_.jpg
In the second of our two contextual models, a high-level RNN (red circles) receives inputs from one layer of a pyramidal RNN (groups of five blue circles), and its output passes to the next layer (groups of two blue circles).

This was our best-performing model. When compared to a model that used the pyramidal RNNs but no higher-level RNN, it offered a 24% reduction in equal error rate, which is the error rate that results when the system parameters are set so that the false-positive rate equals the false-negative rate.

Our other ICASSP paper presents our semi-supervised approach to acoustic-event detection (AED). One popular and simple semi-supervised learning technique is self-training, in which a machine learning model is trained on a small amount of labeled data and then itself labels a much larger set of unlabeled data. The machine-labeled data is then sorted according to confidence score — the system’s confidence that its labels are correct — and data falling in the right confidence window is used to fine-tune the model.

The model, that is, is retrained on data that it has labeled itself. Remarkably, this approach tends to improve the model’s performance.

But it also poses a risk. If the model makes a systematic error, and if it makes it with high confidence, then that error will feed back into the model during self-training, growing in magnitude.

Tri-training is intended to mitigate this kind of self-reinforcement. In our experiments, we created three different training sets, each the size of the original — 39,000 examples — by randomly sampling data from the original. There was substantial overlap between the sets, but in each, some data items were oversampled, and some were undersampled.

We trained neural networks on all three data sets and saved copies of them, which we might call initial models. Then we used each of those networks to label another 5.4 million examples. For each of the initial models, we used machine-labeled data to re-train it only if both of the other models agreed on the labels with high confidence. In all, we retained only 5,000 examples out of the more than five million in the unlabeled data set.

Finally, we used six different models to classify the examples in our test set: the three initial models and the three retrained models. On samples of three sounds — dog sounds, baby cries, and gunshots — pooling the results of all six models led to reductions in equal-error rate (EER) of 16%, 26%, and 19%, respectively, over a standard self-trained model.

Of course, using six different models to process the same input is impractical, so we also trained a seventh neural network to mimic the aggregate results of the first six. On the test set, that network was not quite as accurate as the six-network ensemble, but it was still a marked improvement over the standard self-trained model, reducing EER on the same three sample sets by 11%, 18%, and 6%, respectively.

Acknowledgments: Qingming Tang, Chieh-Chi Kao, Viktor Rozgic, Bowen Shi, Spyros Matsoukas, Chao Wang

Research areas

Related content

US, WA, Seattle
Here at Amazon, we embrace our differences. We are committed to furthering our culture of diversity and inclusion of our teams within the organization. How do you get items to customers quickly, cost-effectively, and—most importantly—safely, in less than an hour? And how do you do it in a way that can scale? Our teams of hundreds of scientists, engineers, aerospace professionals, and futurists have been working hard to do just that! We are delivering to customers, and are excited for what’s to come. Check out more information about Prime Air on the About Amazon blog (https://www.aboutamazon.com/news/transportation/amazon-prime-air-delivery-drone-reveal-photos). If you are seeking an iterative environment where you can drive innovation, apply state-of-the-art technologies to solve real world delivery challenges, and provide benefits to customers, Prime Air is the place for you. Come work on the Amazon Prime Air Team! Prime Air is seeking an experienced Research Scientist in the Flight Sciences High-Fidelity Methods (HFM) team within Flight Sciences, you will develop and verify aerodynamics models used for engineering analyses and vehicle simulation. These models are the backbone of every flight simulation performed within Prime Air and are a critical element in the aircraft design, verification and certification process. These models are used to predict many attributes of the vehicle performance including range, maneuverability, tracking error, and aircraft stability. They are a key input to design decisions, vehicle component sizing and flight software algorithm development. The accuracy and reliability of these flight model are critical to the success of Prime Air. For this role we are looking for a scientist to develop surrogate or machine learning models to represent the complex aerodynamic behavior of our drones. This scientist will develop techniques to validate these models using flight testing, quantify the model uncertainty, and assess the impact of this uncertainty on downstream engineering analyses. Key job responsibilities A Research Scientist in this role is responsible for owning the development, deployment, verification, and maintenance of models from end-to-end. This includes the initial gathering of the downstream customer needs, identifying the most suitable modelling approach, coordinating the generation of input data, training models, developing and maintaining software interfaces, and verifying the model accuracy. A Research Scientist in this role is responsible for determining the most suitable modeling approach for a given physical phenomena. They need to possess knowledge of various machine learning techniques, and their respective advantages and limitations. They will need to have a detailed understanding of the types of physics to be modelled including vehicle aerodynamics, multibody dynamics, and atmosphere physics. This role is responsible for designing experiments for generating data used to train and verify surrogate models. They need to have a basic understanding of the methods used to generate high-fidelity aerodynamics predictions including CFD, wind tunnel testing, and flight testing. They will be responsible for validating the models by leveraging uncertainty quantification, system identification, and statical analyses. Export Control License This position may require a deemed export control license for compliance with applicable laws and regulations. Placement is contingent on Amazon’s ability to apply for and obtain an export control license on your behalf. A day in the life A Research Scientist in the High-Fidelity Methods (HFM) team will have the opportunity to work on a wide variety of tasks. The ideal candidate should be adaptable and thrive in an everchanging environment. Depending on the phase of model or vehicle development, a typical day might consist of reading research papers on machine learning techniques, developing test plans for wind tunnel testing, writing code to train and verify models, reviewing flight test results, or writing documentation. We are open to hiring candidates to work out of one of the following locations: Seattle, WA, USA
US, CA, Santa Clara
AWS AI/ML is looking for world class scientists and engineers to work on foundation models, large-scale representation learning, and distributed learning methods and systems. At AWS AI/ML you will invent, implement, and deploy state of the art machine learning algorithms and systems. You will build prototypes and innovate on new representation learning solutions. You will interact closely with our customers and with the academic and research communities. You will be at the heart of a growing and exciting focus area for AWS and work with other acclaimed engineers and world famous scientists. Large-scale foundation models have been the powerhouse in many of the recent advancements in computer vision, natural language processing, automatic speech recognition, recommendation systems, and time series modeling. Developing such models requires not only skillful modeling in individual modalities, but also understanding of how to synergistically combine them, and how to scale the modeling methods to learn with huge models and on large datasets. Join us to work as an integral part of a team that has diverse experiences in this space. We actively work on these areas: Hardware-informed efficient model architecture, training objective and curriculum design Distributed training, accelerated optimization methods Continual learning, multi-task/meta learning Reasoning, interactive learning, reinforcement learning Robustness, privacy, model watermarking Model compression, distillation, pruning, sparsification, quantization A day in the life Inclusive Team Culture Here at AWS, we embrace our differences. We are committed to furthering our culture of inclusion. We have ten employee-led affinity groups, reaching 40,000 employees in over 190 chapters globally. We have innovative benefit offerings, and host annual and ongoing learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences. Amazon’s culture of inclusion is reinforced within our 14 Leadership Principles, which remind team members to seek diverse perspectives, learn and be curious, and earn trust. Work/Life Balance Our team puts a high value on work-life balance. It isn’t about how many hours you spend at home or at work; it’s about the flow you establish that brings energy to both parts of your life. We believe striking the right balance between your personal and professional life is critical to life-long happiness and fulfillment. We offer flexibility in working hours and encourage you to find your own balance between your work and personal lives. Mentorship & Career Growth Our team is dedicated to supporting new members. We have a broad mix of experience levels and tenures, and we’re building an environment that celebrates knowledge sharing and mentorship. Our senior members enjoy one-on-one mentoring and thorough, but kind, code reviews. We care about your career growth and strive to assign projects based on what will help each team member develop into a better-rounded engineer and enable them to take on more complex tasks in the future. We are open to hiring candidates to work out of one of the following locations: Santa Clara, CA, USA
US, CA, Santa Clara
Amazon AI is looking for world class scientists and engineers to join its AWS AI Labs. This group is entrusted with developing core data mining, natural language processing, deep learning, and machine learning algorithms for AWS. You will invent, implement, and deploy state of the art machine learning algorithms and systems. You will build prototypes and explore conceptually new solutions. You will interact closely with our customers and with the academic community. You will be at the heart of a growing and exciting focus area for AWS and work with other acclaimed engineers and world famous scientists. Inclusive Team Culture Here at AWS, we embrace our differences. We are committed to furthering our culture of inclusion. We have ten employee-led affinity groups, reaching 40,000 employees in over 190 chapters globally. We have innovative benefit offerings, and host annual and ongoing learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences. Work/Life Balance Our team puts a high value on work-life balance. It isn’t about how many hours you spend at home or at work; it’s about the flow you establish that brings energy to both parts of your life. We believe striking the right balance between your personal and professional life is critical to life-long happiness and fulfillment. We offer flexibility in working hours and encourage you to find your own balance between your work and personal lives. Mentorship & Career Growth Our team is dedicated to supporting new members. We have a broad mix of experience levels and tenures, and we’re building an environment that celebrates knowledge sharing and mentorship. Our senior members enjoy one-on-one mentoring and thorough, but kind, code reviews. We care about your career growth and strive to assign projects based on what will help each team member develop into a better-rounded engineer and enable them to take on more complex tasks in the future. We are open to hiring candidates to work out of one of the following locations: New York, NY, USA | Santa Clara, CA, USA | Seattle, WA, USA
US, WA, Seattle
Amazon AI is looking for world class scientists and engineers to join its AWS AI Labs. This group is entrusted with developing core data mining, natural language processing, deep learning, and machine learning algorithms for AWS. You will invent, implement, and deploy state of the art machine learning algorithms and systems. You will build prototypes and explore conceptually new solutions. You will interact closely with our customers and with the academic community. You will be at the heart of a growing and exciting focus area for AWS and work with other acclaimed engineers and world famous scientists. Inclusive Team Culture Here at AWS, we embrace our differences. We are committed to furthering our culture of inclusion. We have ten employee-led affinity groups, reaching 40,000 employees in over 190 chapters globally. We have innovative benefit offerings, and host annual and ongoing learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences. Work/Life Balance Our team puts a high value on work-life balance. It isn’t about how many hours you spend at home or at work; it’s about the flow you establish that brings energy to both parts of your life. We believe striking the right balance between your personal and professional life is critical to life-long happiness and fulfillment. We offer flexibility in working hours and encourage you to find your own balance between your work and personal lives. Mentorship & Career Growth Our team is dedicated to supporting new members. We have a broad mix of experience levels and tenures, and we’re building an environment that celebrates knowledge sharing and mentorship. Our senior members enjoy one-on-one mentoring and thorough, but kind, code reviews. We care about your career growth and strive to assign projects based on what will help each team member develop into a better-rounded engineer and enable them to take on more complex tasks in the future. We are open to hiring candidates to work out of one of the following locations: New York, NY, USA | Santa Clara, CA, USA | Seattle, WA, USA
US, CA, Santa Clara
Amazon AI is looking for world class scientists and engineers to join its AWS AI Labs. This group is entrusted with developing core data mining, natural language processing, deep learning, and machine learning algorithms for AWS. You will invent, implement, and deploy state of the art machine learning algorithms and systems. You will build prototypes and explore conceptually new solutions. You will interact closely with our customers and with the academic community. You will be at the heart of a growing and exciting focus area for AWS and work with other acclaimed engineers and world famous scientists. Inclusive Team Culture Here at AWS, we embrace our differences. We are committed to furthering our culture of inclusion. We have ten employee-led affinity groups, reaching 40,000 employees in over 190 chapters globally. We have innovative benefit offerings, and host annual and ongoing learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences. Work/Life Balance Our team puts a high value on work-life balance. It isn’t about how many hours you spend at home or at work; it’s about the flow you establish that brings energy to both parts of your life. We believe striking the right balance between your personal and professional life is critical to life-long happiness and fulfillment. We offer flexibility in working hours and encourage you to find your own balance between your work and personal lives. Mentorship & Career Growth Our team is dedicated to supporting new members. We have a broad mix of experience levels and tenures, and we’re building an environment that celebrates knowledge sharing and mentorship. Our senior members enjoy one-on-one mentoring and thorough, but kind, code reviews. We care about your career growth and strive to assign projects based on what will help each team member develop into a better-rounded engineer and enable them to take on more complex tasks in the future. We are open to hiring candidates to work out of one of the following locations: New York, NY, USA | Santa Clara, CA, USA | Seattle, WA, USA
US, CA, Santa Clara
Amazon AI is looking for world class scientists and engineers to join its AWS AI Labs. This group is entrusted with developing core data mining, natural language processing, deep learning, and machine learning algorithms for AWS. You will invent, implement, and deploy state of the art machine learning algorithms and systems. You will build prototypes and explore conceptually new solutions. You will interact closely with our customers and with the academic community. You will be at the heart of a growing and exciting focus area for AWS and work with other acclaimed engineers and world famous scientists. Inclusive Team Culture Here at AWS, we embrace our differences. We are committed to furthering our culture of inclusion. We have ten employee-led affinity groups, reaching 40,000 employees in over 190 chapters globally. We have innovative benefit offerings, and host annual and ongoing learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences. Work/Life Balance Our team puts a high value on work-life balance. It isn’t about how many hours you spend at home or at work; it’s about the flow you establish that brings energy to both parts of your life. We believe striking the right balance between your personal and professional life is critical to life-long happiness and fulfillment. We offer flexibility in working hours and encourage you to find your own balance between your work and personal lives. Mentorship & Career Growth Our team is dedicated to supporting new members. We have a broad mix of experience levels and tenures, and we’re building an environment that celebrates knowledge sharing and mentorship. Our senior members enjoy one-on-one mentoring and thorough, but kind, code reviews. We care about your career growth and strive to assign projects based on what will help each team member develop into a better-rounded engineer and enable them to take on more complex tasks in the future. We are open to hiring candidates to work out of one of the following locations: New York, NY, USA | Santa Clara, CA, USA | Seattle, WA, USA
US, CA, Santa Clara
Amazon AI is looking for world class scientists and engineers to join its AWS AI Labs. This group is entrusted with developing core data mining, natural language processing, deep learning, and machine learning algorithms for AWS. You will invent, implement, and deploy state of the art machine learning algorithms and systems. You will build prototypes and explore conceptually new solutions. You will interact closely with our customers and with the academic community. You will be at the heart of a growing and exciting focus area for AWS and work with other acclaimed engineers and world famous scientists. Inclusive Team Culture Here at AWS, we embrace our differences. We are committed to furthering our culture of inclusion. We have ten employee-led affinity groups, reaching 40,000 employees in over 190 chapters globally. We have innovative benefit offerings, and host annual and ongoing learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences. Work/Life Balance Our team puts a high value on work-life balance. It isn’t about how many hours you spend at home or at work; it’s about the flow you establish that brings energy to both parts of your life. We believe striking the right balance between your personal and professional life is critical to life-long happiness and fulfillment. We offer flexibility in working hours and encourage you to find your own balance between your work and personal lives. Mentorship & Career Growth Our team is dedicated to supporting new members. We have a broad mix of experience levels and tenures, and we’re building an environment that celebrates knowledge sharing and mentorship. Our senior members enjoy one-on-one mentoring and thorough, but kind, code reviews. We care about your career growth and strive to assign projects based on what will help each team member develop into a better-rounded engineer and enable them to take on more complex tasks in the future. We are open to hiring candidates to work out of one of the following locations: New York, NY, USA | Santa Clara, CA, USA | Seattle, WA, USA
US, CA, Santa Clara
Amazon AI is looking for world class scientists and engineers to join its AWS AI Labs. This group is entrusted with developing core data mining, natural language processing, deep learning, and machine learning algorithms for AWS. You will invent, implement, and deploy state of the art machine learning algorithms and systems. You will build prototypes and explore conceptually new solutions. You will interact closely with our customers and with the academic community. You will be at the heart of a growing and exciting focus area for AWS and work with other acclaimed engineers and world famous scientists. Inclusive Team Culture Here at AWS, we embrace our differences. We are committed to furthering our culture of inclusion. We have ten employee-led affinity groups, reaching 40,000 employees in over 190 chapters globally. We have innovative benefit offerings, and host annual and ongoing learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences. Work/Life Balance Our team puts a high value on work-life balance. It isn’t about how many hours you spend at home or at work; it’s about the flow you establish that brings energy to both parts of your life. We believe striking the right balance between your personal and professional life is critical to life-long happiness and fulfillment. We offer flexibility in working hours and encourage you to find your own balance between your work and personal lives. Mentorship & Career Growth Our team is dedicated to supporting new members. We have a broad mix of experience levels and tenures, and we’re building an environment that celebrates knowledge sharing and mentorship. Our senior members enjoy one-on-one mentoring and thorough, but kind, code reviews. We care about your career growth and strive to assign projects based on what will help each team member develop into a better-rounded engineer and enable them to take on more complex tasks in the future. We are open to hiring candidates to work out of one of the following locations: New York, NY, USA | Santa Clara, CA, USA | Seattle, WA, USA
US, NY, New York
Amazon AI is looking for world class scientists and engineers to join its AWS AI Labs. This group is entrusted with developing core data mining, natural language processing, deep learning, and machine learning algorithms for AWS. You will invent, implement, and deploy state of the art machine learning algorithms and systems. You will build prototypes and explore conceptually new solutions. You will interact closely with our customers and with the academic community. You will be at the heart of a growing and exciting focus area for AWS and work with other acclaimed engineers and world famous scientists. Inclusive Team Culture Here at AWS, we embrace our differences. We are committed to furthering our culture of inclusion. We have ten employee-led affinity groups, reaching 40,000 employees in over 190 chapters globally. We have innovative benefit offerings, and host annual and ongoing learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences. Work/Life Balance Our team puts a high value on work-life balance. It isn’t about how many hours you spend at home or at work; it’s about the flow you establish that brings energy to both parts of your life. We believe striking the right balance between your personal and professional life is critical to life-long happiness and fulfillment. We offer flexibility in working hours and encourage you to find your own balance between your work and personal lives. Mentorship & Career Growth Our team is dedicated to supporting new members. We have a broad mix of experience levels and tenures, and we’re building an environment that celebrates knowledge sharing and mentorship. Our senior members enjoy one-on-one mentoring and thorough, but kind, code reviews. We care about your career growth and strive to assign projects based on what will help each team member develop into a better-rounded engineer and enable them to take on more complex tasks in the future. We are open to hiring candidates to work out of one of the following locations: New York, NY, USA | Santa Clara, CA, USA | Seattle, WA, USA
US, WA, Seattle
Amazon AI is looking for world class scientists and engineers to join its AWS AI Labs. This group is entrusted with developing core data mining, natural language processing, deep learning, and machine learning algorithms for AWS. You will invent, implement, and deploy state of the art machine learning algorithms and systems. You will build prototypes and explore conceptually new solutions. You will interact closely with our customers and with the academic community. You will be at the heart of a growing and exciting focus area for AWS and work with other acclaimed engineers and world famous scientists. Inclusive Team Culture Here at AWS, we embrace our differences. We are committed to furthering our culture of inclusion. We have ten employee-led affinity groups, reaching 40,000 employees in over 190 chapters globally. We have innovative benefit offerings, and host annual and ongoing learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences. Work/Life Balance Our team puts a high value on work-life balance. It isn’t about how many hours you spend at home or at work; it’s about the flow you establish that brings energy to both parts of your life. We believe striking the right balance between your personal and professional life is critical to life-long happiness and fulfillment. We offer flexibility in working hours and encourage you to find your own balance between your work and personal lives. Mentorship & Career Growth Our team is dedicated to supporting new members. We have a broad mix of experience levels and tenures, and we’re building an environment that celebrates knowledge sharing and mentorship. Our senior members enjoy one-on-one mentoring and thorough, but kind, code reviews. We care about your career growth and strive to assign projects based on what will help each team member develop into a better-rounded engineer and enable them to take on more complex tasks in the future. We are open to hiring candidates to work out of one of the following locations: New York, NY, USA | Santa Clara, CA, USA | Seattle, WA, USA