New sound detection approach improves on state of the art

Knowledge distillation technique for shrinking neural networks yields relative performance increases of up to 122%.

Sound detection is a popular application of today’s smart speakers. Alexa customers who activate Alexa Guard when they leave the house, for instance, receive notifications if their Alexa-enabled devices detect sounds such as glass breaking or smoke detectors going off while they’re away.

Sound detection — or, technically, acoustic-event detection (AED) — needs to run on-device: a home security application, for example, can’t miss a smoke alarm because of a momentary loss of Internet connectivity. 

A popular way to fit AED models on-device is to use knowledge distillation, in which a machine learning model with a small memory footprint is trained to reproduce the outputs of a more powerful but also much larger model.

At this year’s Interspeech, we presented a new approach to knowledge distillation for AED systems. In tests, we compared our model to both a baseline model with no knowledge distillation and a model using a state-of-the-art knowledge distillation technique. On a standard metric called area under the precision-recall curve (AUPRC), our model improved on the earlier knowledge distillation model by 27% to 122%, relative to the baseline.

Our technique works by exploiting repetitions in the acoustic signal, which are common in the types of sounds that AED systems are typically trained to detect: the sounds of smoke detector alarms or barking dogs, for instance, have more-or-less recurrent acoustic patterns.

Spectrogram of an emergency vehicle siren.
The spectrogram of an emergency vehicle siren, which maps power fluctuations in different frequency bands over time. The repetition in the signal is clearly visible (pink and gold lines).
From "Intra-utterance similarity preserving knowledge distillation for audio tagging"

While our system did deliver its greatest improvement over baseline on such repetitive signals, it also improved performance on loud, singular sounds such as engine and machinery impacts.

Deep neural networks, like the ones used in most AED models, are arranged into layers; input data is fed to the bottom layer, which processes it and passes the results to the next layer, which processes them and passes the results to the next layer, and so on.

Past work has improved knowledge distillation by using a technique called similarity-preserving knowledge distillation, which relies on similarities between the outputs of different network layers on training examples that share a label. 

For instance, sounds of breaking glass have certain acoustic characteristics not shared by sounds of barking dogs, and the layers’ outputs should reflect that. With similarity-preserving knowledge distillation, similarities inferred by the teacher model help guide the training of the student model.

We vary this approach to enforce similarities between the outputs of network layers for the same training example. That is, the outputs of the network layers should reflect the repetitions in the input signal. We thus call our approach intra-utterance similarity-preserving (IUSP) knowledge distillation.

We can enforce similarity between whichever layers of the teacher network — the larger network — and the student network — the smaller network — we want. For a given layer of the teacher model, we produce a matrix that maps its outputs for successive time steps of the input signal against themselves. The values in the matrix cells indicate the correlation between the layer’s outputs at different time steps.

Self-correlation matrix of a siren sound.
The self-correlation matrix of the siren sound represented above. The bright green line on the central diagonal represents the strong correlation between the outputs at particular time steps and themselves. The other green diagonals indicate repetitions in the acoustic signal.
From "Intra-utterance similarity preserving knowledge distillation for audio tagging"

During training, we evaluate the student model not only according to how well its final output matches that of the teacher model, but also according to how well the self-correlation matrices of its normalized outputs match the teacher’s.

Since the goal of knowledge distillation is to shrink the size of the machine learning model, the layers and intermediate features of the student model are often smaller — they have fewer processing nodes — than those of the teacher model. 

In that case, we use bilinear interpolation to make the student model’s self-correlation matrices the same size as the teacher’s. That is, we insert additional rows and columns into the matrix, and the value of each added cell is an interpolation between the values of the adjacent cells in the horizontal and vertical directions.

In our experiments, we used a standard benchmark data set that features eight classes of sound, including alarm sounds, dogs barking, impact sounds, and human speech.

As a baseline model, we used a standard AED network with no knowledge distillation. To assess our model, we also compared it to a model trained using similarity-preserving knowledge distillation.

We measured the models’ performance using area under the precision-recall curve, which represents the trade-off between false positives and false negatives, and we experimented with student models of four different sizes. We assessed the knowledge distillation models according to their degree of improvement over the baseline model.

Compared to the other knowledge distillation model, our model’s biggest improvement — a 122% increase in relative AUPRC — came with the smallest student model. The smallest improvement — 27% relative — came with the largest student model. As the purpose of knowledge distillation is to shrink the size of the student model, this indicates that our approach could be of use in real-world settings.

Research areas

Related content

IL, Tel Aviv
Come join the AWS Agentic AI science team in building the next generation models for intelligent automation. AWS, the world-leading provider of cloud services, has fostered the creation and growth of countless new businesses, and is a positive force for good. Our customers bring problems that will give Applied Scientists like you endless opportunities to see your research have a positive and immediate impact in the world. You will have the opportunity to partner with technology and business teams to solve real-world problems, have access to virtually endless data and computational resources, and to world-class engineers and developers that can help bring your ideas into the world. As part of the team, we expect that you will develop innovative solutions to hard problems, and publish your findings at peer reviewed conferences and workshops. We are looking for world class researchers with experience in one or more of the following areas - autonomous agents, API orchestration, Planning, large multimodal models (especially vision-language models), reinforcement learning (RL) and sequential decision making.
IL, Tel Aviv
Are you a Masters or PhD student interested in a 2026 Internship in Data Science? If so, we want to hear from you! We are looking for a customer obsessed Data Scientist Intern who can innovate in a business environment and is comfortable owning data to drive step-change innovation in the EMEA region or worldwide. If this describes you, come and join our Data Science teams at Amazon for an exciting internship opportunity. If you are insatiably curious and always want to learn more, then you’ve come to the right place. You can find more information about the Amazon Science community as well as our interview process via the links below; https://www.amazon.science/ https://amazon.jobs/content/en/career-programs/university/science Key job responsibilities As a Data Science Intern, you will have the following key job responsibilities: • Work closely with scientists and engineers to develop new algorithms to implement scientific solutions for Amazon problems • Design, run, and analyze A/B tests • Work on an interdisciplinary team on customer-obsessed research • Experience Amazon's customer-focused culture • Create and deliver projects that can be quickly applied starting locally and scaled to EMEA/worldwide • Create and share data with audiences of varying levels technical papers and presentations • Define metrics and design algorithms to estimate customer satisfaction and engagement A day in the life At Amazon, you will grow into the high impact person you know you’re ready to be. Every day will be filled with developing new skills and achieving personal growth. How often can you say that your work changes the world? At Amazon, you’ll say it often. Join us and define tomorrow. Some more benefits of an Amazon Science internship include; • All of our internships offer a competitive stipend/salary • Interns are paired with an experienced manager and mentor(s) • Interns receive invitations to different events such as intern program initiatives or site events • Interns can build their professional and personal network with other Amazon Scientists • Interns can potentially publish work at top tier conferences each year About the team Applicants will be reviewed on a rolling basis and are assigned to teams aligned with their research interests and experience prior to interviews. Start dates are available throughout the year and durations can vary in length from 3-6 months for full time internships or 6-12 months for part time internships. Please note these are not remote internships.
IN, KA, Bengaluru
Alexa+ is the world’s best Generative AI powered personal assistant / agent for consumers. We are seeking an Applied Scientist to join our newly expanding team in India focused on Alexa Conversational Ads and Personalization. In this role, you will build machine learning models that seamlessly and naturally integrate relevant advertising into the Alexa experience while deeply personalizing user interactions. You will work closely with other scientists, engineers, and product managers to take models from conception to production. Key job responsibilities Design, develop, and evaluate innovative deep learning and GenAI models for natural language processing (NLP), recommendation systems, and personalization. Conduct hands-on data analysis and build scalable ML pipelines. Design and run A/B experiments to measure the impact of new models on customer experience and ad performance. Collaborate with software development engineers to deploy models into high-scale, real-time production environments. About the team We are building a new science team in Bangalore to solve some of the most impactful problems in computational advertising. This isn't about tweaking existing models as we are rethinking how ads are ranked, priced, and personalized across voice-first and screen-first surfaces. These are problems that don't have textbook solutions. Key points to note about the team: 🧪 Greenfield team - you are not joining a mature org with rigid processes. You will shape the science roadmap, pick the problems, and define the culture from day one. 📈 Direct business impact — your models directly drive revenue. No yearly cycles to see if your work matters. 🌏 Global scope, local autonomy — collaborate with scientists and engineers across Seattle, Sunnyvale, and Bangalore, but own your problem space end-to-end. 🎓 Ship AND Publish: We encourage top-tier publications (NeurIPS, ACL, EMNLP, KDD, ICML, WWW) while ensuring your research hits production.
IN, KA, Bengaluru
Alexa+ is the world’s best Generative AI powered personal assistant / agent for consumers. We are seeking an Applied Scientist to join our newly expanding team in India focused on Alexa Conversational Ads and Personalization. In this role, you will build machine learning models that seamlessly and naturally integrate relevant advertising into the Alexa experience while deeply personalizing user interactions. You will work closely with other scientists, engineers, and product managers to take models from conception to production. Key job responsibilities Design, develop, and evaluate innovative deep learning and GenAI models for natural language processing (NLP), recommendation systems, and personalization. Conduct hands-on data analysis and build scalable ML pipelines. Design and run A/B experiments to measure the impact of new models on customer experience and ad performance. Collaborate with software development engineers to deploy models into high-scale, real-time production environments. About the team We are building a new science team in Bangalore to solve some of the most impactful problems in computational advertising. This isn't about tweaking existing models as we are rethinking how ads are ranked, priced, and personalized across voice-first and screen-first surfaces. These are problems that don't have textbook solutions. Key points to note about the team: 🧪 Greenfield team - you are not joining a mature org with rigid processes. You will shape the science roadmap, pick the problems, and define the culture from day one. 📈 Direct business impact — your models directly drive revenue. No yearly cycles to see if your work matters. 🌏 Global scope, local autonomy — collaborate with scientists and engineers across Seattle, Sunnyvale, and Bangalore, but own your problem space end-to-end. 🎓 Ship AND Publish: We encourage top-tier publications (NeurIPS, ACL, EMNLP, KDD, ICML, WWW) while ensuring your research hits production.
IN, KA, Bengaluru
Alexa+ is the world’s best Generative AI powered personal assistant / agent for consumers. We are seeking an Applied Scientist to join our newly expanding team in India focused on Alexa Conversational Ads and Personalization. In this role, you will build machine learning models that seamlessly and naturally integrate relevant advertising into the Alexa experience while deeply personalizing user interactions. You will work closely with other scientists, engineers, and product managers to take models from conception to production. Key job responsibilities Design, develop, and evaluate innovative deep learning and GenAI models for natural language processing (NLP), recommendation systems, and personalization. Conduct hands-on data analysis and build scalable ML pipelines. Design and run A/B experiments to measure the impact of new models on customer experience and ad performance. Collaborate with software development engineers to deploy models into high-scale, real-time production environments. About the team We are building a new science team in Bangalore to solve some of the most impactful problems in computational advertising. This isn't about tweaking existing models as we are rethinking how ads are ranked, priced, and personalized across voice-first and screen-first surfaces. These are problems that don't have textbook solutions. Key points to note about the team: 🧪 Greenfield team - you are not joining a mature org with rigid processes. You will shape the science roadmap, pick the problems, and define the culture from day one. 📈 Direct business impact — your models directly drive revenue. No yearly cycles to see if your work matters. 🌏 Global scope, local autonomy — collaborate with scientists and engineers across Seattle, Sunnyvale, and Bangalore, but own your problem space end-to-end. 🎓 Ship AND Publish: We encourage top-tier publications (NeurIPS, ACL, EMNLP, KDD, ICML, WWW) while ensuring your research hits production.
IN, KA, Bengaluru
Alexa+ is the world’s best Generative AI powered personal assistant / agent for consumers. We are seeking an Applied Scientist to join our newly expanding team in India focused on Alexa Conversational Ads and Personalization. In this role, you will build machine learning models that seamlessly and naturally integrate relevant advertising into the Alexa experience while deeply personalizing user interactions. You will work closely with other scientists, engineers, and product managers to take models from conception to production. Key job responsibilities - Design, develop, and evaluate innovative machine learning and deep learning models for natural language processing (NLP), recommendation systems, and personalization. - Conduct hands-on data analysis and build scalable ML pipelines. - Design and run A/B experiments to measure the impact of new models on customer experience and ad performance. - Collaborate with software development engineers to deploy models into high-scale, real-time production environments.
US, CA, San Francisco
Join Amazon's Frontier AI & Robotics team as a Member of Technical Staff, this Technical Program Manager will become the driving force behind breakthrough robotics innovation. You'll orchestrate complex, cross-functional programs that bridge AI research, software, hardware, and production deployment—managing the technical workstreams that enable robots to see, reason, and act in Amazon's warehouse environments. Your program leadership will directly accelerate our mission to build the next generation of embodied intelligence. Key job responsibilities · Establish and drive program management mechanisms and cadence for complex robotics and AI development initiatives spanning research, software engineering, hardware, and operations · Manage end-to-end program execution across the full robotics stack—including AI models, software engineering, and hardware deployment · Drive decision-making velocity by facilitating tradeoff discussions when there are conflicting priorities; determine whether decisions are one-way or two-way doors · Own program-level risk management, proactively identifying technical, schedule, and resource risks; escalate where necessary and drive mitigation strategies · Manage dependencies and scope changes across internal teams and partner organizations, ensuring alignment on commitments, timelines, and technical requirements · Create transparency through clear RACI frameworks, program dashboards, and communication mechanisms that keep stakeholders aligned on status, risks, and decisions · Exercise strong technical judgment to influence program-level decisions on deployment methodology, scalability requirements, and technical feasibility—acting as the voice back to research and engineering teams · Build sustainable program management processes that scale as our organization grows, adapting agile frameworks to the unique challenges of AI robotics A day in the life Your focus centers on driving velocity and alignment across our robotics programs. You might start your morning facilitating tradeoff decisions between AI researchers and software engineers on a critical prototype milestone, then transition to managing dependencies across hardware and operations teams to keep timelines on track. In the afternoon, you could be conducting risk assessments on supply chain constraints that impact our development roadmap, updating program dashboards to provide leadership visibility, or working with partner teams to align on deployment strategies. You'll establish the mechanisms and cadence that keep our fast-moving organization synchronized—from sprint planning rituals to cross-functional design reviews. Throughout the day, you balance hands-on program execution with strategic escalation, ensuring technical decisions align with our long-term vision while removing obstacles that slow teams down. You're the connective tissue that enables researchers, engineers, and operations specialists to move fast together. About the team At Frontier AI & Robotics, we're not just advancing robotics – we're reimagining it from the ground up. Our team is building the future of intelligent robotics through frontier foundation models and end-to-end learned systems. We tackle some of the most challenging problems in AI and robotics, from developing sophisticated perception systems to creating adaptive manipulation strategies that work in complex, real-world scenarios. What sets us apart is our unique combination of ambitious research vision and practical impact. We leverage Amazon's computational infrastructure and rich real-world datasets to train and deploy state-of-the-art foundation models. Our work spans the full spectrum of robotics intelligence – from multimodal perception using images, videos, and sensor data, to sophisticated manipulation strategies that can handle diverse real-world scenarios. We're building systems that don't just work in the lab, but scale to meet the demands of Amazon's global operations. Join us if you're excited about pushing the boundaries of what's possible in robotics, working with world-class researchers, and seeing your innovations deployed at unprecedented scale.
US, CA, San Francisco
We are seeking a hands-on Electrical Engineer to lead the design and integration of electrical systems or subsystems for high-degree-of-freedom robotic platforms. This role involves architecting the robot’s power distribution, sensor wiring, and embedded electrical infrastructure. You will be responsible for designing across the full electrical system for advanced robotics platforms including power distribution, sensing, compute, motor controllers, communication infrastructure, battery system and power electronics in close collaboration with mechanical, controls and software engineers. You’ll play a key role in ensuring high-performance, reliable operation of complex electromechanical systems under real-world conditions. Key job responsibilities * Electrical system architect / owner for power electronics, actuation, PCBAs, battery, ware harness specs and high speed electrical/communications protocols * Design, develop and integrate power distribution, embedded electronics, motor controllers and safety-critical circuits for complex robotic systems * Own board layout of PCBAs including SoCs, microcontrollers, sensors, power devices, etc. using Cadence OrCAD/Allegro or equivalent tools. Oversee bring-up and validation * Determine appropriate high speed electrical and communication protocols (e.g., CAN, EtherCAT, USB, etc) for reliable and efficient system operation * Specify and design custom power electronics and power distribution boards to meet performance, thermal, and safety requirements * Design and route all cabling and wire harnesses across the robotic platform, considering EMI, signal integrity, serviceability, and integration with mechanical structures * Architect and integrate the robot’s battery system, including protection circuitry, battery management, charging systems, and thermal considerations * Define and implement wiring and electrical interfaces for sensors (e.g., lidar, stereo cameras, IMUs, tactile) and compute modules * Ownership over prototyping and bringing up electrical designs and creation of test & validation rigs About the team At Frontier AI & Robotics, we're not just advancing robotics – we're reimagining it from the ground up. Our team is building the future of intelligent robotics through innovative foundation models and end-to-end learned systems. We tackle some of the most challenging problems in AI and robotics, from developing sophisticated perception systems to creating adaptive manipulation strategies that work in complex, real-world scenarios. What sets us apart is our unique combination of ambitious research vision and practical impact. We leverage Amazon's massive computational infrastructure and rich real-world datasets to train and deploy state-of-the-art foundation models. Our work spans the full spectrum of robotics intelligence – from multimodal perception using images, videos, and sensor data, to sophisticated manipulation strategies that can handle diverse real-world scenarios. We're building systems that don't just work in the lab, but scale to meet the demands of Amazon's global operations. Join us if you're excited about pushing the boundaries of what's possible in robotics, working with world-class researchers, and seeing your innovations deployed at unprecedented scale.
US, NY, New York
We are seeking an Applied Scientist to develop and optimize Visual Inertial Odometry (VIO) and sensor fusion systems for our intelligent robots. In this role, you will design, implement, and deploy state estimation and tracking algorithms that enable robots to understand their position and motion in real time, even in challenging and dynamic environments. You will own the full pipeline from algorithm development through embedded deployment, ensuring that perception systems run efficiently on resource-constrained robotic hardware. You will also leverage modern machine learning approaches to push the boundaries of classical perception methods, combining learned representations with geometric techniques to achieve robust, real-time performance. This is a deeply hands-on role. You will work directly with sensors, hardware, and real-world data, while prototyping, testing, and iterating in physical environments. The ideal candidate has strong foundations in VIO and sensor fusion, practical experience optimizing algorithms for embedded platforms, and familiarity with how modern deep learning is transforming perception. Key job responsibilities - Design and implement Visual Inertial Odometry algorithms for robust real-time state estimation on robotic platforms like Sprout - Develop multi-sensor fusion pipelines integrating cameras, IMUs, and other sensing modalities for accurate pose tracking - Optimize perception and tracking algorithms for deployment on embedded hardware (e.g., ARM, GPU-accelerated edge devices) under strict latency and power constraints - Apply modern ML-based perception techniques (learned features, depth estimation, neural odometry) to complement and improve classical geometric approaches - Build and maintain calibration, evaluation, and benchmarking infrastructure for perception systems - Collaborate with hardware, controls, and navigation teams to integrate perception outputs into the robot’s autonomy stack - Lead technical projects from research prototyping through production deployment
US, WA, Bellevue
The candidate in this role will own delivery of science products and solutions to help Amazon Devices Sales and Marketing org. make better decisions: product recommendations to customers, segmentation, financial incrementality of marketing initiatives, A/B testing etc. Key job responsibilities The Amazon Devices organization designs, produces and markets Echo Speakers, Kindle e-readers, Fire Tablets, Fire TV Streaming Media Players, Ring and Blink Smart Home & Security products. We are constantly looking to innovate on behalf of customers with new devices in existing or new categories or improving customer experience on existing platforms. The Devices Data Services (DDS) team provides Data Science, Analytics and Engineering support to the broader organization to enable Sales and Marketing activities across all these product lines. We are looking for an innovative, hands-on and customer-obsessed Data Scientist who can be a strategic partner to the product managers and engineers on the team. Our projects span multiple organizations and require coordination of experimentation, economic and causal analysis, and building predictive machine learning models. A successful candidate will be a problem solver who enjoys diving into data, is excited by difficult modeling challenges, is motivated to build something that will eventually become a production software system, and possesses strong communication skills to effectively interface between technical and business teams. In this role, you will be a technical expert with massive impact. You will take the lead on developing advanced ML systems that are key to reaching our customers with the right recommendations at the right time. Your work will directly impact the success of Amazon's growing Devices business. You will work across diverse science/engineering/business teams. You will work on critical data science problems, building high quality, reliable, accurate, and consistent code sets that are aligned with our business needs. Key Performance Areas - Implement statistical or machine learning methods to solve specific business problems. - Improve upon existing methodologies by developing new data sources, testing model enhancements, and fine-tuning model parameters. - Directly contribute to development of modern automated recommendation systems - Build customer-facing reporting tools to provide insights and metrics to track model performance and explain variance - Collaborate with researchers, software developers, and business leaders to define product requirements, provide analytical support, and communicate feedback A day in the life You will work with other scientists, engineers, product managers, and marketers to develop new products that benefit our customers and help us reach our business goals. You will own solutions from end to end: conceptualization, prioritization, development, delivery, and productionalization. About the team We are a full stack science team that empowers product, marketing, and other business leaders to better understand customers who use Amazon devices, make decisions on product development or optimization, and measure the effectiveness of their efforts against our customer’s expectation. Our focus area is to build analytical frameworks that help the organization either access data, better understand the decisions customers are making and why, or assess customer satisfaction.