ASRU: Integrating speech recognition and language understanding

Amazon’s Jimmy Kunzmann on how “signal-to-interpretation” models improve availability, performance.

Jimmy Kunzmann, a senior manager for applied science with Alexa AI, is one of the sponsorship chairs at this year’s IEEE Automatic Speech Recognition and Understanding Workshop (ASRU). His research team also presented two papers at the conference, both on the topic of “signal-to-interpretation”, or the integration of automatic speech recognition and natural-language understanding into a single machine learning model.

Kunzmann portrait.png
Jimmy Kunzmann, a senior manager for applied science with Alexa AI and a sponsorship chair at the IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).

“Signal-to-interpretation derives the domain, intent, and slot values directly from the audio signal, and it’s becoming more and more a hot topic in research land,” Kunzmann says. “Research is driven largely by what algorithm gives the best performance in terms of accuracy, and signal-to-interpretation can drive accuracy up and latency and memory footprint down.”

The Alexa AI team is constantly working to improve Alexa’s accuracy, but its interest in signal-to-interpretation stemmed from the need to ensure Alexa’s availability on resource-constrained devices with intermittent Internet connections.

“If Internet connectivity drops all of a sudden, and nothing is working anymore, in a home or car environment, that's frustrating — when your lights are not switched on anymore, or you can’t call your favorite contacts in your car,” Kunzmann says.

Kunzmann says that his team’s early work concentrated on finding techniques to dramatically reduce the memory footprint of models that run on-device — techniques such as perfect hashing. But that work still approached automatic speech recognition (ASR) and natural-language understanding (NLU) as separate, sequential tasks.

More recently, he says, the team has moved to end-to-end neural-network-based models that tightly couple ASR and NLU, enabling more compact on-device models.

“By replacing traditional techniques with neural techniques, we could get a smaller footprint — and faster and more accurate models, actually,” Kunzmann says. “And the closer we couple all system components, the further we increasing reliability.”

Running end-to-end models on device can also improve responsiveness, Kunzmann says.

“Fire TV customers said that when we process requests like switching channels or proceeding to the next page on-device, we are much faster, and usability goes up,” he says

At ASRU, Kunzmann’s team is reporting on two new projects to make on-device, neural, signal-to-interpretation models even more useful.

Dynamic content

One paper, “Context-aware Transformer transducer for speech recognition”, considers the problem of how to incorporate personalized content — for instance, names from an address book, or the custom names of smart appliances — into neural models at run time.

“In the old days, they had so-called class-based language models, and at inference time, you could load these lists dynamically and get the user’s personalized content decoded,” Kunzmann says. “With neural approaches, you have a huge parameter set, but it is all pretrained. So you have to invent means of ingesting user data at run time.

“The neural network has numerous layers, represented typically as vectors of probabilities. If you are going from one layer to the other, you feed updated probabilities forward. You can ingest information by changing these probabilities based on dynamic content, which allows you to change output probabilities to recognize user context — like your personal address book or your location of interest.

Biasing layer.png
The architecture of the context-aware model proposed in "Context-aware Transformer transducer for speech recognition": (a) Transformer transducer model; (b) context biasing layer; (c) context-aware Transformer transducer (CATT) with audio embeddings; (d) CATT with audio and label embeddings.

Multilingual processing

The other ASRU paper from Kunzmann’s team, “In pursuit of babel: Multilingual end-to-end spoken language understanding”, tackles the problem of moving multilingual models, which can respond in kind to requests in any of several languages, on-device.

In the cloud-based version of Alexa’s multilingual service, the same customer utterance is sent to multiple ASR models at once. Once a separate language identification model has determined what language is being spoken, the output of the appropriate ASR model is used for further processing. This prevents delays, because it enables the ASR models to begin working before the language has been identified.

Multilingual architecture.png
The architecture of the multilingual model proposed in "In pursuit of babel: Multilingual end-to-end spoken language understanding".

“On-device, we cannot afford that, because we don't have compute fleets running in parallel,” Kunzmann says. “Remember, signal-to-interpretation is one system that tightly couples ASR and NLU. In a nutshell, we show that we can train the signal-to-interpretation models on data from three different locales — in this case, English, Spanish, and French — and that improves accuracy and shrinks the model footprint. We could improve these systems’ performance by an order of magnitude and run these models on-device.”

“I think this is a core aspect what we want to do in science at Amazon — driving the research community to new areas. Performance improvements, like dynamic content processing, are helping research generally, but they’re also helping solve our customer problems.”

Research areas

Related content

US, CA, Santa Clara
The Geospatial science team solves problems at the interface of ML/AI and GIS for Amazon's last mile delivery programs. We have access to Earth-scale datasets and use them to solve challenging problems that affect hundreds of thousands of transporters. We are looking for strong candidates to join the transportation science team which owns time estimation, GPS trajectory learning, and sensor fusion from phone data. You will join a team of GIS and ML domain experts and be expected to develop ML models, present research results to stakeholders, and collaborate with SDEs to implement the models in production. Key job responsibilities - Understand business problems and translate them into science problems - Develop ML models - Present research results - Write and publish papers - Collaborate with other scientists
US, CA, San Francisco
If you are interested in this position, please apply on Twitch's Career site https://www.twitch.tv/jobs/en/ About Us: Twitch is the world’s biggest live streaming service, with global communities built around gaming, entertainment, music, sports, cooking, and more. It is where thousands of communities come together for whatever, every day. We’re about community, inside and out. You’ll find coworkers who are eager to team up, collaborate, and crush (or elegantly solve) problems together. We’re on a quest to empower live communities, so if this sounds good to you, see what we’re up to on LinkedIn and X, and discover the projects we’re solving on our Blog. Be sure to explore our Interviewing Guide to learn how to ace our interview process. About the Role: We are looking for an experienced Data Scientist to support our central analytics and finance disciplines at Twitch. Bringing to bear a mixture of data analysis, dashboarding, and SQL query skills, you will use data-driven methods to answer business questions, and deliver insights that deepen understanding of our viewer behavior and monetization performance. Reporting to the Head of Finance, Analytics, and Business Operations, your team will be located in San Francisco. While there is a preference for the San Francisco Bay Area, we are open to this role operating remotely within the U.S. You Will: - Create actionable insights from data related to Twitch viewers, creators, advertising revenue, commerce revenue, and content deals. - Develop dashboards and visualizations to communicate points of view that inform business decision-making. - Create and maintain complex queries and data pipelines for ad-hoc analyses. - Author narratives and documentation that support conclusions. - Collaborate effectively with business partners, product managers, and data team members to align data science efforts with strategic goals.
US, WA, Seattle
The Private Brands Discovery team designs innovative machine learning solutions to enhance customer awareness of Amazon’s own brands and help customers find products they love. This interdisciplinary team of scientists and engineers incubates and develops disruptive solutions using cutting-edge technology to tackle some of the most challenging scientific problems at Amazon. To achieve this, the team utilizes methods from Natural Language Processing, deep learning, large language models (LLMs), multi-armed bandits, reinforcement learning, Bayesian optimization, causal and statistical inference, and econometrics to drive discovery throughout the customer journey. Our solutions are crucial to the success of Amazon’s private brands and serve as a model for discovery solutions across the company. This role presents a high-visibility opportunity for someone eager to make a business impact, delve into large-scale problems, drive measurable actions, and collaborate closely with scientists and engineers. As a team lead, you will be responsible for developing and coaching talent, guiding the team in designing and developing cutting-edge models, and working with business, marketing, and software teams to address key challenges. These challenges include building and improving models for sourcing, relevance, and CTR/CVR estimation, deploying reinforcement learning methods in production etc. In this role, you will be a technical leader in applied science research with substantial scope, impact, and visibility. A successful team lead will be an analytical problem solver who enjoys exploring data, leading problem-solving efforts, guiding the development of new frameworks, and engaging in investigations and algorithm development. You should be capable of effectively interfacing between technical teams and business stakeholders, pushing the boundaries of what is scientifically possible, and maintaining a sharp focus on measurable customer and business impact. Additionally, you will mentor and guide scientists to enhance the team's talent and expand the impact of your work.
US, MD, Annapolis Junction
Are you excited to help the US Intelligence Community design, build, and implement AI algorithms to augment decision making while meeting the highest standards for reliability, transparency, and scalability? The Amazon Web Services (AWS) US Federal Professional Services team works directly with US Intelligence Community agencies and other public sector entities to achieve their mission goals through the adoption of Machine Learning (ML) methods. We build models for text, image, video, audio, and multi-modal use cases, using traditional or generative approaches to fit the mission. Our team collaborates across the entire AWS organization to bring access to product and service teams, to get the right solution delivered and drive feature innovation based on customer needs. At AWS, we're hiring experienced data scientists with a background in both traditional and generative AI who can help our customers understand the opportunities their data presents, and build solutions that earn the customer trust needed for deployment to production systems. In this role, you will work closely with customers to deeply understand their data challenges and requirements, and design tailored solutions that best fit their use cases. You should have broad experience building models using all kinds of data sources, and building data-intensive applications at scale. You should possess excellent business acumen and communication skills to collaborate effectively with stakeholders, develop key business questions, and translate requirements into actionable solutions. You will provide guidance and support to other engineers, sharing industry best practices and driving innovation in the field of data science and AI. This position may require local travel up to 25% It is expected to work from one of the above locations (or customer sites) at least 1+ days in a week. This is not a remote position. You are expected to be in the office or with customers as needed. This position requires that the candidate selected must currently possess and maintain an active TS/SCI Security Clearance with Polygraph. The position further requires the candidate to opt into a commensurate clearance for each government agency for which they perform AWS work. Key job responsibilities As an Data Scientist, you will: - Collaborate with AI/ML scientists and architects to research, design, develop, and evaluate cutting-edge AI algorithms to address real-world challenges - Interact with customers directly to understand the business problem, help and aid them in implementation of AI solutions, deliver briefing and deep dive sessions to customers and guide customer on adoption patterns and paths to production. - Create and deliver best practice recommendations, tutorials, blog posts, sample code, and presentations adapted to technical, business, and executive stakeholder - Provide customer and market feedback to Product and Engineering teams to help define product direction About the team About AWS Diverse Experiences AWS values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying. Why AWS? Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses. Inclusive Team Culture Here at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness. Mentorship & Career Growth We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional. Work/Life Balance We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why flexible work hours and arrangements are part of our culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud.
US, VA, Arlington
Are you looking to work at the forefront of Machine Learning and AI? Would you be excited to apply cutting edge Generative AI algorithms to solve real world problems with significant impact? Amazon Web Services (AWS) Professional Services (ProServe) is looking for Data Scientists who like helping U.S. Federal agencies implement innovative cloud computing solutions and solve technical problems using state-of-the-art language models in the cloud. AWS ProServe engages in a wide variety of projects for customers and partners, providing collective experience from across the AWS customer base and are obsessed about strong success for the Customer. Our team collaborates across the entire AWS organization to bring access to product and service teams, to get the right solution delivered and drive feature innovation based upon customer needs. At AWS, we're hiring experienced data scientists with a background in NLP, generative AI, and document processing to help our customers understand, plan, and implement best practices around leveraging these technologies within their AWS cloud environments. Our consultants deliver proof-of-concept projects, reusable artifacts, reference architectures, and lead implementation projects to assist organizations in harnessing the power of their data and unlocking the potential of advanced NLP and AI capabilities. In this role, you will work closely with customers to deeply understand their data challenges and requirements, and design tailored solutions that best fit their use cases. You should have deep expertise in NLP/NLU, generative AI, and building data-intensive applications at scale. You should possess excellent business acumen and communication skills to collaborate effectively with stakeholders, develop key business questions, and translate requirements into actionable solutions. You will provide guidance and support to other engineers, sharing industry best practices and driving innovation in the field of data science and AI. It is expected to work from one of the above locations (or customer sites) at least 1+ days in a week. This is not a remote position. You are expected to be in the office or with customers as needed. This position requires that the candidate selected be a US Citizen and obtain and maintain a security clearance at the TS/SCI with polygraph level. Upon start, the selected candidate will be sponsored for a commensurate clearance for each government agency for which they perform AWS work. Key job responsibilities In this role, you will: - Collaborate with AI/ML scientists and architects to research, design, develop, and evaluate cutting-edge generative AI solutions to address real-world challenges. - Interact with customers directly to understand the business problem, help and aid them in implementation of generative AI solutions, deliver briefing and deep dive sessions to customers and guide customer on adoption patterns and paths to production. - Provide expertise and guidance in generative AI and document processing infrastructure, design, implementation, and optimization. - Maintain domain knowledge and expertise in generative AI, NLP, and NLU. - Architect and build large-scale solutions. - Build technical solutions that are secure, maintainable, scalable, reliable, performant, and cost-effective. - Identify and prepare metrics and reports for the internal team and for customers to delineate the value of their solution to the customer. - Identify, mitigate and communicate risks related to solution and service constraints by making technical trade-offs. - Participate in growing their team’s skills and help mentor internal and customer team members. - Provide guidance on the people, organizational, security and compliance aspects of AI/ML transformations for the customer. About the team Why AWS? Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses. Diverse Experiences AWS values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying. Inclusive Team Culture Here at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness. Work/Life Balance We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why flexible work hours and arrangements are part of our culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud. Mentorship & Career Growth We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional.
US, WA, Seattle
The Artificial General Intelligence (AGI) team is looking for a passionate, talented, and inventive Applied Scientist with a strong deep learning background, to build industry-leading Generative Artificial Intelligence (GenAI) technology with Large Language Models (LLMs) and multimodal systems. Key job responsibilities As a Applied Scientist with the AGI team, you will work with talented peers to lead the development of novel algorithms and modeling techniques, to advance the state of the art with LLMs. Your work will directly impact our customers in the form of products and services that make use of speech and language technology. You will leverage Amazon’s heterogeneous data sources and large-scale computing resources to accelerate advances in spoken language understanding. About the team The AGI team has a mission to push the envelope in GenAI with LLMs and multimodal systems, in order to provide the best-possible experience for our customers.
US, CA, Sunnyvale
Amazon's AGI Web & Knowledge Services group is seeking a passionate, talented, and inventive Applied Scientist to lead the development of industry-leading structured Information retrieval systems. As part of our cutting-edge AGI-SIR team, you will play a pivotal role in developing efficient AI solutions for Knowledge Graphs, Graph Search and Question Answering Systems. In this role, your work will focus on creating scalable and efficient AI-driven technologies that push the boundaries of information retrieval. You will work on a broad range of problems, from low-level data processing to the development of novel retrieval models, leveraging state-of-the-art machine learning methods. Key job responsibilities - Lead the development of advanced algorithms for knowledge graphs, graph search and question answering systems, guiding the team in solving complex problems and setting technical direction. - Design models that address customer needs, making informed trade-offs to balance accuracy, efficiency, and user experience. - Collaborate with engineering teams to implement successful models into scalable, reliable Amazon production systems. - Present results to technical and business audiences, ensuring clarity, statistical rigor, and relevance to business goals. - Establish and uphold high scientific and engineering standards, driving best practices across the team. - Promote a culture of experimentation and continuous learning within Amazon’s applied science community.
US, WA, Seattle
Join an innovative team of scientists and engineers who use machine learning and statistical techniques to create state-of-the-art solutions for providing better value to Amazon's customers. Key job responsibilities “Amazon Science gives you insight into the company’s approach to customer-obsessed scientific innovation. Amazon fundamentally believes that scientific innovation is essential to being the most customer-centric company in the world. It’s the company’s ability to have an impact at scale that allows us to attract some of the brightest minds in artificial intelligence and related fields. Our scientists continue to publish, teach, and engage with the academic community, in addition to utilizing our working backwards method to enrich the way we live and work.” Please visit https://www.amazon.science for more information Do you want to join an innovative team of scientists and engineers who use machine learning and statistical techniques to create state-of-the-art solutions for providing better value to Amazon's customers? Do you want to build advanced algorithmic systems that help optimize millions of transactions every day? Are you excited by the prospect of analyzing and modeling terabytes of data to solve real world problems? Do you like to own end-to-end business problems/metrics and directly impact the profitability of the company? Do you like to innovate and simplify? If yes, then you may be a great fit to join the Machine Learning team for our International Consumer Businesses. The team builds the next generation of Machine Learning solutions for a wide spectrum of problems by leveraging generative-AI and LLMs, in areas such as Recommendations, Search Relevance, Catalog Quality, Online Ads, Pricing, Demand/Forecasting, Computer Vision, and Conversational Systems. A day in the life Build advanced algorithmic systems that help optimize millions of transactions every day. About the team The team builds the next generation of Machine Learning services for a wide spectrum of problems in areas such as Recommendations, Search Relevance, Computer Vision, Catalog Quality, Online Ads, Pricing, Demand/Forecasting, and Conversational Systems.
GB, Cambridge
We are looking for a researcher in cutting-edge LLM technologies for applications across Alexa, AWS, and other Amazon businesses. In this role, you will innovate in the fastest-moving fields of current AI research, in particular in how to integrate a broad range of structured and unstructured information into AI systems (e.g. with RAG techniques), and get to immediately apply your results in highly visible Amazon products. If you are deeply familiar with LLMs, natural language processing, and machine learning and have experience managing high-performing research teams, this may be the right opportunity for you. Our fast-paced environment requires a high degree of independence in making decisions and driving ambitious research agendas all the way to production. You will work with other science and engineering teams as well as business stakeholders to maximize velocity and impact of your team's contributions. It's an exciting time to be a leader in AI research. In Amazon's AGI Information team, you can make your mark by improving information-driven experience of Amazon customers worldwide!
CA, BC, Vancouver
Alexa Daily Essentials is hiring an Applied Scientist to research and implement large language model innovations to enhance Alexa's language understanding, knowledge representation, reasoning and generation capabilities. The Alexa Daily Essentials team delivers experiences critical to how customers interact with Alexa as part of daily life. We drive over 40 billion+ actions annually across 60 million+ monthly customers, who engage with our products across experiences connected to Timers, Alarms, Calendars, Food, and News. Our experiences include critical time saving techniques, ad-supported news audio and video, and in-depth kitchen guidance aimed at serving the needs of the family from sunset to sundown. Our upcoming launches are at the forefront of innovation, delivering step-function improvements in experiences that stretch across the customer journey, and new AI technologies that will enable customers to send Alexa information for future recall and conversation. We collaborate closely with partners such as Amazon.com, Whole Foods, Spotify, CNN, Fox, NPR, BBC, Discovery, and Food Network to deliver our vision. If you are passionate about redefining the personal assistant experience and leveraging innovative technology to improve daily life, we’d love to hear from you. This is an opportunity to make a tangible impact at the heart of the Alexa ecosystem. As an applied scientist, you will advance state of the art techniques in ML and LLM, and work closely with product and engineering teams to build the next generation of the Alexa smart assistant. Key job responsibilities - Rapidly prototype ML/LLM solutions, evaluate feasibility, and drive projects to production deployment - Continuously monitor and improve model performance through retraining, parameter tuning, and architecture refinements - Develop new training and inference techniques to improve model performance - Work cross-functionally across engineering, product, and business teams to understand customer needs, scope science work, and drive science solutions from conception to customer delivery - Research and develop LLM innovations, and lead paper publications. - Code proficiently in Python (required) and Java (preferred); optimize systems for high performance at scale; contribute code directly into production services - Innovate and develop science and engineering solutions that optimize team operations and increase team effectiveness. - Clearly communicate complex technical concepts to non-technical stakeholders and leadership