Conversational AI

Alexa AI co-organizes special sessions at Interspeech

Sessions on multidevice scenarios, inclusive and fair speech technologies, trustworthy speech processing, and speech intelligibility prediction seek paper submissions.

By Staff writer

February 2, 2022

5 min read

At this year's Interspeech conference, in September, Alexa AI is co-organizing four special sessions — themed sessions within the main conferences — all of which are currently seeking paper submissions.

One session is on machine learning and signal processing in the context of multiple networked smart devices. This session will address topics such as synchronization, arbitration (deciding which device should respond to a query), and privacy.

Another Interspeech session is on inclusive and fair speech technologies. Algorithmic bias has been well studied in natural-language processing and computer vision but less so in speech. Possible paper topics include methods of bias analysis and mitigation, dataset creation, and ASR for atypical speech.

The third session is on trustworthy speech processing, which focuses on the development of models whose goals go beyond accuracy to incorporate privacy, interpretability, fairness, ethics, bias mitigation, and related areas.

Finally, the fourth special session is on predicting the intelligibility of speech — both the raw acoustic signal and the signal generated by hearing aids — to hearing-impaired listeners. This session is related to the Clarity Challenge, a five-year challenge to improve hearing aids that Alexa AI is participating in.

There’s more information about the individual sessions below. Submissions to the special sessions should go through the main-conference submission portal. The submission deadline is March 21.

Challenges and opportunities for signal processing and machine learning for multiple smart devices

The purpose of this session is to promote research in multiple-device signal processing and machine learning by bringing together industry and academic experts to discuss topics that include but are not limited to

Multiple-device audio datasets
Automatic speech recognition
Keyword spotting
Device arbitration (i.e., which device should respond to the user’s inquiry)
Speech enhancement: de-reverberation, noise reduction, echo reduction
Source separation
Speaker localization and tracking
Privacy-sensitive signal processing and machine learning

The session will collocate top researchers working in the multisensor domain, and even though their specific applications may be different (e.g., enhancement vs. acoustic-event detection), the similarity of the problem space encourages cross-pollination of techniques.

Amazon organizers:

Jarred Barber, applied scientist with Alexa AI
Gregory Ciccarelli, applied scientist with Alexa AI
Israel Cohen, Amazon Scholar and professor at Technion-Israel Institute of Technology
Tao Zhang, senior manager of applied science with Alexa AI

Inclusive and Fair Speech Technologies

Alexa AI is co-organizing this session with leading researchers in the field from around the world. The session will feature a series of oral presentations (or posters with two-minute introductions if more than six papers are accepted) that may address but are not limited to the following topics:

methods for bias analysis and mitigation, including algorithmic training criteria;
creating, managing, and sharing datasets for bias quantification and methods for data augmentation, curation, and coding techniques, with an emphasis on user groups not included in standard corpora;
ASR for atypical speech (e.g., ALS, stroke, deafness, Down syndrome);
ethical considerations about inclusion, democratization of speech technologies, and making speech interaction seamless for all;
applications of personalization techniques while fostering fairness (i.e., fairness-aware personalization)

Amazon organizers:

Peng Liu, senior machine learning scientist with Alexa AI
Anirudh Mani, applied scientist with Alexa AI
Tao Zhang, senior manager of applied science with Alexa AI

Trustworthy Speech Processing

Given the ubiquity of machine learning systems, it is important to ensure private and safe handling of data. Speech processing presents a unique set of challenges, given the rich information carried in linguistic and paralinguistic content, including speaker traits and interaction and state characteristics. This special session will bring together new and experienced researchers working on trustworthy machine learning and speech processing, and the session organizers are seeking novel and relevant submissions from academic and industrial research groups showcasing both theoretical and empirical advancements in trustworthy speech processing (TSP)

Topics of interest include but are not limited to:

Differential privacy
Federated learning
Ethics in speech processing
Model interpretability
Quantifying and mitigating bias in speech processing
New datasets, frameworks, and benchmarks for TSP
Discovery and defense against emerging privacy attacks
Trustworthy machine learning in applications of speech processing, such as automatic speech recognition

Amazon organizers:

Anil Ramakrishna, an applied scientist with Alexa AI
Rahul Gupta, an applied-science manager with Alexa AI

Speech intelligibility prediction for hearing-impaired listeners

Disabling hearing impairment affects 360 million people worldwide, and one of the greatest challenges for hearing-impaired listeners is understanding speech in the presence of background noise. The development of better hearing aids requires prediction models that can take audio signals and use knowledge of the listener's characteristics (e.g., an audiogram) to estimate the signals' intelligibility. These include models that can estimate the intelligibility of natural signals and models that can estimated the intelligibility of signals that have been processed using hearing aid algorithms.

The Clarity Prediction Challenge (part of the five-year Clarity Challenge) provides noisy speech signals that have been processed with a number of hearing-aid signal-processing systems and corresponding intelligibility scores and asks contestants to produce models that can predict intelligibility scores given just the signals, their clean references, and a characterisation of each listener’s specific hearing impairment. The challenge will remain open until the Interspeech submission deadline and all entrants are welcome.

The special session welcomes submission from entrants to the Clarity Prediction Challenge but is also inviting papers on related topics in hearing impairment and speech intelligibility, including, but not limited to

Statistical speech modeling for intelligibility prediction
Modeling energetic and informational noise masking
Individualizing intelligibility models using audiometric data
Intelligibility prediction in online and low-latency settings
Model-driven speech intelligibility enhancement
New methodologies for intelligibility model evaluation
Speech resources for intelligibility model evaluation
Applications of intelligibility modeling in acoustic engineering
Modeling interactions between hearing impairment and speaking style
Papers using the data supplied with the Clarity Prediction Challenge

Amazon organizer:

Daniel Korzekwa, an applied-science manager with Alexa AI

How to submit your paper

Submissions to the session on multidevice signal processing and ML or the session on inclusive and fair speech technologies can be uploaded to the main-conference site.

Submit your paper

About the Author

Staff writer

Alexa AI co-organizes special sessions at Interspeech

Sessions on multidevice scenarios, inclusive and fair speech technologies, trustworthy speech processing, and speech intelligibility prediction seek paper submissions.

Challenges and opportunities for signal processing and machine learning for multiple smart devices

Amazon organizers:

Inclusive and Fair Speech Technologies

Amazon organizers:

Trustworthy Speech Processing

Amazon organizers:

Speech intelligibility prediction for hearing-impaired listeners

Amazon organizer:

Related content

Work with us