-
Interspeech 20192019Machine learning approaches for building task-oriented dialogue systems require large conversational datasets with labels to train on. We are interested in building task-oriented dialogue systems from human-human conversations, which may be available in ample amounts in existing customer care center logs or can be collected from crowd workers. Annotating these datasets can be prohibitively expensive. Recently
-
Interspeech 20192019In automatic speech recognition, confidence measures provide a quantitative representation used to assess the reliability of generated hypothesis text. For personal assistant devices like Alexa, speech recognition errors are inevitable due to the growing number of applications. Hence, confidence scores provide an important metric to downstream consumers to gauge the correctness of ASR hypothesis text and
-
NAACL 20192019Neural network models have recently gained traction for sentence-level intent classification and token-based slot-label identification. In many real-world scenarios, users have multiple intents in the same utterance, and a tokenlevel slot label can belong to more than one intent. We investigate an attention-based neural network model that performs multi-label classification for identifying multiple intents
-
NAACL 20192019Text normalization (TN) is an important step in conversational systems. It converts written text to its spoken form to facilitate speech recognition, natural language understanding and text-to-speech synthesis. Finite state transducers (FSTs) are commonly used to build grammars that handle text normalization (Sproat, 1996; Roark et al., 2012). However, translating linguistic knowledge into grammars requires
-
IEEE Journal on Emerging and Selected Topics in Circuits and System (JETCAS)2019Large scale machine learning (ML) systems such as the Alexa automatic speech recognition (ASR) system continue to improve with increasing amounts of manually transcribed training data. Instead of scaling manual transcription to impractical levels, we utilize semi-supervised learning (SSL) to learn acoustic models (AM) from the vast firehose of untranscribed audio data. Learning an AM from 1 Million hours
Related content
-
January 21, 2020Self-learning system uses customers’ rephrased requests as implicit error signals.
-
January 16, 2020According to listener tests, whispers produced by a new machine learning model sound as natural as vocoded human whispers.
-
December 11, 2019Related data selection techniques yield benefits for both speech recognition and natural-language understanding.
-
November 6, 2019Today is the fifth anniversary of the launch of the Amazon Echo, so in a talk I gave yesterday at the Web Summit in Lisbon, I looked at how far Alexa has come and where we’re heading next.
-
October 28, 2019In a paper we’re presenting at this year’s Conference on Empirical Methods in Natural Language Processing, we describe experiments with a new data selection technique.
-
October 17, 2019This year at EMNLP, we will cohost the Second Workshop on Fact Extraction and Verification — or FEVER — which will explore techniques for automatically assessing the veracity of factual assertions online.