-
CVPR 20222022Outside-knowledge visual question answering (OKVQA) requires the agent to comprehend the image, make use of relevant knowledge from the entire web, and digest all the information to answer the question. Most previous works address the problem by first fusing the image and question in the multi-modal space, which is inflexible for further fusion with a vast amount of external knowledge. In this paper, we
-
ACL Findings 20222022Despite profound successes, contrastive representation learning relies on carefully designed data augmentations using domainspecific knowledge. This challenge is magnif ied in natural language processing, where no general rules exist for data augmentation due to the discrete nature of natural language. We tackle this challenge by presenting a Virtual augmentation Supported Contrastive Learning of sentence
-
ACL Findings 20222022Accurate automatic evaluation metrics for open-domain dialogs are in high demand. Existing model-based metrics for system response evaluation are trained on human annotated data, which is cumbersome to collect. In this work, we propose to use information that can be automatically extracted from the next user utterance, such as its sentiment or whether the user explicitly ends the conversation, as a proxy
-
ACL Findings 20222022Users interacting with voice assistants today need to phrase their requests in a very specific manner to elicit an appropriate response. This limits the user experience, and is partly due to the lack of reasoning capabilities of dialogue platforms and the hand-crafted rules that require extensive labor. One possible way to improve user experience and relieve the manual efforts of designers is to build an
-
WSDM 20222022Voice assistants such as Alexa, Siri, and Google Assistant have become increasingly popular worldwide. However, linguistic variations, variability of speech patterns, ambient acoustic conditions, and other such factors are often correlated with the assistants misinterpreting the user’s query. In order to provide better customer experience, retrieval based query reformulation (QR) systems are widely used
Related content
-
December 18, 2018At a recent press event on Alexa's latest features, Alexa’s head scientist, Rohit Prasad, mentioned multistep requests in one shot, a capability that allows you to ask Alexa to do multiple things at once. For example, you might say, “Alexa, add bananas, peanut butter, and paper towels to my shopping list.” Alexa should intelligently figure out that “peanut butter” and “paper towels” name two items, not four, and that bananas are a separate item.
-
December 17, 2018In recent years, data representation has emerged as an important research topic within machine learning.
-
December 13, 2018Language models are a key component of automatic speech recognition systems, which convert speech into text. A language model captures the statistical likelihood of any particular string of words, so it can help decide between different interpretations of the same sequence of sounds.
-
December 11, 2018Suppose that you say to Alexa, “Alexa, play Mary Poppins.” Alexa must decide whether you mean the book, the video, or the soundtrack. How should she do it?
-
December 7, 2018In the past few years, advances in artificial intelligence have captured our imaginations and led to the widespread use of voice services on our phones and in our homes.
-
December 4, 2018Method factors in the utterances that immediately preceded the target utterance and its classification as a “dialogue act”