-
AI-ML Systems 20242024While we can customize large language models (LLMs) on specific domains by finetuning using the domain specific labeled data, performance of the customized models is highly dependent on the quality of the labeled data. Obtaining high-quality labeled data for custom domains often requires considerable human effort and associated costs. However, in many cases, unlabeled data is readily available at little
-
ACM SIGSPATIAL 20242024Determining the precise location of customers is important for an efficient and reliable delivery experience, both for customers and delivery associates. Address text is a primary source of information provided by customers about their location. In this paper, we study the important and challenging task of matching free-form customer address text to determine if two addresses represent the same physical
-
Large language models (LLMs) can be prone to hallucinations —generating unreliable outputs that are unfaithful to their inputs, external facts or internally inconsistent. In this work, we address several challenges for post-hoc hallucination detection in production settings. Our pipeline for hallucination detection entails: first, producing a confidence score representing the likelihood that a generated
-
In recent years, Vision Language Models (VLMs) have achieved significant advancements due to the success of large language models. The common strategy for aligning vision and language models involves a two-step process: an alignment (or pretraining) stage and an instruction tuning stage. During the alignment stage, a projection module is trained to map image embeddings into the language space using a paired
-
Language Resources and Evaluation2024In Artificial Intelligence research, perspectivism is an approach to machine learning that aims at leveraging data annotated by different individuals in order to model varied perspectives that influence their opinions and world view. We present the first survey of datasets and methods relevant to perspectivism in Natural Language Processing (NLP). We review datasets in which individual annotator labels
Related content
-
September 06, 2022Speech recognition and text-to-speech predominate, but other topics include audio watermarking, automatic dubbing, and compression.
-
September 02, 2022Method would enable customers to evaluate supporting evidence for tip reliability.
-
August 23, 2022New speech representations and self-supervised learning are two of the recent trends that most intrigue him.
-
August 15, 2022Data augmentation and post-editing strategies lift Amazon’s submission above competitors.
-
August 02, 2022With an encoder-decoder architecture — rather than decoder only — the Alexa Teacher Model excels other large language models on few-shot tasks such as summarization and machine translation.
-
August 01, 2022McKeown awarded IEEE Innovation in Societal Infrastructure Award and named a member of the American Philosophical Society.