-
2024Multi-document summarization (MDS) is a challenging task, often decomposed to subtasks of salience and redundancy detection, followed by text generation. In this context, alignment of corresponding sentences between a reference summary and its source documents has been leveraged to generate training data for some of the component tasks. Yet, this enabling alignment step has usually been applied heuristically
-
2024Multimodal Large Language Models (MLLMs) excel at synthesizing key information from diverse sources. However, generating accurate and faithful multimodal summaries is challenging, primarily due to the lack of appropriate multimodal datasets for fine-tuning that meaningfully integrate textual and visual modalities. To address this gap, we present a new dataset specifically designed for image-text multimodal
-
2024End-to-end neural diarization (EEND) models offer significant improvements over traditional embedding-based Speaker Diarization (SD) approaches but falls short on generalizing to long-form audio with large number of speakers. EEND-vector-clustering method mitigates this by combining local EEND with global clustering of speaker embeddings from local windows, but this requires an additional speaker embedding
-
2024Speaker Diarization (SD) systems are typically audio-based and operate independently of the ASR system in traditional speech transcription pipelines and can have speaker errors due to SD and/or ASR reconciliation, especially around speaker turns and regions of speech overlap. To reduce these errors, a Lexical Speaker Error Correction (LSEC), in which an external language model provides lexical information
-
Large language models (LLMs) exhibit excellent ability to understand human languages, but do they also understand their own language that appears gibberish to us? In this work we delve into this question, aiming to uncover the mechanisms underlying such behavior in LLMs. We employ the Greedy Coordinate Gradient optimizer to craft prompts that compel LLMs to generate coherent responses from seemingly nonsensical
Related content
-
March 31, 2021Throughout the pandemic, the Alexa team has continued to invent on behalf of our customers.
-
March 26, 2021In the future, says Amazon Scholar Emine Yilmaz, users will interact with computers to identify just the information they need, rather than scrolling through long lists of results.
-
March 24, 2021Human-evaluation studies validate metrics, and experiments show evidence of bias in popular language models.
-
March 19, 2021A model that uses both local and global context improves on the state of the art by 6% and 11% on two benchmark datasets.
-
March 16, 2021Amanda Cullen, a PhD candidate in informatics at the University of California, Irvine, wanted to do work that had an impact outside of academia — she found an ideal opportunity at Twitch.
-
March 11, 2021Watch a recording of the presentation and Q&A roundtable featuring Amazon scientists and scholars.