-
COLING 2025 Workshop on Evaluation of Multi-Modal Generation2025Multimodal generative AI usually involves generating image or text responses given inputs in another modality. The evaluation of image-text relevancy is essential for measuring response quality or ranking candidate responses. In particular, binary relevancy evaluation, i.e., “Relevant” vs. “Not Relevant”, is a fundamental problem. However, this is a challenging task considering that texts have diverse formats
-
2025Text-to-Image diffusion models have shown remarkable capabilities in generating high-quality images. However, current models often struggle to adhere to the complete set of conditions specified in the input text and return unfaithful generations. Existing works address this problem by either fine-tuning the base model or modifying the latent representations during the inference stage with gradient-based
-
Findings of EMNLP 20242024In a plethora of recent work, large language models (LLMs) demonstrated impressive reasoning ability, but many proposed downstream reasoning tasks only focus on final answers. Two fundamental questions persist: 1) how consistent is the reasoning, and 2) can models detect unreliable reasoning? In this paper, we investigate self-contradictory (SELF-CONTRA) reasoning, where the model reasoning does not support
-
Findings of EMNLP 20242024Self-anthropomorphism in robots manifests itself through their display of human-like characteristics in dialogue, such as expressing preferences and emotions. Our study systematically analyzes self-anthropomorphic expression within various dialogue datasets, outlining the contrasts between self-anthropomorphic and non-self-anthropomorphic responses in dialogue systems. We show significant differences in
-
U2BigData 20242024This paper introduces a Context-Aware and User Intent-Aware follow-up Question Generation (CA-UIA-QG) method in multi-turn conversational settings. Our CA-UIA-QG model is designed to simultaneously consider the evolving context of a conversation and identify user intent. By integrating these aspects, it generates relevant follow-up questions, which can better mimic user behavior and align well with users
Related content
-
September 16, 2020How Amazon conducted customer-obsessed science research and engineering to release a vastly improved experience.
-
September 14, 2020University teams have until October 6, 2020 to submit their applications.
-
September 14, 2020Winning teams from the third annual Alexa Prize competition present their research in new video.
-
August 25, 2020ACL 2020 keynote presentation given by Amazon Scholar and Columbia University professor Kathleen McKeown.
-
August 21, 2020Watch the recording of Marcu's live interview with Alexa evangelist Jeff Blankenburg.
-
August 20, 2020The team’s non-real-time system is the top performer, while its real-time system finishes third overall and second among real-time systems — despite using only 4% of a CPU core.