Conversational AI

Building software and systems that help people communicate with computers naturally, as if communicating with family and friends.

Evaluating the critical risks of Amazon’s Nova Premier under the Frontier Model Safety Framework

Satyapriya Krishna, Ninareh Mehrabi, Abhinav Mohanty, Matteo Memelli, Vincent Ponzo, Payal Motwani, Rahul Gupta

Amazon Technical Reports

2025

Nova Premier is Amazon’s most capable multimodal foundation model and teacher for model distillation. It processes text, images, and video with a one-million-token context window, enabling analysis of large codebases, 400-page documents, and 90-minute videos in a single prompt [2]. We present the first comprehensive evaluation of Nova Premier’s critical risk profile under the Frontier Model Safety Framework

Conversational AI
Detecting and mitigating challenges in zero-shot video summarization with video LLMs

Luca Cagliero, Lorenzo Vaiani, Eliana Pastor, Alkis Koudounas, Elena Baralis, Vittorio Mazzia, Sandro Pollastrini, Thomas Gueudre, Manuel Giollo, Daniele Amberti, Yue (Rex) Wu

ACL 2025

2025

Video summarization aims to generate a condensed textual version of an original video. Summaries may consist of either plain text or a shortlist of salient events, possibly including temporal or spatial references. Video Large Language Models (VLLMs) exhibit impressive zero-shot capabilities in video analysis. However, their performance varies significantly according to the LLM prompt, the characteristics

Computer vision
Expansion span: Combining fading memory and retrieval in hybrid state space models

Elvis Nunez, Luca Zancato, Ben Bowman, Aditya Golatkar, Wei Xia, Stefano Soatto

NeuS 2025

2025

The “state” of State Space Models (SSMs) represents their memory, which fades exponentially over an unbounded span. By contrast, Attention-based models have “eidetic” (i.e., verbatim, or photographic) memory over a finite span (context size). Hybrid architectures combine State Space layers with Attention, but still cannot recall the distant past and can access only the most recent tokens eidetically. Unlike

Conversational AI
Robust online inference using adaptive model switching

Kalpan Mukherjee, Vikramank Singh, Abishek Sankararaman, Murali Narayanaswamy, Tim Kraska

ICLR 2025 Workshop on Modularity for Collaborative, Decentralized, and Continual Deep Learning

2025

It is well known that Large language models (LLMs) have good zero-shot and few-shot performance which makes them a promising candidate for inference when no or few training samples are available. However, when there is abundant task data, small custom trained models perform as well or are superior in performance to pre-trained LLMs, even after accounting for in-context examples. Further, smaller models

Conversational AI
AIDE: Attribute-guided multi-hop data expansion for data scarcity in task-specific fine-tuning

Jiayu Li, Xuan Zhu, Fang Liu, Yanjun (Jane) Qi

ACL 2025

2025

Fine-tuning large language models (LLMs) for specific tasks requires diverse, high-quality training data. However, obtaining sufficient relevant data remains a significant challenge. Existing data synthesis methods either depend on extensive seed datasets or struggle to balance task relevance and data diversity. To address these challenges, we propose Attributeguided multI-hop Data Expansion (AIDE), a novel

Conversational AI

The science behind Echo Show 10

Prakash Iyer

September 24, 2020

A combination of audio and visual signals guide the device’s movement, so the screen is always in view.

Conversational AI
New Alexa features: Speaking style adaptation

Antonio Bonafonte

September 24, 2020

Adjusting prosody and speaking style to conversational context is a first step toward “concept-to-speech”.

Conversational AI
New Alexa features: Natural turn-taking

Pradeep Natarajan, Arindam Mandal, Nikko Ström

September 24, 2020

Natural turn-taking uses multiple cues — acoustic, linguistic, and visual — to help Alexa interact more naturally, without the need to repeat the wake word.

Conversational AI
New Alexa features: Interactive teaching by customers

Govind Thattai, Gokhan Tur, Prem Natarajan

September 24, 2020

Deep learning and reasoning enable customers to explicitly teach Alexa how to interpret their novel requests.

Conversational AI
Creating AI-driven voice experiences with Alexa Conversations

Staff writer

September 18, 2020

Learn how Alexa Conversations helps developers in authoring complex dialogue management rules.

Conversational AI
Samuel L. Jackson celebrity voice for Alexa gets an update

Staff writer

September 16, 2020

How Amazon conducted customer-obsessed science research and engineering to release a vastly improved experience.

Conversational AI

Conversational AI

Publications

Related content

Work with us