Conversational AI

Alexa Prize TaskBot Challenge 2 winner announced

Team TWIZ from NOVA School of Science and Technology awarded $500,000 prize for first-place overall performance.

By Alexa Prize team

October 3, 2023

5 min read

Amazon today announced that a team from NOVA School of Science and Technology (FCT NOVA) in Portugal has earned first place in the Alexa Prize TaskBot Challenge 2. Participants worked to address one of the hardest problems in conversational AI — creating next-generation conversational AI experiences that delight customers by addressing their changing needs as they complete complex tasks.

TaskBot is the first conversational AI challenge to incorporate multimodal customer experiences. During the contest, in addition to verbal instructions, some customers with Echo Show or Fire TV devices were also presented with step-by-step instructions, images, or diagrams to enhance task guidance.

“The most encouraging and impressive advances were in the application of large language models to dialog management itself,” said Michael Johnston, an applied science manager in Alexa AI who leads the science and engineering teams supporting the Alexa Prize. “Rather than just using LLMs to create candidate responses, teams explored having an instruction-following LLM drive the whole conversation. I think cracking that problem for the task assistance domain was the major contributing factor in the quality and naturalness we saw in the top performing bots.”

Team TWIZ, advised by João Magalhães, took home $500,000 for earning first place in overall performance.

“I’m extremely happy about the team’s creativity in designing the groundbreaking TWIZ LLM,” Magalhães said. "Conversations about video content take CX to an all-new level and I’m very proud for helping to pioneer video dialogue in the Alexa Prize. I think there's a lot to explore here.”

This year’s challenge was expanded to include more hobbies and at-home activities. Teams were asked to find interesting ways to incorporate visual aids into every conversation turn when a screen is available. Innovative ideas on improving the presentation of visual aids, as well as the coordination of visual and verbal modalities, were part of the judging criteria.

“User dialogues in the Alexa TaskBot are unique, shedding a new light into the execution of manual tasks,” said Rafael Ferreira, the TWIZ team lead. “Leveraged by these dialogues, we learned that using TWIZ allowed us to steer conversations in a more contextual and insightful way.”

Team GRILL from University of Glasgow, advised by Jeff Dalton, earned $100,000 for second place and team ISABEL from the University of Pittsburgh, advised by Malihe Alikhani, earned the $50,000 third-place prize. The work of the top three teams, along with the other participants, is now captured in a series of research papers.

“Working on the TaskBot 2 Challenge gave us the unique opportunity to develop and deploy cutting-edge language models,” said Sophie Fischer, GRILL team lead. “We learned that it's not just about model size or improved training, but about using models in new and creative ways to help people.”

Five university teams were selected to participate in the final live interactions phase of the TaskBot Challenge 2 earlier this year. The teams were selected based on, among other criteria, customer feedback and scientific merit of the technical papers produced by each team. The other two finalist teams were team PLAN-Bot from Virginia Tech, advised by Ismini Lourentzou; and team Sage, advised by Xin (Eric) Wang, from University of California, Santa Cruz.

“Compared to previous challenges, it was interesting to see the how broadly generative AI and large language models are applied,” Johnston said. “Previous challenges have used earlier language models for generating candidate responses, but with the rise of large capacity language models with the ability to follow instructions, teams use them for many different tasks needed to improve their bots.

“This included tasks like intent classification, formulating search queries, creating synthetic datasets, creating compelling descriptions of tasks, and more,” he continued. “Teams also explored different user interfaces to enable users to more easily clarify and iterate on their input using the screen and they also started to add assistive technology capabilities to increase the reach of the taskbots to underserved communities.”

Alexa customers interacted with the university taskbots on Amazon Echo or Fire TV devices. Customer ratings and feedback helped the student teams improve their bots as they competed.

Each university selected for the challenge received a $250,000 research grant, Alexa-enabled devices, free Amazon Web Services (AWS) cloud computing services to support their research and development efforts, access to Amazon scientists, the CoBot (conversational bot) toolkit, and other tools such as automated speech recognition through Alexa, neural detection and response generation models, conversational datasets, and design guidance and development support from the Alexa Prize team.

During the contest, customers engaged with the university teams’ taskbots. After initiating the interaction, customers received a brief message informing them that they were interacting with an Alexa Prize university taskbot before being randomly connected to one of the participating taskbots.

After exiting the conversation with the taskbot, the customer was prompted for a verbal rating, followed by an option to provide additional feedback. The interactions, ratings, and feedback were shared with the teams to help them improve their taskbots. Customer ratings were also used to determine which university teams advanced to the semifinals and finals.

Success in the previous TaskBot Challenge required teams to address many difficult AI obstacles. The challenge required the fusion of multiple AI techniques including knowledge representation and inference, commonsense and causal reasoning, and language understanding and generation.

“The performance of some of the taskbots in the second year of the competition improved drastically compared to the TaskBot 1,” said Eugene Agichtein, a computer science professor at Emory University and Amazon Scholar who also served as the faculty advisor for two of Emory’s Alexa Prize teams. “I was thrilled to see leaps forward due in part to the lessons learned and data and models created in the first year of the Taskbot competition, combined with improvements in LLM technology.”

The “GRILLBot” team from University of Glasgow won the TaskBot 1 Challenge in 2022, earning a $500,000 prize for its performance. Teams from NOVA School of Science and Technology (Portugal) and The Ohio State University earned second- and third-place prizes, respectively.