TheWebConf: Stable themes, new wrinkles
Amazon Scholar Eugene Agichtein on incorporating knowledge into natural-language-processing models, multimodal interactions, and more.
Famously, in 1998, the first research paper about Google’s ranking algorithm was turned down by more-established academic conferences on information retrieval before finding a home at the upstart World Wide Web Conference, which was only four years old at the time.
“It was accepted to WWW because it was this new and emerging conference that was just taking cool ideas,” says Eugene Agichtein, an Amazon Scholar, the Winship Professor of computer science at Emory University, and a researcher whose 20-year involvement with the Web Conference included a stint as program committee co-chair in 2017. “It was accepting of new topics, and it moved faster and was more adaptable than traditional academic conferences. And it was more inclusive of industry work.”
This year, the formerly disruptive conference — now known as simply the Web Conference, nicknamed TheWebConf — receives another badge of mainstream acceptance, as it officially comes under the aegis of the Association for Computing Machinery.
“This year marks the historical transition of the conference series to ACM, the world’s largest scientific- and educational-computing society,” says Yoelle Maarek, the Amazon vice president for research and science at Alexa Shopping and a vice president of the conference’s new steering committee of the conference. “This definitely paints an even brighter future for the conference series.”
“Five years ago” — the year in which Agichtein was program chair — “we had a record number of submissions to the conference,” Agichtein says. "Out of 966 submissions, 164 were accepted. This year, there were almost double the submissions from five years ago. There were 1,820 submissions, with, again, a 17% acceptance rate. The conference has just exploded, and it remains incredibly competitive.
“Because of the acceptance rate, a lot of potentially cool and exciting work doesn't get in. However, there are a lot of what they call alternate tracks for industry, for posters and demos, and for web development where a lot of these emerging topics get accepted. For example, e-sports and online gaming, which would be a struggle to evaluate in a regular academic conference — e-sports has a special track at the Web Conference this year.”
Shifts and trends
In just the five years since he served as program chair, Agichtein says, there have been some notable shifts in the distribution of research topics covered at the conference.
“One of the popular topics five years ago was crowdsourcing, investigating methodologies for large-scale human data collection for training and evaluating machine learning models,” he says. “But by now, it has become a mainstream method for creating training data for large models. Similarly, there is no longer a separate track for conversational systems, because conversational interfaces have become incorporated into the general search or recommendation system tracks.”
“In ’17, we introduced a new track to the Web Conference on computational health,” Agichtein adds, “and I was very happy to see that there are a lot of papers this year on health on the web, with different names, like web for good or web for society. Especially with the pandemic, the web has become central to health-related activities and research — tracking things like infection rates. It was interesting to see how much it took off.”
Glancing over the program of this year’s Web Conference, Agichtein notices a few pronounced trends.
“User modeling has been a central part of the web, and this year is no exception,” he says. “It's all about trying to personalize content, trying to model how people are interacting with the systems. I would say there are at least two dozen papers on representing users, building user models, and trying to personalize or present content to them. And security, privacy, and trust remain a critical issue.”
Knowledge and multimodality
One of the research trends that most intrigues Agichtein is the incorporation of both structured and unstructured knowledge and reasoning into natural-language-processing models for conversational information retrieval and recommendation systems.
“I can give you an example close to our work at Amazon,” he says. “In order to generate an informed response, a conversational agent needs to be able to detect when, how, and what knowledge to incorporate into a conversation in a coherent manner. For example, to recommend an item such as a movie, an agent needs to represent the conversation context and retrieve useful knowledge about the movie itself and, ideally, provide relevant information about what made this movie appropriate for the user.
“There's been a wide variety of approaches to how to incorporate this knowledge, whether it's to incorporate it directly into the generative model by memorizing everything — storing it as part of the language model — or by retrieving knowledge from a variety of sources at runtime, which is the approach that we tend to favor.
“The new approaches will allow us to better select relevant knowledge or reason about which parts of the knowledge sources are helpful to include, because we have more capacity to capture the conversational context itself and more powerful models to pull in the knowledge needed to either generate a response or to select among possible responses or to understand what the user is trying to do.
“The other thing I have been studying is how users interact with information retrieval and conversational systems. Conversational interfaces have become ubiquitous, thanks to Alexa and others, but there's a completely open area on how those agents would interact with users in the real world, and in combination with other modalities such as screens and available sensors. So when we have responsive and potentially autonomous devices like Amazon’s Astro or other robots interacting with users in the real, physical environment, we need completely new models to represent the physical context of the interaction and to connect the content and the user’s gestures to what they refer to on the screen or in the real world.
“In this spirit, we have organized the Alexa Prize TaskBot Challenge, providing an opportunity for university teams to develop conversational-AI agents to assist users with cooking and home improvement tasks. The user modeling track at TheWebConf would be a perfect venue for that kind of work.
“The research community has spent 20 years optimizing models to interpret user queries and result clicks on the web. Now we have much richer environments and interaction modalities. So you can imagine it'll take us another 20 years to really come up with accurate ways of interpreting user interactions with multimodal conversational systems embedded in the user’s space.”
For the most part, however, “the overall themes of TheWebConf have remained relatively stable for the last five years,” Agichtein says. “It's just that the diversity within each of the tracks has continued to increase. And it’s encouraging to continue to see strong representation of both academia and industry. That's the spirit in which the conference was founded.”