Context-Aware Deep-Learning Method Boosts Alexa Dialogue System’s Ability to Recognize Conversation Topics by 35%

Conversational-AI systems have traditionally fallen into two categories: goal-oriented systems, which help users fulfill requests, and chatbots, which carry on informative or entertaining conversations.

Recently, the two areas have begun to converge, but separately or together, they both benefit from accurate “topic modeling”. Identifying the topic of a particular utterance can help goal-oriented systems route requests more accurately and keep chatbots’ comments relevant and engaging. Accurate topic tracking has also been shown to be strongly correlated with users’ subjective assessments of the quality of chatbot conversations.

In a paper we’re presenting at this year’s IEEE Spoken Language Technologies conference, we describe a system that uses two additional sources of information to determine the topic of a given utterance: the utterances that immediately preceded it and its classification as a “dialogue act”. Factoring that information in improves the accuracy of the system’s topic classification by 35%.

We validated our approach using more than 100,000 annotated utterances collected during the 2017 Alexa Prize competition, in which 15 academic research teams deployed experimental Alexa chatbot systems. In addition to generating innovative ideas about system design, the Alexa Prize helps address the chicken-and-egg problem that plagues conversational AI: training quality chatbots depends on realistic interaction data, but realistic interaction data is hard to come by without chatbots that people want to talk to.

Over the years, conversational-AI researchers have developed some standard taxonomies for classifying utterances as dialogue acts such as InformationRequests, Clarifications, or UserInstructions. Dialogue management systems generally use such classifications to track the progress of conversations.

We asked a team of annotators to label the data in our training set according to 14 dialogue acts and 12 topics, such as Politics, Fashion, EntertainmentMovies, and EntertainmentBooks. We also asked them to identify keywords in the utterances that helped them determine topics. For instance, a chatbot’s declaration that “Gucci is a famous brand from Italy” was assigned the topic Fashion, and “Gucci”, “brand”, and “Italy” were tagged as keywords associated with that topic.

We built topic-modeling systems that used three different neural-network architectures. One was a simple but fast network called a deep averaging network, or DAN. Another was a variation on the DAN that learned to predict not only the topics of utterances but also the keywords that indicated those topics. The third was a more sophisticated network called a bidirectional long-short-term-memory network.

Long short-term memory (LSTM) networks process sequential data — such as strings of spoken words — in order, and a given output factors in the outputs that preceded it. LSTMs are widely used in natural-language understanding: the interpretation of the fifth word in a sentence, for instance, will often depend on interpretations of the first four. A bidirectional LSTM (bi-LSTM) network is one that runs through the same data sequence both forward and backward.

Inputs to all three networks consist of a given utterance, its dialogue act classification, and it conversational context. Here, context means the last five turns of conversation, where a turn is a combination of a speaker utterance and a chatbot response. The dialogue act classifications come from a separate DAN model, which we trained using our labeled data.

In the DAN-based topic-modeling system, the first step is to embed the words of the input utterances, both the current utterance and the prior turns of conversation. An embedding is a representation of a word as a point in a high-dimensional space, such that words with similar meanings are grouped together. The DAN produces embeddings of full sentences by simply averaging the embeddings of their words.

The embeddings of the prior turns of conversation are then averaged with each other to produce a single summary embedding, which is appended to the embedding of the current utterance. The combined embedding then passes to a neural network, which learns to correlate embeddings with topic classifications.

DAN_architecture.jpg._CB460793352_.jpg
The DAN architecture

The second system, which uses a modified DAN — or ADAN, for attentional DAN — adds several ingredients to this recipe. During training, the ADAN built a matrix that mapped every word it encountered against each of the 12 topics it was being asked to recognize, recording the frequency with which annotators correlated a particular word with a particular topic. Each word thus had 12 numbers associated with it — a 12-dimensional vector — indicating its relevance to each topic. This matrix, which we call a topic-word attention table, gives the ADAN its name.

During operation, the ADAN embeds the words of the current utterance and the past utterances. Like the DAN, it averages the words of the past utterances, then averages the averages together. But it processes the words of the current utterance separately, adding to the embedding of each the corresponding 12-dimensional topic vector. Each of these combination vectors is also combined with the past-utterance summaries, before passing to the neural network for classification.

ADAN_architecture.jpg._CB460793358_.jpg
The ADAN architecture

The output of the neural network, however, includes not only a prediction of the topic label but also a prediction of which words in the input correspond to that label. Although such keywords were labeled in our data set, we used the labels only to gauge the system’s performance, not to train it. That is, it learned to identify keywords in an unsupervised way.

Because it can identify keywords, the ADAN, unlike the DAN and the bi-LSTM, is “interpretable”: it issues not only a judgment but also an explanation of the basis for that judgment.

We experimented with two different methods of feeding data about prior utterances to the bi-LSTM. With one method, we fed it an averaged embedding of all five prior turns; in the other, we fed it embeddings of the prior turns sequentially. The first method is more efficient, but the second proved to be more accurate.

Bi-LSTM_architecture.jpg._CB460793356_.jpg
The bi-LSTM architecture

We evaluated four different versions of each system: a baseline version, which used only information about the current utterance; a version that added in only prior-turn information; a version that added in only dialogue act information; and a version that added in both.

With all four systems — DAN, ADAN, and the two varieties of bi-LSTM — adding prior-turn information and dialogue act information, both separately and together, improved accuracy over baseline. The bi-LSTM system augmented with both dialogue act and prior-turn information performed best, with an accuracy of 74 percent, up from 55 percent for baseline.

The ADAN had the lowest accuracy scores, but we suspect that its decision model was too complex to learn accurate correlations from the amount of training data we had available. Its performance should improve with more data, and as dialogue systems grow more sophisticated, interpretability may prove increasingly important.

Acknowledgments: Chandra Khatri, Rahul Goel, Angeliki Metanillou, Anushree Venkatesh, Raefer Gabriel, Arindam Mandal

Related content

ES, M, Madrid
Amazon's International Technology org in EU (EU INTech) is creating new ways for Amazon customers discovering Amazon catalog through new and innovative Customer experiences. Our vision is to provide the most relevant content and CX for their shopping mission. We are responsible for building the software and machine learning models to surface high quality and relevant content to the Amazon customers worldwide across the site. The team, mainly located in Madrid Technical Hub, London and Luxembourg, comprises Software Developer and ML Engineers, Applied Scientists, Product Managers, Technical Product Managers and UX Designers who are experts on several areas of ranking, computer vision, recommendations systems, Search as well as CX. Are you interested on how the experiences that fuel Catalog and Search are built to scale to customers WW? Are interesting on how we use state of the art AI to generate and provide the most relevant content? Key job responsibilities We are looking for Applied Scientists who are passionate to solve highly ambiguous and challenging problems at global scale. You will be responsible for major science challenges for our team, including working with text to image and image to text state of the art models to scale to enable new Customer Experiences WW. You will design, develop, deliver and support a variety of models in collaboration with a variety of roles and partner teams around the world. You will influence scientific direction and best practices and maintain quality on team deliverables. We are open to hiring candidates to work out of one of the following locations: Madrid, M, ESP
US, WA, Bellevue
Imagine being part of an agile team where your ideas have the potential to reach millions of customers. Picture working on cutting-edge, customer-facing solutions, where every team member is a critical voice in the decision making process. Envision being able to leverage the resources of a Fortune 500 company within the atmosphere of a start-up. Welcome to Amazon’s NCRC team. We solve complex problems in an ambiguous space, focusing on reducing return costs and improving the customer experience. We build solutions that are distributed on a large scale, positively impacting experiences for our customers and sellers. Come innovate with the NCRC team! The Net Cost of Refunds and Concessions (NCRC) team is looking for a Senior Manager Data Science to lead a team of economists, business intelligence engineers and business analysts who investigate business problems, develop insights and build models & algorithms that predict and quantify new opportunity. The team instigates and productionalizes nascent solutions around four pillars: outbound defects, inbound defects, yield optimization and returns reduction. These four pillars interact, resulting in impacts to our overall return rate, associated costs, and customer satisfaction. You may have seen some downstream impacts of our work including Amazon.com customer satisfaction badges on the website and app, new returns drop off optionality, and faster refunds for low cost items. In this role, you will set the science vision and direction for the team, collaborating with internal stakeholders across our returns and re-commerce teams to scale and advance science solutions. This role is based in Bellevue, WA Key job responsibilities * Single threaded leader responsible for setting and driving science strategy for the organization. * Lead and provide coaching to a team of Scientists, Economists, Business Intelligence Engineers and Business Analysts. * Partner with Engineering, Product and Machine Learning leaders to deliver insights and recommendations across NCRC initiatives. * Lead research and development of models and science products powering return cost reduction. * Educate and evangelize across internal teams on analytics, insights and measurement by writing whitepapers, knowledge documentation and delivering learning sessions. We are open to hiring candidates to work out of one of the following locations: Bellevue, WA, USA
US, WA, Bellevue
We are designing the future. If you are in quest of an iterative fast-paced environment, where you can drive innovation through scientific inquiry, and provide tangible benefit to hundreds of thousands of our associates worldwide, this is your opportunity. Come work on the Amazon Worldwide Fulfillment Design & Engineering Team! We are looking for an experienced and Research Scientist with background in Ergonomics and Industrial Human Factors, someone that is excited to work on complex real-world challenges for which a comprehensive scientific approach is necessary to drive solutions. Your investigations will define human factor / ergonomic thresholds resulting in design and implementation of safe and efficient workspaces and processes for our associates. Your role will entail assessment and design of manual material handling tasks throughout the entire Amazon network. You will identify fundamental questions pertaining to the human capabilities and tolerances in a myriad of work environments, and will initiate and lead studies that will drive decision making on an extreme scale. .You will provide definitive human factors/ ergonomics input and participate in design with every single design group in our network, including Amazon Robotics, Engineering R&D, and Operations Engineering. You will work closely with our Worldwide Health and Safety organization to gain feedback on designs and work tenaciously to continuously improve our associate’s experience. Key job responsibilities - Collaborating and designing work processes and workspaces that adhere to human factors / ergonomics standards worldwide. - Producing comprehensive and assessments of workstations and processes covering biomechanical, physiological, and psychophysical demands. - Effectively communicate your design rationale to multiple engineering and operations entities. - Identifying gaps in current human factors standards and guidelines, and lead comprehensive studies to redefine “industry best practices” based on solid scientific foundations. - Continuously strive to gain in-depth knowledge of your profession, as well as branch out to learn about intersecting fields, such as robotics and mechatronics. - Travelling to our various sites to perform thorough assessments and gain in-depth operational feedback, approximately 25%-50% of the time. We are open to hiring candidates to work out of one of the following locations: Bellevue, WA, USA
US, NY, New York
Amazon Advertising exists at the intersection of marketing and e-commerce and offers advertisers a rich array of innovative advertising solutions across Amazon-owned and third party properties. We believe that advertising, when done well, can greatly enhance the value of the customer experience and generate a positive return on investment for our advertising partners. We are currently looking for a highly skilled and motivated Data Scientist to help scale our growing advertising business. The Data Scientist is a key member of the Global Marketing Insights team at Amazon Ads, working with marketing, product, retail and other Amazon business partners to analyze and improve advertisers’ performance on Amazon, in support of their marketing objectives. You will work with Amazon's unique data and translate it into high-quality and actionable insights and recommendations to improve the effectiveness of advertiser campaigns and unlock business opportunities. Day to day activities include analyzing advertiser behaviors to develop data-driven insights on what tactics and strategies lead to success. You will also build automated solutions to generate science driven insights at scale, that are distributed to our advertisers across channels. Basic qualifications - Bachelor's or Master's degree in Engineering, Statistics, Economics, or a related technical field - Proven experience in data analytics or data science roles - Proficiency with SQL and Python - Strong understanding of basic statistical techniques and methodologies such as distributions, hypothesis testing, regressions, experimentation, A/B Testing etc. - Excellent organizational, interpersonal, and communication skills (both written and verbal) - Ability to work cross-functionally and with technical and non-technical stakeholders Preferred qualifications - Understanding of advanced statistical techniques and methodologies such as causal inference, propensity score matching, machine learning etc. - Experience with developing and deploying production machine learning models, especially on cloud platforms - Experience building and managing data pipelines - Experience with digital advertising products, performance analytics , marketing and advertising campaigns - MBA, Master’s, or Doctoral degree in Economics, Engineering, Marketing, Statistics, Advertising, or related fields - Publication track record/writing experience (ex. published a paper in a technical journal or trade publication) About the team The Marketing Insights team is responsible for delivering science backed insights to millions of advertisers via our marketing messages. Our team is distributed across the globe and is building cutting edge data science to identify and communicate the impact of various advertising strategies for our products. We are open to hiring candidates to work out of one of the following locations: New York, NY, USA
US, WA, Seattle
We are looking for detail-oriented, organized, and responsible individuals who are eager to learn how to work with large and complicated data sets. Some knowledge of econometrics, as well as basic familiarity with Python is necessary, and experience with SQL and Scala would be a plus. These are full-time positions at 40 hours per week, with compensation being awarded on an hourly basis. You will learn how to build data sets and perform applied econometric analysis collaborating with economists, scientists, and product managers. These skills will translate well into writing applied chapters in your dissertation and provide you with work experience that may help you with placement. Roughly 85% of previous cohorts have converted to full time economics employment at Amazon. If you are interested, please send your CV to our mailing list at econ-internship@amazon.com. We are open to hiring candidates to work out of one of the following locations: Chicago, IL, USA | Seattle, WA, USA | Washington, DC, USA
US, WA, Seattle
We are looking for detail-oriented, organized, and responsible individuals who are eager to learn how to work with large and complicated data sets. Some knowledge of econometrics, as well as basic familiarity with Python is necessary, and experience with SQL and Scala would be a plus. These are full-time positions at 40 hours per week, with compensation being awarded on an hourly basis. You will learn how to build data sets and perform applied econometric analysis collaborating with economists, scientists, and product managers. These skills will translate well into writing applied chapters in your dissertation and provide you with work experience that may help you with placement. Roughly 85% of previous cohorts have converted to full time economics employment at Amazon. If you are interested, please send your CV to our mailing list at econ-internship@amazon.com. We are open to hiring candidates to work out of one of the following locations: Chicago, IL, USA | Seattle, WA, USA | Washington, DC, USA
US, WA, Seattle
We are looking for detail-oriented, organized, and responsible individuals who are eager to learn how to work with large and complicated data sets. Some knowledge of econometrics, as well as basic familiarity with Python is necessary, and experience with SQL and Scala would be a plus. These are full-time positions at 40 hours per week, with compensation being awarded on an hourly basis. You will learn how to build data sets and perform applied econometric analysis collaborating with economists, scientists, and product managers. These skills will translate well into writing applied chapters in your dissertation and provide you with work experience that may help you with placement. Roughly 85% of previous cohorts have converted to full time economics employment at Amazon. If you are interested, please send your CV to our mailing list at econ-internship@amazon.com. We are open to hiring candidates to work out of one of the following locations: Chicago, IL, USA | Seattle, WA, USA | Washington, DC, USA
US, CA, Santa Clara
Machine learning (ML) has been strategic to Amazon from the early years. We are pioneers in areas such as recommendation engines, product search, eCommerce fraud detection, and large-scale optimization of fulfillment center operations. The Generative AI team helps AWS customers accelerate the use of Generative AI to solve business and operational challenges and promote innovation in their organization. As an applied scientist, you are proficient in designing and developing advanced ML models to solve diverse challenges and opportunities. You will be working with terabytes of text, images, and other types of data to solve real-world problems. You'll design and run experiments, research new algorithms, and find new ways of optimizing risk, profitability, and customer experience. We’re looking for talented scientists capable of applying ML algorithms and cutting-edge deep learning (DL) and reinforcement learning approaches to areas such as drug discovery, customer segmentation, fraud prevention, capacity planning, predictive maintenance, pricing optimization, call center analytics, player pose estimation, event detection, and virtual assistant among others. AWS Sales, Marketing, and Global Services (SMGS) is responsible for driving revenue, adoption, and growth from the largest and fastest growing small- and mid-market accounts to enterprise-level customers including public sector. The AWS Global Support team interacts with leading companies and believes that world-class support is critical to customer success. AWS Support also partners with a global list of customers that are building mission-critical applications on top of AWS services. Key job responsibilities The primary responsibilities of this role are to: Design, develop, and evaluate innovative ML models to solve diverse challenges and opportunities across industries Interact with customer directly to understand their business problems, and help them with defining and implementing scalable Generative AI solutions to solve them Work closely with account teams, research scientist teams, and product engineering teams to drive model implementations and new solutions About the team Diverse Experiences AWS values diverse experiences. Even if you do not meet all of the qualifications and skills listed in the job description, we encourage candidates to apply. If your career is just starting, hasn’t followed a traditional path, or includes alternative experiences, don’t let it stop you from applying. Why AWS? Amazon Web Services (AWS) is the world’s most comprehensive and broadly adopted cloud platform. We pioneered cloud computing and never stopped innovating — that’s why customers from the most successful startups to Global 500 companies trust our robust suite of products and services to power their businesses. Inclusive Team Culture Here at AWS, it’s in our nature to learn and be curious. Our employee-led affinity groups foster a culture of inclusion that empower us to be proud of our differences. Ongoing events and learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences, inspire us to never stop embracing our uniqueness. Mentorship & Career Growth We’re continuously raising our performance bar as we strive to become Earth’s Best Employer. That’s why you’ll find endless knowledge-sharing, mentorship and other career-advancing resources here to help you develop into a better-rounded professional. Work/Life Balance We value work-life harmony. Achieving success at work should never come at the expense of sacrifices at home, which is why flexible work hours and arrangements are part of our culture. When we feel supported in the workplace and at home, there’s nothing we can’t achieve in the cloud. We are open to hiring candidates to work out of one of the following locations: San Francisco, CA, USA | Santa Clara, CA, USA
US, WA, Bellevue
Amazon.com Services, Inc. is looking for a motivated individual with strong analytical skills and practical experience to join our Modeling and Optimization team. We are hiring specialists into our scientific team with expertise in network and combinatorial optimization, simulation-based design, and/or control theory. Amazon is growing rapidly and because we are driven by faster delivery to customers, a more efficient supply chain network, and lower cost of operations, our main focus is in the development of analytical strategic models and automation tools fed by massive amounts of data. You will be responsible for building these models/tools that improve the economics of Amazon’s worldwide fulfillment networks in North America, Europe, and Japan, China, and India as Amazon increases the speed and decreases the cost to deliver products to customers. You will identify and evaluate opportunities to reduce variable costs by improving fulfillment center processes, transportation operations and scheduling, and the execution to operational plans. You will also improve the efficiency of capital investment by helping the fulfillment centers to improve storage utilization and the effective use of automation. Finally, you will help create the metrics to quantify improvements to the fulfillment costs (e.g., transportation and labor costs) resulting from the application of these optimization models and tools. The ideal candidate will have good communication skills with both technical and business people with ability to speak at a level appropriate for the audience. Key job responsibilities * Understand ambiguous business problems, model it in the simplest and most effective manner with limited guidance. * Use new or existing tools to support internal partner-teams and provide the best, science-based guidance. * Contribute to existing tools with highly disciplined coding practices. * Contribute to the growth of knowledge of our team and the scientific community with internal and external publications or presentations. About the team * At the Modeling and Optimization (MOP) team, we use optimization, algorithm design, statistics, and machine learning to improve decision-making capabilities across WW Operations and Amazon Logistics. * We focus on transportation topology, labor and resource planning, routing science, visualization research, data science and development, and process optimization. * We create models to simulate, optimize, and control the fulfillment network with the objective of reducing cost while improving speed and reliability. * We support multiple business line, therefore maintain a comprehensive and objective view, coordinating solutions across organizational lines where possible. We are open to hiring candidates to work out of one of the following locations: Bellevue, WA, USA
US, CA, Santa Clara
Amazon AI is looking for world class scientists and engineers to join its AWS AI. This group is entrusted with developing core natural language processing, generative AI, deep learning and machine learning algorithms for AWS. You will invent, implement, and deploy state of the art machine learning algorithms and systems. You will build prototypes and explore conceptually new solutions. You will interact closely with our customers and with the academic community. You will be at the heart of a growing and exciting focus area for AWS and work with other acclaimed engineers and world famous scientists. A day in the life Inclusive Team Culture Here at AWS, we embrace our differences. We are committed to furthering our culture of inclusion. We have ten employee-led affinity groups, reaching 40,000 employees in over 190 chapters globally. We have innovative benefit offerings, and host annual and ongoing learning experiences, including our Conversations on Race and Ethnicity (CORE) and AmazeCon (gender diversity) conferences. Amazon’s culture of inclusion is reinforced within our 14 Leadership Principles, which remind team members to seek diverse perspectives, learn and be curious, and earn trust. Work/Life Balance Our team puts a high value on work-life balance. It isn’t about how many hours you spend at home or at work; it’s about the flow you establish that brings energy to both parts of your life. We believe striking the right balance between your personal and professional life is critical to life-long happiness and fulfillment. We offer flexibility in working hours and encourage you to find your own balance between your work and personal lives. Mentorship & Career Growth Our team is dedicated to supporting new members. We have a broad mix of experience levels and tenures, and we’re building an environment that celebrates knowledge sharing and mentorship. Our senior members enjoy one-on-one mentoring and thorough, but kind, code reviews. We care about your career growth and strive to assign projects based on what will help each team member develop into a better-rounded engineer and enable them to take on more complex tasks in the future. About the team The Amazon Web Services (AWS) Next Gen DevX (NGDE) team uses generative AI and foundation models to reimagine the experience of all builders on AWS. From the IDE to web-based tools and services, AI will help engineers work on large and small applications. We explore new technologies and find creative solutions. Curiosity and an explorative mindset can find a place here to impact the life of engineers around the world. If you are excited about this space and want to enlighten your peers with new capabilities, this is the team for you. We are open to hiring candidates to work out of one of the following locations: Santa Clara, CA, USA