Conversational AI

Alexa gets better at predicting customers’ goals

With a new machine learning system, Alexa can infer that an initial question implies a subsequent request.

By Anjishnu Kumar, Anand Rathi

November 11, 2020

3 min read

Amazon’s goal for Alexa is that customers should find interacting with her as natural as interacting with another human being. Toward that end, in September, we announced natural turn-taking, or conversing with Alexa without repetition of the wake word, and in July we began the public beta of Alexa Conversations, which makes it easier for developers to integrate sophisticated conversational experiences into their Alexa skills.

Now, we’re taking another step toward natural interaction with a capability that lets Alexa infer customers’ latent goals — goals that are implicit in customer requests but not directly expressed. For instance, if a customer asks, “How long does it take to steep tea?”, the latent goal could be setting a timer for steeping a cup of tea.

With the new capability, Alexa might answer that question, “Five minutes is a good place to start", then follow up by asking, "Would you like me to set a timer for five minutes?”

Illustration of Alexa inferring a customer asking about weather at the beach may be planning a beach trip. — In this interaction, Alexa infers that a customer who asks about the weather at the beach may be interested in other information that could be useful for planning a beach trip.

Transitions like this appear simple, but under the hood a number of sophisticated algorithms are running to detect latent goals, formulate them into actions that frequently span different skills, and surface them to customers in a way that doesn’t feel disruptive.

The trigger model

The first step is to decide whether to anticipate a latent goal at all. Our early experiments showed that not all dialogue contexts are well suited to latent-goal discovery. When a customer asked for “recipes for chicken”, for instance, one of our initial prototypes would incorrectly follow up by asking, “Do you want me to play chicken sounds?”

To determine whether to suggest a latent goal, we use a deep-learning-based trigger model that factors in several aspects of the dialogue context, such as the text of the customer’s current session with Alexa and whether the customer has engaged with Alexa’s multi-skill suggestions in the past.

If the trigger model finds the context suitable, the system suggests a skill to service the latent goal. Those suggestions are based on relationships learned by the latent-goal discovery model. For instance, the model may have discovered that customers who ask how long tea should steep frequently follow up by asking Alexa to set a timer for that amount of time.

Latent-goal discovery

The latent-goal discovery model analyzes multiple features of customer utterances, including pointwise mutual information, which measures the likelihood of an interaction pattern in a given context relative to its likelihood across all Alexa traffic. Deep-learning-based sub-modules assess additional features, such as whether the customer was trying to rephrase a prior command or issue a new command, or whether the direct goal and the latent goal share common entities or values (such as the time-value required to steep tea).

Over time, the discovery model improves its predictions through active learning, which identifies sample interactions that would be particularly informative during future fine-tuning.

Next, the semantic-role labeling model looks for named entities and other arguments from the current conversation, including Alexa’s own responses. Our context carryover models transform those entities into a structured format that the follow-on skill can consume, even if it is a third-party skill that uses its own ontology, or concept hierarchy.

Lastly, through bandit learning, in which machine learning models track whether recommendations are helping or not, underperforming experiences are automatically suppressed.

This capability is already available to Alexa customers in English in the United States. It requires no additional effort from skill developers to activate. However, skill developers can make their skills more visible to the discovery model by using the Name-Free Interaction Toolkit, which provides natural hooks for interactions between skills. While skills may experience different results, our early metric show that latent-goal discovery has increased customer engagement with some developers’ skills.

We are thrilled about this invention as it aids discovery of Alexa’s skills and provides increased utility to our customers.

About the Author

Anjishnu Kumar

Anjishnu Kumar is a senior applied scientist in the Alexa AI organization.

Anand Rathi

Anand Rathi is a director of software development in the Alexa AI organization.

Alexa gets better at predicting customers’ goals

With a new machine learning system, Alexa can infer that an initial question implies a subsequent request.

The trigger model

Latent-goal discovery

Related content

Work with us