Co-authored with Marco Noel.
You can very quickly create a voice solution using Watson Assistant and Voice Agent. Out of the box the Voice Agent will use a standard speech to text model. With a small additional effort you can train a custom speech model that will understand your domain even better. This guide will help you bootstrap that speech model using your existing Watson Assistant.
From the Watson Assistant editor, navigate to the Export option to download a JSON copy of your skill. Save this JSON file to your hard drive.
In previous posts we explored what analysts want to discover about their virtual assistant and some building blocks for building analytics. In this post I will demonstrate some common recipes tailored to Watson Assistant logs.
Base layer: Get raw log events
In Part 1 of this series we explored the personas who analyze virtual assistants and some existing tools that help them. In this post we will review the foundational components in building your own analysis pipeline. Building your own analysis lets you fully customize an analysis to both your specific personas and your specific virtual assistant.
There are four key steps in developing Watson Assistant log analysis:
· Gather the logs
· Extract the fields of interest
· Define the analytic goal
· Filter data for analytic goal
In the remainder of this blog post we will explore these steps in detail. Part 3 of this series will demonstrate recipes for common analytic goals. …
Your Watson Assistant solution is in production. Congratulations! Now it’s time to analyze the solution’s performance and implement improvements. This blog post will help you get the most out of your virtual assistant. I will cover the various personas interested in analysis, the types of analyses that help them, and how to develop these analyses.
Analytics means different things to different people. For text-based virtual assistants there are three primary personas, each with a different goal:
Executive: This persona needs metrics that ties back to key performance indicators (KPIs). These metrics are generally extracted by summarizing entire conversations and comparing groups of conversations. …
In the previous post we looked at a simple demonstration of overfitting in a classic regression scenario. In this post I will demonstrate overfitting in the context of virtual assistants, or as some call them “chat bots”.
Before I deep dive on overfitting I want to briefly discuss how we talk about accuracy of assistants. An assistant is backed by a classifier that maps utterances into intents. For instance, the utterance “where is your new store opening” could map to a #Store_Location intent. …
We want our AI models to be as accurate as they can be. That’s one of the selling points of AI — that we can encode the best version of our past knowledge and have an automated model infer and apply our judgement. How can we tell when the model is accurate enough to trust? More importantly how can we tell if our efforts to improve accuracy are actually making the model worse? This situation can happen through a training problem called overfitting.
In this two-part series I will show how overfitting can affect various kinds of AI models. In this first post I’ll take a simple example and demonstrate both a reasonably accurate simple model as well as a “perfectly accurate” (overfit!) complex model. In the second post I will demonstrate overfitting in virtual assistants (sometimes called chatbots). …
In the embedded video I demonstrate reviewing a blind test result and how to improve a chatbot based on that result.
My initial bot was trained on 160 utterances (video 1) with a blind test against 200 utterances (video 2). In this video I split the 200 initially “blind” utterances in half, I analyze “part 1” and hold “part 2” out for a second blind test. I identify 45 examples of the “part 1” set that I add to the training data and do not look at “part 2” at all. I run a new blind test with the updated training (now using 160+45=205 utterances) and do a blind test on the “part 2” utterances with the WA-Testing-Tool. The updated training improves the blind test performance. I analyze where performance increased, where more work is needed, and discuss the need to implement this cycle iteratively. …
In the embedded video I demonstrate reviewing user utterances and how to turn them into “blind test” and future training data with the WA-Testing-Tool.
I first describe how to collect user utterances from logs and how to manually classify them into intents, taking care to make sure each example represents a single clear intent and removes superfluous text. For example “Hi, thanks for helping, can you tell me where the store is?” should be shortened to “Can you tell me where the store is?” and associated with the “Store Location” intent. …
K-folds cross validation helps you find confusion in your training data and intents. It does NOT predict runtime performance on utterances it has not previously seen, however it can help determine if your initial intent structure is clear or confusing, as well as identify places where additional training is needed. …
Many AI projects have a speech recognition component such as voice assistants (for example: IBM’s Watson Assistant for Voice Interaction). Speech recognition models including IBM Speech to Text require training to understand new domains and that training requires a data collection exercise. This post describes the data collection and speech training at a high level to demonstrate how each step relates to the others. For a deeper treatment on data collection and training, see the fantastic “How to Train Your Speech Dragon” series by Marco Noel — part 1, part 2, and part 3.
The following diagram outlines the high-level steps in the data collection exercise and how these steps support the training of your speech model. …