So you’re building an AI powered chatbot to help customers on your website, but where do you start? Unless you’re in the position of knowing exactly what your customers will talk to your chatbot about, you’ll need to make some assumptions and pick your topics carefully.
A focus on domain specificity for your chatbot will benefit your customers and ensure you’re investing in the right thing.
This might sound obvious but, from my experience, too many companies are being tempted by the lure of “all knowing” chatbots and question answering systems that end up being all too generic and fall short of customer expectations.
As CTO at ICM Hub, I come at this subject from a particular perspective: we’re focused on AI automated customer service for airlines. Although our solution uses machine learning and natural language processing, delivered mostly in the form of a chatbot, our focus isn’t so much on the technology as it is on the impact we have on the final customer’s experience.
Getting specific is beneficial!
The technology needed to build chatbots and question answering systems is here, and there are businesses already investing in the development of these systems. However, one of the things I’ve seen a lot over the last fews years is a tendency to focus on the wrong thing: breadth. I think it is because people feel that knowing a little about a lot is better than knowing a lot about a little, but I’m not sure I agree. Lots of us will relate to taking the time to call a helpline, send an e-mail or start a web chat with a specific question or issue, only to be disappointed that the answer is too generic and doesn’t address our question at all.
General or domain agnostic solutions have their place, search engines are a great example: they cover massive amounts of content across every subject and can connect users with relevant information quickly and efficiently. They don’t, however, answer your question (yes, there are exceptions) or solve your problem, rather they help point you in the right direction. This approach is fine for a search engine, as our expectations are that they will do exactly that. On the other hand, when we take the time to engage with a company directly, we often expect more than simply being pointed in the right direction: we want personalised help.
Deciding what topics to cover
There’s a concept used within the world of question answering and information retrieval systems known as “the long tail graph”. I used it quite regularly during my time at IBM Watson to explain to businesses and development teams how to identify good topics to cover. The graph takes the form y = frequency and x = uniqueness and looks something like below.
The idea is that you focus the attention on topics that fall on the left hand side of the graph because they are the questions that occur most frequently and they are the easiest topics with which to train the machine learning. For question answering systems that focus on FAQs this is a good approach for the short tail, but the longer I’ve worked with businesses on conversational solutions (e.g. chatbots) for customer service the more I see that there is a need for an alternative perspective.
Consider value & complexity
Focusing so much on frequency and uniqueness leaves you at risk of spending time covering topics that offer the least value/cost or the highest complexity. This really isn’t how you want to get started: ideally you want to select the valuable topics that are most easily implemented. To deal with this, we find ourselves adding cost and complexity to our analysis to identify a number of candidate topics to cover.
Our approach is to analyse the domain using data from existing communication channels so we have a measure of frequency, cost, uniqueness and complexity. Cost represents the current cost of dealing with those topics while complexity is a discrete measure of how complex (1-10) the topic would be to automate. Visually the graph would look the same, but the topics that lie toward the top left can be quite different than those of other approaches.
y = frequency * cost | x = uniqueness * complexity
By following the above approach we ensure that we’re investing in the most frequently asked topics that carry highest value and are easiest to implement. The result is a reduction in the time needed to implement a solution capable of providing a return on investment and an increased chance at long term success. The idea is to get to a usable release and begin to get that valuable stream of data from the users as quickly as possible.
Don’t forget about breadth, just iterate
The idea that you’ll identify the most valuable topics to cover and only support those is the wrong mindset. Once you’ve released a chatbot, users will start to ask it questions; at this point, instead of guessing exactly what users will ask and how they’ll ask it, you’ll have real data to analyse.
My advice is to resist the temptation to start adding the topics that you THINK are needed and “let the data guide you”. You can then focus on the things your customers are actually asking that aren’t yet covered by looking again at the frequency, cost and complexity. As a result of this approach you will naturally begin to add the kind of breadth you were tempted to do from the beginning, but with greater confidence and a lower risk of wasting time.
What challenges have you faced selecting topics for a chatbot and how have you overcome them?