Overheard in a Palo Alto coffee shop:
“… so, we tokenize the utterances, lemmatize the tokens, vectorize with StarSpace embedding, and run the featurized input through the neural net. The softmax activation layer tells us what the user intent was.”
“Sorry, you what the what and then what?”
“Or, you can say the user types commands, the bot interprets those and takes the appropriate action.”
“Oh, now I get it; it’s like a CLI on steroids!”
Calling a chatbot “a CLI” is like referring to a mobile phone as “a portable wireless teletype”: not even close to a complete representation, yet not entirely wrong. Like a smartphone, a well-trained bot will do as much or as little as the user requires of it.
Historical Perspective
The command-line interface (CLI) dominated human-computer interaction for a few decades, until the growing number of options software offered made GUI the new favorite way of talking to computers. Both of these interaction models are based on human to human communication: the operator communicates “an intent” complete with some input “entities,” each inserted in its “slot.” This is exactly how bots operate, so how is a bot not just another UI?
Bots as UI
While a user interface is merely a human-readable catalog of machine’s capabilities and a set of input slots tightly coupled to a specific application, an NLI-enabled bot acts as our agent: it infers the intent (even multiple intents) from a natural language utterance, identifies any entities in that utterance, such as food: pizza, toppings: (olives, pepperoni) and fulfills that intent, be it by talking to Pizza Hut’s API or by calling the local pizza shop on the phone. Or even by making the pizza itself.
NLI bots have already made quite an impact on the sales and support side, but how suitable are bot agents for more technical work? How would a chatbot perform in the DevOps world, replete with esoteric terms and complicated interfaces?
Although there is a lot less training material available (after all, more people eat pizza than reconfigure Kubernetes clusters), there are also a lot fewer ways to express a technical intent, so in the end the technical bots are easier to train and make fewer mistakes than their sales brethren. But is a bot’s ability to classify an intent to deploy a new EC2 instance of specific flavor enough to make hardcore DevOps techies give up their favorite CLI?
Beyond Interface: Advantages of Bot-assisted System Interaction
Context memory: Whether a bot remembers your favorite pizza toppings, or you have to say “pepperoni and mushrooms” a couple of times a week is of little consequence. However, if your input looks like this:
I’m sure you’d rather not have to type that ever again. The machine learning models of the leading bot platforms not only remember your slot values, they also remember the context in which those values were entered, so a well-trained bot is always able to suggest an appropriate value when looking to fill a slot.
Natural language understanding: Just like a human would not be fooled by variations in the word order or use of synonyms, neither would a well-designed bot. Not having to remember the exact syntax or even spelling translates into higher efficiency.
Continuous training and improvement: The more time a bot spends with a team, the better it gets at its task of helping members of that team, not only due to filling its context memory with details about team members and tasks, but also because of reinforcement learning. In addition to the initial training corpus of data, most contemporary bot platforms feature interactive training mode, wherein a human user can provide a real-time feedback on the bot’s performance.
Seamless automation: Since it’s an agent rather than a mere translation layer, the bot remembers successfully fulfilled intents and chaining those into a sequence, with or without slot elicitation, with or without branching and decision making is significantly simpler and easier than even the simplest of scripts.
Collaboration platform integration: Teams typically meet in a collaboration platform such as Slack, WebEx or MS Teams, decide what needs to be done, then go and do it. With a bot on their team, they can get right to the “doing” part without leaving the collab platform; avoiding context-switching improves productivity.
Built-in analytics: A bot’s main job is to classify intents, fill slots and invoke fulfillment. However, in its spare time it could listen in and take notes. At the end of an event (release, incident, outage, etc.) it could use its NLP savviness to provide a linguistic breakdown of the event: How many questions were asked? How many answers? How many statements? Or, even, What was the level of profanity use as compared to the average? This kind of summary provides benefits for parts of the company beyond bot’s original team.
Conclusion
Bots can and do increase productivity in a software development or operations environment. Automation is important, but when done through traditional means such as scripting, its high cost often causes it to be prioritized behind other tasks. The use of an NLI-enabled bot as an agent for interfacing with various back-end systems not only ensures the very best in user interface (“CLI on steroids”), but also minimizes the need to switch contexts and significantly lowers the cost of automation.
More importantly, the emerging agent-facilitated process control prepares us for the next phase in the human-machine interaction: the post-app world.
Bots will help us navigate the new world of commoditized services that will replace the familiar apps. They will track our experience and base future selections on our satisfaction with their past choices. What more can we ask for? Aren’t most of humanity’s dreams centered on intent (wish) fulfillment? What are the wish-fulfilling creatures from the fairy tales if not expressions of human desire for the perfect bot?