Conversational Programs: Introducing SWAPI Bot

Blog Posts ,Programming ,Software ,Virtual Reality ,Web Development
December 14, 2016

The Star Wars API is one of my favorite public APIs to use for testing out new technologies. This week, as part of a new project at work, I decided to brush up on my Node.JS skills (and by “brush up on”, I mean “hack together something resembling an app) and build a SWAPI Bot. You can converse with SWAPI Bot here.

Conversational programs, which can respond to human input and perform actions accordingly, are going to be an increasingly important part of how we interact with software in the coming years. One area that “bots” have a lot of use are in virtual reality applications, yet conversational interfaces are still in early stages for two reasons: the technical implementation of cloud-based conversational frameworks have only recently become accessible to developers at large, and there isn’t an intuitive interface to discovering voice commands and specifications.

Building an application that can respond to basic queries is becoming easier, and while SWAPI Bot is a simple web-based Node.JS application, the promise is there for using bot framework technologies to build out voice activated interfaces that understand sentiment (check out LUIS for what Microsoft is doing in natural language processing) as well as conversational AI for immersive experiences.

I finally watched Westworld and updated my EULA accordingly.

I finally watched Westworld and updated my EULA accordingly.

With SWAPI Bot (as always, you can find the full source code on GitHub here) I wanted to break down the initial first stage of building a basic bot. I used the Microsoft Bot Framework and Cognitive Services API to access the SWAPI API, parse a message, and then query for an image to include back on a formatted card. SWAPI Bot accesses the people, planets, and starships options of the Star Wars API endpoint.

bot.dialog('/', new builder.IntentDialog()
    .matches(/^hello/i, function (session) {
        session.send("Hi, I'm SWAPI Bot!");
    })
    .matches(/^hi/i, function (session) {
        session.send("Hi, I'm SWAPI Bot!");
    })
    .matches(/^help/i, function (session) {
        session.send("Type 'people', 'planets', or 'starships' to learn more.");
    })
    .matches(/^people/i, function (session) {
        session.beginDialog('/searchPerson');
    })
    .matches(/^planets/i, function (session) {
        session.beginDialog('/searchPlanets');
    })
    .matches(/^starships/i, function (session) {
        session.beginDialog('/searchSpaceships')
    })
    .onDefault(function (session) {
        session.send("Try typing 'help' to see what I can do!");
    }));

Conversationally, the SWAPI bot has a very limited number of understood phrases, but if you start your conversation with “hi”, “hello”, “planets”, “people”, “help”, or “starships” the bot will branch off into different conversation patterns. Let’s consider the flow of specifying that you want to search for a person in the Star Wars universe. This dialog is triggered by typing the word “people” into the chat bot message, which begins the dialog tree “searchPerson”:

bot.dialog('/searchPerson', [
    function (session) {
        var regex = /\d/g;
        var res = regex.test(session.message.text);
        if (!res) {
            builder.Prompts.text(session, "Type 'people' and a number to search the SWAPI database");
        }
        else {
            var fullURL = SWAPI_URL + session.message.text.replace(" ", "/");
            request(fullURL, function (error, response, body) {
                if (!error && response.statusCode == 200) {
                    formatAndEndCharacters(session, body);
                }
                else {
                    session.endDialog("Sorry, I wasn't able to find anything");
                }
            })
        }
    },
    function (session, results) {
        var fullURL = SWAPI_URL + results.response.replace(" ", "/");
        request(fullURL, function (error, response, body) {
            if (!error && response.statusCode == 200) {
                formatAndEndCharacters(session, body);
            }
            else {
                session.endDialog("Sorry, I wasn't able to find anything");
            }
        })

    }
]);

So what’s happening here? As I mentioned above, the SWAPI bot is a really basic bot that knows how to say hello, and respond to really basic queries. What we’re doing in this function is checking to see if there’s a full query (specified in the API as a number following the world “people”) and if there is, we skip asking the user for additional information and simply perform the request. If the user has only typed “people”, we ask them to provide a full query and then connect to the API endpoint. If there’s a successful response, we pass in the information to the function formatAndEndCharacters(), which does an image search and formats the result. If there isn’t a successful response, the session ends with an apology from SWAPI bot.

Now, because I’m specifically matching words to the bot, there’s not much that this bot does that I couldn’t do with a simple string matching case, but the branching structure built-in to the Bot Framework makes this a little cleaner and manageable with fewer lines of code. The really interesting consideration comes from adding in natural language processing, then combining that with speech to text functionality and a voice interface in an application.

Take a look at the HoloBot for an example of adding a language bot to Unity applications for HoloLens. Imagine the capabilities of being able to say, in a VR app “Hey, how do I do this?” and get an answer.

The addition of natural language processing changes the flow of the bot. If I were to implement it into the SWAPI Bot, instead of directly matching chat words to specific actions, I’d process the message’s intents, which would be sent off to the cloud to understand if someone was asking for help, or trying to learn about specific things.

Conversational AI is a big area of interest around bot implementation in VR, but there are a lot of things right now that bots do well that would lend themselves to enabling interesting virtual reality experiences, especially in interfaces and actions. Consider “Hey, turn the music off” or “take me back to the main menu” scenarios – you have the opportunity to free up buttons or UI space, while also adding flexibility and usability in your application.

I’m really excited for where this takes off, and have a project up my sleeve that combines a bot & VR for a voice controlled interface sample. Stay tuned!

 

 

Related Posts

Leave a Reply