The past few weeks have been a blur for me, but in some of the best ways possible. I’ve been able to spend a fair amount of my time programming on the HoloLens, and participated as a mentor in Berkeley’s CalHacks hackathon this past weekend, which was equal parts exhausting and incredibly fun. It was the first time that I was able to officially support HoloLens hacking at a student event, and seeing the excitement around immersive computing made the low-sleep weekend totally worth it.
One of the things that I saw was a lot of interest in using the HoloLens’ mixed reality capabilities to start working on problems that heavily rely on the device’s ability to process and categorize information. The Microsoft Cognitive Services APIs provide a solid introductory foundation around computer vision, natural language processing, character recognition, and more – so I decided to start building up a few sample projects that pull the functionality of the Cognitive Services (formally Project Oxford) solutions into Unity for use with the Universal Windows platform and Windows Holographic.
I decided to focus on two sample projects to start, based off of the most frequently asked questions at the hackathon. The first thing that I wanted to make a demo of was the use of the Cognitive Services Emotion API, and the second was to hook in the Computer Vision API to search images in order to categorize the photo.
To keep things modular, the demos I made were pretty small in scope – and because of the way that the Cognitive Services APIs are written, the wrappers that I wrote in Unity ended up sharing quite a bit in terms of their overall architecture and structure. I started with the Emotion API, to analyze and return the emotions that are found in images that have people in them.
The second demo, which uses the Computer Vision API, initially was structured the same as the Emotion API – we still need to set up our Cognitive Services API access keys, choose and send an image to the server, and process it back, but the structure of the returned data was a little bit different than the Emotion API due to the fact that our image could potentially have an unknown number of categories returned.
Both of the projects contain an API access manager, a photo display script, and wrapper objects for the responses that are returned from the server. While these work for the most basic queries to Cognitive Services, it’s fairly flexible and easy to modify the data structures for the objects that are returned to accommodate additional parameters and information that the API presents.
You can find the full source code for each of these examples up on GitHub: Unity Cognitive Services Demos
As immersive technologies continue to grow more prominent as a platform for solving new computational problems, the integration into powerful machine-trained APIs will enable a wealth of new solutions for creators. I will be continuing to update the Cognitive Services demo projects for Unity + HoloLens to highlight the functionality and ease with which these tools can be integrated into the Universal Windows and Holographic platforms – let me know if there’s one in particular you’d like to see next!