Microsoft Build 2017, the premier software developers event is in the books now. There has been a significant number of announcements and releases across the wide array of new software and services but I will cover the area of Artificial Intelligence and Machine learning, specifically around Microsoft Cognitive Services in this post.
Democratization of Artificial intelligence, Microsoft’s promise to take the AI and Machine learning from the ivory towers and make it accessible for all, is starting to take shape quite effectively. Let’s face it; resource constraints around AI/ML is a real problem. Most companies with real-world AI use cases just don’t have enough runway to build their own artificial intelligence offerings, and Microsoft cognitive services provide a sophisticated yet easy to use abstraction which fills this gap. Microsoft has also announced AI as an MVP category (http://aka.ms/AIMVP) for those creating intelligent apps/bots (voice, txt, spch), and AI algorithms.
Being a Microsoft MVP for Data Platforms, I have had the front row seat to see how Cognitive Services, a collection of powerful APIs and toolkits unfold to fulfill the promise of AI democratization. Among top 3 AI and Machine learning related capabilities discussed at Build 2017, I’d consider
- Custom Vision Service,
- Custom Decision Service, and
- Azure Batch AI Training
as top contenders, with Video index service as a close runner up.
Custom Vision Service
Computer vision service has been part of cognitive services since the inception however this has been significantly enhanced by introducing active learning and custom models in the current update. The API now offers the capability to upload your own image datasets, to create custom vision models which becomes part of a feedback loop, and helps to improve the underlying classification accuracy.
Custom Decision Service
Custom Decision Service provides a “contextual decision-making API that sharpens with experience”; Essentially providing an abstraction over cognitive services reinforcement learning capabilities which helps adapts the content in the application (think personalized interfaces, A|B testing, content recommendations) to respond in real time.
Azure Batch AI Training
Batch AI Training enables AI and Machine Learning developers to start training their customized deep neural networks using any framework (yes, not just CNTK), TensorFlow, and Caffe.
Video Index Service
A ubiquitous use case for entertainment, advertisement, and media verticals, like cognitive services image API, video index service works on moving pictures to identify faces, voices, and emotions; from captions to targeted advertisement, to discovering relevant (or irrelevant) contents to avoid. The service offers Audio Transcription, Video Indexer, Face tracking and identification, Speaker indexing, Visual text recognition, Voice activity detection, Scene detection, Keyframe extraction, Sentiment analysis, Translation, Visual content moderation, Keywords extraction, and Annotation.
Food Classification with Custom Vision Service
Custom Vision Service Documentation
Brief Introduction to Cognitive Services - The Force is strong with this one!
The Cognitive services are classified into 5 categories including Vision, Speech, Language, Knowledge and Search. These broader categories offer further specific sub APIs; In the Vision group, we see an amazing collection of powerful Computer Vision APIs around Content Moderation, Intelligent video processing, Video Indexing, Face & Emotion API as well as the long awaited and newly minted Custom Vision Service which allows users to upload their own images to create models.
In Speech, Custom Speech Service helps recognize variety of speaking styles, works with background noise, and customized vocabulary, Bing Speech API, Speaker Recognition API, and Translator services are provided. The live demo of PowerPoint translator service is definitely one of the Build 2017 highlights.
Language Understanding Intelligent Service (LUIS) is one of the marquee offerings in cognitive services which contains an entire suite of NLU / NLP capabilities, teaching applications to understand entities, utterances, and genera; commands from user input. Other language services include Bing Spell Check API which detect and correct spelling mistakes, Web Language Model API which helps building knowledge graphs using predictive language models Text Analytics API to perform topic modeling and do sentiment analysis, as well as Translator Text API to perform automatic text translation. The Linguistic Analysis API is a new addition which parses and provide context around language concepts.
In the knowledge spectrum, the Recommendations API to help predict and recommend items, Knowledge Exploration Service to enable interactive search experiences over structured data via natural language inputs, Entity Linking Intelligence Service for NER / disambiguation, Academic Knowledge API (academic content in the Microsoft Academic Graph search), QnA Maker API, and the newly minted custom Decision Service which provides a contextual decision-making API with reinforcement learning features. Search APIs include Autosuggest, news, web, image, video and customized searches.
Some of the labs projects discussed during build includes Project Prague for Gesture based controls, Nanjing Project for Isochrones Calculations (travel time), Project Johannesburg for Route logistics, Project Cuzco for Event associated with Wikipedia Entries, Project Abu Dhabi for Distance Matrix and Project Wollongong for Location insights.
And yes, these ARE the droids you are looking for
I have very little reason to doubt Satya Nadella’s claim that Software bots will be as big as mobile apps; there is already evidence of blurring lines as most applications use AI and Machine Learning as inherent part of their offering. Microsoft Bot framework has also been upgraded; this Open source Bot Builder SDKs helps build dialogs, and integrates with Cognitive Services to see, hear, interpret and interact in more human ways.
I have thoroughly enjoyed working with the bot framework, and it provides variety of features including building the Skype Bots, Bing, building Bots for Teams in Office 365, and Skype for Business Bots. The cognitive capabilities include features like Bot Smarts, Language Understanding, and QnA Maker. The framework also offers tools like Emulator which features debugging for Mac, Windows, and Linux, a Channel Inspector to show how the messages may look like on multiple channels and message types, adaptive Cards for conversation cards as well as Payments Request API, and Analytics on the bot usage.
HCI classifies the ability of the computers to understand what a person wants as one of the key problems; figuring out the pieces of information relevant to the intention is the key. Microsoft LUIS (Our Language Understanding intelligent service) now enables building language models (intents/entities) to understand actions, entities, and utterances. LUIS is not specific to the bot framework but can be used as a general offering.
No Pain No Gain; Be Quiet and Train (your Models that is)
Azure Batch AI Training is now offered for training customized deep neural networks on Azure. The preview allows, and this is the kicker here, to train models using any framework including Microsoft Cognitive Toolkit, TensorFlow, and Caffe at scale across clustered GPUs.
References and Further Reading
- Microsoft’s Cognitive Services get customizable models for search, image classification and A/B testing
- Language Understanding Intelligent Service PREVIEW
- Build custom language models
- Microsoft’s bid to bring AI to every developer is starting to make sense
- At Build, Microsoft delivers AI to mainstream software developers
- At Build, Microsoft expands its Cognitive Services collection of intelligent APIs
- Custom vision API https://customvision.ai/projects
Images, Courtesy of Microsoft Corporation, and Adnan Hashmi