• Sanjeev Surati

How are Voice Apps for Alexa and Google Assistant Made?

As brands experiment with ways in which to interact with their customers using virtual assistants such as Amazon Alexa and Google Assistant, understanding the basic building blocks of a voice application can help determine how to properly design it. This article describes some of the key components of Alexa Skills and Google Actions.

At the core of every voice interaction and voice application is Natural Language Processing (NLP). When a person interacts with a traditional graphical user interface (GUI) application, they press buttons, swipe, enter text into text boxes and select items from lists. NLP applications offer a new way to accomplish the same results, but by leveraging the user’s voice. At the core of nearly every voice application are basic concepts that define a language model for your application: intents, utterances and slots.


At the heart of any user interaction is “intent” – meaning what is the user attempting to accomplish? A user may be supplying a “yes” or “no” answer or attempting to find information on your company – e.g. “Tell me about Whetstone Technologies”. There are different ways users may be expressing their intent – an NLP engine will take in what the users say and, based on hints provided by your voice application, attempt to translate that into intent.


There are many different ways a person can express intent. For example, “yeah”, “yes”, “yep”, and “sure” when provided in response to a yes / no question are all different ways of expressing a “Yes” intent. The different ways of expressing the intent are called utterances. They may select what they want from a short list of choices – eg. Whetstone or SoniBridge. Note: a long list of choices does not work as well in voice technology as they do when written, on a web site, for example.


Slots are placeholders in an utterance representing different values that provide more context for the intent. For example, I may have an intent called “FindInformation” which means that a user is trying to find out more information about my company. A voice application may have a couple of major topics it provides information on and rather than define a separate “FindInformation” intent for each topic, we define a slot “CompanyData” that can have multiple possible values. If we define a CompanyData slot to have two values: “Whetstone Technologies” and “SoniBridge”, we can then define an utterance “Tell me about {companydata}”, where company data can either be “Whetstone Technologies” or “SoniBridge”. The NLP engine will then pass our application a “FindInformation” intent along with a “CompanyData” slot set to the appropriate value and we can return a response with data appropriate to “Whetstone Technologies” or “SoniBridge” or any other slot value we add later.

There’s more to it, of course, but by understanding these three concepts, it becomes possible to create a language model that can then be used to back a voice application for your company or brand.

Bottom line: It’s not only what the users say, but how they say it, in what context and what they are intending to get out of the voice technology interaction. Taking these into consideration when designing your voice app makes it easier for both the developers to code and the NLP to understand. Most importantly, it makes an accurate and enjoyable user experience that customers will share with others.

#voicetechnology #amazonalexa #googleassistant #voicefirst #voiceassistants #voicetechexperts #voicetechnologyexperts #conversationaldesign #SoniBridge #WhetstoneTechnologies

72 views0 comments
Sanjeev Surati, CEO, Whetstone Technologis



The visionary leading a diverse group of talented and experienced people focused on a fun and memorable customer experience

Sanjeev Surati, Certified Alexa Skill Builder
About Us

Our Story

Co-founders Sanj & John are Microsoft veterans.  They have worked on projects for major clients in various industries.  They're also filmakers.  How cool is that? 


They began developing voice apps in 2017 and were selected for Batch2 of the 2018 Playlabs@MIT startup incubator/accelerator program.   

John Iwasz, CTO, Whetstone Technologies



Architecting the future of voice interactions using decades of experience in several industries.

John Iwasz, Certified Alexa Skill Buider in Philadelphia
John Iwasz, AWS Associate Architect Certification

The Team

Sanj & John have selected management team members who bring a balanced set of skills and 

valuable experience to Whetstone Technologies.


Headquartered near Philadelphia, PA, USA 

with remote distributed teams in Boston, MA and Seattle, WA.  

Phone:  610.345.7384


to us

  • Whetstone Technologies LinkedIn
  • Whetstone Technologies Twitter Page
  • Philadelphia Voice Developers

Philadelphia VUX Designers &  Developers Group

Philly VUX Group Meetu

Philadelphia VUX Designers &  Developers Group

Subscribe to Whetstone Technologies Newsletter

© 2019-2020 Whetstone Technologies, Inc. All rights reserved.