One of the more common concerns about virtual assistants is that Amazon and Google are storing digital copies of what consumers say. Learn what voice assistant platforms share with voice app developers. What happens when you delete your smart speaker interactions?
Audio files feed NLP
While it is true that virtual assistants convert what you say into an audio file – they have to in order to send the data over the internet so it can go through natural language processing (NLP) – it’s not necessarily the case that they are storing them.
Voice assistants provide an option for the user to not allow them to store your history. Even if a user’s account is configured so they do store the history, at any time, they can ask to have the data deleted. One important item to note is that the voice assistants do not start recording what is being said to send to the backend until it hears a wake word or phrase, like “Alexa” or “Hey Google.”
Alexa, delete... oops
We know from first-hand experience that if you delete your history, even Amazon support can’t get to the files. We had a case where we were working through some recognition issues with an invocation name for a voice application and because we had deleted our voice history, developer support was unable to access our repro steps.
Audio file info passed to voice apps
Regardless of whether or not Google or Amazon are storing digital audio files, they do not make that data available to voice applications.
Alexa passes the name of an intent that you defined in your natural language model and any slot values that were specified. But that’s it. You have no idea what the user actually said that resolved to your intent.
Google, in addition to the intent and slot values, passes the raw speech-to-text values. So, if you want to understand what users are saying, you can – which can help inform future iterations of your natural language model.
Clearly, these are different approaches to the problem of having your app understand what the user said and what they intended. The important thing to remember, is that unless you are using account linking, or collecting data such as an email address, you really don’t have any way to tie what was said to someone. And if an application is collecting personally identifiable information (PII), then the developers of that application should build in safeguards to protect the data - including limiting access, requiring credentials and encryption.
Have further questions on this sort of topic or voice applications in general? Feel free to reach out to Whetstone Technologies via our Contact page to schedule a free one-hour consult.