We’re still at the beginning of the new year, so it’s common for experts to list their prophecies for the coming year. One thing a lot of people seem to be certain about is that 2019 is the year when voice input will take advance significantly and become a much more established way of communicating with our devices.
According to Adobe Analytics, 71% of the owners of smart speakers (like Amazon Echo and Google Home) use their voice assistants at least daily and 44% using them multiple times per day. Over 76% of smart speaker owners increased their usage of voice assistants in the last year.
So what’s the driver behind this change? You could say there are “changing user demands.” “There is an increased overall awareness and a higher level of comfort demonstrated specifically by millennial consumers. In this ever-evolving digital world where speed, efficiency, and convenience are constantly being optimized.” Me? I’m not entirely sure. I still get the feeling that we’re pursuing a lot of the voice technology because we can rather than there being an actual demand.
Personally, I use an HomePod in my office to play music (and mostly control it through voice) and I use Siri on my phone to set timers a timer while cooking. In order for voice assistants to actually break through to the masses in 2019, I believe there are a couple of things they first need to be able to handle.
First, they need to be able to help me be more productive and help me get actual work done. Activating Siri and saying ‘Add design review to my project in Things’ is not particularly faster than opening Things and just typing it in. I can open my email application and type a reply faster than I can tell Siri to do it. I can usually add an event to my calendar more quickly than telling Siri to add ‘Lunch with Klas on Tuesday, February 5th at 12:00 at Saltimporten successfully. It needs to help me get actual work done. And most importantly, I should able to trust it to complete it successfully without me having to double check to make sure it didn’t make a mistake.
If voice assistants are to become true ‘personal assistants’, they need to be much closer to what an actual personal assistant would be. Would an actual assistant ever get confused by the same simple, mundane command twice eventually just answering that they can’t perform the task? No. However, these companies are working hard for us to think of Siri and Alexa as humans. Voice is the new Skeuomorphism.
The articulation of the metaphor of a human assistant and the way voice assistants mimic humans is literal. Just as buttons look literally like button on the skeuomorphic visual interface, the voice assistant that sounds literally like a human is a skeuomorphism.
There’s a lot of talk about how the voice assistants are getting far better at understanding context. What this means is that follow-up questions are understood as related to a question asked before. So if I ask ‘What’s the weather like tomorrow in Oslo?’, I can follow-up with ‘Will it snow?’ and my assistant understands that I’m asking about tomorrow and Oslo. Or if I ask ‘Who’s the president of the United States?’ and follow-up with ‘Is he impeached yet?’, it understands that he refers to the president of the United States.
While that’s great for micro interactions I think what’s really missing, from a user’s experience, is more personalized context. It should be aware of what apps I use. If I tell Siri to ‘Remind me to call Klas’, it should be aware that I use Things for reminders, not the default Reminders app. Then it will automatically place the reminder in the correct app (based on my settings). Similarly, if I ask about the weather, it should know what my default weather app is. The requests should be on my terms, not theirs.
It should be aware of what I’m currently doing. If I’m listening to The Talkshow, I should be able to say ‘Remind me of this’ and it should create a direct link to 30 seconds earlier in the podcast. If I’m browsing something on Amazon and I say ‘Remind me to buy this’, it should include a link to the product without me having to specify it.
I’m sure that there is a lot of future in voice commands, but if they are supposed to ‘be the thing of 2019’, I think the software and executions behind them really has a long way to go. People are not going to be entertained by being able to ask ‘Who starred in Casino?’ forever.*
- not to mention you can still only set ONE timer on iOS and you can’t even do that on your Mac!