Voice control is an extremely powerful way to use technology. Every body can use it and there is next to no learning curve required. How does it work?
When we ask our virtual assistants to change the lighting in a room, the audio clip is converted to text using a speech to text engine. The assistant algorythm then extracts important information from the intrudtction such as the command and subject. This “intent” is represented as a string of text containing information about the entity to control (say a lamp), as well as the action to perform on it (change its color) along with some parameters describing how the color should be changed (such as “to white”).
Consider this voice instruction controlling a multi colored LED lamp:
Home Assistant, change the color of the table lamp to white!
– You (wondering if it will work or not)
With a single command, we communicated four vital pieces of information to the hub:
- the intention to issue a voice command (keyword)
- the entity to control
- the action to perform
- and action parameters.
The clunky alternative
Imagine a different method to achieve the same outcome: Let’s use a clunky web form to illustrate the point. You would need a bookmark on your homescreen to open the page. When the form loads, you are confronted with a drop down list of twenty-something entities to control. Lamps, garage door, window blinds, locks, heater, fans, A/C. Another drop down with actions to perform and a freeform text field allowing you to enter optional action parameters in a set, structured format such as JSON. (which is not ideal for a user friendly interface)
{
"action_parameter": {
"to_color": "white"
}
}
Entering this information through a web interface is actually the worst, and would be a reason for me to not bother with the product. The result will always be a time consuming process, executed on a touch screen interface by people who—very quickly—come to understand that the original interface was much more convenient to use. (for example the switch on the wall or remote control)
Intent recognition
This example illustrates the power of voice control in making HC interfaces more natural and efficient. The technology powering voice assistants is getting better at recognizing intents, even in cases where the instruction is delivered in different phrasings. In this way, your voice assistant can be trained to respond to different ways of issuing the same voice command.
Home Assistant, I would like the table lamp to be white in colour.
This further lowers the learning curve by catering to the ways different people speak. It removes the need to learn and memorise robotic voice commands and blurs the boundaries between human-computer and human-human communication.
Open source projects
All this makes voice control an excellent addition to any home automation system. If you’re like me and privacy is important to you, I would suggest checking out Mycroft.ai – an open source, privacy-focussed voice assistant. The documentation shows promising features and use cases, and I am keen to implement it into my own system. Checkout the Why use Mycroft AI? page for more information