In the future, voice will be the primary input mechanism for virtually all interfaces. Science fiction books, movies and TV shows have predicted this for years, including the iconic ship computer in Star Trek, or, more recently, the operating system Samantha in the 2013 film Her. Yet these interfaces are no longer purely fiction; platforms like Amazon Alexa and Google Assistant are enabling users to talk to their computers and take more hands-free actions than ever before. But what is the state of the art in 2019, and what does it mean for us as software engineers?
As the lead engineer for voice interfaces and emerging platforms at National Public Radio (NPR), I’ve been at the forefront of the push to develop consumer applications for voice platforms and have experimented with other applications of this technology. In this talk, I will cover the many things I’ve learned along the way, including what voice UI development is and isn’t. (Spoiler alert: you’re not doing machine learning or natural language processing when you’re developing consumer apps!) We’ll go over the core strategy for building a voice app (or “skill”) for platforms like Alexa and Google using JavaScript, as well as discuss limitations of the current tech and predictions for how it will evolve in the future. Finally, I’ll conclude with my thoughts on the essential skills that a developer in this space should possess, so that you’ll feel prepared to enter the voice UI space, whether you’re planning to start working with it tomorrow or a decade from now.