Smart Speakers: The New Face of Faceless Computing

Sometimes, new technology sneaks up on you. Who would have thought, in 1993, that the hyperlinked internet with pictures would explode into the indispensable World Wide Web? In 2004, the GPS in my Prius seemed like an unnecessary luxury, but it quickly usurped that formerly essential book of Los Angeles street maps, the Thomas Guide . In 2014, I declared that I didn’t need that big, fancy iPhone 6. Now, I carry it wherever I go.

Will a voice-activated computer interface become a similarly disruptive technology?

As internet entrepreneur and investor Clay Dixon declares in a blog entry, “The next big thing will start out looking like a toy” (“The Next Big Thing Will Start Out Looking Like a Toy,” Cdixon Blog, Jan. 3, 2010; cdixon.org/2010/01/03/the-next-big-thing-will-start-out-looking-like-a-toy). “Disruptive technologies are dismissed as toys,” he writes, “because when they are first launched they ‘undershoot’ user needs.” His example is the telephone, which at first could only carry voices a mile or two. What possible use was that to the railroads, who were then its primary customer? “What they failed to anticipate was how rapidly telephone technology and infrastruc ture would improve …,” quickly making the telephone an essential utility of everyday life.

In the same way, the new smart speakers such as Amazon’s Echo (and particularly the miniature Dot) seem like playthings. Ask Echo the time, the date, the weather, and Alexa, the voice of the device, speaks a quick answer. Ask it to report sport scores or answer simple trivia questions, and Alexa is on the case.

Those with newer iPhones or Android-based smartphones are familiar with this personal digital assistant (PDA) technology. In the car, via Bluetooth, I ask my iPhone’s Siri to read my text messages so that I don’t have to look at my phone while driving. I can answer texts too, although my dictate d responses sometimes come out garbled; the car is noisy. Still, the technology astonishes me.

And that is using a 3-year-old iPhone in a 13-year-old Prius. At the CES show in January 2017, Shawn DuBravac, the chief economist for the Consumer Technology Association (CTA), spoke about the revolution of speech recognition: “We’ve seen more progress in this technology in the last 30 months than we saw in the last 30 years. Ultimately vocal computing is replacing the traditional graphical user interface” (“CES 2017: Voice Is the Next Computer Interface,” Stephanie Condon, ZDNet, Jan. 4, 2017; zdnet.com/article/ces-2017-voice-is-the-next-computer-interface).

As with telephone technology in the late 19th century, voice recognition is improving so rapidly that it is “ushering in a new era of faceless computing,” according to DuBravac.

Smart Speakers

The easiest way to experience the new age of faceless computing is to buy one of the new voice-activated digital assistants, or “smart speakers.”

These smart speakers listen, send your spoken words over a wireless internet connection, interpret them and then respond. At the time of this writing, there are two main competing smart speakers on the market: the Amazon Echo and Google Home. More devices in the works include Apple HomePod, which is scheduled to be released in December 2017. Microsoft’s voice recognition product, called Cortana, is available on PCs run ning on Windows 10 and in the fall of 2017 became avail able on a smart speaker called Invoke (harmankardon.com/invoke.html). Microsoft and Amazon recently announced plans to make Al exa available over Cortana and vice versa (“Alexa and Cortana Will Soon Work Together, Allowing Each to Access the Other,” Greg Ster ling, Search Engine Land, Aug. 30, 2017; searchengineland.com/alexa-cortana-will-soon-work-together-allowing-access-281699).

Amazon Echo

The Echo, a 9.25" tall cylinder speaker with a microphone array, came on the open market in June 2015 at the price point of $130. (A smaller version, the Echo Dot, debuted in 2016 at the cost of $50; both units are cheaper now.) Echo responds to a “wake word”: “Alexa” (although this can be changed to “Amazon,” “Echo,” or “Computer”).

Users download the free Alexa app on their smart phones and synchronize Echo with their Amazon account. Then, they can say “Alexa, set an alarm,” “Alexa, what is the weather?” or even ask her to tell a joke. She will play music too. Ask her play a song or an album; she will play it from the Amazon music service, if you have a Prime membership. (Alternatively, you can use the settings on the app to play music through Spotify or the freely available TuneIn: tunein.com.) Users may also set preferences for news outlets or radio stations in the app. They can then ask for a “flash briefing” to hear the top stories from their favorite outlets.

Alexa will respond to queries about the availability of products. But just as the Kindle is on some level a device designed to buy Amazon ebooks, Alexa also functions as an Amazon buying machine. For ex ample, my son asked Alexa about the availability of decaffeinated coffee beans. Alexa answered with a price and then the question, “Would you like to buy it?” “No,” said my son. Nevertheless, Alexa persisted, offering the top search result for decaf cappuccino. “Would you like to buy it?” she asked again? “No, no, no,” was my son’s answer. She finally shut up about it.

The Echo can do all this right out of the box. Still, the real fun happens when users download “skills,” which are like smartphone apps. According to the website Voicebot.ai, on July 2, 2017, there were more than 15,000 skills available for the Echo (“Amazon Alexa Skill Count Passes 15,000 in the U.S.,” Bret Kinsella, July 2, 2017; voicebot.ai/2017/07/02/amazon-alexa-skill-count-passes-15000-in-the-u-s). These skills are all free and cover a vast range of tasks—anything from hailing a ride from Uber or Lyft, ordering a pizza from Dominos, or a reading out a guide for a 7-minute workout. There are games too, including Jeopardy! or The Magic Door, an interactive adventure game. To get a skill, “enable” it from the app or via the Amazon site (search for “Alexa skills”). Once you have enabled a skill, “invoke” it by asking Alexa to open it by name.

Google Home

In November 2016, the smart speaker called Google Home debuted at the cost of about $130. (That price has since gone down.) Google Home, 5.62" tall and 3.79" around, was de signed to be a direct competitor to the Amazon Echo. Being that it is produced by Google, it is better at answering ques tions than the Amazon product, but it is worse at buying things and having them delivered to your house.

Indeed, as of July 2, 2017, Google Home offered only 378 voice apps compared to Echo’s 150,000. Still, as user “dmclone” comments on Reddit, “With Echo it seems like you always have to add skills and then you forget what things it can do. With Home it allows me to be dumb and just say simple phrases that it understands” (reddit.com/r/homeautomation/comments/69y480/google_home_vs_echo_after_a_year_of_use/dhab5fx).

The delivery of news headlines on Google Home differs from that on Echo. Google’s news stream is driven by al gorithms, which can, as we saw in the 2016 election, favor “fake news,” false stories pushed by Russian bots, for example. On Echo, the users choose their preferred news stream, such as NPR versus Fox. If there is bias in the news reported over the service, it has been consciously chosen by the user.