Today I had the chance to talk to Todd Mozer, CEO of speech recognition technology developer Sensory Inc. In the past 15 years, about 60 million products have shipped with its chips and software embedded. These include everything from in-car voice command systems to Bluetooth headsets to voice controlled toys.
The company’s next big area of growth, according to Mozer, lies in speech controlled Internet devices (SCIDs). This essentially means the many consumer electronics products that surround us, and some yet to be marketed, which are Internet connected and speech enabled.
This is expected to be a big growth area if you look at the increasing saturation of wireless broadband (4G rollouts, etc.) that is tied to competition among wireless carriers. A product of this (and Wi-Fi saturation) will be more connected devices. Intel in fact predicts there will be 15 billion connected devices by 2015.
Raise Your Voice
This includes everything from coffee makers to shoehorns (kidding, only sort of). The reason this is interesting to Sensory is that many of these devices don’t fit our typical view of what makes an Internet connected device (read: keyboard and screen). Voice then becomes a logical input mechanism.
One example Mozer showed me was a clock that is completely speech enabled, including setting and telling the time, current weather and other localized variables. In this case, the voice recognition chip exists at the client level to perform a certain set of functions — nothing new.
But when networked to a server side speech processing engine, a much broader set of possibilities emerge. The same clock now becomes able to tell you the weather in Paris or the closest theater where “The Hangover” is playing. The connectivity also opens the possibility for ad support. This relies on more audio content (ads) but Google is moving in that direction, as we’ve argued, with Google Voice and other efforts.
Sensory has done work with Goog411 and currently works with Microsoft on Bing 411. This involves connecting client side voice commands (from an in-car Bluetooth, for example) to a free DA call. These DA providers love it obviously, because it sends them traffic. Bing 411 is more interesting, says Mozer, because it has branched out beyond just local listings to include a broader index of weather, stocks and such.
Other possibilities, Mozer says, include tying it to VoIP capability to call people or places from these different devices. This can include ordering a product or making a reservation. This starts to get interesting if you tie it to call tracking services that monetize local phone leads. Skype has also done some interesting work in this area, which we uncovered a few months ago.
Bottom line: If Mozer — and Intel — is right about the explosion of SCIDs, you can imagine a day when the average home is filled with not only Wi-Fi connectivity but also speech recognition. The kitchen, living room, den, car and other places become search engines. True, that’s already the promise of the mobile phone, but this broadens it and gets us closer to being truly “wireless.”
In terms of Sensory as a company, it’s been profitable for four of the past six years, with the down years having to do with the economy plus strategic investments in Bluetooth integrations. Stay tuned for more announcements in the next 60 days that I can’t talk about yet, which will get the company closer to bringing us the wide world of SCIDs.