Viva la Voice! Hypervoice, hot wording & killer apps on the rise

What do StubHub, Uber and The RealReal have in common?  Besides the fact that they provide services we cannot stop raving about, they neatly illustrate how high the bar has been set for new user adoption.  

We don’t tend to pick up something new because it’s a “nice to have” or even a “must have” app or service.  Today, a new tool must fundamentally reinvent an experience in a profound, meaningful and whip simple way. Our tolerance for any conceivable inconvenience – such as a learning curve – seems to be at an all time low.  

With that in mind, where does Hypervoice fit along the adoption life cycle? 

We all know that emergent paradigms and market shifts take time. But unless you are one of the early pioneers (complete with arrows in your back), you may not realize that the advent of Hypervoice is a 14-year-long journey.  

Until very recently, the conditions for voice as a native Web object have been, well, unfavorable.  Economic, regulatory and behavioral barriers have resulted in a steady stream of casualties.  One particularly painful source of delay was the Great Recession when funding for voice innovation evaporated, and VC focus shifted almost exclusively to digital – primarily mobile and social media.  

Given the bumpy past, an outsider may have a hard time fathoming the recent outpouring of optimism for Hypervoice.  However, as I’m riding a high of two Hypervoice events last week, I am more confident than ever of its impending mass adoption.  

Thanks to Siri, the connected car and most recently – conversational search and hot wording from Google – voice as an interface is approaching a behavioral tipping point.  The phone as a typewriter is quickly being seen as an outdated, outmoded model.  This critical behavioral micro-step – of talking to your phone or computer instead of through it – is required for Hypervoice to take hold. 

Simply put, we need to be willing to talk to our devices.  And we have never been closer.  

Then again, old habits die hard.

What is clear is that the juggernaut of WebRTC is blazing the path for a plethora of new voice applications to emerge.  Similarly, Big Data has both startups and enterprise searching for the killer app.  Thankfully, you don’t have to be a data scientist to recognize that voice is a far more information-rich media type than text.  "Big Voice Data" will be a hot trend in short order. 

When viewed through the prism of these two zeitgeists, it is easy to see why the voice innovation community is optimistic. Voice is getting downright sexy.

Do You See What I See? SeeMail Brings Hypervoice to Life

What does Hypervoice look like?  How does it make you feel?  While enterprise and B2B applications were the first examples cited, consumer Hypervoice applications are rapidly emerging. 

Case in point, check out SeeMail - a mobile app that adds voice to the photo sharing experience.  SeeMail was the first to combine photos and audio in this unique and profound way. Recently, I had the pleasure of speaking with SeeMail’s Founder & CEO, Ward Chandler, to hear what inspired him to view voice in this novel, intimate way. 

KF: What inspired you to found SeeMail?

WC:  I'd been thinking about some new type of "visual voicemail" hybrid for a long time - I registered the SeeMail.com domain way back in 1998.  When I started looking through old family photos and started reading the stories that my dad had written on the back, it hit me - create a mobile photo sharing app where people can share the story behind the photo in their own words.

KF: Wow, that’s an early insight - 1998 was long before Instagram was a household name.

KF:  How have people responded to the app thus far? 

WC: "This is fantastic - why didn't someone do this before?!" 

SeeMail has been in the market for over a year now. Since SeeMail can be shared privately between friends or family, people love the personal, intimate aspect of the app.

KF: What is the driver behind SeeMail’s early success?

WC:  It’s a much more intimate experience to hear the person's voice or the sounds around them. You know that they are thinking of you and want to share that moment specifically with you.   

KF: How does the Hypervoice model intersect with your vision for SeeMail?

WC:  The concept of Hypervoice is very much aligned with the fundamental premise of SeeMail.  I believe that over the last 30 years the telephone has turned into a typewriter, and it's time to turn it back into a device that we talk on instead of type on.  We have the technology now to recognize and convert voice to text. And we can capture and catalog conversations and make them searchable.  That technology gives us the best of both worlds – the searchable, retrievable benefits of text and the personal, emotional communication that only the human voice can relay.  

I expect that in 10 years we will look back at all this texting and emailing and laugh.   

KF:  I hope we don’t have to wait 10 years! Thank you for being such an early visionary and pioneer, Ward. 

Unified Communications and Hypervoice: how are they related?

Historically, voice communication has been tied to a synchronous paradigm.  Hypervoice allows us to occupy a new opportunity space: asynchronous voice.  In March I presented on Hypervoice at UC Expo in London. (You can watch the video here.) A question worth thinking about is how the two words of Unified Communications and Hypervoice are inter-related. To get there, we first need to see where the former is headed.

Unified communications has been moving away from being about "unifying" of disparate synchronous and asynchronous messaging technologies, and instead is re-positioning itself as being about collaboration. Indeed, it is being referred to as "Unified Communications and Collaboration" or UC&C. This is unsurprising, since all business communications is ultimately about collaboration towards some directed goal.

An important part of the value of UC&C is the ability to integrate with business processes, such as trouble ticket management, or purchase approvals. These "Communications Enabled Business Processes" make the system more aware of our goals and their context. The humans are left to collaborate over human issues, and the machines focus on automation within their domain.

The key objective of CEBP is to reduce the cycle time of business processes, so that issues can be surfaced from the "automated" domain for human resolution, and the decision results handed back for continued automation. As such, communications and collaboration is becoming increasingly embedded into the systems in which the business processes occur, rather than being a separate stand-alone activity or application. This allows the business process to be managed and monitored end-to-end.

This contextual embedding applies equally to voice communications, which used to exist in a separate silo of the enterprise PBX. The convergence of voice into enterprise collaboration is evidenced by Oracle's recent acquisitions of Acme Packet and Tekelec. And here is where we can see how Hypervoice enters the fray.

At the moment those enterprise CEBP activities are highly text-centric. Users are presented with overflowing inboxes of messages from automated systems requiring attention. A cascade of further messages is then generated as people co-ordinate around issues. Each round of text-centric messaging induces delay to getting a business process to move forward.

Every conference call has been preceded by a long negotiation over timing and participation. That meant that there was a strong incentive to stay in a textual mode; the set-up time of voice overwhelmed its subsequent benefits of synchronous interaction. We have tried to mitigate this issue by using presence to indicate who might be available for a conversation, and to manage an ad-hoc escalation to voice. It has never really been a satisfactory solution once the group of people involved exceeds 2-3 persons.

However, with hypervoice we have a wider range of options. We can have an ad-hoc voice conversation without it disrupting the overall interaction flow, or excluding participants. Hypervoice potentially opens up a new way of working, as much a step-change as was the jump from inter-office memos to email.

Imagine for a moment you are in your "virtual work environment" of the early 2020s... You see visualised a number of business tasks in front of you, almost like virtual water coolers around which people are available to conduct discussions relevant to the tasks at hand. You can step up to one of those spaces, and will find others who are working on the same problem or issue. It is quite possible you will collaboratively review a few minutes of audio from a related conversation held by another team working on the same issue. You have your conversation about the business problem, take notes, interact with various other business objects and processes, set some actions in train, and then "step back". No emails, no IMs, but a complete record of the interaction is retained.

Moving between text and voice becomes a simple and natural shift, unlike how we manage conversations today. The machines are working to set up spaces and conversations for us, based on who is around, and using more natural metaphors. We see our "business terrain" laid out before us, in a very different way to the inbox of today. The boundaries between synchronous and asynchronous start to blur; we have flows of conversations, about flows of business issues. We can step between different flows, and know we won't be missing out by joining another conversation.

Whilst this may seem futuristic and fantastic, these changes can happen very quickly. Things we take for granted, like getting an accurate map of our locality everywhere we go, were novelties less than a decade ago. As Intel founder Andy Grove famously said: "A fundamental rule in technology says that whatever can be done will be done."

HyperVoice in Healthcare: Engaging patients & reducing diagnostic errors

Web RTC and VoIP are about delivery mechanism of voice, not the content itself.

In healthcare, the content of coversations between doctor and patient are incredibly important. I am spending nights thinking about how Hypervoice can enhance better health and overall care, and I’m sure there are 100 ways I can’t yet even imagine.

One of my initial goals, regarding my passion for reducing diagnostic errors, is to better engage patients to actively participate in the diagnosis process (by linking what they say to what they do) – more efficiently, effectively and securely. When patients are allowed only 12 seconds to speak, it is disrespectful, annoying and can lead to a ‘rush to diagnosis’ by not considering other diagnostic possibilities. With a lack of complete information, potential diagnostic errors are much greater. By using Hypervoice to encourage conversations to be captured, structured and made easily searchable for reflection and reminders, it is possible to promote engagement and participation if empathy is more generously availed.

Clinicians iterate their desire for the patient to prepare in advance of a visit; a potential use case I see with Hypervoice is for patients to tell their story and engage in a facilitated conversation that has been recorded and structured so they have a ‘head start’ when the visit begins. Hypervoice has the opportunity to play a significant role by providing the complete and powerful patient history that can be recorded, searched and reflected upon in less time than listening to a patient who may ramble on, be confused at the time or simply not well prepared.

The Association of American Medical College estimates that the United States faces a shortage of more than 90,000 physicians by 2020 - a number that will grow to more than 130,000 by 2025. This trend it ubiquitous throughout the world. Therefore, we need to think about task shifting. Task shifting can go to non-doctors and to the patient.

I am a huge supporter of patients (if they are stable adults) being the executives of their own care. They give the ultimate "go" or "no go" - but they need the right knowledge to make good decisions. We can help patients and empower them to become better Narrative Creators (1). I think we can do that by facilitating conversations and using Hypervoice as an enabler.

Hypervoice is certainly not the only answer, but it’s the beginning in recognizing a challenge and a potential solution to improve patient care and safety, as well as something else I am becoming a fervent believer in – DIY or Do-it-Yourself Health.

(1) Graham Douglas is a pioneer of Applied Mind Science with a wealth of experience in innovative projects in government, business, civil society organizations and in international development. Contact: www.integrative-thinking.com.

Read my first guest blog post on Hypervoice in Healthcare here

Martin's take on the Future of Conferencing

The Hypervoice team was in Las Vegas March 7-8 to attend the Telespan workshop on the Future of Conferencing. The Telespan event is where the teleconferencing industry gathers to celebrate its wins and mourn its losses. The players cover conference calling, video conferencing, and newer web-centric entrants.

The general tone of the event was one of “change or perish”. The teleconferencing industry is mature, margins are under pressure, and impending regulatory changes in the USA may transform the cost and revenue model. The number of mentions of WebRTC – both as a threat and opportunity – was notably high. For a long time, the per-minute model of conference calls created an incentive to minimise productivity and keep calls long, rather than to maximise it and make calls short. The consequences of this are now catching up.

I presented an Introduction to Hypervoice along with Hypervoice Consortium co-founder Kelly Fitzsimmons, and Tracy Isacke of Telefónica Digital, who is their Director of Business Development & Investments and based in the Silicon Valley office. The idea of Hypervoice clearly excites and interests a lot of people, and the hard question we keep getting is “where do I begin?”

I would like to highlight three players who attended the Telespan event and who have already begun showing value in the Hypervoice space.

VoiceBase

Jay Blazensky is co-founder of VoiceBase, and drew a lot of people to his demo at the Hypervoice table. His company is the “Google of spoken voice”. They were demonstrating the ability to ingest large amounts of podcasts and published spoken audio, automatically do voice to text transcription, and then make the results searchable. The search results highlight the position of each search word (e.g. “Advantages” and “Benefits”) in the spoken audio.

Jay made the point that telcos are missing a trick with recorded voice. It can attract new users, make existing ones sticky (by persisting their data), and recruit new ones via the viral nature of sharing.

ZipDX

The ZipDx service provides a fascinating real-time transcription of audio, augmenting the conversation as you speak, as well as providing a parallel text to be able to return to any portion later. Note the “play” button next to each little piece of transcription that lets you return to that place.

The service is of particular interest to international organisations with multi-lingual needs. The ability to apply different language transcription to each separate audio stream in the mix is new and differentiating.

HarQen

HarQen, one of the six Hypervoice Consortium founding partners, was also present. They have two products in this space. Their VoiceAdvantage product for HR recruitment allows you to quickly find and screen the best candidates for a job, with voice automation resulting in a 2/3 drop in the cost of recruitment. Their Symposia product is an early example of the kind of rich recording, tagging and workflow integration we expect to spread widely in this space. HarQen President Ane Ohm commented: "There is a need to build partnerships and mindshare for this emerging segment. Hypervoice is a rallying point for voice-enhanced business communications."   You can see Symposia in action during our first Hypervoice virtual event, which we blogged about here

Finally, our best wishes to Telespan workshop organiser Elliot Gold who announced his retirement at the close of the event, after 33 years running his business.