Speech Recognition, the Brand and the Voice

How to Choose a Voice for Your Application

Marcus Graham, Founder/CEO, GM Voices

What began some thirty-odd years ago as an effort to let callers know that your business was closed, has grown into something far more sophisticated and critical to businesses today. Those simple answering machine messages such as, “Our office is closed…” have morphed into a wide range of automated call routing, information dispersing and transaction completing technologies that are changing the way our society does business.

As more companies implement voice-driven self-service applications to lower cost, the smart ones have already recognized the impact this vital contact channel has with their customers. A critical part of the customer experience in automated voice applications is the recorded voice that guides the callers to their desired information or transaction.

Historically, the voices heard on a company’s recorded telephone messages were given little thought because they were originally considered an extension of the receptionist. The receptionist answered the phone live during the day, and after closing time the receptionist’s recorded voice answered via a phone answering machine. When touch-tone automated attendant and IVR systems arrived in the 80s and 90s, the receptionist continued doing many of the greetings and menus.

While the receptionist continued providing the voice on some applications, radio announcers began recording more of the messages. These ‘enunciation’ experts could pronounce any particular word perfectly, but they didn’t sound natural. They had that ‘disc jockey’ sound. In fact, most automated systems working today use this sort of voice talent.

Let’s assume that the speech application is functionally sound with effective call flows, scripts and high recognition rates. The only real aspect the caller ‘connects with’ then is the voice. They know it’s not a real person, but they do want it to be personable. Research has shown that people attribute human characteristics to speech applications because it just seems natural to do it. After all it is a voice speaking to them, right? It stands to reason that the more pleasant the voice, the better the exchange.

Speech Recognition

When speech recognition began to make its way out of the labs and into the marketplace in the late 90s, it soon became apparent that encouraging callers to speak naturally for better recognition rates was largely accomplished by speaking naturally to them with the prerecorded voice prompts. This realization dramatically raised the level of quality expected for recording greetings, menus, and prompts for the telephone.

Ironically, at the same time, much of the higher end advertising and production world began moving toward using more believable voice actors in television, corporate productions, and shows. Using actors posing as ‘real people’ became a more credible and effective communications technique that has become even more widespread today. Most of the successful speech applications in use today employ actors as the voice talent.

An important part of creating a virtual personality for a speech application is developing a persona or biography to clarify the impression the company wants to leave with the caller and for use by the actor recording the messages. Who is this person answering the phone? How old are they? Where do they live? Do they have kids? Using this biography, the voice actor can become the person who’s answering the phone and maintain consistency throughout the recording of hundreds or even thousands of voice prompts.

The Brand

How do you go about identifying the type of voice that will work best with your application? It starts with the brand. Think of the brand as a container that holds every experience the user has had with that company or product. All the good, bad, and the ugly touchpoints paint a picture in the individual customer’s mind. The voice that is ultimately chosen for the company’s speech application should embody the same attributes as the brand.

With multiple applications, there may be a need to brand by department, depending on the company and requirements. For example, specialized speech applications for particular departments may warrant positioning for something other than the overall brand and may require a different personality or strategy.

The automated-voice application is simply another contact channel with customers. It should be given the same care and consideration as other contact points. Voice customer self-service has become one of the dominant contact channels for many companies. As a result, Chief Marketing Officers have discovered that it has a huge impact on the customer’s brand awareness.

The good news for people on the technical side is that the marketing department needs to pony up some budget money for implementing speech recognition applications! We’ve heard numerous respondents in focus groups and in-depth individual interviews state the choice of speech recognition vs. touchtone is a competitive advantage. Positioning speech as a marketing tool may erase the financial burden faced by many IT departments if they play it right within their organization.

The Voice Solution Provider

The discussion of the voice talent selection usually goes from considering ‘the brand’ directly into ‘choosing a voice,’ but a more fundamental decision should be considered selecting the voice talent. That decision has a large impact on the final product: You need to evaluate the ideal way for your organization to get those voice prompts. There are three basic sources: 1) you, 2) a technology provider, or 3) a voice solution provider.


You manage the voice talent recruitment, audition, selection, contracts, and general management process. This is rarely an effective solution. Just as your business specializes in its area of expertise, you’ve got to decide if you want to learn this business yourself or hire people who do it every day. On occasion, we’ve had companies that had a relationship with a voice talent and requested we work with them on their behalf to manage the process.

Technology Provider

This is a one-stop-shopping-solution that many companies choose. A technology provider, such as a Nuance, Avaya, Genesys, or IBM, has a professional services group with experience in this area and can do an effective job. Today, more of these companies outsource this task to a company in the next group. It’s just too complex to do it well part-time.

Voice Solution Provider

The third option uses a specialized provider, such as my company, GM Voices, that focuses entirely on providing voice talent solutions for speech applications. These specialized companies typically have more voice talent options, quicker turnaround, lower costs, and more natural-sounding concatenated speech (account and phone number strings). This option can help you with broader voice-branding solutions by putting your selected voice talent on all the other automated applications such as automated attendant, call center queues, and even on-hold messages.

The Voice Talent

Once you’ve decided the best option for your organization, then you need to look at voice talent. The choice you make in the section above will impact this process because each provider will likely have options readily available to you relating to a voice talent. Regardless of which direction you go, you may consider an in-house voice or an outsourced voice.

In-House Voice

This is usually a throwback to the after-hours message recorded by the receptionist many years ago. I know a few staff members who turned auto-attendant work for their company into a career as a voice talent. But realistically, you can’t expect a quality application from non-professional voice talent. Familiarity with voice-over recording, maintaining performance consistency, and a dozen other considerations are not natural. These skills are learned during years of refining the craft in voice acting.

Outsourced Voice

These are professionals who can deliver on most of the requirements. Of course, they are many professionals who ‘won’t pass the audition’ for various reasons. Remember, choosing a voice is a very subjective process. It’s hard to articulate why you like a voice or don’t care for one, but you know it when you hear it. The difference between a Radio Announcer, Voice Talent, and Voice Actor is worthy of some discussion.

A Radio Announcer is typically a rich-voiced performer who’s been in the radio business for some time. Male and female announcers are heard every day all over the country. Next time you listen to the radio, listen to the words. They’re usually pronounced perfectly and they likely have a deep tone, but they don’t sound like real people. It’s very difficult to turn off the patterned cadence of radio talking. “That’s right, now back to more music.” It’s hard to say that like a real person.

A Voice Talent is someone who’s trained in using their voice. They sound somewhat conversational, but it’s still not quite real. These people usually are the anonymous voices you hear on radio and television commercials. They also do much of the narration you hear in business presentations or even on television. Anyone who’s ever done a voice-over will call themselves a voice talent. It’s really hard to articulate the difference.

A Voice Actor is an actor. They’ve likely been trained in drama or theatre where they learned how to ‘become’ the character in the show. It’s about sounding real by being real. That’s why the persona biographies are so important. That’s how they figure out how to perform when the microphone is turned on. They are the guys and girls next door who talk to you like…the guy and girl next door!

When it’s all said and done, you can’t put a voice talent on a spreadsheet. Choosing a voice is still a very subjective, creative and an art oriented process. And there are clearly preferences in particular applications for gender, age, and other factors.

One factor that will play a growing role in choosing voices is research. We’ve worked on projects for large IVR users recently where the marketing players are participating in the discussion and enthusiastically embracing traditional consumer-based research to validate voice choices. Through focus groups and in-depth individual interviews, we’re finding out from customers what they really want. Marketing leaders will continue to get more involved in large-scale speech implementation as they increasingly impact the brand in the mind of the customer.

GM Voices welcomes the opportunity to discuss these and other issues relating to voices and technology. Visit us at www.gmvoices.com to listen to all our voice samples and languages.

© GM Voices, Inc. All rights reserved.