AI Tools That Modify Speech Have Downsides
Speech technologies, such as those that soften accents and filter out emotions, can strip away vital vocal cues and hinder authentic communication.
At the annual TED Conference earlier this year, the room fell silent for the demonstration of a potentially groundbreaking technology: a “cone of silence” tool, currently in development, designed to block out all surrounding noise so conversation partners can hear only each other’s voices. While many of us in attendance were impressed by the simulated “quiet table” in a busy restaurant, I could only wonder how much of each speaker’s voice might be lost along the way.
This isn’t the only technology “cleaning up” sound involving human speech. Artificial intelligence applications are being developed to soften call center workers’ accents to boost sales, an idea critics have called dehumanizing. Other technologies are reducing tones of anger from upset callers, in an effort to make the experience of call center workers less difficult.
Get Updates on Leading With AI and Data
Get monthly insights on how artificial intelligence impacts your organization and what it means for your company and customers.
Please enter a valid email address
Thank you for signing up
It may be just a matter of time before similar tools are used in daily workplace interactions. Businesses are increasingly distributed and global, with employees working remotely or in satellite offices. Those with different accents or less fluency in English may be tempted or even encouraged to put such tools to use when speaking with colleagues or presenting at a meeting. And people who don’t want to put up with angry coworkers may press a filter on their phones to weed out that pesky emotion.
The potential benefits of using these tools seem clear — speeding up communication, preventing misunderstandings, avoiding the biases that some people have with various accents, and more. But these same tools also threaten to silence crucial parts of the very voices they aim to enhance.
The Challenge of Fostering Trust
Because my focus is on reading the room in order to elicit the truth, I’m closely watching how this technological revolution could affect our ability to recognize dishonesty and what that could mean for workplaces.
This shift raises significant concerns. To understand why, we need to acknowledge the complexities of human speech. When we speak, we convey more than just the meanings of our words. Our speech patterns, pitches, vocal tones, cadences, and other vocal mannerisms help shape how others perceive us. They can also provide indications as to whether we’re telling the truth.
In my book Liespotting, I shared research on multiple crucial auditory cues.