Artificial Intelligence

The human ear is still a lot more accurate than AI transcription

10th December 2018
Anna Flockett
0

In more recent years, the quality in artificial intelligence-driven transcription services has improved considerably. For example, between 2016-2017, Microsoft reported it had reduced its error rate by 12%, which puts the accuracy of its automated transcriptions currently at 94.9%. Automated transcription is where the video or audio file is converted into text file by voice recognition software.

However, despite the emergence of AI-driven apps such as Trint, Otter, and Wreally, which are growing in popularity due to the improvements made in automated voice-to-speech technology, both language and tech experts believe that automated transcription services will not replace human-driven services at any time in the near future.

The biggest advantage that a human transcriber has over their automated counterpart is that the human ear is more attuned to a series of external factors which makes it likely to misunderstand, omit, or completely skip words.

This is due to a number of variables, starting with the fact that humans are more adept at filtering out background noise. Meanwhile, AI-driven services can produce a transcript with an error rate of 12%, even with a clean audio script. Humans, growing up around different cultural contexts and languages showed to be more adept at identifying different accents than a machine.

Earlier in 2018, a study was carried out which found Amazon Alexa and Google Assistant to have problems with accuracy when identifying different accents irrespective of how fluent the speaker’s English might be - accuracy dropped by 2.6% with speakers with a Chinese accent and by as much as 4.2% for Spanish accents.

The difficulties experienced by AI-driven services comes from the fact that they have a dictionary-based vocabulary, which means they are only able to understand a series of short commands and a limited amount of words - therefore they are unable to recognise different accents, interlocked speech, or colloquial and slang terms. This presents them at a significant disadvantage over human-driven competitors who are able to achieve accuracy rates of between 99% - 100%.

“We cannot doubt the fact that the advancements AI has made in the transcription sphere in recent years is phenomenal,” commented Peter Trebek, the CEO of freelancer-driven transcription service provider, GoTranscript. “However, with error rates over 5%, there is still some considerable improvements to be made until AI-driven solutions are considered to be on par with human transcribers.”

Without doubt, AI-driven solutions have made significant gains over the last decade with experts claiming up to an 80% reduction rate in transcription errors over the last decade. However, programmers will need to work closer with language experts to bring them on par with human-driven transcribers and eliminate the current linguistic deficits.

Product Spotlight

Upcoming Events

View all events
Newsletter
Latest global electronics news
© Copyright 2024 Electronic Specifier