Four types of audio transcription and when to use them

Posted July 1, 2021

Thanks to machine learning, the potential uses of audio transcription are rapidly expanding. However, in order to maximize your use of this increasingly important service, it’s vital to have a clear understanding of what it involves. There are a wide variety of audio transcription services, each with different benefits and limitations. It goes without saying that which type of audio transcription you choose can have a massive impact on your project.

In this article, we’ll help you to figure out which type of audio transcription is the best for you. By taking a detailed look at each of the four main services, you’ll be able to take the first step towards a transcription that perfectly suits your audio files.

Verbatim transcription

Also called true verbatim or strict verbatim transcription, this is one of the most detailed types of transcription available. It aims to capture all filler words, pauses and non-verbal communication contained within the recording, as well as all the words uttered by the speaker. As a result, verbatim transcripts are usually long and extremely detailed. If the audio contains multiple speakers, verbatim transcribers may also note interruptions, conversational affirmations such as “right” or “uh huh” and overlapping conversation.

For example:

Included: All words and non-verbal communication, such as laughter, pauses, and coughing. Ambient background noise such as talking or noises from an audience may also be included.

Not included: Noises that are irrelevant to the transcript or that may interfere unnecessarily with the flow of reading, such as police sirens, thunder or construction work.

Edited transcription

Edited transcription, also called clean verbatim transcription, is often the default for transcription providers. Just like verbatim transcription, it is committed to preserving the meaning of a text. A good edited transcription will not paraphrase the text or change its meaning in any way. However, it doesn’t aim to capture the way that the speaker communicates. Stammering, filler words such as ‘like’ or ‘you know,’ and unnecessary non-verbal communication are usually left out. This is because they don’t add much to the meaning of the text. Instead, the aim of edited transcription is to strike a balance between completeness and readability.

For example:

Included: All essential text uttered by the speaker.

Not included: Ambient noise, sounds, and non-verbal communication. Filler words or phrases which don’t affect meaning are also omitted. For audio with multiple speakers, interruptions and affirmations may also be excluded.

Intelligent transcription

Sometimes referred to as intelligent verbatim transcription, this service is largely concerned with transcribing audio into concise, readable texts. Here, transcribers have much more leeway to edit and remove parts of speech than in the varieties of transcription outlined above. Instead of clinging solely to the speech as it was uttered, intelligent transcription aims to communicate the meaning of the speech in the most natural way possible. This can include the removal of repeated sentences and phrases or even the grammatical restructuring of what was said.

For example:

Included: A version of the recording which prioritizes clear communication of the meaning of what was said. Punctuation and grammar errors will be corrected where appropriate.

Not included: In addition to the removal of noise and non-verbal communication, all filler words, repetitions and off-topic content will be omitted from the final product. The transcript may not accurately match the audio in terms of speech uttered, but should encompass the meaning of what was said.

Phonetic transcription

Phonetic transcription is a specialized form of transcription which differs significantly from the other types of audio transcription mentioned above. It aims to capture the way that speakers utter sounds, with a particular focus on pronunciation of words. This can also extend to annotation of the way that the speaker’s tone rises and falls, as well as how different sounds overlap within the audio. Phonetic transcription requires a specialized notation system to be performed properly, as can be seen in the example below:

Included: A complete catalogue of all the sounds uttered by speakers in the audio, written in phonetic alphabet. Further annotations detailing the speaker’s intonation may also be included. It’s worth confirming whether a more traditional transcript will also be included with your phonetics.

Not included: Noises that may interfere with the transcript.

Which type of transcription should I use?

With such varied output across these different transcription services, it’s obvious that each one will suit different project types. You should carefully consider which will give you the best ROI, as the contents of your transcript can vary widely across transcription types. As a general rule:

Verbatim transcription suits heavily detailed projects that require analysis of a complete transcript, such as legal work.
Edited transcription creates clean, professional texts that are both formal and comprehensive. Texts that need publishing often benefit from this type of transcript.
Intelligent transcription delivers transcripts that are clear and easy to understand. As such, it suits a wide variety of general business purposes, where documents need to be quickly read, digested, and shared.
Phonetic transcription explains how something was said on an extremely detailed level. It’s generally used for specialized projects in academia or linguistics. Certain industries also gravitate towards a particular type of transcription. Check out the graph below to see how other businesses in your industry normally handle their recordings:

The information above should help you to get your transcription plans in motion. However, choosing the correct service is only the beginning of the process. To maximize your transcription ROI, you’ll need experienced transcribers to carry out the work for you. That’s where TELUS International can help. Check out our AI Data Solutions to learn more about a variety of audio projects we can assist you with.

Four types of audio transcription and when to use them

Verbatim transcription

Edited transcription

Intelligent transcription

Phonetic transcription

Which type of transcription should I use?

Check out our solutions

Related insights

Natural language processing: The power behind today's large language models

Are we headed for an AI data shortage?

Building a multilingual dataset with high-quality data collection and annotation