In Issue 1/2021

Introducing forensic transcription

In this article, I introduce a form of transcription that may be unfamiliar to Tiro readers: transcription of forensic audio.

Forensic audio is spoken language recorded secretly (typically by police) and used as evidence in a criminal trial. The need for secrecy means the recording is often very indistinct, to the extent that professional verbatim reporters struggle to transcribe the content.

It might be expected that specialist transcribers with advanced training would do the work. This is far from true. In many jurisdictions, transcripts of indistinct forensic audio are created by police investigating the case. Since police lack relevant expertise, their transcripts are not always reliable.

This creates a clear threat to justice. In Australia, the Research Hub for Language in Forensic Evidence aims to develop better ways of transcribing indistinct forensic audio.

To help explain our approach, it is useful to start from the more familiar situation of verbatim transcription. We can then consider forensic transcription by contrast.

Producing reliable transcripts of court proceedings

A common misconception is that verbatim transcription is a simple matter of writing down spoken words – like schoolroom dictation.

However, producing transcripts reliable enough to form an official record of court proceedings is a complex process. It requires individuals with advanced skill and professionalism operating within a controlled system with multiple components.

One essential component is a system of accreditation, to ensure practitioners meet appropriate standards of speed and accuracy. Another is a system to ensure they have the equipment and conditions they need to hear clearly.

Even with excellent skills and conditions, a further component is needed: a process to ensure practitioners have relevant background information about the context in which they are working. Without this contextual information, obtained via extended employment on one project or via research and briefing, they may be unable to transcribe unpredictable expressions.

Even with all these components in place, professional transcripts are liable to contain some errors. That is why the last part of the process requires the lawyers and judges who spoke during the proceedings to check the transcript to ensure their words have been represented appropriately.

From accuracy to reliability

This notion of transcripts being “appropriate” raises another misconception.

It is often thought that transcripts should be accurate in the sense of recording every syllable precisely. However, a better criterion might be reliability, in the sense of helping end-users achieve their purpose.

On this view, practitioners should balance completeness against usability (Eugeni, 2020a). Syllable-level accuracy is not just difficult to achieve; it is actually not useful. Too much detail can obscure the information that users want to retrieve from the official record (Voutilainen, 2018). This emphasises that the practitioner is not an invisible conduit, but a responsible agent making judgments in context (Eugeni, 2020b).

For these and other reasons, it is widely agreed that a transcript is not a neutral substitution for the speech it represents (Haworth, 2018). Some suggest a better term for transcription is “entextualisation” (Park & Bucholtz, 2009).

From court transcripts to forensic transcription

Forensic transcription is different from court reporting in multiple ways that make achieving reliability even more complex and challenging.

The most obvious difference is that forensic audio is often extremely indistinct. A transcript is needed, not as an official record but to help members of the court hear what is said, so they can use the evidence in reaching a verdict.

As already mentioned, when forensic audio is too hard for professional transcribers, many courts allow transcripts to be provided by police working on the case.

This may be surprising, but the reason is simple. Police can often hear more in indistinct audio related to their cases than the professionals can. In addition, their transcripts assist others to hear.

The question to consider next is why police can hear more than professionals. The law ascribes their ability to the fact they have listened to the audio many times. That is clearly a misconception, since professionals can also listen many times.

The real answer involves recognition that transcription begins with the perception of speech. This is easy to overlook for court transcripts, where the audio is easy for anyone to hear, and it is easy to check what was said. However, with forensic transcription, “what was said” is uncertain, and most people cannot understand the audio. The reason police seem to hear more than others reflects some interesting facts about human speech perception.

There’s more to speech perception than meets the ear

It is often assumed that perceiving speech is a simple matter of recognising sounds and forming them into words.

However, speech perception relies heavily on listeners having information about the context in which the sounds are heard. As we have seen, even professionals with excellent listening conditions still need to be briefed with contextual information.

With indistinct audio, the role of context is far stronger. While listeners without contextual information are unlikely to make out any words, those who know the context may hear confidently.

The real advantage police have in transcribing forensic audio, then, is their access to information about the context in which the recording was made.

The problem is that contextual information available to police may be incomplete or disputed. This, combined with their lack of training in transcription, means their transcripts are prone to error.

Ineffective evaluation of police transcripts

Of course, the law understands that police transcripts might not be fully reliable, and insists they must be checked by prosecution and defence lawyers. If necessary, the judge also checks them.

As we have seen, lawyers and judges have experience and confidence in checking verbatim transcripts of court proceedings. However, evaluating police transcripts of indistinct forensic audio is a significantly different task.

With court transcripts, the lawyers were present during the proceedings, so they know what was said. With forensic audio, they do not know for sure what was said. The only way they can check the transcript is by confirming they hear the words that are written.

The problem is that this kind of checking is ineffective. With indistinct audio, listeners’ ears tend to follow a transcript even if it is wrong (see multimedia demonstrations in Burridge, 2017). The result is that important errors may not be detected and corrected.

When we recall that the purpose of a forensic transcript is to assist listeners in determining the content of indistinct audio used as crucial evidence in a criminal trial, it is easy to see why Australian researchers are seeking a better way to transcribe indistinct forensic audio.


Misconceptions about transcription have created a paradoxical situation in which transcripts of the most indistinct audio are created and evaluated by the least qualified personnel (see Fraser, 2014).

The Research Hub for Language in Forensic Evidence (RHLIFE, 2021) aims to help overcome this threat to justice by designing professional accreditations specifically for transcription of indistinct forensic audio. Please contact us if you feel you can help our work. 

Professor Helen Fraser is the Director of the Research Hub for Language in Forensic Evidence at the University of Melbourne in Australia.


Burridge, K. (2017). The dark side of mondegreens: How a simple mishearing can lead to wrongful conviction. The Conversation.

Eugeni, C. (2020a). “What’s in a name?”. Tiro – The Journal of Professional Reporting and Transcription, 1.

Eugeni, C. (2020b). The reporter’s invisibility. Tiro – The Journal of Professional Reporting and Transcription, 2.

Fraser, H. (2014). Transcription of indistinct forensic recordings: Problems and solutions from the perspective of phonetic science. Language and Law/Linguagem E Direito, 1(2), 5-21.

Haworth, K. J. (2018). Tapes, transcripts and trials. International Journal of Evidence and Proof, 22(4), 428-450.

RHLIFE (2021). Website of the Research Hub for Language in Forensic Evidence.

Park, J. S.-Y., & Bucholtz, M. (2009). Public transcripts: Entextualization and linguistic representation in institutional contexts. Text and Talk – An Interdisciplinary Journal of Language, Discourse & Communication Studies, 29(5).

Voutilainen, E. (2018). The regulation of linguistic quality in the official speech-to-text reports of the Finnish parliament. CoMe – Studies on Communication and Linguistic and Cultural Mediation, 2(1), 61-73.

pingbacks / trackbacks
  • […] Helen Fraser:How misconceptions about transcription affect the criminal justice system […]

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.