There are many different specialized tools for manual digital transcription. While it is not neccesary to use specialized tools (one can also use Microsoft Word or Google Docs), knowing your options can make transcribing much less of a hassle and expand the possibilities of research using your transcriptions.

One example of an advantage of transcription software is that it offers "playing" with "typing" and that the resulting transcription is time-aligned. i.e. the start and end time of each text-fragment (a word, a couple of words, a phrase or even a paragraph) is known. This time-alignment makes it possible to search for spoken words and to generate subtitles.

Transcriptions made with an ordinary text editor (Notepad, Word, etc.) lack this time-alignment and the result is just text. Combining this text with forced alignment however will result in the same time-aligned transcriptions as with dedicated transcription software, which will be explored on our page on post-transcription.

The "time resolution" of the transcription software depends on the human editor who selects short fragments (words or even phonemes) or rather long fragments (paragraph). Another often used method for the time-alignment is to place time-stamps in a fixed interval (e.g. each 30 sec or each 5 minutes).

Once the transcription is made (with a text editor with or without time-stamps or dedicated transcription software on an utterance level) a final foreced alignment will result in a more precise determination of the start- and end-times of each word and, if desired, the start- and end-times of the spoken phonemes.

For Oral Historians, time-aligment on the utterance level will be "enough", but modern technolgy makes it extremely simple to automatically add a higher granularity on the time-aligned transcriptions.


Here, I will list three tools that can be used especially for transcription. The first is a transcription-centered text editor for plain transcripts, the second is a tool useful for manually transcribing on a sentence-by-sentence basis, and the third goes even deeper, making it possible to transcribe on a word-by word or phoneme basis.