Background

Most transcriptions, if made by humans, are written with a text-editor like Microsoft Word. Word has a lot of functionalities, is well-known, available on nearly each computer and is considered as the default text-processor.
However, the result is stored in doc or docx, a Microsoft proprietary format and, it is nearly always unstructured. People can write whatever they want in the transcription. They do so, with other people (i.e. readers) in mind but normally not with a computer in mind.

For example. one can write:

John: Can you tell me about that episode? (Mary starts crying and John waits for 5 minutes).

Mary: yes, I can. So,.......

For human readers it is clear that the part between the brackets, is a comment and not a transcription. For the computer however.....

So, in order to do something with the transcripts, other than reading, the transcripts need to be structured.

TXT2CXML

To structur the hand-made transcripts, a software program TXT2CXML was written. TXT2CXM (text-to-CXML) converts a transcription-document into a more structured CXML-file. CXML stands for Conversational XML and is a protocol created by Telecats (Enschede, NL) for the transcriptions of the Dutch Parliament.

What the program does

The program tries to figure out who are the speakers in a transcript. It does so by assuming that the speaker is the first word of a new line if that word is followed by a colon and a space (= ": ").
Lines without an initial word followed by this colon+space, are considered as belonging to the previous turn.
Additionally, empty lines are skipped (and not recorded as a speaker).

Example

Original Result
 john: I felt a sleep  john: I felt a sleep
 mary: what did you do?
I mean, once you realized that....
 mary: what did you do? I mean, once you realized that....

Finally, the program counts the number of turns.

The result, is stored into a XML-file with the file extension *.cxml. The file is a XML-file and can be read/processed as a normal xml-file. However, to increase the readability, a xslt-file is provided that converts the cxml-file into a more readable html-file (see the example below.

CXML-file Result in a browser when the cxml.xslt is used
cxml cxml
Fig.1: the resulting CXML-file (left, as showed in an editor like Oxygen) and the same file, showed in the browser (right).
The accompanying xslt-file "transforms" the CXML-content in a more readable html-page.

The result, is stored into a XML-file with the file extension *.cxml. The file is a XML-file and can be read/processed as a normal xml-file. However, to increase the readability, a xslt-file is provided that converts the cxml-file into a more readable html-file (see the example in Fig. 1 above).

Preprocessing the transcription

Sometimes, the transcription contains a lot of additional text, lines, and other unusable information such as lines, file location on the hard disk, footnotes and more. In TXT2CXML text can be added/modified/deleted but it lacks the full functionality of a modern word-processor. So, before saving it as UTF-8 text only, one may use the word-processor to do some search-and-replaces (for example, unify the way a city is written by replacing all mentions of ‘Rome’ by ‘Roma’).

Using TXT2CXML

saveas
Fig. 2: Screenshot of “save as” command
As said, many transcription documents are made in MS-word. In order to use this programme, however, files cannot be saved as .doc or .docx – they need to be UTF-8. This can be done easily within Microsoft Word - simply save the transcription in your text-editor as "plain UTF-8 text" by using the Save As...

The resulting txt-document can be read into the TXT2CXML program.

 

Opening the UTF-8-Text-only transcription

The first step, is to open the txt-file. The file is read, processed and shown in the "Original" tab.

Identifying speakers

Besides the text of the document, some metadata is "calculated": e.g. the number of speaker turns, the number of different speakers, etc. The speakers found are showed in a small table. By default, the program detects the speaker IDs (for example, "AvH: I was wondering..." → Speaker-ID = AvH) Name, gender, role and description of the speaker are automatically (?) set to unknown.

speakertable
Fig. 3: Speaker metadata table. The orange arrow point to the "Original" tab, the green arrow to the small table with the (3) speakers.
Abbreviation is the ID as written in the transcript, Name is the full name (if available and desirable), Role is the role of the speaker in the interview.

Editing Speakers

There are 2 main ways of editing the speaker metadata. The first way can be done by editing the transcription itself. For example, in Figure 3, three speakers are shown: Interviewer, Respondent, and Respondent 1. The the easiest way to amend these speakers is to edit the transcription by changing all Respondent 1 to Respondent (if that is the case). After editing the transcript, click “Recaluc Metadata” on the speaker metadata table. This will recalculate and accurately show 2 speakers (see Figure 4).

speakertable2
Fig. 4: The recalculated speaker metadata table

The other way to edit speaker metadata is to modify the metadata in the speaker metadata table (see green arrow in Fig. 3). Editing can be done by just clicking on the cell of the table and replacing the old text by the new one. All cells can be modified with the exception of the Abbreviation (= speaker ID), in the first column.
The result is a more complete and informative set of information about the speakers in the interview (see Fig. 5).

speakertable2
Fig. 5: Modified speaker metadata table

Writing CXML

The final step in preparing your transcripts as a CXML format is to save it as a CXML file. This is done by clicking the “Write CXML” button. The filename will automatically be the same as the input UTF-8_Text-only transcription file, except that the file extension .txt is replaced by .cxml.
The file location will be based on the default settings in the setting tab of the TXT2CXML program.

Background

Most transcriptions, if transcribed manually by hand, are written with a word processor, like Microsoft Word. Word, in particular, is well-known, available on nearly each computer and is largely considered as the default text-processor. Beyond that, it has a lot of functionalities which allow customisation of transcripts.
However, the result is stored in doc or docx, a Microsoft proprietary format, and it is nearly always unstructured (meaning transcripts are not standardised). People can write transcriptions however they want. They structure the data with other people (i.e. readers) in mind but not normally with a computer in mind.

For example. one can write:

John: Can you tell me about that episode? (Mary starts crying and John waits for 5 minutes).

Mary: yes, I can. So,.......

For human readers it is clear that the part between the brackets, is a comment and not a transcription. The computer however, may not understand this and will process this piece of data along with the rest of the transcript as if it was something said.

So, in order to do something with the transcripts other than reading, the transcripts need to be structured. TXT2CXML is a software programme created with this problem in mind – it structures the data to allow for more diverse and fruitful analysis.

TXT2CXML

To structure the handmade transcripts TXT2CXM (text-to-CXML) converts a transcription-document into a more structured CXML-file. CXML stands for Conversational XML and is a protocol created by Telecats (Enschede, NL) for the transcriptions of the Dutch Parliament..

As said, many transcription documents are made in MS-word. So, the first step is to save the Transcription document in your text-editor into "plain UTF-8 text" by using the Save As...

saveasThe resulting txt-document can be read into the TXT2CXML-program.

Preprocessing the transcription

Sometimes, the transcription contains a lot of additional text, lines, and other not-usable information. In TXT2CXML text can be added/modified/deleted but it lack the full functionality of a modern word-processor. So, before saving it as UTF-8-Text-only, one may use the word-processor to do some search-and-replaces (for example, unify the way a city is written by replacing all appearings of ‘Rome’ by ‘Roma’).

Opening the UTF-8-Text-only transcription

The first step, is to open the txt-file. The file is read, processed and showed in the Original tab.

Speakers

Besides the text of the document, some metadata is "calculated": the number of turns, the number of different speakers, etc. Moreover, the speakers found, are showed in a small table. By default, the program detects the speaker ID's (for example "AvH: I was wondering..." → Speaker-ID = AvH. Name, gender, role and description of the speaker are set to unknown.

speakertableThe user can edit these data in the speakers table. However, it make sense to first edit the transcription and then recalculate the metadata. In this example, we see 4 speakers; Domanda, D, Risposta, and R. Is is obvious that Domanda and D are the same speakers  and Risposta and R too. So, the best thing to do is to rewrite the Domanda: in the first transcription-line by D: (and Risposta:  by R: ) anf click the Recalculate Button. This will result in the same table but with 2 speakers as can be seen below.

speakertable2

Next, the user can modify the cells in the table (only NOT the first collumn with the abbreviations/ID's). The result is a more complete set of information about the speakers in the interview.

speakertable2

Writing CXML

The last step is to write the transcript into this CXML format and save it on your computer. This is done by clicking the Write CXML-button. The name of the file is the same as the input UTF-8_Text-only transcription file, except that the file-extention .txt is replaced by .cxml.
The location of the files to be stored, is a parameter on the settings-tab (first tab of the program).

What the program does

The program tries to figure out who are the speakers. It does so by assuming that the speaker is the first word of a new line if that word is followed by a colon and a space (= ": ").
Lines without an initial word followed by this colon+space, are considered as belonging to the previous turn.
Moreover, empty lines are skipped.

Example

Original Result
 john: I felt a sleep  john: I felt a sleep
 mary: what did you do?
I mean, once you realized that....
 mary: what did you do? I mean, once you realized that....

Finally, the program counts the number of turns.

The result, is stored into a XML-file with the file extension *.cxml. The file is a XML-file and can be read/processed as a normal xml-file. However, to increase the readability, a xslt-file is provided that converts the cxml-file into a more readable html-file (see the example below.

 

CXML-file Result in a browser when the cxml.xslt is used
cxml cxml