Post-Transcription

Once the ASR has been done, one can download the results. However, each ASR-engine seems to have its own set of output-formats. Sometimes you can get a special XML-file, sometimes a CSV-file and sometime something else.

Moreover, if it is XML, each ASR-engine seems to use its own XML-schema.

So we made some software that reads the output of the ASR-engines we support, and transforms it into one of the following output formats:

  • SRT: the standard subtitle format used by nearly all existing video and audio players (like VLC)
  • VTT: the new internetversion of the SRT-format, used by all the modern browsers
  • Karaoke: a html-file where each recognised word is "connected" with the audio-file and where clicking on a word results in playing the audio-file from that word. Words played, are highlighted.
  • CHA: the format used in the CHILDES format (spring 2020: work in progress).

The software (FromTo) can be downloaded below in DOWNLOADS for Windows 64 and MacOS 64