As a follow up of the CLARIN-PLUS workshops on Oral History (OH) archives in Oxford (April 2016) and Utrecht (dec 2016), the Arezzo workshop is meant for the finalization of the setup of a transcription chain for OH interviews.
The envisaged outcome of the Arezzo workshop is an implementation plan for an OH transcription chain that can be integrated into the CLARIN infrastructure. Once the implementation plan is written, it will be submitted to CLARIN ERIC for final approval. The funding has been reserved already.
The second workshop (10-12 May 2017) in Arezzo is a two-day workshop for max 30 participants (on invitation only).
Main goal of the workshop is to:
- finalize the proposal for the "ideal transcription chain" for oral historians
- find necessary colleagues/partners
- identify possible (CLARIN) hosts for OH transcription services for the three languages.
Location
The location of the meeting is at the Department of Education, human sciences and intercultural communication – Siena University (Campus ‘Il Pionta’).
The University Campus is located in Viale Cittadini 33
+39-0575-9261;
The location is very near to the railway station of Arezzo and the historical centre is less than 10 minutes by foot.
Directions: Once you get to the railway station, walk through the underpass to Campo di Marte and take the exit on the right, walk straight to the traffic light, cross the road and walk in the opposite direction to the cars. After a few meters, you will find the Campus on your left.
Here you can find a virtual tour of the Campus.
Programme
Here the draft version of the workshop-program. The times are just an indication: it may happen that some parts will need more or less time than expected.
Wednesday 10 May
14:00 | Welcome | Silvia Calamai | |
14:15 | Overview | Henk van den Heuvel | Background, Objectives, Agenda, targets of workshop |
14:30 | Transcription chain | Henk van den Heuvel | The various building blocks of a transcription chain, as discussed in Utrecht workshop. |
14:45 | AD-conversion | Arjan van Hessen |
AD-conversion-tools |
ASR-tools: Full Speech Recognition for different languages |
|||
15:00 | ASR tools, English | Thomas Hain | Focussing at WebASR.org |
15:20 | ASR tools, Dutch | Roeland Ordelman | KALDI recognizer Dutch NISV |
15:40 | ASR tools, Dutch | Henk van den Heuvel |
Webinterface incl. OH version, incl results |
16:00 | BREAK | ||
ASR-tools: Alignment of audio and transcripts for various languages |
|||
16:15 | WebMAUS | John Coleman & Christoph Draxler |
WebMAUS Aligner |
16:30 | Italian Alignment | Piero Cosi | The Italian Aligner |
16:45 | Experience feedback | Graham Gibbs | Participants reports on their experiences with the ASR tools and Alignment tools |
17:15 | DIY | Arjan van Hessen | Discussion about desired formats of the ASR-tools. What do you want to get back from the ASR-engine? Hands-on Experience if necessary |
18:30 | Close of first day | Silvia Calamai | Are you hungry? |
19:30 | Diner |
Thursday 11 May
9:15 | Buon Giorno | Henk van den Heuvel | Summary of day 1 and Overview of day 2 |
Transcription: Guidelines, Standards, Editors, Crowdsourcing |
|||
9:25 | Transcription guidelines | Stef Scagliola & Silvia Calamai | Various standards, best practices for Oral History |
9:45 | Manual transcription correction services | Arjan van Hessen | What is there to be used by individual researchers (for example SubtitleEdit) |
10:00 | Web-based annotation editors | Christoph Draxler | Portal for individual researchers and in in a crowdsourcing environment |
11:00 | BREAK | ||
11:15 | Crowdsourcing | Arjan van Hessen | Crowdflower (in 2020 bought by Appen) crowdsourcing strategies and transcription correction |
11:25 | Discussion | All | Participants reports on their experiences with Transcription services and crowdsourcing platforms |
12:00 | Hand-on experience | Arjan van Hessen & Christoph Draxler | Do a correction of your own transcriptions, set-up a crowdsourcing experiment where people can help you with the transcriptions, and try-out the transcription guidelines (good or not and what is missing) |
13:00 | LUNCH | ||
Metadata: Guidelines, Standards, Editors |
|||
14:00 | Metadata | Stef Scagliola & Louise Corti | Overview of standards, relevant categories, language of metadata, translation etc |
14:30 | Metadata editor | Henk van den Heuvel | A metadata editor as implemented at CLST |
14:45 | Discussion | All | Participants reports on their experiences with Metadata-editing |
15:00 | BREAK | ||
Presentations on data management/hosting in NL, UK, IT ((persistent) archiving options) |
|||
15:15 | National Infra: NL | Rene van Horik | About the data infrastructure in the country and how our services could fit into that & access to data, tools, metadata for the research community at large & IPR / informed consent / ethical issues |
15:30 | National Infra: UK | Louise Corti | About the data infrastructure in the country and how our services could fit into that & access to data, tools, metadata for the research community at large & IPR / informed consent / ethical issues |
15:45 | National Infra: IT | Monica Monachini | About the data infrastructure in the country and how our services could fit into that & access to data, tools, metadata for the research community at large & IPR / informed consent / ethical issues |
16:00 | National Infra: CZ | Pavel Stranak | About the data infrastructure in the country and how our services could fit into that & access to data, tools, metadata for the research community at large & IPR / informed consent / ethical issues |
16:20 | Discussion | Henk van den Heuvel | |
18:00 | Close of meeting | Silvia Calamai | |
19:30 | Diner |
Friday 12 May
9:45 | Buongiorno | Henk van den Heuvel | Summary of day 2 and overview of day 3 |
10:00 | Wrapping up | Henk van den Heuvel |
|
10:30 | Proposal | Arjan van Hessen | Concluding actions for finalising the implementation proposal |
11:30 | BREAK | ||
11:45 | Time schedule | Arjan van Hessen | Setup of the time schedules for the next months: from workshop to proposal. |
12:15 | Plan for a publication | Stef Scagliola | How to set up some publications based on the work done in this workshop? |
12:45 | LUNCH | ||
14:00 | Adjourn | Henk van den Heuvel & Silvia Calamai |
Participants
At this moment (1 May 2017) the following persons have confirmed their availability at the workshop.
Photo | Name | Country | Affiliation | Expertise |
![]() |
Silvia Calamai |
IT | Dipartimento di Scienze della formazione, scienze umane e della comunicazione interculturale | Linguistics |
![]() |
Bianca Pastori |
IT | Dipartimento Culture e civiltà, Università di Verona | Oral History |
![]() |
Riccardo del Gratta |
IT | ILC | Infrastructure |
![]() |
Piero Cosi |
IT | Institute of Cognitive Sciences and Technologies | Infrastructure |
![]() |
Franco Cutugno |
IT | Dipartimento di Ingegneria elettrica e delle Tecnologie dell'Informazione | Language and Speech Technology |
![]() |
Monica Monachini |
IT |
Institute for Computational Linguistics “A. Zampolli” |
Infrastructure |
![]() |
Stef Scagliola |
LU | Faculté des Lettres, des Sciences Humaines, des Arts et des Sciences de l'Education | Oral History |
![]() |
Pavel Straňák |
CZ |
Institute of Formal and Applied Linguistics Charles University, Czech Republic |
Language and Speech Technology |
![]() |
Christoph Draxler |
D | Ludwig-Maximilians-University of Munich Institut für Phonetik und SprachverarbeitungMunich, Bavaria, Germany | Language and Speech Technology |
![]() |
Louise Corti |
UK | UK Data Archive | Oral History |
![]() |
Graham Gibbs |
UK | University of Huddersfield | Oral History |
![]() |
Maureen Haaker |
UK | Department of Sociology, University of EssexSociology, University of Essex | Oral History |
![]() |
John Coleman |
UK | Phonetics Laboratory, University of Oxford | Language and Speech Technology |
![]() |
Martin Wynne |
UK | Bodleian Libraries, University of Oxford | Infrastructure |
![]() |
Thomas Hain |
UK | Speech and Hearing Group at the Department of Computer Science, University of Sheffield | Language and Speech Technology |
![]() |
Afalonne Doek |
NL | International Institute of Social History | Oral History |
![]() |
Roeland Ordelman |
NL | Netherlands Institute of Sound and Vision | Language and Speech Technology |
![]() |
René van Horik |
NL | DANS | Infrastructure |
![]() |
Leon Wessels |
NL | CLARIN-EU | Infrastructure |
![]() |
Norah Karrouche |
NL | CLARIAH - WP5 | Oral History |
![]() |
Henk van den Heuvel |
NL | CLST, Radboud University | Language and Speech Technology |
![]() |
Arjan van Hessen |
NL | HMI, University of Twente | Language and Speech Technology |
ASR comments
Comments on the ASR-engines
Here the comments of various participants on the three ASR-engines (English, Dutch and Italian). The majority of the comments deal with the Sheffield WebASR-engine. Some with the Radboud ASR-Engine and just one comment is about the Italian ASR-engine. This is because the Italian ASR-engine was not online at the time of the workshop. It could only be accessed via the computer of Piero Cosi.
English
Before one could use the web-based engine, you need an aproved registration by the Speechgroup of the Sheffield University. WebASr is at this moment the best featured ASR-engine..
I try it with some youtube file. The first one was a journalistic interview and the result is good a part for some misunderstanding. The two voices of interviewer and interviewed are not split. The second file was a oral history interview to an old irish woman speaking about her family and origins (probably the worst choice I could made) and the result is - understandably - awful. It would be better to have a .doc or similar output file.
Bianca Pastori
Activated login ; uploaded an mp3 of child in tiime, Deep purple. Updated the XML metadata to select the segments to transcribe. Got the transcription quite distant from the original text.. No problem.. it had to be such. I'll try another wav at work
Riccardo Del Gratta
Activated login Uploaded 1 x mp3 (anthropology) 2 x wav files (popper and HLS). Didn’t realise one could upload metadata though e.g interview summary but for a short clip might not be so useful?). Got a zip, ttml (didnt understand format?), XML and pdf files. Actually did a very good job indeed, although speech very clear and pronounced. No punctuation!! However, once downloaded the file names are not intuitive and better if could include the name of the the input file. Also extract as rtf as another default setting? Hard to match up. XML = really detailed metadata with phonetic output! Converted pdf to word to compare against my own gold standard transcript. Pretty good!
Louise Corti
Using;
‘Alex interview start.wav’ gives,
‘Alex interview start Transcript.pdf’
Does not pick up different speakers and misses start of text.
Using;
‘Lateral_flow_test.wav’
Gives good transcript of single speaker. But with some missing repeats and no acronyms.
Lecture recording did less well. Poor sound at start and two speakers. Lots of mistakes but editable text.
Graham Gibbs
When I first tried to use this programme, I had several 500 errors. I did eventually get this to work for my audio files after trying at a different time, however the transcripts were not particularly useful. This, however, is clearly down to the quality of audio - these were mp3s with background noise, elderly voices, etc. I will check other audio files on here to experiment with best practices for sound recording. Very valuable resource - will run all audio files through here first (and hopefully find a good method for recording reliably high quality audio).
Maureen Haaker
Signed up and got a reply after a few hours, IS THERE A PERSON OR A MACHINE THAT REACTS? successfully uploaded audio and downloaded outcome in pdf, running text with no speaker turn, took me 1 hour to correct and edit transcription of 5 min audio, had no clue about alternative formatsin output that have this requirement already, if presented in XML I can't make sense out of it. I need something readable with speaker turn and time codes, this saves me time and makes use of ASR an advantage - 2nd attempt after the presentations. I tried to upload an English speaking focus group, in webASR, first I had to convert an Mpeg audio file of 1.15 hours, into something manage-able. I used Movavi converter on my own MAC, and that was easy. Then I tried AUdacity to cut a 5 min fragment. That was a tedious task, as the functionality to mark the segment was not clear, had to ask help, then I managed to cut a fragment, but interface Audacity not quite clear whether is was really 5 min. Then I uploaded the audio, and tried to create a text file with muy notes on the focus group to upload and improve the performance of the ASR. WHen I asked Arjan, it appeared that this function is not supported yet. All this took me en entire hour. I had expected that the speed with which the document was uploaded would reflect the speed of processing, but It took more time, it is processing for some time now, about 30 min. THen I asked THomas and it appeared that I had not uploaded 5 min, but 1.15 hour, the problem is that you must not save the project, but export the fragment in Audacity, if you forget, it will give the original file a new name.
Stef Scagliola
Several errors in the beginning, Thomas helped me in the workshop. Results are not as good as the Dutch resluts.
Afelonne Doek
- works fine for me
- extra info wrt expected (remaining) processing time would be nice
- names of outputfiles have no correspondance to name of audio input file which is confusing!
Henk van den Heuvel
Dutch
Dutch ASR-engine
Before one could use the web-based engine, you need an aproved registration by the CLST of the Radboud University. Currently only 16kHz-16-bit-mon wav files are supported.
Easy to use, but I had no idea how to convert the file. I tried otranscribe immediately after this, and then went back. Worked fluently, but the outcome was not very good. Short interventions are missing, bad transcription. It would be useful to have an info sheet that summarizes the weaknesses of ASR. I would have understood from the onset why some parts of the interview were badly transcribed (speaker turns, different languages in one recording etc.).
Norah Karrouche
First attempts failed because not clear what to upload and what went wrong. Onsite help during the workshop in Arezzo made it very easy. Some very good results, some poorer results. Not sure whether the results are influenced by the lack of domainsepcific volabulary or more by the quality of the recording.
Afelonne Doek
Some remarks of missing features or unclear manuals after the testing of the engine during and after the Arezzo workshop.
- Input audio is limited to wav 16 kHz 16bit; (should be mentioned)
- Projectnames and filenames: alphanumeric characters should be allowed
- The program suggests multiple audio file upload but works on single files only
- Outputformats should be clearly explained
Henk van den Heuvel
Italian
Italian ASR-engine
The windows-based engine has to be downloaded and installed on a personal computer.
The results Piero shows are encouraging. Is important to stress on the quality of the audio document that can be useful for training the system (e.g. no voices overlapping, good audio quality, standard speech, orthography accurance in transcription, correct identification of disfluences). I think collection of recent and brand-new interviews could help this process.
Bianca Pastori