trinityCLARIN works with various communities to collaborate on research projects, to find out requirements for language technology services, and to support the curation of linguistic data.
CLARIN has embarked on a number of activities relating to working with oral history archive materials. The kick-off was during the 18-19 April 2016 workshop in Oxford. Goal of the workshop was to start with a number of activities relating to working with oral history archive materials, including:

  • extending and maintaining a registry of oral history collections in Europe,
  • making oral history collections visible via the Virtual Language Observatory,
  • a 'hackathon' to work on mini-projects with oral history and develop case studies of good practice,
  • publishing how-to guides including screencast videos on how to transcribe, align and search oral collections,
  • developing support services to make it easier to find and use Natural Language Processing (NLP) tools in oral history research,
  • development of new collaborative projects, including crowdsourcing elements to involve the public in transcriptions.


Oral history is a specific sub-discipline of history that has benefitted from the increased popularity of the personal narrative. Oral history can be defined as the practice of eliciting people’s personal memory of lived experiences that are absent in written archives, and documenting them with a recording device with the purpose of turning the interviews into historical sources.

The ‘digital turn’ has had an enormous impact on this archival practice. Currently much unique and valuable spoken language data reside in oral history archives, in the form of digital audio and video, written transcripts and non-digitized recordings. Speech and language technologists have developed various software tools and platforms for the analysis and exploration of the various layers of meaning in spoken data. But despite the large amount of research carried out in numerous disciplines to create, explore and analyse oral history data, the state of the art software is often not exploited by researchers in the humanities and the social sciences. At the same time oral history data is rather underused by linguists. CLARIN has organized a workshop to bring together those doing research on oral history archive data, including archivists, language technologists, social scientists and linguists

As part of the CLARIN-PLUS project, a two-day workshop took place at the University of Oxford on Monday 18th and Tuesday 19th April 2016.

The focus of the workshop was on the following questions:

  • What language technologies exist and can be used to help explore and analyse collections?
  • What are the barriers to uptake for these tools, and what can CLARIN do to take them away?
  • How can we integrate disparate collections to make more coherent historical collections, language corpora, and virtual collections?
  • Can we identify themes that could be studied from a cross-European (comparative) perspective and what could CLARIN do to support such studies?

The outcomes of the workshop included:

  • Proposals for new resource development and integration in CLARIN;
  • Proposals for new future joint research projects;
  • Requirements for the tools and services that could support of researchers working with oral history data, including ideas for tutorial development.