As a follow up of the CLARIN-PLUS workshops on Oral History (OH) archives in Oxford (April 2016) and Utrecht (dec 2016), the Arezzo workshop is meant for the finalization of the setup of a transcription chain for OH interviews.
The envisaged outcome of the Arezzo workshop is an implementation plan for an OH transcription chain that can be integrated into the CLARIN infrastructure. Once the implementation plan is written, it will be submitted to CLARIN ERIC for final approval. The funding has been reserved already.
The second workshop (10-12 May 2017) in Arezzo is a two-day workshop for max 30 participants (on invitation only).
Main goal of the workshop is to:
- finalize the proposal for the "ideal transcription chain" for oral historians
- find necessary colleagues/partners
- identify possible (CLARIN) hosts for OH transcription services for the three languages.
The location of the meeting is at the Department of Education, human sciences and intercultural communication – Siena University (Campus ‘Il Pionta’).
The University Campus is located in Viale Cittadini 33
The location is very near to the railway station of Arezzo and the historical centre is less than 10 minutes by foot.
Directions: Once you get to the railway station, walk through the underpass to Campo di Marte and take the exit on the right, walk straight to the traffic light, cross the road and walk in the opposite direction to the cars. After a few meters, you will find the Campus on your left.
Here you can find a virtual tour of the Campus.
Programme Wednesday 10 May
|14:15||Overview||Henk van den Heuvel||Background, Objectives, Agenda, targets of workshop|
|14:30||Transcription chain||Henk van den Heuvel||The various building blocks of a transcription chain, as discussed in Utrecht workshop.|
|14:45||AD-conversion||Arjan van Hessen||
ASR-tools: Full Speech Recognition for different languages
|15:00||ASR tools, English||Thomas Hain||Focussing at WebASR.org|
|15:20||ASR tools, Dutch||Roeland Ordelman||KALDI recognizer Dutch NISV|
|15:40||ASR tools, Dutch||Henk van den Heuvel||
Webinterface incl. OH version, incl results
ASR-tools: Alignment of audio and transcripts for various languages
|16:15||WebMAUS||John Coleman &
|16:30||Italian Alignment||Piero Cosi||The Italian Aligner|
|16:45||Experience feedback||Graham Gibbs||Participants reports on their experiences with the ASR tools and Alignment tools|
|17:15||DIY||Arjan van Hessen||Discussion about desired formats of the ASR-tools. What do you want to get back from the ASR-engine?
Hands-on Experience if necessary
|18:30||Close of first day||Silvia Calamai||Are you hungry?|
Programme Thursday 11 May
|9:15||Buon Giorno||Henk van den Heuvel||Summary of day 1 and Overview of day 2|
Transcription: Guidelines, Standards, Editors, Crowdsourcing
|9:25||Transcription guidelines||Stef Scagliola & Silvia Calamai||Various standards, best practices for Oral History|
|9:45||Manual transcription correction services||Arjan van Hessen||What is there to be used by individual researchers (for example SubtitleEdit)|
|10:00||Web-based annotation editors||Christoph Draxler||Portal for individual researchers and in in a crowdsourcing environment|
|11:15||Crowdsourcing||Arjan van Hessen||Crowdflower (in 2020 bought by Appen) crowdsourcing strategies and transcription correction|
|11:25||Discussion||All||Participants reports on their experiences with Transcription services and crowdsourcing platforms|
|12:00||Hand-on experience||Arjan van Hessen & Christoph Draxler||Do a correction of your own transcriptions, set-up a crowdsourcing experiment where people can help you with the transcriptions, and try-out the transcription guidelines (good or not and what is missing)|
Metadata: Guidelines, Standards, Editors
|14:00||Metadata||Stef Scagliola & Louise Corti||Overview of standards, relevant categories, language of metadata, translation etc|
|14:30||Metadata editor||Henk van den Heuvel||A metadata editor as implemented at CLST|
|14:45||Discussion||All||Participants reports on their experiences with Metadata-editing|
Presentations on data management/hosting in NL, UK, IT ((persistent) archiving options)
|15:15||National Infra: NL||Rene van Horik||About the data infrastructure in the country and how our services could fit into that & access to data, tools, metadata for the research community at large & IPR / informed consent / ethical issues|
|15:30||National Infra: UK||Louise Corti||About the data infrastructure in the country and how our services could fit into that & access to data, tools, metadata for the research community at large & IPR / informed consent / ethical issues|
|15:45||National Infra: IT||Monica Monachini||About the data infrastructure in the country and how our services could fit into that & access to data, tools, metadata for the research community at large & IPR / informed consent / ethical issues|
|16:00||National Infra: CZ||Pavel Stranak||About the data infrastructure in the country and how our services could fit into that & access to data, tools, metadata for the research community at large & IPR / informed consent / ethical issues|
|16:20||Discussion||Henk van den Heuvel|
|18:00||Close of meeting||Silvia Calamai|
Programme Friday 12 May
|9:45||Buongiorno||Henk van den Heuvel||Summary of day 2 and overview of day 3|
|10:00||Wrapping up||Henk van den Heuvel||
|10:30||Proposal||Arjan van Hessen||Concluding actions for finalising the implementation proposal|
|11:45||Time schedule||Arjan van Hessen||Setup of the time schedules for the next months: from workshop to proposal.|
|12:15||Plan for a publication||Stef Scagliola||How to set up some publications based on the work done in this workshop?|
|14:00||Adjourn||Henk van den Heuvel & Silvia Calamai|
More images of the workshop can be found here.