munchen-2018 - Speech & Technology

ORAL HISTORY:
USERS AND THEIR SCHOLARLY PRACTICES IN A MULTIDISCIPLINARY WORLD

Main building of the Ludwig-Maximilians-University at Geschwister-Scholl-Platz in Maxvorstadt, München

Over the past 2 years a number of researchers from various backgrounds have been working on the exploitation of digital techniques and tools for working with oral history (OH) data. These endeavours have been supported by CLARIN (Common Language Resources and Technology Infrastructure) as this European infrastructure for harmonizing and sharing data and tools for linguists, aims at broadening its audience, for example, to the social science community. It intends to realize this objective by reaching out to scholars who are interested in cross disciplinary approaches and who work with interviews/oral histories/qualitative data.

CLARIN is interested in how it can better support the diversity of practices among social scientists, oral historians and digital humanists. and how it can lower barriers to the use and take up of its resource and technologies. Through supporting workshops they can get researcher-oriented feedback and suggestions for refinement and improvement of its resources.

The third of our programme of workshops, supported by the funding program, CLARIN will be held in Munich, Germany on 19-21 September 2018, and will focus on the analysis phase of the research process. This builds on the work already done during previous workshops, in order to create automatically generated transcripts. By bringing together CLARIN technologies for speech retrieval and alignment from different countries, the ‘Transcription Chain’ has been developed, a prototype that will also be tested during the workshop.

The invitation-only workshop aims to gather of evidence on scholars' everyday practices when working with OH data. By documenting and comparing these engrained practices, and by venturing into different approaches and methods to their data, namely by using unfamiliar annotation. For the purpose of this workshop we have agreed to work with audiovisual and textual data in 4 languages: Dutch, English, German and Italian, and will have prepared some materials for participants to work with, in language groups.

There will be some preparatory work that has to be done prior to the workshop that will consist of completing a matrix of user approaches and tools for your specialty, analysing some extracts of data and reviewing some documentation that we will send in advance. Our estimate is that this will take somewhere between 5 and 10 hours of your time and the 'homework' aims to reflect participants' various expertise that will be used in the workshop.
No specific technical knowledge is required.

All participants need to be able to commit to the full 3 days - Wednesday lunchtime on 19^th September to lunchtime on Friday 21^st September. We are able to pay up to €275 maximum towards your economy class travel. Please ensure that your travel is booked well in advance and check with us in advance if you need to. Hotels and meals will be provided from Wednesday lunchtime to Friday lunch. We may be able to arrange pick-ups from airports if people's travel plans coincide.

Programme

Below the workshop-program. of the workshop.

Wednesday 19 September

Overview and demos


Morning	Travel time
14:00 - 14:15	Welcome and overview of the workshop Who we are Organisers General overview of CLARIN mission and tools Louise Corti
14:15 - 15.30	Very brief summary of landscape(s) What oral historians do with interviews Norah Karrouche What social scientists do with interviews Maureen Haaker and Silvana di Gregorio What computational linguists do with spoken corpora Florentina Armaselu What sociolinguists do with spoken corpora Silvia Calamai and Stef Scagiola What social sign processing scholars do with spoken data Khiet Truong Check on homework progress
15.30 - 16.00	Coffee
16.00 - 18.00	Demonstration of the TChain workflow Christoph Draxler and Arjan van Hessen Introduction to the workshop sessions/resources Louise and Maureen Expected outcomes and evaluation method St the workshop Norah and Max Broekhuizen Introductions in language groups
19:30	Dinner

Thursday 20 September

Friday 21 September

Presentation

Blogs


9:15 - 9:30	Assemble into language groups
9.30 - 11.00	Session 1: The Transcription Chain (TChain) (hands on) Arjan and Christoph Try out the TChain with date, pre-prepared data (available on a USB-stick) and, if there is enough time, a short segments of your own data ( Dutch, English, German and Italian, not more than 10 minutes, format wav) Audio-data conversion and segment selection (GoldWave, Audacity) Download the recognition results and post-process them 10 minutes language group evaluation
11.00 - 11.30	Coffee
11.30 - 13.00	Session 2: Researcher annotation tools (hands on) Liliana Melgar and Silvana di Gregorio Assume we have alignment of audio and transcription to move to annotation Introduce different annotation tools. Are there computational elements included? How are sound and moving image included? Try out ELAN and NVivo software with pre-prepared aligned data
13:00 - 14:00	Lunch
14:00-15.45	Continue Session 2 Researcher annotation tools Liliana Melgar and Silvana di Gregorio Try out various software with pre-prepared aligned data Document experiences 10 minutes language group evaluation
15.45 - 16.15	Coffee
16.15 - 18.00	Session 3: On the fly linguistic tools (hands on) Jeannine Beeken Try out linguistic tools (VOYANT, Stanford CoreNLP) on oral history data that has been preprocessed How do linguistic research questions relate to the social science and history paradigms? What kind of meaning can be extracted from this? When does scale matter to be meaningful? Document experiences 10 minutes language group evaluation
19:30	Dinner


9:00 - 11.00	Session 4: Preprocessing and textometry tools (hands on) Florentina Try out linguistic tool (TXM) on workshop oral history data that has been preprocessed What kind of meaning can be extracted from this? How is this different from tools in session 3? Document experiences 10 minutes language group evaluation
11:00 - 11:15	Coffee
11.15 - 13.15	Session 5: Emotion recognition tools: audio and video Khiet Truong Presentation and demonstration of Praat and OpenSmile in combination with ELAN Investigating silences, tone, emotion and facial emotion Group work
13:15 - 13:30	Short summary and adjourn
13.30- 14.00	optional lunch

München Workshop 2018

Agenda München 2018

Oral History under scrutiny in München

Cross disciplinary overtures between linguists, historians and social scientists

Transcription Chain

Hiccups: Scalability and Conversion

Landscape of disciplines

Researcher Annotation Tools

On-the-fly linguistic tools (no pre-processing)

Linguistic tools with pre-processing

Emotion recognition tools

Summary

Some final quotes from participants

Acknowledgement


Three screenshots of xxxx