Programme
Friday, August 18th, 2023
09:00-09:10 Opening Session
- 09:00-09:05 SIGUL 2023 Opening Talk
- SIGUL Co-Chair: Maite Melero
- 09:05-09:10 Welcome Speech
- Local Organizer: Ailbhe Ní Chasaide
09:10-10:00 Keynote Speech (Chair: Maite Melero)
- Reclaiming Our Voices: Imagining Community-Led Ai/Ml Practices
- Subhashish Panigrahi
10:00-10:40 SIGUL Session 1: NLP (Chair: Mélanie Jouitteau)
- 10:00-10:20 (on-site) [Lg: Romanian (Romania)]
- Short-Cutting Manual Acquisition in Deep-Learning Deciphering of Old Documents (paper | slides)
- D. Cristea
- Short-Cutting Manual Acquisition in Deep-Learning Deciphering of Old Documents (paper | slides)
- 10:20-10:40 (on-site) [Lg: Saraiki (Pakistan)]
- A Finite-State Morphological Analyzer for Saraiki (paper | slides)
- M. Alam, A. O’Neil, D.G. Swanson and F.M. Tyers
- A Finite-State Morphological Analyzer for Saraiki (paper | slides)
10:40-11:10 Coffee Break
11:10-12:30 SIGUL Session 2: Resources, Pronunciation & Dialect (Chair: Alex Cristia)
- 11:10-11:30 (remote) [Lg: 6 Tibeto-Burman and 4 Indo-Aryan (India)]
- Collecting Speech Data for Endangered and Under-resourced Indian Languages (paper | slides)
- R. Kumar, M. Takhellambam, A. Gope, B Lahiri, S. Ratan, N. Mathur, S. Singh
- Collecting Speech Data for Endangered and Under-resourced Indian Languages (paper | slides)
- 11:30-11:50 (remote) [Lg: Kanak (New Caledonia)]
- A visit to the Cliffs of Jokin: A role for phonetizers in second language pronunciation and word learning, with an example from the under-resourced languages of New Caledonia (paper | slides)
- P. Welby, B. Bigi, A. Corral, F. Wacalie, G. Wattelez
- A visit to the Cliffs of Jokin: A role for phonetizers in second language pronunciation and word learning, with an example from the under-resourced languages of New Caledonia (paper | slides)
- 11:50-12:10 (on-site) [Lg: Korebaju (Colombian)]
- An intra- and inter-dialectal study of Korebaju vowels (paper | slides)
- J.V. Rodríguez, N. Vallee, T. Chacon, C. Savariaux, S. Gerber
- An intra- and inter-dialectal study of Korebaju vowels (paper | slides)
- 12:10-12:30 (on-site) [Lg: Nepali, Lyngam, Na-nasu, War, Na (Nepal/Tibet)]
- From ‘Snippet-lects’ to Doculects and Dialects: Leveraging Neural Representations of Speech for Placing Audio Signals in a Language Landscape (paper | slides)
- S. Guillaume, G. Wisniewski, A. Michaud
- From ‘Snippet-lects’ to Doculects and Dialects: Leveraging Neural Representations of Speech for Placing Audio Signals in a Language Landscape (paper | slides)
12:30-13:30 Lunch Break
13:30-14:30 SIGUL Poster Session (Chair: Satoshi Tamura)
- (remote) [Lg: Konkani (Indo-Aryan)]
- (on-site) [Lg: Maltese – Semitic (Sicilian Arabic)]
- (on-site) [Lg: North Sami (Finland)]
- (on-site) [Lg: Lule and North Sami (Finland)]
- (on-site) [Lg: English-Japanese (Japan)]
14:30-16:10 SIGUL Session 3: Language Technologies 1 (Chair: Dewi Jones)
- 14:30-14:50 (on-site) [Lg: Hungarian (Hungary) ]
- What kind of multi- or cross-lingual pre-training is the most effective for a spontaneous, less-resourced ASR task? (paper | slides)
- P. Mihajlik, M.S. Kadar, G. Dobsinszki, Y. Meng, M. Kedalai, J. Linke, T. Fegyó, K. Mády
- What kind of multi- or cross-lingual pre-training is the most effective for a spontaneous, less-resourced ASR task? (paper | slides)
- 14:50-15:10 (on-site) [Lg: Irish – Celtic (Ireland) ]
- Towards spoken dialect identification of Irish (paper | slides)
- L. Lonergan, M. Qian, N. Ní Chiaráin, C. Gobl, A. Ní Chasaide
- Towards spoken dialect identification of Irish (paper | slides)
- 15:10-15:30 (on-site) [Lg: Austrian-German (Austria)]
- Neural Speech Synthesis for Austrian Dialects with Standard German Grapheme-to-Phoneme Conversion and Dialect Embeddings (paper | slides)
- L. Gutscher, M. Pucher, V. Garcia
- Neural Speech Synthesis for Austrian Dialects with Standard German Grapheme-to-Phoneme Conversion and Dialect Embeddings (paper | slides)
- 15:30-15:50 (remote) [Lg: Nepali (Nepal)]
- Nepali Text-to-Speech Synthesis using Tacotron2 for Mel-spectrogram Generation (paper | slides)
- S. Khadka, R. GC, P. Paudel, R. Shah, B. Joshi
- Nepali Text-to-Speech Synthesis using Tacotron2 for Mel-spectrogram Generation (paper | slides)
- 15:50-16:10 (remote) [Lg: English-Japanese (Japan)]
- Low-Resource Japanese-English Speech-to-Text Translation Leveraging Speech-Text Unified-model Representation Learning (paper | slides)
- T. Tran, S. Sakti
- Low-Resource Japanese-English Speech-to-Text Translation Leveraging Speech-Text Unified-model Representation Learning (paper | slides)
16:10-16:30 Coffee Break
16:30-17:30 SIGUL Session 4: Language Technologies 2 (Chair: Alexis Michaud)
- 16:30-16:50 (on-site) [Lg: 25 Languages (Babel data)]
- Multilingual models with language embeddings for low-resource speech recognition (paper | slides)
- L.-M. Lam-Yee-Mui, W.B. Kheder, V.B. Le, C. Barras, J.-L. Gauvain
- Multilingual models with language embeddings for low-resource speech recognition (paper | slides)
- 16:50-17:10 (on-site) [Lg: Bambara (Mali)]
- Low-resource ASR-free keyword spotting using listen-and-confirm (paper | slides)
- E. Westhuizen, M. Ribeiro, J.J. Vüren, P. Hidalgo-Sanchis, T. Niesler
- Low-resource ASR-free keyword spotting using listen-and-confirm (paper | slides)
- 17:10-17:30 (remote) [Lg: isiXhosa (South Africa)]
- Automatic transcription and (de)standardisation (paper | slides)
- N. Markl, E. Wallington, O. Klejch, T. Reitmaier, G. Bailey, J. Pearson, S. Robinson, M. Jones, P. Bell
- Automatic transcription and (de)standardisation (paper | slides)
17:30-18:15 SIGUL Discussion with Panel (Chair: Sakriani Sakti)
- Language resources and technology: Inclusion, Diversity, and Sovereignty
- Panelists:
- Subhashish Panigrahi (Keynote speaker) – on-site
- Dan Cristea (Author paper ID#14) – on-site
- Alexis Michaud (Author paper ID#12) – on-site
- Nina Markl (Author paper ID#9) – remote
- Ewan Dunbar (Zero Resources Speech Technology) – on-site
- Jiatong Shi (ML-SUPERB: Multilingual Speech Universal PERformance Benchmark) – remote
- Michael Auli (Meta AI Project on Scales Speech Technology to 1000+ Languages) – remote
- Panelists:
Saturday, August 19th, 2023
9:00-10:40 Special Session on Celtic Languages: Paper Presentation (Chair: Fransen Theodorus)
- 09:00-09:20 (on-site) [Lg: Breton – Celtic (Britanny)]
- Community internally driven corpus buildings: Three examples from the Breton ecosystem (paper | slides)
- M. Jouitteau
- Community internally driven corpus buildings: Three examples from the Breton ecosystem (paper | slides)
- 09:20-09:40 (on-site) [Lg: Scottish Gaelic – Celtic (Scotland)]
- A Transformer-Based Orthographic Standardiser for Scottish Gaelic (paper | slides)
- J. Huang, B. Alex, M. Bauer, D.S. Jasin, Y. Liang, R. Thomas, W. Lamb
- A Transformer-Based Orthographic Standardiser for Scottish Gaelic (paper | slides)
- 09:40-10:00 (on-site) [Lg: Welsh – Celtic (Wales)]
- Developing Live Welsh Speech Recognition Models for a Commercial Product – a case study (paper | slides)
- P. Vangberg, L.S. Farhat, D. Jones, S. Kinahan
- Developing Live Welsh Speech Recognition Models for a Commercial Product – a case study (paper | slides)
- 10:00-10:20 (on-site) [Lg: Irish – Celtic (Ireland)]
- The language communities as active partners in technology provisions: the Irish ABAIR experience (paper | slides)
- A. Ní Chasaide, N. Ní Chiaráin, H. Berthelsen, A. Murphy, L. Lonergan, J. Sloan, C. Wendler, C. McCabe, E. Barnes, C. Gobl
- The language communities as active partners in technology provisions: the Irish ABAIR experience (paper | slides)
- 10:20-10:40 (on-site) [Lg: Irish – Celtic (Ireland) ]
- ABAIR & ÉIST: A demonstration of speech technologies for Irish (paper | slides)
- A. Murphy, L. Lonergan, M. Qian, H. Berthelsen, C. Wendler, N. Ní Chiaráin, A. Ní Chasaide, C. Gobl
- ABAIR & ÉIST: A demonstration of speech technologies for Irish (paper | slides)
10:40-11:10 Coffee Break
11:10-12:00 Special Session on Celtic Languages: Keynote Speech (Chair: Maite Melero)
- New Challenges, Old Problems: AI and Under-resourced Languages
- Delyth Prys
12:00-12:45 Special Session on Celtic Languages: Discussion with Panel (Chair: Delyth Prys & Sakriani Sakti)
- Role of Local Communities & Policy/Support
- Panelists
- Mélanie Jouitteau (Rep. Breton-Britanny; author paper ID#16) — on-site
- Will Lamb (Rep. Scottish Gaelic – Scotland; author paper ID#7) — on-site
- Dewi Jones (Rep. Welsh-Wales; author paper ID#11) — on-site
- Ailbhe Ní Chasaide (Rep. Irish – Ireland; author paper ID#26) — on-site
- Eoin Ó Droighneáin (Digital plan for the Irish Language – Dept. of Tourism, Culture, Arts, Gaeltacht, Sport & Media) — on-site
- Apolonia Tamata (Rep. from other local community – Fiji Island) — remote
- Bhuvana Ramabahdran (IEEE Education and outreach activity: Language of the Worlds) — remote
- Eva Hachem (UNESCO for IDIL Program) – remote
- Panelists
12:45-14:00 Lunch Break
14:00-15:40 Joint SIGUL-SLaTE Session: Paper Presentation 1 (Chair: Ahmed Ali)
- 14:00-14:20 (on-site SLaTE) [Lg: Irish – Celtic (Ireland)]
- Mol an Óige: a phonological awareness and early literacy platform for Irish
- A. Ní Chasaide, E. Barnes, R. Errity, O. Mroz, O. Ní hAonghusa, S. Ní Chasaide, A. Giovannini, N. Ní Chiaráin
- Mol an Óige: a phonological awareness and early literacy platform for Irish
- 14:20-14:40 (on-site SLaTE) [Lg: Finnish and Finland Swedish (Finland)]
- New data, benchmark and baseline for L2 speaking assessment for low-resource languages
- M. Kurimo, Y. Getman, E. Voskoboinik, R. Al-Ghezi, H. Kallio, M. Kuronen, A. Zansen, R. Hilden, S. Kronholm, A. Huhta, K. Linden
- New data, benchmark and baseline for L2 speaking assessment for low-resource languages
- 14:40-15:00 (on-site SIGUL) [Lg: Māori (New Zealand)]
- Towards Automatic Marking of Pepeha: a Formulaic Māori Language Speech (paper | slides)
- C. Watson, P. Allen, P. Keegan, K. Mahelona, P.-L. Jones
- Towards Automatic Marking of Pepeha: a Formulaic Māori Language Speech (paper | slides)
- 15:00-15:20 (on-site SIGUL) [Lg: Irish – Celtic (Ireland)]
- Geabaire, the first Irish AAC system: voice as a vehicle for change (paper | slides)
- E. Barnes, J. Cummins, R. Errity, O. Morrin, H. Berthelsen, C. Wendler, H. Husca, N. Ni Chiaráin and A. Ní Chasaide
- Geabaire, the first Irish AAC system: voice as a vehicle for change (paper | slides)
15:20-15:50 Coffee Break
15:50-16:50 Joint SIGUL-SLaTE Session: Paper Presentation 2 (Chair: Mark Gales)
- 15:50-16:10 (on-site SLaTE) [Lg: Irish – Celtic (Ireland)]
- An Bat M ́ırialta: Stateful Development of an Irregular Verb Bot for Irish
- J. Sloan, N. Ní Chiaráin
- An Bat M ́ırialta: Stateful Development of an Irregular Verb Bot for Irish
- 16:10-16:30 (on-site SLaTE) [Lg: Irish – Celtic (Ireland)]
- Filling the SLaTE: examining the contribution LLMs can make to Irish iCALL content generation
- N. Ní Chiaráin, N.R. Gunning, O. Nolan, M. Comtois
- Filling the SLaTE: examining the contribution LLMs can make to Irish iCALL content generation
- 16:30-16:50 (on-site SLaTE) [Lg: 21 languages]
- ChatGPT + LARA = C-LARA
- B. Bedi, B. Chiera, C. Chua, N. Ní Chiaráin, M. Rayner, A. Simonson, R. Zviel-Girshin
- ChatGPT + LARA = C-LARA
16:50-17:35 Joint SIGUL-SLaTE Session: Discussion with Panel (Chair: Helmer Strik & Sakriani Sakti)
- Indigenous Language Revitalization through Education & Language Learning Technology
- Panelists
- Mikko Kurimo (Author paper SLaTE2) — on-site
- Catherine Watson (Author paper ID#27) — on-site
- Emily Barnes (Author paper ID#20) — on-site
- Nease Ní Chiaráin (Author paper SLaTE3, SLaTE4, SLaTE5) — on-site
- Catia Cucchiarini (Nederlandse Taalunie [Dutch Language Union]) — on-site
- Trond Trosterud (The Arctic University of Norway) — remote
- Alex Cristia (Free Online School on Language Acquisition) — on-site
- Lorna Williams (Lil’wat from the St’at’yem’c First Nation; “How Universities Can Support Indigenous Language Revitalization and Maintenance“) — remote
- Panelists
17:35-17:45 Closing and Group Photos
- 17:35-17:40 SIGUL 2023 Closing Talk
- SIGUL Co-Chair: Sakriani Sakti
- 17:40-17:45 SLaTE 2023 Closing Talk
- SLaTE Co-Chair: Helmer Strik
- 17:45-18:00 Group Photos