Principal scientist, Naver Labs Europe
Self Supervised Representation Learning for Pre-training Speech Systems
Self-supervised learning using huge unlabeled data has been successfully explored for image processing and natural language processing. Since 2019, recent works also investigated self-supervised representation learning from speech. They were notably successful to improve performance on downstream tasks such as speech recognition. These recent works suggest that it is possible to reduce dependence on labeled data for building speech systems through acoustic representation learning. In this talk I will present an overview of these recent approaches to self-supervised learning from speech and show my own investigations to use them in spoken language processing tasks for which size of training data is limited.
Laurent Besacier is a principal scientist at Naver Labs Europe since January 2021 where he is leading the Natural Language Processing (NLP) group. Before that, he was full professor at the University Grenoble Alpes (UGA) since 2009 where he lead the GETALP group (natural language and speech processing) for 8 years. Laurent is still affiliated with UGA. His main research expertise and interests lie in the field of natural language processing, automatic speech recognition, machine translation, under resourced languages, machine-assisted language documentation and the evaluation of NLP systems. Laurent is also involved in MIAI (Grenoble AI institute) where he holds a chair simply entitled ‘Ai & Language’.
Associate Professor, National Engineering Laboratory for Speech and Language Information Processing (NEL-SLIP), University of Science and Technology of China (USTC)
Environmentally Robust Speech Recognition: A Corpus-Based Perspective
In the past two decades, with the development of big data and machine learning technologies, great progress has been made in automatic speech recognition (ASR), from the theoretical paradigm shift to many realistic applications. In this talk, we will explore the environmental robustness of ASR from a corpus-based perspective. The corpus development can play a very important role in development of ASR technologies, e.g., showing a slowdown with the simulated/single-speaker/single-microphone digital string recognition task in the early stage and an acceleration with the realistic/multi-speaker/multi-array/multimodal large-vocabulary conversational speech recognition in the current stage. And the main challenges and future research trends in this area will also be discussed.
Jun Du received the B.Eng. and Ph.D. degrees from the Department of Electronic Engineering and Information Science, University of Science and Technology of China (USTC), in 2004 and 2009, respectively. From July 2009 to June 2010, he was with iFlytek Research leading a team to develop the ASR prototype system of the mobile app “iFlytek Input”. From July 2010 to January 2013, he joined Microsoft Research Asia (MSRA) as an Associate Researcher, working on handwriting recognition, OCR, and speech recognition. Since February 2013, he has been with the National Engineering Laboratory for Speech and Language Information Processing (NEL-SLIP), USTC. His main research interest includes speech signal processing and pattern recognition applications. He has published more than 150 conference and journal papers with more than 4200 citations in Google Scholar. His team is one of the pioneers in deep-learning-based speech enhancement area. As the corresponding author, the IEEE-ACM TASLP paper “A Regression Approach to Speech Enhancement Based on Deep Neural Networks” also received 2018 IEEE Signal Processing Society Best Paper Award. Based on those research achievements of speech enhancement, he led a joint team with members from USTC and iFlytek Research to win the champions of all three tasks in the 2016 CHiME-4 challenge, all four tasks in 2018 CHiME-5 challenge, two tasks in 2020 CHiME-6 challenge, the SELD task of 2020 DCASE challenge and all tasks in 2020 DIHARD-III challenge. Currently he is IEEE senior member, the associate editor of IEEE-ACM TASLP and IEEE SLTC member. He is one of the organizers for DIHARD Challenge.