{"id":1281,"date":"2022-11-16T06:28:03","date_gmt":"2022-11-16T06:28:03","guid":{"rendered":"https:\/\/www.colips.org\/conferences\/iscslp2022\/wp\/?page_id=1281"},"modified":"2022-12-11T00:35:08","modified_gmt":"2022-12-10T16:35:08","slug":"program-details","status":"publish","type":"page","link":"https:\/\/www.colips.org\/conferences\/iscslp2022\/web\/program-details\/","title":{"rendered":"Program Details"},"content":{"rendered":"\t\t<div data-elementor-type=\"wp-page\" data-elementor-id=\"1281\" class=\"elementor elementor-1281\">\n\t\t\t\t\t\t<section class=\"elementor-section elementor-top-section elementor-element elementor-element-823b315 elementor-section-boxed elementor-section-height-default elementor-section-height-default\" data-id=\"823b315\" data-element_type=\"section\" data-e-type=\"section\">\n\t\t\t\t\t\t<div class=\"elementor-container elementor-column-gap-default\">\n\t\t\t\t\t<div class=\"elementor-column elementor-col-100 elementor-top-column elementor-element elementor-element-bb8d296\" data-id=\"bb8d296\" data-element_type=\"column\" data-e-type=\"column\">\n\t\t\t<div class=\"elementor-widget-wrap elementor-element-populated\">\n\t\t\t\t\t\t<div class=\"elementor-element elementor-element-72b6749 elementor-widget elementor-widget-text-editor\" data-id=\"72b6749\" data-element_type=\"widget\" data-e-type=\"widget\" data-widget_type=\"text-editor.default\">\n\t\t\t\t<div class=\"elementor-widget-container\">\n\t\t\t\t\t\t\t\t\t<h1>Day 1, Sunday, 11 Dec 2022<\/h1><p><strong>kindly take note that the tutorials and grand challenges will be conducted online only.<\/strong><\/p><h2><strong>Tutorials<\/strong><\/h2><p><strong>Tutorial 1:<\/strong> Exploring the Frontier of Large-Scale Semi-Supervised Learning for Speech Processing\u00a0<br \/><em>Presenters: Yu Zhang, Bo Li, Daniel Park, Google<br \/><\/em>Time: 9:30-11:30, Sunday, 11 Dec 2022<br \/><a href=\"http:\/\/www.colips.org\/conferences\/iscslp2022\/wp\/wp-content\/uploads\/2022\/12\/T1.zip\">Download T1 material\u00a0<\/a>(Alternative links: <a href=\"https:\/\/drive.google.com\/file\/d\/1bXc7yYgP-8qRLqWZRAE0A2EhHXDn0pLs\/view?usp=sharing\">Part1<\/a>, <a href=\"https:\/\/drive.google.com\/file\/d\/1TMJEv1RYC1garBio7LDJK__O9mQ2ICQW\/view?usp=share_link\">Part2<\/a>, <a href=\"https:\/\/drive.google.com\/file\/d\/1ca_MQ1zOa8XX-8zIq66ZRbfwrA_cPkv9\/view?usp=share_link\">Part3<\/a>)<\/p><p><strong>Tutorial 2:<\/strong> TorchAudio Tutorial<br \/><em>Presenters: Xiaohui Zhang, Zhaoheng Ni, Jeff Hwang, Caroline Chen, Meta<br \/><\/em>Time: 9:30-11:30, Sunday, 11 Dec 2022<br \/><a href=\"http:\/\/www.colips.org\/conferences\/iscslp2022\/wp\/wp-content\/uploads\/2022\/12\/T2.pdf\">Download T2 material<\/a><\/p><p><strong>Tutorial 3:<\/strong> Towards Solving Cocktail Party Problem with Artificial Intelligence<br \/><em>Presenter: Dr. Chenglin Xu, Kuaishou Technology<br \/><\/em>Time: 13:00-15:00, Sunday, 11 Dec 2022<br \/><a href=\"http:\/\/www.colips.org\/conferences\/iscslp2022\/wp\/wp-content\/uploads\/2022\/12\/T3.pdf\">Download T3 material<\/a><\/p><p><strong>Tutorial 4:<\/strong> Quantum Machine Learning for Speech Processing: from Theoretical Foundations to Practices<br \/><em>Presenters: Prof. Jun Qi, Fudan Unversity, Shanghai, China; Huck Yang, Ph.D. candidate, Georgia Insitute of Technology, Atlanta, GA, USA<br \/><\/em>Time: 13:00-15:00, Sunday, 11 Dec 2022<br \/><a href=\"http:\/\/www.colips.org\/conferences\/iscslp2022\/wp\/wp-content\/uploads\/2022\/12\/T4.zip\">Download T4 material<\/a><\/p><p><strong>Tutorial 5:<\/strong> Recent Advances on Automatic Dialogue Evaluation<br \/><em>Presenters: Luis Fernando D&#8217;Haro, Universidad Polit_ecnica de Madrid; Chen Zhang, National University of Singapore<br \/><\/em>Time: 15:30-17:30, Sunday, 11 Dec 2022<br \/><a href=\"http:\/\/www.colips.org\/conferences\/iscslp2022\/wp\/wp-content\/uploads\/2022\/12\/T5.pdf\">Download T5 material<\/a><\/p><h2><strong>Grand Challenges<\/strong><\/h2><h3><strong>Challenge 1: Conversational Short-Phrase Speaker Diarization Challenge (CSSD)<\/strong><\/h3><p>Time: 9:30-11:30, Sunday, 11 Dec 2022<br \/>Chair: Qingqing Zhang<\/p><p><strong>GC1.1<\/strong> (#126) The Conversational Short-phrase Speaker Diarization (CSSD) Task: Dataset, Evaluation Metric and Baselines<br \/><em>Gaofeng Cheng, Yifan Chen, Runyan Yang, Qingxuan Li, Zehui Yang, Lingxuan Ye, Pengyuan Zhang, Qingqing Zhang, Lei Xie, Yanmin Qian, Kong Aik Lee and Yonghong Yan<\/em><\/p><p><strong>GC1.2<\/strong> (#129) Spectral Clustering Based EEND-vector Clustering: A Robust System Fine-tuned on Simulated Conversations<br \/><em>Kai Li<\/em><\/p><p><strong>GC1.3<\/strong> (#130) The X-Lance Speaker Diarization System for the Conversational Short-phrase Speaker Diarization Challenge 2022<br \/><em>Tao Liu, Xu Xiang, Zhengyang Chen, Bing Han, Kai Yu and Yanmin Qian<\/em><\/p><p><strong>GC1.4<\/strong> (#132) TSUP Speaker Diarization System for Conversational Short-phrase Speaker Diarization Challenge<br \/><em>Bowen Pang, Huan Zhao, Gaosheng Zhang, Xiaoyue Yang, Yang Sun, Li Zhang, Qing Wang and Lei Xie<\/em><\/p><h3><strong>Challenge 2: Intelligent Cockpit Speech Recognition Challenge (ICSRC)<\/strong><\/h3><p>Time: 13:00-15:00, Sunday, 11 Dec 2022<br \/>Chair: Lei Xie<\/p><p><strong>GC2.1<\/strong> (#142) The ISCSLP 2022 Intelligent Cockpit Speech Recognition Challenge (ICSRC): Dataset, Tracks, Baseline and Results<br \/><em>Ao Zhang, Fan Yu, Kaixun Huang, Lei Xie, Longbiao Wang, Eng Siong Chng, Hui Bu, Binbin Zhang, Wei Chen and Xin Xu<\/em><\/p><p><strong>GC2.2<\/strong> (#139) The FawAI ASR System for the ISCSLP 2022 Intelligent Cockpit Speech Recognition Challenge<br \/><em>Yujia Sun, Bing Ge, Bo Chen, Zhen Fu, Jinxin He, Hongwei Gao and Xue Wang<\/em><\/p><p><strong>GC2.3<\/strong> (#140) LeVoice ASR Systems for the ISCSLP 2022 Intelligent Cockpit Speech Recognition Challenge<br \/><em>Yan Jia, Mi Hong, Jingyu Hou, Kailong Ren, Sifan Ma, Jin Wang, Yinglin Ji, Fangzhen Peng,\u00a0 Lin Yang and Junjie Wang<\/em><\/p><p><strong>GC2.4<\/strong> (#141) Efficient Conformer-Based CTC Model for Intelligent Cockpit Speech Recognition<br \/><em>Hanzhi Guo, Yunshu Chen, Xukang Xie, Gaopeng Xu and Wei Guo<\/em><\/p><h3><strong>Challenge 3: Chinese-English Code-Switching Automatic Speech Recognition (CSASR)<\/strong><\/h3><p>Time: 15:30-17:30, Sunday, 11 Dec 2022<br \/>Chair: Qingqing Zhang<\/p><p><strong>GC3.1<\/strong> (#138) Summary on the ISCSLP 2022 Chinese-English Code-switching ASR Challenge<br \/><em>Shuhao Deng, Chengfei Li, Jinfeng Bai, Qingqing Zhang, Wei-Qiang Zhang, Runyan Yang, Gaofeng Cheng, Pengyuan Zhang and Yonghong Yan<\/em><\/p><p><strong>GC3.2<\/strong> (#135) The NPU-ASLP System for The ISCSLP 2022 Magichub Code-Switching ASR Challenge<br \/><em>Yuhao Liang, Peikun Chen, Fan Yu, Xinfa Zhu, Tianyi Xu, Yingying Gao and Lei Xie<\/em><\/p><p><strong>GC3.3<\/strong> (#136) Hybrid CTC Language Identification Structure for Mandarin-English Code-Switching ASR<br \/><em>Hengxin Yin, Guangyu Hu, Fei Wang and Pengfei Ren<\/em><\/p><h1>Day 2, Monday, 12 Dec 2022<\/h1><h2><strong>Opening Session<\/strong><\/h2><p>Time: 8:30-9:00, Monday, 12 Dec 2022<\/p><h2><strong>Keynote Speech 1<\/strong><\/h2><p>Time: 9:00-10:00, Monday, 12 Dec 2022<br \/>Title: Advancing end-to-end automatic speech recognition and beyond<br \/>Speaker: Dr Jinyu Li, Partner Applied Science Manager, Microsoft<br \/>Chair: Nancy Chen<\/p><h2><strong>Oral 1: Speech Recognition I<\/strong><\/h2><p>Session Chairs: Jen-Tzung Chien, Siqi Cai<br \/>Time: 10:30-12:30, Monday, 12 Dec 2022<\/p><p><strong>OS1.1<\/strong> (#99) An Ensemble Teacher-Student Learning Approach with Poisson Sub-sampling to Differential Privacy Preserving Speech Recognition<br \/><em>Chao-Han Huck Yang, Jun Qi, Sabato Marco Siniscalchi and Chin-Hui Lee<\/em><\/p><p><strong>OS1.2 <\/strong>(#25) Adaptive Attention Network with Domain Adversarial Training for Multi-Accent Speech Recognition<br \/><em>Yanbing Yang, Hao Shi, Yuqin Lin, Meng Ge, Longbiao Wang, Qingzhi Hou and Jianwu Dang<\/em><\/p><p><strong>OS1.3<\/strong> (#26) Multilingual Zero Resource Speech Recognition Base on Self-Supervise Pre-Trained Acoustic Models<br \/><em>Haoyu Wang, Wei-Qiang Zhang, Hongbin Suo and Yulong Wan<\/em><\/p><p><strong>OS1.4<\/strong> (#47) Towards Language-universal Mandarin-English Speech Recognition with Unsupervised Label Synchronous Adaptation<br \/><em>Song Li, Haoneng Luo, Wenxuan Hu, Yuan Liu, Shiliang Zhang, Lin Li and Qingyang Hong<\/em><\/p><p><strong>OS1.5<\/strong> (#49) Sequence Distribution Matching for Unsupervised Domain Adaptation in ASR<br \/><em>Qingxuan Li, Han Zhu, Liuping Luo, Gaofeng Cheng, Pengyuan Zhang, Jiasong Sun and Yonghong Yan<\/em><\/p><p><strong>OS1.6<\/strong> (#86) Improving Rare Words Recognition through Homophone Extension and Unified Writing for Low-resource Cantonese Speech Recognition<br \/><em>Ho-Lam Chung, Junan Li, Pengfei Liu, Wai Kim LEUNG, Xixin Wu and Helen Meng<\/em><\/p><h2><strong>Oral 2: Speech Production and Perception I<\/strong><\/h2><p>Session Chairs: Aijun Li, Changhuai You<br \/>Time: 10:30-12:30, Monday, 12 Dec 2022<strong><br \/><\/strong><\/p><p><strong>OS2.1<\/strong> (#54) Perception and Production of Mandarin Vowels by Teenagers &#8211; Blind and Sighted<br \/><em>Moyu Chen, Jing Qi, and Xiyu Wu<\/em><\/p><p><strong>OS2.2<\/strong> (#70) The Production of Contrastive Focus by Children Learning Mandarin Chinese<br \/><em>Jing Lu and Ping Tang<\/em><\/p><p><strong>OS2.3<\/strong> (#77) Production Characteristics of Vowels in the Standard Chinese by Preschool Bilingual Teachers <br \/>Jiao Lin Pan and Yuan Jia<\/p><p><strong>OS2.4<\/strong> (#81) Effects of Aspiration on Tone Production and Perception in Standard Chinese<br \/><em>Chong Cao and Aijun Li<\/em><\/p><p><strong>OS2.5<\/strong> (#84) The Disyllabic Tone Production and Tone Context Effect in Mandarin-speaking Children with Cochlear Implants <br \/><em>Jingwen Cheng, Yingming Gao, Yuchen Yan, Xiaoli Feng, Binghuai Lin, and Jinsong Zhang<\/em><\/p><p><strong>OS2.6<\/strong> (#03) A Preliminary Ultrasonic Investigation of Tenseness in Northern Yi<em><br \/>Shuwen Chen<\/em><\/p><h2><strong>Oral 3: Speech Synthesis<\/strong><\/h2><p>Session Chairs: Yuan-Fu Liao<br \/>Time: 14:00-16:00, Monday, 12 Dec 2022<strong><br \/><\/strong><\/p><p><strong>OS3.1<\/strong> (#02) Style-Label-Free: Cross-Speaker Style Transfer by Quantized VAE and Speaker-wise Normalization in Speech Synthesis<br \/><em>Chunyu Qiang, Peng Yang, Hao Che, Xiaorui Wang and Zhongyuan Wang<\/em><\/p><p><strong>OS3.2<\/strong> (#37) Multi-speaker Multi-style Text-to-speech Synthesis with Single-speaker Single-style Training Data Scenarios<br \/><em>Qicong Xie, Tao Li, Xinsheng Wang, Zhichao Wang, Lei Xie, Guoqiao Yu and Guanglu Wan<\/em><\/p><p><strong>OS3.3<\/strong> (#51) Robust MelGAN: A robust universal neural vocoder for high-fidelity TTS<br \/><em>Kun Song, Jian Cong, Xinsheng<\/em>\u00a0<em>Wang, Yongmao Zhang, Lei Xie, Ning Jiang and Haiying Wu<\/em><\/p><p><strong>OS3.4<\/strong> (#52) AccentSpeech: Learning Accent from Crowd-sourced Data for Target Speaker TTS with Accents<br \/><em>Yongmao Zhang, Zhichao Wang, Peiji Yang, Hongshen Sun, Zhisheng Wang and Lei Xie<\/em><\/p><p><strong>OS3.5<\/strong> (#28) CorrectSpeech: A Fully Automated System for Speech Correction and Accent Reduction <br \/><em>Daxin Tan, Liqun Deng, Nianzu Zheng, Yu Ting Yeung, Xin Jiang, Xiao Chen and Tan Lee<\/em><\/p><p><strong>OS3.6<\/strong> (#62) HILvoice: Human-in-the-Loop Style Selection for Elder-Facing Speech Synthesis<br \/><em>Xueyuan Chen, Qiaochu Huang, Xixin Wu, Zhiyong Wu and Helen Meng<\/em><\/p><h2><strong>Special Session 1: Data Augmentation in Speech Technologies<\/strong><\/h2><p>Session Chairs: Rohan Kumar Das<br \/>Time: 14:00-16:00, Monday, 12 Dec 2022<strong><br \/><\/strong><\/p><p><strong>SS1.1<\/strong> (#103) Dynamic Thresholding on FixMatch with Weak and Strong Data Augmentations for Sound Event Detection<br \/><em>Tanmay Khandelwal and Rohan Kumar Das<\/em><\/p><p><strong>SS1.2<\/strong> (#118) Data Augmentation for Infant Cry Classification<br \/><em>Aastha Kachhi, Shreya Chaturvedi, Hemant A. Patil and Dipesh Kumar Singh<\/em><\/p><p><strong>SS1.3<\/strong> (#122) Low Pass Filtering and Bandwidth Extension for Robust Anti-spoofing Countermeasure Against Codec Variabilities<br \/><em>Yikang Wang, Xingming Wang, Hiromitsu Nishizaki and Ming Li<\/em><\/p><p><strong>SS1.4<\/strong> (#08) Improving Speech Recognition with Augmented Synthesized Data and Conditional Model Training<br \/><em>Shaofei Xue, Jian Tang and Yazhu Liu<\/em><\/p><p><strong>SS1.5<\/strong> (#85) Speaking Style Compensation on Synthetic Audio for Robust Keyword Spotting<br \/><em>Houjun Huang and Yanmin Qian<\/em><\/p><p><strong>SS1.6<\/strong> (#45) A Study on Joint Modeling and Data Augmentation of Multi-Modalities for Audio-Visual Scene Classification<br \/><em>Qing Wang, Jun Du, Siyuan Zheng, Yunqing Li, Yajian Wang, Yuzhong Wu, Hu Hu, Chao-Han Huck Yang, Sabato Marco Siniscalchi, Yannan Wang and Chin-Hui Lee<\/em><\/p><h2><strong>Oral 4: Voice Conversion &amp; Spoofing Speech Detection<\/strong><\/h2><p>Session Chairs: Junichi Yamagishi<br \/>Time: 16:30-18:45, Monday, 12 Dec 2022<strong><br \/><\/strong><\/p><p><strong>OS4.1<\/strong> (#38) End-to-End Voice Conversion with Information Perturbation<br \/><em>Qicong Xie, Shan Yang, Yi Lei, Lei Xie and Dan Su<\/em><\/p><p><strong>OS4.2<\/strong> (#42) Mix-Guided VC: Any-to-many Voice Conversion by Combining ASR and TTS Bottleneck Features<br \/><em>Zeqing Zhao, Sifan Ma, Yan Jia, Jingyu Hou, Lin Yang and Junjie Wang<\/em><\/p><p><strong>OS4.3<\/strong> (#76) A New Spoken Language Teaching Tech: Combining Multi-attention and AdaIN for One Shot Cross Language Voice Conversion<br \/><em>Dengfeng Ke, Wenhan Yao, Ruixin Hu, Qi Luo, Liangjie Huang, Qi Luo and Wentao Shu<\/em><\/p><p><strong>OS4.4<\/strong> (#116) The Impact of Room Acoustics on Replay Speech Signal<br \/><em>Madhu R. Kamble and Hemant A. Patil<\/em><\/p><p><strong>OS4.5<\/strong> (#124) Effect of Speaker-Microphone Proximity on Pop Noise: Continuous Wavelet Transform-Based Approach<br \/><em>Priyanka Gupta and Hemant A. Patil<\/em><\/p><p><strong>OS4.6<\/strong> (#127) Synthetic Voice Detection and Audio Splicing Detection using SE-Res2Net-Conformer Architecture<br \/><em>Lei Wang, Benedict Yeoh and Jun Wah Ng<\/em><\/p><p><strong>OS4.7<\/strong> (#144) Audio Splicing Localization: Can We Accurately Locate the Splicing Tampering?<br \/><em>Zhiping Zeng and Zhizheng Wu<\/em><\/p><h2><strong>Oral 5: Speech Enhancement and Separation<\/strong><\/h2><p>Session Chairs: Fei Chen, Xiaohai Tian<br \/>Time: 16:30-18:45, Monday, 12 Dec 2022<strong><br \/><\/strong><\/p><p><strong>OS5.1<\/strong> (#71) Masking-based Neural Beamformer for Multichannel Speech Enhancement<br \/><em>Shuai Nie, Shan Liang, Zhanlei Yang, Longshuai Xiao, Wenju Liu and Jianhua Tao<\/em><\/p><p><strong>OS5.2<\/strong> (#36) Deep Multi-task Cascaded Acoustic Echo Cancellation and Noise Suppression<br \/><em>Junjie Li, Meng Ge, Longbiao Wang and Jianwu Dang<\/em><\/p><p><strong>OS5.3<\/strong> (#30) Boosting the Performance of SpEx+ by Attention and Contextual Mechanism<br \/><em>Chenyi Li, Zhiyong Wu, Wei Rao, Yannan Wang and Helen Meng<\/em><\/p><p><strong>OS5.4<\/strong> (#16) Assessing the Effect of Temporal Misalignment between the Probe and Processed Speech Signals on Objective Speech Quality Evaluation<br \/><em>Shangdi Liao and Fei Chen<\/em><\/p><p><strong>OS5.5<\/strong> (#06) Speech-enhanced and Noise-aware Networks for Robust Speech Recognition<br \/><em>Hung-Shin Lee, Pin-Yuan Chen, Yao-Fei Cheng, Yu Tsao and Hsin-Min Wang<\/em><\/p><p><strong>OS5.6<\/strong> (#57) Separate-to-Recognize: Joint Multi-target Speech Separation and Speech Recognition for Speaker-attributed ASR<br \/><em>Yuxiao Lin, Zhihao Du, ShiLiang Zhang, Fan Yu, Zhou Zhao and Fei Wu<\/em><\/p><p><strong>OS5.7<\/strong> (#133)Speech Enhancement Based on CycleGAN with Noise-informed Training<br \/><em>Wen-Yuan Ting, Syu-Siang Wang, Hsin-Li Chang, Borching Su and Yu Tsao<\/em><\/p><h1>Day 3, Tuesday, 13 Dec 2022<\/h1><h2><strong>Keynote Speech 2<\/strong><\/h2><p>Time: 8:30-9:30, Tuesday, 13 Dec 2022<br \/>Title: Recent progress in code-switch Singapore English+Mandarin large vocabulary continuous speech recognition<br \/>Speaker: Prof Eng Siong Chng, Associate Professor, Nanyang Technological University<br \/>Chair: Rong Tong<\/p><h2><strong>Oral 6: Speech Recognition II<\/strong><\/h2><p>Session Chairs: Huck Yang<strong><br \/><\/strong>Time: 10:00-12:00, Tuesday, 13 Dec 2022<\/p><p><strong>OS6.1<\/strong> (#10) Incorporating VAD into ASR System by Multi-task Learning<br \/><em>Meng Li, Yan Xia and Feng Lin<\/em><\/p><p><strong>OS6.2<\/strong> (#20) Improving ASR in Reverberant Environments<br \/><em>Yen-Lun Liao, Chi-Han Lin, Ren-Yuan Lyu and Jyh-Shing Roger Jang<\/em><\/p><p><strong>OS6.3<\/strong> (#23) 3M: Multi-loss, Multi-path and Multi-level Neural Networks for speech recognition<br \/><em>Zhao You, Shulin Feng, Dan Su and Dong Yu<\/em><\/p><p><strong>OS6.4<\/strong> (#32) Multi-Level Modeling Units for End-to-End Mandarin Speech Recognition<br \/><em>Yuting Yang, Binbin Du and Yuke Li<\/em><\/p><p><strong>OS6.5<\/strong> (#72)\u00a0 Exploiting Single-Channel Speech for Multi-Channel End-to-End Speech Recognition: A Comparative Study<br \/><em>Keyu An, Ji Xiao and Zhijian Ou<\/em><\/p><p><strong>OS6.6<\/strong> (#73) Ensemble and Re-ranking based on Language Models to Improve ASR<br \/><em>Shu-Fen Tsai, Shih-Chan Kuo, Ren-Yuan Lyu and Jyh-Shing Roger Jang<\/em><\/p><h2><strong>Oral 7: Speech Production and Perception II<\/strong><\/h2><p>Session Chairs: Shuwen Chen, Yanfeng Lu<br \/>Time: 10:00-12:00, Tuesday, 13 Dec 2022<strong><br \/><\/strong><\/p><p><strong>OS7.1<\/strong> (#19) Acoustic and Perceptual Study of Tones in Jin Chinese (Togtoh Variety)<br \/><em>Yue Wang and Wen Liu<\/em><\/p><p><strong>OS7.2<\/strong> (#53) Acoustic-perceptual correlates of whispered Mandarin consonants<br \/><em>Min Xu, Jing Shao, Hongwei Ding and Lan Wang<\/em><\/p><p><strong>OS7.3<\/strong> (#55) Bilingual Advantage? Perception of the Japanese Consonant Length Contrast by Monolingual vs Bilingual Speakers of Mongolian<br \/><em>Kimiko Tsukada, Yurong Yurong and Badmaavanchin Munguntsetseg<\/em><\/p><p><strong>OS7.4<\/strong> (#66) Multichannel Emotional Perception in Chinese Female: Faces, Voices and Bodies<br \/><em>Ruiqi Ge and Xiyu Wu<\/em><\/p><p><strong>OS7.5<\/strong> (#128) Coda Nasal Perception in Wenzhou Wu and Rugao Mandarin by Native Speakers of Standard Mandarin<br \/><em>Yanyang Chen, Xinya Zhang, Ying Chen and Jiazheng Wang<\/em><\/p><p><strong>OS7.6<\/strong> (#22) Objective Hand Complexity Comparison between Two Mandarin Chinese Cued Speech Systems<br \/><em>Li Liu, Gang Feng, Xiaoxi Ren and Xianping Ma<\/em><\/p><h2><strong>Oral 8: Speech Synthesis &amp; Speaker Embedding<\/strong><\/h2><p>Session Chairs: Xiaoxiao Miao<br \/>Time: 13:00-15:00, Tuesday, 13 Dec 2022<strong><br \/><\/strong><\/p><p><strong>OS8.1<\/strong> (#34) Rhythm-controllable Attention with High Robustness for Long Sentence Speech Synthesis<br \/><em>Dengfeng Ke, Yayue Deng, Yukang Jia, Jinlong Xue, Qi Luo, Ya Li, Jianqing Sun, Jiaen Liang and Binghuai Lin<\/em><\/p><p><strong>OS8.2<\/strong> (#74) AdaptiveFormer : A Few-shot Speaker Adaptative Speech Synthesis Model based on FastSpeech2<br \/><em>Dengfeng Ke, Ruixin Hu, Qi Luo, Liangjie Huang, WenHan Yao, Wentao Shu, Jinsong Zhang and Yanlu Xie<\/em><\/p><p><strong>OS8.3<\/strong> (#27) ECAPA-TDNN for Multi-speaker Text-to-speech Synthesis<br \/><em>Jinlong Xue, Yayue Deng, Yichen Han, Ya Li, Jianqing Sun and Jiaen Liang<\/em><\/p><p><strong>OS8.4<\/strong> (#78) Low-Resource Speech Synthesis with Speaker-Aware Embedding<br \/><em>Li-Jen Yang, I-Ping Yeh and Jen-Tzung Chien<\/em><\/p><p><strong>OS8.5<\/strong> (#44) A Phone-Level Speaker Embedding Extraction Framework with Multi-Gate Mixture-of-Experts Based Multi-Task Learning<br \/><em>Zhijunyi Yang, Mengjie Du, Rongfeng Su, Xiaokang Liu, Nan Yan and Lan Wang<\/em><\/p><p><strong>OS8.6<\/strong> (#120) Shuffle is What You Need<br \/><em>Wan Lin, Lantian Li and Dong Wang<\/em><\/p><h2><strong>Special Session 2: Deep Noise Reduction<\/strong><\/h2><p>Session Chairs: Xueliang Zhang, Lei Wang<br \/>Time: 13:00-15:00, Tuesday, 13 Dec 2022<strong><br \/><\/strong><\/p><p><strong>SS2.1<\/strong> (#104) On the Use of Absolute Threshold of Hearing-based Loss for Full-band Speech Enhancement<br \/><em>Rohith Mars and Rohan Kumar Das<\/em><\/p><p><strong>SS2.2<\/strong> (#106) RAT: RNN-Attention Transformer for Speech Enhancement<br \/><em>Tailong Zhang, Shulin He, Hao Li and Xueliang Zhang<\/em><\/p><p><strong>SS2.3<\/strong> (#109) A Speech-Noise-Equilibrium Loss Function for Deep Learning-Based Speech Enhancement<br \/><em>Weitong Zhao, Fushi Xie, Kang Ouyang and Nengheng Zheng<\/em><\/p><p><strong>SS2.4<\/strong> (#101) Speakerfilter-Pro: an Improved Target Speaker Extractor Combines the Time Domain and Frequency Domain<br \/><em>Shulin He, Hao Li and Xueliang Zhang<\/em><\/p><p><strong>SS2.5<\/strong> (#105)\u00a0 Two-Branch Network with Selective Kernel Convolution for Time-Domain Speech Enhancement<br \/><em>Hui Li, Zhihua Huang and Chuangjian Guo<\/em><\/p><p><strong>SS2.6<\/strong> (#11) Optimizing Shoulder to Shoulder: A Coordinated Sub-Band Fusion Model for Full-Band Speech Enhancement<br \/><em>Guochen Yu, Andong Li, Wenzhe Liu, Chengshi Zheng, Yutian Wang and Hui Wang<\/em><\/p><h1>Day 4, Wednesday, 14 Dec 2022<\/h1><h2><strong>Keynote Speech 3<\/strong><\/h2><p>Time: 8:30-9:30, Wednesday, 14 Dec 2022<br \/>Title: Automated Assessment and Feedback: the Role of Spoken Grammatical Error Correction<br \/>Speaker: Kate Knill, Principal Research Associate, University of Cambridge<br \/>Chair: Junichi Yamagishi<\/p><h2><strong>Oral 9: Multimodality<\/strong><\/h2><p>Session Chairs: Ming-Hsiang Su, Ya Li<br \/>Time: 10:00-12:00, Wednesday, 14 Dec 2022<strong><br \/><\/strong><\/p><p><strong>OS9.1<\/strong> (#46) Deep Learning Based Audio-Visual Multi-Speaker DOA Estimation Using Permutation-Free Loss Function<br \/><em>Qing Wang, Hang Chen, Ya Jiang, Zhe Wang, Yuyang Wang, Jun Du and Chin-Hui Lee<\/em><\/p><p><strong>OS9.2<\/strong> (#108) Multi-Task Joint Learning for Embedding Aware Audio-Visual Speech Enhancement<br \/><em>Chenxi Wang, Hang Chen, Jun Du, Baocai Yin and Jia Pan<\/em><\/p><p><strong>OS9.3<\/strong> (#80) Multimodal Automatic Speech Fluency Evaluation Method for Putonghua Proficiency Test Propositional Speaking Section<br \/><em>Jiajun Liu, Huazhen Meng, Yunfei Shen, Linna Zheng and Aishan Wumaier<\/em><\/p><p><strong>OS9.4<\/strong> (#114) Cantonese Neural Speech Synthesis from Found Newscasting Video Data and its Speaker Adaptation<br \/><em>Raymond Chung<\/em><\/p><p><strong>OS9.5<\/strong> (#107) A Preliminary Study on Taiwanese OCR for Assisting Textual Database Construction from Historical Documents<br \/><em>Yuan-Fu Liao, Yu-Hsuan Huang, Matus Pleva, Daniel Hl\u00e1dek and Ming-Hsiang Su<\/em><\/p><p><strong>OS9.6<\/strong> (#18) Reconstruction of Speech Spectrogram based on Non-invasive EEG Signal<br \/><em>Di Zhou, Masashi Unoki, Gaoyan Zhang and Jianwu Dang<\/em><\/p><h2><strong>Oral 10: Speech Prosody<\/strong><\/h2><p>Session Chair: Bin Li, Huayun Zhang<br \/>Time: 10:00-12:00, Wednesday, 14 Dec 2022<strong><br \/><\/strong><\/p><p><strong>OS10.1<\/strong> (#07) J-TranPSP: A Joint Transition-based Model for Prosodic Structure Prediction, Word Segmentation and PoS Tagging<br \/><em>Binbin Shen, Jian Luan, Shengyan Zhang, Quanbo Shen and Yujun Wang<\/em><\/p><p><strong>OS10.2<\/strong> (#12) A Mandarin Prosodic Boundary Prediction Model Based on Multi-Source Semi-Supervision<br \/><em>Peiyang shi, Zengqiang Shang and Pengyuan Zhang<\/em><\/p><p><strong>OS10.3<\/strong> (#59) English lexical stresses in non-native speech under adverse conditions<br \/><em>Mosi He, Ting Zhang, Bin Li and Kin Cheung<\/em><\/p><p><strong>OS10.4<\/strong> (#35) Stress Gravity of Neutral Tone Words in Different Information Structures<br \/><em>Jingwen Huang and Aijun Li<\/em><\/p><p><strong>OS10.5<\/strong> (#67) Prosodic Encoding of Mandarin Chinese Intonation by Uygur Speakers in Declarative and Interrogative Sentences<br \/><em>Tong Li, Hui Feng and Yuan Jia<\/em><\/p><p><strong>OS10.6<\/strong> (#102) In-group Advantage for Chinese and English Emotional Prosody in Quiet and Noise Conditions<br \/><em>Yuhan Yan, Shanpeng Li and Ying Chen<\/em><\/p><h2><strong>Oral 11: Lightweight Model &amp; Knowledge Distillation<\/strong><\/h2><p>Session Chairs: Yanmin Qian, Yi Zhou<br \/>Time: 13:00-15:00, Wednesday, 14 Dec 2022<strong><br \/><\/strong><\/p><p><strong>OS11.1<\/strong> (#48) Multi-Resolution Stacked 1D-CNN for Small-Footprint Keyword Spotting with Two-Stage Detection <br \/><em>Jian Tang and Shaofei Xue<\/em><\/p><p><strong>OS11.2<\/strong> (#65) Lightweight End-to-End Deep Learning Model for Music Source Separation<br \/><em>Yao-Ting Wang, Yi-Xing Lin, Kai-Wen Liang, Tzu-Chiang Tai and Jia-Ching Wang<\/em><\/p><p><strong>OS11.3<\/strong> (#97) AdaVITS: Tiny VITS for Low Computing Resource Speaker Adaptation <br \/><em>Kun Song, Heyang Xue, Xinsheng Wang, Jian Cong, Yongmao Zhang, lei Xie, Bing Yang, Xiong Zhang and Dan Su<\/em><\/p><p><strong>OS11.4<\/strong> (#43) Label-free Knowledge Distillation with Contrastive Loss for Light-weight Speaker Recognition <br \/><em>Zhiyuan Peng, Xuanji He, Ke Ding, Tan Lee and Guanglu Wan<\/em><\/p><p><strong>OS11.5<\/strong> (#121) Improving Speech Separation with Knowledge Distilled from Self-supervised Pre-trained Models <br \/><em>Bowen Qu, Chenda Li, Jinfeng Bai and Yanmin Qian<\/em><\/p><p><strong>OS11.6<\/strong> (#111) Text-Informed Knowledge Distillation for Robust Speech Enhancement and Recognition <br \/><em>Wei Wang, Wangyou Zhang, Shaoxiong Lin and Yanmin Qian<\/em><\/p><h2><strong>Oral 12: Speech Technology for Health<\/strong><\/h2><p>Session Chair: Nan Yan, Jeremy Wong<br \/>Time: 13:00-15:00, Wednesday, 14 Dec 2022<strong><br \/><\/strong><\/p><p><strong>OS12.1<\/strong> (#94) Prediction of Depression Severity Based on Transformer Encoder and CNN Model<br \/><em>Jiahao Lu, Bin Liu, Zheng Lian, Cong Cai, Jianhua Tao and Ziping Zhao<\/em><\/p><p><strong>OS12.2<\/strong> (#05) Depressive Tendency Recognition by Fusing Speech and Text Features: A Comparative Analysis<br \/><em>Xiaoyong Lu, Yimin He, Jingyi Yuan, Tao Pan and Yafan Wang<\/em><\/p><p><strong>OS12.3<\/strong> (#17) Medical Difficult Airway Detection using Speech Technology<br \/><em>Zhikai Zhou, Shuang Cao, Zhengyang Chen, Bei Liu, Ming Xia, Hong Jiang and Yanmin Qian<\/em><\/p><p><strong>OS12.4<\/strong> (#88) CUEMPATHY: A Counseling Speech Dataset for Psychotherapy Research<br \/><em>Dehua Tao, Harold Chui, Sarah Luk and Tan Lee<\/em><\/p><p><strong>OS12.5<\/strong> (#24) Aphasia Detection for Cantonese-Speaking and Mandarin-Speaking Patients Using Pre-Trained Language Models<br \/><em>Ying Qin, Tan Lee, Anthony Pak Hin Kong and Feng Lin<\/em><\/p><p><strong>OS12.6<\/strong> (#39) Respiratory and laryngeal influences on voice in post-stroke dysarthria: a pilot study<br \/><em>Tinghao Zhao, Xiaoxia Du, Juan Liu, Rongfeng Su, Nan Yan and Lan Wang<\/em><\/p><h2><strong>Oral 13: Listening Comprehension of Machines and Humans<\/strong><\/h2><p>Session Chair: Wei-Qiang Zhang, Yanfeng Lu<br \/>Time: 15:30-17:30, Wednesday, 14 Dec 2022<strong><br \/><\/strong><\/p><p><strong>OS13.1<\/strong> (#110) End-to-end speech topic classification based on pre-trained model Wavlm<br \/><em>Tengfei Cao, Liang He and Fangjing Niu<\/em><\/p><p><strong>OS13.2<\/strong> (#79) BERT-based Chinese Medicine Named Entity Recognition Model Applied to Medication Reminder Dialogue System<br \/><em>Tsung-Hsien Yang, Matus Pleva, Daniel Hl\u00e1dek and Ming-Hsiang Su<\/em><\/p><p><strong>OS13.3<\/strong> (#29) Dialogue scenario classification based on social factors<br \/><em>Yuning Liu, Di Zhou, Masashi Unoki, Jianwu Dang and Aijun Li<\/em><\/p><p><strong>OS13.4<\/strong> (#112) BERT-LID: Leveraging BERT to Improve Spoken Language Identification<br \/><em>Yuting Nie, Junhong Zhao, Wei-Qiang Zhang and Jinfeng Bai<\/em><\/p><p><strong>OS13.5<\/strong> (#123) An Exploratory Study for Quantifying the Contextual Information for Successful Chinese L2 Speech Comprehension<br \/><em>Rian Bao, Linkai Peng, Yuchen Yan and Jinsong Zhang<\/em><\/p><p><strong>OS13.6<\/strong> (#92) The Contribution of Phonological and Fluency Factors to Chinese L2 Comprehensibility Ratings: A Case Study of Urdu-speaking Learners<br \/><em>Rian Bao, Linkai Peng, Yingming Gao and Jinsong Zhang<\/em><\/p><h2><strong>Oral 14: Acoustic Phonetics &amp; Prosody<\/strong><\/h2><p>Session Chair: Wen Liu, Bo Li<br \/>Time: 15:30-17:30, Wednesday, 14 Dec 2022<strong><br \/><\/strong><\/p><p><strong>OS14.1<\/strong> (#21) An Acoustic Study on Fricative Vowel [i\u0291] in Zhongwei Chinese<br \/><em>Xinyi Zhang and Wen Liu<\/em><\/p><p><strong>OS14.2<\/strong> (#69) Acoustic Features of Consonants of Standard Chinese and English by Uyghur Native Speakers<br \/><em>Yuan Jia and Xintong\u00a0 Zuo<\/em><\/p><p><strong>OS14.3<\/strong> (#33) A Study on Mandarin Chinese \u201cBu\u201d Tone Sandhi Followed by English Words<br \/><em>Kaige Gao and Xiyu Wu<\/em><\/p><p><strong>OS14.4<\/strong> (#68) An Entropy-based Study on the Acquisition of Mandarin Initial Consonants by Korean Learners<br \/><em>Xiaoli Feng, Yingming Gao, Jinsong Zhang and Yanchun Cao<\/em><\/p><p><strong>OS14.5<\/strong> (#58) Impacts of Aging on Suprasegmental and Segmental Encoding of Vocally-Expressed Confidence in Wuxi Dialect<br \/><em>Yujie Ji, Qiqi Sun, Zhikang Peng and Xiaoming Jiang<\/em><\/p><p><strong>OS14.6<\/strong> (#31) Acceptance of Tonal and Segmental Variability Correlates to Inventory Size in Mandarin Chinese<br \/><em>Julie Siying Chen and Stephen Politzer-Ahles<\/em><\/p><h2><strong>SIG-CSLP Assembly<\/strong><\/h2><p>Time: 17:30-18:00, Wednesday, 14 Dec 2022<\/p><h2><strong>Closing Session<\/strong><\/h2><p>Time: 18:00-18:30, Wednesday, 14 Dec 2022<\/p>\t\t\t\t\t\t\t\t<\/div>\n\t\t\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/div>\n\t\t\t\t\t<\/div>\n\t\t<\/section>\n\t\t\t\t<\/div>\n\t\t","protected":false},"excerpt":{"rendered":"<p>Day 1, Sunday, 11 Dec 2022 kindly take note that the tutorials and grand challenges will be conducted online only. Tutorials Tutorial 1: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Speech Processing\u00a0Presenters: Yu Zhang, Bo Li, Daniel Park, GoogleTime:&#8230;<br \/><a class=\"read-more-button\" href=\"https:\/\/www.colips.org\/conferences\/iscslp2022\/web\/program-details\/\">Read more<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"parent":0,"menu_order":0,"comment_status":"closed","ping_status":"closed","template":"","meta":{"footnotes":""},"class_list":["post-1281","page","type-page","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/www.colips.org\/conferences\/iscslp2022\/web\/wp-json\/wp\/v2\/pages\/1281","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.colips.org\/conferences\/iscslp2022\/web\/wp-json\/wp\/v2\/pages"}],"about":[{"href":"https:\/\/www.colips.org\/conferences\/iscslp2022\/web\/wp-json\/wp\/v2\/types\/page"}],"author":[{"embeddable":true,"href":"https:\/\/www.colips.org\/conferences\/iscslp2022\/web\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.colips.org\/conferences\/iscslp2022\/web\/wp-json\/wp\/v2\/comments?post=1281"}],"version-history":[{"count":280,"href":"https:\/\/www.colips.org\/conferences\/iscslp2022\/web\/wp-json\/wp\/v2\/pages\/1281\/revisions"}],"predecessor-version":[{"id":1785,"href":"https:\/\/www.colips.org\/conferences\/iscslp2022\/web\/wp-json\/wp\/v2\/pages\/1281\/revisions\/1785"}],"wp:attachment":[{"href":"https:\/\/www.colips.org\/conferences\/iscslp2022\/web\/wp-json\/wp\/v2\/media?parent=1281"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}