Keynote Speech

Title: Diagnosing Large Language Models in Text Generation and Reasoning

Speaker: Prof Deyi Xiong, Tianjin University



Deyi Xiong is a Professor of Computer Science at Tianjin University (TJU), Director of both the Natural Language Processing Laboratory at the College of Intelligence and Computing, TJU and the International Joint Research Center of Language Intelligence and Technology at TJU. His research focuses on natural language processing, specifically machine translation, dialogue, natural language generation and commonsense reasoning. He has published over 100 papers in prestigious journals and conferences, including Computational Linguistics, IEEE TPAMI, IEEE TASLP, Artificial Intelligence, AAAI, IJCAI ACL, and EMNLP. He was the program co-chair of IALP 2021 and CWMT 2017. He has also served as an area chair of conferences including ACL, EMNLP, NAACL and COLING. He was the founder and co-organizer of multiple ACL/EMNLP/NAACL-affiliated workshops such as S2MT 2015, SedMT 2016 and DiscoMT 2019. He is a member of the standing committee of reviewers of CL, action editor of both TACL and ARR, and an editorial board member of International Journal of Asian Language Processing.



Large language models (LLMs), trained on a huge amount of data via self-supervised learning, have recently made remarkable progress on both natural language understanding and generation. Studies have also found that LLMs are capable of reasoning over a range of tasks (i.e., quantitative reasoning, commonsense reasoning). In this talk, I will attentively examine such capabilities of LLMs in text generation (focusing on the quality of LLM-authored texts) and reasoning (especially commonsense reasoning and reasoning over long-range context). In order to examine and diagnose LLMs in text generation and reasoning, we build large-scale manually-annotated datasets, TGEA 1.0/2.0 and Chinese WPLC, where raw data are carefully selected from thousands of machine-authored sentences and millions of human-written texts. I will present the data selection criteria, error taxonomy, annotation procedure, quality control for these datasets and summarize our findings in building and exploring these datasets. Open questions and future research directions with respect to LLMs text generation and reasoning will be also discussed in this talk.