Keynote Speakers

Gerald Penn, University of Toronto
Title: The Quantitative Study of Writing Systems

Abstract: If you understood all of the world's languages, you would still not be able to read many of the texts that you find on the world wide web, because they are written in non-Roman scripts -- often ones that have been arbitrarily encoded for electronic transmission in the absence of an accepted standard. This very modern nuisance reflects a dilemma as ancient as writing itself: the association between a language as it is spoken and its written form has a sort of internal logic to it that we can comprehend, but the conventions are different in every individual case --- even among languages that use the same script, or between scripts used by the same language. This conventional association between language and script, called a writing system, is indeed reminiscent of the Saussurean conception of language itself, a conventional association of meaning and sound, upon which modern linguistic theory is based. Despite linguists' reliance upon writing to present and preserve linguistic data, however, writing systems were a largely forgotten corner of linguistics until the 1960s, when Gelb presented their first classification. This talk will describe recent work that aims to place the study of writing systems upon a sound computational and statistical foundation. While archaeological decipherment may eternally remain the holy grail of this area of research, it also has applications to speech synthesis, machine translation, and multilingual document retrieval.

Claire Cardie, Cornell University
Title: A Sentimental Journey

Abastract: Within the field of natural language processing, sentiment analysis refers to the computational study of subjective, opinion-oriented language. Editorials, reviews (of products, movies, books, etc.), blogs, chat room dialog, and even (purportedly factual) newspaper articles represent some of the genres of text for which accurate identification and interpretation of opinions is a critical component for understanding the text in its entirety. And given the correspondingly wide range of potential applications for opinion analysis in business, politics, the intelligence community, and the entertainment industry, it is possibly not surprising that researchers have been devoting increasingly more energy to sentiment analysis in recent years. This talk will provide an overview of current research in this rapidly growing field, focusing on the presentation of effective techniques for extracting and summarizing opinions in unstructured text. In the talk, I will also describe the key challenges for research in this area, present a few related (and as yet unsolved) problems, and discuss promising directions for future work in sentiment analysis.

Hwee-Tou Ng, National Univeristy of Singapore
Title: Towards Word Sense Disambiguation in the Large

Abstract: Word sense disambiguation (WSD) is the task of determining the correct meaning or sense of a word in context. A critical problem faced by current supervised WSD systems is the lack of manually annotated training data. Tackling this data acquisition bottleneck is crucial, in order to build WSD systems with broad coverage of words. In this talk, I will present results of our attempt to scale up WSD, exploiting large quantities of Chinese-English parallel text. Our evaluation indicates that our implemented approach of gathering training examples from parallel text is promising, when tested on nouns and adjectives of SENSEVAL-2 and SENSEVAL-3 English all-words task. This work is jointly done with Yee Seng Chan.

©2005-2006 Chinese and Oriental Languages Information Processing Society, Singapore | Last updated on September 24, 2006 .