ISCA COLIPS I2R


Conference Areas

1: Speech Perception and Production
  1.1 Models of speech production
  1.2 Physiology and neurophysiology of speech production
  1.3 Neural basis of speech production
  1.4 Speech acoustics
  1.5 Singing acoustics
  1.6 Coarticulation
  1.7 Infant spoken language acquisition, infant speech perception
  1.8 L2 acquisition
  1.9 Models of speech perception
  1.10 Physiology and neurophysiology of speech perception
  1.11 Neural basis of speech perception
  1.12 Multimodal speech perception
  1.13 Interaction speech production-speech perception
  1.14 Acoustic and articulatory cues in speech perception
  1.15 Perception of prosody
  1.16 Perception of emotions
  1.17 Perception of singing voice
  1.18 Multilingual studies  
  1.19 Speech and hearing disorders 

2: Prosody, Phonetics, Phonology, and Para-/Non- Linguistic Information
  2.1 Articulatory and acoustic cues of prosody
  2.2 Linguistic systems
  2.3 Language descriptions
  2.4 Phonetics and phonology
  2.5 Discourse and dialog structures
  2.6 Phonological processes and models
  2.9 Laboratory phonology
  2.10 Phonetic universals
  2.11 Sound changes
  2.12 Socio-phonetics  
  2.13 Phonetics of L1-L2 interaction
  2.14 Acoustic phonetics
  2.15 Phonation, voice quality
  2.16 Paralinguistic and nonlinguistic cues (other than emotion, expression)
  2.17 Non-verbal communication
  2.18 Speaker trait and age recognition

3: Analysis of Speech and Audio Signals
  3.1 Speech analysis and representation
  3.2 Audio signal analysis and representation
  3.3 Speech and audio segmentation and classification
  3.4 Voice activity detection
  3.5 Pitch and harmonic analysis
  3.6 Source separation and computational auditory scene analysis
  3.7 Speaker spatial localization
  3.8 Voice separation
  3.9 Music signal processing and understanding
  3.10 Singing analysis

4: Speech Coding and Enhancement
  4.1 Speech coding and transmission
  4.2 Low-bit-rate speech coding
  4.3 Perceptual audio coding of speech signals
  4.4 Noise reduction for speech signals
  4.6 Speech enhancement: single-channel
  4.7 Speech enhancement: multi-channel
  4.8 Speech intelligibility
  4.9 Active noise control
  4.10 Speech enhancement in hearing aids
  4.11 Adaptive beamforming for speech enhancement
  4.12 Dereverberation for speech signals
  4.13 Echo cancelation for speech signals

5: Speaker and Language Identification
  5.1 Language identification and verification
  5.2 Dialect and accent recognition
  5.3 Speaker characterization, verification and identification
  5.4 Features for speaker and language recognition
  5.5 Robustness to variable and degraded channels
  5.6 Speaker confidence estimation
  5.7 Extraction of para-linguistic information (gender, stress, mood, age, emotion)
  5.8 Speaker diarization
  5.9 Multimodal and multimedia speaker recognition
  5.10 Higher-level knowledge in speaker and language recognition

6: Speech Synthesis and Spoken Language Generation
  6.1 Grapheme-to-phoneme conversion for synthesis
  6.2 Text processing for speech synthesis (text normalization, syntactic and semantic analysis)
  6.3 Segmental-level and/or concatenative synthesis
  6.4 Signal processing/statistical model for synthesis
  6.5 Speech synthesis paradigms and methods, silence speech, articulatory synthesis,
        parametric synthesis etc.
  6.6 Prosody modeling and generation
  6.7 Expression, emotion and personality generation
  6.8 Voice conversion and modification, morphing
  6.9 Concept-to-speech conversion
  6.10 Cross-lingual and multilingual aspects for synthesis
  6.11 Avatars and talking faces
  6.12 Tools and data for speech synthesis
  6.13 Quality assessment/evaluation metrics in synthesis

7: Speech Recognition - Signal Processing, Acoustic Modeling, Robustness, and Adaptation
  7.1 Feature extraction and low-level feature modeling for ASR
  7.2 Prosodic features and models
  7.3 Robustness against noise, reverberation
  7.4 Far field and microphone array speech recognition
  7.5 Speaker normalization (e.g., VTLN)
  7.6 Deep neural network
  7.7 Discriminative acoustic training methods for ASR
  7.8 Acoustic model adaptation (speaker, bandwidth, emotion, accent)
  7.9 Speaker adaptation; speaker adapted training methods
  7.10 Pronunciation variants and modeling for speech recognition
  7.11 Acoustic confidence measures
  7.12 Multimodal aspects (e.g., AV speech recognition)
  7.13 Cross-lingual and multilingual aspects, non native accents
  7.14 Acoustic modeling for conversational speech (dialog, interaction)

8: Speech Recognition - Architecture, Search & Linguistic Components
  8.1 Lexical modeling and access: units and models
  8.2 Automatic lexicon learning
  8.3 Supervised/unsupervised morphological models
  8.4 Prosodic features and models for LM
  8.5 Discriminative training methods for LM  
  8.6 Language model adaptation (domain, diachronic adaptation)
  8.7 Language modeling for conversational speech (dialog, interaction)
  8.8 Search methods, decoding algorithms and implementation; lattices; multipass strategies
  8.9 New computational strategies, data-structures for ASR
  8.10 Computational resource constrained speech recognition
  8.11 Confidence measures
  8.12 Cross-lingual and multilingual aspects for speech recognition
  8.13 Structured classification approaches

9: LVCSR and Its Applications, Technologies and Systems for New Applications
  9.1 Multimodal systems
  9.2 Applications in education and learning (including CALL, assessment of language fluency)
  9.3 Applications in medical practice (CIS, voice assessment, ...)
  9.4 Speech science in end-user applications
  9.5 Systems for LVCSR and its applications
  9.6 Rich transcription
  9.7 New types of deep neural network learning for LVCSR and related applications
  9.8 Innovative Products and Services Based on Speech Technologies
  9.9 Sparse, template-based representations
  9.10 New paradigms (e.g. articulatory models, silent speech interfaces, topic models)

10: Spoken Language Processing - Dialogue, Summarization, Understanding
  10.1 Spoken dialog system
  10.2 Multimodal dialog systems
  10.2 Stochastic modeling for dialog
  10.3 Question/answering from speech
  10.4 Spoken document summarization
  10.5 Systems for spoken language understanding
  10.6 Topic spotting and classification
  10.7 Entity extraction from speech
  10.8 Semantic analysis and classification
  10.9 Conversation and interaction

11: Spoken Language Processing – Translation, Info Retrieval
  11.1 Spoken machine translation
  11.2 Speech-to-speech translation systems
  11.3 Transliteration
  11.4 Voice search
  11.5 Spoken term detection
  11.6 Audio indexing
  11.7 Spoken document retrieval
  11.8 Systems for mining spoken data, search/retrieval of speech documents

12: Spoken Language Evaluation, Standardization and Resources 
  12.1 Speech and multimodal resources and annotation
  12.2 Metadata descriptions of speech, audio, and text resources
  12.3 Metadata for semantic/content markup
  12.4 Metadata for linguistic/discourse structure (e.g., disfluencies, sentence/topic boundaries,
          speech acts)
  12.5 Methodologies and tools for language resource construction and annotation
  12.6 Automatic segmentation and labeling of resources
  12.7 Multilingual resources
  12.8 Validation, quality assurance, evaluation of language resources
  12.9 Evaluation and standardization of speech and language technologies and systems

 

Papers

Diamond Sponsors

Gold Sponsors



Silver Sponsors

Bronze Sponsors

Corporate Partner

Supporters

Media Partners


Copyright © 2013-2015 Chinese and Oriental Languages Information Processing Society
Conference managed by Meeting Matters International