Areas & Topics

1 Speech Perception, Production and Acquisition by Human Listeners

1.1 Models of speech production
1.2 Physiology and neurophysiology of speech production and perception
1.3 Models of speech perception
1.4 Acoustic and articulatory cues in speech perception
1.5 Interaction speech production-speech perception
1.6 Multimodal speech perception
1.7 Cognition and brain studies on speech
1.8 Code switching and multilingual studies
1.9 L1 acquisition and processing
1.10 Bilingual and L2 acquisition and processing
1.11 Combining speech and other biosignals
1.12 Other topics in Speech Perception, Production and Acquisition

2 Phonetics, Phonology, and Prosody

2.1 Phonetics and phonology
2.2 Language descriptions
2.3 Acoustic phonetics
2.4 Phonation and voice quality
2.5 Articulatory and acoustic features of prosody
2.6 Sociophonetics and sound changes
2.7 Phonetics of L1-L2 interaction
2.8 Forensic phonetics
2.9 Phonetic and linguistic aspects of paralinguistics
2.10 Other topics in Phonetics, Phonology, and Prosody

3 Paralinguistics in Speech and Language: Human and Automatic Analysis and Processing

3.1 Analysis of speaker states
3.2 Analysis of speaker traits
3.3 Automatic analysis of speaker states
3.4 Automatic analysis of speaker traits
3.5 Social signal processing
3.6 Sentiment analysis and opinion mining
3.7 Perception of paralinguistic phenomena
3.8 Multimodal paralinguistics
3.9 Other topics in Paralinguistics in Speech and Language

4 Speaker and Language Identification

4.1 Language identification and verification, language diarization
4.2 Dialect and accent recognition
4.3 Speaker verification and identification
4.4 Features for speaker and language recognition
4.5 Speaker diarization
4.6 Evaluation of speaker and language identification systems
4.7 Multimodal/multimedia speaker recognition and diarization
4.8 Other topics in Speaker and Language Identification

5 Analysis of Speech and Audio Signals

5.1. Bioacoustics
5.2. Speech signal analysis and representation
5.3. Acoustic event detection and acoustic scene classification
5.4 Speech and audio segmentation
5.5. Speech type classification
5.6. Voice activity detection and audio segmentation
5.7. Detection, inference, and segmentation of phonetic events and articulation
5.8. Source separation
5.9. Spatial audio
5.10 Singing analysis
5.11. Speech and audio quality assessment
5.12 Other topics in Analysis of Speech and Audio Signals

6 Speech Coding and Enhancement

6.1 Speech coding and transmission
6.2 Noise reduction for speech signals
6.3 Speech enhancement: single-channel
6.4 Speech enhancement: multi-channel
6.5 Speech intelligibility
6.6 Speech enhancement in hearing aids
6.7 Dereverberation for speech signals
6.8 Echo cancelation for speech signals
6.9 Evaluation of speech transmission, coding and enhancement
6.10 Bandwidth expansion
6.11 Privacy and security in speech communication
6.12 Other topics in Speech Coding and Enhancement

7 Speech Synthesis and Spoken Language Generation

7.1 Grapheme-to-phoneme conversion for synthesis
7.2 Text processing for speech synthesis
7.3 Signal processing methods for synthesis
7.4 Speech synthesis paradigms and methods
7.5 Towards end-to-end speech synthesis
7.6 Statistical parametric speech synthesis
7.7 Prosody modeling and generation
7.8 Expression, emotion and personality generation
7.9 Synthesis of singing voices
7.10 Voice modification, conversion and morphing
7.11 Cross-lingual and multilingual aspects in speech synthesis, code switching
7.12 Multimodal synthesis for avatars and talking heads
7.13 Tools and data for speech synthesis
7.14 Evaluation of speech synthesis
7.15 Other topics in Speech Synthesis and Spoken Language Generation

8 Speech Recognition: Signal Processing, Acoustic Modeling, Robustness, Adaptation

8.1 Feature extraction and low-level feature modeling for ASR
8.2 Prosodic features and models
8.3 Robustness against noise or reverberation
8.4 Far field and microphone array speech recognition
8.5 Novel neural network architectures (e.g. sequence models, LSTM variants)
8.6 Neural network training methods (including new objective functions)
8.7 Discriminative acoustic training methods for ASR
8.8 Acoustic model adaptation (bandwidth, emotion, accent)
8.9 Speaker adaptation and normalization
8.10 Pronunciation variants and modeling for speech recognition
8.11 Cross-lingual and multilingual aspects, non-native accents
8.12 Acoustic modeling for conversational speech (dialog, interaction)
8.13 Other topics in Speech Recognition: Signal Processing, Acoustic Modeling, Robustness, Adaptation

9 Speech Recognition: Architecture, Search, and Linguistic Components

9.1 Lexical modeling (lexicon learning, units, morphological models, ...)
9.2 Language model adaptation (domain, diachronic adaptation)
9.3 Language modeling
9.4 Search methods, decoding algorithms, lattices, multipass strategies
9.5 New computational strategies, data-structures for ASR
9.6 Computational resource constrained speech recognition
9.7 Confidence measures
9.8 Cross-lingual and multilingual components for speech recognition, code switching
9.9 Other topics in Speech Recognition -Architecture, Search, and Linguistic Components

10 Speech Recognition: Technologies and Systems for New Applications

10.1 Multimodal systems
10.2 Applications in education and learning (incl. CALL, assessment of fluency)
10.3 Applications in medical practice
10.4 Rich transcription
10.5 Innovative products and services based on speech technologies
10.6 New paradigms (e.g. artic. models, topic models)
10.7 Zero-resource speech recognition
10.8 Other topics in Speech Recognition -Technologies and Systems for New Applications

11 Spoken dialog systems and conversational analysis

11.1 Spoken dialog systems
11.2 Discourse and dialog structures
11.3 Multimodal interaction and interfaces
11.4 Conversation, communication and interaction
11.5 Analysis of verbal, co-verbal and nonverbal behavior
11.6 Language modeling for conversational speech (dialog, interaction)
11.7 Interactive systems for speech/language training, therapy, communication aids
11.8 Stochastic modeling for dialog
11.9 Question-answering from speech
11.10 Systems for spoken language understanding
11.11 Other topics in Spoken dialog systems and conversational analysis

12 Spoken Language Processing: Translation, Information Retrieval, Summarization, Resources and Evaluation

12.1 Spoken machine translation
12.2 Speech-to-speech translation systems
12.3 Voice search
12.4 Spoken term detection
12.5 Indexing, mining and retrieval of speech and audio documents
12.6 Speech and multimodal resources
12.7 Evaluation of speech technology systems
12.8 Metadata descriptions of speech, audio and text resources
12.9 Methodologies and tools for language resource construction, annotation and evaluation
12.10 Spoken document summarization
12.11 Semantic analysis and classification
12.12 Entity extraction from speech
12.13 Other topics in Spoken Language Processing: Translation, Information Retrieval, Summarization, Resources and Evaluation

13 Speech, voice, and hearing disorders

13.1 Speech disorders
13.2 Voice disorders
13.3 Hearing disorders
13.4 Phonation and voice quality
13.5 Automatic assessment of pathological speech
13.6 Parkinson's Disease
13.7 Dysarthric speech
13.8 Paralinguistics of pathological speech and language
13.9 Applications for voice, speech, and hearing assessment
13.10 Speech technology for disordered speech
13.11 Speech technology for disordered hearing
13.12 Speech technology for disordered voice
13.13 Silent speech interfaces
13.14 Other topics in Speech, voice, and hearing disorders

14 Special Sessions

14.1 Zero Resource Speech Challenge 2021
14.2 Speech Recognition of Atypical Speech
14.3 Shared Task on Automatic Speech Recognition for Non-native Children's Speech
14.4 Oriental Language Recognition
14.5 Far-field Multi-Channel Speech Enhancement Challenge for Video Conferencing (ConferencingSpeech 2021)
14.6 Voice quality characterization for clinical voice assessment: Voice production, acoustics, and auditory perception
14.7 Automatic Speech Recognition in Air Traffic Management (ASR-ATM)
14.8 Alzheimer's Dementia Recognition through Spontaneous Speech: The ADReSS Challenge
14.9 SdSV Challenge 2021: Analysis and Exploration of New Ideas on Short-Duration Speaker Verification
14.10 Multilingual and code-switching ASR challenges for low resource Indian languages
14.11 Acoustic Echo Cancellation (AEC) Challenge
14.12 Non-Autoregressive Sequential Modeling for Speech Processing
14.13 DiCOVA: Diagnosis of COVID-19 using Acoustics
14.14 Deep Noise Suppression Challenge INTERSPEECH 2021
14.15 The Fearless Steps Challenge (Phase 3: FS#3)
14.16 Privacy-preserving Machine Learning for Audio, Speech and Language Processing
14.17 Computational Paralinguistics ChallengE (ComParE) - COVID-19 Cough, COVID-19 Speech, Escalation & Primates
14.18 OpenASR20 and Low Resource ASR Development
14.19 AutoSpeech 2021
14.20 Learned Prosodic Representations in Emotional Speech Classification and Synthesis