default search action

combined dblp search
author search
venue search
publication search

ask others

22nd Interspeech 2021: Brno, Czechia

> Home > Conferences and Workshops > Interspeech

Refine list

refinements active!

zoomed in on ?? of ?? records

view refined list in

export refined list as

showing all ?? records

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/2021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/2021
Hynek Hermansky, Honza Cernocký, Lukás Burget, Lori Lamel, Odette Scharenborg, Petr Motlícek:
22nd Annual Conference of the International Speech Communication Association, Interspeech 2021, Brno, Czechia, August 30 - September 3, 2021. ISCA 2021

Speech Synthesis: Other Topics

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/PucherW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/PucherW21
Michael Pucher, Thomas Woltron:
Conversion of Airborne to Bone-Conducted Speech with Deep Neural Networks. 1-5
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/RezackovaST21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/RezackovaST21
Markéta Rezácková, Jan Svec, Daniel Tihelka:
T5G2P: Using Text-to-Text Transfer Transformer for Grapheme-to-Phoneme Conversion. 6-10
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/PerrotinABH21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/PerrotinABH21
Olivier Perrotin, Hussein El Amouri, Gérard Bailly, Thomas Hueber:
Evaluating the Extrapolation Capabilities of Neural Vocoders to Extreme Pitch Values. 11-15
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/DoCDK21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/DoCDK21
Phat Do, Matt Coler, Jelske Dijkstra, Esther Klabbers:
A Systematic Review and Analysis of Multilingual Data Strategies in Text-to-Speech for Low-Resource Languages. 16-20

Disordered Speech

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/TalkarSBKELBFLQ21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/TalkarSBKELBFLQ21
Tanya Talkar, Nancy Pearl Solomon, Douglas S. Brungart, Stefanie E. Kuchinsky, Megan M. Eitel, Sara M. Lippa, Tracey A. Brickell, Louis M. French, Rael T. Lange, Thomas F. Quatieri:
Acoustic Indicators of Speech Motor Coordination in Adults With and Without Traumatic Brain Injury. 21-25
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Vasquez-CorreaF21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Vasquez-CorreaF21
Juan Camilo Vásquez-Correa, Julian Fritsch, Juan Rafael Orozco-Arroyave, Elmar Nöth, Mathew Magimai-Doss:
On Modeling Glottal Source Information for Phonation Assessment in Parkinson's Disease. 26-30
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/DaoudiDVFTRMW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/DaoudiDVFTRMW21
Khalid Daoudi, Biswajit Das, Solange Milhé de Saint Victor, Alexandra Foubert-Samier, Anne Pavy-Le Traon, Olivier Rascol, Wassilios G. Meissner, Virginie Woisard:
Distortion of Voiced Obstruents for Differential Diagnosis Between Parkinson's Disease and Multiple System Atrophy. 31-35
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WangBh21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WangBh21
Pu Wang, Bagher BabaAli, Hugo Van hamme:
A Study into Pre-Training Strategies for Spoken Language Understanding on Dysarthric Speech. 36-40
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/TurrisiBEGPSFB21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/TurrisiBEGPSFB21
Rosanna Turrisi, Arianna Braccia, Marco Emanuele, Simone Giulietti, Maura Pugliatti, Mariachiara Sensi, Luciano Fadiga, Leonardo Badino:
EasyCall Corpus: A Dysarthric Speech Dataset. 41-45

Speech Signal Analysis and Representation II

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/BieGLHA21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/BieGLHA21
Xiaoyu Bie, Laurent Girin, Simon Leglaive, Thomas Hueber, Xavier Alameda-Pineda:
A Benchmark of Dynamical Variational Autoencoders Applied to Speech Spectrogram Modeling. 46-50
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/YurtKDNEM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/YurtKDNEM21
Metehan Yurt, Pavan Kantharaju, Sascha Disch, Andreas Niedermeier, Alberto N. Escalante-B., Veniamin I. Morgenshtern:
Fricative Phoneme Detection Using Deep Neural Networks and its Comparison to Traditional Methods. 51-55
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/PrasadM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/PrasadM21
RaviShankar Prasad, Mathew Magimai-Doss:
Identification of F1 and F2 in Speech Using Modified Zero Frequency Filtering. 56-60
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/TeytautR21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/TeytautR21
Yann Teytaut, Axel Roebel:
Phoneme-to-Audio Alignment with Recurrent Neural Networks for Speaking and Singing Voice. 61-65

Feature, Embedding and Neural Architecture for Speaker Recognition

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KimP21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KimP21
Seong-Hu Kim, Yong-Hwa Park:
Adaptive Convolutional Neural Network for Text-Independent Speaker Recognition. 66-70
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/QiG021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/QiG021
Jiajun Qi, Wu Guo, Bin Gu:
Bidirectional Multiscale Feature Aggregation for Speaker Verification. 71-75
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZhangWCLC21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZhangWCLC21
Yu-Jia Zhang, Yih-Wen Wang, Chia-Ping Chen, Chung-Li Lu, Bo-Cheng Chan:
Improving Time Delay Neural Network Based Speaker Recognition with Convolutional Block and Feature Aggregation Methods. 76-80
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WuZGX21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WuZGX21
Yanfeng Wu, Junan Zhao, Chenkai Guo, Jing Xu:
Improving Deep CNN Architectures with Variable-Length Training Samples for Text-Independent Speaker Verification. 81-85
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZhuQL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZhuQL21
Tinglong Zhu, Xiaoyi Qin, Ming Li:
Binary Neural Network for Speaker Verification. 86-90
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/TuM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/TuM21
Youzhi Tu, Man-Wai Mak:
Mutual Information Enhanced Training for Speaker Embedding. 91-95
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZhuJD21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZhuJD21
Ge Zhu, Fei Jiang, Zhiyao Duan:
Y-Vector: Multiscale Waveform Encoder for Speaker Embedding. 96-100
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LiuLLH21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LiuLLH21
Yan Liu, Zheng Li, Lin Li, Qingyang Hong:
Phoneme-Aware and Channel-Wise Attentive Learning for Text Dependent Speaker Verification. 101-105
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZhuL021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZhuL021
Hongning Zhu, Kong Aik Lee, Haizhou Li:
Serialized Multi-Layer Multi-Head Attention for Neural Speaker Embedding. 106-110

Speech Synthesis: Toward End-to-End Synthesis II

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/GongW0GWD21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/GongW0GWD21
Cheng Gong, Longbiao Wang, Ju Zhang, Shaotong Guo, Yuguang Wang, Jianwu Dang:
TacoLPCNet: Fast and Stable TTS by Conditioning LPCNet on Mel Spectrogram Predictions. 111-115
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/BakBBKC21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/BakBBKC21
Taejun Bak, Jae-Sung Bae, Hanbin Bae, Young-Ik Kim, Hoon-Young Cho:
FastPitchFormant: Source-Filter Based Decomposed Modeling for Speech Synthesis. 116-120
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/NakamuraKS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/NakamuraKS21
Taiki Nakamura, Tomoki Koriyama, Hiroshi Saruwatari:
Sequence-to-Sequence Learning for Deep Gaussian Process Based Speech Synthesis Using Self-Attention GP Layer. 121-125
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KakegawaHAI21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KakegawaHAI21
Naoto Kakegawa, Sunao Hara, Masanobu Abe, Yusuke Ijima:
Phonetic and Prosodic Information Estimation from Texts for Genuine Japanese End-to-End Text-to-Speech. 126-130
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/DaiGWZ21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/DaiGWZ21
Xudong Dai, Cheng Gong, Longbiao Wang, Kaili Zhang:
Information Sieve: Content Leakage Reduction in End-to-End Prosody Transfer for Expressive Speech Synthesis. 131-135
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/DouWWLG21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/DouWWLG21
Qingyun Dou, Xixin Wu, Moquan Wan, Yiting Lu, Mark J. F. Gales:
Deliberation-Based Multi-Pass Speech Synthesis. 136-140
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/EliasZS0JSW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/EliasZS0JSW21
Isaac Elias, Heiga Zen, Jonathan Shen, Yu Zhang, Ye Jia, R. J. Skerry-Ryan, Yonghui Wu:
Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling. 141-145
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WuXSKFKH21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WuXSKFKH21
Chunyang Wu, Zhiping Xiu, Yangyang Shi, Ozlem Kalinli, Christian Fuegen, Thilo Köhler, Qing He:
Transformer-Based Acoustic Modeling for Streaming Speech Synthesis. 146-150
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/JiaZSZW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/JiaZSZW21
Ye Jia, Heiga Zen, Jonathan Shen, Yu Zhang, Yonghui Wu:
PnG BERT: Augmented BERT on Phonemes and Graphemes for Neural TTS. 151-155
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/GeKOK21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/GeKOK21
Zhenhao Ge, Lakshmish Kaushik, Masanori Omote, Saket Kumar:
Speed up Training with Variable Length Inputs by Efficient Batching Strategies. 156-160

Speech Enhancement and Intelligibility

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SunYZH21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SunYZH21
Yuhang Sun, Linju Yang, Huifeng Zhu, Jie Hao:
Funnel Deep Complex U-Net for Phase-Aware Speech Enhancement. 161-165
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZhangSNL021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZhangSNL021
Qiquan Zhang, Qi Song, Aaron Nicolson, Tian Lan, Haizhou Li:
Temporal Convolutional Network with Frequency Dimension Adaptive Attention for Speech Enhancement. 166-170
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/PanYC21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/PanYC21
Changjie Pan, Feng Yang, Fei Chen:
Perceptual Contributions of Vowels and Consonant-Vowel Transitions in Understanding Time-Compressed Mandarin Sentences. 171-175
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/BiswasNA21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/BiswasNA21
Ritujoy Biswas, Karan Nathwani, Vinayak Abrol:
Transfer Learning for Speech Intelligibility Improvement in Noisy Environments. 176-180
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/YamamotoIAAOKN21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/YamamotoIAAOKN21
Ayako Yamamoto, Toshio Irino, Kenichi Arai, Shoko Araki, Atsunori Ogawa, Keisuke Kinoshita, Tomohiro Nakatani:
Comparison of Remote Experiments Using Crowdsourcing and Laboratory Experiments on Speech Intelligibility. 181-185
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LiuLKZL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LiuLKZL21
Wenzhe Liu, Andong Li, Yuxuan Ke, Chengshi Zheng, Xiaodong Li:
Know Your Enemy, Know Yourself: A Unified Two-Stage Framework for Speech Enhancement. 186-190
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KongLDCXW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KongLDCXW21
Qiuqiang Kong, Haohe Liu, Xingjian Du, Li Chen, Rui Xia, Yuxuan Wang:
Speech Enhancement with Weakly Labelled Data from AudioSet. 191-195
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/HsiehYFL021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/HsiehYFL021
Tsun-An Hsieh, Cheng Yu, Szu-Wei Fu, Xugang Lu, Yu Tsao:
Improving Perceptual Quality by Phone-Fortified Perceptual Loss Using Wasserstein Distance for Speech Enhancement. 196-200
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/FuYHPRL021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/FuYHPRL021
Szu-Wei Fu, Cheng Yu, Tsun-An Hsieh, Peter Plantinga, Mirco Ravanelli, Xugang Lu, Yu Tsao:
MetricGAN+: An Improved Version of MetricGAN for Speech Enhancement. 201-205
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/EdrakiC0F21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/EdrakiC0F21
Amin Edraki, Wai-Yip Chan, Jesper Jensen, Daniel Fogerty:
A Spectro-Temporal Glimpsing Index (STGI) for Speech Intelligibility Prediction. 206-210
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/QiuWSMH21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/QiuWSMH21
Yuanhang Qiu, Ruili Wang, Satwinder Singh, Zhizhong Ma, Feng Hou:
Self-Supervised Learning Based Phone-Fortified Speech Enhancement. 211-215
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/NayemW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/NayemW21
Khandokar Md. Nayem, Donald S. Williamson:
Incorporating Embedding Vectors from a Human Mean-Opinion Score Prediction Model for Monaural Speech Enhancement. 216-220
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZhangJB21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZhangJB21
Jianwei Zhang, Suren Jayasuriya, Visar Berisha:
Restoring Degraded Speech via a Modified Diffusion Model. 221-225

Spoken Dialogue Systems I

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/0005RPPNA21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/0005RPPNA21
Hoang Long Nguyen, Vincent Renkens, Joris Pelemans, Srividya Pranavi Potharaju, Anil Kumar Nalamalapu, Murat Akbacak:
User-Initiated Repetition-Based Recovery in Multi-Utterance Dialogue Systems. 226-230
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChenYZ21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChenYZ21
Nuo Chen, Chenyu You, Yuexian Zou:
Self-Supervised Dialogue Learning for Spoken Conversational Question Answering. 231-235
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SuWJ21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SuWJ21
Ruolin Su, Ting-Wei Wu, Biing-Hwang Juang:
Act-Aware Slot-Value Predicting in Multi-Domain Dialogue State Tracking. 236-240
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChibaH21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChibaH21
Yuya Chiba, Ryuichiro Higashinaka:
Dialogue Situation Recognition for Everyday Conversation Using Multimodal Information. 241-245
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/YamazakiCNI21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/YamazakiCNI21
Yoshihiro Yamazaki, Yuya Chiba, Takashi Nose, Akinori Ito:
Neural Spoken-Response Generation Using Prosodic and Linguistic Context for Conversational Systems. 246-250
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/XuZYZ21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/XuZYZ21
Weiyuan Xu, Peilin Zhou, Chenyu You, Yuexian Zou:
Semantic Transportation Prototypical Network for Few-Shot Intent Detection. 251-255
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/TangSWD21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/TangSWD21
Li Tang, Yuke Si, Longbiao Wang, Jianwu Dang:
Domain-Specific Multi-Agent Dialog Policy Learning in Multi-Domain Task-Oriented Scenarios. 256-260
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WangCLDKCL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WangCLDKCL21
Haoyu Wang, John Chen, Majid Laali, Kevin Durda, Jeff King, William Campbell, Yang Liu:
Leveraging ASR N-Best in Deep Entity Retrieval. 261-265

Topics in ASR: Robustness, Feature Extraction, and Far-Field ASR

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/0014YTBTLW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/0014YTBTLW21
Shuai Zhang, Jiangyan Yi, Zhengkun Tian, Ye Bai, Jianhua Tao, Xuefei Liu, Zhengqi Wen:
End-to-End Spelling Correction Conditioned on Acoustic Feature for Code-Switching Speech Recognition. 266-270
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SiminyuLAMMN21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SiminyuLAMMN21
Kathleen Siminyu, Xinjian Li, Antonios Anastasopoulos, David R. Mortensen, Michael R. Marlo, Graham Neubig:
Phoneme Recognition Through Fine Tuning of Phonetic Representations: A Case Study on Luhya Language Varieties. 271-275
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LoweimiC0R21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LoweimiC0R21
Erfan Loweimi, Zoran Cvetkovic, Peter Bell, Steve Renals:
Speech Acoustic Modelling Using Raw Source and Filter Components. 276-280
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/FujimotoK21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/FujimotoK21
Masakiyo Fujimoto, Hisashi Kawai:
Noise Robust Acoustic Modeling for Single-Channel Speech Recognition Based on a Stream-Wise Transformer Architecture. 281-285
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/RatnarajahTM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/RatnarajahTM21
Anton Ratnarajah, Zhenyu Tang, Dinesh Manocha:
IR-GAN: Room Impulse Response Generator for Far-Field Speech Recognition. 286-290
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChenZ21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChenZ21
Junqi Chen, Xiao-Lei Zhang:
Scaling Sparsemax Based Channel Selection for Speech Recognition with ad-hoc Microphone Arrays. 291-295
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChangRMO21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChangRMO21
Feng-Ju Chang, Martin Radfar, Athanasios Mouchtaris, Maurizio Omologo:
Multi-Channel Transformer Transducer for Speech Recognition. 296-300
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/TsunooSNK021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/TsunooSNK021
Emiru Tsunoo, Kentaro Shibata, Chaitanya Narisetty, Yosuke Kashiwagi, Shinji Watanabe:
Data Augmentation Methods for End-to-End Speech Recognition on Distant-Talk Scenarios. 301-305
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/MaH0HH21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MaH0HH21
Guodong Ma, Pengfei Hu, Jian Kang, Shen Huang, Hao Huang:
Leveraging Phone Mask Training for Phonetic-Reduction-Robust E2E Uyghur Speech Recognition. 306-310
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LikhomanenkoXPT21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LikhomanenkoXPT21
Tatiana Likhomanenko, Qiantong Xu, Vineel Pratap, Paden Tomasello, Jacob Kahn, Gilad Avidov, Ronan Collobert, Gabriel Synnaeve:
Rethinking Evaluation in ASR: Are Our Models Robust Enough? 311-315
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LamWWSY21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LamWWSY21
Max W. Y. Lam, Jun Wang, Chao Weng, Dan Su, Dong Yu:
Raw Waveform Encoder with Multi-Scale Globally Attentive Locally Recurrent Networks for End-to-End Speech Recognition. 316-320

Voice Activity Detection and Keyword Spotting

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/HouYLDZMB21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/HouYLDZMB21
Yuanbo Hou, Zhesong Yu, Xia Liang, Xingjian Du, Bilei Zhu, Zejun Ma, Dick Botteldooren:
Attention-Based Cross-Modal Fusion for Audio-Visual Voice Activity Detection in Musical Video Streams. 321-325
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Kim21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Kim21
Ui-Hyun Kim:
Noise-Tolerant Self-Supervised Learning for Audio-Visual Voice Activity Detection. 326-330
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ParkZLS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ParkZLS21
Hyun-Jin Park, Pai Zhu, Ignacio López-Moreno, Niranjan Subrahmanya:
Noisy Student-Teacher Training for Robust Keyword Spotting. 331-335
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/IchikawaNNS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/IchikawaNNS21
Osamu Ichikawa, Kaito Nakano, Takahiro Nakayama, Hajime Shirouzu:
Multi-Channel VAD for Transcription of Group Discussion. 336-340
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZhouDCJXL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZhouDCJXL21
Hengshun Zhou, Jun Du, Hang Chen, Zijun Jing, Shifu Xiong, Chin-Hui Lee:
Audio-Visual Information Fusion Using Cross-Modal Teacher-Student Learning for Voice Activity Detection in Realistic Environments. 341-345
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/MakishimaITTOM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MakishimaITTOM21
Naoki Makishima, Mana Ihori, Tomohiro Tanaka, Akihiko Takashima, Shota Orihashi, Ryo Masumura:
Enrollment-Less Training for Personalized Voice Activity Detection. 346-350
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/NonakaLKUN21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/NonakaLKUN21
Yuto Nonaka, Chee Siang Leow, Akio Kobayashi, Takehito Utsuro, Hiromitsu Nishizaki:
Voice Activity Detection for Live Speech of Baseball Game Based on Tandem Connection with Speech/Noise Separation Model. 351-355
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KwonCM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KwonCM21
Young D. Kwon, Jagmohan Chauhan, Cecilia Mascolo:
FastICARL: Fast Incremental Classifier and Representation Learning with Efficient Budget Allocation in Audio Sensing Applications. 356-360
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WeiYZTHKLCP21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WeiYZTHKLCP21
Bo Wei, Meirong Yang, Tao Zhang, Xiao Tang, Xing Huang, Kyuhong Kim, Jaeyun Lee, Kiho Cho, Sung-Un Park:
End-to-End Transformer-Based Open-Vocabulary Keyword Spotting with Location-Guided Local Attention. 361-365
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/BhatiVZMD21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/BhatiVZMD21
Saurabhchand Bhati, Jesús Villalba, Piotr Zelasko, Laureano Moro-Velázquez, Najim Dehak:
Segmental Contrastive Predictive Coding for Unsupervised Word Segmentation. 366-370
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/XuDW021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/XuDW021
Xuenan Xu, Heinrich Dinkel, Mengyue Wu, Kai Yu:
A Lightweight Framework for Online Voice Activity Detection in the Wild. 371-375

Voice and Voicing

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChlebowskiB21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChlebowskiB21
Aurélie Chlébowski, Nicolas Ballier:
"See what I mean, huh?" Evaluating Visual Inspection of F₀ Tracking in Nasal Grunts. 376-380
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WangH21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WangH21
Bruce Xiao Wang, Vincent Hughes:
System Performance as a Function of Calibration Methods, Sample Size and Sampling Variability in Likelihood Ratio-Based Forensic Voice Comparison. 381-385
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Bonneau21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Bonneau21
Anne Bonneau:
Voicing Assimilations by French Speakers of German in Stop-Fricative Sequences. 386-390
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChakrabortyPR21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChakrabortyPR21
Titas Chakraborty, Vaishali Patil, Preeti Rao:
The Four-Way Classification of Stops with Voicing and Aspiration for Non-Native Speech Evaluation. 391-395
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/UroojMHH21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/UroojMHH21
Saba Urooj, Benazir Mumtaz, Sarmad Hussain, Ehsan ul Haq:
Acoustic and Prosodic Correlates of Emotions in Urdu Speech. 396-400
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/TamimH21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/TamimH21
Nour Tamim, Silke Hamann:
Voicing Contrasts in the Singleton Stops of Palestinian Arabic: Production and Perception. 401-405
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/CoyHHG21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/CoyHHG21
Thomas Coy, Vincent Hughes, Philip Harrison, Amelia Jane Gully:
A Comparison of the Accuracy of Dissen and Keshet's (2016) DeepFormants and Traditional LPC Methods for Semi-Automatic Speaker Recognition. 406-410
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Jessen21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Jessen21
Michael Jessen:
MAP Adaptation Characteristics in Forensic Long-Term Formant Analysis. 411-415
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Lo21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Lo21
Justin J. H. Lo:
Cross-Linguistic Speaker Individuality of Long-Term Formant Distributions: Phonetic and Forensic Perspectives. 416-420
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SooJB21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SooJB21
Rachel Soo, Khia A. Johnson, Molly Babel:
Sound Change in Spontaneous Bilingual Speech: A Corpus Study on the Cantonese n-l Merger in Cantonese-English Bilinguals. 421-425
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LalhminghluiS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LalhminghluiS21
Wendy Lalhminghlui, Priyankoo Sarmah:
Characterizing Voiced and Voiceless Nasals in Mizo. 426-430

The INTERSPEECH 2021 Computational Paralinguistics Challenge (ComParE) - COVID-19 Cough, COVID-19 Speech, Escalation & Primates

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SchullerBBMHLKA21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SchullerBBMHLKA21
Björn W. Schuller, Anton Batliner, Christian Bergler, Cecilia Mascolo, Jing Han, Iulia Lefter, Heysem Kaya, Shahin Amiriparian, Alice Baird, Lukas Stappen, Sandra Ottl, Maurice Gerczuk, Panagiotis Tzirakis, Chloë Brown, Jagmohan Chauhan, Andreas Grammenos, Apinan Hasthanasombat, Dimitris Spathis, Tong Xia, Pietro Cicuta, Léon J. M. Rothkrantz, Joeri A. Zwerts, Jelle Treep, Casper S. Kaandorp:
The INTERSPEECH 2021 Computational Paralinguistics Challenge: COVID-19 Cough, COVID-19 Speech, Escalation & Primates. 431-435
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Solera-UrenaBTR21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Solera-UrenaBTR21
Rubén Solera-Ureña, Catarina Botelho, Francisco Teixeira, Thomas Rolland, Alberto Abad, Isabel Trancoso:
Transfer Learning-Based Cough Representations for Automatic Detection of COVID-19. 436-440
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KlumppBAVPBON21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KlumppBAVPBON21
Philipp Klumpp, Tobias Bocklet, Tomás Arias-Vergara, Juan Camilo Vásquez-Correa, Paula Andrea Pérez-Toro, Sebastian P. Bayerl, Juan Rafael Orozco-Arroyave, Elmar Nöth:
The Phonetic Footprint of Covid-19? 441-445
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/CasanovaCJFGPS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/CasanovaCJFGPS21
Edresson Casanova, Arnaldo Candido Jr., Ricardo Corso Fernandes Junior, Marcelo Finger, Lucas Rafael Stefanel Gris, Moacir Antonelli Ponti, Daniel Peixoto Pinto da Silva:
Transfer Learning and Data Augmentation Techniques to the COVID-19 Identification Tasks in ComParE 2021. 446-450
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/IlliumMSL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/IlliumMSL21
Steffen Illium, Robert Müller, Andreas Sedlmeier, Claudia Linnhoff-Popien:
Visual Transformers for Primates Classification and Covid Detection. 451-455
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Pellegrini21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Pellegrini21
Thomas Pellegrini:
Deep-Learning-Based Central African Primate Species Classification with MixUp and SpecAugment. 456-460
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/MullerIL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MullerIL21
Robert Müller, Steffen Illium, Claudia Linnhoff-Popien:
A Deep and Recurrent Architecture for Primate Vocalization Classification. 461-465
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZwertsTKMKK21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZwertsTKMKK21
Joeri A. Zwerts, Jelle Treep, Casper S. Kaandorp, Floor Meewis, Amparo C. Koot, Heysem Kaya:
Introducing a Central African Primate Vocalisation Dataset for Automated Species Classification. 466-470
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/RizosLHBRMBS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/RizosLHBRMBS21
Georgios Rizos, Jenna Lawson, Zhuoda Han, Duncan Butler, James Rosindell, Krystian Mikolajczyk, Cristina Banks-Leite, Björn W. Schuller:
Multi-Attentive Detection of the Spider Monkey Whinny in the (Actual) Wild. 471-475
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LopezV0G21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LopezV0G21
José Vicente Egas López, Mercedes Vetráb, László Tóth, Gábor Gosztolya:
Identifying Conflict Escalation and Primates by Using Ensemble X-Vectors and Fisher Vector Features. 476-480
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/VerkholyakDDKRV21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/VerkholyakDDKRV21
Oxana Verkholyak, Denis Dresvyanskiy, Anastasia Dvoynikova, Denis Kotov, Elena Ryumina, Alena Velichko, Danila Mamontov, Wolfgang Minker, Alexey Karpov:
Ensemble-Within-Ensemble Classification for Escalation Prediction from Speech. 481-485
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SchillerMRA21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SchillerMRA21
Dominik Schiller, Silvan Mertes, Pol van Rijn, Elisabeth André:
Analysis by Synthesis: Using an Expressive TTS Model as Feature Extractor for Paralinguistic Speech Classification. 486-490

Survey Talk 1: Heidi Christensen

- view
  - electronic edition @ isca-speech.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/interspeech/Christensen21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Christensen21
Heidi Christensen:
Towards Automatic Speech Recognition for People with Atypical Speech.

Embedding and Network Architecture for Speaker Recognition

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Luu0R21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Luu0R21
Chau Luu, Peter Bell, Steve Renals:
Leveraging Speaker Attribute Information Using Multi Task Learning for Speaker Verification and Diarization. 491-495
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/RybickaVZDK21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/RybickaVZDK21
Magdalena Rybicka, Jesús Villalba, Piotr Zelasko, Najim Dehak, Konrad Kowalczyk:
Spine2Net: SpineNet with Res2Net and Time-Squeeze-and-Excitation Blocks for Speaker Recognition. 496-500
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/StafylakisRB21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/StafylakisRB21
Themos Stafylakis, Johan Rohdin, Lukás Burget:
Speaker Embeddings by Modeling Channel-Wise Correlations. 501-505
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/HeMO21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/HeMO21
Weipeng He, Petr Motlícek, Jean-Marc Odobez:
Multi-Task Neural Network for Robust Multiple Speaker Embedding Extraction. 506-510
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/PengQWG0BC21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/PengQWG0BC21
Junyi Peng, Xiaoyang Qu, Jianzong Wang, Rongzhi Gu, Jing Xiao, Lukás Burget, Jan Cernocký:
ICSpk: Interpretable Complex Speaker Embedding Extractor from Raw Waveform. 511-515

Speech Perception I

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/XiaoALdKP21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/XiaoALdKP21
Xiao Xiao, Nicolas Audibert, Grégoire Locqueville, Christophe d'Alessandro, Barbara Kühnert, Claire Pillot-Loiseau:
Prosodic Disambiguation Using Chironomic Stylization of Intonation with Native and Non-Native Speakers. 516-520
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/BlockCZ21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/BlockCZ21
Aleese Block, Michelle Cohn, Georgia Zellou:
Variation in Perceptual Sensitivity and Compensation for Coarticulation Across Adult and Child Naturally-Produced and TTS Voices. 521-525
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Jalilpour-Monesi21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Jalilpour-Monesi21
Mohammad Jalilpour-Monesi, Bernd Accou, Tom Francart, Hugo Van hamme:
Extracting Different Levels of Speech Information from EEG Using an LSTM-Based Model. 526-530
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/BoschB21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/BoschB21
Louis ten Bosch, Lou Boves:
Word Competition: An Entropy-Based Approach in the DIANA Model of Human Word Comprehension. 531-535
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/BoschB21a
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/BoschB21a
Louis ten Bosch, Lou Boves:
Time-to-Event Models for Analyzing Reaction Time Sequences. 536-540
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/BrandMBB21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/BrandMBB21
Sophie Brand, Kimberley Mulder, Louis ten Bosch, Lou Boves:
Models of Reaction Times in Auditory Lexical Decision: RTonset versus RToffset. 541-545

Acoustic Event Detection and Acoustic Scene Classification

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KimHK21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KimHK21
Gwantae Kim, David K. Han, Hanseok Ko:
SpecMix : A Mixed Sample Data Augmentation Method for Training with Time-Frequency Domain Features. 546-550
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WangZW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WangZW21
Helin Wang, Yuexian Zou, Wenwu Wang:
SpecAugment++: A Hidden Space Data Augmentation Method for Acoustic Scene Classification. 551-555
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Zheng00ML21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Zheng00ML21
Xu Zheng, Yan Song, Li-Rong Dai, Ian McLoughlin, Lin Liu:
An Effective Mutual Mean Teaching Based Domain Adaptation Method for Sound Event Detection. 556-560
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/NandiSM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/NandiSM21
Ritika Nandi, Shashank Shekhar, Manjunath Mulimani:
Acoustic Scene Classification Using Kervolution-Based SubSpectralNet. 561-565
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SundarSW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SundarSW21
Harshavardhan Sundar, Ming Sun, Chao Wang:
Event Specific Attention for Polyphonic Sound Event Detection. 566-570
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/GongCG21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/GongCG21
Yuan Gong, Yu-An Chung, James R. Glass:
AST: Audio Spectrogram Transformer. 571-575
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SeoLK21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SeoLK21
Soonshin Seo, Donghyun Lee, Ji-Hwan Kim:
Shallow Convolution-Augmented Transformer with Differentiable Neural Computer for Low-Complexity Classification of Variable-Length Acoustic Scene. 576-580
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/BearMB21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/BearMB21
Helen L. Bear, Veronica Morfi, Emmanouil Benetos:
An Evaluation of Data Augmentation Methods for Sound Scene Geotagging. 581-585
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/HoriHR21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/HoriHR21
Chiori Hori, Takaaki Hori, Jonathan Le Roux:
Optimizing Latency for Online Video Captioning Using Audio-Visual Transformers. 586-590
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SiWSWZQCCX21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SiWSWZQCCX21
Shijing Si, Jianzong Wang, Huiming Sun, Jianhan Wu, Chuanyao Zhang, Xiaoyang Qu, Ning Cheng, Lei Chen, Jing Xiao:
Variational Information Bottleneck for Effective Low-Resource Audio Classification. 591-595
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/DeshmukhRS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/DeshmukhRS21
Soham Deshmukh, Bhiksha Raj, Rita Singh:
Improving Weakly Supervised Sound Event Detection with Self-Supervised Auxiliary Tasks. 596-600
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Komatsu0MH21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Komatsu0MH21
Tatsuya Komatsu, Shinji Watanabe, Koichi Miyazaki, Tomoki Hayashi:
Acoustic Event Detection with Classifier Chains. 601-605

Diverse Modes of Speech Acquisition and Processing

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/TsengL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/TsengL21
Shu-Chuan Tseng, Yi-Fen Liu:
Segment and Tone Production in Continuous Speech of Hearing and Hearing-Impaired Children. 606-610
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WangCC21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WangCC21
Feng Wang, Jing Chen, Fei Chen:
Effect of Carrier Bandwidth on Understanding Mandarin Sentences in Simulated Electric-Acoustic Hearing. 611-615
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SharmaGUMG21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SharmaGUMG21
Manthan Sharma, Navaneetha Gaddam, Tejas Umesh, Aditya Murthy, Prasanta Kumar Ghosh:
A Comparative Study of Different EMG Features for Acoustics-to-EMG Mapping. 616-620
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/AbrahamSSM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/AbrahamSSM21
Ajish K. Abraham, V. Sivaramakrishnan, N. Swapna, N. Manohar:
Image-Based Assessment of Jaw Parameters and Jaw Kinematics for Articulatory Simulation: Preliminary Results. 621-625
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WangGYLFL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WangGYLFL21
Jianrong Wang, Nan Gu, Mei Yu, Xuewei Li, Qiang Fang, Li Liu:
An Attention Self-Supervised Contrastive Learning Based Three-Stage Model for Hand Shape Feature Representation in Cued Speech. 626-630
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/DineleyLLMSPWIO21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/DineleyLLMSPWIO21
Judith Dineley, Grace Lavelle, Daniel Leightley, Faith Matcham, Sara Siddi, Maria Teresa Peñarrubia-María, Katie M. White, Alina Ivan, Carolin Oetzmann, Sara Simblett, Erin Dawe-Lane, Stuart Bruce, Daniel Stahl, Yatharth Ranjan, Zulqarnain Rashid, Pauline Conde, Amos A. Folarin, Josep Maria Haro, Til Wykes, Richard J. B. Dobson, Vaibhav A. Narayan, Matthew Hotopf, Björn W. Schuller, Nicholas Cummins, RADAR-CNS Consortium:
Remote Smartphone-Based Speech Collection: Acceptance and Barriers in Individuals with Major Depressive Disorder. 631-635
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LiADSESSBRM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LiADSESSBRM21
Sarah R. Li, Colin T. Annand, Sarah Dugan, Sarah M. Schwab, Kathryn J. Eary, Michael Swearengen, Sarah Stack, Suzanne Boyce, Michael A. Riley, T. Douglas Mast:
An Automatic, Simple Ultrasound Biofeedback Parameter for Distinguishing Accurate and Misarticulated Rhotic Syllables. 636-640
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/RibeiroERR21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/RibeiroERR21
Manuel Sam Ribeiro, Aciel Eshky, Korin Richmond, Steve Renals:
Silent versus Modal Multi-Speaker Speech Recognition from Ultrasound and Video. 641-645
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/FerreiraSCT21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/FerreiraSCT21
David Ferreira, Samuel S. Silva, Francisco Curado, António J. S. Teixeira:
RaSSpeR: Radar-Based Silent Speech Recognition. 646-650
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/CaoSBISM021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/CaoSBISM021
Beiming Cao, Nordine Sebkhi, Arpan Bhavsar, Omer T. Inan, Robin Samlan, Ted Mau, Jun Wang:
Investigating Speech Reconstruction for Laryngectomees for Silent Speech Interfaces. 651-655

Multi-Channel Speech Enhancement and Hearing Aids

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SchroterREM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SchroterREM21
Hendrik Schröter, Tobias Rosenkranz, Alberto N. Escalante-B., Andreas K. Maier:
LACOPE: Latency-Constrained Pitch Estimation for Speech Enhancement. 656-660
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/0002SNBY21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/0002SNBY21
Mathieu Fontaine, Kouhei Sekiguchi, Aditya Arie Nugraha, Yoshiaki Bando, Kazuyoshi Yoshii:
Alpha-Stable Autoregressive Fast Multichannel Nonnegative Matrix Factorization for Joint Speech Enhancement and Dereverberation. 661-665
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZhangL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZhangL21
Siyuan Zhang, Xiaofei Li:
Microphone Array Generalization for Multichannel Narrowband Deep Speech Enhancement. 666-670
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SongS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SongS21
Hyungchan Song, Jong Won Shin:
Multiple Sound Source Localization Based on Interchannel Phase Differences in All Frequencies with Spectral Masks. 671-675
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZarazagaMBL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZarazagaMBL21
Pablo Pérez Zarazaga, Mariem Bouafif Mansali, Tom Bäckström, Zied Lachiri:
Cancellation of Local Competing Speaker with Near-Field Localization for Distributed ad-hoc Sensor Network. 676-680
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZhangW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZhangW21
Hao Zhang, DeLiang Wang:
A Deep Learning Method to Multi-Channel Active Noise Control. 681-685
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/GraetzerBCACNPM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/GraetzerBCACNPM21
Simone Graetzer, Jon Barker, Trevor J. Cox, Michael Akeroyd, John F. Culling, Graham Naylor, Eszter Porter, Rhoddy Viveros Muñoz:
Clarity-2021 Challenges: Machine Learning Challenges for Advancing Hearing Aid Processing. 686-690
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Tu0B21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Tu0B21
Zehai Tu, Ning Ma, Jon Barker:
Optimising Hearing Aid Fittings for Speech in Noise with a Differentiable Hearing Loss Model. 691-695
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Sivasankaran0F21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Sivasankaran0F21
Sunit Sivasankaran, Emmanuel Vincent, Dominique Fohr:
Explaining Deep Learning Models for Speech Enhancement. 696-700
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/HuangF21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/HuangF21
Weilong Huang, Jinwei Feng:
Minimum-Norm Differential Beamforming for Linear Array with Directional Microphones. 701-705

Self-Supervision and Semi-Supervision for Neural ASR Training

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/CaoKFXSZM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/CaoKFXSZM21
Songjun Cao, Yueteng Kang, Yanzhe Fu, Xiaoshuo Xu, Sining Sun, Yike Zhang, Long Ma:
Improving Streaming Transformer Based ASR Under a Framework of Self-Supervised Learning. 706-710
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SadhuHHMWRSDM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SadhuHHMWRSDM21
Samik Sadhu, Di He, Che-Wei Huang, Sri Harish Mallidi, Minhua Wu, Ariya Rastrow, Andreas Stolcke, Jasha Droppo, Roland Maas:
wav2vec-C: A Self-Supervised Model for Speech Representation Learning. 711-715
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WallingtonKK021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WallingtonKK021
Electra Wallington, Benji Kershenbaum, Ondrej Klejch, Peter Bell:
On the Learning Dynamics of Semi-Supervised Training for ASR. 716-720
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/HsuSBLXPK0CSA21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/HsuSBLXPK0CSA21
Wei-Ning Hsu, Anuroop Sriram, Alexei Baevski, Tatiana Likhomanenko, Qiantong Xu, Vineel Pratap, Jacob Kahn, Ann Lee, Ronan Collobert, Gabriel Synnaeve, Michael Auli:
Robust wav2vec 2.0: Analyzing Domain Shift in Self-Supervised Pre-Training. 721-725
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/HiguchiMRH21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/HiguchiMRH21
Yosuke Higuchi, Niko Moritz, Jonathan Le Roux, Takaaki Hori:
Momentum Pseudo-Labeling for Semi-Supervised Speech Recognition. 726-730
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/MisraHHGSNS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MisraHHGSNS21
Ananya Misra, Dongseong Hwang, Zhouyuan Huo, Shefali Garg, Nikhil Siddhartha, Arun Narayanan, Khe Chai Sim:
A Comparison of Supervised and Unsupervised Pre-Training of End-to-End Models. 731-735
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChenRZZGHEWRM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChenRZZGHEWRM21
Zhehuai Chen, Andrew Rosenberg, Yu Zhang, Heiga Zen, Mohammadreza Ghodsi, Yinghui Huang, Jesse Emond, Gary Wang, Bhuvana Ramabhadran, Pedro J. Moreno:
Semi-Supervision in ASR: Sequential MixMatch and Factorized TTS-Based Augmentation. 736-740
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LikhomanenkoXKS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LikhomanenkoXKS21
Tatiana Likhomanenko, Qiantong Xu, Jacob Kahn, Gabriel Synnaeve, Ronan Collobert:
slimIPL: Language-Model-Free Iterative Pseudo-Labeling. 741-745
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Yue021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Yue021
Xianghu Yue, Haizhou Li:
Phonetically Motivated Self-Supervised Speech Representation Learning. 746-750
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/DengZMCLL0H21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/DengZMCLL0H21
Yan Deng, Rui Zhao, Zhong Meng, Xie Chen, Bing Liu, Jinyu Li, Yifan Gong, Lei He:
Improving RNN-T for Domain Scaling Using Semi-Supervised Training with Neural TTS. 751-755

Spoken Language Processing I

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SeyfarthSK21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SeyfarthSK21
Scott Seyfarth, Sundararajan Srinivasan, Katrin Kirchhoff:
Speaker-Conversation Factorial Designs for Diarization Error Analysis. 756-760
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/McGowanSDMS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/McGowanSDMS21
Ross McGowan, Jinru Su, Vince DiCocco, Thejaswi Muniyappa, Grant P. Strimel:
SmallER: Scaling Neural Entity Resolution for Edge Devices. 761-765
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/RochollZWMSL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/RochollZWMSL21
Johann C. Rocholl, Vicky Zayats, Daniel D. Walker, Noah B. Murad, Aaron Schneider, Daniel J. Liebling:
Disfluency Detection with Unlabeled Data and Small BERT Models. 766-770
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/0003WCZ21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/0003WCZ21
Qian Chen, Wen Wang, Mengzhe Chen, Qinglin Zhang:
Discriminative Self-Training for Punctuation Prediction. 771-775
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/IhoriMTTOM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/IhoriMTTOM21
Mana Ihori, Naoki Makishima, Tomohiro Tanaka, Akihiko Takashima, Shota Orihashi, Ryo Masumura:
Zero-Shot Joint Modeling of Multiple Spoken-Text-Style Conversion Tasks Using Switching Tokens. 776-780
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LinW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LinW21
Binghuai Lin, Liyuan Wang:
A Noise Robust Method for Word-Level Pronunciation Assessment. 781-785
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Wintrode21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Wintrode21
Jonathan Wintrode:
Targeted Keyword Filtering for Accelerated Spoken Topic Identification. 786-790
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/PalaskarSBM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/PalaskarSBM21
Shruti Palaskar, Ruslan Salakhutdinov, Alan W. Black, Florian Metze:
Multimodal Speech Summarization Through Semantic Concept Learning. 791-795
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LeeYCJG21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LeeYCJG21
Hyunjae Lee, Jaewoong Yun, Hyunjin Choi, Seongho Joe, Youngjune L. Gwon:
Enhancing Semantic Understanding with Self-Supervised Methods for Abstractive Dialogue Summarization. 796-800
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WlodarczakG21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WlodarczakG21
Marcin Wlodarczak, Emer Gilmartin:
Speaker Transition Patterns in Three-Party Conversation: Evidence from English, Estonian and Swedish. 801-805

Voice Conversion and Adaptation II

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/BroughtonJM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/BroughtonJM21
Samuel J. Broughton, Md. Asif Jalal, Roger K. Moore:
Investigating Deep Neural Structures and their Interpretability in the Domain of Voice Conversion. 806-810
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZhouSL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZhouSL21
Kun Zhou, Berrak Sisman, Haizhou Li:
Limited Data Emotional Voice Conversion Leveraging Text-to-Speech: Two-Stage Sequence-to-Sequence Training. 811-815
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/DingLHL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/DingLHL21
Yi-Yang Ding, Li-Juan Liu, Yu Hu, Zhen-Hua Ling:
Adversarial Voice Conversion Against Neural Spoofing Detectors. 816-820
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/HeCRS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/HeCRS21
Xiangheng He, Junjie Chen, Georgios Rizos, Björn W. Schuller:
An Improved StarGAN for Emotional Voice Conversion: Enhancing Voice Quality and Data Augmentation. 821-825
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChenZ21a
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChenZ21a
Ziyi Chen, Pengyuan Zhang:
TVQVC: Transformer Based Vector Quantized Variational Autoencoder with CTC Loss for Voice Conversion. 826-830
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WangZYLDXGCL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WangZYLDXGCL21
Zhichao Wang, Xinyong Zhou, Fengyu Yang, Tao Li, Hongqiang Du, Lei Xie, Wendong Gan, Haitao Chen, Hai Li:
Enriching Source Style Transfer in Recognition-Synthesis Based Non-Parallel Voice Conversion. 831-835
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LinLCL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LinLCL21
Jheng-Hao Lin, Yist Y. Lin, Chung-Ming Chien, Hung-yi Lee:
S2VC: A Framework for Any-to-Any Voice Conversion with Self-Supervised Pretrained Representations. 836-840
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LiberatoreG21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LiberatoreG21
Christopher Liberatore, Ricardo Gutierrez-Osuna:
An Exemplar Selection Algorithm for Native-Nonnative Voice Conversion. 841-845
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WangLZ0KM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WangLZ0KM21
Jie Wang, Jingbei Li, Xintao Zhao, Zhiyong Wu, Shiyin Kang, Helen Meng:
Adversarially Learning Disentangled Speech Representations for Robust Multi-Factor Voice Conversion. 846-850
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LuongT21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LuongT21
Manh Luong, Viet-Anh Tran:
Many-to-Many Voice Conversion Based Feature Disentanglement Using Variational Autoencoder. 851-855

Privacy-Preserving Machine Learning for Audio & Speech Processing

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChouchaneBGLTEK21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChouchaneBGLTEK21
Oubaïda Chouchane, Baptiste Brossier, Jorge Esteban Gamboa Gamboa, Thomas Lardy, Hemlata Tak, Orhan Ermis, Madhu R. Kamble, Jose Patino, Nicholas W. D. Evans, Melek Önen, Massimiliano Todisco:
Privacy-Preserving Voice Anti-Spoofing Using Secure Multi-Party Computation. 856-860
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/AloufiHB21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/AloufiHB21
Ranya Aloufi, Hamed Haddadi, David Boyle:
Configurable Privacy-Preserving Automatic Speech Recognition. 861-865
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/NovotneyGB21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/NovotneyGB21
Scott Novotney, Yile Gu, Ivan Bulyko:
Adjunct-Emeritus Distillation for Semi-Supervised Language Model Adaptation. 866-870
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/RoCMMS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/RoCMMS21
Jae Ro, Mingqing Chen, Rajiv Mathews, Mehryar Mohri, Ananda Theertha Suresh:
Communication-Efficient Agnostic Federated Averaging. 871-875
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KoppelmannNSK021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KoppelmannNSK021
Timm Koppelmann, Alexandru Nelus, Lea Schönherr, Dorothea Kolossa, Rainer Martin:
Privacy-Preserving Feature Extraction for Cloud-Based Wake Word Verification. 876-880
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/YangSL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/YangSL21
Chao-Han Huck Yang, Sabato Marco Siniscalchi, Chin-Hui Lee:
PATE-AAE: Incorporating Adversarial Autoencoder into Private Aggregation of Teacher Ensembles for Spoken Command Classification. 881-885
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/MaYTBTW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MaYTBTW21
Haoxin Ma, Jiangyan Yi, Jianhua Tao, Ye Bai, Zhengkun Tian, Chenglong Wang:
Continual Learning for Fake Audio Detection. 886-890
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ShahSMMD21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ShahSMMD21
Muhammad A. Shah, Joseph Szurley, Markus Müller, Athanasios Mouchtaris, Jasha Droppo:
Evaluating the Vulnerability of End-to-End Automatic Speech Recognition Models to Membership Inference Attacks. 891-895
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/FazelYLBMMD21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/FazelYLBMMD21
Amin Fazel, Wei Yang, Yulan Liu, Roberto Barra-Chicote, Yixiong Meng, Roland Maas, Jasha Droppo:
SynthASR: Unlocking Synthetic Data for Speech Recognition. 896-900

The First DiCOVA Challenge: Diagnosis of COVID-19 Using Acoustics

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/MuguliPRSVGKBCG21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MuguliPRSVGKBCG21
Ananya Muguli, Lancelot Pinto, Nirmala R., Neeraj Kumar Sharma, Prashant Krishnan V, Prasanta Kumar Ghosh, Rohit Kumar, Shrirama Bhat, Srikanth Raj Chetupalli, Sriram Ganapathy, Shreyas Ramoji, Viral Nanda:
DiCOVA Challenge: Dataset, Task, and Baseline System for COVID-19 Diagnosis Using Acoustics. 901-905
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KambleLGECHG0FP21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KambleLGECHG0FP21
Madhu R. Kamble, José Andrés González López, Teresa Grau, Juan M. Espín, Lorenzo Cascioli, Yiqing Huang, Alejandro Gómez Alanís, Jose Patino, Roberto Font, Antonio M. Peinado, Angel M. Gomez, Nicholas W. D. Evans, Maria A. Zuluaga, Massimiliano Todisco:
PANACEA Cough Sound-Based Diagnosis of COVID-19 for the DiCOVA 2021 Challenge. 906-910
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KarasS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KarasS21
Vincent Karas, Björn W. Schuller:
Recognising Covid-19 from Coughing Using Ensembles of SVMs and LSTMs with Handcrafted and Deep Audio Features. 911-915
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SodergrenNCNK21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SodergrenNCNK21
Isabella Södergren, Maryam Pahlavan Nodeh, Prakash Chandra Chhipa, Konstantina Nikolaidou, György Kovács:
Detecting COVID-19 from Audio Recording of Coughs Using Random Forests and Support Vector Machines. 916-920
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/DasM021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/DasM021
Rohan Kumar Das, Maulik C. Madhavi, Haizhou Li:
Diagnosis of COVID-19 Using Auditory Acoustic Cues. 921-925
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/HarvillWHABC21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/HarvillWHABC21
John B. Harvill, Yash R. Wani, Mark Hasegawa-Johnson, Narendra Ahuja, David G. Beiser, David Chestek:
Classification of COVID-19 from Cough Using Autoregressive Predictive Coding Pretraining and Spectral Data Augmentation. 926-930
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/DeshpandeS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/DeshpandeS21
Gauri Deshpande, Björn W. Schuller:
The DiCOVA 2021 Challenge - An Encoder-Decoder Approach for COVID-19 Recognition from Coughing Audio. 931-935
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/RitwikKV21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/RitwikKV21
Kotra Venkata Sai Ritwik, Shareef Babu Kalluri, Deepu Vijayasenan:
COVID-19 Detection from Spectral Features on the DiCOVA Dataset. 936-940
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Mallol-RagoltaC21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Mallol-RagoltaC21
Adria Mallol-Ragolta, Helena Cuesta, Emilia Gómez, Björn W. Schuller:
Cough-Based COVID-19 Detection with Contextual Attention Convolutional Neural Networks and Gender Information. 941-945
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/BhosaleTCK21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/BhosaleTCK21
Swapnil Bhosale, Upasana Tiwari, Rupayan Chakraborty, Sunil Kumar Kopparapu:
Contrastive Learning of Cough Descriptors for Automatic COVID-19 Preliminary Diagnosis. 946-950
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/AvilaPMDMKCG021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/AvilaPMDMKCG021
Flávio Ávila, Amir H. Poorjam, Deepak Mittal, Charles Dognin, Ananya Muguli, Rohit Kumar, Srikanth Raj Chetupalli, Sriram Ganapathy, Maneesh Singh:
Investigating Feature Selection and Explainability for COVID-19 Diagnostics from Cough Sounds. 951-955

Show and Tell 1

- view
  - electronic edition @ isca-speech.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/interspeech/KissST21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KissST21
Gábor Kiss, Dávid Sztahó, Miklós Gábriel Tulics:
Application for Detecting Depression, Parkinson's Disease and Dysphonic Speech. 956-957
- view
  - electronic edition @ isca-speech.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/interspeech/WeingartovaVB21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WeingartovaVB21
Lenka Weingartová, Veronika Volna, Ewa Balejová:
Beey: More Than a Speech-to-Text Editor. 958-959
- view
  - electronic edition @ isca-speech.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/interspeech/Arai21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Arai21
Takayuki Arai:
Downsizing of Vocal-Tract Models to Line up Variations and Reduce Manufacturing Costs. 960-961
- view
  - electronic edition @ isca-speech.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/interspeech/FabienPMZKN21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/FabienPMZKN21
Maël Fabien, Shantipriya Parida, Petr Motlícek, Dawei Zhu, Aravind Krishnan, Hoang H. Nguyen:
ROXANNE Research Platform: Automate Criminal Investigations. 962-964
- view
  - electronic edition @ isca-speech.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/interspeech/FluchaLMMPPPPST21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/FluchaLMMPPPPST21
Alexandre Flucha, Anthony Larcher, Ambuj Mehrish, Sylvain Meignier, Florian Plaut, Nicolas Poupon, Yevhenii Prokopalo, Adrien Puertolas, Meysam Shamsi, Marie Tahon:
The LIUM Human Active Correction Platform for Speaker Diarization. 965-966
- view
  - electronic edition @ isca-speech.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/interspeech/OhP21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/OhP21
Yoo Rhee Oh, Kiyoung Park:
On-Device Streaming Transformer-Based End-to-End Speech Recognition. 967-968
- view
  - electronic edition @ isca-speech.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/interspeech/CmejlaKJMRK21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/CmejlaKJMRK21
Jaroslav Cmejla, Tomás Kounovský, Jakub Janský, Jirí Málek, M. Rozkovec, Zbynek Koldovský:
Advanced Semi-Blind Speaker Extraction and Tracking Implemented in Experimental Device with Revolving Dense Microphone Array. 969-970

Keynote 1: Hermann Ney

- view
  - electronic edition @ isca-speech.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/interspeech/Ney21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Ney21
Hermann Ney:
Forty Years of Speech and Language Processing: From Bayes Decision Rule to Deep Learning.

ASR Technologies and Systems

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChorowskiCDLMOP21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChorowskiCDLMOP21
Jan Chorowski, Grzegorz Ciesielski, Jaroslaw Dzikowski, Adrian Lancucki, Ricard Marxer, Mateusz Opala, Piotr Pusz, Pawel Rychlikowski, Michal Stypulkowski:
Information Retrieval for ZeroSpeech 2021: The Submission by University of Wroclaw. 971-975
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChorowskiCDLMOP21a
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChorowskiCDLMOP21a
Jan Chorowski, Grzegorz Ciesielski, Jaroslaw Dzikowski, Adrian Lancucki, Ricard Marxer, Mateusz Opala, Piotr Pusz, Pawel Rychlikowski, Michal Stypulkowski:
Aligned Contrastive Predictive Coding. 976-980
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SuterN21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SuterN21
Benjamin Suter, Josef Novák:
Neural Text Denormalization for Speech Transcripts. 981-985
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/JoglekarSSCH21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/JoglekarSSCH21
Aditya Joglekar, Seyed Omid Sadjadi, Meena Chandra Shekar, Christopher Cieri, John H. L. Hansen:
Fearless Steps Challenge Phase-3 (FSC P3): Advancing SLT for Unseen Channel and Mission Data Across NASA Apollo Audio. 986-990

Phonation and Voicing

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Leykum21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Leykum21
Hannah Leykum:
Voice Quality in Verbal Irony: Electroglottographic Analyses of Ironic Utterances in Standard Austrian German. 991-995
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/HutinWJVLA21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/HutinWJVLA21
Mathilde Hutin, Yaru Wu, Adèle Jatteau, Ioana Vasilescu, Lori Lamel, Martine Adda-Decker:
Synchronic Fortition in Five Romance Languages? A Large Corpus-Based Study of Word-Initial Devoicing. 996-1000
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KraljevskiBDTW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KraljevskiBDTW21
Ivan Kraljevski, Maria Paola Bissiri, Frank Duckhorn, Constanze Tschöpe, Matthias Wolff:
Glottal Stops in Upper Sorbian: A Data-Driven Approach. 1001-1005
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LudusanWW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LudusanWW21
Bogdan Ludusan, Petra Wagner, Marcin Wlodarczak:
Cue Interaction in the Perception of Prosodic Prominence: The Role of Voice Quality. 1006-1010
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/RodriguezV21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/RodriguezV21
Jenifer Vega Rodríguez, Nathalie Vallée:
Glottal Sounds in Korebaju. 1011-1014
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChancluAGFB21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChancluAGFB21
Anaïs Chanclu, Imen Ben Amor, Cédric Gendrot, Emmanuel Ferragne, Jean-François Bonastre:
Automatic Classification of Phonation Types in Spontaneous Speech: Towards a New Workflow for the Characterization of Speakers' Voice Quality. 1015-1018

Health and Affect I

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Son21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Son21
Rob J. J. H. van Son:
Measuring Voice Quality Parameters After Speaker Pseudonymization. 1019-1023
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SteinertPKS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SteinertPKS21
Lars Steinert, Felix Putze, Dennis Küster, Tanja Schultz:
Audio-Visual Recognition of Emotional Engagement of People with Dementia. 1024-1028
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/HeckerPBRRHESAS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/HeckerPBRRHESAS21
Pascal Hecker, Florian B. Pokorny, Katrin D. Bartl-Pokorny, Uwe D. Reichel, Zhao Ren, Simone Hantke, Florian Eyben, Dagmar M. Schuller, Bert Arnrich, Björn W. Schuller:
Speaking Corona? Human and Machine Recognition of COVID-19 from Voice. 1029-1033
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/NguyenVLLH21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/NguyenVLLH21
Huyen Nguyen, Ralph Vente, David Lupea, Sarah Ita Levitan, Julia Hirschberg:
Acoustic-Prosodic, Lexical and Demographic Cues to Persuasiveness in Competitive Debate Speeches. 1034-1038

Robust Speaker Recognition

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Borgstrom21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Borgstrom21
Bengt J. Borgström:
Unsupervised Bayesian Adaptation of PLDA for Speaker Verification. 1039-1043
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WangCWLWHL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WangCWLWHL21
Weiqing Wang, Danwei Cai, Jin Wang, Qingjian Lin, Xuyang Wang, Mi Hong, Ming Li:
The DKU-Duke-Lenovo System Description for the Fearless Steps Challenge Phase III. 1044-1048
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChenGG21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChenGG21
Yafeng Chen, Wu Guo, Bin Gu:
Improved Meta-Learning Training for Speaker Verification. 1049-1053
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WangDLZZ0X21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WangDLZZ0X21
Dan Wang, Yuanjie Dong, Yaxing Li, Yunfei Zi, Zhihui Zhang, Xiaoqi Li, Shengwu Xiong:
Variational Information Bottleneck Based Regularization for Speaker Recognition. 1054-1058
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/BrummerFS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/BrummerFS21
Niko Brümmer, Luciana Ferrer, Albert Swart:
Out of a Hundred Trials, How Many Errors Does Your Speaker Verifier Make? 1059-1063
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChojnackaPWL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChojnackaPWL21
Roza Chojnacka, Jason Pelecanos, Quan Wang, Ignacio López-Moreno:
SpeakerStew: Scaling to Many Languages with a Triaged Multilingual Text-Dependent and Text-Independent Speaker Verification System. 1064-1068
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WangXYCXZ21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WangXYCXZ21
Zhiming Wang, Furong Xu, Kaisheng Yao, Yuan Cheng, Tao Xiong, Huijia Zhu:
AntVoice Neural Speaker Embedding System for FFSVC 2020. 1069-1073
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LiHS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LiHS21
Jianchen Li, Jiqing Han, Hongwei Song:
Gradient Regularization for Noise-Robust Speaker Verification. 1074-1078
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KatariaVZMD21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KatariaVZMD21
Saurabh Kataria, Jesús Villalba, Piotr Zelasko, Laureano Moro-Velázquez, Najim Dehak:
Deep Feature CycleGANs: Speaker Identity Preserving Non-Parallel Microphone-Telephone Domain Adaptation for Speaker Verification. 1079-1083
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Pu0LED21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Pu0LED21
Jie Pu, Yuguang Yang, Ruirui Li, Oguz Elibol, Jasha Droppo:
Scaling Effect of Self-Supervised Speech Models. 1084-1088
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WuWLLD21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WuWLLD21
Yibo Wu, Longbiao Wang, Kong Aik Lee, Meng Liu, Jianwu Dang:
Joint Feature Enhancement and Speaker Recognition with Multi-Objective Task-Oriented Network. 1089-1093
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZhangWLX021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZhangWLX021
Li Zhang, Qing Wang, Kong Aik Lee, Lei Xie, Haizhou Li:
Multi-Level Transfer Learning from Near-Field to Far-Field Speaker Verification. 1094-1098
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/0001TTNE21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/0001TTNE21
Jose Patino, Natalia A. Tomashenko, Massimiliano Todisco, Andreas Nautsch, Nicholas W. D. Evans:
Speaker Anonymisation Using the McAdams Coefficient. 1099-1103

Source Separation, Dereverberation and Echo Cancellation

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LuoWXY21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LuoWXY21
Yiyu Luo, Jing Wang, Liang Xu, Lidong Yang:
Multi-Stream Gated and Pyramidal Temporal Convolutional Neural Networks for Audio-Visual Speech Separation in Multi-Talker Environments. 1104-1108
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WangWC0YXZWSY21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WangWC0YXZWSY21
Helin Wang, Bo Wu, Lianwu Chen, Meng Yu, Jianwei Yu, Yong Xu, Shi-Xiong Zhang, Chao Weng, Dan Su, Dong Yu:
TeCANet: Temporal-Contextual Attention Network for Environment-Aware Speech Dereverberation. 1109-1113
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/GuCSL021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/GuCSL021
Jianjun Gu, Longbiao Cheng, Xingwei Sun, Junfeng Li, Yonghong Yan:
Residual Echo and Noise Cancellation with Feature Attention Module and Multi-Domain Loss Function. 1114-1118
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LiX0Z00021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LiX0Z00021
Xiyun Li, Yong Xu, Meng Yu, Shi-Xiong Zhang, Jiaming Xu, Bo Xu, Dong Yu:
MIMO Self-Attentive RNN Beamformer for Multi-Speaker Speech Separation. 1119-1123
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/GiriVVIK21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/GiriVVIK21
Ritwik Giri, Shrikant Venkataramani, Jean-Marc Valin, Umut Isik, Arvindh Krishnaswamy:
Personalized PercepNet: Real-Time, Low-Complexity Target Voice Separation and Enhancement. 1124-1128
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/YeminiFMG21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/YeminiFMG21
Yochai Yemini, Ethan Fetaya, Haggai Maron, Sharon Gannot:
Scene-Agnostic Multi-Microphone Speech Dereverberation. 1129-1133
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/TanakaST21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/TanakaST21
Keitaro Tanaka, Ryosuke Sawata, Shusuke Takahashi:
Manifold-Aware Deep Clustering: Maximizing Angles Between Embedding Vectors Based on Regular Simplex. 1134-1138
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZhangW21a
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZhangW21a
Hao Zhang, DeLiang Wang:
A Deep Learning Approach to Multi-Channel and Multi-Microphone Acoustic Echo Cancellation. 1139-1143
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/NaWLTF21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/NaWLTF21
Yueyue Na, Ziteng Wang, Zhang Liu, Biao Tian, Qiang Fu:
Joint Online Multichannel Acoustic Echo Cancellation, Speech Dereverberation and Source Separation. 1144-1148
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SatoODKMK21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SatoODKMK21
Hiroshi Sato, Tsubasa Ochiai, Marc Delcroix, Keisuke Kinoshita, Takafumi Moriya, Naoyuki Kamo:
Should We Always Separate?: Switching Between Enhanced and Observed Signals for Overlapping Speech Recognition. 1149-1153

Speech Signal Analysis and Representation I

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/UdupaRSIG21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/UdupaRSIG21
Sathvik Udupa, Anwesha Roy, Abhayjeet Singh, Aravind Illa, Prasanta Kumar Ghosh:
Estimating Articulatory Movements in Speech Production with Transformer Networks. 1154-1158
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/YangWZ21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/YangWZ21
Dongchao Yang, Helin Wang, Yuexian Zou:
Unsupervised Multi-Target Domain Adaptation for Acoustic Scene Classification. 1159-1163
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/JaramilloNC21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/JaramilloNC21
Alfredo Esquivel Jaramillo, Jesper Kjær Nielsen, Mads Græsbøll Christensen:
Speech Decomposition Based on a Hybrid Speech Model and Optimal Segmentation. 1164-1168
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LuoWCX21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LuoWCX21
Jian Luo, Jianzong Wang, Ning Cheng, Jing Xiao:
Dropout Regularization for Self-Supervised Learning of Transformer Encoder Speech Representation. 1169-1173
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/YarraG21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/YarraG21
Chiranjeevi Yarra, Prasanta Kumar Ghosh:
Noise Robust Pitch Stylization Using Minimum Mean Absolute Error Criterion. 1174-1178
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/HuangSHL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/HuangSHL21
Yu-Lin Huang, Bo-Hao Su, Y.-W. Peter Hong, Chi-Chun Lee:
An Attribute-Aligned Strategy for Learning Speech Representation. 1179-1183
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ShahrebabakiSS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ShahrebabakiSS21
Abdolreza Sabzi Shahrebabaki, Sabato Marco Siniscalchi, Torbjørn Svendsen:
Raw Speech-to-Articulatory Inversion by Temporal Filtering and Decimation. 1184-1188
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LilleyB21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LilleyB21
Jason Lilley, H. Timothy Bunnell:
Unsupervised Training of a DNN-Based Formant Tracker. 1189-1193
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/YangCCLLLLSCLHT21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/YangCCLLLLSCLHT21
Shu-Wen Yang, Po-Han Chi, Yung-Sung Chuang, Cheng-I Jeff Lai, Kushal Lakhotia, Yist Y. Lin, Andy T. Liu, Jiatong Shi, Xuankai Chang, Guan-Ting Lin, Tzu-Hsien Huang, Wei-Cheng Tseng, Ko-tik Lee, Da-Rong Liu, Zili Huang, Shuyan Dong, Shang-Wen Li, Shinji Watanabe, Abdelrahman Mohamed, Hung-yi Lee:
SUPERB: Speech Processing Universal PERformance Benchmark. 1194-1198
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZhangZ21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZhangZ21
Cong Zhang, Jian Zhu:
Synchronising Speech Segments with Musical Beats in Mandarin and English Singing. 1199-1203
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/PeplinskiSJGP21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/PeplinskiSJGP21
Jacob Peplinski, Joel Shor, Sachin Joglekar, Jake Garrison, Shwetak N. Patel:
FRILL: A Non-Semantic Speech Embedding for Mobile Devices. 1204-1208
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Mori21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Mori21
Hiroki Mori:
Pitch Contour Separation from Overlapping Speech. 1209-1213
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/0003WIF21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/0003WIF21
Anurag Kumar, Yun Wang, Vamsi Krishna Ithapu, Christian Fuegen:
Do Sound Event Representations Generalize to Other Audio Tasks? A Case Study in Audio Transfer Learning. 1214-1218

Spoken Language Understanding I

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/PengZZG21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/PengZZG21
Baolin Peng, Chenguang Zhu, Michael Zeng, Jianfeng Gao:
Data Augmentation for Spoken Language Understanding via Pretrained Language Models. 1219-1223
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/RadfarMKR21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/RadfarMKR21
Martin Radfar, Athanasios Mouchtaris, Siegfried Kunzmann, Ariya Rastrow:
FANS: Fusing ASR and NLU for On-Device SLU. 1224-1228
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/CaoPA21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/CaoPA21
Yiran Cao, Nihal Potdar, Anderson R. Avila:
Sequential End-to-End Intent and Slot Label Classification and Localization. 1229-1233
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/MuralidharanMZP21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MuralidharanMZP21
Deepak Muralidharan, Joel Ruben Antony Moniz, Weicheng Zhang, Stephen Pulman, Lin Li, Megan Barnes, Jingjing Pan, Jason D. Williams, Alex Acero:
DEXTER: Deep Encoding of External Knowledge for Named Entity Recognition in Virtual Assistants. 1234-1238
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WuSJ21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WuSJ21
Ting-Wei Wu, Ruolin Su, Biing-Hwang Juang:
A Context-Aware Hierarchical BERT Fusion Network for Multi-Turn Dialog Act Detection. 1239-1243
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/0003WZ21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/0003WZ21
Qian Chen, Wen Wang, Qinglin Zhang:
Pre-Training for Spoken Language Understanding with Joint Textual and Phonetic Representation Learning. 1244-1248
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/DoGSL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/DoGSL21
Quynh Do, Judith Gaspers, Daniil Sorokin, Patrick Lehnen:
Predicting Temporal Performance Drop of Deployed Production Spoken Language Understanding Models. 1249-1253
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Ganhotra0KJSTK21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Ganhotra0KJSTK21
Jatin Ganhotra, Samuel Thomas, Hong-Kwang Jeff Kuo, Sachindra Joshi, George Saon, Zoltán Tüske, Brian Kingsbury:
Integrating Dialog History into End-to-End Spoken Language Understanding Systems. 1254-1258
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/HanHP21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/HanHP21
Ting Han, Chongxuan Huang, Wei Peng:
Coreference Augmentation for Multi-Domain Task-Oriented Dialogue State Tracking. 1259-1263
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/AroraO0DM0B21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/AroraO0DM0B21
Siddhant Arora, Alissa Ostapenko, Vijay Viswanathan, Siddharth Dalmia, Florian Metze, Shinji Watanabe, Alan W. Black:
Rethinking End-to-End Evaluation of Decomposable Tasks: A Case Study on Spoken Language Understanding. 1264-1268

Topics in ASR: Adaptation, Transfer Learning, Children’s Speech, and Low-Resource Settings

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SunTYWZZLZL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SunTYWZZLZL21
Jianwei Sun, Zhiyuan Tang, Hengxin Yin, Wei Wang, Xi Zhao, Shuaijiang Zhao, Xiaoning Lei, Wei Zou, Xiangang Li:
Semantic Data Augmentation for End-to-End Mandarin Speech Recognition. 1269-1273
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/GongLZQ21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/GongLZQ21
Xun Gong, Yizhou Lu, Zhikai Zhou, Yanmin Qian:
Layer-Wise Fast Adaptation for End-to-End Multi-Accent Speech Recognition. 1274-1278
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WangZFCA21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WangZFCA21
Jinhan Wang, Yunzheng Zhu, Ruchao Fan, Wei Chu, Abeer Alwan:
Low Resource German ASR with Untranscribed Data Spoken by Non-Native Children - INTERSPEECH 2021 Shared Task SPAPL System. 1279-1283
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SimCGCMB21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SimCGCMB21
Khe Chai Sim, Angad Chandorkar, Fan Gao, Mason Chua, Tsendsuren Munkhdalai, Françoise Beaufays:
Robust Continuous On-Device Personalization for Automatic Speech Recognition. 1284-1288
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KumarRP21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KumarRP21
Shashi Kumar, Shakti P. Rath, Abhishek Pandey:
Speaker Normalization Using Joint Variational Autoencoder. 1289-1293
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/XuYMLW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/XuYMLW21
Gaopeng Xu, Song Yang, Lu Ma, Chengfei Li, Zhongqin Wu:
The TAL System for the INTERSPEECH2021 Shared Task on Automatic Speech Recognition for Non-Native Childrens Speech. 1294-1298
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LamOSR21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LamOSR21
Tsz Kin Lam, Mayumi Ohta, Shigehiko Schamoni, Stefan Riezler:
On-the-Fly Aligned Data Augmentation for Sequence-to-Sequence ASR. 1299-1303
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/GaoNZQCH21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/GaoNZQCH21
Heting Gao, Junrui Ni, Yang Zhang, Kaizhi Qian, Shiyu Chang, Mark Hasegawa-Johnson:
Zero-Shot Cross-Lingual Phonetic Recognition with External Language Embedding. 1304-1308
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/0028Y0021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/0028Y0021
Yan Huang, Guoli Ye, Jinyu Li, Yifan Gong:
Rapid Speaker Adaptation for Conformer Transducer: Attention and Bias Are All You Need. 1309-1313
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/DasBSSC21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/DasBSSC21
Nilaksh Das, Sravan Bodapati, Monica Sunkara, Sundararajan Srinivasan, Duen Horng Chau:
Best of Both Worlds: Robust Accented Speech Recognition with Adversarial Transfer Learning. 1314-1318
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Chu0021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Chu0021
Wei Chu, Peng Chang, Jing Xiao:
Extending Pronunciation Dictionary with Automatically Detected Word Mispronunciations to Improve PAII's System for Interspeech 2021 Non-Native Child English Close Track ASR Challenge. 1319-1323

Voice Conversion and Adaptation I

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LiLHZ21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LiLHZ21
Tingle Li, Yichen Liu, Chenxu Hu, Hang Zhao:
CVC: Contrastive Learning for Non-Parallel Voice Conversion. 1324-1328
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/HuangKPLTWT21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/HuangKPLTWT21
Wen-Chin Huang, Kazuhiro Kobayashi, Yu-Huai Peng, Ching-Feng Liu, Yu Tsao, Hsin-Min Wang, Tomoki Toda:
A Preliminary Study of a Two-Stage Paradigm for Preserving Speaker Identity in Dysarthric Voice Conversion. 1329-1333
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/EskimezDKG21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/EskimezDKG21
Sefik Emre Eskimez, Dimitrios Dimitriadis, Ken'ichi Kumatani, Robert Gmyr:
One-Shot Voice Conversion with Speaker-Agnostic StarGAN. 1334-1338
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KoshizukaOK21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KoshizukaOK21
Takeshi Koshizuka, Hidefumi Ohmura, Kouichi Katsurada:
Fine-Tuning Pre-Trained Voice Conversion Model for Adding New Target Speakers with Limited Data. 1339-1343
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WangDYCLM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WangDYCLM21
Disong Wang, Liqun Deng, Yu Ting Yeung, Xiao Chen, Xunying Liu, Helen Meng:
VQMIVC: Vector Quantization and Mutual Information-Based Unsupervised Speech Representation Disentanglement for One-Shot Voice Conversion. 1344-1348
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LiZM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LiZM21
Yinghao Aaron Li, Ali Zare, Nima Mesgarani:
StarGANv2-VC: A Diverse, Unsupervised, Non-Parallel Framework for Natural-Sounding Voice Conversion. 1349-1353
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/0010GNL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/0010GNL21
Neeraj Kumar, Srishti Goel, Ankur Narang, Brejesh Lall:
Normalization Driven Zero-Shot Multi-Speaker Speech Synthesis. 1354-1358
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SakamotoTTK21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SakamotoTTK21
Shoki Sakamoto, Akira Taniguchi, Tadahiro Taniguchi, Hirokazu Kameoka:
StarGAN-VC+ASR: StarGAN-Based Non-Parallel Voice Conversion Regularized by Automatic Speech Recognition. 1359-1363
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/XuSCCLLZH21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/XuSCCLLZH21
Xuexin Xu, Liang Shi, Jinhui Chen, Xunquan Chen, Jie Lian, Pingyuan Lin, Zhihong Zhang, Edwin R. Hancock:
Two-Pathway Style Embedding for Arbitrary Voice Conversion. 1364-1368
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LiuYSYCZ21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LiuYSYCZ21
Yufei Liu, Chengzhu Yu, Shuai Wang, Zhenchuan Yang, Yang Chao, Weibin Zhang:
Non-Parallel Any-to-Many Voice Conversion by Replacing Speaker Statistics. 1369-1373
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZhouTW021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZhouTW021
Yi Zhou, Xiaohai Tian, Zhizheng Wu, Haizhou Li:
Cross-Lingual Voice Conversion with a Cycle Consistency Loss on Linguistic Representation. 1374-1378
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/DuX21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/DuX21
Hongqiang Du, Lei Xie:
Improving Robustness of One-Shot Voice Conversion with Deep Discriminative Speaker Encoder. 1379-1383

Voice Quality Characterization for Clinical Voice Assessment: Voice Production, Acoustics, and Auditory Perception

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WhitePGSC21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WhitePGSC21
Hannah White, Joshua Penney, Andy Gibson, Anita Szakay, Felicity Cox:
Optimizing an Automatic Creaky Voice Detection Method for Australian English Speaking Females. 1384-1388
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/PenneyGCPS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/PenneyGCPS21
Joshua Penney, Andy Gibson, Felicity Cox, Michael I. Proctor, Anita Szakay:
A Comparison of Acoustic Correlates of Voice Quality Across Different Recording Devices: A Cautionary Tale. 1389-1393
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SfakianakiK21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SfakianakiK21
Anna Sfakianaki, George P. Kafentzis:
Investigating Voice Function Characteristics of Greek Speakers with Hearing Loss Using Automatic Glottal Source Feature Extraction. 1394-1398
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/HuckvaleB21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/HuckvaleB21
Mark A. Huckvale, Catinca Buciuleac:
Automated Detection of Voice Disorder in the Saarbrücken Voice Database: Effects of Pathology Subset and Audio Materials. 1399-1403
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LulichP21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LulichP21
Steven M. Lulich, Rita R. Patel:
Accelerometer-Based Measurements of Voice Quality in Children During Semi-Occluded Vocal Tract Exercise with a Narrow Straw in Air. 1404-1408
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/PerezRRCMDP21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/PerezRRCMDP21
Matthew Perez, Amrit Romana, Angela Roberts, Noelle Carlozzi, Jennifer Ann Miner, Praveen Dayalu, Emily Mower Provost:
Articulatory Coordination for Speech Motor Tracking in Huntington Disease. 1409-1413
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/FerrerAHBCEBN21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/FerrerAHBCEBN21
Carlos A. Ferrer, Efren Aragón, María E. Hdez-Díaz, Marc S. De Bodt, Roman Cmejla, Marina Englert, Mara Behlau, Elmar Nöth:
Modeling Dysphonia Severity as a Function of Roughness and Breathiness Ratings in the GRBAS Scale. 1414-1418

Miscellanous Topics in ASR

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KarpovDM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KarpovDM21
Nikolay Karpov, Alexander Denisenko, Fedor Minkin:
Golos: Russian Dataset for Speech Research. 1419-1423
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SadhuH21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SadhuH21
Samik Sadhu, Hynek Hermansky:
Radically Old Way of Computing Spectra: Applications in End-to-End ASR. 1424-1428
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Al-GheziGRHK21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Al-GheziGRHK21
Ragheb Al-Ghezi, Yaroslav Getman, Aku Rouhe, Raili Hildén, Mikko Kurimo:
Self-Supervised End-to-End ASR for Low Resource L2 Swedish. 1429-1433
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ONeillLMNZKBDFS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ONeillLMNZKBDFS21
Patrick K. O'Neill, Vitaly Lavrukhin, Somshubra Majumdar, Vahid Noroozi, Yuekai Zhang, Oleksii Kuchaiev, Jagadeesh Balam, Yuliya Dovzhenko, Keenan Freyberg, Michael D. Shulman, Boris Ginsburg, Shinji Watanabe, Georg Kucsko:
SPGISpeech: 5, 000 Hours of Transcribed Financial Audio for Fully Formatted End-to-End Speech Recognition. 1434-1438
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/EvainNLBMATTDPA21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/EvainNLBMATTDPA21
Solène Evain, Ha Nguyen, Hang Le, Marcely Zanon Boito, Salima Mdhaffar, Sina Alisamir, Ziyi Tong, Natalia A. Tomashenko, Marco Dinarelli, Titouan Parcollet, Alexandre Allauzen, Yannick Estève, Benjamin Lecouteux, François Portet, Solange Rossato, Fabien Ringeval, Didier Schwab, Laurent Besacier:
LeBenchmark: A Reproducible Framework for Assessing Self-Supervised Representation Learning from Speech. 1439-1443

Phonetics I

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SturmSN21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SturmSN21
Pavel Sturm, Radek Skarnitzl, Tomás Nechanský:
Prosodic Accommodation in Face-to-Face and Telephone Dialogues. 1444-1448
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Riverin-Coutlee21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Riverin-Coutlee21
Josiane Riverin-Coutlée, Conceição Cunha, Enkeleida Kapia, Jonathan Harrington:
Dialect Features in Heterogeneous and Homogeneous Gheg Speaking Communities. 1449-1453
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZellersWSN21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZellersWSN21
Margaret Zellers, Alena Witzlack-Makarevich, Lilja Saeboe, Saudah Namyalo:
An Exploration of the Acoustic Space of Rhotics and Laterals in Ruruuli. 1454-1458
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/BodurBPTG21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/BodurBPTG21
Kübra Bodur, Sweeney Branje, Morgane Peirolo, Ingrid Tiscareno, James Sneed German:
Domain-Initial Strengthening in Turkish: Acoustic Cues to Prosodic Hierarchy in Stop Consonants. 1459-1463

Target Speaker Detection, Localization and Separation

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZmolikovaDR0C21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZmolikovaDR0C21
Katerina Zmolíková, Marc Delcroix, Desh Raj, Shinji Watanabe, Jan Cernocký:
Auxiliary Loss Function for Target Speech Extraction and Recognition with Weak Supervision Based on Speaker Characteristics. 1464-1468
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/BorsdorfX0S21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/BorsdorfX0S21
Marvin Borsdorf, Chenglin Xu, Haizhou Li, Tanja Schultz:
Universal Speaker Extraction in the Presence and Absence of Target Speakers for Speech of One and Two Talkers. 1469-1473
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/MatejuKCZM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MatejuKCZM21
Lukás Mateju, Frantisek Kynych, Petr Cerva, Jindrich Zdánský, Jirí Málek:
Using X-Vectors for Speech Activity Detection in Broadcast Streams. 1474-1478
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SalvatiDF21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SalvatiDF21
Daniele Salvati, Carlo Drioli, Gian Luca Foresti:
Time Delay Estimation for Speaker Localization Using CNN-Based Parametrized GCC-PHAT Features. 1479-1483
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/YousefiH21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/YousefiH21
Midia Yousefi, John H. L. Hansen:
Real-Time Speaker Counting in a Cocktail Party Scenario Using Attention-Guided Convolutional Neural Network. 1484-1488

Language and Accent Recognition

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LiuGZDKKS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LiuGZDKKS21
Hexin Liu, Leibny Paola García-Perera, Xinyi Zhang, Justin Dauwels, Andy W. H. Khong, Sanjeev Khudanpur, Suzy J. Styles:
End-to-End Language Diarization for Bilingual Code-Switching Speech. 1489-1493
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/DuroselleSJI21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/DuroselleSJI21
Raphaël Duroselle, Md. Sahidullah, Denis Jouvet, Irina Illina:
Modeling and Training Strategies for Language Recognition Systems. 1494-1498
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WangLSF0021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WangLSF0021
Hui Wang, Lin Liu, Yan Song, Lei Fang, Ian McLoughlin, Li-Rong Dai:
A Weight Moving Average Based Alternate Decoupled Learning Algorithm for Long-Tailed Language Identification. 1499-1503
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/DengCM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/DengCM21
Keqi Deng, Songjun Cao, Long Ma:
Improving Accent Identification and Accented Speech Recognition Under a Framework of Self-Supervised Learning. 1504-1508
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/FanLZX21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/FanLZX21
Zhiyun Fan, Meng Li, Shiyu Zhou, Bo Xu:
Exploring wav2vec 2.0 on Speaker Verification and Language Identification. 1509-1513
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/RameshKM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/RameshKM21
Gundluru Ramesh, C. Shiva Kumar, K. Sri Rama Murty:
Self-Supervised Phonotactic Representations for Language Identification. 1514-1518
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZhangPPXHC21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZhangPPXHC21
Jicheng Zhang, Yizhou Peng, Van Tung Pham, Haihua Xu, Hao Huang, Eng Siong Chng:
E2E-Based Multi-Task Learning Approach to Joint Speech and Accent Recognition. 1519-1523
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/TzudirBSP21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/TzudirBSP21
Moakala Tzudir, Shikha Baghel, Priyankoo Sarmah, S. R. Mahadeva Prasanna:
Excitation Source Feature Based Dialect Identification in Ao - A Low Resource Language. 1524-1528

Low-Resource Speech Recognition

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KhareMDSJB21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KhareMDSJB21
Shreya Khare, Ashish R. Mittal, Anuj Diwan, Sunita Sarawagi, Preethi Jyothi, Samarth Bharadwaj:
Low Resource ASR: The Surprising Effectiveness of High Resource Transliteration. 1529-1533
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/FengZMS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/FengZMS21
Siyuan Feng, Piotr Zelasko, Laureano Moro-Velázquez, Odette Scharenborg:
Unsupervised Acoustic Unit Discovery by Leveraging a Language-Independent Subword Discriminative Feature Representation. 1534-1538
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KamperN21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KamperN21
Herman Kamper, Benjamin van Niekerk:
Towards Unsupervised Phone and Word Segmentation Using Self-Supervised Vector-Quantized Neural Networks. 1539-1543
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/JiangLCZL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/JiangLCZL21
Dongwei Jiang, Wubo Li, Miao Cao, Wei Zou, Xiangang Li:
Speech SimCLR: Combining Contrastive and Reconstruction Objective for Self-Supervised Speech Representation Learning. 1544-1548
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/JacobsK21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/JacobsK21
Christiaan Jacobs, Herman Kamper:
Multilingual Transfer of Acoustic Word Embeddings Improves When Training on Languages Related to the Target Zero-Resource Language. 1549-1553
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/NiekerkNBK21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/NiekerkNBK21
Benjamin van Niekerk, Leanne Nortje, Matthew Baas, Herman Kamper:
Analyzing Speaker Information in Self-Supervised Models to Improve Zero-Resource Speech Processing. 1554-1558
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/TakahashiSN21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/TakahashiSN21
Shun Takahashi, Sakriani Sakti, Satoshi Nakamura:
Unsupervised Neural-Based Graph Clustering for Variable-Length Speech Representation Discovery of Zero-Resource Languages. 1559-1563
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/MaekakuCFCWR21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MaekakuCFCWR21
Takashi Maekaku, Xuankai Chang, Yuya Fujita, Li-Wei Chen, Shinji Watanabe, Alexander I. Rudnicky:
Speech Representation Learning Combining Conformer CPC with Deep Cluster for the ZeroSpeech Challenge 2021. 1564-1568
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/CuiGHM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/CuiGHM21
Xia Cui, Amila Gamage, Terry Hanley, Tingting Mu:
Identifying Indicators of Vulnerability from Short Speech Segments Using Acoustic and Textual Features. 1569-1573
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/DunbarBHNSRRKD21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/DunbarBHNSRRKD21
Ewan Dunbar, Mathieu Bernard, Nicolas Hamilakis, Tu Anh Nguyen, Maureen de Seyssel, Patricia Rozé, Morgane Rivière, Eugene Kharitonov, Emmanuel Dupoux:
The Zero Resource Speech Challenge 2021: Spoken Language Modelling. 1574-1578
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/GudurP21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/GudurP21
Gautham Krishna Gudur, Satheesh Kumar Perepu:
Zero-Shot Federated Learning with New Classes for Audio Classification. 1579-1583
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/RouditchenkoBHC21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/RouditchenkoBHC21
Andrew Rouditchenko, Angie W. Boggust, David Harwath, Brian Chen, Dhiraj Joshi, Samuel Thomas, Kartik Audhkhasi, Hilde Kuehne, Rameswar Panda, Rogério Schmidt Feris, Brian Kingsbury, Michael Picheny, Antonio Torralba, James R. Glass:
AVLnet: Learning Audio-Visual Language Representations from Instructional Videos. 1584-1588

Speech Synthesis: Singing, Multimodal, Crosslingual Synthesis

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LeeKBLKC21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LeeKBLKC21
Gyeong-Hoon Lee, Tae-Woo Kim, Hanbin Bae, Min-Ji Lee, Young-Ik Kim, Hoon-Young Cho:
N-Singer: A Non-Autoregressive Korean Singing Voice Synthesis System for Pronunciation Enhancement. 1589-1593
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ManiatiEMVSPCT21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ManiatiEMVSPCT21
Georgia Maniati, Nikolaos Ellinas, Konstantinos Markopoulos, Georgios Vamvoukakis, June Sig Sung, Hyoungmin Park, Aimilios Chalamandaris, Pirros Tsiakoulis:
Cross-Lingual Low Resource Speaker Adaptation Using Phonological Features. 1594-1598
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZhanZOL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZhanZOL21
Haoyue Zhan, Haitong Zhang, Wenjie Ou, Yue Lin:
Improve Cross-Lingual Text-To-Speech Synthesis on Monolingual Corpora with Pitch Contour Information. 1599-1603
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/YangZLX21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/YangZLX21
Zhenchuan Yang, Weibin Zhang, Yufei Liu, Xiaofen Xing:
Cross-Lingual Voice Conversion with Disentangled Universal Linguistic Representations. 1604-1608
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LiuMZCMWX21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LiuMZCMWX21
Zhengchen Liu, Chenfeng Miao, Qingying Zhu, Minchuan Chen, Jun Ma, Shaojun Wang, Jing Xiao:
EfficientSing: A Chinese Singing Voice Synthesis System Using Duration-Free Acoustic Model and HiFi-GAN Vocoder. 1609-1613
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/XinSTKS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/XinSTKS21
Detai Xin, Yuki Saito, Shinnosuke Takamichi, Tomoki Koriyama, Hiroshi Saruwatari:
Cross-Lingual Speaker Adaptation Using Domain Adaptation and Speaker Consistency Loss for Text-To-Speech Synthesis. 1614-1618
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ShangHZZ021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ShangHZZ021
Zengqiang Shang, Zhihua Huang, Haozhe Zhang, Pengyuan Zhang, Yonghong Yan:
Incorporating Cross-Speaker Style Transfer for Multi-Language Text-to-Speech. 1619-1623
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KesimE21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KesimE21
Ege Kesim, Engin Erzin:
Investigating Contributions of Speech and Facial Landmarks for Talking Head Generation. 1624-1628
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SiWQCWZX21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SiWQCWZX21
Shijing Si, Jianzong Wang, Xiaoyang Qu, Ning Cheng, Wenqi Wei, Xinghua Zhu, Jing Xiao:
Speech2Video: Cross-Modal Distillation for Speech to Video Generation. 1629-1633

Speech Coding and Privacy

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LeeH21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LeeH21
Junhyeok Lee, Seungu Han:
NU-Wave: A Diffusion Probabilistic Model for Neural Audio Upsampling. 1634-1638
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LinHL0L21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LinHL0L21
Gang-Xuan Lin, Shih-Wei Hu, Yen-Ju Lu, Yu Tsao, Chun-Shien Lu:
QISTA-Net-Audio: Audio Super-Resolution via Non-Convex ℓ_q-Norm Minimization. 1639-1643
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WenWWZPC21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WenWWZPC21
Liang Wen, Lizhong Wang, Xue Wen, Yuxing Zheng, Youngo Park, Kwang Pyo Choi:
X-net: A Joint Scale Down and Scale Up Method for Voice Call. 1644-1648
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Zhang0XZ21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Zhang0XZ21
Kexun Zhang, Yi Ren, Changliang Xu, Zhou Zhao:
WSRGlow: A Glow-Based Waveform Generative Model for Audio Super-Resolution. 1649-1653
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/YiBTMTWWF21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/YiBTMTWWF21
Jiangyan Yi, Ye Bai, Jianhua Tao, Haoxin Ma, Zhengkun Tian, Chenglong Wang, Tao Wang, Ruibo Fu:
Half-Truth: A Partially Fake Audio Detection Dataset. 1654-1658
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChettriHSK21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChettriHSK21
Bhusan Chettri, Rosa González Hautamäki, Md. Sahidullah, Tomi Kinnunen:
Data Quality as Predictor of Voice Anti-Spoofing Generalization. 1659-1663
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/CheonHHJS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/CheonHHJS21
Youngju Cheon, Soojoong Hwang, Sangwook Han, Inseon Jang, Jong Won Shin:
Coded Speech Enhancement Using Neural Network-Based Vector-Quantized Residual Features. 1664-1668
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/DrudeHSV21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/DrudeHSV21
Lukas Drude, Jahn Heymann, Andreas Schwarz, Jean-Marc Valin:
Multi-Channel Opus Compression for Far-Field Automatic Speech Recognition with a Fixed Bitrate Budget. 1669-1673
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Siegert21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Siegert21
Ingo Siegert:
Effects of Prosodic Variations on Accidental Triggers of a Commercial Voice Assistant. 1674-1678
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/GabrysJKKB21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/GabrysJKKB21
Adam Gabrys, Yunlong Jiao, Viacheslav Klimkov, Daniel Korzekwa, Roberto Barra-Chicote:
Improving the Expressiveness of Neural Vocoding with Non-Affine Normalizing Flows. 1679-1683
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/PrajapatiSAP21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/PrajapatiSAP21
Gauri P. Prajapati, Dipesh K. Singh, Preet P. Amin, Hemant A. Patil:
Voice Privacy Through x-Vector and CycleGAN-Based Anonymization. 1684-1688
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LinWKKZF21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LinWKKZF21
Ju Lin, Yun Wang, Kaustubh Kalgaonkar, Gil Keren, Didi Zhang, Christian Fuegen:
A Two-Stage Approach to Speech Bandwidth Extension. 1689-1693
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ByunSPSB21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ByunSPSB21
Joon Byun, Seungmin Shin, Youngcheol Park, Jongmo Sung, Seungkwon Beack:
Development of a Psychoacoustic Loss Function for the Deep Neural Network (DNN)-Based Speech Coder. 1694-1698
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/StoidisC21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/StoidisC21
Dimitrios Stoidis, Andrea Cavallaro:
Protecting Gender and Identity with Disentangled Speech Representations. 1699-1703

Speech Perception II

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/AldholmiAA21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/AldholmiAA21
Yahya Aldholmi, Rawan Aldhafyan, Asma Alqahtani:
Perception of Standard Arabic Synthetic Speech Rate. 1704-1707
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Kishiyama21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Kishiyama21
Takeshi Kishiyama:
The Influence of Parallel Processing on Illusory Vowels. 1708-1712
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChingachamDK21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChingachamDK21
Anupama Chingacham, Vera Demberg, Dietrich Klakow:
Exploring the Potential of Lexical Paraphrases for Mitigating Noise-Induced Comprehension Errors. 1713-1717
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SimantirakiC21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SimantirakiC21
Olympia Simantiraki, Martin Cooke:
SpeechAdjuster: A Tool for Investigating Listener Preferences and Speech Intelligibility. 1718-1722
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SaitoINO21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SaitoINO21
Susumu Saito, Yuta Ide, Teppei Nakano, Tetsuji Ogawa:
VocalTurk: Exploring Feasibility of Crowdsourced Speaker Identification. 1723-1727
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/XuSW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/XuSW21
Min Xu, Jing Shao, Lan Wang:
Effects of Aging and Age-Related Hearing Loss on Talker Discrimination. 1728-1732
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZhangLWXLZ21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZhangLWXLZ21
Yuqing Zhang, Zhu Li, Bin Wu, Yanlu Xie, Binghuai Lin, Jinsong Zhang:
Relationships Between Perceptual Distinctiveness, Articulatory Complexity and Functional Load in Speech Communication. 1733-1737
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/TerblancheHG21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/TerblancheHG21
Camryn Terblanche, Philip Harrison, Amelia Jane Gully:
Human Spoofing Detection Performance on Degraded Speech. 1738-1742
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/EinfeldtSZKB21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/EinfeldtSZKB21
Marieke Einfeldt, Rita Sevastjanova, Katharina Zahner-Ritter, Ekaterina Kazak, Bettina Braun:
Reliable Estimates of Interpretable Cue Effects with Active Learning in Psycholinguistic Research. 1743-1747
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/0003KR21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/0003KR21
Puneet Kumar, Vishesh Kaushik, Balasubramanian Raman:
Towards the Explainability of Multimodal Speech Emotion Recognition. 1748-1752
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZengWYD21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZengWYD21
Biao Zeng, Rui Wang, Guoxing Yu, Christian Dobel:
Primacy of Mouth over Eyes: Eye Movement Evidence from Audiovisual Mandarin Lexical Tones and Vowels. 1753-1756
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/AshiharaMK21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/AshiharaMK21
Takanori Ashihara, Takafumi Moriya, Makio Kashino:
Investigating the Impact of Spectral and Temporal Degradation on End-to-End Automatic Speech Recognition Performance. 1757-1761

Streaming for ASR/RNN Transducers

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/NguyenSW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/NguyenSW21
Thai-Son Nguyen, Sebastian Stüker, Alex Waibel:
Super-Human Performance in Online Low-Latency Recognition of Conversational Speech. 1762-1766
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/JoshiDSM0021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/JoshiDSM0021
Vikas Joshi, Amit Das, Eric Sun, Rupesh R. Mehta, Jinyu Li, Yifan Gong:
Multiple Softmax Architecture for Streaming Multilingual End-to-End ASR Systems. 1767-1771
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LeJKKSMCSFKSS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LeJKKSMCSFKSS21
Duc Le, Mahaveer Jain, Gil Keren, Suyoun Kim, Yangyang Shi, Jay Mahadeokar, Julian Chan, Yuan Shangguan, Christian Fuegen, Ozlem Kalinli, Yatharth Saraf, Michael L. Seltzer:
Contextualized Streaming End-to-End Speech Recognition with Trie-Based Deep Biasing and Shallow Fusion. 1772-1776
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SainathHNBPRAVQ21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SainathHNBPRAVQ21
Tara N. Sainath, Yanzhang He, Arun Narayanan, Rami Botros, Ruoming Pang, David Rybach, Cyril Allauzen, Ehsan Variani, James Qin, Quoc-Nam Le-The, Shuo-Yiin Chang, Bo Li, Anmol Gulati, Jiahui Yu, Chung-Cheng Chiu, Diamantino Caseiro, Wei Li, Qiao Liang, Pat Rondon:
An Efficient Streaming Non-Recurrent On-Device End-to-End Model with Improvements to Rare-Word Modeling. 1777-1781
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/0001K0021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/0001K0021
Liang Lu, Naoyuki Kanda, Jinyu Li, Yifan Gong:
Streaming Multi-Talker Speech Recognition with Joint Speaker Identification. 1782-1786
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/MoriyaTAOSAMDA21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MoriyaTAOSAMDA21
Takafumi Moriya, Tomohiro Tanaka, Takanori Ashihara, Tsubasa Ochiai, Hiroshi Sato, Atsushi Ando, Ryo Masumura, Marc Delcroix, Taichi Asami:
Streaming End-to-End Speech Recognition for Hybrid RNN-T/Attention Architecture. 1787-1791
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SchwarzSW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SchwarzSW21
Andreas Schwarz, Ilya Sklyar, Simon Wiesler:
Improving RNN-T ASR Accuracy Using Context Audio. 1792-1796
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/HuangSTHCZM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/HuangSTHCZM21
Lu Huang, Jingyu Sun, Yufeng Tang, Junfeng Hou, Jinkun Chen, Jun Zhang, Zejun Ma:
HMM-Free Encoder Pre-Training for Streaming RNN Transducer. 1797-1801
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/CuiKSHT21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/CuiKSHT21
Xiaodong Cui, Brian Kingsbury, George Saon, David Haws, Zoltán Tüske:
Reducing Exposure Bias in Training Recurrent Neural Network Transducers. 1802-1806
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/DoutreHCPSC21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/DoutreHCPSC21
Thibault Doutre, Wei Han, Chung-Cheng Chiu, Ruoming Pang, Olivier Siohan, Liangliang Cao:
Bridging the Gap Between Streaming and Non-Streaming ASR Systems by Distilling Ensembles of CTC and RNN-T Models. 1807-1811
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/AudhkhasiCRM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/AudhkhasiCRM21
Kartik Audhkhasi, Tongzhou Chen, Bhuvana Ramabhadran, Pedro J. Moreno:
Mixture Model Attention: Flexible Streaming and Non-Streaming Automatic Speech Recognition. 1812-1816
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/InagumaK21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/InagumaK21
Hirofumi Inaguma, Tatsuya Kawahara:
StableEmit: Selection Probability Discount for Reducing Emission Latency of Streaming Monotonic Attention ASR. 1817-1821
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/MoritzHR21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MoritzHR21
Niko Moritz, Takaaki Hori, Jonathan Le Roux:
Dual Causal/Non-Causal Self-Attention for Streaming End-to-End Speech Recognition. 1822-1826
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KimWSHW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KimWSHW21
Kwangyoun Kim, Felix Wu, Prashant Sridhar, Kyu Jeong Han, Shinji Watanabe:
Multi-Mode Transformer Transducer with Stochastic Future Context. 1827-1831

ConferencingSpeech 2021 Challenge: Far-Field Multi-Channel Speech Enhancement for Video Conferencing

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/RenZCZZGY21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/RenZCZZGY21
Xinlei Ren, Xu Zhang, Lianwu Chen, Xiguang Zheng, Chen Zhang, Liang Guo, Bing Yu:
A Causal U-Net Based Neural Beamforming Network for Real-Time Multi-Channel Speech Enhancement. 1832-1836
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZhuYLS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZhuYLS21
Rui Zhu, Feiran Yang, Yuepeng Li, Shidong Shang:
A Partitioned-Block Frequency-Domain Adaptive Kalman Filter for Stereophonic Acoustic Echo Cancellation. 1837-1841
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WangYZY21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WangYZY21
Taihui Wang, Feiran Yang, Rui Zhu, Jun Yang:
Real-Time Independent Vector Analysis Using Semi-Supervised Nonnegative Matrix Factorization as a Source Model. 1842-1846
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/HanRWL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/HanRWL21
Jiangyu Han, Wei Rao, Yannan Wang, Yanhua Long:
Improving Channel Decorrelation for Multi-Channel Target Speech Extraction. 1847-1851
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LiuZ21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LiuZ21
Jinjiang Liu, Xueliang Zhang:
Inplace Gated Convolutional Recurrent Neural Network for Dual-Channel Speech Enhancement. 1852-1856
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/RajKJPGS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/RajKJPGS21
R. G. Prithvi Raj, Rohit Kumar, M. K. Jayesh, Anurenjan Purushothaman, Sriram Ganapathy, M. Ali Basha Shaik:
SRIB-LEAP Submission to Far-Field Multi-Channel Speech Enhancement Challenge for Video Conferencing. 1857-1861
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/XueHCF21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/XueHCF21
Cheng Xue, Weilong Huang, Weiguang Chen, Jinwei Feng:
Real-Time Multi-Channel Speech Enhancement Based on Neural Network Masking with Attention Model. 1862-1866

Survey Talk 2: Sriram Ganapathy

- view
  - electronic edition @ isca-speech.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/interspeech/Ganapathy21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Ganapathy21
Sriram Ganapathy:
Uncovering the Acoustic Cues of COVID-19 Infection.

Keynote 2: Pascale Fung

- view
  - electronic edition @ isca-speech.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/interspeech/Fung21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Fung21
Pascale Fung:
Ethical and Technological Challenges of Conversational AI.

Language Modeling and Text-Based Innovations for ASR

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/FohrI21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/FohrI21
Dominique Fohr, Irina Illina:
BERT-Based Semantic Model for Rescoring N-Best Speech Recognition List. 1867-1871
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/BenesB21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/BenesB21
Karel Benes, Lukás Burget:
Text Augmentation for Language Models in High Error Recognition Scenario. 1872-1876
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/GaoTGTSN21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/GaoTGTSN21
Yingbo Gao, David Thulke, Alexander Gerstenberger, Khoa Viet Tran, Ralf Schlüter, Hermann Ney:
On Sampling-Based Training Criteria for Neural Language Modeling. 1877-1881
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/PylkkonenUKTH21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/PylkkonenUKTH21
Janne Pylkkönen, Antti Ukkonen, Juho Kilpikoski, Samu Tamminen, Hannes Heikinheimo:
Fast Text-Only Domain Adaptation of RNN-Transducer Prediction Network. 1882-1886

Speaker, Language, and Privacy

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/CieriFW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/CieriFW21
Christopher Cieri, James Fiumara, Jonathan Wright:
Using Games to Augment Corpora for Language Recognition and Confusability. 1887-1891
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/FenuMMM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/FenuMMM21
Gianni Fenu, Mirko Marras, Giacomo Medda, Giacomo Meloni:
Fair Voice Biometrics: Impact of Demographic Imbalance on Group Fairness in Speaker Recognition. 1892-1896
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZhangCQ21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZhangCQ21
Leying Zhang, Zhengyang Chen, Yanmin Qian:
Knowledge Distillation from Multi-Modality to Single-Modality for Person Verification. 1897-1901
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/NoeMMPNB21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/NoeMMPNB21
Paul-Gauthier Noé, Mohammad MohammadAmini, Driss Matrouf, Titouan Parcollet, Andreas Nautsch, Jean-François Bonastre:
Adversarial Disentanglement of Speaker Representation for Attribute-Driven Privacy Preservation. 1902-1906

Assessment of Pathological Speech and Language I

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/RomanaBPGRRP21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/RomanaBPGRRP21
Amrit Romana, John Bandon, Matthew Perez, Stephanie Gutierrez, Richard Richter, Angela Roberts, Emily Mower Provost:
Automatically Detecting Errors and Disfluencies in Read Speech to Predict Cognitive Impairment in People with Parkinson's Disease. 1907-1911
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/VaysseFAA21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/VaysseFAA21
Robin Vaysse, Jérôme Farinas, Corine Astésano, Régine André-Obrecht:
Automatic Extraction of Speech Rhythm Descriptors for Speech Intelligibility Assessment in the Context of Head and Neck Cancers. 1912-1916
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Qih21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Qih21
Jinzi Qi, Hugo Van hamme:
Speech Disorder Classification Using Extended Factorized Hierarchical Variational Auto-Encoders. 1917-1921
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/MathadMSCHLB21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MathadMSCHLB21
Vikram C. Mathad, Tristan J. Mahr, Nancy Scherer, Kathy Chapman, Katherine C. Hustad, Julie Liss, Visar Berisha:
The Impact of Forced-Alignment Errors on Automatic Pronunciation Evaluation. 1922-1926
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Villatoro-Tello21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Villatoro-Tello21
Esaú Villatoro-Tello, S. Pavankumar Dubagunta, Julian Fritsch, Gabriela Ramírez-de-la-Rosa, Petr Motlícek, Mathew Magimai-Doss:
Late Fusion of the Available Lexicon and Raw Waveform-Based Acoustic Modeling for Depression and Dementia Recognition. 1927-1931
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Shandiz0GMC21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Shandiz0GMC21
Amin Honarmandi Shandiz, László Tóth, Gábor Gosztolya, Alexandra Markó, Tamás Gábor Csapó:
Neural Speaker Embeddings for Ultrasound-Based Silent Speech Interfaces. 1932-1936

Communication and Interaction, Multimodality

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LambaAADJR21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LambaAADJR21
Jatin Lamba, Abhishek, Jayaprakash Akula, Rishabh Dabral, Preethi Jyothi, Ganesh Ramakrishnan:
Cross-Modal Learning for Audio-Visual Video Parsing. 1937-1941
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/CookZMA21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/CookZMA21
Darren Cook, Miri Zilka, Simon Maskell, Laurence Alison:
A Psychology-Driven Computational Analysis of Political Interviews. 1942-1946
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SantosoYMIH21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SantosoYMIH21
Jennifer Santoso, Takeshi Yamada, Shoji Makino, Kenkichi Ishizuka, Takekatsu Hiramura:
Speech Emotion Recognition Based on Attention Weight Correction Using Word-Level Confidence Measure. 1947-1951
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SilpachaiRBLCZG21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SilpachaiRBLCZG21
Alif Silpachai, Ivana Rehman, Taylor Anne Barriuso, John Levis, Evgeny Chukharev-Hudilainen, Guanlong Zhao, Ricardo Gutierrez-Osuna:
Effects of Voice Type and Task on L2 Learners' Awareness of Pronunciation Errors. 1952-1956
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/MenshikovaKK21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MenshikovaKK21
Alla Menshikova, Daniil Kocharov, Tatiana Kachkovskaia:
Lexical Entrainment and Intra-Speaker Variability in Cooperative Dialogues. 1957-1961
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/NasreenHP21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/NasreenHP21
Shamila Nasreen, Julian Hough, Matthew Purver:
Detecting Alzheimer's Disease Using Interactional and Acoustic Features from Spontaneous Speech. 1962-1966
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KothareRRNLBCHS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KothareRRNLBCHS21
Hardik Kothare, Vikram Ramanarayanan, Oliver Roesler, Michael Neumann, Jackson Liscombe, William Burke, Andrew Cornish, Doug Habberstad, Alaa Sakallah, Sara Markuson, Seemran Kansara, Afik Faerman, Yasmine Bensidi-Slimane, Laura Fry, Saige Portera, David Suendermann-Oeft, David Pautler, Carly Demopoulos:
Investigating the Interplay Between Affective, Phonatory and Motoric Subsystems in Autism Spectrum Disorder Using a Multimodal Dialogue Agent. 1967-1971
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/IshiS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/IshiS21
Carlos Toshinori Ishi, Taiken Shintani:
Analysis of Eye Gaze Reasons and Gaze Aversions During Three-Party Conversations. 1972-1976

Language and Lexical Modeling for ASR

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KimALYFKS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KimALYFKS21
Suyoun Kim, Abhinav Arora, Duc Le, Ching-Feng Yeh, Christian Fuegen, Ozlem Kalinli, Michael L. Seltzer:
Semantic Distance: A New Metric for ASR Performance Analysis Towards Spoken Language Understanding. 1977-1981
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WangLZ021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WangLZ021
Xiaoqiang Wang, Yanqing Liu, Sheng Zhao, Jinyu Li:
A Light-Weight Contextual Spelling Correction Model for Customizing Transducer-Based Speech Recognition Systems. 1982-1986
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ShiWWLLL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ShiWWLLL21
Ning Shi, Wei Wang, Boxin Wang, Jinfeng Li, Xiangyu Liu, Zhouhan Lin:
Incorporating External POS Tagger for Punctuation Restoration. 1987-1991
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/PapadourakisMLM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/PapadourakisMLM21
Vasileios Papadourakis, Markus Müller, Jing Liu, Athanasios Mouchtaris, Maurizio Omologo:
Phonetically Induced Subwords for End-to-End Speech Recognition. 1992-1996
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/MansfieldNLWO21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MansfieldNLWO21
Courtney Mansfield, Sara Ng, Gina-Anne Levow, Richard A. Wright, Mari Ostendorf:
Revisiting Parity of Human vs. Machine Conversational Speech Transcription. 1997-2001
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/HuangSPKRS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/HuangSPKRS21
W. Ronny Huang, Tara N. Sainath, Cal Peyser, Shankar Kumar, David Rybach, Trevor Strohman:
Lookup-Table Recurrent Language Models for Long Tail Speech Recognition. 2002-2006
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Andres-FerrerAZ21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Andres-FerrerAZ21
Jesús Andrés-Ferrer, Dario Albesano, Puming Zhan, Paul Vozila:
Contextual Density Ratio for Language Model Biasing of Sequence to Sequence ASR Systems. 2007-2011
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/HuangKTL021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/HuangKTL021
Qiushi Huang, Tom Ko, H. Lilian Tang, Xubo Liu, Bo Wu:
Token-Level Supervised Contrastive Learning for Punctuation Restoration. 2012-2016
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZhaoYWGYZ21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZhaoYWGYZ21
Yun Zhao, Xuerui Yang, Jinchao Wang, Yongyu Gao, Chao Yan, Yuanfu Zhou:
BART Based Semantic Correction for Mandarin Automatic Speech Recognition System. 2017-2021
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/DaiLY21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/DaiLY21
Lingfeng Dai, Qi Liu, Kai Yu:
Class-Based Neural Network Language Model for Second-Pass Rescoring in ASR. 2022-2026
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KurataSKHT21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KurataSKHT21
Gakuto Kurata, George Saon, Brian Kingsbury, David Haws, Zoltán Tüske:
Improving Customization of Neural Transducers by Mitigating Acoustic Mismatch of Synthesized Audio. 2027-2031
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SaebiPMG21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SaebiPMG21
Mandana Saebi, Ernest Pusateri, Aaksha Meghawat, Christophe Van Gysel:
A Discriminative Entity-Aware Language Model for Virtual Assistants. 2032-2036
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/NamazifarMLTH21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/NamazifarMLTH21
Mahdi Namazifar, John Malik, Li Erran Li, Gökhan Tür, Dilek Hakkani-Tür:
Correcting Automated and Manual Speech Transcription Errors Using Warped Language Models. 2037-2041

Novel Neural Network Architectures for ASR

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ShiNWMLPXYCFKS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ShiNWMLPXYCFKS21
Yangyang Shi, Varun Nagaraja, Chunyang Wu, Jay Mahadeokar, Duc Le, Rohit Prabhavalkar, Alex Xiao, Ching-Feng Yeh, Julian Chan, Christian Fuegen, Ozlem Kalinli, Michael L. Seltzer:
Dynamic Encoder Transducer: A Flexible Solution for Trading Off Accuracy for Latency. 2042-2046
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZhangLXZC21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZhangLXZC21
Shiqi Zhang, Yan Liu, Deyi Xiong, Pei Zhang, Boxing Chen:
Domain-Aware Self-Attention for Multi-Domain Neural Machine Translation. 2047-2051
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZeyerMMSN21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZeyerMMSN21
Albert Zeyer, André Merboldt, Wilfried Michel, Ralf Schlüter, Hermann Ney:
Librispeech Transducer Model with Internal Language Model Prior Correction. 2052-2056
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/MavandadiSHW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MavandadiSHW21
Sepand Mavandadi, Tara N. Sainath, Ke Hu, Zelin Wu:
A Deliberation-Based Joint Acoustic and Text Decoder. 2057-2061
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/TuskeSK21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/TuskeSK21
Zoltán Tüske, George Saon, Brian Kingsbury:
On the Limit of English Conversational Speech Recognition. 2062-2066
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/AnZO21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/AnZO21
Keyu An, Yi Zhang, Zhijian Ou:
Deformable TDNN with Adaptive Receptive Fields for Speech Recognition. 2067-2071
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/YouFSY21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/YouFSY21
Zhao You, Shulin Feng, Dan Su, Dong Yu:
SpeechMoE: Scaling to Large Acoustic Models with Dynamic Routing Mixture of Experts. 2077-2081
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LeongHC21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LeongHC21
Chi-Hang Leong, Yu-Han Huang, Jen-Tzung Chien:
Online Compressive Transformer for End-to-End Speech Recognition. 2082-2086
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LinW21a
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LinW21a
Binghuai Lin, Liyuan Wang:
End to End Transformer-Based Contextual Speech Recognition Based on Pointer Network. 2087-2091
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KaritaKBJ21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KaritaKBJ21
Shigeki Karita, Yotaro Kubo, Michiel Adriaan Unico Bacchiani, Llion Jones:
A Comparative Study on Neural Architectures and Training Methods for Japanese Speech Recognition. 2092-2096
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/HoriMHR21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/HoriMHR21
Takaaki Hori, Niko Moritz, Chiori Hori, Jonathan Le Roux:
Advanced Long-Context End-to-End Speech Recognition Using Context-Expanded Transformers. 2097-2101
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/HaidarXR21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/HaidarXR21
Md. Akmal Haidar, Chao Xing, Mehdi Rezagholizadeh:
Transformer-Based ASR Incorporating Time-Reduction Layer and Fine-Tuning with Self-Knowledge Distillation. 2102-2106
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/MahadeokarSSWXS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MahadeokarSSWXS21
Jay Mahadeokar, Yangyang Shi, Yuan Shangguan, Chunyang Wu, Alex Xiao, Hang Su, Duc Le, Ozlem Kalinli, Christian Fuegen, Michael L. Seltzer:
Flexi-Transducer: Optimizing Latency, Accuracy and Compute for Multi-Domain On-Device Scenarios. 2107-2111

Speech Localization, Enhancement, and Quality Assessment

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Falkowski-Gilski21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Falkowski-Gilski21
Przemyslaw Falkowski-Gilski:
Difference in Perceived Speech Signal Quality Assessment Among Monolingual and Bilingual Teenage Students. 2112-2116
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SchymuraBODKNAK21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SchymuraBODKNAK21
Christopher Schymura, Benedikt T. Bönninghoff, Tsubasa Ochiai, Marc Delcroix, Keisuke Kinoshita, Tomohiro Nakatani, Shoko Araki, Dorothea Kolossa:
PILOT: Introducing Transformers for Probabilistic Sound Event Localization. 2117-2121
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/TogamiS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/TogamiS21
Masahito Togami, Robin Scheibler:
Sound Source Localization with Majorization Minimization. 2122-2126
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/MittagNC021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MittagNC021
Gabriel Mittag, Babak Naderi, Assmaa Chehadi, Sebastian Möller:
NISQA: A Deep CNN-Self-Attention Model for Multidimensional Speech Quality Prediction with Crowdsourced Datasets. 2127-2131
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/NaderiC21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/NaderiC21
Babak Naderi, Ross Cutler:
Subjective Evaluation of Noise Suppression Algorithms in Crowdsourcing. 2132-2136
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/GengWLLL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/GengWLLL21
Jianhua Geng, Sifan Wang, Juan Li, Jingwei Li, Xin Lou:
Reliable Intensity Vector Selection for Multi-Source Direction-of-Arrival Estimation Using a Single Acoustic Vector Sensor. 2137-2141
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/YuZXZ021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/YuZXZ021
Meng Yu, Chunlei Zhang, Yong Xu, Shi-Xiong Zhang, Dong Yu:
MetricNet: Towards Improved Modeling For Non-Intrusive Speech Quality Assessment. 2142-2146
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/TomaSDF21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/TomaSDF21
Andrea Toma, Daniele Salvati, Carlo Drioli, Gian Luca Foresti:
CNN-Based Processing of Acoustic and Radio Frequency Signals for Speaker Localization from MAVs. 2147-2151
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ItoyamaMMKNN21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ItoyamaMMKNN21
Katsutoshi Itoyama, Yoshiya Morimoto, Shungo Masaki, Ryosuke Kojima, Kenji Nishida, Kazuhiro Nakadai:
Assessment of von Mises-Bernoulli Deep Neural Network in Sound Source Localization. 2152-2156
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LiuZC21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LiuZC21
Rongliang Liu, Nengheng Zheng, Xi Chen:
Feature Fusion by Attention Networks for Robust DOA Estimation. 2157-2161
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LinL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LinL21
Shoufeng Lin, Zhaojie Luo:
Far-Field Speaker Localization and Adaptive GLMB Tracking. 2162-2166
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/NarayanaswamyTS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/NarayanaswamyTS21
Vivek Sivaraman Narayanaswamy, Jayaraman J. Thiagarajan, Andreas Spanias:
On the Design of Deep Priors for Unsupervised Audio Restoration. 2167-2171
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChenXZ21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChenXZ21
Weiguang Chen, Cheng Xue, Xionghu Zhong:
Cramér-Rao Lower Bound for DOA Estimation with an Array of Directional Microphones in Reverberant Environments. 2172-2176

Speech Synthesis: Neural Waveform Generation

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/YouKNHC21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/YouKNHC21
Jaeseong You, Dalhyun Kim, Gyuhyeon Nam, Geumbyeol Hwang, Gyeongsu Chae:
GAN Vocoder: Multi-Resolution Discriminator Is All You Need. 2177-2181
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/CongYXS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/CongYXS21
Jian Cong, Shan Yang, Lei Xie, Dan Su:
Glow-WaveGAN: Learning Speech Representations from GAN-Based Variational Auto-Encoder for High Fidelity Flow-Based Speech Synthesis. 2182-2186
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/YoneyamaWT21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/YoneyamaWT21
Reo Yoneyama, Yi-Chiao Wu, Tomoki Toda:
Unified Source-Filter GAN: Unified Source-Filter Network Based On Factorization of Quasi-Periodic Parallel WaveGAN. 2187-2191
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/MizutaKS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MizutaKS21
Kazuki Mizuta, Tomoki Koriyama, Hiroshi Saruwatari:
Harmonic WaveGAN: GAN-Based Speech Waveform Generation Model with Harmonic Structure Discriminator. 2192-2196
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KimLLL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KimLLL21
Ji-Hoon Kim, Sang-Hoon Lee, Ji-Hyun Lee, Seong-Whan Lee:
Fre-GAN: Adversarial Frequency-Consistent Audio Synthesis. 2197-2201
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/YangBBKC21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/YangBBKC21
Jinhyeok Yang, Jae-Sung Bae, Taejun Bak, Young-Ik Kim, Hoon-Young Cho:
GANSpeech: Adversarial Training for High-Fidelity Multi-Speaker Speech Synthesis. 2202-2206
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/JangLYKK21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/JangLYKK21
Won Jang, Dan Lim, Jaesam Yoon, Bongwan Kim, Juntae Kim:
UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation. 2207-2211
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Al-RadhiCZN21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Al-RadhiCZN21
Mohammed Salah Al-Radhi, Tamás Gábor Csapó, Csaba Zainkó, Géza Németh:
Continuous Wavelet Vocoder-Based Decomposition of Parametric Speech Waveform Synthesis. 2212-2216
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/TobingT21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/TobingT21
Patrick Lumban Tobing, Tomoki Toda:
High-Fidelity and Low-Latency Universal Neural Vocoder Based on Multiband WaveRNN with Data-Driven Linear Prediction for Discrete Waveform Modeling. 2217-2221
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LiuQ21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LiuQ21
Zhengxi Liu, Yanmin Qian:
Basis-MelGAN: Efficient Neural Vocoder Based on Audio Decomposition. 2222-2226
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/HwangYSK21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/HwangYSK21
Min-Jae Hwang, Ryuichi Yamamoto, Eunwoo Song, Jae-Min Kim:
High-Fidelity Parallel WaveGAN with Multi-Band Harmonic-Plus-Noise Model. 2227-2231

Spoken Machine Translation

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChenMZ021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChenMZ021
Jun-Kun Chen, Mingbo Ma, Renjie Zheng, Liang Huang:
SpecRec: An Alternative Solution for Improving End-to-End Speech-to-Text Translation via Spectrogram Reconstruction. 2232-2236
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/CherryAPK21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/CherryAPK21
Colin Cherry, Naveen Arivazhagan, Dirk Padfield, Maxim Krikun:
Subtitle Translation as Markup Translation. 2237-2241
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WangWPBAC21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WangWPBAC21
Changhan Wang, Anne Wu, Juan Pino, Alexei Baevski, Michael Auli, Alexis Conneau:
Large-Scale Self- and Semi-Supervised Learning for Speech Translation. 2242-2246
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WangWGP21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WangWGP21
Changhan Wang, Anne Wu, Jiatao Gu, Juan Pino:
CoVoST 2 and Massively Multilingual Speech Translation. 2247-2251
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChengLW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChengLW21
Yao-Fei Cheng, Hung-Shin Lee, Hsin-Min Wang:
AlloST: Low-Resource Speech Translation Without Source Transcription. 2252-2256
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/EffendiSN21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/EffendiSN21
Johanes Effendi, Sakriani Sakti, Satoshi Nakamura:
Weakly-Supervised Speech-to-Text Mapping with Visually Connected Non-Parallel Speech-Text Data Using Cyclic Partially-Aligned Transformer. 2257-2261
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/TokuyamaSSN21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/TokuyamaSSN21
Hirotaka Tokuyama, Sakriani Sakti, Katsuhito Sudoh, Satoshi Nakamura:
Transcribing Paralinguistic Acoustic Cues to Target Language Text in Transformer-Based Speech-to-Text Translation. 2262-2266
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/YeW021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/YeW021
Rong Ye, Mingxuan Wang, Lei Li:
End-to-End Speech Translation via Cross-Modal Progressive Training. 2267-2271
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KoSSN21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KoSSN21
Yuka Ko, Katsuhito Sudoh, Sakriani Sakti, Satoshi Nakamura:
ASR Posterior-Based Loss for Multi-Task End-to-End Speech Translation. 2272-2276
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/MartosIGJSCSJ21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MartosIGJSCSJ21
Alejandro Pérez González de Martos, Javier Iranzo-Sánchez, Adrià Giménez-Pastor, Javier Jorge, Joan Albert Silvestre-Cerdà, Jorge Civera, Albert Sanchís, Alfons Juan:
Towards Simultaneous Machine Interpretation. 2277-2281
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/MartucciCNT21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MartucciCNT21
Giuseppe Martucci, Mauro Cettolo, Matteo Negri, Marco Turchi:
Lexical Modeling of ASR Errors for Robust Speech Translation. 2282-2286
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/VyasKW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/VyasKW21
Piyush Vyas, Anastasia Kuznetsova, Donald S. Williamson:
Optimally Encoding Inductive Biases into the Transformer Improves End-to-End Speech Translation. 2287-2291
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Ananthanarayana21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Ananthanarayana21
Tejaswini Ananthanarayana, Lipisha Chaudhary, Ifeoma Nwogu:
Effects of Feature Scaling and Fusion on Sign Language Translation. 2292-2296

SdSV Challenge 2021: Analysis and Exploration of New Ideas on Short-Duration Speaker Verification

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/AleninOMTSS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/AleninOMTSS21
Alexander Alenin, Anton Okhotnikov, Rostislav Makarov, Nikita Torgashov, Ilya Shigabeev, Konstantin Simonchik:
The ID R&D System Description for Short-Duration Speaker Verification Challenge 2021. 2297-2301
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ThienpondtDD21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ThienpondtDD21
Jenthe Thienpondt, Brecht Desplanques, Kris Demuynck:
Integrating Frequency Translational Invariance in TDNNs and Frequency Positional Information in 2D ResNets to Enhance Speaker Verification. 2302-2306
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/GusevVNA21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/GusevVNA21
Aleksei Gusev, Alisa Vinogradova, Sergey Novoselov, Sergei Astapov:
SdSVC Challenge 2021: Tips and Tricks to Boost the Short-Duration Speaker Verification System Performance. 2307-2311
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KangK21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KangK21
Woo Hyun Kang, Nam Soo Kim:
Team02 Text-Independent Speaker Verification System for SdSV Challenge 2021. 2312-2316
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/QinWMLZL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/QinWMLZL21
Xiaoyi Qin, Chao Wang, Yong Ma, Min Liu, Shilei Zhang, Ming Li:
Our Learned Lessons from Cross-Lingual Speaker Verification: The CRMI-DKU System Description for the Short-Duration Speaker Verification Challenge 2021. 2317-2321
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZhangHZ21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZhangHZ21
Peng Zhang, Peng Hu, Xueliang Zhang:
Investigation of IMU&Elevoc Submission for the Short-Duration Speaker Verification Challenge 2021. 2322-2326
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/YanYPC21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/YanYPC21
Jie Yan, Shengyu Yao, Yiqian Pan, Wei Chen:
The Sogou System for Short-Duration Speaker Verification Challenge 2021. 2327-2331
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/HanCZQ21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/HanCZQ21
Bing Han, Zhengyang Chen, Zhikai Zhou, Yanmin Qian:
The SJTU System for Short-Duration Speaker Verification Challenge 2021. 2332-2336

Show and Tell 2

- view
  - electronic edition @ isca-speech.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/interspeech/ChoL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChoL21
Sungjae Cho, Soo-Young Lee:
Multi-Speaker Emotional Text-to-Speech Synthesizer. 2337-2338
- view
  - electronic edition @ isca-speech.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/interspeech/PrazakLPRPS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/PrazakLPRPS21
Ales Prazák, Zdenek Loose, Josef V. Psutka, Vlasta Radová, Josef Psutka, Jan Svec:
Live TV Subtitling Through Respeaking. 2339-2340
- view
  - electronic edition @ isca-speech.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/interspeech/FragnerTGPP21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/FragnerTGPP21
Stefan Fragner, Tobias Topar, Maximilian Giller, Lukas Pfeifenberger, Franz Pernkopf:
Autonomous Robot for Measuring Room Impulse Responses. 2341-2342
- view
  - electronic edition @ isca-speech.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/interspeech/BeskowCEHJW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/BeskowCEHJW21
Jonas Beskow, Charlie Caper, Johan Ehrenfors, Nils Hagberg, Anne Jansen, Chris Wood:
Expressive Robot Performance Based on Facial Motion Capture. 2343-2344
- view
  - electronic edition @ isca-speech.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/interspeech/DominguezCW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/DominguezCW21
Mónica Domínguez, Juan Soler Company, Leo Wanner:
ThemePro 2.0: Showcasing the Role of Thematic Progression in Engaging Human-Computer Interaction. 2345-2346
- view
  - electronic edition @ isca-speech.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/interspeech/GurujuV21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/GurujuV21
Sai Guruju, Jithendra Vepa:
Addressing Compliance in Call Centers with Entity Extraction. 2347-2348
- view
  - electronic edition @ isca-speech.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/interspeech/GogineniYV21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/GogineniYV21
Krishnachaitanya Gogineni, Tarun Reddy Yadama, Jithendra Vepa:
Audio Segmentation Based Conversational Silence Detection for Contact Center Calls. 2349-2350

Graph and End-to-End Learning for Speaker Recognition

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/RajK21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/RajK21
Desh Raj, Sanjeev Khudanpur:
Reformulating DOVER-Lap Label Mapping as a Graph Partitioning Problem. 2351-2355
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/TakJ0TE21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/TakJ0TE21
Hemlata Tak, Jee-weon Jung, Jose Patino, Massimiliano Todisco, Nicholas W. D. Evans:
Graph Attention Networks for Anti-Spoofing. 2356-2360
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/MingoteMGL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MingoteMGL21
Victoria Mingote, Antonio Miguel, Alfonso Ortega Giménez, Eduardo Lleida:
Log-Likelihood-Ratio Cost Function as Objective Loss for Speaker Verification Systems. 2361-2365
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/PengQGWXBC21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/PengQGWXBC21
Junyi Peng, Xiaoyang Qu, Rongzhi Gu, Jianzong Wang, Jing Xiao, Lukás Burget, Jan Cernocký:
Effective Phase Encoding for End-To-End Speaker Verification. 2366-2370

Spoken Language Processing II

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/NguyenEB21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/NguyenEB21
Ha Nguyen, Yannick Estève, Laurent Besacier:
Impact of Encoding and Segmentation Strategies on End-to-End Simultaneous Speech Translation. 2371-2375
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/MachacekZB21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MachacekZB21
Dominik Machácek, Matús Zilinec, Ondrej Bojar:
Lost in Interpreting: Speech Translation from Source or Interpreter? 2376-2380
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/PouthierPGBP21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/PouthierPGBP21
Baptiste Pouthier, Laurent Pilati, Leela K. Gudupudi, Charles Bouveyron, Frédéric Precioso:
Active Speaker Detection as a Multi-Objective Optimization with Uncertainty-Based Multimodal Fusion. 2381-2385
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Wallbridge0L21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Wallbridge0L21
Sarenne Wallbridge, Peter Bell, Catherine Lai:
It's Not What You Said, it's How You Said it: Discriminative Perception of Speech as a Multichannel Communication System. 2386-2390

Speech and Audio Analysis

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/MichaelMBM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MichaelMBM21
Thilo Michael, Gabriel Mittag, Andreas Bütow, Sebastian Möller:
Extending the Fullband E-Model Towards Background Noise, Bursty Packet Loss, and Conversational Degradations. 2391-2395
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/BerglerSMSSNTN21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/BerglerSMSSNTN21
Christian Bergler, Manuel Schmitt, Andreas K. Maier, Helena Symonds, Paul Spong, Steven R. Ness, George Tzanetakis, Elmar Nöth:
ORCA-SLANG: An Automatic Multi-Stage Semi-Supervised Deep Learning Framework for Large-Scale Killer Whale Call Type Identification. 2396-2400
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Boesh21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Boesh21
Wim Boes, Hugo Van hamme:
Audiovisual Transfer Learning for Audio Tagging and Sound Event Detection. 2401-2405
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/NesslerCPM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/NesslerCPM21
Natalia Nessler, Milos Cernak, Paolo Prandoni, Pablo Mainar:
Non-Intrusive Speech Quality Assessment with Transfer Learning and Subject-Specific Scaling. 2406-2410
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/OncescuKHAA21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/OncescuKHAA21
Andreea-Maria Oncescu, A. Sophia Koepke, João F. Henriques, Zeynep Akata, Samuel Albanie:
Audio Retrieval with Natural Language Queries. 2411-2415

Cross/Multi-Lingual and Code-Switched ASR

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/GiolloGLW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/GiolloGLW21
Manuel Giollo, Deniz Gunceler, Yulan Liu, Daniel Willett:
Bootstrap an End-to-End ASR System by Multilingual Training, Transfer Learning, Text-to-Text Mapping and Synthetic Audio. 2416-2420
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/PhamNSW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/PhamNSW21
Ngoc-Quan Pham, Tuan-Nam Nguyen, Sebastian Stüker, Alex Waibel:
Efficient Weight Factorization for Multilingual Speech Recognition. 2421-2425
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ConneauBCMA21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ConneauBCMA21
Alexis Conneau, Alexei Baevski, Ronan Collobert, Abdelrahman Mohamed, Michael Auli:
Unsupervised Cross-Lingual Representation Learning for Speech Recognition. 2426-2430
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/HayakawaLKUN21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/HayakawaLKUN21
Tomoaki Hayakawa, Chee Siang Leow, Akio Kobayashi, Takehito Utsuro, Hiromitsu Nishizaki:
Language and Speaker-Independent Feature Transformation for End-to-End Multilingual Speech Recognition. 2431-2435
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/NWB21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/NWB21
Krishna D. N, Pinyi Wang, Bruno Bozza:
Using Large Self-Supervised Models for Low-Resource Speech Recognition. 2436-2440
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KumarKTASPJPM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KumarKTASPJPM21
Mari Ganesh Kumar, Jom Kuriakose, Anand Thyagachandran, Arun Kumar A, Ashish Seth, Lodagala Durga Prasad, Saish Jaiswal, Anusha Prakash, Hema A. Murthy:
Dual Script E2E Framework for Multilingual and Code-Switching ASR. 2441-2445
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/DiwanVSSMKUVRYM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/DiwanVSSMKUVRYM21
Anuj Diwan, Rakesh Vaideeswaran, Sanket Shah, Ankita Singh, Srinivasa Raghavan K. M., Shreya Khare, Vinit Unni, Saurabh Vyas, Akash Rajpuria, Chiranjeevi Yarra, Ashish R. Mittal, Prasanta Kumar Ghosh, Preethi Jyothi, Kalika Bali, Vivek Seshadri, Sunayana Sitaram, Samarth Bharadwaj, Jai Nanavati, Raoul Nanavati, Karthik Sankaranarayanan:
MUCS 2021: Multilingual and Code-Switching ASR Challenges for Low Resource Indian Languages. 2446-2450
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WinataWXH21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WinataWXH21
Genta Indra Winata, Guangsen Wang, Caiming Xiong, Steven C. H. Hoi:
Adapt-and-Adjust: Overcoming the Long-Tail Problem of Multilingual Speech Recognition. 2451-2455
- view
  authority control:
- export record
  dblp key:
  - conf/interspeech/SailorPAJP21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SailorPAJP21
Hardik B. Sailor, Kiran Praveen, Vikas Agrawal, Abhinav Jain, Abhishek Pandey:
SRI-B End-to-End System for Multilingual and Code-Switching ASR Challenges for Low Resource Indian Languages. 2456-2460
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Li0MB21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Li0MB21
Xinjian Li, Juncheng Li, Florian Metze, Alan W. Black:
Hierarchical Phone Recognition with Compositional Phonetics. 2461-2465
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChowdhuryHAA21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChowdhuryHAA21
Shammur Absar Chowdhury, Amir Hussein, Ahmed Abdelali, Ahmed Ali:
Towards One Model to Rule All: Multilingual Strategy for Dialectal Code-Switching Arabic ASR. 2466-2470
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/YanDMM021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/YanDMM021
Brian Yan, Siddharth Dalmia, David R. Mortensen, Florian Metze, Shinji Watanabe:
Differentiable Allophone Graphs for Language-Universal Speech Recognition. 2471-2475

Health and Affect II

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/MartinRBP21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MartinRBP21
Vincent P. Martin, Jean-Luc Rouas, Florian Boyer, Pierre Philip:
Automatic Speech Recognition Systems Errors for Objective Sleepiness Detection Through Voice. 2476-2480
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/GillickDRB21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/GillickDRB21
Jon Gillick, Wesley Deng, Kimiko Ryokai, David Bamman:
Robust Laughter Detection in Noisy Environments. 2481-2485
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/NaganoIH21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/NaganoIH21
Mizuki Nagano, Yusuke Ijima, Sadao Hiroya:
Impact of Emotional State on Estimation of Willingness to Buy from Advertising Speech. 2486-2490
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/AlsofyaniV21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/AlsofyaniV21
Huda Alsofyani, Alessandro Vinciarelli:
Stacked Recurrent Neural Networks for Speech-Based Inference of Attachment Condition in School Age Children. 2491-2495
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/AloshbanEV21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/AloshbanEV21
Nujud Aloshban, Anna Esposito, Alessandro Vinciarelli:
Language or Paralanguage, This is the Problem: Comparing Depressed and Non-Depressed Speakers Through the Analysis of Gated Multimodal Units. 2496-2500
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/TammewarCR21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/TammewarCR21
Aniruddha Tammewar, Alessandra Cervone, Giuseppe Riccardi:
Emotion Carrier Recognition from Personal Narratives. 2501-2505
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/CondronCKMPP21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/CondronCKMPP21
Scott Condron, Georgia Clarke, Anita Klementiev, Daniela Morse-Kopp, Jack Parry, Dimitri Palaz:
Non-Verbal Vocalisation and Laughter Detection Using Sequence-to-Sequence Models and Multi-Label Training. 2506-2510
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/CaiNLTL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/CaiNLTL21
Cong Cai, Mingyue Niu, Bin Liu, Jianhua Tao, Xuefei Liu:
TDCA-Net: Time-Domain Channel Attention Network for Depression Detection. 2511-2515
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/BotelhoAST21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/BotelhoAST21
Catarina Botelho, Alberto Abad, Tanja Schultz, Isabel Trancoso:
Visual Speech for Obstructive Sleep Apnea Detection. 2516-2520
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/MaruriASAN21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MaruriASAN21
Héctor A. Cordourier Maruri, Sinem Aslan, Georg Stemmer, Nese Alyüz, Lama Nachman:
Analysis of Contextual Voice Changes in Remote Meetings. 2521-2525
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SeneviratneE21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SeneviratneE21
Nadee Seneviratne, Carol Y. Espy-Wilson:
Speech Based Depression Severity Level Classification Using a Multi-Stage Dilated CNN-LSTM Model. 2526-2530

Neural Network Training Methods for ASR

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KimLLKLYH21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KimLLKLYH21
Ho-Gyeong Kim, Min-Joong Lee, Hoshik Lee, Tae Gyoon Kang, Jihyun Lee, Eunho Yang, Sung Ju Hwang:
Multi-Domain Knowledge Distillation via Uncertainty-Matching for End-to-End ASR Models. 2531-2535
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/MacoskeySR21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MacoskeySR21
Jonathan Macoskey, Grant P. Strimel, Ariya Rastrow:
Learning a Neural Diff for Speech Models. 2536-2540
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZhangL0R21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZhangL0R21
Shucong Zhang, Erfan Loweimi, Peter Bell, Steve Renals:
Stochastic Attention Head Removal: A Simple and Effective Method for Improving Transformer Based ASR Models. 2541-2545
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/XueZH21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/XueZH21
Jiabin Xue, Tieran Zheng, Jiqing Han:
Model-Agnostic Fast Adaptive Multi-Objective Balancing Algorithm for Multilingual Automatic Speech Recognition Model Training. 2546-2550
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChangLL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChangLL21
Heng-Jui Chang, Hung-yi Lee, Lin-Shan Lee:
Towards Lifelong Learning of End-to-End ASR. 2551-2555
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LealGHFMPRZ21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LealGHFMPRZ21
Isabel Leal, Neeraj Gaur, Parisa Haghani, Brian Farris, Pedro J. Moreno, Manasa Prasad, Bhuvana Ramabhadran, Yun Zhu:
Self-Adaptive Distillation for Multilingual Speech Recognition: Leveraging Student Independence. 2556-2560
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/XuAHER21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/XuAHER21
Hainan Xu, Kartik Audhkhasi, Yinghui Huang, Jesse Emond, Bhuvana Ramabhadran:
Regularizing Word Segmentation by Creating Misspellings. 2561-2565
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WangSW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WangSW21
Peidong Wang, Tara N. Sainath, Ron J. Weiss:
Multitask Training with Text Data for End-to-End Speech Recognition. 2566-2570
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChenNHWMX21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChenNHWMX21
Xianzhao Chen, Hao Ni, Yi He, Kang Wang, Zejun Ma, Zongxia Xie:
Emitting Word Timings with HMM-Free End-to-End System in Automatic Speech Recognition. 2571-2575
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/DroppoE21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/DroppoE21
Jasha Droppo, Oguz Elibol:
Scaling Laws for Acoustic Models. 2576-2580
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Billa21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Billa21
Jayadev Billa:
Leveraging Non-Target Language Resources to Improve ASR Performance in a Target Language. 2581-2585
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/FasoliCSSWVSCK021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/FasoliCSSWVSCK021
Andrea Fasoli, Chia-Yu Chen, Mauricio J. Serrano, Xiao Sun, Naigang Wang, Swagath Venkataramani, George Saon, Xiaodong Cui, Brian Kingsbury, Wei Zhang, Zoltán Tüske, Kailash Gopalakrishnan:
4-Bit Quantization of LSTM-Based Speech Recognition Models. 2586-2590
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/MasumuraOMITTO21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MasumuraOMITTO21
Ryo Masumura, Daiki Okamura, Naoki Makishima, Mana Ihori, Akihiko Takashima, Tomohiro Tanaka, Shota Orihashi:
Unified Autoregressive Modeling for Joint End-to-End Multi-Talker Overlapped Speech Recognition and Speaker Attribute Estimation. 2591-2595
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Meng0K0CYSL021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Meng0K0CYSL021
Zhong Meng, Yu Wu, Naoyuki Kanda, Liang Lu, Xie Chen, Guoli Ye, Eric Sun, Jinyu Li, Yifan Gong:
Minimum Word Error Rate Training with Language Model Fusion for End-to-End Speech Recognition. 2596-2600
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/JiangZW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/JiangZW21
Dongcheng Jiang, Chao Zhang, Philip C. Woodland:
Variable Frame Rate Acoustic Models Using Minimum Error Reinforcement Learning. 2601-2605

Prosodic Features and Structure

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KalandG21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KalandG21
Constantijn Kaland, Matthew Gordon:
How f0 and Phrase Position Affect Papuan Malay Word Identification. 2606-2610
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/JespersenSH21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/JespersenSH21
Anna Bothe Jespersen, Pavel Sturm, Mísa Hejná:
On the Feasibility of the Danish Model of Intonational Transcription: Phonetic Evidence from Jutlandic Danish. 2611-2615
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/MeliBFH21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MeliBFH21
Adrien Méli, Nicolas Ballier, Achille Falaise, Alice Henderson:
An Experiment in Paratone Detection in a Prosodically Annotated EAP Spoken Corpus. 2616-2620
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/GerazovW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/GerazovW21
Branislav Gerazov, Michael Wagner:
ProsoBeast Prosody Annotation Tool. 2621-2625
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/TranO21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/TranO21
Trang Tran, Mari Ostendorf:
Assessing the Use of Prosody in Constituency Parsing of Imperfect Transcripts. 2626-2630
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LiuHC21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LiuHC21
Roger Cheng-yen Liu, Feng-fan Hsieh, Yueh-Chin Chang:
Targeted and Targetless Neutral Tones in Taiwanese Southern Min. 2631-2635
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/GosyA21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/GosyA21
Mária Gósy, Kálmán Abari:
The Interaction of Word Complexity and Word Duration in an Agglutinative Language. 2636-2640
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/PanL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/PanL21
Ho-hsien Pan, Shao-Ren Lyu:
Taiwan Min Nan (Taiwanese) Checked Tones Sound Change. 2641-2645
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/JakobBZ21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/JakobBZ21
Moritz Jakob, Bettina Braun, Katharina Zahner-Ritter:
In-Group Advantage in the Perception of Emotions: Evidence from Three Varieties of German. 2646-2650
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Gobl21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Gobl21
Christer Gobl:
The LF Model in the Frequency Domain for Glottal Airflow Modelling Without Aliasing Distortion. 2651-2655
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WagnerZZ21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WagnerZZ21
Michael Wagner, Alvaro Iturralde Zurita, Sijia Zhang:
Parsing Speech for Grouping and Prominence, and the Typology of Rhythm. 2656-2660
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/MumtazCB21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MumtazCB21
Benazir Mumtaz, Massimiliano Canzi, Miriam Butt:
Prosody of Case Markers in Urdu. 2661-2665
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/StefansdottirBT21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/StefansdottirBT21
Brynhildur Stefansdottir, Francesco Burroni, Sam Tilsen:
Articulatory Characteristics of Icelandic Voiced Fricative Lenition: Gradience, Categoricity, and Speaker/Gesture-Specific Effects. 2666-2670
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Johnson21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Johnson21
Khia A. Johnson:
Leveraging the Uniformity Framework to Examine Crosslinguistic Similarity for Long-Lag Stops in Spontaneous Cantonese-English Bilingual Speech. 2671-2675

Single-Channel Speech Enhancement

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Sivaraman0K21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Sivaraman0K21
Aswin Sivaraman, Sunwoo Kim, Minje Kim:
Personalized Speech Enhancement Through Self-Supervised Data Augmentation and Purification. 2676-2680
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SaddlerFFQZM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SaddlerFFQZM21
Mark R. Saddler, Andrew Francl, Jenelle Feather, Kaizhi Qian, Yang Zhang, Josh H. McDermott:
Speech Denoising with Auditory Models. 2681-2685
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/EskimezWTYZCWY21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/EskimezWTYZCWY21
Sefik Emre Eskimez, Xiaofei Wang, Min Tang, Hemin Yang, Zirun Zhu, Zhuo Chen, Huaming Wang, Takuya Yoshioka:
Human Listening and Live Captioning: Multi-Task Training for Speech Enhancement. 2686-2690
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/XuWXPZJC21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/XuWXPZJC21
Xinmeng Xu, Yang Wang, Dongxiang Xu, Yiyuan Peng, Cong Zhang, Jie Jia, Binbin Chen:
Multi-Stage Progressive Speech Enhancement Network. 2691-2695
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChangTK21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChangTK21
Oscar Chang, Dung N. Tran, Kazuhito Koishida:
Single-Channel Speech Enhancement Using Learnable Loss Mixup. 2696-2700
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZhangD0L21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZhangD0L21
Xiaoqi Zhang, Jun Du, Li Chai, Chin-Hui Lee:
A Maximum Likelihood Approach to SNR-Progressive Learning Using Generalized Gaussian Distribution for LSTM-Based Speech Enhancement. 2701-2705
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/AgrawalKR21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/AgrawalKR21
Vikas Agrawal, Shashi Kumar, Shakti P. Rath:
Whisper Speech Enhancement Using Joint Variational Autoencoder for Improved Speech Recognition. 2706-2710
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LeeJLC21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LeeJLC21
Lukas Lee, Youna Ji, Minjae Lee, Min-Seok Choi:
DEMUCS-Mobile : On-Device Lightweight Speech Enhancement. 2711-2715
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KashyapTMN21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KashyapTMN21
Madhav Mahesh Kashyap, Anuj Tambwekar, Krishnamoorthy Manohara, S. Natarajan:
Speech Denoising Without Clean Training Data: A Noise2Noise Approach. 2716-2720
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/DangZC21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/DangZC21
Feng Dang, Pengyuan Zhang, Hangting Chen:
Improved Speech Enhancement Using a Complex-Domain GAN with Fused Time-Domain and Time-Frequency Domain Constraints. 2721-2725
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/0004ZG21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/0004ZG21
Xudong Zhang, Liang Zhao, Feng Gu:
Speech Enhancement with Topology-Enhanced Generative Adversarial Networks (GANs). 2726-2730
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/BuZWH21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/BuZWH21
Suliang Bu, Yunxin Zhao, Shaojun Wang, Mei Han:
Learning Speech Structure to Improve Time-Frequency Masks. 2731-2735
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KimS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KimS21
Eesung Kim, Hyeji Seo:
SE-Conformer: Time-Domain Speech Enhancement Using Conformer. 2736-2740

Speech Synthesis: Tools, Data, Evaluation

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KongthawornNC21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KongthawornNC21
Thananchai Kongthaworn, Burin Naowarat, Ekapol Chuangsuwanich:
Spectral and Latent Speech Representation Distortion for TTS Evaluation. 2741-2745
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Valentini-Botinhao21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Valentini-Botinhao21
Cassia Valentini-Botinhao, Simon King:
Detection and Analysis of Attention Errors in Sequence-to-Sequence Text-to-Speech. 2746-2750
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZandieMME21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZandieMME21
Rohola Zandie, Mohammad H. Mahoor, Julia Madsen, Eshrat S. Emamian:
RyanSpeech: A Corpus for Conversational Text-to-Speech Synthesis. 2751-2755
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ShiBXZL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ShiBXZL21
Yao Shi, Hui Bu, Xin Xu, Shaoji Zhang, Ming Li:
AISHELL-3: A Multi-Speaker Mandarin TTS Corpus. 2756-2760
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/EngHHW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/EngHHW21
Nicholas Eng, C. T. Justine Hui, Yusuke Hioka, Catherine I. Watson:
Comparing Speech Enhancement Techniques for Voice Adaptation-Based Speech Synthesis. 2761-2765
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Cui0LCHLZ21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Cui0LCHLZ21
Chenye Cui, Yi Ren, Jinglin Liu, Feiyang Chen, Rongjie Huang, Ming Lei, Zhou Zhao:
EMOVIE: A Mandarin Emotion Speech Dataset with a Simple Emotional Text-to-Speech Model. 2766-2770
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/RallabandiBN021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/RallabandiBN021
Sai Sirisha Rallabandi, Abhinav Bharadwaj, Babak Naderi, Sebastian Möller:
Perception of Social Speaker Characteristics in Synthetic Speech. 2771-2775
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/BakhturinaLGZ21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/BakhturinaLGZ21
Evelina Bakhturina, Vitaly Lavrukhin, Boris Ginsburg, Yang Zhang:
Hi-Fi Multi-Speaker English TTS Dataset. 2776-2780
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/TsengHKLL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/TsengHKLL21
Wei-Cheng Tseng, Chien-yu Huang, Wei-Tsung Kao, Yist Y. Lin, Hung-yi Lee:
Utilizing Self-Supervised Representations for MOS Prediction. 2781-2785
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/MussakhojayevaJ21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MussakhojayevaJ21
Saida Mussakhojayeva, Aigerim Janaliyeva, Almas Mirzakhmetov, Yerbolat Khassanov, Huseyin Atakan Varol:
KazakhTTS: An Open-Source Kazakh Text-to-Speech Synthesis Dataset. 2786-2790
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/TaylorR21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/TaylorR21
Jason Taylor, Korin Richmond:
Confidence Intervals for ASR-Based TTS Evaluation. 2791-2795

INTERSPEECH 2021 Deep Noise Suppression Challenge

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ReddyDKNGCBGA021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ReddyDKNGCBGA021
Chandan K. A. Reddy, Harishchandra Dubey, Kazuhito Koishida, Arun Asokan Nair, Vishak Gopal, Ross Cutler, Sebastian Braun, Hannes Gamper, Robert Aichner, Sriram Srinivasan:
INTERSPEECH 2021 Deep Noise Suppression Challenge. 2796-2800
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LiLLYZL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LiLLYZL21
Andong Li, Wenzhe Liu, Xiaoxue Luo, Guochen Yu, Chengshi Zheng, Xiaodong Li:
A Simultaneous Denoising and Dereverberation Framework with Target Decoupling. 2801-2805
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/XuSF21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/XuSF21
Ziyi Xu, Maximilian Strake, Tim Fingscheidt:
Deep Noise Suppression with Non-Intrusive PESQNet Supervision Enabling the Use of Real Training Data. 2806-2810
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LeCCL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LeCCL21
Xiaohuai Le, Hongsheng Chen, Kai Chen, Jing Lu:
DPCRN: Dual-Path Convolution Recurrent Network for Single Channel Speech Enhancement. 2811-2815
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LvHZX21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LvHZX21
Shubo Lv, Yanxin Hu, Shimin Zhang, Lei Xie:
DCCRN+: Channel-Wise Subband DCCRN with SNR Estimation for Speech Enhancement. 2816-2820
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZhangHLZ21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZhangHLZ21
Kanghao Zhang, Shulin He, Hao Li, Xueliang Zhang:
DBNet: A Dual-Branch Network Architecture Processing on Spectrum and Waveform for Single-Channel Speech Enhancement. 2821-2825
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZhangRZCZGY21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZhangRZCZGY21
Xu Zhang, Xinlei Ren, Xiguang Zheng, Lianwu Chen, Chen Zhang, Liang Guo, Bing Yu:
Low-Delay Speech Enhancement Using Perceptually Motivated Target and Loss. 2826-2830
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Oostermeijer0D21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Oostermeijer0D21
Koen Oostermeijer, Qing Wang, Jun Du:
Lightweight Causal Transformer with Local Self-Attention for Real-Time Speech Enhancement. 2831-2835

Neural Network Training Methods and Architectures for ASR

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/RisteaI21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/RisteaI21
Nicolae-Catalin Ristea, Radu Tudor Ionescu:
Self-Paced Ensemble Learning for Speech and Audio Classification. 2836-2840
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Kojima21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Kojima21
Atsushi Kojima:
Knowledge Distillation for Streaming Transformer-Transducer. 2841-2845
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LohrenzLF21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LohrenzLF21
Timo Lohrenz, Zhengyang Li, Tim Fingscheidt:
Multi-Encoder Learning and Stream Fusion for Transformer-Based End-to-End Automatic Speech Recognition. 2846-2850
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZaiemPE21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZaiemPE21
Salah Zaiem, Titouan Parcollet, Slim Essid:
Conditional Independence for Pretext Task Selection in Self-Supervised Speech Representation Learning. 2851-2855
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZeineldeenGMZSN21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZeineldeenGMZSN21
Mohammad Zeineldeen, Aleksandr Glushko, Wilfried Michel, Albert Zeyer, Ralf Schlüter, Hermann Ney:
Investigating Methods to Improve Language Model Integration for Attention-Based Encoder-Decoder ASR Models. 2856-2860
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/VyasMB21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/VyasMB21
Apoorv Vyas, Srikanth R. Madikeri, Hervé Bourlard:
Comparing CTC and LFMMI for Out-of-Domain Adaptation of wav2vec 2.0 Acoustic Model. 2861-2865

Emotion and Sentiment Analysis I

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/MoineOR21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MoineOR21
Clément Le Moine, Nicolas Obin, Axel Roebel:
Speaker Attentive Speech Emotion Recognition. 2866-2870
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LeemFOGB21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LeemFOGB21
Seong-Gyun Leem, Daniel Fulford, Jukka-Pekka Onnela, David Gard, Carlos Busso:
Separation of Emotional and Reconstruction Embeddings on Ladder Network to Improve Speech Emotion Recognition Robustness in Noisy Conditions. 2871-2875
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/GeorgiouPP21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/GeorgiouPP21
Efthymios Georgiou, Georgios Paraskevopoulos, Alexandros Potamianos:
M³: MultiModal Masking Applied to Sentiment Analysis. 2876-2880

Linguistic Components in End-to-End ASR

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KlejchW021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KlejchW021
Ondrej Klejch, Electra Wallington, Peter Bell:
The CSTR System for Multilingual and Code-Switching ASR Challenges for Low Resource Indian Languages. 2881-2885
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZhouZZSN21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZhouZZSN21
Wei Zhou, Mohammad Zeineldeen, Zuoyun Zheng, Ralf Schlüter, Hermann Ney:
Acoustic Data-Driven Subword Modeling for End-to-End Speech Recognition. 2886-2890
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZhouZMSN21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZhouZMSN21
Wei Zhou, Albert Zeyer, André Merboldt, Ralf Schlüter, Hermann Ney:
Equivalence of Segmental and Neural Transducer Modeling: A Proof of Concept. 2891-2895
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KhosravaniGL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KhosravaniGL21
Abbas Khosravani, Philip N. Garner, Alexandros Lazaridis:
Modeling Dialectal Variation for Swiss German Automatic Speech Recognition. 2896-2900
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/EgorovaVBC21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/EgorovaVBC21
Ekaterina Egorova, Hari Krishna Vydana, Lukás Burget, Jan Cernocký:
Out-of-Vocabulary Words Detection with Attention and CTC Alignments in an End-to-End ASR System. 2901-2905
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WiesnerSARGHPJI21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WiesnerSARGHPJI21
Matthew Wiesner, Mousmita Sarma, Ashish Arora, Desh Raj, Dongji Gao, Ruizhe Huang, Supreet Preet, Moris Johnson, Zikra Iqbal, Nagendra Goel, Jan Trmal, Leibny Paola García-Perera, Sanjeev Khudanpur:
Training Hybrid Models on Noisy Transliterated Transcripts for Code-Switched Speech Recognition. 2906-2910

Assessment of Pathological Speech and Language II

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/XueHBGCS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/XueHBGCS21
Wei Xue, Roeland van Hout, Fleur Boogmans, Mario Ganzeboom, Catia Cucchiarini, Helmer Strik:
Speech Intelligibility of Dysarthric Speech: Human Scores and Acoustic-Phonetic Features. 2911-2915
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KimLNKBLN21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KimLNKBLN21
Young-Kyung Kim, Rimita Lahiri, Md. Nasir, So Hyun Kim, Somer Bishop, Catherine Lord, Shrikanth S. Narayanan:
Analyzing Short Term Dynamic Speech Features for Understanding Behavioral Traits of Children with Autism Spectrum Disorder. 2916-2920
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Jesko21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Jesko21
Waldemar Jesko:
Vocalization Recognition of People with Profound Intellectual and Multiple Disabilities (PIMD) Using Machine Learning Algorithms. 2921-2925
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/FivelaSPP21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/FivelaSPP21
Barbara Gili Fivela, Vincenzo Sallustio, Silvia Pede, Danilo Patrocinio:
Phonetic Complexity, Speech Accuracy and Intelligibility Assessment of Italian Dysarthric Speech. 2926-2930
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/NgNLL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/NgNLL21
Si Ioi Ng, Cymie Wing-Yee Ng, Jingyu Li, Tan Lee:
Detection of Consonant Errors in Disordered Speech Based on Consonant-Vowel Segment Embedding. 2931-2935
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/HairZABG21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/HairZABG21
Adam Hair, Guanlong Zhao, Beena Ahmed, Kirrie J. Ballard, Ricardo Gutierrez-Osuna:
Assessing Posterior-Based Mispronunciation Detection on Field-Collected Recordings from Child Speech Therapy Sessions. 2936-2940
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/MirheidariPBOC21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MirheidariPBOC21
Bahman Mirheidari, Yilin Pan, Daniel Blackburn, Ronan O'Malley, Heidi Christensen:
Identifying Cognitive Impairment Using Sentence Representation Vectors. 2941-2945
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/YueBCMAWGB21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/YueBCMAWGB21
Zhengjun Yue, Jon Barker, Heidi Christensen, Cristina McKean, Elaine Ashton, Yvonne Wren, Swapnil Gadgil, Rebecca Bright:
Parental Spoken Scaffolding and Narrative Skills in Crowd-Sourced Storytelling Samples of Young Children. 2946-2950
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/XiaHQDM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/XiaHQDM21
Tong Xia, Jing Han, Lorena Qendro, Ting Dang, Cecilia Mascolo:
Uncertainty-Aware COVID-19 Detection from Imbalanced Sound Data. 2951-2955
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WangDYCLM21a
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WangDYCLM21a
Disong Wang, Liqun Deng, Yu Ting Yeung, Xiao Chen, Xunying Liu, Helen Meng:
Unsupervised Domain Adaptation for Dysarthric Speech Detection via Domain Adversarial Training and Mutual Information Minimization. 2956-2960
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/BhattacharjeeMB21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/BhattacharjeeMB21
Tanuka Bhattacharjee, Jhansi Mallela, Yamini Belur, Atchayaram Nalini, Ravi Yadav, Pradeep Reddy, Dipanjan Gope, Prasanta Kumar Ghosh:
Source and Vocal Tract Cues for Speech-Based Classification of Patients with Parkinson's Disease and Healthy Subjects. 2961-2965
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/HaulcyG21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/HaulcyG21
R'mani Haulcy, James R. Glass:
CLAC: A Speech Corpus of Healthy English Speakers. 2966-2970

Multimodal Systems

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/NortjeK21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/NortjeK21
Leanne Nortje, Herman Kamper:
Direct Multimodal Few-Shot Learning of Speech and Images. 2971-2975
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SanabriaWB21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SanabriaWB21
Ramon Sanabria, Austin Waters, Jason Baldridge:
Talk, Don't Write: A Study of Direct Speech-Based Image Retrieval. 2976-2980
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Zhao021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Zhao021
Huan Zhao, Kaili Ma:
A Fast Discrete Two-Step Learning Hashing for Scalable Cross-Modal Retrieval. 2981-2985
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WangTLYFL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WangTLYFL21
Jianrong Wang, Ziyue Tang, Xuewei Li, Mei Yu, Qiang Fang, Li Liu:
Cross-Modal Knowledge Distillation Method for Automatic Cued Speech Recognition. 2986-2990
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/OlaleyeK21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/OlaleyeK21
Kayode Olaleye, Herman Kamper:
Attention-Based Keyword Localisation in Speech Using Visual Grounding. 2991-2995
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KhorramiR21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KhorramiR21
Khazar Khorrami, Okko Räsänen:
Evaluation of Audio-Visual Alignments in Visually Grounded Speech Models. 2996-3000
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChenD00YL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChenD00YL21
Hang Chen, Jun Du, Yu Hu, Li-Rong Dai, Bao-Cai Yin, Chin-Hui Lee:
Automatic Lip-Reading with Hierarchical Pyramidal Convolution and Self-Attention for Image Sequences with No Word Boundaries. 3001-3005
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/RouditchenkoBH021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/RouditchenkoBH021
Andrew Rouditchenko, Angie W. Boggust, David Harwath, Samuel Thomas, Hilde Kuehne, Brian Chen, Rameswar Panda, Rogério Feris, Brian Kingsbury, Michael Picheny, James R. Glass:
Cascaded Multilingual Audio-Visual Learning from Videos. 3006-3010
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/0001MPSP21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/0001MPSP21
Pingchuan Ma, Rodrigo Mira, Stavros Petridis, Björn W. Schuller, Maja Pantic:
LiRA: Learning Visual Speech Representations from Audio Through Self-Supervision. 3011-3015
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/RoseSTB21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/RoseSTB21
Richard Rose, Olivier Siohan, Anshuman Tripathi, Otavio Braga:
End-to-End Audio-Visual Speech Recognition for Overlapping Speech. 3016-3020
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WuLYWQ21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WuLYWQ21
Yifei Wu, Chenda Li, Song Yang, Zhongqin Wu, Yanmin Qian:
Audio-Visual Multi-Talker Speech Recognition in a Cocktail Party. 3021-3025

Source Separation I

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChenWCWY00Y21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChenWCWY00Y21
Sanyuan Chen, Yu Wu, Zhuo Chen, Jian Wu, Takuya Yoshioka, Shujie Liu, Jinyu Li, Xiangzhan Yu:
Ultra Fast Speech Separation Model with Teacher Student Learning. 3026-3030
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/AliKN21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/AliKN21
Murtiza Ali, Ashwani Koul, Karan Nathwani:
Group Delay Based Re-Weighted Sparse Recovery Algorithms for Robust and High-Resolution Source Separation in DOA Framework. 3031-3035
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/HanLLZK0DEHMC21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/HanLLZK0DEHMC21
Cong Han, Yi Luo, Chenda Li, Tianyan Zhou, Keisuke Kinoshita, Shinji Watanabe, Marc Delcroix, Hakan Erdogan, John R. Hershey, Nima Mesgarani, Zhuo Chen:
Continuous Speech Separation Using Speaker Inventory for Long Recording. 3036-3040
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/YuanWLUW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/YuanWLUW21
Weitao Yuan, Shengbei Wang, Xiangrui Li, Masashi Unoki, Wenwu Wang:
Crossfire Conditional Generative Adversarial Networks for Singing Voice Extraction. 3041-3045
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WangHHH021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WangHHH021
Kai Wang, Hao Huang, Ying Hu, Zhihua Huang, Sheng Li:
End-to-End Speech Separation Using Orthogonal Representation in Complex and Real Time-Frequency Domain. 3046-3050
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/NakagomeTOK21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/NakagomeTOK21
Yu Nakagome, Masahito Togami, Tetsuji Ogawa, Tetsunori Kobayashi:
Efficient and Stable Adversarial Learning Using Unpaired Data for Unsupervised Multichannel Speech Separation. 3051-3055
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/HuangCLCYL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/HuangCLCYL21
Sung-Feng Huang, Shun-Po Chuang, Da-Rong Liu, Yi-Chen Chen, Gene-Ping Yang, Hung-yi Lee:
Stabilizing Label Assignment for Speech Separation by Self-Supervised Pre-Training. 3056-3060
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WangPLW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WangPLW21
Fan-Lin Wang, Yu-Huai Peng, Hung-Shin Lee, Hsin-Min Wang:
Dual-Path Filter Network: Speaker-Aware Modeling for Speech Separation. 3061-3065
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WuCCWYK0L21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WuCCWYK0L21
Jian Wu, Zhuo Chen, Sanyuan Chen, Yu Wu, Takuya Yoshioka, Naoyuki Kanda, Shujie Liu, Jinyu Li:
Investigation of Practical Aspects of Single Channel Speech Separation for ASR. 3066-3070
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LuoM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LuoM21
Yi Luo, Nima Mesgarani:
Implicit Filter-and-Sum Network for End-to-End Multi-Channel Speech Separation. 3071-3075
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/XuZ0Z021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/XuZ0Z021
Yong Xu, Zhuohuang Zhang, Meng Yu, Shi-Xiong Zhang, Dong Yu:
Generalized Spatio-Temporal RNN Beamformer for Target Speech Separation. 3076-3080

Speaker Diarization I

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LiuHLS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LiuHLS21
Yi-Chieh Liu, Eunjung Han, Chul Lee, Andreas Stolcke:
End-to-End Neural Diarization: From Transformer to Conformer. 3081-3085
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/JungHKCL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/JungHKCL21
Jee-weon Jung, Hee-Soo Heo, Youngki Kwon, Joon Son Chung, Bong-Jin Lee:
Three-Class Overlapped Speech Detection Using a Convolutional Recurrent Neural Network. 3086-3090
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WanLZ21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WanLZ21
Xucheng Wan, Kai Liu, Huan Zhou:
Online Speaker Diarization Equipped with Discriminative Modeling and Guided Inference. 3091-3095
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/TakashimaFHWPN21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/TakashimaFHWPN21
Yuki Takashima, Yusuke Fujita, Shota Horiguchi, Shinji Watanabe, Leibny Paola García-Perera, Kenji Nagamatsu:
Semi-Supervised Training with Pseudo-Labeling for End-To-End Neural Diarization. 3096-3100
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KwonJHKLC21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KwonJHKLC21
Youngki Kwon, Jee-weon Jung, Hee-Soo Heo, You Jin Kim, Bong-Jin Lee, Joon Son Chung:
Adapting Speaker Embeddings for Speaker Diarisation. 3101-3105
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WangDHNSL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WangDHNSL21
Yu-Xuan Wang, Jun Du, Maokui He, Shutong Niu, Lei Sun, Chin-Hui Lee:
Scenario-Dependent Speaker Diarization for DIHARD-III Challenge. 3106-3110
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/BredinL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/BredinL21
Hervé Bredin, Antoine Laurent:
End-To-End Speaker Segmentation for Overlap-Aware Resegmentation. 3111-3115
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/XueHFTWPN21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/XueHFTWPN21
Yawen Xue, Shota Horiguchi, Yusuke Fujita, Yuki Takashima, Shinji Watanabe, Leibny Paola García-Perera, Kenji Nagamatsu:
Online Streaming End-to-End Neural Diarization Handling Overlapping Speech and Flexible Numbers of Speakers. 3116-3120
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/AnidjarLHD21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/AnidjarLHD21
Or Haim Anidjar, Itshak Lapidot, Chen Hajaj, Amit Dvir:
A Thousand Words are Worth More Than One Recording: Word-Embedding Based Speaker Change Detection. 3121-3125

Speech Synthesis: Prosody Modeling I

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/FutamataPYT21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/FutamataPYT21
Kosuke Futamata, Byeongseon Park, Ryuichi Yamamoto, Kentaro Tachibana:
Phrase Break Prediction with Bidirectional Encoder Representations in Japanese Text-to-Speech Synthesis. 3126-3130
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Valles-PerezRBB21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Valles-PerezRBB21
Iván Vallés-Pérez, Julian Roth, Grzegorz Beringer, Roberto Barra-Chicote, Jasha Droppo:
Improving Multi-Speaker TTS Prosody Variance with a Residual Encoder and Normalizing Flows. 3131-3135
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Du021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Du021
Chenpeng Du, Kai Yu:
Rich Prosody Diversity Modelling with Phone-Level Mixture Density Network. 3136-3140
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/FujitaAI21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/FujitaAI21
Kenichi Fujita, Atsushi Ando, Yusuke Ijima:
Phoneme Duration Modeling Using Speech Rhythm-Based Speaker Embeddings for Multi-Speaker Speech Synthesis. 3141-3145
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZouLYLWZM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZouLYLWZM21
Yuxiang Zou, Shichao Liu, Xiang Yin, Haopeng Lin, Chunfeng Wang, Haoyu Zhang, Zejun Ma:
Fine-Grained Prosody Modeling in Neural Speech Synthesis Using ToBI Representation. 3146-3150
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SharmaVFBE21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SharmaVFBE21
Mayank Sharma, Yogesh Virkar, Marcello Federico, Roberto Barra-Chicote, Robert Enyedi:
Intra-Sentential Speaking Rate Control in Neural Text-To-Speech for Automatic Dubbing. 3151-3155
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZhangQTL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZhangQTL21
Guangyan Zhang, Ying Qin, Daxin Tan, Tan Lee:
Applying the Information Bottleneck Principle to Prosodic Representation Learning. 3156-3160
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/BairdMMSWAS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/BairdMMSWAS21
Alice Baird, Silvan Mertes, Manuel Milling, Lukas Stappen, Thomas Wiest, Elisabeth André, Björn W. Schuller:
A Prototypical Network Approach for Evaluating Generated Emotional Speech. 3161-3165

Speech Production II

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/YoshinagaTNI21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/YoshinagaTNI21
Tsukasa Yoshinaga, Kohei Tada, Kazunori Nozaki, Akiyoshi Iida:
A Simplified Model for the Vocal Tract of [s] with Inclined Incisors. 3166-3170
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Arai21a
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Arai21a
Takayuki Arai:
Vocal-Tract Models to Visualize the Airstream of Human Breath and Droplets While Producing Speech. 3171-3175
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/TanjiOK21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/TanjiOK21
Ryo Tanji, Hidefumi Ohmura, Kouichi Katsurada:
Using Transposed Convolution for Articulatory-to-Acoustic Conversion from Real-Time MRI Data. 3176-3180
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/InaamYAYI21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/InaamYAYI21
Rafia Inaam, Tsukasa Yoshinaga, Takayuki Arai, Hiroshi Yokoyama, Akiyoshi Iida:
Comparison Between Lumped-Mass Modeling and Flow Simulation of the Reed-Type Artificial Vocal Fold. 3181-3185
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WernerFTM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WernerFTM21
Raphael Werner, Susanne Fuchs, Jürgen Trouvain, Bernd Möbius:
Inhalations in Speech: Acoustic and Physiological Characteristics. 3186-3190
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/XuNGKPBX21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/XuNGKPBX21
Anqi Xu, Daniel R. van Niekerk, Branislav Gerazov, Paul Konstantin Krug, Santitham Prom-on, Peter Birkholz, Yi Xu:
Model-Based Exploration of Linking Between Vowel Articulatory Space and Acoustic Space. 3191-3195
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ElmersWMMT21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ElmersWMMT21
Mikey Elmers, Raphael Werner, Beeke Muhlack, Bernd Möbius, Jürgen Trouvain:
Take a Breath: Respiratory Sounds Improve Recollection in Synthetic Speech. 3196-3200
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChenLP21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChenLP21
Taijing Chen, Adam C. Lammert, Benjamin Parrell:
Modeling Sensorimotor Adaptation in Speech Through Alterations to Forward and Inverse Models. 3201-3205
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KawaharaMYSTMI21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KawaharaMYSTMI21
Hideki Kawahara, Toshie Matsui, Kohei Yatabe, Ken-Ichi Sakakibara, Minoru Tsuzaki, Masanori Morise, Toshio Irino:
Mixture of Orthogonal Sequences Made from Extended Time-Stretched Pulses Enables Measurement of Involuntary Voice Fundamental Frequency Response to Pitch Perturbation. 3206-3210

Spoken Dialogue Systems II

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/YouCZ21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/YouCZ21
Chenyu You, Nuo Chen, Yuexian Zou:
Contextualized Attention-Based Knowledge Transfer for Spoken Conversational Question Answering. 3211-3215
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/DuanHZRT21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/DuanHZRT21
Wenying Duan, Xiaoxi He, Zimu Zhou, Hong Rao, Lothar Thiele:
Injecting Descriptive Meta-Information into Pre-Trained Language Models with Hypernetworks. 3216-3220
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/RohmatillahC21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/RohmatillahC21
Mahdin Rohmatillah, Jen-Tzung Chien:
Causal Confusion Reduction for Robust Multi-Domain Dialogue Policy. 3221-3225
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/FujieKSK21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/FujieKSK21
Shinya Fujie, Hayato Katayama, Jin Sakuma, Tetsunori Kobayashi:
Timing Generating Networks: Neural Network Based Precise Turn-Taking Timing Prediction in Multiparty Conversation. 3226-3230
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChenLDZC21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChenLDZC21
Kehan Chen, Zezhong Li, Suyang Dai, Wei Zhou, Haiqing Chen:
Human-to-Human Conversation Dataset for Learning Fine-Grained Turn-Taking Action. 3231-3235
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SundararamanKV21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SundararamanKV21
Mukuntha Narayanan Sundararaman, Ayush Kumar, Jithendra Vepa:
PhonemeBERT: Joint Language Modelling of Phoneme Sequence and ASR Transcript. 3236-3240
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LuoGLZ021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LuoGLZ021
Hongyin Luo, James R. Glass, Garima Lalwani, Yi Zhang, Shang-Wen Li:
Joint Retrieval-Extraction Training for Evidence-Aware Dialog Response Selection. 3241-3245
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ShenoyBSRK21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ShenoyBSRK21
Ashish Shenoy, Sravan Bodapati, Monica Sunkara, Srikanth Ronanki, Katrin Kirchhoff:
Adapting Long Context NLM for ASR Rescoring in Conversational Agents. 3246-3250

Oriental Language Recognition

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LiWZLLHW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LiWZLLHW21
Jing Li, Binling Wang, Yiming Zhi, Zheng Li, Lin Li, Qingyang Hong, Dong Wang:
Oriental Language Recognition (OLR) 2020: Summary and Analysis. 3251-3255
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/DuroselleSJI21a
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/DuroselleSJI21a
Raphaël Duroselle, Md. Sahidullah, Denis Jouvet, Irina Illina:
Language Recognition on Unknown Conditions: The LORIA-Inria-MULTISPEECH System for AP20-OLR Challenge. 3256-3260
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KongYZGWSHSW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KongYZGWSHSW21
Tianlong Kong, Shouyi Yin, Dawei Zhang, Wang Geng, Xin Wang, Dandan Song, Jinwen Huang, Huiyu Shi, Xiaorui Wang:
Dynamic Multi-Scale Convolution for Dialect Identification. 3261-3265
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WangYHLX21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WangYHLX21
Ding Wang, Shuaishuai Ye, Xinhui Hu, Sheng Li, Xinkang Xu:
An End-to-End Dialect Identification System with Transfer Learning from a Multilingual Automatic Speech Recognition Model. 3266-3270
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/YuZYWNZ21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/YuZYWNZ21
Haibin Yu, Jing Zhao, Song Yang, Zhongqin Wu, Yuting Nie, Wei-Qiang Zhang:
Language Recognition Based on Unsupervised Pretrained Models. 3271-3275
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LiLLH21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LiLLH21
Zheng Li, Yan Liu, Lin Li, Qingyang Hong:
Additive Phoneme-Aware Margin Softmax Loss for Language Recognition. 3276-3280

Automatic Speech Recognition in Air Traffic Management

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/JahchanBGKD21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/JahchanBGKD21
Nataly Jahchan, Florentin Barbier, Ariyanidevi Dharma Gita, Khaled Khelif, Estelle Delpech:
Towards an Accent-Robust Approach for ATC Communications Transcription. 3281-3285
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SzokeKNKVC21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SzokeKNKVC21
Igor Szöke, Santosh Kesiraju, Ondrej Novotný, Martin Kocour, Karel Veselý, Jan Cernocký:
Detecting English Speech in the Air Traffic Control Voice Communication. 3286-3290
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/OhneiserSHSMKEM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/OhneiserSHSMKEM21
Oliver Ohneiser, Seyyed Saeed Sarfjoo, Hartmut Helmke, Shruthi Shetty, Petr Motlícek, Matthias Kleinert, Heiko Ehr, Sarunas Murauskas:
Robust Command Recognition for Lithuanian Air Traffic Control Tower Utterances. 3291-3295
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Zuluaga-GomezNP21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Zuluaga-GomezNP21
Juan Zuluaga-Gomez, Iuliia Nigmatulina, Amrutha Prasad, Petr Motlícek, Karel Veselý, Martin Kocour, Igor Szöke:
Contextual Semi-Supervised Learning: An Approach to Leverage Air-Surveillance and Untranscribed ATC Data in ASR Systems. 3296-3300
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KocourVBZSCKM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KocourVBZSCKM21
Martin Kocour, Karel Veselý, Alexander Blatt, Juan Zuluaga-Gomez, Igor Szöke, Jan Cernocký, Dietrich Klakow, Petr Motlícek:
Boosting of Contextual Information in ASR for Air-Traffic Call-Sign Recognition. 3301-3305
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ElieGGL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ElieGGL21
Benjamin Elie, Jodie Gauvain, Jean-Luc Gauvain, Lori Lamel:
Modeling the Effect of Military Oxygen Masks on Speech Characteristics. 3306-3310

Show and Tell 3

- view
  - electronic edition @ isca-speech.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/interspeech/MildeFRB21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MildeFRB21
Benjamin Milde, Tim Fischer, Steffen Remus, Chris Biemann:
MoM: Minutes of Meeting Bot. 3311-3312
- view
  - electronic edition @ isca-speech.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/interspeech/WilbrandtSB21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WilbrandtSB21
Alexander Wilbrandt, Simon Stone, Peter Birkholz:
Articulatory Data Recorder: A Framework for Real-Time Articulatory Data Recording. 3313-3314
- view
  - electronic edition @ isca-speech.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/interspeech/Codina-FilbaCLG21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Codina-FilbaCLG21
Joan Codina-Filbà, Guillermo Cámbara, Alex Peiró Lilja, Jens Grivolla, Roberto Carlini, Mireia Farrús:
The INGENIOUS Multilingual Operations App. 3315-3316
- view
  - electronic edition @ isca-speech.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/interspeech/RownickaSTGK21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/RownickaSTGK21
Joanna Rownicka, Kilian Sprenkamp, Antonio Tripiana, Volodymyr Gromoglasov, Timo P. Kunz:
Digital Einstein Experience: Fast Text-to-Speech for Conversational AI. 3317-3318
- view
  - electronic edition @ isca-speech.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/interspeech/GeislingerMBB21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/GeislingerMBB21
Robert Geislinger, Benjamin Milde, Timo Baumann, Chris Biemann:
Live Subtitling for BigBlueButton with Open-Source Software. 3319-3320
- view
  - electronic edition @ isca-speech.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/interspeech/NicmanisS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/NicmanisS21
Davis Nicmanis, Askars Salimbajevs:
Expressive Latvian Speech Synthesis for Dialog Systems. 3321-3322
- view
  - electronic edition @ isca-speech.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/interspeech/Kachare0MDNRP21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Kachare0MDNRP21
Pramod H. Kachare, Prem C. Pandey, Vishal Mane, Hirak Dasgupta, K. S. Nataraj, Akshada Rathod, Sheetal K. Pathak:
ViSTAFAE: A Visual Speech-Training Aid with Feedback of Articulatory Efforts. 3323-3324

Survey Talk 3: Karen Livescu

- view
  - electronic edition @ isca-speech.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/interspeech/Livescu21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Livescu21
Karen Livescu:
Learning Speech Models from Multi-Modal Data.

Keynote 3: Mounya Elhilali

- view
  - electronic edition @ isca-speech.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/interspeech/Elhilali21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Elhilali21
Mounya Elhilali:
Adaptive Listening to Everyday Soundscapes.

Speech Production I

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/RibeiroILVL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/RibeiroILVL21
Vinicius Ribeiro, Karyna Isaieva, Justine Leclere, Pierre-André Vuissoz, Yves Laprie:
Towards the Prediction of the Vocal Tract Shape from the Sequence of Phonemes to be Articulated. 3325-3329
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/BlandinAFDB21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/BlandinAFDB21
Rémi Blandin, Marc Arnela, Simon Félix, Jean-Baptiste Doc, Peter Birkholz:
Comparison of the Finite Element Method, the Multimodal Method and the Transmission-Line Model for the Computation of Vocal Tract Transfer Functions. 3330-3334
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WagnerZC21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WagnerZC21
Petra Wagner, Sina Zarrieß, Joana Cholin:
Effects of Time Pressure and Spontaneity on Phonotactic Innovations in German Dialogues. 3335-3339
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/MedinaTTHM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MedinaTTHM21
Salvador Medina, Sarah Taylor, Mark Tiede, Alexander G. Hauptmann, Iain A. Matthews:
Importance of Parasagittal Sensor Information in Tongue Motion Capture Through a Diphonic Analysis. 3340-3344
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/GeorgesGSH21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/GeorgesGSH21
Marc-Antoine Georges, Laurent Girin, Jean-Luc Schwartz, Thomas Hueber:
Learning Robust Speech Representation with an Articulatory-Regularized Variational Autoencoder. 3345-3349
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WestonKF21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WestonKF21
Heather Weston, Laura L. Koenig, Susanne Fuchs:
Changes in Glottal Source Parameter Values with Light to Moderate Physical Load. 3350-3354

Speech Enhancement and Coding

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ValiB21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ValiB21
Mohammad Hassan Vali, Tom Bäckström:
End-to-End Optimized Multi-Stage Vector Quantization of Spectral Envelopes for Speech and Audio Coding. 3355-3359
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/NareddulaGG21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/NareddulaGG21
Santhan Kumar Reddy Nareddula, Subrahmanyam Gorthi, Rama Krishna Sai Subrahmanyam Gorthi:
Fusion-Net: Time-Frequency Information Fusion Y-Network for Speech Enhancement. 3360-3364
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/MarcinekSMG21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MarcinekSMG21
Lubos Marcinek, Michael Stone, Rebecca E. Millman, Patrick Gaydecki:
N-MTTL SI Model: Non-Intrusive Multi-Task Transfer Learning-Based Speech Intelligibility Prediction Model with Scenery Classification. 3365-3369

Emotion and Sentiment Analysis II

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/XiaCRS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/XiaCRS21
Yangyang Xia, Li-Wei Chen, Alexander Rudnicky, Richard M. Stern:
Temporal Context in Speech Emotion Recognition. 3370-3374
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LiDWL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LiDWL21
Hang Li, Wenbiao Ding, Zhongqin Wu, Zitao Liu:
Learning Fine-Grained Cross Modality Excitement for Speech Emotion Recognition. 3375-3379
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/VaarasADR21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/VaarasADR21
Einari Vaaras, Sari Ahlqvist-Björkroth, Konstantinos Drossos, Okko Räsänen:
Automatic Analysis of the Emotional Content of Speech in Daylong Child-Centered Recordings from a Neonatal Intensive Care Unit. 3380-3384
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/QianH21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/QianH21
Fan Qian, Jiqing Han:
Multimodal Sentiment Analysis with Temporal Modality Attention. 3385-3389
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/TellamekalaSTGV21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/TellamekalaSTGV21
Mani Kumar Tellamekala, Enrique Sanchez, Georgios Tzimiropoulos, Timo Giesbrecht, Michel F. Valstar:
Stochastic Process Regression for Cross-Cultural Speech Emotion Recognition. 3390-3394
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LiKKN21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LiKKN21
Haoqi Li, Yelin Kim, Cheng-Hao Kuo, Shrikanth S. Narayanan:
Acted vs. Improvised: Domain Adaptation for Elicitation Approaches in Audio-Visual Emotion Recognition. 3395-3399
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/PepinoRF21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/PepinoRF21
Leonardo Pepino, Pablo Riera, Luciana Ferrer:
Emotion Recognition from Speech Using wav2vec 2.0 Embeddings. 3400-3404
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LiuW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LiuW21
Jiawang Liu, Haoxiang Wang:
Graph Isomorphism Network for Speech Emotion Recognition. 3405-3409
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KumawatR21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KumawatR21
Pooja Kumawat, Aurobinda Routray:
Applying TDNN Architectures for Analyzing Duration Dependencies on Speech Emotion Recognition. 3410-3414
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KeesingKW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KeesingKW21
Aaron Keesing, Yun Sing Koh, Michael Witbrock:
Acoustic Features and Neural Representations for Categorical Emotion Recognition from Speech. 3415-3419
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ShonBPHW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ShonBPHW21
Suwon Shon, Pablo Brusco, Jing Pan, Kyu Jeong Han, Shinji Watanabe:
Leveraging Pre-Trained Language Model for Speech Sentiment Analysis. 3420-3424

Multi- and Cross-Lingual ASR, Other Topics in ASR

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/HouW0QS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/HouW0QS21
Wenxin Hou, Jindong Wang, Xu Tan, Tao Qin, Takahiro Shinozaki:
Cross-Domain Speech Recognition with Unsupervised Character-Level Distribution Matching. 3425-3429
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KandaYWGWMCY21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KandaYWGWMCY21
Naoyuki Kanda, Guoli Ye, Yu Wu, Yashesh Gaur, Xiaofei Wang, Zhong Meng, Zhuo Chen, Takuya Yoshioka:
Large-Scale Pre-Training of End-to-End Multi-Talker ASR for Meeting Transcription with Single Distant Microphone. 3430-3434
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/0001MKL021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/0001MKL021
Liang Lu, Zhong Meng, Naoyuki Kanda, Jinyu Li, Yifan Gong:
On Minimum Word Error Rate Training of the Hybrid Autoregressive Transducer. 3435-3439
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KimLTZS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KimLTZS21
Jaeyoung Kim, Han Lu, Anshuman Tripathi, Qian Zhang, Hasim Sak:
Reducing Streaming ASR Model Delay with Self Alignment. 3440-3444
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/DiwanJ21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/DiwanJ21
Anuj Diwan, Preethi Jyothi:
Reduce and Reconstruct: ASR for Low-Resource Phonetic Languages. 3445-3449
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Fukuda021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Fukuda021
Takashi Fukuda, Samuel Thomas:
Knowledge Distillation Based Training of Universal ASR Source Models for Cross-Lingual Transfer. 3450-3454
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/RayWRGBRARSD21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/RayWRGBRARSD21
Swayambhu Nath Ray, Minhua Wu, Anirudh Raju, Pegah Ghahremani, Raghavendra Bilgi, Milind Rao, Harish Arsikere, Ariya Rastrow, Andreas Stolcke, Jasha Droppo:
Listen with Intent: Improving Speech Recognition with Audio-to-Intent Front-End. 3455-3459
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LuHZC21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LuHZC21
Zhiyun Lu, Wei Han, Yu Zhang, Liangliang Cao:
Exploring Targeted Universal Adversarial Perturbations to End-to-End ASR Models. 3460-3464
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/RioDWHBPMDZJ21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/RioDWHBPMDZJ21
Miguel Del Rio, Natalie Delworth, Ryan Westerman, Michelle Huang, Nishchal Bhandari, Joseph Palakapilly, Quinten McNamara, Joshua Dong, Piotr Zelasko, Miguel Jette:
Earnings-21: A Practical Benchmark for ASR in the Wild. 3465-3469
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Sun0MWX0021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Sun0MWX0021
Eric Sun, Jinyu Li, Zhong Meng, Yu Wu, Jian Xue, Shujie Liu, Yifan Gong:
Improving Multilingual Transformer Transducer Models by Reducing Language Confusions. 3470-3474
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/AliCHH21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/AliCHH21
Ahmed Ali, Shammur Absar Chowdhury, Amir Hussein, Yasser Hifny:
Arabic Code-Switching Speech Recognition Using Monolingual Data. 3475-3479

Source Separation II

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/EisenbergSG21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/EisenbergSG21
Aviad Eisenberg, Boaz Schwartz, Sharon Gannot:
Online Blind Audio Source Separation Using Recursive Expectation-Maximization. 3480-3484
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LuoHM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LuoHM21
Yi Luo, Cong Han, Nima Mesgarani:
Empirical Analysis of Generalized Iterative Speech Separation Networks. 3485-3489
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/NeumannKBDH21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/NeumannKBDH21
Thilo von Neumann, Keisuke Kinoshita, Christoph Böddeker, Marc Delcroix, Reinhold Haeb-Umbach:
Graph-PIT: Generalized Permutation Invariant Training for Continuous Separation of Arbitrary Numbers of Speakers. 3490-3494
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZhangZDB21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZhangZDB21
Jisi Zhang, Catalin Zorila, Rama Doddipatla, Jon Barker:
Teacher-Student MixIT for Unsupervised and Semi-Supervised Speech Separation. 3495-3499
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/DelcroixVOKA21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/DelcroixVOKA21
Marc Delcroix, Jorge Bennasar Vázquez, Tsubasa Ochiai, Keisuke Kinoshita, Shoko Araki:
Few-Shot Learning of New Sound Classes for Target Sound Extraction. 3500-3504
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/HanLM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/HanLM21
Cong Han, Yi Luo, Nima Mesgarani:
Binaural Speech Separation of Moving Speakers With Preserved Spatial Cues. 3505-3509
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/HuANDPT21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/HuANDPT21
Shell Xu Hu, Md Rifat Arefin, Viet-Nhat Nguyen, Alish Dipani, Xaq Pitkow, Andreas Savas Tolias:
AvaTr: One-Shot Speaker Extraction with Transformers. 3510-3514
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SarkarBS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SarkarBS21
Saurjya Sarkar, Emmanouil Benetos, Mark B. Sandler:
Vocal Harmony Separation Using Time-Domain Neural Networks. 3515-3519
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Maciejewski0K21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Maciejewski0K21
Matthew Maciejewski, Shinji Watanabe, Sanjeev Khudanpur:
Speaker Verification-Based Evaluation of Single-Channel Speech Separation. 3520-3524
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LanQLMT021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LanQLMT021
Tian Lan, Yuxin Qian, Yilan Lyu, Refuoe Mokhosi, Wenxin Tai, Qiao Liu:
Improved Speech Separation with Time-and-Frequency Cross-Domain Feature Selection. 3525-3529
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/DengMSZZSW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/DengMSZZSW21
Chengyun Deng, Shiqian Ma, Yongtao Sha, Yi Zhang, Hui Zhang, Hui Song, Fei Wang:
Robust Speaker Extraction Network Based on Iterative Refined Adaptation. 3530-3534
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WangXG021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WangXG021
Wupeng Wang, Chenglin Xu, Meng Ge, Haizhou Li:
Neural Speaker Extraction with Speaker-Speech Cross-Attention Network. 3535-3539
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/RigalCZ21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/RigalCZ21
Rémi Rigal, Jacques Chodorowski, Benoît Zerr:
Deep Audio-Visual Speech Separation Based on Facial Motion. 3540-3544

Speaker Diarization II

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SinghVKCG21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SinghVKCG21
Prachi Singh, Rajat Varma, Venkat Krishnamohan, Srikanth Raj Chetupalli, Sriram Ganapathy:
LEAP Submission for the Third DIHARD Diarization Challenge. 3545-3549
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZhangZHLSFY21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZhangZHLSFY21
Shiliang Zhang, Siqi Zheng, Weilong Huang, Ming Lei, Hongbin Suo, Jinwei Feng, Zhijie Yan:
Investigation of Spatial-Acoustic Features for Overlapping Speech Detection in Multiparty Meetings. 3550-3554
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/HeRHDC021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/HeRHDC021
Maokui He, Desh Raj, Zili Huang, Jun Du, Zhuo Chen, Shinji Watanabe:
Target-Speaker Voice Activity Detection with Improved i-Vector Estimation for Unknown Number of Speaker. 3555-3559
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/DawalatabadRGTD21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/DawalatabadRGTD21
Nauman Dawalatabad, Mirco Ravanelli, François Grondin, Jenthe Thienpondt, Brecht Desplanques, Hwidong Na:
ECAPA-TDNN Embeddings for Speaker Diarization. 3560-3564
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KinoshitaDT21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KinoshitaDT21
Keisuke Kinoshita, Marc Delcroix, Naohiro Tawara:
Advances in Integration of End-to-End Neural and Clustering-Based Diarization for Real Conversational Speech. 3565-3569
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/RyantSKV0CDGL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/RyantSKV0CDGL21
Neville Ryant, Prachi Singh, Venkat Krishnamohan, Rajat Varma, Kenneth Church, Christopher Cieri, Jun Du, Sriram Ganapathy, Mark Y. Liberman:
The Third DIHARD Diarization Challenge. 3570-3574
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LeungS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LeungS21
Tsun-Yat Leung, Lahiru Samarakoon:
Robust End-to-End Speaker Diarization with Conformer and Additive Margin Penalty. 3575-3579
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/OBrienTCB21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/OBrienTCB21
Benjamin O'Brien, Natalia A. Tomashenko, Anaïs Chanclu, Jean-François Bonastre:
Anonymous Speaker Clusters: Making Distinctions Between Anonymised Speech Recordings with Clustering Interface. 3580-3584
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KarraM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KarraM21
Kiran Karra, Alan McCree:
Speaker Diarization Using Two-Pass Leave-One-Out Gaussian PLDA Clustering of DNN Embeddings. 3585-3589

Speech Synthesis: Toward End-to-End Synthesis I

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/HongWQLZX21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/HongWQLZX21
Zhenhou Hong, Jianzong Wang, Xiaoyang Qu, Jie Liu, Chendong Zhao, Jing Xiao:
Federated Learning with Dynamic Transformer for Text to Speech. 3590-3594
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/NguyenJUHSK21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/NguyenJUHSK21
Huu-Kim Nguyen, Kihyuk Jeong, Seyun Um, Min-Jae Hwang, Eunwoo Song, Hong-Goo Kang:
LiteTTS: A Lightweight Mel-Spectrogram-Free Text-to-Wave Synthesizer Based on Generative Adversarial Networks. 3595-3599
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/TangLZYZZ21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/TangLZYZZ21
Chuanxin Tang, Chong Luo, Zhiyuan Zhao, Dacheng Yin, Yucheng Zhao, Wenjun Zeng:
Zero-Shot Text-to-Speech for Text-Based Insertion in Audio Narration. 3600-3604
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/JeongKCCK21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/JeongKCCK21
Myeonghun Jeong, Hyeongju Kim, Sung Jun Cheon, Byoung Jin Choi, Nam Soo Kim:
Diff-TTS: A Denoising Diffusion Model for Text-to-Speech. 3605-3609
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/BaeBJC21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/BaeBJC21
Jae-Sung Bae, Taejun Bak, Young-Sun Joo, Hoon-Young Cho:
Hierarchical Context-Aware Transformers for Non-Autoregressive Text to Speech. 3610-3614
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/PolyakACKLHMD21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/PolyakACKLHMD21
Adam Polyak, Yossi Adi, Jade Copet, Eugene Kharitonov, Kushal Lakhotia, Wei-Ning Hsu, Abdelrahman Mohamed, Emmanuel Dupoux:
Speech Resynthesis from Discrete Disentangled Self-Supervised Representations. 3615-3619
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KaranasouKMJASL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KaranasouKMJASL21
Penny Karanasou, Sri Karlapati, Alexis Moinet, Arnaud Joly, Ammar Abbas, Simon Slangen, Jaime Lorenzo-Trueba, Thomas Drugman:
A Learned Conditional Prior for the VAE Acoustic Space of a TTS System. 3620-3624
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/PaulMPS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/PaulMPS21
Dipjyoti Paul, Sankar Mukherjee, Yannis Pantazis, Yannis Stylianou:
A Universal Multi-Speaker Multi-Style Text-to-Speech via Disentangled Representation Learning Based on Rényi Divergence Minimization. 3625-3629
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WuHLPHTWT21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WuHLPHTWT21
Yi-Chiao Wu, Cheng-Hung Hu, Hung-Shin Lee, Yu-Huai Peng, Wen-Chin Huang, Yu Tsao, Hsin-Min Wang, Tomoki Toda:
Relational Data Selection for Data Augmentation of Speaker-Dependent Multi-Band MelGAN Vocoder. 3630-3634
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChungLL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChungLL21
Hyunseung Chung, Sang-Hoon Lee, Seong-Whan Lee:
Reinforce-Aligner: Reinforcement Alignment Search for Robust End-to-End Text-to-Speech. 3635-3639
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LinXMLL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LinXMLL21
Shilun Lin, Fenglong Xie, Li Meng, Xinhui Li, Li Lu:
Triple M: A Practical Text-to-Speech Synthesis System with Multi-Guidance Attention and Multi-Band Multi-Time LPCNet. 3640-3644
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/CasanovaSGMOCSA21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/CasanovaSGMOCSA21
Edresson Casanova, Christopher Shulby, Eren Gölge, Nicolas Michael Müller, Frederico Santos de Oliveira, Arnaldo Candido Jr., Anderson da Silva Soares, Sandra Maria Aluísio, Moacir Antonelli Ponti:
SC-GlowTTS: An Efficient Zero-Shot Multi-Speaker Text-To-Speech Model. 3645-3649

Tools, Corpora and Resources

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/PalmerRBKG21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/PalmerRBKG21
Ian Palmer, Andrew Rouditchenko, Andrei Barbu, Boris Katz, James R. Glass:
Spoken ObjectNet: A Bias-Controlled Spoken Caption Dataset. 3650-3654
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SaleskyWBCNTOP21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SaleskyWBCNTOP21
Elizabeth Salesky, Matthew Wiesner, Jacob Bremerman, Roldano Cattoni, Matteo Negri, Marco Turchi, Douglas W. Oard, Matt Post:
The Multilingual TEDx Corpus for Speech Recognition and Translation. 3655-3659
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/MortensenPLS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MortensenPLS21
David R. Mortensen, Jordan Picone, Xinjian Li, Kathleen Siminyu:
Tusom2021: A Phonetically Transcribed Speech Dataset from an Endangered Language for Universal Phone Recognition Experiments. 3660-3664
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/FuCLJKCHXWBXDC21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/FuCLJKCHXWBXDC21
Yihui Fu, Luyao Cheng, Shubo Lv, Yukai Jv, Yuxiang Kong, Zhuo Chen, Yanxin Hu, Lei Xie, Jian Wu, Hui Bu, Xin Xu, Jun Du, Jingdong Chen:
AISHELL-4: An Open Source Dataset for Speech Enhancement, Separation, Recognition and Speaker Diarization in Conference Scenario. 3665-3669
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChenCWDZWSPTZJK21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChenCWDZWSPTZJK21
Guoguo Chen, Shuzhou Chai, Guan-Bo Wang, Jiayu Du, Wei-Qiang Zhang, Chao Weng, Dan Su, Daniel Povey, Jan Trmal, Junbo Zhang, Mingjie Jin, Sanjeev Khudanpur, Shinji Watanabe, Shuaijiang Zhao, Wei Zou, Xiangang Li, Xuchen Yao, Yongqing Wang, Zhao You, Zhiyong Yan:
GigaSpeech: An Evolving, Multi-Domain ASR Corpus with 10, 000 Hours of Transcribed Audio. 3670-3674
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KimHCCKLKC21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KimHCCKLKC21
You Jin Kim, Hee-Soo Heo, Soyeon Choe, Soo-Whan Chung, Yoohwan Kwon, Bong-Jin Lee, Youngki Kwon, Joon Son Chung:
Look Who's Talking: Active Speaker Detection in the Wild. 3675-3679
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/AhmedBBSMEBCABD21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/AhmedBBSMEBCABD21
Beena Ahmed, Kirrie J. Ballard, Denis Burnham, Tharmakulasingam Sirojan, Hadi Mehmood, Dominique Estival, Elise Baker, Felicity Cox, Joanne Arciuli, Titia Benders, Katherine Demuth, Barbara Kelly, Chloé Diskin-Holdaway, Mostafa Ali Shahin, Vidhyasaharan Sethu, Julien Epps, Chwee Beng Lee, Eliathamby Ambikairajah:
AusKidTalk: An Auditory-Visual Corpus of 3- to 12-Year-Old Australian Children's Speech. 3680-3684
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/FallgrenE21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/FallgrenE21
Per Fallgren, Jens Edlund:
Human-in-the-Loop Efficiency Analysis for Binary Classification in Edyson. 3685-3689
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/RyuminaV021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/RyuminaV021
Elena Ryumina, Oxana Verkholyak, Alexey Karpov:
Annotation Confidence vs. Training Sample Size: Trade-Off Solution for Partially-Continuous Categorical Emotion Recognition. 3690-3694
- view
  - electronic edition @ isca-speech.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/interspeech/Diaz-MunioSJGIB21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Diaz-MunioSJGIB21
Gonçal V. Garcés Díaz-Munío, Joan Albert Silvestre-Cerdà, Javier Jorge, Adrià Giménez-Pastor, Javier Iranzo-Sánchez, Pau Baquero-Arnal, Nahuel Roselló, Alejandro Pérez González de Martos, Jorge Civera, Albert Sanchís, Alfons Juan:
Europarl-ASR: A Large Corpus of Parliamentary Debates for Streaming ASR Benchmarking and Speech Data Filtering/Verbatimization.
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KapoorMHNJ21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KapoorMHNJ21
Parul Kapoor, Rudrabha Mukhopadhyay, Sindhu B. Hegde, Vinay P. Namboodiri, C. V. Jawahar:
Towards Automatic Speech to Sign Language Generation. 3700-3704
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChoKCK21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChoKCK21
Won-Ik Cho, Seok Min Kim, Hyunchang Cho, Nam Soo Kim:
kosp2e: Korean Speech to English Translation Corpus. 3705-3709
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZhangZWYSHLPW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZhangZWYSHLPW21
Junbo Zhang, Zhiwen Zhang, Yongqing Wang, Zhiyong Yan, Qiong Song, Yukai Huang, Ke Li, Daniel Povey, Yujun Wang:
speechocean762: An Open-Source Non-Native English Speech Corpus for Pronunciation Assessment. 3710-3714

Non-Autoregressive Sequential Modeling for Speech Processing

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/FanC00A21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/FanC00A21
Ruchao Fan, Wei Chu, Peng Chang, Jing Xiao, Abeer Alwan:
An Improved Single Step Non-Autoregressive Transformer for Automatic Speech Recognition. 3715-3719
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/GuoC0021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/GuoC0021
Pengcheng Guo, Xuankai Chang, Shinji Watanabe, Lei Xie:
Multi-Speaker ASR Combining Non-Autoregressive Conformer CTC and Conditional Speaker Chain. 3720-3724
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/NgCZC21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/NgCZC21
Edwin G. Ng, Chung-Cheng Chiu, Yu Zhang, William Chan:
Pushing the Limits of Non-Autoregressive Speech Recognition. 3725-3729
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LiuCG21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LiuCG21
Alexander H. Liu, Yu-An Chung, James R. Glass:
Non-Autoregressive Predictive Coding for Learning Speech Representations from Local Dependencies. 3730-3734
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/NozakiK21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/NozakiK21
Jumon Nozaki, Tatsuya Komatsu:
Relaxing the Conditional Independence Assumption of CTC-Based ASR by Conditioning on Intermediate Predictions. 3735-3739
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/FujitaW0O21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/FujitaW0O21
Yuya Fujita, Tianzi Wang, Shinji Watanabe, Motoi Omachi:
Toward Streaming ASR with Non-Autoregressive Insertion-Based Model. 3740-3744
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LeeK021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LeeK021
Jaesong Lee, Jingu Kang, Shinji Watanabe:
Layer Pruning on Demand with Intermediate CTC. 3745-3749
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LiOTLLH21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LiOTLLH21
Song Li, Beibei Ouyang, Fuchuan Tong, Dexin Liao, Lin Li, Qingyang Hong:
Real-Time End-to-End Monaural Multi-Speaker Speech Recognition. 3750-3754
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WangFCW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WangFCW21
Tianzi Wang, Yuya Fujita, Xuankai Chang, Shinji Watanabe:
Streaming End-to-End ASR Based on Blockwise Non-Autoregressive Models. 3755-3759
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/BeliaevG21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/BeliaevG21
Stanislav Beliaev, Boris Ginsburg:
TalkNet: Non-Autoregressive Depth-Wise Separable Convolutional Model for Speech Synthesis. 3760-3764
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChenZZW0DC21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChenZZW0DC21
Nanxin Chen, Yu Zhang, Heiga Zen, Ron J. Weiss, Mohammad Norouzi, Najim Dehak, William Chan:
WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis. 3765-3769
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChenZMVD21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChenZMVD21
Nanxin Chen, Piotr Zelasko, Laureano Moro-Velázquez, Jesús Villalba, Najim Dehak:
Align-Denoise: Single-Pass Non-Autoregressive Speech Recognition. 3770-3774
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Lu0WLKLM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Lu0WLKLM21
Hui Lu, Zhiyong Wu, Xixin Wu, Xu Li, Shiyin Kang, Xunying Liu, Helen Meng:
VAENAR-TTS: Variational Auto-Encoder Based Non-AutoRegressive Text-to-Speech Synthesis. 3775-3779

The ADReSSo Challenge: Detecting Cognitive Decline Using Speech Only

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LuzHFFM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LuzHFFM21
Saturnino Luz, Fasih Haider, Sofia de la Fuente, Davida Fromm, Brian MacWhinney:
Detecting Cognitive Decline Using Speech Only: The ADReSSo Challenge. 3780-3784
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Perez-ToroBAVKS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Perez-ToroBAVKS21
Paula Andrea Pérez-Toro, Sebastian P. Bayerl, Tomás Arias-Vergara, Juan Camilo Vásquez-Correa, Philipp Klumpp, Maria Schuster, Elmar Nöth, Juan Rafael Orozco-Arroyave, Korbinian Riedhammer:
Influence of the Interviewer on the Automatic Assessment of Alzheimer's Disease in the Context of the ADReSSo Challenge. 3785-3789
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZhuOLBR21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZhuOLBR21
Youxiang Zhu, Abdelrahman Obyat, Xiaohui Liang, John A. Batsis, Robert M. Roth:
WavBERT: Exploiting Semantic and Non-Semantic Speech Using Wav2vec and BERT for Dementia Detection. 3790-3794
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/GauderPFR21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/GauderPFR21
Lara Gauder, Leonardo Pepino, Luciana Ferrer, Pablo Riera:
Alzheimer Disease Recognition Using Speech-Based Embeddings From Pre-Trained Models. 3795-3799
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/BalagopalanN21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/BalagopalanN21
Aparna Balagopalan, Jekaterina Novikova:
Comparing Acoustic-Based Approaches for Alzheimer's Disease Detection. 3800-3804
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/QiaoYWK21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/QiaoYWK21
Yu Qiao, Xuefeng Yin, Daniel Wiechmann, Elma Kerz:
Alzheimer's Disease Detection from Spontaneous Speech Through Combining Linguistic Complexity and (Dis)Fluency Features with Pretrained Language Models. 3805-3809
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/PanMHTJSBC21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/PanMHTJSBC21
Yilin Pan, Bahman Mirheidari, Jennifer M. Harris, Jennifer C. Thompson, Matthew Jones, Julie S. Snowden, Daniel Blackburn, Heidi Christensen:
Using the Outputs of Different Automatic Speech Recognition Paradigms for Acoustic- and BERT-Based Alzheimer's Dementia Detection Through Spontaneous Speech. 3810-3814
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SyedSLP21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SyedSLP21
Zafi Sherhan Syed, Muhammad Shehram Shah Syed, Margaret Lech, Elena Pirogova:
Tackling the ADRESSO Challenge 2021: The MUET-RMIT System for Alzheimer's Dementia Recognition from Spontaneous Speech. 3815-3819
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/RohanianHP21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/RohanianHP21
Morteza Rohanian, Julian Hough, Matthew Purver:
Alzheimer's Dementia Recognition Using Acoustic, Lexical, Disfluency and Speech Pause Features Robust to Noisy Inputs. 3820-3824
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/PappagariCJMZVD21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/PappagariCJMZVD21
Raghavendra Pappagari, Jaejin Cho, Sonal Joshi, Laureano Moro-Velázquez, Piotr Zelasko, Jesús Villalba, Najim Dehak:
Automatic Detection and Assessment of Alzheimer Disease Using Speech and Language Technologies in Low-Resource Scenarios. 3825-3829
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChenYTZ21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChenYTZ21
Jun Chen, Jieping Ye, Fengyi Tang, Jiayu Zhou:
Automatic Detection of Alzheimer's Disease Using Spontaneous Speech Only. 3830-3834
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WangCHSS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WangCHSS21
Ning Wang, Yupeng Cao, Shuai Hao, Zongru Shao, K. P. Subbalakshmi:
Modular Multi-Modal Attention Network for Alzheimer's Disease Detection Using Patient Audio and Language Data. 3835-3839

Robust and Far-Field ASR

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/GongQSGLM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/GongQSGLM21
Rong Gong, Carl Quillen, Dushyant Sharma, Andrew Goderre, José Laínez, Ljubomir Milanovic:
Self-Attention Channel Combinator Frontend for End-to-End Multichannel Far-Field Speech Recognition. 3840-3844
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/GretterMFMLKW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/GretterMFMLKW21
Roberto Gretter, Marco Matassoni, Daniele Falavigna, A. Misra, Chee Wee Leong, Kate M. Knill, Linlin Wang:
ETLT 2021: Shared Task on Automatic Speech Recognition for Non-Native Children's Speech. 3845-3849
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/RumbergELO21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/RumbergELO21
Lars Rumberg, Hanna Ehlert, Ulrike Lüdtke, Jörn Ostermann:
Age-Invariant Training for End-to-End Child Speech Recognition Using Adversarial Multi-Task Learning. 3850-3854
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/CornellBMS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/CornellBMS21
Samuele Cornell, Alessio Brutti, Marco Matassoni, Stefano Squartini:
Learning to Rank Microphones for Distant Speech Recognition. 3855-3859
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/GelinPPD21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/GelinPPD21
Lucile Gelin, Thomas Pellegrini, Julien Pinquier, Morgane Daniel:
Simulating Reading Mistakes for Child Speech Transformer-Based Phone Recognition. 3860-3864

Speech Synthesis: Prosody Modeling II

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/StephensonHGB21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/StephensonHGB21
Brooke Stephenson, Thomas Hueber, Laurent Girin, Laurent Besacier:
Alternate Endings: Improving Prosody for Incremental Neural TTS with Predicted Future Text Input. 3865-3869
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/RijnMSHLAJ21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/RijnMSHLAJ21
Pol van Rijn, Silvan Mertes, Dominik Schiller, Peter M. C. Harrison, Pauline Larrouy-Maestri, Elisabeth André, Nori Jacoby:
Exploring Emotional Prototypes in a High Dimensional TTS Latent Space. 3870-3874
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/MohanHTTWSFGK21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MohanHTTWSFGK21
Devang S. Ram Mohan, Qinmin Vivian Hu, Tian Huey Teh, Alexandra Torresquintero, Christopher G. R. Wallis, Marlene Staib, Lorenzo Foglianti, Jiameng Gao, Simon King:
Ctrl-P: Temporal Control of Prosodic Variation for Speech Synthesis. 3875-3879
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/TorresquinteroT21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/TorresquinteroT21
Alexandra Torresquintero, Tian Huey Teh, Christopher G. R. Wallis, Marlene Staib, Devang S. Ram Mohan, Vivian Hu, Lorenzo Foglianti, Jiameng Gao, Simon King:
ADEPT: A Dataset for Evaluating Prosody Transfer. 3880-3884
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/NguyenKRd21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/NguyenKRd21
Thi Thu Trang Nguyen, Nguyen Hoang Ky, Albert Rilliard, Christophe d'Alessandro:
Prosodic Boundary Prediction Model for Vietnamese Text-To-Speech. 3885-3889

Source Separation III

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/DovratNW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/DovratNW21
Shaked Dovrat, Eliya Nachmani, Lior Wolf:
Many-Speakers Single Channel Speech Separation with Optimal Permutation Training. 3890-3894
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/FrasWK21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/FrasWK21
Mieszko Fras, Marcin Witkowski, Konrad Kowalczyk:
Combating Reverberation in NTF-Based Speech Separation Using a Sub-Source Weighted Multichannel Wiener Filter and Linear Prediction. 3895-3899
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/0003PTE21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/0003PTE21
Martin Strauss, Jouni Paulus, Matteo Torcoli, Bernd Edler:
A Hands-On Comparison of DNNs for Dialog Separation Using Transfer Learning from Music Source Separation. 3900-3904
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/BorsdorfX0S21a
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/BorsdorfX0S21a
Marvin Borsdorf, Chenglin Xu, Haizhou Li, Tanja Schultz:
GlobalPhone Mix-To-Separate Out of 2: A Multilingual 2000 Speakers Mixtures Database for Speech Separation. 3905-3909

Non-Native Speech

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/TsukadaRKHH21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/TsukadaRKHH21
Kimiko Tsukada, Yu Rong, Joo-Yeon Kim, Jeong-Im Han, John Hajek:
Cross-Linguistic Perception of the Japanese Singleton/Geminate Contrast: Korean, Mandarin and Mongolian Compared. 3910-3914
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KorzekwaBZBLSDD21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KorzekwaBZBLSDD21
Daniel Korzekwa, Roberto Barra-Chicote, Szymon Zaporowski, Grzegorz Beringer, Jaime Lorenzo-Trueba, Alicja Serafinowicz, Jasha Droppo, Thomas Drugman, Bozena Kostek:
Detection of Lexical Stress Errors in Non-Native (L2) English with Data Augmentation and Attention. 3915-3919
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/BraunDEWZ21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/BraunDEWZ21
Bettina Braun, Nicole Dehé, Marieke Einfeldt, Daniela Wochner, Katharina Zahner-Ritter:
Testing Acoustic Voice Quality Classification Across Languages and Speech Styles. 3920-3924
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZhangLCT21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZhangLCT21
Qianyutong Zhang, Kexin Lyu, Zening Chen, Ping Tang:
Acquisition of Prosodic Focus Marking by Three- to Six-Year-Old Children Learning Mandarin Chinese. 3925-3928
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/MirzaeiM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MirzaeiM21
Maryam Sadat Mirzaei, Kourosh Meshgi:
Adaptive Listening Difficulty Detection for L2 Learners Through Moderating ASR Resources. 3929-3933
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/DingLW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/DingLW21
Hongwei Ding, Binghuai Lin, Liyuan Wang:
F₀ Patterns of L2 English Speech by Mandarin Chinese Learners. 3934-3938
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LinW21b
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LinW21b
Binghuai Lin, Liyuan Wang:
A Neural Network-Based Noise Compensation Method for Pronunciation Assessment. 3939-3943
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KuderaGMAK21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KuderaGMAK21
Jacek Kudera, Philip Georgis, Bernd Möbius, Tania Avgustinova, Dietrich Klakow:
Phonetic Distance and Surprisal in Multilingual Priming: Evidence from Slavic. 3944-3948
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZhangLLZ21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZhangLLZ21
Yuqing Zhang, Zhu Li, Binghuai Lin, Jinsong Zhang:
A Preliminary Study on Discourse Prosody Encoding in L1 and L2 English Spontaneous Narratives. 3949-3953
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WuLLM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WuLLM21
Minglin Wu, Kun Li, Wai-Kim Leung, Helen Meng:
Transformer Based End-to-End Mispronunciation Detection and Diagnosis. 3954-3958
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Graham21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Graham21
Calbert Graham:
L1 Identification from L2 Speech Using Neural Spectrogram Analysis. 3959-3963

Phonetics II

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/OhBN21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/OhBN21
Miran Oh, Dani Byrd, Shrikanth S. Narayanan:
Leveraging Real-Time MRI for Illuminating Linguistic Velum Action. 3964-3968
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LiuX21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LiuX21
Zirui Liu, Yi Xu:
Segmental Alignment of English Syllables with Singleton and Cluster Onsets. 3969-3973
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Hejna21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Hejna21
Mísa Hejná:
Exploration of Welsh English Pre-Aspiration: How Wide-Spread is it? 3974-3978
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/MuhlackEDTOWRM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MuhlackEDTOWRM21
Beeke Muhlack, Mikey Elmers, Heiner Drenhaus, Jürgen Trouvain, Marjolein van Os, Raphael Werner, Margarita Ryzhova, Bernd Möbius:
Revisiting Recall Effects of Filler Particles in German and English. 3979-3983
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/GeXM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/GeXM21
Chunyu Ge, Yixuan Xiong, Peggy Mok:
How Reliable Are Phonetic Data Collected Remotely? Comparison of Recording Devices and Environments on Acoustic Measurements. 3984-3988
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/HuangHC21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/HuangHC21
Jing Huang, Feng-fan Hsieh, Yueh-Chin Chang:
A Cross-Dialectal Comparison of Apical Vowels in Beijing Mandarin, Northeastern Mandarin and Southwestern Mandarin: An EMA and Ultrasound Study. 3989-3993
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/GibsonMP21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/GibsonMP21
Mark Gibson, Oihane Muxika, Marianne Pouplier:
Dissecting the Aero-Acoustic Parameters of Open Articulatory Transitions. 3994-3998
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Gully21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Gully21
Amelia Jane Gully:
Quantifying Vocal Tract Shape Variation and its Acoustic Impact: A Geometric Morphometric Approach. 3999-4003
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Guevara-RukozYP21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Guevara-RukozYP21
Adriana Guevara-Rukoz, Shi Yu, Sharon Peperkamp:
Speech Perception and Loanword Adaptations: The Case of Copy-Vowel Epenthesis. 4004-4008
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/GuoS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/GuoS21
Zhe-chen Guo, Rajka Smiljanic:
Speakers Coarticulate Less When Facing Real and Imagined Communicative Difficulties: An Analysis of Read and Spontaneous Speech from the LUCID Corpus. 4009-4013
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/MeisterM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MeisterM21
Einar Meister, Lya Meister:
Developmental Changes of Vowel Acoustics in Adolescents. 4014-4018
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/DApolitoF21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/DApolitoF21
Sonia D'Apolito, Barbara Gili Fivela:
Context and Co-Text Influence on the Accuracy Production of Italian L2 Non-Native Sounds. 4019-4023
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/HeeringaV21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/HeeringaV21
Wilbert Heeringa, Hans Van de Velde:
A New Vowel Normalization for Sociophonetics. 4024-4028
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/BillingtonST21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/BillingtonST21
Rosey Billington, Hywel Stoakes, Nick Thieberger:
The Pacific Expansion: Optimizing Phonetic Transcription of Archival Corpora. 4029-4033

Search/Decoding Techniques and Confidence Measures for ASR

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/TianYBT0W21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/TianYBT0W21
Zhengkun Tian, Jiangyan Yi, Ye Bai, Jianhua Tao, Shuai Zhang, Zhengqi Wen:
FSR: Accelerating the Inference Process of Transducer-Based Models by Applying Fast-Skip Regularization. 4034-4038
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/MitrofanovKPKLA21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MitrofanovKPKLA21
Anton Mitrofanov, Mariya Korenevskaya, Ivan Podluzhny, Yuri Y. Khokhlov, Aleksandr Laptev, Andrei Andrusenko, Aleksei Ilin, Maxim Korenevsky, Ivan Medennikov, Aleksei Romanenko:
LT-LM: A Novel Non-Autoregressive Language Model for Single-Shot Lattice Rescoring. 4039-4043
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/AllauzenV0RZ21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/AllauzenV0RZ21
Cyril Allauzen, Ehsan Variani, Michael Riley, David Rybach, Hao Zhang:
A Hybrid Seq-2-Seq ASR Design for On-Device and Server Applications. 4044-4048
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/InagumaK21a
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/InagumaK21a
Hirofumi Inaguma, Tatsuya Kawahara:
VAD-Free Streaming Hybrid CTC/Attention ASR for Unsegmented Recording. 4049-4053
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/YaoWWZYYPCXL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/YaoWWZYYPCXL21
Zhuoyuan Yao, Di Wu, Xiong Wang, Binbin Zhang, Fan Yu, Chao Yang, Zhendong Peng, Xiaoyu Chen, Lei Xie, Xin Lei:
WeNet: Production Oriented Streaming and Non-Streaming End-to-End Speech Recognition Toolkit. 4054-4058
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/TanakaMITMAOM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/TanakaMITMAOM21
Tomohiro Tanaka, Ryo Masumura, Mana Ihori, Akihiko Takashima, Takafumi Moriya, Takanori Ashihara, Shota Orihashi, Naoki Makishima:
Cross-Modal Transformer-Based Neural Correction Models for Automatic Speech Recognition. 4059-4063
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LeeC21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LeeC21
Mun-Hak Lee, Joon-Hyuk Chang:
Deep Neural Network Calibration for E2E Speech Recognition System. 4064-4068
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LiZLCW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LiZLCW21
Qiujia Li, Yu Zhang, Bo Li, Liangliang Cao, Philip C. Woodland:
Residual Energy-Based Models for End-to-End Speech Recognition. 4069-4073
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/QiuHLZCM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/QiuHLZCM21
David Qiu, Yanzhang He, Qiujia Li, Yu Zhang, Liangliang Cao, Ian McGraw:
Multi-Task Learning for End-to-End ASR Word and Utterance Confidence with Deletion Prediction. 4074-4078
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/OllerenshawJH21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/OllerenshawJH21
Anna Ollerenshaw, Md. Asif Jalal, Thomas Hain:
Insights on Neural Representations for End-to-End Speech Recognition. 4079-4083
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/AfshanKW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/AfshanKW21
Amber Afshan, Kshitiz Kumar, Jian Wu:
Sequence-Level Confidence Classifier for ASR Utterance Accuracy and Application to Acoustic Models. 4084-4088

Speech Synthesis: Linguistic Processing, Paradigms and Other Topics

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/TjandraPZK21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/TjandraPZK21
Andros Tjandra, Ruoming Pang, Yu Zhang, Shigeki Karita:
Unsupervised Learning of Disentangled Speech Content and Style Representation. 4089-4093
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChoiKKK21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChoiKKK21
Eunbi Choi, Hwa-Yeon Kim, Jong-Hwan Kim, Jae-Min Kim:
Label Embedding for Chinese Grapheme-to-Phoneme Conversion. 4094-4098
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Zhang21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Zhang21
Haiteng Zhang:
PDF: Polyphone Disambiguation in Chinese by Using FLAT. 4099-4103
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LiZCMWX21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LiZCMWX21
Junjie Li, Zhiyu Zhang, Minchuan Chen, Jun Ma, Shaojun Wang, Jing Xiao:
Improving Polyphone Disambiguation for Mandarin Chinese by Combining Mix-Pooling Strategy and Window-Based Attention. 4104-4108
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ShiWCW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ShiWCW21
Yi Shi, Congyi Wang, Yu Chen, Bin Wang:
Polyphone Disambiguation in Mandarin Chinese with Semi-Supervised Learning. 4109-4113
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChenLL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChenLL21
Yue Chen, Zhen-Hua Ling, Qing-Feng Liu:
A Neural-Network-Based Approach to Identifying Speakers in Novels. 4114-4118
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZhouL021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZhouL021
Xiao Zhou, Zhen-Hua Ling, Li-Rong Dai:
UnitNet-Based Hybrid Speech Synthesis. 4119-4123
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/NovitasariSN21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/NovitasariSN21
Sashi Novitasari, Sakriani Sakti, Satoshi Nakamura:
Dynamically Adaptive Machine Speech Chain Inference for TTS in Noisy Environment: Listen and Speak Louder. 4124-4128
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZhangHSZ021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZhangHSZ021
Haozhe Zhang, Zhihua Huang, Zengqiang Shang, Pengyuan Zhang, Yonghong Yan:
LinearSpeech: Parallel Text-to-Speech with Linear Complexity. 4129-4133

Speech Type Classification and Diagnosis

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/MansbachNA21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MansbachNA21
Noa Mansbach, Evgeny Hershkovitch Neiterman, Amos Azaria:
An Agent for Competing with Humans in a Deceptive Game Based on Vocal Cues. 4134-4138
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/FakhryJXCH21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/FakhryJXCH21
Ahmed Fakhry, Xinyi Jiang, Jaclyn Xiao, Gunvant R. Chaudhari, Asriel Han:
A Multi-Branch Deep Learning Network for Automated Detection of COVID-19. 4139-4143
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/MaRX21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MaRX21
Youxuan Ma, Zongze Ren, Shugong Xu:
RW-Resnet: A Novel Speech Anti-Spoofing Model Using Raw Waveform. 4144-4148
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/DhamyalAQR21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/DhamyalAQR21
Hira Dhamyal, Ayesha Ali, Ihsan Ayyub Qazi, Agha Ali Raza:
Fake Audio Detection in Resource-Constrained Settings Using Microfeatures. 4149-4153
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/YanMPLSS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/YanMPLSS21
Tianhao Yan, Hao Meng, Emilia Parada-Cabaleiro, Shuo Liu, Meishu Song, Björn W. Schuller:
Coughing-Based Recognition of Covid-19 with Spatial Attentive ConvLSTM Recurrent Neural Networks. 4154-4158
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/PaulMRD21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/PaulMRD21
Soumava Paul, Gurunath Reddy M, K. Sreenivasa Rao, Partha Pratim Das:
Knowledge Distillation for Singing Voice Detection. 4159-4163
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/TakedaK21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/TakedaK21
Ryu Takeda, Kazunori Komatani:
Age Estimation with Speech-Age Model for Heterogeneous Speech Datasets. 4164-4168
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/TehT21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/TehT21
Kah Kuan Teh, Huy Dat Tran:
Open-Set Audio Classification with Limited Training Resources Based on Augmentation Enhanced Variational Auto-Encoder GAN with Detection-Classification Joint Training. 4169-4173
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Fukumori21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Fukumori21
Takahiro Fukumori:
Deep Spectral-Cepstral Fusion for Shouted and Normal Speech Classification. 4174-4178
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/BaghelBPG21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/BaghelBPG21
Shikha Baghel, Mrinmoy Bhattacharjee, S. R. Mahadeva Prasanna, Prithwijit Guha:
Automatic Detection of Shouted Speech Segments in Indian News Debates. 4179-4183
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/GaoVEBS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/GaoVEBS21
Yang Gao, Tyler Vuong, Mahsa Elyasi, Gaurav Bharaj, Rita Singh:
Generalized Spoofing Detection Inspired from Audio Generation Artifacts. 4184-4188
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChenPCZ21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChenPCZ21
Weiguang Chen, Van Tung Pham, Eng Siong Chng, Xionghu Zhong:
Overlapped Speech Detection Based on Spectral and Spatial Feature Fusion. 4189-4193

Spoken Term Detection & Voice Search

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/AbdullahMZMK21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/AbdullahMZMK21
Badr M. Abdullah, Marius Mosbach, Iuliia Zaitova, Bernd Möbius, Dietrich Klakow:
Do Acoustic Word Embeddings Capture Phonological Similarity? An Empirical Study. 4194-4198
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/GaoAHGMXA21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/GaoAHGMXA21
Zheng Gao, Radhika Arava, Qian Hu, Xibin Gao, Thahir Mohamed, Wei Xiao, Mohamed Abdelhady:
Paraphrase Label Alignment for Voice Application Retrieval in Spoken Language Understanding. 4199-4203
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/RikhyeWLHZHNM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/RikhyeWLHZHNM21
Rajeev Rikhye, Quan Wang, Qiao Liang, Yanzhang He, Ding Zhao, Yiteng Huang, Arun Narayanan, Ian McGraw:
Personalized Keyphrase Detection Using Speaker and Environment Information. 4204-4208
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/GargCSASDD21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/GargCSASDD21
Vineet Garg, Wonil Chang, Siddharth Sigtia, Saurabh Adya, Pramod Simha, Pranay Dighe, Chandra Dhir:
Streaming Transformer for Hardware Efficient Voice Trigger Detection and False Trigger Mitigation. 4209-4213
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/MazumderBMWR21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MazumderBMWR21
Mark Mazumder, Colby R. Banbury, Josh Meyer, Pete Warden, Vijay Janapa Reddi:
Few-Shot Keyword Spotting in Any Language. 4214-4218
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WangGCZ21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WangGCZ21
Li Wang, Rongzhi Gu, Nuo Chen, Yuexian Zou:
Text Anchor Based Metric Learning for Small-Footprint Keyword Spotting. 4219-4223
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChenK021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChenK021
Yangbin Chen, Tom Ko, Jianping Wang:
A Meta-Learning Approach for User-Defined Spoken Term Classification with Varying Classes and Examples. 4224-4228
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LeeKSWLKKJ21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LeeKSWLKKJ21
Dongyub Lee, Byeongil Ko, Myeongcheol Shin, Taesun Whang, Daniel Lee, Eun Hwa Kim, EungGyun Kim, Jaechoon Jo:
Auxiliary Sequence Labeling Tasks for Disfluency Detection. 4229-4233
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZhouHYC21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZhouHYC21
Hang Zhou, Wenchao Hu, Yu Ting Yeung, Xiao Chen:
Energy-Friendly Keyword Spotting System Using Add-Based Convolution. 4234-4238
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/JiaWQZWWZL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/JiaWQZWWZL21
Yan Jia, Xingming Wang, Xiaoyi Qin, Yinping Zhang, Xuyang Wang, Junjie Wang, Dong Zhang, Ming Li:
The 2020 Personalized Voice Trigger Challenge: Open Datasets, Evaluation Metrics, Baseline System and Results. 4239-4243
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WangHZSTKLX21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WangHZSTKLX21
Jingsong Wang, Yuxuan He, Chunyu Zhao, Qijie Shao, Wei-Wei Tu, Tom Ko, Hung-yi Lee, Lei Xie:
Auto-KWS 2021 Challenge: Task, Datasets, and Baselines. 4244-4248
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/BergOC21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/BergOC21
Axel Berg, Mark O'Connor, Miguel Tairum Cruz:
Keyword Transformer: A Self-Attention Model for Keyword Spotting. 4249-4253
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/AwasthiKR21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/AwasthiKR21
Abhijeet Awasthi, Kevin Kilgour, Hassan Rom:
Teaching Keyword Spotters to Spot New Keywords with Limited Examples. 4254-4258

Voice Anti-Spoofing and Countermeasure

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WangY21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WangY21
Xin Wang, Junichi Yamagishi:
A Comparative Study on Recent Neural Spoofing Countermeasures for Synthetic Speech Detection. 4259-4263
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZhangWCY0E21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZhangWCY0E21
Lin Zhang, Xin Wang, Erica Cooper, Junichi Yamagishi, Jose Patino, Nicholas W. D. Evans:
An Initial Investigation for Detecting Partially Spoofed Audio. 4264-4268
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/XieZY21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/XieZY21
Yang Xie, Zhenchuan Zhang, Yingchun Yang:
Siamese Network with wav2vec Feature for Spoofing Speech Detection. 4269-4273
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChengXZ21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChengXZ21
Xingliang Cheng, Mingxing Xu, Thomas Fang Zheng:
Cross-Database Replay Detection in Terminal-Dependent Speaker Verification. 4274-4278
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZhangWZ21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZhangWZ21
Yuxiang Zhang, Wenchao Wang, Pengyuan Zhang:
The Effect of Silence and Dual-Band Fusion in Anti-Spoofing System. 4279-4283
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/PengLL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/PengLL21
Zhiyuan Peng, Xu Li, Tan Lee:
Pairing Weak with Strong: Twin Models for Defending Against Adversarial Attack on Speaker Verification. 4284-4288
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LingHHZL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LingHHZL21
Hefei Ling, Leichao Huang, Junrui Huang, Baiyan Zhang, Ping Li:
Attention-Based Convolutional Neural Network for ASV Spoofing Detection. 4289-4293
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WuZWWL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WuZWWL21
Haibin Wu, Yang Zhang, Zhiyong Wu, Dong Wang, Hung-yi Lee:
Voting for the Right Answer: Adversarial Defense for Speaker Verification. 4294-4298
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KinnunenNSEWTDY21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KinnunenNSEWTDY21
Tomi Kinnunen, Andreas Nautsch, Md. Sahidullah, Nicholas W. D. Evans, Xin Wang, Massimiliano Todisco, Héctor Delgado, Junichi Yamagishi, Kong Aik Lee:
Visualizing Classifier Adjacency Relations: A Case Study in Speaker Verification and Voice Anti-Spoofing. 4299-4303
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/VillalbaJZD21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/VillalbaJZD21
Jesús Villalba, Sonal Joshi, Piotr Zelasko, Najim Dehak:
Representation Learning to Classify and Detect Adversarial Attacks Against Speaker and Speech Recognition Systems. 4304-4308
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZhangZJD21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZhangZJD21
You Zhang, Ge Zhu, Fei Jiang, Zhiyao Duan:
An Empirical Study on Channel Effects for Synthetic Voice Spoofing Countermeasure Systems. 4309-4313
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LiWLLM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LiWLLM21
Xu Li, Xixin Wu, Hui Lu, Xunying Liu, Helen Meng:
Channel-Wise Gated Res2Net: Towards Robust Detection of Synthetic Speech Attacks. 4314-4318
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/GeP0TE21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/GeP0TE21
Wanying Ge, Michele Panariello, Jose Patino, Massimiliano Todisco, Nicholas W. D. Evans:
Partially-Connected Differentiable Architecture Search for Deepfake and Spoofing Detection. 4319-4323

OpenASR20 and Low Resource ASR Development

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/PetersonTY21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/PetersonTY21
Kay Peterson, Audrey Tong, Yan Yu:
OpenASR20: An Open Challenge for Automatic Speech Recognition of Conversational Telephone Speech in Low-Resource Languages. 4324-4328
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/MadikeriMB21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MadikeriMB21
Srikanth R. Madikeri, Petr Motlícek, Hervé Bourlard:
Multitask Adaptation with Lattice-Free MMI for Multi-Genre Speech Recognition of Low Resource Languages. 4329-4333
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZhuZWF021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZhuZWF021
Qiu-Shi Zhu, Jie Zhang, Ming-Hui Wu, Xin Fang, Li-Rong Dai:
An Improved Wav2Vec 2.0 Pre-Training Approach Using Enhanced Local Dependency Modeling for Speech Recognition. 4334-4338
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LinZC21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LinZC21
Hung-Pang Lin, Yu-Jia Zhang, Chia-Ping Chen:
Systems for Low-Resource Speech Recognition Tasks in Open Automatic Speech Recognition and Formosa Speech Recognition Challenges. 4339-4343
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZhaoLHWSKYHHZ21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZhaoLHWSKYHHZ21
Jing Zhao, Zhiqiang Lv, Ambyera Han, Guan-Bo Wang, Gui-Xin Shi, Jian Kang, Jinghao Yan, Pengfei Hu, Shen Huang, Weiqiang Zhang:
The TNT Team System Descriptions of Cantonese and Mongolian for IARPA OpenASR20. 4344-4348
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/AlumaeK21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/AlumaeK21
Tanel Alumäe, Jiaming Kong:
Combining Hybrid and End-to-End Approaches for the OpenASR20 Challenge. 4349-4353
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/MorrisJP21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MorrisJP21
Ethan Morris, Robbie Jimerson, Emily Prud'hommeaux:
One Size Does Not Fit All in Resource-Constrained ASR. 4354-4358

Survey Talk 4: Alejandrina Cristia

- view
  - electronic edition @ isca-speech.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/interspeech/Cristia21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Cristia21
Alejandrina Cristià:
Child Language Acquisition Studied with Wearables.

Keynote 4: Tomáš Mikolov

- view
  - electronic edition @ isca-speech.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/interspeech/Mikolov21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Mikolov21
Tomás Mikolov:
Language Modeling and Artificial Intelligence.

Voice Activity Detection

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/GimenoGML21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/GimenoGML21
Pablo Gimeno, Alfonso Ortega Giménez, Antonio Miguel, Eduardo Lleida:
Unsupervised Representation Learning for Speech Activity Detection in the Fearless Steps Challenge 2021. 4359-4363
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/VuongXS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/VuongXS21
Tyler Vuong, Yangyang Xia, Richard M. Stern:
The Application of Learnable STRF Kernels to the 2021 Fearless Steps Phase-03 SAD Challenge. 4364-4368
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SarfjooMM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SarfjooMM21
Seyyed Saeed Sarfjoo, Srikanth R. Madikeri, Petr Motlícek:
Speech Activity Detection Based on Multilingual Speech Recognition System. 4369-4373
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LuckenbaughAGFG21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LuckenbaughAGFG21
Jarrod Luckenbaugh, Samuel Abplanalp, Rachel Gonzalez, Daniel Fulford, David Gard, Carlos Busso:
Voice Activity Detection with Teacher-Student Domain Emulation. 4374-4378
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Ghahabi021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Ghahabi021
Omid Ghahabi, Volker Fischer:
EML Online Speech Activity Detection for the Fearless Steps Challenge Phase-III. 4379-4382

Keyword Search and Spoken Language Processing

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LopatkaKKT21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LopatkaKKT21
Kuba Lopatka, Katarzyna Kaszuba-Miotke, Piotr Klinke, Pawel Trella:
Device Playback Augmentation with Echo Cancellation for Keyword Spotting. 4383-4387
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/YusufGGS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/YusufGGS21
Bolaji Yusuf, Alican Gök, Batuhan Gündogdu, Murat Saraclar:
End-to-End Open Vocabulary Keyword Search. 4388-4392
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/MerkxFE21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MerkxFE21
Danny Merkx, Stefan L. Frank, Mirjam Ernestus:
Semantic Sentence Similarity: Size does not Always Matter. 4393-4397
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SvecSPP21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SvecSPP21
Jan Svec, Lubos Smídl, Josef V. Psutka, Ales Prazák:
Spoken Term Detection and Relevance Score Estimation Using Dot-Product of Pronunciation Embeddings. 4398-4402
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/BuetY21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/BuetY21
François Buet, François Yvon:
Toward Genre Adapted Closed Captioning. 4403-4407

Applications in Transcription, Education and Learning

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KorzekwaLDCK21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KorzekwaLDCK21
Daniel Korzekwa, Jaime Lorenzo-Trueba, Thomas Drugman, Shira Calamaro, Bozena Kostek:
Weakly-Supervised Word-Level Pronunciation Error Detection in Non-Native English Speech. 4408-4412
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KandaYGWMCY21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KandaYGWMCY21
Naoyuki Kanda, Guoli Ye, Yashesh Gaur, Xiaofei Wang, Zhong Meng, Zhuo Chen, Takuya Yoshioka:
End-to-End Speaker-Attributed ASR with Transformer. 4413-4417
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SoltauWSS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SoltauWSS21
Hagen Soltau, Mingqiu Wang, Izhak Shafran, Laurent El Shafey:
Understanding Medical Conversations: Rich Transcription, Confidence Scores & Information Extraction. 4418-4422
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/VidalBSF21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/VidalBSF21
Jazmín Vidal, Cyntia Bonomi, Marcelo Sancinetti, Luciana Ferrer:
Phone-Level Pronunciation Scoring for Spanish Speakers Learning English Using a GOP-DNN System. 4423-4427
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/XuKCLM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/XuKCLM21
Xiaoshuo Xu, Yueteng Kang, Songjun Cao, Binghuai Lin, Long Ma:
Explore wav2vec 2.0 for Mispronunciation Detection. 4428-4432
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/AndoMS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/AndoMS21
Shintaro Ando, Nobuaki Minematsu, Daisuke Saito:
Lexical Density Analysis of Word Productions in Japanese English Using Acoustic Word Embeddings. 4433-4437
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LinW21c
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LinW21c
Binghuai Lin, Liyuan Wang:
Deep Feature Transfer Learning for Automatic Pronunciation Assessment. 4438-4442
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZhangSC21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZhangSC21
Huayun Zhang, Ke Shi, Nancy F. Chen:
Multilingual Speech Evaluation: Case Studies on English, Malay and Tamil. 4443-4447
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/PengFLK021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/PengFLK021
Linkai Peng, Kaiqi Fu, Binghuai Lin, Dengfeng Ke, Jinsong Zhang:
A Study on Fine-Tuning wav2vec2.0 Model for the Task of Mispronunciation Detection and Diagnosis. 4448-4452
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/QiaoZKS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/QiaoZKS21
Yu Qiao, Wei Zhou, Elma Kerz, Ralf Schlüter:
The Impact of ASR on the Automatic Analysis of Linguistic Complexity and Sophistication in Spontaneous L2 Speech. 4453-4457
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/TanakaMITOM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/TanakaMITOM21
Tomohiro Tanaka, Ryo Masumura, Mana Ihori, Akihiko Takashima, Shota Orihashi, Naoki Makishima:
End-to-End Rich Transcription-Style Automatic Speech Recognition with Semi-Supervised Learning. 4458-4462
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/CumbalMLE21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/CumbalMLE21
Ronald Cumbal, Birger Moëll, José Lopes, Olov Engwall:
"You don't understand me!": Comparing ASR Results for L1 and L2 Speakers of Swedish. 4463-4467
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZhangBGG21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZhangBGG21
Yang Zhang, Evelina Bakhturina, Kyle Gorman, Boris Ginsburg:
NeMo Inverse Text Normalization: From Development to Production. 4468-4472
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/NaijoIN21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/NaijoIN21
Satsuki Naijo, Akinori Ito, Takashi Nose:
Improvement of Automatic English Pronunciation Assessment with Small Number of Utterances Using Sentence Speakability. 4473-4477

Emotion and Sentiment Analysis III

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/HaiderL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/HaiderL21
Fasih Haider, Saturnino Luz:
Affect Recognition Through Scalogram and Multi-Resolution Cochleagram Features. 4478-4482
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LiuW21a
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LiuW21a
Jiawang Liu, Haoxiang Wang:
A Speech Emotion Recognition Framework for Better Discrimination of Confusions. 4483-4487
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LiZJ21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LiZJ21
Ruichen Li, Jinming Zhao, Qin Jin:
Speech Emotion Recognition via Multi-Level Cross-Modal Distillation. 4488-4492
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ItoFSN21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ItoFSN21
Koichiro Ito, Takuya Fujioka, Qinghua Sun, Kenji Nagamatsu:
Audio-Visual Speech Emotion Recognition by Disentangling Emotion and Identity Attributes. 4493-4497
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/BoseSA21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/BoseSA21
Deboshree Bose, Vidhyasaharan Sethu, Eliathamby Ambikairajah:
Parametric Distributions to Model Numerical Emotion Labels. 4498-4502
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/GaoLWD21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/GaoLWD21
Yuan Gao, Jiaxing Liu, Longbiao Wang, Jianwu Dang:
Metric Learning Based Feature Representation with Gated Fusion Model for Speech Emotion Recognition. 4503-4507
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/CaiYZ0021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/CaiYZ0021
Xingyu Cai, Jiahong Yuan, Renjie Zheng, Liang Huang, Kenneth Church:
Speech Emotion Recognition with Multi-Task Learning. 4508-4512
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SeneviratneE21a
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SeneviratneE21a
Nadee Seneviratne, Carol Y. Espy-Wilson:
Generalized Dilated CNN Models for Depression Detection Using Inverted Vocal Tract Variables. 4513-4517
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WangSXLZ21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WangSXLZ21
Yuhua Wang, Guang Shen, Yuezhu Xu, Jiahang Li, Zhengdao Zhao:
Learning Mutual Correlation in Multimodal Transformer for Speech Emotion Recognition. 4518-4522
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LiuSWDY21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LiuSWDY21
Jiaxing Liu, Yaodong Song, Longbiao Wang, Jianwu Dang, Ruiguo Yu:
Time-Frequency Representation Learning with Graph Convolutional Network for Dialogue-Level Speech Emotion Recognition. 4523-4527

Resource-Constrained ASR

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/MordidoK021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MordidoK021
Gonçalo Mordido, Matthijs Van Keirsbilck, Alexander Keller:
Compressing 1D Time-Channel Separable Convolutions Using Sparse Random Ternary Matrices. 4528-4532
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Cheng00W21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Cheng00W21
Mengli Cheng, Chengyu Wang, Jun Huang, Xiaobo Wang:
Weakly Supervised Construction of ASR Systems from Massive Video Data. 4533-4537
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KimC0S21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KimC0S21
Byeonggeun Kim, Simyung Chang, Jinkyu Lee, Dooyong Sung:
Broadcasted Residual Learning for Efficient Keyword Spotting. 4538-4542
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SwaminathanKSDM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SwaminathanKSDM21
Rupak Vignesh Swaminathan, Brian John King, Grant P. Strimel, Jasha Droppo, Athanasios Mouchtaris:
CoDERT: Distilling Encoder Representations with Co-Learning for Transducer-Based Speech Recognition. 4543-4547
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/GaoYZYL021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/GaoYZYL021
Zhifu Gao, Yiwu Yao, Shiliang Zhang, Jun Yang, Ming Lei, Ian McLoughlin:
Extremely Low Footprint End-to-End ASR System for Smart Device. 4548-4552
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ShangguanPSMSZW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ShangguanPSMSZW21
Yuan Shangguan, Rohit Prabhavalkar, Hang Su, Jay Mahadeokar, Yangyang Shi, Jiatong Zhou, Chunyang Wu, Duc Le, Ozlem Kalinli, Christian Fuegen, Michael L. Seltzer:
Dissecting User-Perceived Latency of On-Device E2E Speech Recognition. 4553-4557
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/MacoskeySSR21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MacoskeySSR21
Jonathan Macoskey, Grant P. Strimel, Jinru Su, Ariya Rastrow:
Amortized Neural Networks for Low-Latency Speech Recognition. 4558-4562
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/BotrosSDG0H21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/BotrosSDG0H21
Rami Botros, Tara N. Sainath, Robert David, Emmanuel Guzman, Wei Li, Yanzhang He:
Tied & Reduced RNN-T Decoder. 4563-4567
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KimCK21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KimCK21
Jangho Kim, Simyung Chang, Nojun Kwak:
PQK: Model Compression via Pruning, Quantization, and Knowledge Distillation. 4568-4572
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/NagarajaSVKSC21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/NagarajaSVKSC21
Varun Nagaraja, Yangyang Shi, Ganesh Venkatesh, Ozlem Kalinli, Michael L. Seltzer, Vikas Chandra:
Collaborative Training of Acoustic Encoders for Speech Recognition. 4573-4577
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WangSXM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WangSXM21
Xiong Wang, Sining Sun, Lei Xie, Long Ma:
Efficient Conformer with Prob-Sparse Attention Mechanism for End-to-End Speech Recognition. 4578-4582
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ParcolletR21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ParcolletR21
Titouan Parcollet, Mirco Ravanelli:
The Energy and Carbon Footprint of Training End-to-End Speech Recognizers. 4583-4587

Speaker Recognition: Applications

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChenRS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChenRS21
Long Chen, Venkatesh Ravichandran, Andreas Stolcke:
Graph-Based Label Propagation for Semi-Supervised Speaker Identification. 4588-4592
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LiJCMES21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LiJCMES21
Ruirui Li, Chelsea J.-T. Ju, Zeya Chen, Hongda Mao, Oguz Elibol, Andreas Stolcke:
Fusion of Embeddings Networks for Robust Combination of Text Dependent and Independent Speaker Recognition. 4593-4597
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/CumaniS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/CumaniS21
Sandro Cumani, Salvatore Sarni:
A Generative Model for Duration-Dependent Score Calibration. 4598-4602
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/PelecanosWL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/PelecanosWL21
Jason Pelecanos, Quan Wang, Ignacio López-Moreno:
Dr-Vectors: Decision Residual Networks and an Improved Loss for Speaker Recognition. 4603-4607
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KatariaZ021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KatariaZ021
Saurabh Kataria, Shi-Xiong Zhang, Dong Yu:
Multi-Channel Speaker Verification for Single and Multi-Talker Speech. 4608-4612
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/PadfieldL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/PadfieldL21
Dirk Padfield, Daniel J. Liebling:
Chronological Self-Training for Real-Time Speaker Diarization. 4613-4617
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/XiaoMWZCL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/XiaoMWZCL21
Runqiu Xiao, Xiaoxiao Miao, Wenchao Wang, Pengyuan Zhang, Bin Cai, Liuping Luo:
Adaptive Margin Circle Loss for Speaker Verification. 4618-4622
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/OBrienMG21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/OBrienMG21
Benjamin O'Brien, Christine Meunier, Alain Ghio:
Presentation Matters: Evaluating Speaker Identification Tasks. 4623-4627
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/TongLLWLH21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/TongLLWLH21
Fuchuan Tong, Yan Liu, Song Li, Jie Wang, Lin Li, Qingyang Hong:
Automatic Error Correction for Speaker Embedding Learning with Noisy Labels. 4628-4632
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LiaoLZLHL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LiaoLZLHL21
Dexin Liao, Jing Li, Yiming Zhi, Song Li, Qingyang Hong, Lin Li:
An Integrated Framework for Two-Pass Personalized Voice Trigger. 4633-4637
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LianKDRS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LianKDRS21
Jiachen Lian, Aiswarya Vinod Kumar, Hira Dhamyal, Bhiksha Raj, Rita Singh:
Masked Proxy Loss for Text-Independent Speaker Verification. 4638-4642

Speech Synthesis: Speaking Style and Emotion

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LeePK21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LeePK21
Keon Lee, Kyumin Park, Daeyoung Kim:
STYLER: Style Factor Modeling with Rapidity and Robustness via Speech Decomposition for Expressive and Controllable Neural Text to Speech. 4643-4647
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/0008S021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/0008S021
Rui Liu, Berrak Sisman, Haizhou Li:
Reinforcement Learning for Emotional Text-to-Speech Synthesis with Improved Emotion Discriminability. 4648-4652
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SivaprasadKG21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SivaprasadKG21
Sarath Sivaprasad, Saiteja Kosgi, Vineet Gandhi:
Emotional Prosody Control for Speech Generation. 4653-4657
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/CongYHLX021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/CongYHLX021
Jian Cong, Shan Yang, Na Hu, Guangzhi Li, Lei Xie, Dan Su:
Controllable Context-Aware Conversational Speech Synthesis. 4658-4662
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/KimCCKK21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KimCCKK21
Minchan Kim, Sung Jun Cheon, Byoung Jin Choi, Jong Jin Kim, Nam Soo Kim:
Expressive Text-to-Speech Using Style Tag. 4663-4667
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/YanTLZQZSZL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/YanTLZQZSZL21
Yuzi Yan, Xu Tan, Bohan Li, Guangyan Zhang, Tao Qin, Sheng Zhao, Yuan Shen, Wei-Qiang Zhang, Tie-Yan Liu:
Adaptive Text to Speech for Spontaneous Style. 4668-4672
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LiSL00M21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LiSL00M21
Xiang Li, Changhe Song, Jingbei Li, Zhiyong Wu, Jia Jia, Helen Meng:
Towards Multi-Scale Style Control for Expressive Speech Synthesis. 4673-4677
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/PanH21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/PanH21
Shifeng Pan, Lei He:
Cross-Speaker Style Transfer with Prosody Bottleneck in Neural Speech Synthesis. 4678-4682
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/TanL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/TanL21
Daxin Tan, Tan Lee:
Fine-Grained Style Modeling, Transfer and Prediction in Text-to-Speech Synthesis via Phone-Level Content-Style Disentanglement. 4683-4687
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/AnS021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/AnS021
Xiaochun An, Frank K. Soong, Lei Xie:
Improving Performance of Seen and Unseen Speech Style Transfer in End-to-End Neural TTS. 4688-4692
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ShechtmanFSH21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ShechtmanFSH21
Slava Shechtman, Raul Fernandez, Alexander Sorin, David Haws:
Synthesis of Expressive Speaking Styles with Limited Training Data in a Multi-Speaker, Prosody-Controllable Sequence-to-Sequence Architecture. 4693-4697

Spoken Language Understanding II

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/DaoTN21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/DaoTN21
Mai Hoang Dao, Thinh Hung Truong, Dat Quoc Nguyen:
Intent Detection and Slot Filling for Vietnamese. 4698-4702
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/LinXZZZ21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/LinXZZZ21
Haitao Lin, Lu Xiang, Yu Zhou, Jiajun Zhang, Chengqing Zong:
Augmenting Slot Values and Contexts for Spoken Language Understanding with Pretrained Models. 4703-4707
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/GaspersDSL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/GaspersDSL21
Judith Gaspers, Quynh Do, Daniil Sorokin, Patrick Lehnen:
The Impact of Intent Distribution Mismatch on Semi-Supervised Spoken Language Understanding. 4708-4712
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/JiangSM021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/JiangSM021
Yidi Jiang, Bidisha Sharma, Maulik C. Madhavi, Haizhou Li:
Knowledge Distillation from BERT Transformer to Speech Transformer for Intent Classification. 4713-4717
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WangWSKZ21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WangWSKZ21
Nick J. C. Wang, Lu Wang, Yandan Sun, Haimei Kang, Dejun Zhang:
Three-Module Modeling For End-to-End Spoken Language Understanding Using Pre-Trained DNN-HMM-Based Acoustic-Phonetic Model. 4718-4722
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChaHJPPK0M21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChaHJPPK0M21
Sujeong Cha, Wangrui Hou, Hyun Jung, My Phung, Michael Picheny, Hong-Kwang Jeff Kuo, Samuel Thomas, Edmilson da Silva Morais:
Speak or Chat with Me: End-to-End Spoken Language Understanding System with Flexible Inputs. 4723-4727
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/Zhang021
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/Zhang021
Xianwei Zhang, Liang He:
End-to-End Cross-Lingual Spoken Language Understanding Model with Multilingual Pretraining. 4728-4732
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SaghirCEC21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SaghirCEC21
Hamidreza Saghir, Samridhi Choudhary, Sepehr Eghbali, Clement Chung:
Factorization-Aware Training of Transformers for Natural Language Understanding on the Edge. 4733-4737
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SaxonCMM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SaxonCMM21
Michael Saxon, Samridhi Choudhary, Joseph P. McKenna, Athanasios Mouchtaris:
End-to-End Spoken Language Understanding for Generalized Voice Assistants. 4738-4742
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/HanLLWP21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/HanLLWP21
Soyeon Caren Han, Siqu Long, Huichun Li, Henry Weld, Josiah Poon:
Bi-Directional Joint Neural Networks for Intent Classification and Slot Filling. 4743-4747

INTERSPEECH 2021 Acoustic Echo Cancellation Challenge

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/CutlerSPLSPGBSA21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/CutlerSPLSPGBSA21
Ross Cutler, Ando Saabas, Tanel Pärnamaa, Markus Loide, Sten Sootla, Marju Purin, Hannes Gamper, Sebastian Braun, Karsten Sørensen, Robert Aichner, Sriram Srinivasan:
INTERSPEECH 2021 Acoustic Echo Cancellation Challenge. 4748-4752
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/PfeifenbergerZP21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/PfeifenbergerZP21
Lukas Pfeifenberger, Matthias Zöhrer, Franz Pernkopf:
Acoustic Echo Cancellation with Cross-Domain Learning. 4753-4757
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ZhangKLHX21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZhangKLHX21
Shimin Zhang, Yuxiang Kong, Shubo Lv, Yanxin Hu, Lei Xie:
F-T-LSTM Based Complex Network for Joint Acoustic Echo Cancellation and Speech Enhancement. 4758-4762
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/SeidelFSF21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/SeidelFSF21
Ernst Seidel, Jan Franzen, Maximilian Strake, Tim Fingscheidt:
Y²-Net FCRN for Acoustic Echo and Noise Suppression. 4763-4767
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/PengCZL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/PengCZL21
Renhua Peng, Linjuan Cheng, Chengshi Zheng, Xiaodong Li:
Acoustic Echo Cancellation Using Deep Complex Neural Network with Nonlinear Magnitude Compression and Phase Information. 4768-4772
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/IvryCB21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/IvryCB21
Amir Ivry, Israel Cohen, Baruch Berdugo:
Nonlinear Acoustic Echo Cancellation with Deep Learning. 4773-4777

Speech Recognition of Atypical Speech

- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/GreenMJCHCSLTBN21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/GreenMJCHCSLTBN21
Jordan R. Green, Robert L. MacDonald, Pan-Pan Jiang, Julie Cattiau, Rus Heywood, Richard Cave, Katie Seaver, Marilyn A. Ladewig, Jimmy Tobin, Michael P. Brenner, Philip C. Nelson, Katrin Tomanek:
Automatic Speech Recognition of Disordered Speech: Personalized Models Outperforming Human Listeners on Short Phrases. 4778-4782
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/NeumannRLKSPNAK21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/NeumannRLKSPNAK21
Michael Neumann, Oliver Roesler, Jackson Liscombe, Hardik Kothare, David Suendermann-Oeft, David Pautler, Indu Navar, Aria Anvar, Jochen Kumm, Raquel Norel, Ernest Fraenkel, Alexander V. Sherman, James D. Berry, Gary L. Pattee, Jun Wang, Jordan R. Green, Vikram Ramanarayanan:
Investigating the Utility of Multimodal Conversational Technology and Audiovisual Analytic Measures for the Assessment and Monitoring of Amyotrophic Lateral Sclerosis at Scale. 4783-4787
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/HermannM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/HermannM21
Enno Hermann, Mathew Magimai-Doss:
Handling Acoustic Variation in Dysarthric Speech Recognition Systems Through Model Combination. 4788-4792
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/GengLYXHYJLM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/GengLYXHYJLM21
Mengzhe Geng, Shansong Liu, Jianwei Yu, Xurong Xie, Shoukang Hu, Zi Ye, Zengrui Jin, Xunying Liu, Helen Meng:
Spectro-Temporal Deep Features for Disordered Speech Assessment and Recognition. 4793-4797
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/GutzRG21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/GutzRG21
Sarah E. Gutz, Hannah P. Rowe, Jordan R. Green:
Speaking with a KN95 Face Mask: ASR Performance and Speaker Compensation. 4798-4802
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/JinGXYLLM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/JinGXYLLM21
Zengrui Jin, Mengzhe Geng, Xurong Xie, Jianwei Yu, Shansong Liu, Xunying Liu, Helen Meng:
Adversarial Data Augmentation for Disordered Speech Recognition. 4803-4807
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/XieRLW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/XieRLW21
Xurong Xie, Rukiye Ruzi, Xunying Liu, Lan Wang:
Variational Auto-Encoder Based Variability Encoding for Dysarthric Speech Recognition. 4808-4812
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/WangLSWLM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/WangLSWLM21
Disong Wang, Songxiang Liu, Lifa Sun, Xixin Wu, Xunying Liu, Helen Meng:
Learning Explicit Prosody Models and Deep Speaker Embeddings for Atypical Voice Conversion. 4813-4817
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/DengGHGXYLYLM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/DengGHGXYLYLM21
Jiajun Deng, Fabian Ritter Gutierrez, Shoukang Hu, Mengzhe Geng, Xurong Xie, Zi Ye, Shansong Liu, Jianwei Yu, Xunying Liu, Helen Meng:
Bayesian Parametric and Architectural Domain Adaptation of LF-MMI Trained TDNNs for Elderly and Dysarthric Speech Recognition. 4818-4822
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/CaiLSGBNS21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/CaiLSGBNS21
Shanqing Cai, Lisie Lillianfeld, Katie Seaver, Jordan R. Green, Michael P. Brenner, Philip C. Nelson, D. Sculley:
A Voice-Activated Switch for Persons with Motor and Speech Impairments: Isolated-Vowel Spotting Using Neural Networks. 4823-4827
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/ChenRBZCJCDM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ChenRBZCJCDM21
Zhehuai Chen, Bhuvana Ramabhadran, Fadi Biadsy, Xia Zhang, Youzheng Chen, Liyang Jiang, Fang Chu, Rohan Doshi, Pedro J. Moreno:
Conformer Parrotron: A Faster and Stronger End-to-End Speech Conversion and Recognition Model for Atypical Speech. 4828-4832
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/MacDonaldJCHCSL21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MacDonaldJCHCSL21
Robert L. MacDonald, Pan-Pan Jiang, Julie Cattiau, Rus Heywood, Richard Cave, Katie Seaver, Marilyn A. Ladewig, Jimmy Tobin, Michael P. Brenner, Philip C. Nelson, Jordan R. Green, Katrin Tomanek:
Disordered Speech Data Collection: Lessons Learned at 1 Million Utterances from Project Euphonia. 4833-4837
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/YeoKC21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/YeoKC21
Eun Jung Yeo, Sunhee Kim, Minhwa Chung:
Automatic Severity Classification of Korean Dysarthric Speech Using Phoneme-Level Pronunciation Features. 4838-4842
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/VenugopalanSPTT21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/VenugopalanSPTT21
Subhashini Venugopalan, Joel Shor, Manoj Plakal, Jimmy Tobin, Katrin Tomanek, Jordan R. Green, Michael P. Brenner:
Comparing Supervised Models and Learned Speech Representations for Classifying Intelligibility of Disordered Speech on Selected Phrases. 4843-4847
- view
  - electronic edition via DOI (open access)
  - details & citations
  authority control:
- export record
  dblp key:
  - conf/interspeech/MitraHLTWBPTGKB21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/MitraHLTWBPTGKB21
Vikramjit Mitra, Zifang Huang, Colin Lea, Lauren Tooley, Sarah Wu, Darren Botten, Ashwini Palekar, Shrinath Thelapurath, Panayiotis G. Georgiou, Sachin Kajarekar, Jeffrey P. Bigham:
Analysis and Tuning of a Voice Assistant System for Dysfluent Speech. 4848-4852

Show and Tell 4

- view
  - electronic edition @ isca-speech.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/interspeech/KawaharaYSMMBI21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/KawaharaYSMMBI21
Hideki Kawahara, Kohei Yatabe, Ken-Ichi Sakakibara, Mitsunori Mizumachi, Masanori Morise, Hideki Banno, Toshio Irino:
Interactive and Real-Time Acoustic Measurement Tools for Speech Data Acquisition and Presentation: Application of an Extended Member of Time Stretched Pulses. 4853-4854
- view
  - electronic edition @ isca-speech.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/interspeech/TihelkaRGHVM21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/TihelkaRGHVM21
Daniel Tihelka, Markéta Rezácková, Martin Gruber, Zdenek Hanzlícek, Jakub Vít, Jindrich Matousek:
Save Your Voice: Voice Banking and TTS for Anyone. 4855-4856
- view
  - electronic edition @ isca-speech.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/interspeech/ZhangBG21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ZhangBG21
Yang Zhang, Evelina Bakhturina, Boris Ginsburg:
NeMo (Inverse) Text Normalization: From Development to Production. 4857-4859
- view
  - electronic edition @ isca-speech.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/interspeech/HembiseGD21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/HembiseGD21
Corentin Hembise, Lucile Gelin, Morgane Daniel:
Lalilo: A Reading Assistant for Children Featuring Speech Recognition-Based Reading Mistake Detection. 4860-4861
- view
  - electronic edition @ isca-speech.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/interspeech/NguyenHNB21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/NguyenHNB21
Manh Hung Nguyen, Vu Hoang, Tu Anh Nguyen, Trung H. Bui:
Automatic Radiology Report Editing Through Voice. 4862-4863
- view
  - electronic edition @ isca-speech.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/interspeech/ShiTZSNC21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/ShiTZSNC21
Ke Shi, Kye Min Tan, Huayun Zhang, Siti Umairah Md. Salleh, Shikang Ni, Nancy F. Chen:
WittyKiddy: Multilingual Spoken Language Learning for Kids. 4864-4865
- view
  - electronic edition @ isca-speech.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/interspeech/JinYW21
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/JinYW21
Chunxiang Jin, Minghui Yang, Zujie Wen:
Duplex Conversation in Outbound Agent System. 4866-4867
- view
  - electronic edition @ isca-speech.org (open access)
  - details & citations
- export record
  dblp key:
  - conf/interspeech/UdupaRSIG21a
- ask others
- share record
  persistent URL:
  - https://dblp.org/rec/conf/interspeech/UdupaRSIG21a
Sathvik Udupa, Anwesha Roy, Abhayjeet Singh, Aravind Illa, Prasanta Kumar Ghosh:
Web Interface for Estimating Articulatory Movements in Speech Production from Acoustics and Text. 4868-4869

manage site settings

To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.