default search action
IEEE/ACM Transactions on Audio, Speech and Language Processing, Volume 28
Volume 28, 2020
- Jamal Amini, Richard Christian Hendriks, Richard Heusdens, Meng Guo, Jesper Jensen:
Rate-Constrained Noise Reduction in Wireless Acoustic Sensor Networks. 1-12 - Chitralekha Gupta, Haizhou Li, Ye Wang:
Automatic Leaderboard: Evaluation of Singing Quality Without a Standard Reference. 13-26 - Sefik Emre Eskimez, Ross K. Maddox, Chenliang Xu, Zhiyao Duan:
Noise-Resilient Training Method for Face Landmark Generation From Speech. 27-38 - Peidong Wang, Ke Tan, DeLiang Wang:
Bridging the Gap Between Monaural Speech Enhancement and Recognition With Distortion-Independent Acoustic Modeling. 39-48 - Yuki Mitsufuji, Stefan Uhlich, Norihiro Takamune, Daichi Kitamura, Shoichi Koyama, Hiroshi Saruwatari:
Multichannel Non-Negative Matrix Factorization Using Banded Spatial Covariance Matrices in Wavenumber Domain. 49-60 - Yaron Laufer, Sharon Gannot:
Scoring-Based ML Estimation and CRBs for Reverberation, Speech, and Noise PSDs in a Spatially Homogeneous Noise Field. 61-76 - Naveen Kumar Desiraju, Simon Doclo, Markus Buck, Tobias Wolff:
Online Estimation of Reverberation Parameters For Late Residual Echo Suppression. 77-91 - Mehdi Zohourian, Rainer Martin:
Binaural Direct-to-Reverberant Energy Ratio and Speaker Distance Estimation. 92-104 - Youhyun Shin, Sang-goo Lee:
Learning Context Using Segment-Level LSTM for Neural Sequence Labeling. 105-115 - Gongping Huang, Jingdong Chen, Jacob Benesty:
Design of Planar Differential Microphone Arrays With Fractional Orders. 116-130 - Ming-Hsiang Su, Chung-Hsien Wu, Liang-Yu Chen:
Attention-Based Response Generation Using Parallel Double Q-Learning for Dialog Policy Decision in a Conversational System. 131-143 - Satoru Emura:
Wave-Domain Residual Echo Reduction Using Subspace Tracking. 144-156 - Xin Wang, Shinji Takaki, Junichi Yamagishi, Simon King, Keiichi Tokuda:
A Vector Quantized Variational Autoencoder (VQ-VAE) Autoregressive Neural F0 Model for Statistical Parametric Speech Synthesis. 157-170 - Falk-Martin Hoffmann, Philip Arthur Nelson, Filippo Maria Fazi:
DOA Estimation Performance With Circular Arrays in Sound Fields With Finite Rate of Innovation. 171-184 - Rongfeng Su, Xunying Liu, Lan Wang, Jingzhou Yang:
Cross-Domain Deep Visual Feature Generation for Mandarin Audio-Visual Speech Recognition. 185-197 - Titouan Parcollet, Mohamed Morchid, Xavier Bost, Georges Linarès, Renato De Mori:
Real to H-Space Autoencoders for Theme Identification in Telephone Conversations. 198-210 - Antonio Canclini, Fabio Antonacci, Stefano Tubaro, Augusto Sarti:
A Methodology for the Robust Estimation of the Radiation Pattern of Acoustic Sources. 211-224 - Yi Yu, Hongsen He, Badong Chen, Jianghui Li, Youwen Zhang, Lu Lu:
M-Estimate Based Normalized Subband Adaptive Filter Algorithm: Performance Analysis and Improvements. 225-239 - Haoxiang Wen, Senquan Yang, Yuanquan Hong, Huan Luo:
A Partial Update Adaptive Algorithm for Sparse System Identification. 240-255 - Martin Bo Møller, Jan Østergaard:
A Moving Horizon Framework for Sound Zones. 256-265 - Stylianos Ioannis Mimilakis, Konstantinos Drossos, Estefanía Cano, Gerald Schuller:
Examining the Mapping Functions of Denoising Autoencoders in Singing Voice Separation. 266-278 - Lachlan Birnie, Thushara D. Abhayapala, Prasanga N. Samarasinghe:
Reflection Assisted Sound Source Localization Through a Harmonic Domain MUSIC Framework. 279-293 - Wenhao Ding, Liang He:
Adaptive Multi-Scale Detection of Acoustic Events. 294-306 - Weijian Zhang, Peng Song:
Transfer Sparse Discriminant Subspace Learning for Cross-Corpus Speech Emotion Recognition. 307-318 - Bidisha Sharma, Ye Wang:
Automatic Evaluation of Song Intelligibility Using Singing Adapted STOI and Vocal-Specific Features. 319-331 - Hai Morgenstern, Boaz Rafaely:
Perceptually-Transparent Online Estimation of Two-Channel Room Transfer Function for Sound Calibration. 332-342 - Shaojin Ding, Guanlong Zhao, Christopher Liberatore, Ricardo Gutierrez-Osuna:
Learning Structured Sparse Representations for Voice Conversion. 343-354 - Mireia Díez, Lukás Burget, Federico Landini, Jan Cernocký:
Analysis of Speaker Diarization Based on Bayesian HMM With Eigenvoice Priors. 355-368 - Jia-Chen Gu, Zhen-Hua Ling, Quan Liu:
Utterance-to-Utterance Interactive Matching Network for Multi-Turn Response Selection in Retrieval-Based Chatbots. 369-379 - Ke Tan, DeLiang Wang:
Learning Complex Spectral Mapping With Gated Convolutional Recurrent Networks for Monaural Speech Enhancement. 380-390 - Richeng Duan, Tatsuya Kawahara, Masatake Dantsuji, Hiroaki Nanjo:
Cross-Lingual Transfer Learning of Non-Native Acoustic Modeling for Pronunciation Error Detection and Diagnosis. 391-401 - Xin Wang, Shinji Takaki, Junichi Yamagishi:
Neural Source-Filter Waveform Models for Statistical Parametric Speech Synthesis. 402-415 - Sanjeel Parekh, Slim Essid, Alexey Ozerov, Ngoc Q. K. Duong, Patrick Pérez, Gaël Richard:
Weakly Supervised Representation Learning for Audio-Visual Scene Analysis. 416-428 - Jianfei Yu, Jing Jiang, Rui Xia:
Entity-Sensitive Attention and Fusion Network for Entity-Level Multimodal Sentiment Classification. 429-439 - John G. Beerends, Niels M. P. Neumann, Egon L. van den Broek, Anna Llagostera Casanovas, Jovana Torres Menendez, Christian Schmidmer, Jens Berger:
Subjective and Objective Assessment of Full Bandwidth Speech Quality. 440-449 - Vikram C. Mathad, S. R. Mahadeva Prasanna:
Vowel Onset Point Based Screening of Misarticulated Stops in Cleft Lip and Palate Speech. 450-460 - Minh Nguyen, Gia H. Ngo, Nancy F. Chen:
Hierarchical Character Embeddings: Learning Phonological and Semantic Representations in Languages of Logographic Origin Using Recursive Neural Networks. 461-473 - Dani Cherkassky, Sharon Gannot:
Successive Relative Transfer Function Identification Using Blind Oblique Projection. 474-486 - Ivo Trowitzsch, Christopher Schymura, Dorothea Kolossa, Klaus Obermayer:
Joining Sound Event Detection and Localization Through Spatial Segregation. 487-502 - Shinichi Mogami, Norihiro Takamune, Daichi Kitamura, Hiroshi Saruwatari, Yu Takahashi, Kazunobu Kondo, Nobutaka Ono:
Independent Low-Rank Matrix Analysis Based on Time-Variant Sub-Gaussian Source Model for Determined Blind Source Separation. 503-518 - Hamzeh Ghasemzadeh, Meisam Khalil Arjmandi:
Toward Optimum Quantification of Pathology-Induced Noises: An Investigation of Information Missed by Human Auditory System. 519-528 - Fei Ma, Wen Zhang, Thushara Dheemantha Abhayapala:
Active Control of Outgoing Broadband Noise Fields in Rooms. 529-539 - Jing-Xuan Zhang, Zhen-Hua Ling, Li-Rong Dai:
Non-Parallel Sequence-to-Sequence Voice Conversion With Disentangled Linguistic and Speaker Representations. 540-552 - Tao Dai, Li Zhu, Yaxiong Wang, Kathleen M. Carley:
Attentive Stacked Denoising Autoencoder With Bi-LSTM for Personalized Context-Aware Citation Recommendation. 553-568 - Yuta Nishimura, Katsuhito Sudoh, Graham Neubig, Satoshi Nakamura:
Multi-Source Neural Machine Translation With Missing Data. 569-580 - Jin Wang, Liang-Chih Yu, K. Robert Lai, Xuejie Zhang:
Tree-Structured Regional CNN-LSTM Model for Dimensional Sentiment Analysis. 581-591 - Abul Azad, Lamine Mili:
Robust Speech Filter and Voice Encoder Parameter Estimation Using the Phase-Phase Correlator. 592-604 - Abdullah Fahim, Prasanga N. Samarasinghe, Thushara D. Abhayapala:
Multi-Source DOA Estimation Through Pattern Recognition of the Modal Coherence of a Reverberant Soundfield. 605-618 - Yaron Laufer, Bracha Laufer-Goldshtein, Sharon Gannot:
ML Estimation and CRBs for Reverberation, Speech, and Noise PSDs in Rank-Deficient Noise Field. 619-634 - Zhongqing Wang, Qingying Sun, Shoushan Li, Qiaoming Zhu, Guodong Zhou:
Neural Stance Detection With Hierarchical Linguistic Representations. 635-645 - Ruizhi Li, Xiaofei Wang, Sri Harish Mallidi, Shinji Watanabe, Takaaki Hori, Hynek Hermansky:
Multi-Stream End-to-End Speech Recognition. 646-655 - Yu Maeno, Yuki Mitsufuji, Prasanga N. Samarasinghe, Naoki Murata, Thushara D. Abhayapala:
Spherical-Harmonic-Domain Feedforward Active Noise Control Using Sparse Decomposition of Reference Signals from Distributed Sensor Arrays. 656-670 - Qingyu Zhou, Nan Yang, Furu Wei, Shaohan Huang, Ming Zhou, Tiejun Zhao:
A Joint Sentence Scoring and Selection Framework for Neural Extractive Document Summarization. 671-681 - Ivan Kukanov, Trung Ngo Trong, Ville Hautamäki, Sabato Marco Siniscalchi, Valerio Mario Salerno, Kong Aik Lee:
Maximal Figure-of-Merit Framework to Detect Multi-Label Phonetic Features for Spoken Language Recognition. 682-695 - Shoichi Koyama, Gilles Chardon, Laurent Daudet:
Optimizing Source and Sensor Placement for Sound Field Control: An Overview. 696-714 - Atsushi Ando, Ryo Masumura, Hosana Kamiyama, Satoshi Kobashikawa, Yushi Aono, Tomoki Toda:
Customer Satisfaction Estimation in Contact Center Calls Based on a Hierarchical Multi-Task Model. 715-728 - Thomas Dietzen, Simon Doclo, Marc Moonen, Toon van Waterschoot:
Integrated Sidelobe Cancellation and Linear Prediction Kalman Filter for Joint Multi-Microphone Speech Dereverberation, Interfering Speech Cancellation, and Noise Reduction. 740-754 - Thomas Dietzen, Simon Doclo, Marc Moonen, Toon van Waterschoot:
Square Root-Based Multi-Source Early PSD Estimation and Recursive RETF Update in Reverberant Environments by Means of the Orthogonal Procrustes Problem. 755-769 - Liwen Zhang, Ziqiang Shi, Jiqing Han:
Pyramidal Temporal Pooling With Discriminative Mapping for Audio Classification. 770-784 - Mengfan Zhang, Zhongshu Ge, Tiejun Liu, Xihong Wu, Tianshu Qu:
Modeling of Individual HRTFs Based on Spatial Principal Component Analysis. 785-797 - Laureano Moro-Velázquez, Estefanía Hernández-García, Jorge Andrés Gómez García, Juan Ignacio Godino-Llorente, Najim Dehak:
Analysis of the Effects of Supraglottal Tract Surgical Procedures in Automatic Speaker Recognition Performance. 798-812 - Yijia Liu, Wanxiang Che, Bing Qin, Ting Liu:
Exploring Segment Representations for Neural Semi-Markov Conditional Random Fields. 813-824 - Morten Kolbæk, Zheng-Hua Tan, Søren Holdt Jensen, Jesper Jensen:
On Loss Functions for Supervised Monaural Time-Domain Speech Enhancement. 825-838 - Yang Ai, Zhen-Hua Ling:
A Neural Vocoder With Hierarchical Generation of Amplitude and Phase Spectra for Statistical Parametric Speech Synthesis. 839-851 - Dongyan Yu, Huiping Duan, Jun Fang, Bing Zeng:
Predominant Instrument Recognition Based on Deep Neural Network With Auxiliary Classification. 852-861 - Ali Aroudi, Simon Doclo:
Cognitive-Driven Binaural Beamforming Using EEG-Based Auditory Attention Decoding. 862-875 - Christopher Gribben, Hyunkook Lee:
The Perception of Band-Limited Decorrelation Between Vertically Oriented Loudspeakers. 876-888 - Olivier Perrotin, Ian Vince McLoughlin:
Glottal Flow Synthesis for Whisper-to-Speech Conversion. 889-900 - Gongping Huang, Jacob Benesty, Israel Cohen, Jingdong Chen:
Differential Beamforming on Graphs. 901-913 - Bracha Laufer-Goldshtein, Ronen Talmon, Sharon Gannot:
Global and Local Simplex Representations for Multichannel Source Separation. 914-928 - Henning F. Schepker, Sven Nordholm, Simon Doclo:
Acoustic Feedback Suppression for Multi-Microphone Hearing Devices Using a Soft-Constrained Null-Steering Beamformer. 929-940 - Zhong-Qiu Wang, DeLiang Wang:
Deep Learning Based Target Cancellation for Speech Dereverberation. 941-950 - Yeongseok Kim, Youngjin Park:
Blockwise Weighted Least Square Active Noise Control for CPU-GPU Architecture. 951-963 - Odette Scharenborg, Lucas Ondel, Shruti Palaskar, Philip Arthur, Francesco Ciannella, Mingxing Du, Elin Larsen, Danny Merkx, Rachid Riad, Liming Wang, Emmanuel Dupoux, Laurent Besacier, Alan W. Black, Mark Hasegawa-Johnson, Florian Metze, Graham Neubig, Sebastian Stüker, Pierre Godard, Markus Müller:
Speech Technology for Unwritten Languages. 964-975 - Andros Tjandra, Sakriani Sakti, Satoshi Nakamura:
Machine Speech Chain. 976-989 - M. Khadem-hosseini, Shahrokh Ghaemmaghami, Azra Abtahi, Saeed Gazor, Farrokh Marvasti:
Error Correction in Pitch Detection Using a Deep Learning Based Classification. 990-999 - Enzo De Sena, Zoran Cvetkovic, Hüseyin Hacihabiboglu, Marc Moonen, Toon van Waterschoot:
Localization Uncertainty in Time-Amplitude Stereophonic Reproduction. 1000-1015 - Vera Erbes, Sascha Spors:
Localisation Properties of Wave Field Synthesis in a Listening Room. 1016-1024 - Jia Pan, Genshun Wan, Jun Du, Zhongfu Ye:
Online Speaker Adaptation Using Memory-Aware Networks for Speech Recognition. 1025-1037 - Weicheng Cai, Jinkun Chen, Jun Zhang, Ming Li:
On-the-Fly Data Loader and Utterance-Level Aggregation for Speaker and Language Recognition. 1038-1051 - George Sterpu, Christian Saam, Naomi Harte:
How to Teach DNNs to Pay Attention to the Visual Modality in Speech Recognition. 1052-1064 - Christopher Schymura, Dorothea Kolossa:
Audiovisual Speaker Tracking Using Nonlinear Dynamical Systems With Dynamic Stream Weights. 1065-1078 - Gongping Huang, Jacob Benesty, Israel Cohen, Jingdong Chen:
A Simple Theory and New Method of Differential Beamforming With Uniform Linear Microphone Arrays. 1079-1093 - Chung-Ying Ho, Kuo-Kai Shyu, Cheng-Yuan Chang, Sen M. Kuo:
Efficient Narrowband Noise Cancellation System Using Adaptive Line Enhancer. 1094-1103 - Aditya Arie Nugraha, Kouhei Sekiguchi, Kazuyoshi Yoshii:
A Flow-Based Deep Latent Variable Model for Speech Spectrogram Modeling and Enhancement. 1104-1117 - Beat Gfeller, Christian Havnø Frank, Dominik Roblek, Matthew Sharifi, Marco Tagliasacchi, Mihajlo Velimirovic:
SPICE: Self-Supervised Pitch Estimation. 1118-1128 - Christoph Urbanietz, Gerald Enzner:
Direct Spatial-Fourier Regression of HRIRs from Multi-Elevation Continuous-Azimuth Recordings. 1129-1142 - Yaakov Buchris, Israel Cohen, Jacob Benesty, Alon Amar:
Joint Sparse Concentric Array Design for Frequency and Rotationally Invariant Beampattern. 1143-1158 - Tharindu Fernando, Sridha Sridharan, Mitchell McLaren, Darshana Priyasad, Simon Denman, Clinton Fookes:
Temporarily-Aware Context Modeling Using Generative Adversarial Networks for Speech Activity Detection. 1159-1169 - Haipeng Sun, Rui Wang, Kehai Chen, Masao Utiyama, Eiichiro Sumita, Tiejun Zhao:
Unsupervised Neural Machine Translation With Cross-Lingual Language Representation Agreement. 1170-1182 - Qiaoling Zhang, WeiQiang Xu, Weiwei Zhang, Jie Feng, Zhiyong Chen:
Multi-Hypothesis Square-Root Cubature Kalman Particle Filter for Speaker Tracking in Noisy and Reverberant Environments. 1183-1197 - Yinhe Zheng, Guanyi Chen, Minlie Huang:
Out-of-Domain Detection for Natural Language Understanding in Dialog Systems. 1198-1209 - Ina Kodrasi, Hervé Bourlard:
Spectro-Temporal Sparsity Characterization for Dysarthric Speech Detection. 1210-1222 - Bharat Padi, Anand Mohan, Sriram Ganapathy:
Towards Relevance and Sequence Modeling in Language Recognition. 1223-1232 - Iván López-Espejo, Zheng-Hua Tan, Jesper Jensen:
Improved External Speaker-Robust Keyword Spotting for Hearing Assistive Devices. 1233-1247 - Vishnuvardhan Varanasi, Harshit Gupta, Rajesh M. Hegde:
A Deep Learning Framework for Robust DOA Estimation Using Spherical Harmonic Decomposition. 1248-1259 - Sahar Hashemgeloogerdi, Mark F. Bocko:
Adaptive Feedback Cancellation in Hearing Aids Based on Orthonormal Basis Functions With Prediction-Error Method Based Prewhitening. 1260-1269 - Maximo Cobos, Fabio Antonacci, Luca Comanducci, Augusto Sarti:
Frequency-Sliding Generalized Cross-Correlation: A Sub-Band Time Delay Estimation Approach. 1270-1281 - Yingying Zhu, Haiquan Zhao, Xiangping Zeng, Badong Chen:
Robust Generalized Maximum Correntropy Criterion Algorithms for Active Noise Control. 1282-1292 - Hassan Taherian, Zhong-Qiu Wang, Jorge Chang, DeLiang Wang:
Robust Speaker Recognition Based on Single-Channel and Multi-Channel Speech Enhancement. 1293-1302 - Cunhang Fan, Jianhua Tao, Bin Liu, Jiangyan Yi, Zhengqi Wen, Xuefei Liu:
End-to-End Post-Filter for Speech Separation With Deep Attention Fusion Features. 1303-1314 - T. Lavanya, T. Nagarajan, P. Vijayalakshmi:
Multi-Level Single-Channel Speech Enhancement Using a Unified Framework for Estimating Magnitude and Phase Spectra. 1315-1327 - Adrien Ycart, Emmanouil Benetos:
Learning and Evaluation Methodologies for Polyphonic Music Sequence Prediction With LSTMs. 1328-1341 - Takatomo Kano, Sakriani Sakti, Satoshi Nakamura:
End-to-End Speech Translation With Transcoding by Multi-Task Learning for Distant Language Pairs. 1342-1355 - Huanyu Zuo, Prasanga N. Samarasinghe, Thushara D. Abhayapala:
Intensity Based Spatial Soundfield Reproduction Using an Irregular Loudspeaker Array. 1356-1369 - Chenglin Xu, Wei Rao, Eng Siong Chng, Haizhou Li:
SpEx: Multi-Scale Time Domain Speaker Extraction Network. 1370-1384 - Wangyou Zhang, Xuankai Chang, Yanmin Qian, Shinji Watanabe:
Improving End-to-End Single-Channel Multi-Talker Speech Recognition. 1385-1394 - Alakananda Vempala, Eduardo Blanco:
Extracting Biographical Spatial Timelines: Corpus and Experiments. 1395-1403 - Qiquan Zhang, Aaron Nicolson, Mingjiang Wang, Kuldip K. Paliwal, Chenxu Wang:
DeepMMSE: A Deep Learning Approach to MMSE-Based Noise Power Spectral Density Estimation. 1404-1415 - Dhananjay Ram, Lesly Miculicich, Hervé Bourlard:
Neural Network Based End-to-End Query by Example Spoken Term Detection. 1416-1427 - Enea Ceolini, Ilya Kiselev, Shih-Chii Liu:
Evaluating Multi-Channel Multi-Device Speech Separation Algorithms in the Wild: A Hardware-Software Solution. 1428-1439 - Su Zhu, Zijian Zhao, Rao Ma, Kai Yu:
Prior Knowledge Driven Label Embedding for Slot Filling in Natural Language Understanding. 1440-1451 - Haoran Miao, Gaofeng Cheng, Pengyuan Zhang, Yonghong Yan:
Online Hybrid CTC/Attention End-to-End Automatic Speech Recognition Architecture. 1452-1465 - Liwei Lin, Xiangdong Wang, Hong Liu, Yueliang Qian:
Specialized Decision Surface and Disentangled Feature for Weakly-Supervised Polyphonic Sound Event Detection. 1466-1478 - Dong-Yuan Shi, Woon-Seng Gan, Bhan Lam, Shulin Wen:
Feedforward Selective Fixed-Filter Active Noise Control: Algorithm and Implementation. 1479-1492 - Zhihao Du, Xueliang Zhang, Jiqing Han:
A Joint Framework of Denoising Autoencoder and Generative Vocoder for Monaural Speech Enhancement. 1493-1505 - Yue Zhang, Yile Wang, Jie Yang:
Lattice LSTM for Chinese Sentence Representation. 1506-1519