default search action
ICASSP 2020: Barcelona, Spain
- 2020 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2020, Barcelona, Spain, May 4-8, 2020. IEEE 2020, ISBN 978-1-5090-6631-5
- Mohammad Alaee-Kerahroodi, Shankar Mysore Rama R. Bhavani, Kumar Vijay Mishra, Björn E. Ottersten:
Information Theoretic Approach for Waveform Design in Coexisting MIMO Radar and MIMO Communications. 1-5 - Cece Jin, Tao Zhang, Weijie Kong, Thomas H. Li, Ge Li:
Regression Before Classification for Temporal Action Detection. 1-5 - Victor Besnier, Himalaya Jain, Andrei Bursuc, Matthieu Cord, Patrick Pérez:
This Dataset Does Not Exist: Training Models from Generated Images. 1-5 - Tsung-Han Hsieh, Kai-Hsiang Cheng, Zhe-Cheng Fan, Yu-Ching Yang, Yi-Hsuan Yang:
Addressing The Confounds Of Accompaniments In Singer Identification. 1-5 - Nicholas J. Bryan:
Impulse Response Data Augmentation and Deep Neural Networks for Blind Room Acoustic Parameter Estimation. 1-5 - Jongpil Lee, Nicholas J. Bryan, Justin Salamon, Zeyu Jin, Juhan Nam:
Disentangled Multidimensional Metric Learning for Music Similarity. 6-10 - Vincent Lostanlen, Sripathi Sridhar, Brian McFee, Andrew Farnsworth, Juan Pablo Bello:
Learning the Helix Topology of Musical Pitch. 11-15 - Karim M. Ibrahim, Jimena Royo-Letelier, Elena V. Epure, Geoffroy Peeters, Gaël Richard:
Audio-Based Auto-Tagging With Contextual Tags for Music. 16-20 - Furkan Yesiler, Joan Serrà, Emilia Gómez:
Accurate and Scalable Version Identification Using Musically-Motivated Embeddings. 21-25 - Chaoya Jiang, Deshun Yang, Xiaoou Chen:
Similarity Learning For Cover Song Identification Using Cross-Similarity Matrices of Multi-Level Deep Sequences. 26-30 - Efthymios Tzinis, Shrikant Venkataramani, Zhepei Wang, Y. Cem Sübakan, Paris Smaragdis:
Two-Step Sound Source Separation: Training On Learned Latent Targets. 31-35 - David Ditter, Timo Gerkmann:
A Multi-Phase Gammatone Filterbank for Speech Separation Via Tasnet. 36-40 - Naoya Takahashi, Mayank Kumar Singh, Sakya Basak, Sudarsanam Parthasaarathy, Sriram Ganapathy, Yuki Mitsufuji:
Improving Voice Separation by Incorporating End-To-End Speech Recognition. 41-45 - Yi Luo, Zhuo Chen, Takuya Yoshioka:
Dual-Path RNN: Efficient Long Sequence Modeling for Time-Domain Single-Channel Speech Separation. 46-50 - Christian Uhle, Matteo Torcoli, Jouni Paulus:
Controlling the Perceived Sound Quality for Dialogue Enhancement With Deep Learning. 51-55 - Masahito Togami, Yoshiki Masuyama, Tatsuya Komatsu, Yu Nakagome:
Unsupervised Training for Deep Speech Source Separation with Kullback-Leibler Divergence Based Probabilistic Loss Function. 56-60 - Çagdas Bilen, Giacomo Ferroni, Francesco Tuveri, Juan Azcarreta, Sacha Krstulovic:
A Framework for the Robust Evaluation of Sound Event Detection. 61-65 - Koichi Miyazaki, Tatsuya Komatsu, Tomoki Hayashi, Shinji Watanabe, Tomoki Toda, Kazuya Takeda:
Weakly-Supervised Sound Event Detection with Self-Attention. 66-70 - Thi Ngoc Tho Nguyen, Douglas L. Jones, Woon-Seng Gan:
A Sequence Matching Network for Polyphonic Sound Event Localization and Detection. 71-75 - Bowen Shi, Ming Sun, Krishna C. Puvvada, Chieh-Chi Kao, Spyros Matsoukas, Chao Wang:
Few-Shot Acoustic Event Detection Via Meta Learning. 76-80 - Yu Wang, Justin Salamon, Nicholas J. Bryan, Juan Pablo Bello:
Few-Shot Sound Event Detection. 81-85 - Romain Serizel, Nicolas Turpault, Ankit Parag Shah, Justin Salamon:
Sound Event Detection in Synthetic Domestic Environments. 86-90 - Fatemeh Pishdadian, Gordon Wichern, Jonathan Le Roux:
Learning to Separate Sounds from Weakly Labeled Scenes. 91-95 - Efthymios Tzinis, Scott Wisdom, John R. Hershey, Aren Jansen, Daniel P. W. Ellis:
Improving Universal Sound Separation Using Sound Classification. 96-100 - Qiuqiang Kong, Yuxuan Wang, Xuchen Song, Yin Cao, Wenwu Wang, Mark D. Plumbley:
Source Separation with Weakly Labelled Data: an Approach to Computational Auditory Scene Analysis. 101-105 - Sunwoo Kim, Haici Yang, Minje Kim:
Boosted Locality Sensitive Hashing: Discriminative Binary Codes for Source Separation. 106-110 - Satoru Emura, Hiroshi Sawada, Shoko Araki, Noboru Harada:
A Frequency-Domain BSS Method Based on ℓ1 Norm, Unitary Constraint, and Cayley Transform. 111-115 - Shrikant Venkataramani, Efthymios Tzinis, Paris Smaragdis:
End-To-End Non-Negative Autoencoders for Sound Source Separation. 116-120 - Aren Jansen, Daniel P. W. Ellis, Shawn Hershey, R. Channing Moore, Manoj Plakal, Ashok C. Popat, Rif A. Saurous:
Coincidence, Categorization, and Consolidation: Learning to Recognize Sounds with Minimal Supervision. 121-125 - Truc Nguyen, Franz Pernkopf, Michal Kosmider:
Acoustic Scene Classification for Mismatched Recording Devices Using Heated-Up Softmax and Spectrum Correction. 126-130 - Nicolas Turpault, Romain Serizel, Emmanuel Vincent:
Limitations of Weak Labels for Embedding and Tagging. 131-135 - Harsh Shrivastava, Yifang Yin, Rajiv Ratn Shah, Roger Zimmermann:
Mt-Gcn For Multi-Label Audio Tagging With Noisy Labels. 136-140 - Mark D. McDonnell, Wei Gao:
Acoustic Scene Classification Using Deep Residual Networks with Late Fusion of Separated High and Low Frequency Paths. 141-145 - Mohammad K. Ebrahimpour, Timothy M. Shea, Andreea Danielescu, David C. Noelle, Christopher T. Kello:
End-To-End Auditory Object Recognition Via Inception Nucleus. 146-150 - Maximilian Kentgens, Andreas Behler, Peter Jax:
Translation of a Higher Order Ambisonics Sound Scene Based on Parametric Decomposition. 151-155 - Diego Di Carlo, Clement Elvira, Antoine Deleforge, Nancy Bertin, Rémi Gribonval:
Blaster: An Off-Grid Method for Blind and Regularized Acoustic Echoes Retrieval. 156-160 - Hannes Helmholz, Jens Ahrens, David L. Alon, Sebastià V. Amengual Garí, Ravish Mehra:
Evaluation of Sensor Self-Noise In Binaural Rendering of Spherical Microphone Array Signals. 161-165 - Kentaro Ariga, Tomoya Nishida, Shoichi Koyama, Natsuki Ueno, Hiroshi Saruwatari:
Mutual-Information-Based Sensor Placement for Spatial Sound Field Recording. 166-170 - Ziqi Fan, Vibhav Vineet, Hannes Gamper, Nikunj Raghuvanshi:
Fast Acoustic Scattering Using Convolutional Neural Networks. 171-175 - Benoit Alary, Archontis Politis:
Frequency-Dependent Directional Feedback Delay Network. 176-180 - Yuma Koizumi, Kohei Yatabe, Marc Delcroix, Yoshiki Masuyama, Daiki Takeuchi:
Speech Enhancement Using Self-Adaptation and Multi-Head Self-Attention. 181-185 - Vincent W. Neo, Christine Evers, Patrick A. Naylor:
PEVD-Based Speech Enhancement in Reverberant Environments. 186-190 - Marvin Tammen, Dörte Fischer, Bernd T. Meyer, Simon Doclo:
DNN-Based Speech Presence Probability Estimation for Multi-Frame Single-Microphone Speech Enhancement. 191-195 - Kristina Tesch, Timo Gerkmann:
Nonlinear Spatial Filtering for Multichannel Speech Enhancement in Inhomogeneous Noise Fields. 196-200 - Heinrich W. Löllmann, Andreas Brendel, Walter Kellermann:
Generalized Coherence-Based Signal Enhancement. 201-205 - Soumi Maiti, Michael I. Mandel:
Speaker Independence of Neural Vocoders and Their Effect on Parametric Resynthesis Speech Enhancement. 206-210 - Gongping Huang, Jacob Benesty, Jingdong Chen, Israel Cohen:
Robust and steerable kronecker product differential beamforming With rectangular microphone arrays. 211-215 - Christoph Böddeker, Tomohiro Nakatani, Keisuke Kinoshita, Reinhold Haeb-Umbach:
Jointly Optimal Dereverberation and Beamforming. 216-220 - Szymon Wozniak, Konrad Kowalczyk:
Exploiting Rays in Blind Localization of Distributed Sensor Arrays. 221-225 - Noman Akbar, Glenn Dickins, Mark R. P. Thomas, Prasanga N. Samarasinghe, Thushara D. Abhayapala:
A Novel Method for Obtaining Diffuse Field Measurements for Microphone Calibration. 226-230 - Masahito Togami:
Multi-Channel Speech Source Separation and Dereverberation With Sequential Integration of Determined and Underdetermined Models. 231-235 - Robin Scheibler, Nobutaka Ono:
Fast and Stable Blind Source Separation with Rank-1 Updates. 236-240 - Marco A. Martínez Ramírez, Emmanouil Benetos, Joshua D. Reiss:
Modeling Plate and Spring Reverberation Using A DSP-Informed Deep Neural Network. 241-245 - Sanna Wager, George Tzanetakis, Cheng-i Wang, Minje Kim:
Deep Autotuner: A Pitch Correcting Network for Singing Performances. 246-250 - Alec Wright, Vesa Välimäki:
Perceptual loss function for neural modeling of audio systems. 251-255 - Stylianos I. Mimilakis, Nicholas J. Bryan, Paris Smaragdis:
One-Shot Parametric Audio Production Style Transfer with Application to Frequency Equalization. 256-260 - Jayneel Parekh, Preeti Rao, Yi-Hsuan Yang:
Speech-To-Singing Conversion in an Encoder-Decoder Framework. 261-265 - Pablo Alonso-Jiménez, Dmitry Bogdanov, Jordi Pons, Xavier Serra:
Tensorflow Audio Models in Essentia. 266-270 - Kaori Suefusa, Tomoya Nishida, Harsh Purohit, Ryo Tanabe, Takashi Endo, Yohei Kawaguchi:
Anomalous Sound Detection Based on Interpolation Deep Neural Network. 271-275 - Wei Wei, Hongning Zhu, Emmanouil Benetos, Ye Wang:
A-CRNN: A Domain Adaptation Model for Sound Event Detection. 276-280 - Yuma Koizumi, Masahiro Yasuda, Shin Murata, Shoichiro Saito, Hisashi Uematsu, Noboru Harada:
SPIDERnet: Attention Network For One-Shot Anomaly Detection In Sounds. 281-285 - Yanxiong Li, Mingle Liu, Konstantinos Drossos, Tuomas Virtanen:
Sound Event Detection Via Dilated Convolutional Recurrent Neural Networks. 286-290 - Manjunath Mulimani, Akash B. Kademani, Shashidhar G. Koolagudi:
A Deep Neural Network-Driven Feature Learning Method for Polyphonic Acoustic Event Detection from Real-Life Recordings. 291-295 - Sixin Hong, Yuexian Zou, Wenwu Wang, Meng Cao:
Weakly Labelled Audio Tagging Via Convolutional Networks with Spatial and Channel-Wise Attention. 296-300 - Vinod Subramanian, Arjun Pankajakshan, Emmanouil Benetos, Ning Xu, SKoT McDonald, Mark B. Sandler:
A Study on the Transferability of Adversarial Attacks in Sound Event Classification. 301-305 - Mahiout Thomas, Fillatre Lionel, Deruaz-Pepin Laurent:
Propeller Noise Detection with Deep Learning. 306-310 - Heinrich Dinkel, Kai Yu:
Duration Robust Weakly Supervised Sound Event Detection. 311-315 - Chieh-Chi Kao, Ming Sun, Weiran Wang, Chao Wang:
A Comparison of Pooling Methods on LSTM Models for Rare Acoustic Event Classification. 316-320 - Yiwei Sun, Shabnam Ghaffarzadegan:
An Ontology-Aware Framework for Audio Event Classification. 321-325 - Jie Yan, Yan Song, Li-Rong Dai, Ian McLoughlin:
Task-Aware Mean Teacher Method for Large Scale Weakly Labeled Semi-Supervised Sound Event Detection. 326-330 - Andrew A. Catellier, Stephen D. Voran:
Wawenets: A No-Reference Convolutional Waveform-Based Approach to Estimating Narrowband and Wideband Speech Quality. 331-335 - Mathias Bach Pedersen, Asger Heidemann Andersen, Søren Holdt Jensen, Jesper Jensen:
A Neural Network for Monaural Intrusive Speech Intelligibility Prediction. 336-340 - Roy Fejgin, Janusz Klejsa, Lars F. Villemoes, Cong Zhou:
Source Coding of Audio Signals with a Generative Model. 341-345 - Gabriel Mittag, Sebastian Möller:
Full-Reference Speech Quality Estimation with Attentional Siamese Neural Networks. 346-350 - Seong-Hyeon Shin, Seungkwon Beack, Wootaek Lim, Hochong Park:
Enhanced Method of Audio Coding Using CNN-Based Spectral Recovery with Adaptive Structure. 351-355 - Arijit Biswas, Dai Jia:
Audio Codec Enhancement with Generative Adversarial Networks. 356-360 - Kai Zhen, Mi Suk Lee, Jongmo Sung, Seungkwon Beack, Minje Kim:
Efficient and Scalable Neural Residual Waveform Coding with Collaborative Quantization. 361-365 - Kai Zhen, Mi Suk Lee, Minje Kim:
A Dual-Staged Context Aggregation Method towards Efficient End-to-End Speech Enhancement. 366-370 - Simon Leglaive, Xavier Alameda-Pineda, Laurent Girin, Radu Horaud:
A Recurrent Variational Autoencoder for Speech Enhancement. 371-375 - Shulin He, Hao Li, Xueliang Zhang:
Speakerfilter: Deep Learning-Based Target Speaker Extraction Using Anchor Speech. 376-380 - Keisuke Kinoshita, Marc Delcroix, Shoko Araki, Tomohiro Nakatani:
Tackling Real Noisy Reverberant Meetings with All-Neural Source Separation, Counting, and Diarization System. 381-385 - Tomohiko Nakamura, Hiroshi Saruwatari:
Time-Domain Audio Source Separation Based on Wave-U-Net Combined with Discrete Wavelet Transform. 386-390 - Simone Spagnol:
Auditory Model Based Subsetting of Head-Related Transfer Function Datasets. 391-395 - Mengfan Zhang, Xihong Wu, Tianshu Qu:
Individual Distance-Dependent HRTFS Modeling Through A Few Anthropometric Measurements. 401-405 - Amir Ivry, Israel Cohen, Baruch Berdugo:
Evaluation of Deep-Learning-Based Voice Activity Detectors and Room Impulse Response Models in Reverberant Environments. 406-410 - Michele Geronazzo, Jason-Yves Tissieres, Stefania Serafin:
A Minimal Personalization of Dynamic Binaural Synthesis with Mixed Structural Modeling and Scattering Delay Networks. 411-415 - Hugo Caracalla, Axel Roebel:
Sound Texture Synthesis Using RI Spectrograms. 416-420 - Jérôme Daniel, Srdan Kitic:
Time Domain Velocity Vector for Retracing the Multipath Propagation. 421-425 - Jiaqi Su, Zeyu Jin, Adam Finkelstein:
Acoustic Matching By Embedding Impulse Responses. 426-430 - David Looney, Nikolay D. Gaubitch:
Joint Estimation Of Acoustic Parameters From Single-Microphone Speech Observations. 431-435 - Liming Shi, Taewoong Lee, Lijun Zhang, Jesper Kjær Nielsen, Mads Græsbøll Christensen:
A Fast Reduced-Rank Sound Zone Control Algorithm Using The Conjugate Gradient Method. 436-440 - Meng Guo:
An Empirical Study on Acoustic Feedback Path Across Hearing Aid Users. 441-445 - Ofer Schwartz, Emanuël A. P. Habets, Sharon Gannot:
Low Complexity NLMS for Multiple Loudspeaker Acoustic ECHO Canceller Using Relative Loudspeaker Transfer Functions. 446-450 - Patrick Meyer, Samy Elshamy, Tim Fingscheidt:
A Multichannel Kalman-Based Wiener Filter Approach for Speaker Interference Reduction in Meetings. 451-455 - Johannes Fabry, Peter Jax:
Primary Path Estimator Based on Individual Secondary Path for ANC Headphones. 456-460 - Mhd Modar Halimeh, Walter Kellermann:
Efficient Multichannel Nonlinear Acoustic Echo Cancellation Based on a Cooperative Strategy. 461-465 - Meiling Hu, Jing Lu:
Active Control of Line Spectral Noise with Simultaneous Secondary Path Modeling Without Auxiliary Noise. 466-470 - Hongsen He, Jingdong Chen, Jacob Benesty, Yi Yu:
Robust Frequency-Domain Recursive Least M-Estimate Adaptive Filter For Acoustic System Identification. 471-475 - Sankha Subhra Bhattacharjee, Nithin V. George:
Nearest Kronecker Product Decomposition Based Normalized Least Mean Square Algorithm. 476-480 - Sahar Hashemgeloogerdi, Sebastian Braun:
Joint Beamforming and Reverberation Cancellation Using a Constrained Kalman Filter With Multichannel Linear Prediction. 481-485 - Zhong-Qiu Wang, DeLiang Wang:
Multi-Microphone Complex Spectral Mapping for Speech Dereverberation. 486-490 - Hannes Gamper, Dimitra Emmanouilidou, Sebastian Braun, Ivan J. Tashev:
Predicting Word Error Rate for Reverberant Speech. 491-495 - Chitralekha Gupta, Emre Yilmaz, Haizhou Li:
Automatic Lyrics Alignment and Transcription in Polyphonic Music: Does Background Music Help? 496-500 - Hendrik Schreiber, Christof Weiß, Meinard Müller:
Local Key Estimation In Classical Music Recordings: A Cross-Version Study on Schubert's Winterreise. 501-505 - Fabrizio Pedersoli, George Tzanetakis, Kwang Moo Yi:
Improving Music Transcription by Pre-Stacking A U-Net. 506-510 - Laure Prétet, Gaël Richard, Geoffroy Peeters:
Learning to Rank Music Tracks Using Triplet Loss. 511-515 - Junyan Jiang, Gus Xia, Dave B. Carlton, Chris N. Anderson, Ryan H. Miyakawa:
Transformer VAE: A Hierarchical Model for Structure-Aware and Interpretable Music Representation Learning. 516-520 - Jianyu Fan, Yi-Hsuan Yang, Kui Dong, Philippe Pasquier:
A Comparative Study of Western and Chinese Classical Music Based on Soundscape Models. 521-525 - Andrea Vaglio, Romain Hennequin, Manuel Moussallam, Gaël Richard, Florence d'Alché-Buc:
Audio-Based Detection of Explicit Content in Music. 526-530 - Johanna Devaney:
New Metrics for Evaluating the Accuracy of Fundamental Frequency Estimation Approaches in Musical Signals. 531-535 - Minz Won, Sanghyuk Chun, Oriol Nieto, Xavier Serra:
Data-Driven Harmonic Filters for Audio Representation Learning. 536-540 - Zhesong Yu, Xiaoshuo Xu, Xiaoou Chen, Deshun Yang:
Learning a Representation for Cover Song Identification Using Convolutional Neural Network. 541-545 - T. J. Tsai:
Towards Linking the Lakh and IMSLP Datasets. 546-550 - Ping Gao, Cheng-You You, Tai-Shih Chi:
A Multi-Dilation and Multi-Resolution Fully Convolutional Network for Singing Melody Extraction. 551-555 - Xudong Zhao, Gongping Huang, Jingdong Chen, Jacob Benesty:
An Improved Solution to the Frequency-Invariant Beamforming with Concentric Circular Microphone Arrays. 556-560 - Ryan M. Corey, Andrew C. Singer:
Binaural Audio Source Remixing with Microphone Array Listening Devices. 561-565 - Reza Varzandeh, Kamil Adiloglu, Simon Doclo, Volker Hohmann:
Exploiting Periodicity Features for Joint Detection and DOA Estimation of Speech Sources Using Convolutional Neural Networks. 566-570 - Yonggang Hu, Prasanga N. Samarasinghe, Thushara D. Abhayapala, Sharon Gannot:
Unsupervised Multiple Source Localization Using Relative Harmonic Coefficients. 571-575 - Daniele Mirabilii, Kishor Kayyar Lakshminarayana, Wolfgang Mack, Emanuël A. P. Habets:
Data-Driven Wind Speed Estimation Using Multiple Microphones. 576-580 - Christopher Schymura, Tsubasa Ochiai, Marc Delcroix, Keisuke Kinoshita, Tomohiro Nakatani, Shoko Araki, Dorothea Kolossa:
A Dynamic Stream Weight Backprop Kalman Filter for Audiovisual Speaker Tracking. 581-585 - Elior Hadad, Sharon Gannot:
Maximum Likelihood Multi-Speaker Direction of Arrival Estimation Utilizing a Weighted Histogram. 586-590