default search action
ICASSP 2019: Brighton, UK
- IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2019, Brighton, United Kingdom, May 12-17, 2019. IEEE 2019, ISBN 978-1-4799-8131-1
- Keisuke Imoto, Seisuke Kyochi:
Sound Event Detection Using Graph Laplacian Regularization Based on Event Co-occurrence. 1-5 - Arindam Jati, Naveen Kumar, Ruxin Chen, Panayiotis G. Georgiou:
Hierarchy-aware Loss Function on a Tree Structured Label Space for Audio Event Detection. 6-10 - Shuoyang Li, Yuantao Gu, Yuhui Luo, Jonathon A. Chambers, Wenwu Wang:
Enhanced Streaming Based Subspace Clustering Applied to Acoustic Scene Data Clustering. 11-15 - Jordi Pons, Joan Serrà, Xavier Serra:
Training Neural Audio Classifiers with Few Data. 16-20 - Eduardo Fonseca, Manoj Plakal, Daniel P. W. Ellis, Frederic Font, Xavier Favory, Xavier Serra:
Learning Sound Event Classifiers from Web Audio with Noisy Labels. 21-25 - Szu-Yu Chou, Kai-Hsiang Cheng, Jyh-Shing Roger Jang, Yi-Hsuan Yang:
Learning to Match Transient Sound Events Using Attentional Similarity for Few-shot Sound Recognition. 26-30 - Yun Wang, Juncheng Li, Florian Metze:
A Comparison of Five Multiple Instance Learning Pooling Functions for Sound Event Detection with Weak Labeling. 31-35 - Sandeep Kothinti, Keisuke Imoto, Debmalya Chakrabarty, Gregory Sell, Shinji Watanabe, Mounya Elhilali:
Joint Acoustic and Class Inference for Weakly Supervised Sound Event Detection. 36-40 - Zuzanna Podwinska, Iwona Sobieraj, Bruno M. Fazenda, William J. Davies, Mark D. Plumbley:
Acoustic Event Detection from Weakly Labeled Data Using Auditory Salience. 41-45 - Yuanbo Hou, Qiuqiang Kong, Shengchen Li, Mark D. Plumbley:
Sound Event Detection with Sequentially Labelled Data Based on Connectionist Temporal Classification and Unsupervised Clustering. 46-50 - Huy Phan, Oliver Y. Chén, Philipp Koch, Lam Dang Pham, Ian McLoughlin, Alfred Mertins, Maarten De Vos:
Unifying Isolated and Overlapping Audio Event Detection with Multi-label Multi-task Convolutional Recurrent Neural Networks. 51-55 - Zhao Ren, Qiuqiang Kong, Jing Han, Mark D. Plumbley, Björn W. Schuller:
Attention-based Atrous Convolutional Neural Networks: Visualisation and Understanding Perspectives of Acoustic Scenes. 56-60 - Yoshiki Masuyama, Kohei Yatabe, Yuma Koizumi, Yasuhiro Oikawa, Noboru Harada:
Deep Griffin-Lim Iteration. 61-65 - Jonathan Le Roux, Gordon Wichern, Shinji Watanabe, Andy M. Sarroff, John R. Hershey:
The Phasebook: Building Complex Masks via Discrete Representations for Source Separation. 66-70 - Zhong-Qiu Wang, Ke Tan, DeLiang Wang:
Deep Learning Based Phase Reconstruction for Speaker Separation: A Trigonometric Perspective. 71-75 - Shanshan Wang, Gaurav Naithani, Tuomas Virtanen:
Low-latency Deep Clustering for Speech Separation. 76-80 - Efthymios Tzinis, Shrikant Venkataramani, Paris Smaragdis:
Unsupervised Deep Clustering for Source Separation: Direct Learning from Mixtures Using Spatial Information. 81-85 - Xiong Xiao, Zhuo Chen, Takuya Yoshioka, Hakan Erdogan, Changliang Liu, Dimitrios Dimitriadis, Jasha Droppo, Yifan Gong:
Single-channel Speech Extraction Using Speaker Inventory and Attention Network. 86-90 - Thilo von Neumann, Keisuke Kinoshita, Marc Delcroix, Shoko Araki, Tomohiro Nakatani, Reinhold Haeb-Umbach:
All-neural Online Source Separation, Counting, and Diarization for Meeting Analysis. 91-95 - Shota Inoue, Hirokazu Kameoka, Li Li, Shogo Seki, Shoji Makino:
Joint Separation and Dereverberation of Reverberant Mixtures with Multichannel Variational Autoencoder. 96-100 - Simon Leglaive, Laurent Girin, Radu Horaud:
Semi-supervised Multichannel Speech Enhancement with Variational Autoencoders and Non-negative Matrix Factorization. 101-105 - Deepak Baby, Sarah Verhulst:
Sergan: Speech Enhancement Using Relativistic Generative Adversarial Networks with Gradient Penalty. 106-110 - Chaitanya Narisetty, Tatsuya Komatsu, Reishi Kondo:
Bayesian Non-parametric Multi-source Modelling Based Determined Blind Source Separation. 111-115 - Alexander Schmidt, Walter Kellermann:
Informed Ego-noise Suppression Using Motor Data-driven Dictionaries. 116-120 - Zhongshu Ge, Xihong Wu, Tianshu Qu:
Improvements to the Matching Projection Decoding Method for Ambisonic System with Irregular Loudspeaker Layouts. 121-125 - Dylan Menzies, Filippo Maria Fazi:
Small Array Reproduction Method for Ambisonic Encodings Using Headtracking. 126-130 - Maximilian Kentgens, Peter Jax:
Space Warping Based Dimensionality Reduction of Higher Order Ambisonics Signals. 131-135 - Taewoong Lee, Jesper Kjær Nielsen, Mads Græsbøll Christensen:
Towards Perceptually Optimized Sound Zones: A Proof-of-concept Study. 136-140 - Junqing Zhang, Wen Zhang, Thushara D. Abhayapala, Jingli Xie, Lijun Zhang:
2.5D Multizone Reproduction with Active Control of Scattered Sound Fields. 141-145 - Jens Ahrens:
Auralization of Omnidirectional Room Impulse Responses Based on the Spatial Decomposition Method and Synthetic Spatial Data. 146-150 - Jacob Møller Hjerrild, Mads Græsbøll Christensen:
Estimation of Guitar String, Fret and Plucking Position Using Parametric Pitch Estimation. 151-155 - Tsung-Han Hsieh, Li Su, Yi-Hsuan Yang:
A Streamlined Encoder/decoder Architecture for Melody Extraction. 156-160 - Ryo Nishikimi, Eita Nakamura, Satoru Fukayama, Masataka Goto, Kazuyoshi Yoshii:
Automatic Singing Transcription Based on Encoder-decoder Recurrent Neural Networks with a Weakly-supervised Attention Mechanism. 161-165 - Yu-Te Wu, Berlin Chen, Li Su:
Polyphonic Music Transcription with Semantic Segmentation. 166-170 - Marco A. Martínez Ramírez, Joshua D. Reiss:
Modeling Nonlinear Audio Effects with End-to-end Deep Neural Networks. 171-175 - Jong Wook Kim, Rachel M. Bittner, Aparna Kumar, Juan Pablo Bello:
Neural Music Synthesis for Flexible Timbre Control. 176-180 - Daniel Stoller, Simon Durand, Sebastian Ewert:
End-to-end Lyrics Alignment for Polyphonic Music Using an Audio-to-character Recognition Model. 181-185 - Andrew McLeod, Eita Nakamura, Kazuyoshi Yoshii:
Improved Metrical Alignment of Midi Performance Based on a Repetition-aware Online-adapted Grammar. 186-190 - Yu-An Wang, Yu-Kai Huang, Tzu-Chuan Lin, Shang-Yu Su, Yun-Nung Chen:
Modeling Melodic Feature Dependency with Modularized Variational Auto-encoder. 191-195 - Eita Nakamura, Kentaro Shibata, Ryo Nishikimi, Kazuyoshi Yoshii:
Unsupervised Melody Style Conversion. 196-200 - Christopher J. Tralie, Brian McFee:
Enhanced Hierarchical Music Structure Annotations via Feature Level Similarity Fusion. 201-205 - Akira Maezawa:
Music Boundary Detection Based on a Hybrid Deep Model of Novelty, Homogeneity, Repetition and Duration. 206-210 - Song Li, Roman Schlieper, Jürgen Peissig:
A Hybrid Method for Blind Estimation of Frequency Dependent Reverberation Time Using Speech Signals. 211-215 - Ryan M. Corey, Naoki Tsuda, Andrew C. Singer:
Acoustic Impulse Responses for Wearable Audio Devices. 216-220 - Maoran Xu, Ziyu Wang, Gus Xia:
Transferring Piano Performance Control across Environments. 221-225 - Helena Peic Tukuljac, Ville Pulkki, Hannes Gamper, Keith W. Godin, Ivan J. Tashev, Nikunj Raghuvanshi:
A Sparsity Measure for Echo Density Growth in General Environments. 226-230 - Andrea F. Genovese, Hannes Gamper, Ville Pulkki, Nikunj Raghuvanshi, Ivan J. Tashev:
Blind Room Volume Estimation from Single-channel Noisy Speech. 231-235 - Kentaro Shibata, Ryo Nishikimi, Satoru Fukayama, Masataka Goto, Eita Nakamura, Katsutoshi Itoyama, Kazuyoshi Yoshii:
Joint Transcription of Lead, Bass, and Rhythm Guitars Based on a Factorial Hidden Semi-Markov Model. 236-240 - Beici Liang, György Fazekas, Mark B. Sandler:
Piano Sustain-pedal Detection Using Convolutional Neural Networks. 241-245 - Rainer Kelz, Sebastian Böck, Gerhard Widmer:
Deep Polyphonic ADSR Piano Note Transcription. 246-250 - Kin Wah Edward Lin, Masataka Goto:
Zero-mean Convolutional Network with Data Augmentation for Sound Level Invariant Singing Voice Separation. 251-255 - Filipe Lins, Marcelo O. Johann, Emmanouil Benetos, Rodrigo Schramm:
Automatic Transcription of Diatonic Harmonica Recordings. 256-260 - Christoph Hold, Hannes Gamper, Ville Pulkki, Nikunj Raghuvanshi, Ivan J. Tashev:
Improving Binaural Ambisonics Decoding by Spherical Harmonics Domain Tapering and Coloration Compensation. 261-265 - Grady Kestler, Shahrokh Yadegari, David Nahamoo:
Head Related Impulse Response Interpolation and Extrapolation Using Deep Belief Networks. 266-270 - Tzu-Yu Chen, Tzu-Hsuan Kuo, Tai-Shih Chi:
Autoencoding HRTFS for DNN Based HRTF Personalization Using Anthropometric Features. 271-275 - Mengfan Zhang, Yue Qiao, Xihong Wu, Tianshu Qu:
Distance-dependent Modeling of Head-related Transfer Functions. 276-280 - Federico Borra, Israel Dejene Gebru, Dejan Markovic:
Soundfield Reconstruction in Reverberant Environments Using Higher-order Microphones and Impulse Response Measurements. 281-285 - Weitao Yuan, Shengbei Wang, Xiangrui Li, Masashi Unoki, Wenwu Wang:
Proximal Deep Recurrent Neural Network for Monaural Singing Voice Separation. 286-290 - Michael Michelashvili, Sagie Benaim, Lior Wolf:
Semi-supervised Monaural Singing Voice Separation with a Masking Network Trained on Synthetic Mixtures. 291-295 - Ke Wang, Frank K. Soong, Lei Xie:
A Pitch-aware Approach to Single-channel Speech Separation. 296-300 - Prem Seetharaman, Gordon Wichern, Shrikant Venkataramani, Jonathan Le Roux:
Class-conditional Embeddings for Music Source Separation. 301-305 - Olga Slizovskaia, Leo Kim, Gloria Haro, Emilia Gómez:
End-to-end Sound Source Separation Conditioned on Instrument Labels. 306-310 - Wenqiang Pu, Jinjun Xiao, Tao Zhang, Zhi-Quan Luo:
A Joint Auditory Attention Decoding and Adaptive Binaural Beamforming Algorithm for Hearing Devices. 311-315 - Elizabeth Ren, Hans-Andrea Loeliger:
Exact Discrete-time Realizations of the Gammatone Filter. 316-320 - Mathew Shaji Kavalekalam, Jesper Kjær Nielsen, Mads Græsbøll Christensen, Jesper Bünsow Boldt:
Hearing Aid-controlled Beamformer for Binaural Speech Enhancement Using a Model-based Approach. 321-325 - Sascha Dick, Jürgen Herre:
Predicting the Precision of Elevation Localization Based on Head Related Transfer Functions. 326-330 - Frank Zalkow, Stefan Balke, Meinard Müller:
Evaluating Salience Representations for Cross-modal Retrieval of Western Classical Music Recordings. 331-335 - Jordi Pons, Xavier Serra:
Randomly Weighted CNNs for (Music) Audio Classification. 336-340 - Christof Weiß, Fabian Brand, Meinard Müller:
Mid-level Chord Transition Features for Musical Style Analysis. 341-345 - Matthew C. McCallum:
Unsupervised Learning of Deep Features for Music Segmentation. 346-350 - Lincon Sales de Souza, Bernardo B. Gatto, Kazuhiro Fukui:
Classification of Bioacoustic Signals with Tangent Singular Spectrum Analysis. 351-355 - Prem Seetharaman, Gordon Wichern, Jonathan Le Roux, Bryan Pardo:
Bootstrapping Single-channel Source Separation via Unsupervised Spatial Clustering on Stereo Mixtures. 356-360 - Cong Han, Yi Luo, Nima Mesgarani:
Online Deep Attractor Network for Real-time Single-channel Speech Separation. 361-365 - Robin Scheibler, Nobutaka Ono:
Multi-modal Blind Source Separation with Microphones and Blinkies. 366-370 - Nobutaka Ito, Tomohiro Nakatani:
FastMNMF: Joint Diagonalization Based Accelerated Algorithms for Multichannel Nonnegative Matrix Factorization. 371-375 - Sunwoo Kim, Mrinmoy Maity, Minje Kim:
Incremental Binarization on Recurrent Neural Networks for Single-channel Source Separation. 376-380 - Yun-Ning Hung, Yi-An Chen, Yi-Hsuan Yang:
Multitask Learning for Frame-level Instrument Recognition. 381-385 - Donmoon Lee, Jaejun Lee, Jeongsoo Park, Kyogu Lee:
Enhancing Music Features by Knowledge Transfer from User-item Log Data. 386-390 - Shun Sawada, Satoru Fukayama, Masataka Goto, Keiji Hirata:
Transdrums: A Drum Pattern Transfer System Preserving Global Pattern Structure. 391-395 - Bidisha Sharma, Chitralekha Gupta, Haizhou Li, Ye Wang:
Automatic Lyrics-to-audio Alignment on Polyphonic Music Using Singing-adapted Acoustic Models. 396-400 - Yi Qin, Alexander Lerch:
Tuning Frequency Dependency in Music Classification. 401-405 - Ali Aroudi, Simon Doclo:
Cognitive-driven Binaural LCMV Beamformer Using EEG-based Auditory Attention Decoding. 406-410 - Lucas Ondel, Ruizhi Li, Gregory Sell, Hynek Hermansky:
Deriving Spectro-temporal Properties of Hearing from Speech Data. 411-415 - Nico Gößling, Simon Doclo:
RTF-steered Binaural MVDR Beamforming Incorporating an External Microphone for Dynamic Acoustic Scenarios. 416-420 - Benjamin R. Hammond, Philip J. B. Jackson:
Robust Full-sphere Binaural Sound Source Localization Using Interaural and Spectral Cues. 421-425 - Gongping Huang, Xudong Zhao, Jingdong Chen, Jacob Benesty:
Properties and Limits of the Minimum-norm Differential Beamformers with Circular Microphone Arrays. 426-430 - Christian Schüldt:
Trigonometric Interpolation Beamforming for a Circular Microphone Array. 431-435 - Pasi Pertilä, Mikko Parviainen:
Time Difference of Arrival Estimation of Speech Signals Using Deep Neural Networks with Integrated Time-frequency Masking. 436-440 - Wenxing Yang, Gongping Huang, Jacob Benesty, Israel Cohen, Jingdong Chen:
On the Design of Flexible Kronecker Product Beamformers with Linear Microphone Arrays. 441-445 - Adrian Herzog, Emanuël Anco Peter Habets:
Direction Preserving Wiener Matrix Filtering for Ambisonic Input-output Systems. 446-450 - Paolo Vecchiotti, Ning Ma, Stefano Squartini, Guy J. Brown:
End-to-end Binaural Sound Localisation from the Raw Waveform. 451-455 - Shun Ueda, Kentaro Shibata, Yusuke Wada, Ryo Nishikimi, Eita Nakamura, Kazuyoshi Yoshii:
Bayesian Drum Transcription Based on Nonnegative Matrix Factor Decomposition with a Deep Score Prior. 456-460 - Steven Spratley, Daniel Beck, Trevor Cohn:
A Unified Neural Architecture for Instrumental Audio Tasks. 461-465 - Ning Zhang, Tao Jiang, Feng Deng, Yan Li:
Automatic Singing Evaluation without Reference Melody Using Bi-dense Neural Network. 466-470 - Eero-Pekka Damskägg, Lauri Juvela, Etienne Thuillier, Vesa Välimäki:
Deep Learning for Tube Amplifier Emulation. 471-475 - Sanna Wager, George Tzanetakis, Stefan Sullivan, Cheng-i Wang, John Shimmin, Minje Kim, Perry Cook:
Intonation: A Dataset of Quality Vocal Performances Refined by Spectral Clustering on Pitch Congruence. 476-480 - Magdalena Fuentes, Brian McFee, Hélène C. Crayencour, Slim Essid, Juan Pablo Bello:
A Music Structure Informed Downbeat Tracking System Using Skip-chain Conditional Random Fields and Deep Learning. 481-485 - Jakob Abeßer, Meinard Müller:
Fundamental Frequency Contour Classification: A Comparison between Hand-crafted and CNN-based Features. 486-490 - Tom Bäckström:
Overlap-add Windows with Maximum Energy Concentration for Speech and Audio Processing. 491-495 - Chia-Hung Wan, Shun-Po Chuang, Hung-yi Lee:
Towards Audio to Scene Image Synthesis Using Generative Adversarial Network. 496-500 - Thanh-Ha Le, Philippe Gilberton, Ngoc Q. K. Duong:
Discriminate Natural versus Loudspeaker Emitted Speech. 501-505 - Laure Prétet, Romain Hennequin, Jimena Royo-Letelier, Andrea Vaglio:
Singing Voice Separation: A Study on Training Data. 506-510 - Hayato Ito, Shoichi Koyama, Natsuki Ueno, Hiroshi Saruwatari:
Feedforward Spatial Active Noise Control Based on Kernel Interpolation of Sound Field. 511-515 - Huiyuan Sun, Thushara D. Abhayapala, Prasanga N. Samarasinghe:
Time Domain Spherical Harmonic Analysis for Adaptive Noise Cancellation over a Spatial Region. 516-520 - Jingli Xie, Danqi Jin, Wen Zhang, Xiao-Lei Zhang, Jie Chen, DeLiang Wang:
Robust Sparse Multichannel Active Noise Control. 521-525 - Naoki Murata, Jihui Zhang, Yu Maeno, Yuki Mitsufuji:
Global and Local Mode-domain Adaptive Algorithms for Spatial Active Noise Control Using Higher-order Sources. 526-530 - Masahito Togami:
Spatial Constraint on Multi-channel Deep Clustering. 531-535 - Masahito Togami:
Multi-channel Itakura Saito Distance Minimization with Deep Neural Network. 536-540 - Simon Leglaive, Umut Simsekli, Antoine Liutkus, Laurent Girin, Radu Horaud:
Speech Enhancement with Variational Autoencoders and Alpha-stable Distributions. 541-545 - Li Li, Hirokazu Kameoka, Shoji Makino:
Fast MVAE: Joint Separation and Classification of Mixed Sources Based on Multichannel Variational Autoencoder with Auxiliary Classifier. 546-550 - Eric L. Ferguson, Stefan B. Williams, Craig T. Jin:
Improved Multipath Time Delay Estimation Using Cepstrum Subtraction. 551-555 - Guanjun Li, Shan Liang, Shuai Nie, Wenju Liu:
Adaptive Dereverberation Using Multi-channel Linear Prediction with Deficient Length Filter. 556-560 - Yonggang Hu, Prasanga N. Samarasinghe, Thushara D. Abhayapala, Glenn Dickins:
Modeling Characteristics of Real Loudspeakers Using Various Acoustic Models: Modal-domain Approaches. 561-565 - Eric Carlos Hamdan, Filippo Maria Fazi:
Low Frequency Crosstalk Cancellation and Its Relationship to Amplitude Panning. 566-570 - Oliver Thiergart, Guendalina Milano, Emanuël Anco Peter Habets:
Combining Linear Spatial Filtering and Non-linear Parametric Processing for High-quality Spatial Sound Capturing. 571-575 - Leo McCormack, Archontis Politis, Ville Pulkki:
Sharpening of Angular Spectra Based on a Directional Re-assignment Approach for Ambisonic Sound-field Visualisation. 576-580 - Yuhta Takida, Shoichi Koyama, Natsuki Ueno, Hiroshi Saruwatari:
Robust Gridless Sound Field Decomposition Based on Structured Reciprocity Gap Functional in Spherical Harmonic Domain. 581-585 - Dingding Yao, Junfeng Li, Huaxing Xu, Risheng Xia, Yonghong Yan:
A Subband Energy Modification Method for Elevation Control in Median Plane. 586-590 - Pavel Záviska, Pavel Rajmic, Ondrej Mokrý, Zdenek Prusa:
A Proper Version of Synthesis-based Sparse Audio Declipper. 591-595 - Daiki Takeuchi, Kohei Yatabe, Yuma Koizumi, Yasuhiro Oikawa, Noboru Harada:
Data-driven Design of Perfect Reconstruction Filterbank for DNN-based Sound Source Enhancement. 596-600 - Robert Rehr, Timo Gerkmann:
An Analysis of Noise-aware Features in Combination with the Size and Diversity of Training Data for DNN-based Speech Enhancement. 601-605