default search action
ICASSP 2018: Calgary, AB, Canada
- 2018 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2018, Calgary, AB, Canada, April 15-20, 2018. IEEE 2018, ISBN 978-1-5386-4658-8
- Zhong-Qiu Wang, Jonathan Le Roux, John R. Hershey:
Multi-Channel Deep Clustering: Discriminative Spectral and Spatial Embeddings for Speaker-Independent Speech Separation. 1-5 - Chenglin Xu, Wei Rao, Xiong Xiao, Eng Siong Chng, Haizhou Li:
Single Channel Speech Separation with Constrained Utterance Level Permutation Invariant Training Using Grid LSTM. 6-10 - Lukas Drude, Thilo von Neumann, Reinhold Haeb-Umbach:
Deep Attractor Networks for Speaker Re-Identification and Blind Source Separation. 11-15 - Li Li, Hirokazu Kameoka:
Deep Clustering with Gated Convolutional Networks. 16-20 - Ke Tan, Jitong Chen, DeLiang Wang:
Gated Residual Networks with Dilated Convolutions for Supervised Speech Separation. 21-25 - Y. Cem Sübakan, Paris Smaragdis:
Generative Adversarial Source Separation. 26-30 - Hideaki Kagami, Hirokazu Kameoka, Masahiro Yukawa:
Joint Separation and Dereverberation of Reverberant Mixtures with Determined Multichannel Non-Negative Matrix Factorization. 31-35 - Lauréline Perotin, Romain Serizel, Emmanuel Vincent, Alexandre Guérin:
Multichannel Speech Separation with Recurrent Neural Networks from High-Order Ambisonics Recordings. 36-40 - Wei Xue, Alastair H. Moore, Mike Brookes, Patrick A. Naylor:
Multichannel Kalman Filtering for Speech Ehnancement. 41-45 - Yaron Laufer, Sharon Gannot:
A Bayesian Hierarchical Model for Speech Enhancement. 46-50 - Juan Azcarreta, Nobutaka Ito, Shoko Araki, Tomohiro Nakatani:
Permutation-Free Cgmm: Complex Gaussian Mixture Model with Inverse Wishart Mixture Model Based Spatial Prior for Permutation-Free Source Separation and Source Counting. 51-55 - Antoine Liutkus, Christian Rohlfing, Antoine Deleforge:
Audio Source Separation with Magnitude Priors: The Beads Model. 56-60 - Andreas Brendel, Walter Kellermann:
Learning-Based Acoustic Source-Microphone Distance Estimation Using the Coherent-to-Diffuse Power Ratio. 61-65 - Vishnuvardhan Varanasi, Rajesh M. Hegde:
Stochastic Online Dictionary Learning for Speech Source Localization and Separation in Spherical Harmonic Domain. 66-70 - Bracha Laufer-Goldshtein, Ronen Talmon, Israel Cohen, Sharon Gannot:
Multi-View Source Localization Based on Power Ratios. 71-75 - Christopher Schymura, Dorothea Kolossa:
Potential-Field-Based Active Exploration for Acoustic Simultaneous Localization and Mapping. 76-80 - Jesper Kjær Nielsen:
Loudspeaker and Listening Position Estimation Using Smart Speakers. 81-85 - Benjamin R. Hammond, Philip J. B. Jackson:
Robust Full-Sphere Binaural Sound Source Localization. 86-90 - Filip Korzeniowski, David R. W. Sears, Gerhard Widmer:
A Large-Scale Study of Language Models for Chord Prediction. 91-95 - Xiaoshuo Xu, Xiaoou Chen, Deshun Yang:
Effective Cover Song Identification Based on Skipping Bigrams. 96-100 - Eita Nakamura, Emmanouil Benetos, Kazuyoshi Yoshii, Simon Dixon:
Towards Complete Polyphonic Music Transcription: Integrating Multi-Pitch Detection and Rhythm Quantization. 101-105 - Tian Cheng, Jordan B. L. Smith, Masataka Goto:
Music Structure Boundary Detection and Labelling by a Deconvolution of Path-Enhanced Self-Similarity Matrix. 106-110 - Anna M. Kruspe, Masataka Goto:
Retrieval of Song Lyrics from Sung Queries. 111-115 - Cheng-i Wang, George Tzanetakis:
Singing Style Investigation by Residual Siamese Convolutional Neural Networks. 116-120 - Yong Xu, Qiuqiang Kong, Wenwu Wang, Mark D. Plumbley:
Large-Scale Weakly Supervised Audio Classification Using Gated Convolutional Neural Network. 121-125 - Aren Jansen, Manoj Plakal, Ratheet Pandya, Daniel P. W. Ellis, Shawn Hershey, Jiayang Liu, R. Channing Moore, Rif A. Saurous:
Unsupervised Learning of Semantic Audio Representations. 126-130 - Rui Lu, Zhiyao Duan, Changshui Zhang:
Multi-Scale Recurrent Neural Network for Sound Event Detection. 131-135 - Ivan Kukanov, Ville Hautamäki, Kong-Aik Lee:
Maximal Figure-of-Merit Embedding for Multi-Label Audio Classification. 136-140 - Huy Phan, Philipp Koch, Ian McLoughlin, Alfred Mertins:
Enabling Early Audio Event Detection with Neural Networks. 141-145 - Abelino Jimenez, Benjamin Elizalde, Bhiksha Raj:
Acoustic Scene Classification Using Discrete Random Hashing for Laplacian Kernel Machines. 146-150 - Sangeon Yong, Juhan Nam:
Singing Expression Transfer from One Voice to Another for a Given Song. 151-155 - Yin-Jyun Luo, Ming-Tso Chen, Tai-Shih Chi, Li Su:
Singing Voice Correction Using Canonical Time Warping. 156-160 - Jong Wook Kim, Justin Salamon, Peter Li, Juan Pablo Bello:
Crepe: A Convolutional Representation for Pitch Estimation. 161-165 - Paul Magron, Tuomas Virtanen:
Bayesian Anisotropic Gaussian Model for Audio Source Separation. 166-170 - Jordan B. L. Smith, Masataka Goto:
Nonnegative Tensor Factorization for Source Separation of Loops in Audio. 171-175 - Christian Dittmar, Patricio López-Serrano, Meinard Müller:
Unifying Local and Global Methods for Harmonic-Percussive Source Separation. 176-180 - Fabrice Katzberg, Radoslaw Mazur, Marco Maaß, Philipp Koch, Alfred Mertins:
Compressive Sampling of Sound Fields Using Moving Microphones. 181-185 - Mirco Pezzoli, Federico Borra, Fabio Antonacci, Augusto Sarti, Stefano Tubaro:
Estimation of the Sound Field at Arbitrary Positions in Distributed Microphone Networks Based on Distributed Ray Space Transform. 186-190 - Ziteng Wang, Junfeng Li, Yonghong Yan, Emmanuel Vincent:
Semi-Supervised Learning with Deep Neural Networks for Relative Transfer Function Inverse Regression. 191-195 - Satoru Emura, Noboru Harada:
Sound Field Decomposition Using SPICE Decomposition. 196-200 - Luca Remaggi, Hansung Kim, Philip J. B. Jackson, Filippo Maria Fazi, Adrian Hilton:
Acoustic Reflector Localization and Classification. 201-205 - Sebastian Braun, João Felipe Santos, Emanuël A. P. Habets, Tiago H. Falk:
Dual-Channel Modulation Energy Metric for Direct-to-Reverberation Ratio Estimation. 206-210 - Yu Maeno, Yuki Mitsufuji, Thushara D. Abhayapala:
Mode Domain Spatial Active Noise Control Using Sparse Signal Representation. 211-215 - Fei Ma, Wen Zhang, Thushara D. Abhayapala:
Reference Signal Generation for Broadband ANC Systems in Reverberant Rooms. 216-220 - Jacob Donley, Christian H. Ritz, W. Bastiaan Kleijn:
On the Comparison of Two Room Compensation / Dereverberation Methods Employing Active Acoustic Boundary Absorption. 221-225 - Jan Franzen, Tim Fingscheidt:
An Efficient Residual Echo Suppression for Multi-Channel Acoustic Echo Cancellation Based on the Frequency-Domain Adaptive Kalman Filter. 226-230 - Guillaume Carbajal, Romain Serizel, Emmanuel Vincent, Eric Humbert:
Multiple-Input Neural Network-Based Residual Echo Suppression. 231-235 - Mhd Modar Halimeh, Christian Huemmer, Walter Kellermann:
Nonlinear Acoustic Echo Cancellation Using Elitist Resampling Particle Filter. 236-240 - Stefan Liebich, Raphael Brandis, Johannes Fabry, Peter Jax, Peter Vary:
Active Occlusion Cancellation with Hear-Through Equalization for Headphones. 241-245 - Meng Guo, Martin Kuriger, Christophe Lesimple, Bernhard Kuenzle:
Extension and Evaluation of a Spectro-Temporal Modulation Method to Improve Acoustic Feedback Performance in Hearing Aids. 246-250 - Johannes Gauer, Anil M. Nagathil, Rainer Martin:
Binaural Spectral Complexity Reduction of Music Signals for Cochlear Implant Listeners. 251-255 - Lincon Sales de Souza, Bernardo B. Gatto, Kazuhiro Fukui:
Grassmann Singular Spectrum Analysis for Bioacoustics Classification. 256-260 - Anshul Thakur, Vinayak Abrol, Pulkit Sharma, Padmanabhan Rajan:
Compressed Convex Spectral Embedding for Bird Species Classification. 261-265 - Vincent Lostanlen, Justin Salamon, Andrew Farnsworth, Steve Kelling, Juan Pablo Bello:
Birdvox-Full-Night: A Dataset and Benchmark for Avian Flight Call Detection. 266-270 - Sai-hua Zhang, Zhao Zhao, Zhi-yong Xu, Kristen Bellisario, Bryan C. Pijanowski:
Automatic Bird Vocalization Identification Based on Fusion of Spectral Pattern and Texture Features. 271-275 - Frank Kurth, Kevin Wilkinghoff:
Robust Detection of Jittered Multiply Repeating Audio Events Using Iterated Time-Warped ACF. 276-280 - Costas Yiallourides, Alastair H. Moore, Edouard Auvinet, Catherine Van Der Straeten, Patrick A. Naylor:
Acoustic Analysis and Assessment of the Knee in Osteoarthritis During Walking. 281-285 - Amir Hossein Poorjam, Max A. Little, Jesper Rindom Jensen, Mads Græsbøll Christensen:
A Parametric Approach for Classification of Distortions in Pathological Voices. 286-290 - Tamás Grósz, Gábor Gosztolya, László Tóth, Tamás Gábor Csapó, Alexandra Markó:
F0 Estimation for DNN-Based Ultrasound Silent Speech Interfaces. 291-295 - Amir Hossein Poorjam, Max A. Little, Jesper Rindom Jensen, Mads Græsbøll Christensen:
A Supervised Approach to Global Signal-to-Noise Ratio Estimation for Whispered and Pathological Voices. 296-300 - Jonah Casebeer, Hillol Sarker, Murtaza Dhuliawala, Nicholas Fay, Mary Pietrowicz, Amar Das:
Verbal Protest Recognition in Children with Autism. 301-305 - Xianjun Xia, Roberto Togneri, Ferdous Ahmed Sohel, David Huang:
Confidence Based Acoustic Event Detection. 306-310 - Qiuqiang Kong, Yong Xu, Wenwu Wang, Mark D. Plumbley:
Audio Set Classification with Attention Model: A Probabilistic Perspective. 316-320 - Qiuqiang Kong, Yong Xu, Wenwu Wang, Mark D. Plumbley:
A Joint Separation-Classification Model for Sound Event Detection of Weakly Labelled Data. 321-325 - Anurag Kumar, Maksim Khadkevich, Christian Fügen:
Knowledge Transfer from Weakly Labeled Audio Using Convolutional Neural Network for Sound Events and Scenes. 326-330 - Yuzhong Wu, Tan Lee:
Reducing Model Complexity for DNN Based Large-Scale Audio Classification. 331-335 - Huy Phan, Martin Krawczyk-Becker, Timo Gerkmann, Alfred Mertins:
Weighted and Multi-Task Loss for Rare Audio Event Detection. 336-340 - Mark Cartwright, Justin Salamon, Ayanna Seals, Oded Nov, Juan Pablo Bello:
Investigating the Effect of Sound-Event Loudness on Crowdsourced Audio Annotations. 341-345 - Shota Ikawa, Kunio Kashino:
Generating Sound Words from Audio Signals of Acoustic Events with Sequence-to-Sequence Model. 346-350 - Robin Scheibler, Eric Bezzam, Ivan Dokmanic:
Pyroomacoustics: A Python Package for Audio Room Simulation and Array Processing Algorithms. 351-355 - Adib Mehrabi, Keunwoo Choi, Simon Dixon, Mark B. Sandler:
Similarity Measures for Vocal-Based Drum Sample Retrieval Using Deep Convolutional Auto-Encoders. 356-360 - Xueyang Wang, Ryan Stables, Bochen Li, Zhiyao Duan:
Score-Aligned Polyphonic Microtiming Estimation. 361-365 - Taejun Kim, Jongpil Lee, Juhan Nam:
Sample-Level CNN Architectures for Music Auto-Tagging Using Raw Waveforms. 366-370 - Li Su:
Vocal Melody Extraction Using Patch-Based CNN. 371-375 - Yiming Wu, Wei Li:
Music Chord Recognition Based on Midi-Trained Deep Feature and BLSTM-CRF Hybird Decoding. 376-380 - Hsin Chou, Ming-Tso Chen, Tai-Shih Chi:
A Hybrid Neural Network Based on the Duplex Model of Pitch Perception for Singing Melody Extraction. 381-385 - Adrien Ycart, Emmanouil Benetos:
Polyphonic Music Sequence Transduction with Meter-Constrained LSTM Networks. 386-390 - Fu'ze Cong, Shu-Chang Liu, Li Guo, Geraint A. Wiggins:
A Parallel Fusion Approach to Piano Music Transcription Based on Convolutional Neural Network. 391-395 - Juheon Lee, Sungkyun Chang, Sang Keun Choe, Kyogu Lee:
Cover Song Identification Using Song-to-Song Cross-Similarity Matrix with Convolutional Neural Network. 396-400 - Yu-Te Wu, Berlin Chen, Li Su:
Automatic Music Transcription Leveraging Generalized Cepstral Features and Deep Learning. 401-405 - Stefan Kühl, Sebastian Nagel, Tobias Kabzinski, Christiane Antweiler, Peter Jax:
A Joint Perspective of Periodically Excited Efficient NLMS Algorithm and Inverse Cyclic Convolution. 406-410 - Tharun Adithya Srikrishnan, Bhaskar D. Rao, Ritwik Giri, Tao Zhang:
Improved Noise Characterization for Relative Impulse Response Estimation. 411-415 - Ajay Dagar, Satyavolu Sai Nitish, Rajesh M. Hegde:
Joint Adaptive Impulse Response Estimation and Inverse Filtering for Enhancing In-Car Audio. 416-420 - Joonas Nikunen, Tuomas Virtanen:
Estimation of Time-Varying Room Impulse Responses of Multiple Sound Sources from Observed Mixture and Isolated Source Signals. 421-425 - Jacob Moller Hjerrild, Mads Græsbøll Christensen:
Estimation of Source Panning Parameters and Segmentation of Stereophonic Mixtures. 426-430 - Karim M. Ibrahim, Mahmoud Allam:
Primary-Ambient Source Separation for Upmixing to Surround Sound Systems. 431-435 - Fabian-Robert Stöter, Soumitro Chakrabarty, Bernd Edler, Emanuël A. P. Habets:
Classification vs. Regression in Supervised Learning for Single Channel Speaker Count Estimation. 436-440 - Ina Kodrasi, Simon Doclo:
Joint Late Reverberation and Noise Power Spectral Density Estimation in a Spatially Homogeneous Noise Field. 441-445 - Daniele Giacobello, Tobias Lindstrøm Jensen:
Speech Dereverberation Based on Convex Optimization Algorithms for Group Sparse Linear Prediction. 446-450 - Marvin Tammen, Ina Kodrasi, Simon Doclo:
Complexity Reduction of Eigenvalue Decomposition-Based Diffuse Power Spectral Density Estimators Using the Power Method. 451-455 - Xiaohui Ma, Patrick J. Hegarty, Jakob Juul Larsen:
Mitigation of Nonlinear Distortion in Sound Zone Control by Constraining Individual Loudspeaker Driver Amplitudes. 456-460 - Daiki Takeuchi, Kohei Yatabe, Yasuhiro Oikawa:
Realizing Directional Sound Source in FDTD Method by Estimating Initial Value. 461-465 - Natsuki Ueno, Shoichi Koyama, Hiroshi Saruwatari:
Sound Field Reproduction with Exterior Cancellation Using Analytical Weighting of Harmonic Coefficients. 466-470 - Peng Chen, Prasanga N. Samarasinghe, Thushara D. Abhayapala:
3D Exterior Soundfield Reproduction Using a Planar Loudspeaker Array. 471-475 - Wen Zhang, Junqing Zhang, Thushara D. Abhayapala, Lijun Zhang:
2.5D Multizone Reproduction Using Weighted Mode Matching. 476-480 - Terence Betlehem, Lakshmi Krishnan, Paul D. Teal:
Temperature Robust Active-Compensated Sound Field Reproduction Using Impulse Response Shaping. 481-485 - Shota Minami, Jun Kuroda, Yasuhiro Oikawa:
Individual Difference of Ultrasonic Transducers for Parametric Array Loudspeaker. 486-490 - Taewoong Lee, Jesper Kjær Nielsen, Jesper Rindom Jensen, Mads Græsbøll Christensen:
A Unified Approach to Generating Sound Zones Using Variable Span Linear Filters. 491-495 - Kimitaka Tsutsumi, Yoichi Haneda, Ken'ichi Noguchil, Hideaki Takada:
Directivity Synthesis with Multipoles Comprising a Cluster of Focused Sources Using a Linear Loudspeaker Array. 496-500 - Shoichi Koyama, Gilles Chardon, Laurent Daudet:
Joint Source and Sensor Placement for Sound Field Control Based on Empirical Interpolation Method. 501-505 - Gongping Huang, Jingdong Chen, Jacob Benesty:
On the Design of Robust Steerable Frequency-Invariant Beampatterns with Concentric Circular Microphone Arrays. 506-510 - Xin Leng, Jingdong Chen, Jacob Benesty, Israel Cohen:
On Speech Enhancement Using Microphone Arrays in the Presence of Co-Directional Interference. 511-515 - Mehdi Zohourian, Rainer Martin:
GSC-Based Binaural Speaker Separation Preserving Spatial Cues. 516-520 - Randall Ali, Toon van Waterschoot, Marc Moonen:
Generalised Sidelobe Canceller for Noise Reduction in Hearing Devices Using an External Microphone. 521-525 - Ziteng Wang, Lu Yin, Junfeng Li, Yonghong Yan:
On SDW-MWF and Variable Span Linear Filter with Application to Speech Recognition in Noisy Environments. 526-530 - Takuya Higuchi, Keisuke Kinoshita, Nobutaka Ito, Shigeki Karita, Tomohiro Nakatani:
Frame-by-Frame Closed-Form Update for Mask-Based Adaptive MVDR Beamforming. 531-535 - Ying Zhou, Yanmin Qian:
Robust Mask Estimation By Integrating Neural Network-Based and Clustering-Based Approaches for Adaptive Acoustic Beamforming. 536-540 - Qingju Liu, Yong Xu, Philip J. B. Jackson, Wenwu Wang, Philip Coleman:
Iterative Deep Neural Networks for Speaker-Independent Binaural Blind Speech Separation. 541-545 - Nobutaka Ito, Takashi Makino, Shoko Araki, Tomohiro Nakatani:
Maximum-Likelihood Online Speaker Diarization in Noisy Meetings Based on Categorical Mixture Model and Probabilistic Spatial Dictionary. 546-550 - Manoj Dinakaran, Fabian Brinkmann, Stine Harder, Robert Pelzer, Peter Grosche, Rasmus R. Paulsen, Stefan Weinzierl:
Perceptually Motivated Analysis of Numerically Simulated Head-Related Transfer Functions Generated By Various 3D Surface Scanning Systems. 551-555 - Haiming Mai, Bosun Xie, Jianliang Jiang:
Influence of the Number of Loudspeakers on the Timbre in Mixed-Order Ambisonics Reprodution. 556-560 - Tianshu Qu, Zhichao Huang, Yue Qiao, Xihong Wu:
Matching Projection Decoding Method for Ambisonics System. 561-565 - Christoph Urbanietz, Gerald Enzner:
Binaural Rendering of Dynamic Head and Sound Source Orientation Using High-Resolution HRTF and Retarded Time. 566-570 - Clément Gaultier, Nancy Bertin, Rémi Gribonval:
Cascade: Channel-Aware Structured Cosparse Audio Declipper. 571-575 - Tian Tan, Yanmin Qian, Dong Yu:
Knowledge Transfer in Permutation Invariant Training for Single-Channel Multi-Talker Speech Recognition. 571-5718 - Ichrak Toumi, Valentin Emiya:
Sparse Non-Local Similarity Modeling for Audio Inpainting. 576-580 - Zhengshan Shi, Tomoyasu Nakano, Masataka Goto:
Instlistener: An Expressive Parameter Estimation System Imitating Human Performances of Monophonic Musical Instruments. 581-585 - Eric Grinstein, Ngoc Q. K. Duong, Alexey Ozerov, Patrick Pérez:
Audio Style Transfer. 586-590 - Prem Seetharaman, Gautham J. Mysore, Paris Smaragdis, Bryan Pardo:
Blind Estimation of the Speech Transmission Index for Speech Quality Prediction. 591-595 - Dominic Ward, Hagen Wierstorf, Russell D. Mason, Emad M. Grais, Mark D. Plumbley:
BSS Eval or Peass? Predicting the Perception of Singing-Voice Separation. 596-600 - Estefanía Cano, Judith Liebetrau, Derry Fitzgerald, Karlheinz Brandenburg:
The Dimensions of Perceptual Quality of Sound Source Separation. 601-605