default search action
ICASSP 2016: Shanghai, China
- 2016 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2016, Shanghai, China, March 20-25, 2016. IEEE 2016, ISBN 978-1-4799-9988-0
AASP-L1: Room Acoustics and Geometry Estimation
- Ingmar Jager, Richard Heusdens, Nikolay D. Gaubitch:
Room geometry estimation from acoustic echoes using graph-based echo labeling. 1-5 - Christine Evers, Alastair H. Moore, Patrick A. Naylor:
Acoustic simultaneous localization and mapping (A-SLAM) of a moving microphone array and its surrounding speakers. 6-10 - Miranda Krekovic, Ivan Dokmanic, Martin Vetterli:
EchoSLAM: Simultaneous localization and mapping with acoustic echoes. 11-15 - Giacomo Vairetti, Søren Holdt Jensen, Enzo De Sena, Marc Moonen, Michael Catrysse, Toon van Waterschoot:
Multichannel identification of room acoustic systems with adaptive filters based on orthonormal basis functions. 16-20 - Tiexing Wang, Fangrong Peng, Biao Chen:
First order echo based room shape recovery using a single mobile device. 21-25 - Yusuke Hioka, Kenta Niwa:
Estimating direct-to-reverberant ratio mapped from power spectral density using deep neural network. 26-30
AASP-L2: Source Separation I
- John R. Hershey, Zhuo Chen, Jonathan Le Roux, Shinji Watanabe:
Deep clustering: Discriminative embeddings for segmentation and separation. 31-35 - Derry Fitzgerald, Antoine Liutkus, Roland Badeau:
PROJET - Spatial audio separation using projections. 36-40 - Minje Kim, Paris Smaragdis:
Efficient neighborhood-based topic modeling for collaborative audio enhancement on massive crowdsourced recordings. 41-45 - Paul Magron, Roland Badeau, Bertrand David:
Complex NMF under phase constraints based on signal modeling: Application to audio source separation. 46-50 - Kazuyoshi Yoshii, Katsutoshi Itoyama, Masataka Goto:
Student's T nonnegative matrix factorization and positive semidefinite tensor factorization for single-channel audio source separation. 51-55 - Yuki Mitsufuji, Shoichi Koyama, Hiroshi Saruwatari:
Multichannel blind source separation based on non-negative tensor factorization in wavenumber domain. 56-60
AASP-L3: Signal Processing for Music
- Francisco J. Rodríguez-Serrano, Sebastian Ewert, Pedro Vera-Candeas, Mark B. Sandler:
A score-informed shift-invariant extension of complex matrix factorization for improving the separation of overlapped partials in music recordings. 61-65 - Sankalp Gulati, Joan Serrà, Vignesh Ishwar, Sertan Sentürk, Xavier Serra:
Phrase-based rĀga recognition using vector space modeling. 66-70 - Satoru Fukayama, Masataka Goto:
Music emotion recognition with adaptive aggregation of Gaussian process regressors. 71-75 - Ajay Srinivasamurthy, Andre Holzapfel, Ali Taylan Cemgil, Xavier Serra:
A generalized Bayesian model for tracking long metrical cycles in acoustic music signals. 76-80 - Colin Raffel, Daniel P. W. Ellis:
Optimizing DTW-based audio-to-MIDI alignment and matching. 81-85 - Jesper Kjær Nielsen, Tobias Lindstrøm Jensen, Jesper Rindom Jensen, Mads Græsbøll Christensen, Søren Holdt Jensen:
Fast and statistically efficient fundamental frequency estimation. 86-90
AASP-L4: Microphone Array Processing
- Amin Hassani, Alexander Bertrand, Marc Moonen:
LCMV beamforming with subspace projection for multi-speaker speech enhancement. 91-95 - Despoina Pavlidi, Symeon Delikaris-Manias, Ville Pulkki, Athanasios Mouchtaris:
3D DOA estimation of multiple sound sources based on spatially constrained beamforming driven by intensity vectors. 96-100 - Thomas Sherson, W. Bastiaan Kleijn, Richard Heusdens:
A distributed algorithm for robust LCMV beamforming. 101-105 - Matthew O'Connor, W. Bastiaan Kleijn, Thushara D. Abhayapala:
Distributed sparse MVDR beamforming using the bi-alternating direction method of multipliers. 106-110 - Hamza A. Javed, Alastair H. Moore, Patrick A. Naylor:
Spherical microphone array acoustic rake receivers. 111-115 - Isha Agrawal, Rajesh M. Hegde:
Radial filters for near field source separation in spherical harmonic domain. 116-120
AASP-L5: Source Separation II
- Scott Wisdom, John R. Hershey, Jonathan Le Roux, Shinji Watanabe:
Deep unfolding for multichannel source separation. 121-125 - Fabian-Robert Stöter, Antoine Liutkus, Roland Badeau, Bernd Edler, Paul Magron:
Common fate model for unison source separation. 126-130 - Damien Bouvier, Nicolas Obin, Marco Liuni, Axel Roebel:
A source/filter model with adaptive constraints for NMF-based speech separation. 131-135 - Dionyssos Kounades-Bastian, Laurent Girin, Xavier Alameda-Pineda, Sharon Gannot, Radu Horaud:
An inverse-gamma source variance prior with factorized parameterization for audio source separation. 136-140 - Lukas Drude, Christoph Böddeker, Reinhold Haeb-Umbach:
Blind speech separation based on complex spherical k-mode clustering. 141-145 - Zhong-Qiu Wang, Yan Zhao, DeLiang Wang:
Phoneme-specific speech separation. 146-150
AASP-L6: Signal Processing for Reverberant Acoustic Environments
- Ofer Schwartz, Sharon Gannot, Emanuël A. P. Habets:
Joint maximum likelihood estimation of late reverberant and speech power spectral density in noisy environments. 151-155 - Deepak Baby, Hugo Van hamme:
Supervised speech dereverberation in noisy environments using exemplar-based sparse representations. 156-160 - Christian Hofmann, Walter Kellermann:
Source-specific system identification. 161-165 - Ina Kodrasi, Ante Jukic, Simon Doclo:
Robust sparsity-promoting acoustic multi-channel equalization for speech dereverberation. 166-170 - Robin Scheibler, Martin Vetterli:
The recursive hessian sketch for adaptive filtering. 171-175 - Jesper Rindom Jensen, Jesper Kjær Nielsen, Richard Heusdens, Mads Græsbøll Christensen:
DOA estimation of audio sources in reverberant environments. 176-180
AASP-L7: Noise Modeling and Signal Enhancement
- Xiaofei Li, Laurent Girin, Sharon Gannot, Radu Horaud:
Non-stationary noise power spectral density estimation based on regional statistics. 181-185 - Yuma Koizumi, Kenta Niwa, Yusuke Hioka, Kazunori Kobayashi, Hitoshi Ohmuro:
Integrated approach of feature extraction and sound source enhancement based on maximization of mutual information. 186-190 - Mathew Shaji Kavalekalam, Mads Græsbøll Christensen, Fredrik Gran, Jesper Bünsow Boldt:
Kalman filter for speech enhancement in cocktail party scenarios using a codebook-based approach. 191-195 - Jahn Heymann, Lukas Drude, Reinhold Haeb-Umbach:
Neural network based spectral mask estimation for acoustic beamforming. 196-200 - Dörte Fischer, Timo Gerkmann:
Single-microphone speech enhancement using MVDR filtering and Wiener post-filtering. 201-205 - Robert Rehr, Timo Gerkmann:
BIAS correction methods for adaptive recursive smoothing with applications in noise PSD estimation. 206-210
AASP-P1: Hearing Aids and Environmental Sound Recognition
- Huy Phan, Marco Maaß, Lars Hertel, Radoslaw Mazur, Ian McLoughlin, Alfred Mertins:
Learning compact structural representations for audio events using regressor banks. 211-215 - Debmalya Chakrabarty, Mounya Elhilali:
Abnormal sound event detection using temporal trajectories mixtures. 216-220 - Kun Qian, Christoph Janott, Zixing Zhang, Clemens Heiser, Björn W. Schuller:
Wavelet features for classification of vote snore sounds. 221-225 - Tim Fischer, Johannes Schneider, Wilhelm Stork:
Classification of breath and snore sounds using audio data recorded with smartphones in the home environment. 226-230 - Henning F. Schepker, Linh Thi Thuc Tran, Sven Nordholm, Simon Doclo:
Improving adaptive feedback cancellation in hearing aids using an affine combination of filters. 231-235 - Meng Guo, Anders Meng, Bernhard Kuenzle, Krista Kappeler:
Intrusive howling detection methods for hearing aid evaluations. 236-240 - Elior Hadad, Daniel Marquardt, Simon Doclo, Sharon Gannot:
Extensions of the binaural MWF with interference reduction preserving the binaural cues of the interfering source. 241-245 - Dianna Yee, A. Homayoun Kamkar-Parsi, Henning Puder, Rainer Martin:
A speech enhancement system using binaural hearing aids and an external microphone. 246-250 - Huiqun Deng, Jun Yang:
Estimating ear canal geometry and eardrum reflection coefficient from ear canal input impedance. 251-255
AASP-P2: Music Intormation Retrieval
- Elio Quinton, Mark B. Sandler, Simon Dixon:
Estimation of the reliability of multiple rhythm features extraction from a single descriptor. 256-260 - Jun-qi Deng, Yu-Kwong Kwok:
Automatic Chord estimation on seventhsbass Chord vocabulary using deep neural network. 261-265 - Thomas Prätzlich, Meinard Müller:
Triple-based analysis of music alignments without the need of ground-truth annotations. 266-270 - Dairoku Kawai, Kazumasa Yamamoto, Seiichi Nakagawa:
Speech analysis of sung-speech and lyric recognition in monophonic singing. 271-275 - Eita Nakamura, Masatoshi Hamanaka, Keiji Hirata, Kazuyoshi Yoshii:
Tree-structured probabilistic model of monophonic written music based on the generative theory of tonal music. 276-280 - Stefan Balke, Vlora Arifi-Müller, Lukas Lamprecht, Meinard Müller:
Retrieving audio recordings using musical themes. 281-285 - Sankalp Gulati, Joan Serrà, Vignesh Ishwar, Xavier Serra:
Discovering rāga motifs by characterizing communities in networks of melodic patterns. 286-290 - Cheng-i Wang, Gautham J. Mysore:
Structural segmentation with the Variable Markov Oracle and boundary adjustment. 291-295 - Simon Durand, Juan Pablo Bello, Bertrand David, Gaël Richard:
Feature adapted convolutional neural networks for downbeat tracking. 296-300
AASP-P3: Spatial Audio Processing and HRTFS
- Luca Bonacina, Antonio Canclini, Fabio Antonacci, Marco Marcon, Augusto Sarti, Stefano Tubaro:
A low-cost solution to 3D pinna modeling for HRTF prediction. 301-305 - Archontis Politis, Mark R. P. Thomas, Hannes Gamper, Ivan J. Tashev:
Applications of 3D spherical transforms to personalization of head-related transfer functions. 306-310 - Jacob Donley, Christian H. Ritz, W. Bastiaan Kleijn:
Improving speech privacy in personal sound zones. 311-315 - Kainan Chen, Jürgen T. Geiger, Karim Helwani, Mohammad Javad Taghizadeh:
Localization of sound sources with known statistics in the presence of interferers. 316-320 - Jianjun He, Rishabh Ranjan, Woon-Seng Gan:
Fast continuous HRTF acquisition with unconstrained movements of human subjects. 321-325 - Takuma Okamoto:
2.5D higher order ambisonics for a sound field described by angular spectrum coefficients. 326-330 - Tilak Rajapaksha, Xiaojun Qiu, Eva Cheng, Ian S. Burnett:
Geometrical room geometry estimation from room impulse responses. 331-335 - Xuejie Liu, Xiaoli Zhong:
An improved anthropometry-based customization method of individual head-related transfer functions. 336-339 - Oliver Thiergart, Weilong Huang, Emanuël A. P. Habets:
A low complexity weighted least squares narrowband DOA estimator for arbitrary array geometries. 340-344
AASP-P4: Microphone Arrays and Spatial Acoustic Processing I
- Naoki Murata, Shoichi Koyama, Hirokazu Kameoka, Norihiro Takamune, Hiroshi Saruwatari:
Sparse sound field decomposition with multichannel extension of complex NMF. 345-349 - Xiaoguang Wu, Huawei Chen:
Further results on mainlobe orientation reversal of the first-order steerable differential array due to microphone phase errors. 350-354 - Mai Quyen Pham, Benoit Oudompheng, Barbara Nicolas, Jérôme I. Mars:
Sparse deconvolution for moving-source localization. 355-359 - Mojtaba Farmani, Michael Syskind Pedersen, Zheng-Hua Tan, Jesper Jensen:
Informed Direction of Arrival estimation using a spherical-head model for Hearing Aid applications. 360-364 - Daan H. M. Schellekens, Martin Bo Møller, Martin Olsen:
Time domain acoustic contrast control implementation of sound zones for low-frequency input signals. 365-369 - Yuji Koyano, Kohei Yatabe, Yusuke Ikeda, Yasuhiro Oikawa:
Physical-model based efficient data representation for many-channel microphone array. 370-374 - Nasim Radmanesh, Bhaskar D. Rao:
Frequency-based customization of multizone sound system design. 375-379 - Yiteng Arden Huang, Alejandro Luebs, Jan Skoglund, W. Bastiaan Kleijn:
Globally optimized least-squares post-filtering for microphone array speech enhancement. 380-384 - Shoko Araki, Masahiro Okada, Takuya Higuchi, Atsunori Ogawa, Tomohiro Nakatani:
Spatial correlation model based observation vector clustering and MVDR beamforming for meeting recognition. 385-389
AASP-P5: Microphone Arrays and Spatial Acoustic Processing II
- Hongsen He, Jingdong Chen, Jacob Benesty, Tao Yang:
On time delay estimation based on multichannel spatiotemporal sparse linear prediction. 390-394 - Shoichi Koyama, Hiroshi Saruwatari:
Sound field decomposition in reverberant environment using sparse and low-rank signal models. 395-399 - Changlei Li, Jacob Benesty, Gongping Huang, Jingdong Chen:
Subspace superdirective beamformers based on joint diagonalization. 400-404 - Ryu Takeda, Kazunori Komatani:
Sound source localization based on deep neural networks with directional activate function exploiting phase information. 405-409 - Lucio Bianchi, V. Baldini Anastasio, Dejan Markovic, Fabio Antonacci, Augusto Sarti, Stefano Tubaro:
A linear operator for the computation of soundfield maps. 410-414 - Sina Hafezi, Alastair H. Moore, Patrick A. Naylor:
3D acoustic source localization in the spherical harmonic domain based on optimized grid search. 415-419 - Knud B. Christensen, Mads Græsbøll Christensen, Jesper Bünsow Boldt, Fredrik Gran:
Experimental study of generalized subspace filters for the cocktail party situation. 420-424
AASP-P6: Source Separation & Denoising
- Hüseyin Hacihabiboglu:
Acoustic source separation using the short-time quaternion fourier transforms of particle velocity signals. 425-429 - Mehdi Zohourian, Rainer Martin:
Binaural speaker localization and separation based on a joint ITD/ILD model and head movement tracking. 430-434 - Kenta Niwa, Yuma Koizumi, Tomoko Kawase, Kazunori Kobayashi, Yusuke Hioka:
Pinpoint extraction of distant sound source based on DNN mapping from multiple beamforming outputs to prior SNR. 435-439 - Josue Sanz-Robinson, Liechao Huang, Tiffany Moy, Warren Rieutort-Louis, Yingzhe Hu, Sigurd Wagner, James C. Sturm, Naveen Verma:
Robust blind source separation in a reverberant room based on beamforming with a large-aperture microphone array. 440-444 - Richard Fug, Andreas Niedermeier, Jonathan Driedger, Sascha Disch, Meinard Müller:
Harmonic-percussive-residual sound separation using the structure tensor on spectrograms. 445-449 - Matt McVicar, Raúl Santos-Rodriguez, Tijl De Bie:
Learning to separate vocals from polyphonic mixtures via ensemble methods and structured output prediction. 450-454 - Gurunath Reddy M., K. Sreenivasa Rao:
Predominant melody extraction from vocal polyphonic music signal by combined spectro-temporal method. 455-459 - Andreas I. Koutrouvelis, Richard C. Hendriks, Jesper Jensen, Richard Heusdens:
Improved multi-microphone noise reduction preserving binaural cues. 460-464 - Nobutaka Ito, Shoko Araki, Tomohiro Nakatani:
Modeling audio directional statistics using a complex bingham mixture model for blind source extraction from diffuse noise. 465-468
AASP-P7: Non-Negative Models
- Shuai Nie, Shan Liang, Hao Li, Xueliang Zhang, Zhanlei Yang, Wenju Liu, Like Dong:
Exploiting spectro-temporal structures using NMF for DNN-based supervised speech separation. 469-473 - Christian Rohlfing, Julian Mathias Becker, Mathias Wien:
NMF-based informed source separation. 474-478 - Kisoo Kwon, Jong Won Shin, Nam Soo Kim:
NMF-based source separation utilizing prior knowledge on encoding vector. 479-483 - Cagdas Bilen, Alexey Ozerov, Patrick Pérez:
Automatic allocation of NTF components for user-guided audio source separation. 484-488 - Tomohiko Nakamura, Hirokazu Kameoka:
Shifted and convolutive source-filter non-negative matrix factorization for monaural audio source separation. 489-493 - Yinan Li, Xiongwei Zhang, Meng Sun, Gang Min, Jibin Yang:
Adaptive extraction of repeating non-negative temporal patterns for single-channel speech enhancement. 494-498 - Thanh T. Vu, Benjamin Bigot, Eng Siong Chng:
Combining non-negative matrix factorization and deep neural networks for speech enhancement and automatic speech recognition. 499-503
AASP-P8: Noise, Echo, Feedback and Reverberation Reduction
- Mahdi Parchami, Wei-Ping Zhu, Benoît Champagne:
Speech dereverberation using linear prediction with estimation of early speech spectral variance. 504-508 - Wenyu Jin:
Adaptive reverberation cancelation for multizone soundfield reproduction using sparse methods. 509-513 - Ritwik Giri, Bhaskar D. Rao, Fred Mustiere, Tao Zhang:
Dynamic relative impulse response estimation using structured sparse Bayesian learning. 514-518 - Maria Luis Valero, Emanuël A. P. Habets:
Insight into a phase modulation technique for signal decorrelation in multi-channel acoustic echo cancellation. 519-523 - Jihui Zhang, Thushara D. Abhayapala, Prasanga N. Samarasinghe, Wen Zhang, Shouda Jiang:
Sparse complex FxLMS for active noise cancellation over spatial regions. 524-528 - Jianming Liu, Steven L. Grant:
Proportionate affine projection algorithms for block-sparse system identification. 529-533 - Christian Hofmann, Michael Günther, Michael Buerger, Walter Kellermann:
Higher-order listening room compensation with additive compensation signals. 534-538 - Meng Guo, Bernhard Kuenzle:
On the periodically time-varying bias in adaptive feedback cancellation systems with frequency shifting. 539-543
AASP-P9: Audio and Music Content Analysis
- Xinxing Li, Haishu Xianyu, Jiashen Tian, Wenxiao Chen, Fanhang Meng, Mingxing Xu, Lianhong Cai:
A deep bidirectional long short-term memory based multi-scale approach for music dynamic emotion prediction. 544-548 - Haishu Xianyu, Xinxing Li, Wenxiao Chen, Fanhang Meng, Jiashen Tian, Mingxing Xu, Lianhong Cai:
SVR based double-scale regression for dynamic emotion prediction in music. 549-553 - Colin Raffel, Daniel P. W. Ellis:
Pruning subsequence search with attention-based embedding. 554-558 - Peter Jancovic, Münevver Köküer, Masoud Zakeri, Martin J. Russell:
Bird species recognition using HMM-based unsupervised modelling of individual syllables with incorporated duration modelling. 559-563 - Jose A. Belloch, Vesa Välimäki:
Efficient target-response interpolation for a graphic equalizer. 564-568 - Thomas Prätzlich, Jonathan Driedger, Meinard Müller:
Memory-restricted multiscale dynamic time warping. 569-573 - Yi-Chan Wu, Homer H. Chen:
Emotion-flow guided music accompaniment generation. 574-578 - Hong Su, Hui Zhang, Xueliang Zhang, Guanglai Gao:
Convolutional neural network for robust pitch determination. 579-583
AASP-P10: Noise Modelling, Signal Enhancement and Equalization
- Anna Maly, Pejman Mowlaee:
On the importance of harmonic phase modification for improved speech signal reconstruction. 584-588 - Chuang Shi, Yoshinobu Kajikawa:
Automatic gain control for parametric array loudspeakers. 589-593 - Christian Hofmann, Walter Kellermann:
Generalized wave-domain transforms for listening room equalization with azimuthally irregularly spaced loudspeaker arrays. 594-598