default search action
ICASSP 2012: Kyoto, Japan
- 2012 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2012, Kyoto, Japan, March 25-30, 2012. IEEE 2012, ISBN 978-1-4673-0046-9
Loudspeaker and Microphone Array Signal Processing
- Edwin Mabande, Michael Buerger, Walter Kellermann:
Design of robust polynomial beamformers for symmetric arrays. 1-4 - Stefanie Brown, Shuai Wang, Deep Sen:
Analysis of the sphericalwave truncation error for spherical harmonic soundfield expansions. 5-8 - Gilles Chardon, Laurent Daudet:
Narrowband source localization in an unknown reverberant environment usingwavefield sparse decomposition. 9-12 - Martin Schneider, Walter Kellermann:
Adaptive listening room equalization using a scalable filtering structure in thewave domain. 13-16 - Dominic Schmid, Sarmad Malik, Gerald Enzner:
An expectation-maximization algorithm for multichannel adaptive speech dereverberation in the frequency-domain. 17-20 - Daniel Marquardt, Volker Hohmann, Simon Doclo:
Binaural cue preservation for hearing aids using multi-channel wiener filter with instantaneous ITF preservation. 21-24
Echo Cancellation
- Cristian Lucian Stanciu, Jacob Benesty, Constantin Paleologu, Tomas Gänsler, Silviu Ciochina:
A novel perspective on stereophonic acoustic echo cancellation. 25-28 - Jason Wung, Ted S. Wada, Biing-Hwang Juang:
Inter-channel decorrelation by sub-band resampling in frequency domain. 29-32 - Jose Manuel Gil-Cacho, Toon van Waterschoot, Marc Moonen, Søren Holdt Jensen:
Nonlinear acoustic echo cancellation based on a parallel-cascade kernel affine projection algorithm. 33-36 - Sarmad Malik, Gerald Enzner:
Variational Bayesian inference for nonlinear acoustic echo cancellation using adaptive cascade modeling. 37-40 - Moctar Mossi Idrissa, Christelle Yemdji, Nicholas W. D. Evans, Christophe Beaugeant, Fabrice Plante, Fatimazahra Marfouq:
Dual amplifier and loudspeaker compensation using fast convergent and cascaded approaches to non-linear acoustic echo cancellation. 41-44 - Mohamed Krini, Gerhard Schmidt:
Method for temporal interpolation of short-term spectra and its application to adaptive system identification. 45-48
Source Separation: Music and Speech
- Pablo Sprechmann, Pablo Cancela, Guillermo Sapiro:
Gaussian mixture models for score-informed instrument separation. 49-52 - Antoine Liutkus, Zafar Rafii, Roland Badeau, Bryan Pardo, Gaël Richard:
Adaptive filtering for music/voice separation exploiting the repeating musical structure. 53-56 - Po-Sen Huang, Scott Deeann Chen, Paris Smaragdis, Mark Hasegawa-Johnson:
Singing-voice separation from monaural recordings using robust principal component analysis. 57-60 - Felix Weninger, Jordi Feliu, Björn W. Schuller:
Supervised and semi-supervised suppression of background music in monaural speech recordings. 61-64 - Jalal Taghia, Rainer Martin, Richard C. Hendriks:
On mutual information as a measure of speech intelligibility. 65-68 - Pejman Mowlaee, Rahim Saeidi, Mads Græsbøll Christensen, Rainer Martin:
Subjective and objective quality assessment of single-channel speech separation algorithms. 69-72
Music: Classification and Recognition
- Rémi Foucard, Slim Essid, Mathieu Lagrange, Gaël Richard:
A regressive boosting approach to automatic audio tagging based on soft annotator fusion. 73-76 - Ju-Chiang Wang, Hsin-Min Wang, Shyh-Kang Jeng:
Playing with tagging: A real-time tagging music player. 77-80 - Justin Salamon, Bruno Miguel Machado Rocha, Emilia Gómez:
Musical genre classification using melody features extracted from polyphonic music signals. 81-84 - Felix Weninger, Noam Amir, Ofer Amir, Irit Ronen, Florian Eyben, Björn W. Schuller:
Robust feature extraction for automatic recognition of vibrato singing in recorded polyphonic music. 85-88 - Andre Holzapfel, Matthew E. P. Davies, José Ricardo Zapata, João Lobato Oliveira, Fabien Gouyon:
On the automatic identification of difficult examples for beat tracking: Towards building new evaluation datasets. 89-92 - Peter Grosche, Meinard Müller:
Toward musically-motivated audio fingerprints. 93-96
Source Separation and Signal Enhancement
- Jingdong Chen, Jacob Benesty:
Single-channel noise reduction in the STFT domain based on the bifrequency spectrum. 97-100 - Nicolas Sturmel, Laurent Daudet:
Iterative phase reconstruction of wiener filtered signals. 101-104 - Timo Gerkmann, Richard C. Hendriks:
Improved mmse-based noise PSD tracking using temporal cepstrum smoothing. 105-108 - Mehrez Souden, Shoko Araki, Keisuke Kinoshita, Tomohiro Nakatani, Hiroshi Sawada:
A multichannel MMSE-based framework for joint blind source separation and noise reduction. 109-112 - Andreas Schwarz, Klaus Reindl, Walter Kellermann:
A two-channel reverberation suppression scheme based on blind signal separation and wiener filtering. 113-116 - Chao Li, Jacob Benesty, Jingdong Chen:
Optimal rectangular filtering matrix for noise reduction in the time domain. 117-120
Music: Transcription, Separation and Transfromations
- Sebastian Böck, Markus Schedl:
Polyphonic piano note transcription with recurrent neural networks. 121-124 - Holger Kirchhoff, Simon Dixon, Anssi Klapuri:
Shift-variant non-negative matrix deconvolution for music transcription. 125-128 - Sebastian Ewert, Meinard Müller:
Using score-informed constraints for NMF-based source separation. 129-132 - Kazuki Ochiai, Hirokazu Kameoka, Shigeki Sagayama:
Explicit beat structure modeling for non-negative matrix factorization-based multipitch analysis. 133-136 - Marcelo F. Caetano, Xavier Rodet:
A source-filter model for musical instrument sound transformation. 137-140 - Hao Mu, Woon-Seng Gan, Ee-Leng Tan:
A psychoacoustic bass enhancement system with improved transient and steady-state performance. 141-144
Perception and Echo Cancellation
- Malcolm Slaney, Trevor Agus, Shih-Chii Liu, Emine Merve Kaya, Mounya Elhilali:
A model of attention-driven scene analysis. 145-148 - Carlos Renato C. Nakagawa, Sven Nordholm, Wei-Yong Yan:
Dual microphone solution for acoustic feedback cancellation for assistive listening. 149-152 - Ramdas Kumaresan, Vijay Kumar Peddinti, Peter Cariani:
Synchrony capture filterbank (SCFB): An auditory periphery inspired method for tracking sinusoids. 153-156 - Zhangli Chen, Guangshu Hu:
A revised method of calculating auditory exciation patterns and loudness for time-varying sounds. 157-160 - Srikanth Vishnubhotla, Jinjun Xiao, Buye Xu, Martin F. McKinney, Tao Zhang:
Annoyance perception and modeling for hearing-impaired listeners. 161-164 - Ivan Tashev:
Coherence based double talk detector with soft decision. 165-168 - Constantin Paleologu, Jacob Benesty, Felix Albu:
Regularization of the improved proportionate affine projection algorithm. 169-172 - Christian Schüldt, Fredric Lindström, Ingvar Claesson:
Robust low-complexity transfer logic for two-path echo cancellation. 173-176 - Pratik Shah, Steven L. Grant, Jacob Benesty:
On an iterative method for basis pursuit with application to echo cancellation with sparse impulse responses. 177-180 - Osamu Hoshuyama:
Dual-microphone echo canceller for suppressing loud nonlinear echo. 181-184 - Radoslaw Mazur, Jan Ole Jungmann, Alfred Mertins:
Optimized gradient calculation for room impulse response reshaping algorithm based on p-norm optimization. 185-188 - Seong-Woo Kim, Young-Cheol Park, Dae Hee Youn:
A variable step-size filtered-x gradient adaptive lattice algorithm for active noise control. 189-192
Loudspeaker and Microphone Array Signal Processing
- Futoshi Asano, Hideki Asoh, Kazuhiro Nakadai:
Sound source localization in spatially colored noise using a hierarchical Bayesian model. 193-196 - Shmulik Markovich Golan, Sharon Gannot, Israel Cohen:
A sparse blocking matrix for multiple constraints GSC beamformer. 197-200 - Tofigh Naghibi, Beat Pfister:
An approach to prevent adaptive beamformers from cancelling the desired signal. 205-208 - Hai Morgenstern, Boaz Rafaely:
Analysis of acoustic MIMO systems in enclosed sound fields. 209-212 - Francesco Nesta, Maurizio Omologo:
Enhanced multidimensional spatial functions for unambiguous localization of multiple sparse acoustic sources. 213-216 - Karim Youssef, Sylvain Argentieri, Jean-Luc Zarader:
A binaural sound source localization method using auditive cues and vision. 217-220 - Hiroaki Itou, Ken'ichi Furuya, Yoichi Haneda:
Localized sound reproduction using circular loudspeaker array based on acoustic evanescent wave. 221-224 - Kenta Niwa, Sumitaka Sakauchi, Ken'ichi Furuya, Manabu Okamoto, Yoichi Haneda:
Diffused sensing for sharp directivity microphone array. 225-228 - Paolo Annibale, Rudolf Rabenstein:
Speed of sound and air temperature estimation using the TDOA-based localization framework. 229-232 - Chenchi Luo, James H. McClellan, Milind Borkar, Arthur J. Redfern:
A model based excursion protection algorithm for loudspeakers. 233-236 - Lars-Johan Brännmark, Adrian Bahne, Anders Ahlén:
Improved loudspeaker-room equalization using multiple loudspeakers and MIMO feedforward control. 237-240
Source Separation
- John Woodruff, DeLiang Wang:
Binaural speech segregation based on pitch and azimuth tracking. 241-244 - Yasuaki Iwata, Tomohiro Nakatani:
Introduction of speech log-spectral priors into dereverberation based on Itakura-Saito distance minimization. 245-248 - Robert Peharz, Franz Pernkopf:
On linear and mixmax interaction models for single channel source separation. 249-252 - Jalil Taghia, Nasser Mohammadiha, Arne Leijon:
A variational Bayes approach to the underdetermined blind source separation with automatic determination of the number of sources. 253-256 - Yu Ting Yeung, Tan Lee, Cheung-Chi Leung:
Integrating multiple observations for model-based single-microphone speech separation with conditional random fields. 257-260 - Hiroshi Sawada, Hirokazu Kameoka, Shoko Araki, Naonori Ueda:
Efficient algorithms for multichannel extensions of Itakura-Saito nonnegative matrix factorization. 261-264 - Shoko Araki, Tomohiro Nakatani:
Sparse vector factorization for underdetermined BSS using wrapped-phase GMM and source log-spectral prior. 265-268 - Takuro Maruyama, Shoko Araki, Tomohiro Nakatani, Shigeki Miyabe, Takeshi Yamada, Shoji Makino, Atsushi Nakamura:
New analytical update rule for TDOA inference for underdetermined BSS in noisy environments. 269-272 - Kamil Adiloglu, Emmanuel Vincent:
A general variational Bayesian framework for robust feature extraction in multisource recordings. 273-276 - Ricard Marxer, Jordi Janer:
A Tikhonov regularization method for spectrum decomposition in low latency audio source separation. 277-280 - Jordi Janer, Ricard Marxer, Keita Arimoto:
Combining a harmonic-based NMF decomposition with transient analysis for instantaneous percussion separation. 281-284 - Christopher Osterwise, Steven L. Grant:
Effect of frequency oversampling and cascade initialization on permutation control in frequency domain BSS. 285-288
Noise Reduction and Source Separation
- Datao You, Jiqing Han, Guibin Zheng, Tieran Zheng:
Sparse power spectrum based robust voice activity detector. 289-292 - Roland Maas, Emanuël A. P. Habets, Armin Sehr, Walter Kellermann:
On the application of reverberation suppression to robust speech recognition. 297-300 - Seon Man Kim, Hong Kook Kim, Sung Joo Lee, Yunkeun Lee:
Adaptation mode control with residual noise estimation for beamformer-based multi-channel speech enhancement. 301-304 - Emanuël A. P. Habets, Jacob Benesty, Jingdong Chen:
Multi-microphone noise reduction using interchannel and interframe correlations. 305-308 - Oliver Thiergart, Giovanni Del Galdo, Emanuël A. P. Habets:
Signal-to-reverberant ratio estimation based on the complex spatial coherence between omnidirectional microphones. 309-312 - Jacob Benesty, Jingdong Chen:
A multichannel widely linear approach to binaural noise reduction using an array of microphones. 313-316 - Shakeel Ahmed, Akihisa Oishi, Muhammad Tahir Akhtar, Wataru Mitsuhashi:
Auxiliary noise power scheduling for on-line secondary path modeling in single channel feedforward active noise control systems. 317-320 - Tongwei Wang, Woon-Seng Gan, Yong Kim Chong:
Psychoacoustic hybrid active noise control system. 321-324 - Annea Barkefors, Simon Berthilsson, Mikael Sternad:
Extending the area silenced by active noise control using multiple loudspeakers. 325-328 - Azar Mahmoodzadeh, Hamid Sheikhzadeh, Hamid Reza Abutalebi, Hamid Soltanian-Zadeh:
A hybrid coherent-incoherent method of modulation filtering for Single Channel Speech Separation. 329-332
Audio Analysis and Synthesis
- Zixing Zhang, Björn W. Schuller:
Semi-supervised learning helps in sound event classification. 333-336 - Christian Borß, Rainer Martin:
On the construction of window functions with constant-overlap-add constraint for arbitrary window shifts. 337-340 - Björn W. Schuller, Simone Hantke, Felix Weninger, Wenjing Han, Zixing Zhang, Shrikanth S. Narayanan:
Automatic recognition of emotion evoked by general sound events. 341-344 - Mads Græsbøll Christensen:
A method for low-delay pitch tracking and smoothing. 345-348 - Anil M. Nagathil, Rainer Martin:
Optimal signal reconstruction from a constant-Q spectrum. 349-352 - Patrick Kramer, Jakob Abeßer, Christian Dittmar, Gerald Schuller:
A digitalwaveguide model of the electric bass guitar including different playing techniques. 353-356 - Jörg-Hendrik Bach, Arne-Freerk Meyer, Duncan C. McElfresh, Jörn Anemüller:
Automatic classification of audio data using nonlinear neural response models. 357-360 - Juan José Burred:
Genetic motif discovery applied to audio analysis. 361-364 - Matthias Janke, Michael Wand, Keigo Nakamura, Tanja Schultz:
Further investigations on EMG-to-speech conversion. 365-368 - Brian Hamilton, Philippe Depalle:
A unified view of non-stationary sinusoidal parameter estimation methods using signal derivatives. 369-372 - Jan Jagla, Julien Maillard, Nadine Martin:
Sample-based engine noise synthesis using a harmonic synchronous overlap-and-add method. 373-376
Spatial Audio and Audio Coding
- Mark A. Poletti, Terence Betlehem, Thushara D. Abhayapala:
Analysis of 2D sound reproduction with fixed-directivity loudspeakers. 377-380 - Shoichi Koyama, Ken'ichi Furuya, Yusuke Hiwasaki, Yoichi Haneda:
Design of transform filter for reproducing arbitrarily shifted sound field using phase-shift of spatio-temporal frequency. 381-384 - Andrew Wabnitz, Nicolas Epain, Craig T. Jin:
A frequency-domain algorithm to upscale ambisonic sound scenes. 385-388 - Kimberly J. Fink, Laura E. Ray:
Tuning principal component weights to individualize HRTFs. 389-392 - Terence Betlehem, Paul D. Teal, Yusuke Hioka:
Efficient crosstalk canceler design with impulse response shortening filters. 393-396 - Julien Capobianco, Grégory Pallone, Laurent Daudet:
Dynamic strategy for window splitting, parameters estimation and interpolation in spatial parametric audio coders. 397-400 - Satoshi Esaki, Kenta Niwa, Takanori Nishino, Kazuya Takeda:
Estimating sound source depth using a small-size array. 401-404 - Xiguang Zheng, Christian H. Ritz, Jiangtao Xi:
Encoding navigable speech sources: An analysis by synthesis approach. 405-408 - Mads Græsbøll Christensen:
Multi-channel maximum likelihood pitch estimation. 409-412 - Minyue Li, Janusz Klejsa, Alexey Ozerov, W. Bastiaan Kleijn:
Audio codingwith power spectral density preserving quantization. 413-416 - Tejaswi Nanjundaswamy, Kenneth Rose:
Bidirectional cascaded long term prediction for frame loss concealment in polyphonic audio signals. 417-420
Music: Classification and Recognition
- Aggelos Gkiokas, Vassilis Katsouros, George Carayannis, Themos Stafylakis:
Music tempo estimation and beat tracking by applying source separation and metrical relations. 421-424 - Daichi Sakaue, Katsutoshi Itoyama, Tetsuya Ogata, Hiroshi G. Okuno:
Initialization-robust multipitch estimation based on latent harmonic allocation using overtone corpus. 425-428 - Siu Wa Lee, Shen Ting Ang, Minghui Dong, Haizhou Li:
Generalized F0 modelling with absolute and relative pitch features for singing voice synthesis. 429-432 - Hiromasa Fujihara, Anssi Klapuri, Mark D. Plumbley:
Instrumentation-based music similarity using sparse representations. 433-436 - Lise Regnier, Geoffroy Peeters:
Singer verification: Singer model .vs. song model. 437-440 - Ken O'Hanlon, Hidehisa Nagano, Mark D. Plumbley:
Structured sparsity for automatic music transcription. 441-444 - Maksim Khadkevich, Thomas Fillon, Gaël Richard, Maurizio Omologo:
A probabilistic approach to simultaneous extraction of beats and downbeats. 445-448 - Aiko Uemura, Jiro Katto:
Chord recognition using Doubly Nested Circle of Fifths. 449-452 - Eric J. Humphrey, Taemin Cho, Juan Pablo Bello:
Learning a robust Tonnetz-space transform for automatic chord recognition. 453-456 - Tzu-Chun Yeh, Ming-Ju Wu, Jyh-Shing Roger Jang, Wei-Lun Chang, I-Bin Liao:
A hybrid approach to singing pitch extraction based on trend estimation and hidden Markov models. 457-460 - Masahiro Nakano, Yasunori Ohishi, Hirokazu Kameoka, Ryo Mukai, Kunio Kashino:
Bayesian nonparametric music parser. 461-464 - Hideyuki Tachibana, Hirokazu Kameoka, Nobutaka Ono, Shigeki Sagayama:
Comparative evaluations of various harmonic/percussive sound separation algorithms based on anisotropic continuity of spectrogram. 465-468
Content Analysis for Music, Multimedia, and Medicine
- Stefano D'Angelo, Vesa Välimäki:
Wave-digital polarity and current inverters and their application to virtual analog audio processing. 469-472 - Peter Grosche, Meinard Müller:
Toward characteristic audio shingles for efficient cross-version music retrieval. 473-476 - Chung-Che Wang, Chieh-Hsing Chen, Chin-Yang Kuo, Li-Ting Chiu, Jyh-Shing Roger Jang:
Accelerating query by singing/humming on GPU: Optimization for web deployment. 477-480 - Bong-Wan Kim, Dae-Lim Choi, Jae-Deok Lim, SeungWan Han, Yong-Ju Lee:
Audio-based automatic detection of objectionable contents in noisy conditions using normalized segmental two-dimesional MFCC. 481-484 - Xavier Anguera:
Speaker independent discriminant feature extraction for acoustic pattern-matching. 485-488 - Anurag Kumar, Pranay Dighe, Rita Singh, Sourish Chaudhuri, Bhiksha Raj:
Audio event detection from acoustic unit occurrence patterns. 489-492