


Остановите войну!
for scientists:


default search action
IEEE/ACM Transactions on Audio, Speech and Language Processing, Volume 25
Volume 25, Number 1, January 2017
- Jin Chu Wu, Alvin F. Martin, Craig S. Greenberg, Raghu N. Kacker:
The Impact of Data Dependence on Speaker Recognition Evaluation. 1-14 - Hélène Papadopoulos, George Tzanetakis
:
Models for Music Analysis From a Markov Logic Networks Perspective. 15-30 - Ahmed Al-Tmeme, Wai Lok Woo
, Satnam Singh Dlay, Bin Gao:
Underdetermined Convolutive Source Separation Using GEM-MU With Variational Approximated Optimum Model Order NMF2D. 31-45 - Mark A. Hasegawa-Johnson
, Preethi Jyothi, Daniel McCloy
, Majid Mirbagheri, Giovanni M. Di Liberto, Amit Das, Bradley Ekin, Chunxi Liu, Vimal Manohar, Hao Tang, Edmund C. Lalor, Nancy F. Chen, Paul Hager, Tyler Kekona, Rose Sloan, Adrian K. C. Lee
:
ASR for Under-Resourced Languages From Probabilistic Transcription. 46-59 - Zhen Huang, Sabato Marco Siniscalchi, Chin-Hui Lee:
Bayesian Unsupervised Batch and Online Speaker Adaptation of Activation Function Parameters in Deep Models for Automatic Speech Recognition. 60-71 - Simon Durand, Juan Pablo Bello, Bertrand David, Gaël Richard:
Robust Downbeat Tracking Using an Ensemble of Convolutional Networks. 72-85 - Zhehuai Chen, Yimeng Zhuang, Yanmin Qian, Kai Yu
:
Phone Synchronous Speech Recognition With CTC Lattices. 86-97 - Bo Wu, Kehuang Li, Minglei Yang, Chin-Hui Lee:
A Reverberation-Time-Aware Approach to Speech Dereverberation Based on Deep Neural Networks. 98-107 - Hongjie Chen
, Lei Xie, Cheung-Chi Leung, Xiaoming Lu, Bin Ma, Haizhou Li
:
Modeling Latent Topics and Temporal Distance for Story Segmentation of Broadcast News. 108-119 - Hua Xing, John H. L. Hansen:
Single Sideband Frequency Offset Estimation and Correction for Quality Enhancement and Speaker Recognition. 120-132 - Andreas I. Koutrouvelis, Richard Christian Hendriks, Richard Heusdens, Jesper Jensen:
Relaxed Binaural LCMV Beamforming. 133-148 - Morten Kolbæk
, Zheng-Hua Tan, Jesper Jensen:
Speech Intelligibility Potential of General and Specialized Deep Neural Network Based Speech Enhancement Systems. 149-163 - Jakob Abeßer
, Klaus Frieler, Estefanía Cano
, Martin Pfleiderer, Wolf-Georg Zaddach:
Score-Informed Analysis of Tuning, Intonation, Pitch Modulation, and Dynamics in Jazz Solos. 168-177 - Alastair H. Moore
, Christine Evers
, Patrick A. Naylor
:
Direction of Arrival Estimation in the Spherical Harmonic Domain Using Subspace Pseudointensity Vectors. 178-192 - Kun Li, Xiaojun Qian, Helen M. Meng:
Mispronunciation Detection and Diagnosis in L2 English Speech Using Multidistribution Deep Neural Networks. 193-207 - Yoonchang Han, Jae-Hun Kim, Kyogu Lee
:
Deep Convolutional Neural Networks for Predominant Instrument Recognition in Polyphonic Music. 208-221
Volume 25, Number 2, February 2017
- Hanchi Chen, Thushara Dheemantha Abhayapala, Prasanga N. Samarasinghe
, Wen Zhang:
Direct-to-Reverberant Energy Ratio Estimation Using a First-Order Microphone. 226-237 - Peter Bell, Pawel Swietojanski
, Steve Renals
:
Multitask Learning of Context-Dependent Targets in Deep Neural Network Acoustic Models. 238-247 - Rui Zhao, Kezhi Mao
:
Topic-Aware Deep Compositional Models for Sentence Classification. 248-260 - Dalia El Badawy, Ngoc Q. K. Duong, Alexey Ozerov:
On-the-Fly Audio Source Separation - A Novel User-Friendly Framework. 261-272 - Filip Elvander
, Johan Sward, Andreas Jakobsson
:
Online Estimation of Multiple Harmonic Signals. 273-284 - Vincent Renkens
, Hugo Van hamme
:
Weakly Supervised Learning of Hidden Markov Models for Spoken Language Acquisition. 285-295 - Luca Remaggi, Philip J. B. Jackson
, Philip Coleman
, Wenwu Wang:
Acoustic Reflector Localization: Novel Image Source Reversion and Direct Localization Methods. 296-309 - Prasanga N. Samarasinghe
, Thushara D. Abhayapala, Hanchi Chen:
Estimating the Direct-to-Reverberant Energy Ratio Using a Spherical Harmonics-Based Spatial Correlation Model. 310-319 - Shmulik Markovich Golan, Sharon Gannot
, Walter Kellermann:
Combined LCMV-TRINICON Beamforming for Separating Multiple Speech Sources in Noisy and Reverberant Environments. 320-332 - Shakeel Ahmed, Muhammad Tahir Akhtar
:
Gain Scheduling of Auxiliary Noise and Variable Step-Size for Online Acoustic Feedback Cancellation in Narrow-Band Active Noise Control Systems. 333-343 - Gabriel Sargent
, Frédéric Bimbot, Emmanuel Vincent:
Estimating the Structural Segmentation of Popular Music Pieces Under Regularity Constraints. 344-358 - Jordan Cheer
, Stephen Daley:
An Investigation of Delayless Subband Adaptive Filtering for Multi-Input Multi-Output Active Noise Control Applications. 359-373 - Sebastian J. Schlecht
, Emanuël A. P. Habets:
Feedback Delay Networks: Echo Density and Mixing Time. 374-383 - Johannes Abel
, Magdalena Kaniewska, Cyril Guillaume, Wouter Tirry, Tim Fingscheidt:
An Instrumental Quality Measure for Artificially Bandwidth-Extended Speech Signals. 384-396 - Robert Rehr, Timo Gerkmann
:
An Analysis of Adaptive Recursive Smoothing with Applications to Noise PSD Estimation. 397-408 - Emilio Granell
, Carlos D. Martínez-Hinarejos
:
Multimodal Crowdsourcing for Transcribing Handwritten Documents. 409-419 - Yaping Ma, Yegui Xiao:
A New Strategy for Online Secondary-Path Modeling of Narrowband Active Noise Control. 420-434 - Jose A. Belloch
, Alberto González
, Enrique S. Quintana-Ortí
, Miguel Ferrer
, Vesa Välimäki
:
GPU-Based Dynamic Wave Field Synthesis Using Fractional Delay Filters and Room Compensation. 435-447
Volume 25, Number 3, March 2017
- Qi He, Feng Bao, Changchun Bao:
Multiplicative Update of Auto-Regressive Gains for Codebook-Based Speech Enhancement. 457-468 - Zhongqing Wang
, Sophia Yat Mei Lee, Shoushan Li, Guodong Zhou
:
Emotion Analysis in Code-Switching Text With Joint Factor Graph Model. 469-480 - Ashwin Bellur, Mounya Elhilali
:
Feedback-Driven Sensory Mapping Adaptation for Robust Speech Activity Detection. 481-492 - Zhiyuan Tang, Lantian Li
, Dong Wang, Ravichander Vipperla:
Collaborative Joint Training With Multitask Recurrent Model for Speech and Speaker Recognition. 493-504 - Bidisha Sharma, S. R. Mahadeva Prasanna:
Sonority Measurement Using System, Source, and Suprasegmental Information. 505-518 - Hung-yi Lee, Bo-Hsiang Tseng, Tsung-Hsien Wen, Yu Tsao
:
Personalizing Recurrent-Neural-Network-Based Language Model by Social Network. 519-530 - Ji Ming, Danny Crookes:
Speech Enhancement Based on Full-Sentence Correlation and Clean Speech Recognition. 531-543 - Quoc Truong Do, Tomoki Toda
, Graham Neubig, Sakriani Sakti, Satoshi Nakamura:
Preserving Word-Level Emphasis in Speech-to-Speech Translation. 544-556 - Zhenghua Li, Jiayuan Chao, Min Zhang, Wenliang Chen, Meishan Zhang, Guohong Fu:
Coupled POS Tagging on Heterogeneous Annotations. 557-571 - Clement S. J. Doire
, Mike Brookes
, Patrick A. Naylor
, Christopher M. Hicks, Dave Betts, Mohammad A. Dmour, Søren Holdt Jensen:
Single-Channel Online Enhancement of Speech Corrupted by Reverberation and Noise. 572-587 - Aleksandr Sizov, Kong-Aik Lee
, Tomi Kinnunen:
Direct Optimization of the Detection Cost for I-Vector-Based Spoken Language Recognition. 588-597 - Imran A. Sheikh, Dominique Fohr, Irina Illina, Georges Linarès:
Modelling Semantic Context of OOV Words in Large Vocabulary Continuous Speech Recognition. 598-610 - Mojtaba Farmani
, Michael Syskind Pedersen, Zheng-Hua Tan, Jesper Jensen:
Informed Sound Source Localization Using Relative Transfer Functions for Hearing Aid Applications. 611-623 - Vikram C. M., S. R. Mahadeva Prasanna:
Epoch Extraction From Telephone Quality Speech Using Single Pole Filter. 624-636 - Motoi Omachi
, Tetsuji Ogawa
, Tetsunori Kobayashi:
Associative Memory Model-Based Linear Filtering and Its Application to Tandem Connectionist Blind Source Separation. 637-650 - Dani Cherkassky, Sharon Gannot
:
Blind Synchronization in Wireless Acoustic Sensor Networks. 651-661 - Laurent Girin, Thomas Hueber, Xavier Alameda-Pineda
:
Extending the Cascaded Gaussian Mixture Regression Framework for Cross-Speaker Acoustic-Articulatory Mapping. 662-673 - Mohamad Hasan Bahari
, Alexander Bertrand
, Marc Moonen:
Blind Sampling Rate Offset Estimation for Wireless Acoustic Sensor Networks Through Weighted Least-Squares Coherence Drift Estimation. 674-686 - Adam Kuklasinski, Simon Doclo
, Søren Holdt Jensen, Jesper Rindom Jensen:
Correction to "Maximum Likelihood PSD Estimation for Speech Enhancement in Reverberation and Noise". 687
Volume 25, Number 4, April 2017
- Sharon Gannot
, Emmanuel Vincent, Shmulik Markovich Golan, Alexey Ozerov:
A Consolidated Perspective on Multimicrophone Speech Enhancement and Source Separation. 692-730 - Dongwen Ying, Ruohua Zhou, Junfeng Li, Yonghong Yan:
Window-Dominant Signal Subspace Methods for Multiple Short-Term Speech Source Localization. 731-744 - Sean U. N. Wood, Jean Rouat, Stéphane Dupont
, Gueorgui Pironkov:
Blind Speech Separation and Enhancement With GCC-NMF. 745-755 - Constantin Spille, Birger Kollmeier, Bernd T. Meyer:
Combining Binaural and Cortical Features for Robust Speech Recognition. 756-767 - Yuma Koizumi, Kenta Niwa, Yusuke Hioka
, Kazunori Kobayashi, Hitoshi Ohmuro:
Informative Acoustic Feature Selection to Maximize Mutual Information for Collecting Target Sources. 768-779 - Takuya Higuchi, Nobutaka Ito, Shoko Araki
, Takuya Yoshioka, Marc Delcroix
, Tomohiro Nakatani:
Online MVDR Beamformer Based on Complex Gaussian Mixture Model With Spatial Prior for Noise Robust ASR. 780-793 - Eita Nakamura, Kazuyoshi Yoshii
, Shigeki Sagayama:
Rhythm Transcription of Polyphonic Piano Music Based on Merged-Output HMM for Multiple Voices. 794-806 - Omid Ghahabi
, Javier Hernando:
Deep Learning Backend for Single and Multisession i-Vector Speaker Recognition. 807-817 - Penny Karanasou, Chunyang Wu, Mark J. F. Gales, Philip C. Woodland:
I-Vectors and Structured Neural Networks for Rapid Adaptation of Acoustic Models. 818-828 - G. Aneeja, B. Yegnanarayana:
Extraction of Fundamental Frequency From Degraded Speech Using Temporal Envelopes at High SNR Frequencies. 829-838 - Seyyed Saeed Sarfjoo, Cenk Demiroglu, Simon King:
Using Eigenvoices and Nearest-Neighbors in HMM-Based Cross-Lingual Speaker Adaptation With Limited Data. 839-851 - Yung-Yue Chen, Jia-Hao Zhang:
Background Noise Reduction Design for Dual Microphone Cellular Phones: Robust Approach. 852-862 - Liner Yang, Xinxiong Chen, Zhiyuan Liu
, Maosong Sun:
Improving Word Representations with Document Labels. 863-870 - Shiliang Zhang, Cong Liu, Hui Jiang, Si Wei, Li-Rong Dai, Yu Hu:
Nonrecurrent Neural Structure for Long-Term Dependence. 871-884 - Xuefeng Yang, Kezhi Mao:
Task Independent Fine Tuning for Word Embeddings. 885-894 - Huawei Chen:
Design of Robust Broadband Beamformers Using Worst-Case Performance Optimization: A Semidefinite Programming Approach. 895-907 - Sandro Cumani, Pietro Laface
:
Nonlinear I-Vector Transformations for PLDA-Based Speaker Recognition. 908-919
Volume 25, Number 5, May 2017
- Manu Airaksinen
, Tomas Bäckström
, Paavo Alku
:
Quadratic Programming Approach to Glottal Inverse Filtering by Joint Norm-1 and Norm-2 Optimization. 929-939 - Ofer Schwartz, Sharon Gannot
, Emanuël A. P. Habets:
Multispeaker LCMV Beamformer and Postfilter for Source Separation and Noise Reduction. 940-951 - Dongmei Wang, Chengzhu Yu, John H. L. Hansen:
Robust Harmonic Features for Classification-Based Pitch Estimation. 952-964 - Tara N. Sainath, Ron J. Weiss, Kevin W. Wilson, Bo Li
, Arun Narayanan, Ehsan Variani, Michiel Bacchiani, Izhak Shafran, Andrew W. Senior, Kean K. Chin, Ananya Misra, Chanwoo Kim:
Multichannel Signal Processing With Deep Neural Networks for Automatic Speech Recognition. 965-979 - Hanieh Khalilian, Ivan V. Bajic
, Rodney G. Vaughan:
A Simulation Study of a Three-Dimensional Sound Field Reproduction System for Immersive Communication. 980-995 - Andreas Franck, Wenwu Wang, Filippo Maria Fazi:
Sparse ℓ1-Optimal Multiloudspeaker Panning and Its Relation to Vector Base Amplitude Panning. 996-1010 - Songbin Li, Yizhen Jia, C.-C. Jay Kuo
:
Steganalysis of QIM Steganography in Low-Bit-Rate Speech Signals. 1011-1022 - Naoyuki Kanda, Xugang Lu, Hisashi Kawai:
Maximum-a-Posteriori-Based Decoding for End-to-End Acoustic Models. 1023-1034 - Navid Shokouhi, John H. L. Hansen:
Teager-Kaiser Energy Operators for Overlapped Speech Detection. 1035-1047 - Yi-Chin Huang, Chung-Hsien Wu, Yan-You Chen, Ming-Ge Shie, Jhing-Fa Wang:
Personalized Spontaneous Speech Synthesis Using a Small-Sized Unsegmented Semispontaneous Speech. 1048-1060 - Jeongsoo Park, Jaeyoung Shin, Kyogu Lee
:
Exploiting Continuity/Discontinuity of Basis Vectors in Spectrogram Decomposition for Harmonic-Percussive Sound Separation. 1061-1074 - Xueliang Zhang, DeLiang Wang:
Deep Learning Based Binaural Speech Separation in Reverberant Environments. 1075-1084 - Masood Delfarah, DeLiang Wang:
Features for Masking-Based Monaural Speech Separation in Reverberant Conditions. 1085-1094 - Feiran Yang, Gerald Enzner
, Jun Yang
:
Statistical Convergence Analysis for Optimal Control of DFT-Domain Adaptive Echo Canceler. 1095-1106 - Takashi Nose, Yusuke Arao, Takao Kobayashi
, Komei Sugiura, Yoshinori Shiga:
Sentence Selection Based on Extended Entropy Using Phonetic and Prosodic Contexts for Statistical Parametric Speech Synthesis. 1107-1116 - Gergely Firtha, Peter Fiala, Frank Schultz
, Sascha Spors
:
Improved Referencing Schemes for 2.5D Wave Field Synthesis Driving Functions. 1117-1127 - Esteban Maestre
, Gary P. Scavone, Julius O. Smith III
:
Joint Modeling of Bridge Admittance and Body Radiativity for Efficient Synthesis of String Instrument Sound by Digital Waveguides. 1128-1139 - Gongping Huang
, Jacob Benesty, Jingdong Chen:
On the Design of Frequency-Invariant Beampatterns With Uniform Circular Microphone Arrays. 1140-1153 - Zdenek Prusa
, Péter Balázs, Peter L. Søndergaard:
A Noniterative Method for Reconstruction of Phase From STFT Magnitude. 1154-1164
Volume 25, Number 6, June 2017
- Gaël Richard, Tuomas Virtanen, Juan Pablo Bello, Nobutaka Ono
, Hervé Glotin:
Introduction to the Special Section on Sound Scene and Event Analysis. 1169-1171 - Héctor A. Sánchez-Hevia
, David Ayllón, Roberto Gil-Pita
, Manuel Rosa-Zurera
:
Maximum Likelihood Decision Fusion for Weapon Classification in Wireless Acoustic Sensor Networks. 1172-1182 - Nithin Rao Koluguri, G. Nisha Meenakshi, Prasanta Kumar Ghosh:
Spectrogram Enhancement Using Multiple Window Savitzky-Golay (MWSG) Filter for Robust Bird Sound Detection. 1183-1192 - Dan Stowell
, Emmanouil Benetos
, Lisa F. Gill
:
On-Bird Sound Recordings: Automatic Acoustic Recognition of Activities and Contexts. 1193-1206 - Brandon T. Carroll, Bradley M. Whitaker, Wayne Daley, David V. Anderson:
Outlier Learning via Augmented Frozen Dictionaries. 1207-1215 - Victor Bisot, Romain Serizel, Slim Essid, Gaël Richard:
Feature Learning With Matrix Factorization Applied to Acoustic Scene Classification. 1216-1229 - Yong Xu, Qiang Huang, Wenwu Wang, Peter Foster, Siddharth Sigtia, Philip J. B. Jackson
, Mark D. Plumbley
:
Unsupervised Feature Learning Based on Deep Models for Environmental Audio Tagging. 1230-1241 - Rene Grzeszick, Axel Plinge, Gernot A. Fink:
Bag-of-Features Methods for Acoustic Event Detection and Classification. 1242-1252 - Alain Rakotomamonjy:
Supervised Representation Learning for Audio Scene Classification. 1253-1265 - Emmanouil Benetos
, Grégoire Lafay, Mathieu Lagrange, Mark D. Plumbley
:
Polyphonic Sound Event Tracking Using Linear Dynamical Systems. 1266-1277 - Huy Phan, Lars Hertel, Marco Maaß
, Philipp Koch, Radoslaw Mazur, Alfred Mertins:
Improved Audio Scene Classification Based on Label-Tree Embeddings and Convolutional Neural Networks. 1278-1290 - Emre Çakir
, Giambattista Parascandolo, Toni Heittola, Heikki Huttunen
, Tuomas Virtanen:
Convolutional Recurrent Neural Networks for Polyphonic Sound Event Detection. 1291-1303 - Jens Schröder, Niko Moritz, Jörn Anemüller, Stefan Goetze
, Birger Kollmeier:
Classifier Architectures for Acoustic Scenes and Events: Implications for DNNs, TDNNs, and Perceptual Features from DCASE 2016. 1304-1314 - Wenjun Yang, Sridhar Krishnan:
Combining Temporal Features by Local Binary Pattern for Acoustic Scene Classification. 1315-1321 - David Dov, Ronen Talmon, Israel Cohen:
Multimodal Kernel Method for Activity Detection of Sound Sources. 1322-1334 - Keisuke Imoto
, Nobutaka Ono
:
Spatial Cepstrum as a Spatial Feature Using a Distributed Microphone Array for Acoustic Scene Analysis. 1335-1343 - Ivo Trowitzsch
, Johannes Mohr, Youssef Kashef, Klaus Obermayer:
Robust Detection of Environmental Sounds in Binaural Auditory Scenes. 1344-1356 - Abu Shafin Mohammad Mahdee Jameel, Shaikh Anowarul Fattah, Rajib Goswami
, Wei-Ping Zhu
, M. Omair Ahmad:
Noise Robust Formant Frequency Estimation Method Based on Spectral Model of Repeated Autocorrelation of Speech. 1357-1370 - Na Li, Man-Wai Mak
, Jen-Tzung Chien
:
DNN-Driven Mixture of PLDA for Robust Speaker Verification. 1371-1383 - Kai Wu, Vaninirappuputhenpurayil Gopalan Reju
, Andy W. H. Khong, Shu Ting Goh
:
Swarm Intelligence Based Particle Filter for Alternating Talker Localization and Tracking Using Microphone Arrays. 1384-1397
Volume 25, Number 7, July 2017
- Yu-An Chen, Ju-Chiang Wang, Yi-Hsuan Yang, Homer H. Chen:
Component Tying for Mixture Model Adaptation in Personalization of Music Emotion Recognition. 1409-1420 - Hossein Zeinali
, Hossein Sameti, Lukás Burget
:
HMM-Based Phrase-Independent i-Vector Extractor for Text-Dependent Speaker Verification. 1421-1435 - Xinzhou Xu, Jun Deng, Nicholas Cummins
, Zixing Zhang, Chen Wu, Li Zhao, Björn W. Schuller
:
A Two-Dimensional Framework of Multiple Kernel Subspace Learning for Recognizing Emotion in Speech. 1436-1449 - Mandy Korpusik, James R. Glass:
Spoken Language Understanding for a Nutrition Dialogue System. 1450-1461 - Mahmoud Fakhry
, Piergiorgio Svaizer, Maurizio Omologo
:
Audio Source Separation in Reverberant Environments Using β-Divergence-Based Nonnegative Factorization. 1462-1476 - Bracha Laufer-Goldshtein, Ronen Talmon, Sharon Gannot
:
Semi-Supervised Source Localization on Multiple Manifolds With Distributed Microphones. 1477-1491 - Donald S. Williamson
, DeLiang Wang:
Time-Frequency Masking in the Complex Domain for Speech Dereverberation and Denoising. 1492-1501 - Liang Lu, Steve Renals
:
Small-Footprint Highway Deep Neural Networks for Speech Recognition. 1502-1511 - Ina Kodrasi
, Simon Doclo
:
Signal-Dependent Penalty Functions for Robust Acoustic Multi-Channel Equalization. 1512-1525 - Jung-Hee Kim, Jin Kim, Jae Hyeon Jeon, Sang Won Nam
:
Delayless Individual-Weighting-Factors Sign Subband Adaptive Filter With Band-Dependent Variable Step-Sizes. 1526-1534 - Yannan Wang, Jun Du, Li-Rong Dai, Chin-Hui Lee:
A Gender Mixture Detection Approach to Unsupervised Single-Channel Speech Separation Based on Deep Neural Networks. 1535-1546 - Giacomo Vairetti
, Enzo De Sena
, Michael Catrysse, Søren Holdt Jensen, Marc Moonen, Toon van Waterschoot:
A Scalable Algorithm for Physically Motivated and Sparse Approximation of Room Impulse Responses With Orthonormal Basis Functions. 1547-1561