


default search action
INTERSPEECH 2011: Florence, Italy
- 12th Annual Conference of the International Speech Communication Association, INTERSPEECH 2011, Florence, Italy, August 27-31, 2011. ISCA 2011

Keynote Sessions
Keynote 1
- Julia Hirschberg:

Speaking More Like You: Entrainment in Conversational Speech. 4001
Keynote 2
- Tom M. Mitchell:

Neural Representations of Word Meanings. 4002
Keynote 3
- Alex Pentland:

Signals and Speech. 1-4
Keynote 4: Roundtable - Future and Applications of Speech and Language Technologies for the Good Health of Society
- Gabriele Miceli:

Language Disorders: Viewpoints on a Complex Object. - Björn Granström:

Speech Technology in (Re)Habilitation of Persons with Communication Disabilities. - Hiroshi Ishiguro:

From Teleoperated Androids to Cellphones as Surrogates.
Regular Oral Sessions
Speaker Recognition - Modeling
- Avi Matza:

Skew Gaussian Mixture Models for Speaker Recognition. 5-8 - Orith Toledo-Ronen, Hagai Aronowitz, Ron Hoory, Jason W. Pelecanos, David Nahamoo:

Towards Goat Detection in Text-Dependent Speaker Verification. 9-12 - Jean-François Bonastre, Xavier Anguera Miró, Gabriel Hernández Sierra, Pierre-Michel Bousquet:

Speaker Modeling Using Local Binary Decisions. 13-16 - Hagai Aronowitz, Ron Hoory, Jason W. Pelecanos, David Nahamoo:

New Developments in Voice Biometrics for User Authentication. 17-20 - Miranti Indar Mandasari, Mitchell McLaren, David A. van Leeuwen:

Evaluation of i-vector Speaker Recognition Systems for Forensic Application. 21-24 - Mohammed Senoussaoui, Patrick Kenny, Niko Brümmer, Edward de Villiers, Pierre Dumouchel:

Mixture of PLDA Models in i-vector Space for Gender-Independent Speaker Recognition. 25-28
Speech Perception - Speech Intelligibility
- Nandini Iyer, Douglas Brungart, Brian D. Simpson:

Segregation of Whispered Speech Interleaved with Noise or Speech Maskers. 29-32 - Roi Kliper, Hendrik Kayser, Daphna Weinshall, Israel Nelken, Jörn Anemüller:

Monaural Azimuth Localization Using Spectral Dynamics of Speech. 33-36 - Jan Rennies, Thomas Brand, Birger Kollmeier:

Prediction of Binaural Intelligibility Level Differences in Reverberation. 37-40 - Aurore Gautreau, Michel Hoen, Fanny Meunier:

Let's All Speak Together! Exploring the Impact of Various Languages on the Comprehension of Speech in Multi-Linguistic Babble. 41-44 - Valeriy Shafiro, Stanley Sheft, Robert Risley:

Cross-Rate Variation in the Intelligibility of Dual-Rate Gated Speech in Older Listeners. 45-48 - Chia-ying Lee, James R. Glass, Oded Ghitza:

An Efferent-Inspired Auditory Model Front-End for Speech Recognition. 49-52
Speech Representation and Modelling
- Faten Ben Ali, Laurent Girin, Sonia Djaziri Larbi:

A Long-Term Harmonic Plus Noise Model for Speech Signals. 53-56 - Alan Ó Cinnéide, David Dorran, Mikel Gainza, Eugene Coyle:

A Frequency Domain Approach to ARX-LF Voiced Speech Parameterization and Synthesis. 57-60 - Vikram Ramanarayanan, Athanasios Katsamanis, Shrikanth S. Narayanan:

Automatic Data-Driven Learning of Articulatory Primitives from Real-Time MRI Data Using Convolutive NMF with Sparseness Constraints. 61-64 - Dong Wang, Ravichander Vipperla, Nicholas W. D. Evans:

Online Pattern Learning for Non-Negative Convolutive Sparse Coding. 65-68 - Nicolas Malyska, Thomas F. Quatieri, Robert B. Dunn:

Sinewave Representations of Nonmodality. 69-72 - Ch. Srikanth Raj, Thippur V. Sreenivas:

Time-Varying Signal Adaptive Transform and IHT Recovery of Compressive Sensed Speech. 73-76
Emotion, Speaking Style, and Social Behavior
- Martin Wöllmer, Felix Weninger, Florian Eyben, Björn W. Schuller:

Acoustic-Linguistic Recognition of Interest in Speech with Bottleneck-BLSTM Nets. 77-80 - Mustafa Erden, Levent M. Arslan:

Automatic Detection of Anger in Human-Human Call Center Dialogs. 81-84 - Keng-hao Chang, Howard Lei, John F. Canny:

Improved Classification of Speaking Styles for Mental Health Monitoring Using Phoneme Dynamics. 85-88 - Matthew Black, Panayiotis G. Georgiou, Athanasios Katsamanis, Brian R. Baucom

, Shrikanth S. Narayanan:
"You made me do it": Classification of Blame in Married Couples' Interactions by Fusing Automatically Derived Speech and Language Information. 89-92 - Martijn Goudbeek, Marie Nilsenová:

Context and Priming Effects in the Recognition of Emotion of Old and Young Listeners. 93-96 - Agustín Gravano, Rivka Levitan, Laura Willson, Stefan Benus, Julia Hirschberg, Ani Nenkova:

Acoustic and Prosodic Correlates of Social Behavior. 97-100
HMM-based Speech Synthesis I
- Kyung Hwan Oh, June Sig Sung, Doo Hwa Hong, Nam Soo Kim:

Decision Tree-Based Clustering with Outlier Detection for HMM-Based Speech Synthesis. 101-104 - Hanna Silén, Elina Helander, Moncef Gabbouj:

Prediction of Voice Aperiodicity Based on Spectral Representations in HMM Speech Synthesis. 105-108 - Takashi Nose, Takao Kobayashi:

A Perceptual Expressivity Modeling Technique for Speech Synthesis Based on Multiple-Regression HSMM. 109-112 - Kei Hashimoto, Yoshihiko Nankaku, Keiichi Tokuda:

Multi-Speaker Modeling with Shared Prior Distributions and Model Structures for Bayesian Speech Synthesis. 113-116 - Zhen-Hua Ling, Korin Richmond, Junichi Yamagishi:

Feature-Space Transform Tying in Unified Acoustic-Articulatory Modelling for Articulatory Control of HMM-Based Speech Synthesis. 117-120 - Matt Shannon, Heiga Zen, William J. Byrne:

The Effect of Using Normalized Models in Statistical Speech Synthesis. 121-124
Speaker Recognition - Modeling, Automatic Procedures, Analysis I
- Ce Zhang, Rong Zheng, Bo Xu:

Restoring the Residual Speaker Information in Total Variability Modeling for Speaker Verification. 125-128 - Hagai Aronowitz, Oren Barkan:

New Developments in Joint Factor Analysis for Speaker Verification. 129-132 - Joaquin Gonzalez-Rodriguez:

Speaker Recognition Using Temporal Contours in Linguistic Units: The Case of Formant and Formant-Bandwidth Trajectories. 133-136 - Ondrej Glembek, Lukás Burget, Niko Brümmer, Oldrich Plchot, Pavel Matejka:

Discriminatively Trained i-vector Extractor for Speaker Verification. 137-140 - Michelle Hewlett Sanchez, Luciana Ferrer, Elizabeth Shriberg, Andreas Stolcke:

Constrained Cepstral Speaker Recognition Using Matched UBM and JFA Training. 141-144 - Alan McCree, Douglas E. Sturim, Douglas A. Reynolds:

A New Perspective on GMM Subspace Compensation Based on PPCA and Wiener Filtering. 145-148
Speech Perception - Perceptual Learning and Cross-Language Perception
- Odette Scharenborg, Holger Mitterer, James M. McQueen:

Perceptual Learning of Liquids. 149-152 - Annelie Tuinman, Holger Mitterer, Anne Cutler:

The Efficiency of Cross-Dialectal Word Recognition. 153-156 - Minoru Tsuzaki, Keiichi Tokuda, Hisashi Kawai, Jinfu Ni:

Estimation of Perceptual Spaces for Speaker Identities Based on the Cross-Lingual Discrimination Task. 157-160 - Sharon Peperkamp

, Camillia Bouchon:
The Relation Between Perception and Production in L2 Phonological Processing. 161-164 - Maria Paola Bissiri, María Luisa García Lecumberri, Martin Cooke, Jan Volín

:
The Role of Word-Initial Glottal Stops in Recognizing English Words. 165-168 - Caicai Zhang, Gang Peng, William S.-Y. Wang:

Effect of Language Experience on the Categorical Perception of Cantonese Vowel Duration. 169-172
Speech Analysis
- Christian Fischer Pedersen, Ove Andersen, Paul Dalsgaard:

Adaptive Estimation of Zeros of Time-Varying Z-Transforms. 173-176 - John Kane, Christer Gobl:

Identifying Regions of Non-Modal Phonation Using Features of the Wavelet Transform. 177-180 - Xing Fan, Keith W. Godin, John H. L. Hansen:

Acoustic Analysis of Whispered Speech for Phoneme and Speaker Dependency. 181-184 - Afsaneh Asaei, Mohammad Javad Taghizadeh, Hervé Bourlard, Volkan Cevher:

Multi-Party Speech Recovery Exploiting Structured Sparsity Models. 185-188 - Sri Harish Reddy Mallidi, Sriram Ganapathy, Hynek Hermansky:

Modulation Spectrum Analysis for Recognition of Reverberant Speech. 189-192 - Petko Nikolov Petkov, W. Bastiaan Kleijn

, Bert de Vries:
Discrete Choice Models for Non-Intrusive Quality Assessment. 193-196
Speech Enhancement and Dereverberation
- Keisuke Kinoshita, Mehrez Souden, Marc Delcroix, Tomohiro Nakatani:

Single Channel Dereverberation Using Example-Based Speech Enhancement with Uncertainty Decoding Technique. 197-200 - Jan S. Erkelens, Richard Heusdens:

A Statistical Room Impulse Response Model with Frequency Dependent Reverberation Time for Single-Microphone Late Reverberation Suppression. 201-204 - Chenxi Zheng, Tiago H. Falk, Wai-Yip Chan:

An Assessment of the Improvement Potential of Time-Frequency Masking for Speech Dereverberation. 205-208 - Thiago de M. Prego, Amaro A. de Lima, Sergio L. Netto:

Perceptual Improvement of a Two-Stage Algorithm for Speech Dereverberation. 209-212 - Najib Hadir, Friedrich Faubel, Dietrich Klakow:

A Model-Based Spectral Envelope Wiener Filter for Perceptually Motivated Speech Enhancement. 213-216 - Jorge I. Marin-Hurtado, Devangi N. Parikh, David V. Anderson:

Binaural Noise-Reduction Method Based on Blind Source Separation and Perceptual Post Processing. 217-220
ASR - Feature Extraction II
- Tim Ng, Bing Zhang, Spyridon Matsoukas, Long Nguyen:

Region Dependent Transform on MLP Features for Speech Recognition. 221-224 - Martin Heckmann, Claudius Gläser:

Discriminant Sub-Space Projection of Spectro-Temporal Speech Features Based on Maximizing Mutual Information. 225-228 - Takashi Fukuda, Osamu Ichikawa, Masafumi Nishimura:

Combining Feature Space Discriminative Training with Long-Term Spectro-Temporal Features for Noise-Robust Speech Recognition. 229-232 - Sumit Chopra, Patrick Haffner, Dimitrios Dimitriadis:

Combining Frame and Segment Level Processing via Temporal Pooling for Phonetic Classification. 233236 - Dong Yu, Michael L. Seltzer:

Improved Bottleneck Features Using Pretrained Deep Neural Networks. 237-240 - Yuan-Fu Liao, Chia-Hsing Lin, We-Der Fang:

Minimum Classification Error Based Spectro-Temporal Feature Extraction for Robust Audio Classification. 241-244
Speaker Recognition - Modeling, Automatic Procedures, Analysis II
- Ce Zhang, Rong Zheng, Bo Xu:

Data-Driven Gaussian Component Selection for Fast GMM-Based Speaker Verification. 245-248 - Daniel Garcia-Romero, Carol Y. Espy-Wilson:

Analysis of i-vector Length Normalization in Speaker Recognition Systems. 249-252 - Weiwu Jiang, Zhifeng Li, Helen M. Meng:

An Analysis Framework Based on Random Subspace Sampling for Speaker Verification. 253-256 - Nicolas Scheffer, Yun Lei, Luciana Ferrer:

Factor Analysis Back Ends for MLLR Transforms in Speaker Recognition. 257-260 - Craig S. Greenberg, Alvin F. Martin, Bradford Barr, George R. Doddington:

Report on Performance Results in the NIST 2010 Speaker Recognition Evaluation. 261-264 - Marcel Kockmann, Luciana Ferrer, Lukás Burget, Jan Cernocký:

iVector Fusion of Prosodic and Cepstral Features for Speaker Verification. 265-268
Speech Production - Articulatory Measurements
- Yoon-Chul Kim, Michael I. Proctor, Shrikanth S. Narayanan, Krishna S. Nayak:

Visualization of Vocal Tract Shape Using Interleaved Real-Time MRI of Multiple Scan Planes. 269-272 - Ralf Winkler, Susanne Fuchs, Pascal Perrier, Mark Tiede:

Biomechanical Tongue Models: An Approach to Studying Inter-Speaker Variability. 273-276 - Jun Wang, Jordan R. Green, Ashok Samal, David Marx:

Quantifying Articulatory Distinctiveness of Vowels. 277-280 - Michael I. Proctor, Adam C. Lammert, Athanasios Katsamanis, Louis M. Goldstein, Christina Hagedorn, Shrikanth S. Narayanan:

Direct Estimation of Articulatory Kinematics from Real-Time Magnetic Resonance Image Sequences. 281-284 - Peter Birkholz, Christiane Neuschaefer-Rube:

Combined Optical Distance Sensing and Electropalatography to Measure Articulation. 285-288 - Santitham Prom-on, Yi Xu, Fang Liu:

Simulating Post-L F0 Bouncing by Modeling Articulatory Dynamics. 289-292
Acoustic Event Detection
- Jürgen T. Geiger, Mohamed Anouar Lakhal, Björn W. Schuller, Gerhard Rigoll:

Learning New Acoustic Events in an HMM-Based System Using MAP Adaptation. 293-296 - Yi Ren Leng, Tran Huy Dat, Norihide Kitaoka, Haizhou Li:

Alternative Frequency Scale Cepstral Coefficient for Robust Sound Event Recognition. 297-300 - Akinori Ito, Akihito Aiba, Masashi Ito, Shozo Makino:

Evaluation of Abnormal Sound Detection using Multi-Stage GMM in Various Environments. 301-304 - Joerg Schmalenstroeer, Markus Bartek, Reinhold Haeb-Umbach:

Unsupervised Learning of Acoustic Events Using Dynamic Time Warping and Hierarchical K-Means++ Clustering. 305-308 - Pradeep Natarajan, Stavros Tsakalidis, Vasant Manohar, Rohit Prasad, Premkumar Natarajan:

Unsupervised Audio Analysis for Categorizing Heterogeneous Consumer Domain Videos. 313-316
Speech Synthesis - Unit Selection and Hybrid Approaches
- Vivek Kumar Rangarajan Sridhar, Ann K. Syrdal, Alistair Conkie, Srinivas Bangalore:

Enriching Text-to-Speech Synthesis Using Automatic Dialog Act Tags. 317-320 - Lukas Latacz, Wesley Mattheyses, Werner Verhelst:

Joint Target and Join Cost Weight Training for Unit Selection Synthesis. 321-324 - Andreas Windmann, Igor Jauk, Fabio Tamburini, Petra Wagner:

Prominence-Based Prosody Prediction for Unit Selection Speech Synthesis. 325-328 - Sathish Pammi, Marc Schröder:

Evaluating the Meaning of Synthesized Listener Vocalizations. 329-332 - Iñaki Sainz, Daniel Erro, Eva Navas, Inma Hernáez:

A Hybrid TTS Approach for Prosody and Acoustic Modules. 333-336 - Alexander Sorin, Slava Shechtman, Vincent Pollet:

Uniform Speech Parameterization for Multi-Form Segment Synthesis. 337-340
Speech Enhancement Analysis and Evaluation
- Ryoichi Miyazaki, Hiroshi Saruwatari, Kiyohiro Shikano:

Theoretical Analysis of Musical Noise and Speech Distortion in Structure-Generalized Parametric Blind Spatial Subtraction Array. 341-344 - Yan Tang, Martin Cooke:

Subjective and Objective Evaluation of Speech Intelligibility Enhancement Under Constant Energy and Duration Constraints. 345-348 - Nagarjuna Reddy Muraka, Chandra Sekhar Seelamantula:

A Risk-Estimation-Based Comparison of Mean Square Error and Itakura-Saito Distortion Measures for Speech Enhancement. 349-352 - Mahdi Triki:

On Noise Tracking for Noise Floor Estimation. 353-356 - Ben Milner:

Maximum a posteriori Estimation of Noise from Non-Acoustic Reference Signals in Very Low Signal-to-Noise Ratio Environments. 357-360 - Ryo Wakisaka, Hiroshi Saruwatari, Kiyohiro Shikano, Tomoya Takatani:

Blind Speech Prior Estimation for Generalized Minimum Mean-Square Error Short-Time Spectral Amplitude Estimator. 361-364
Speaker Recognition - Analysis and Statistics I
- Kornel Laskowski, Qin Jin:

Harmonic Structure Transform for Speaker Recognition. 365-368 - Hemant A. Patil, Maulik C. Madhavi, Keshab K. Parhi:

Combining Evidence from Spectral and Source-Like Features for Person Recognition from Humming. 369-372 - Yanhua Long, Zhi-Jie Yan, Frank K. Soong, Li-Rong Dai, Wu Guo:

Improvements in Speaker Characterization Using Spectral Subband Energy Based on Harmonic plus Noise Model. 373-376 - Yosef A. Solewicz, Hagai Aronowitz:

Implicit Segmentation in Two-Wire Speaker Recognition. 377-380 - Sibel Yaman, Jason W. Pelecanos, Mohamed Kamal Omar:

Boosting Speaker Recognition Performance with Compact Representations. 381-384 - Carlos Vaquero, Alfonso Ortega

, Eduardo Lleida:
Partitioning of Two-Speaker Conversation Datasets. 385-388
Speech Production - Coarticulation and Speech Timing
- Stefan Benus, Marianne Pouplier:

Jaw Movement in Vowels and Liquids Forming the Syllable Nucleus. 389-392 - Barbara Gili Fivela, Antonio Stella, Sonia D'Apolito, Francesco Sigona:

Coarticulation Across Prosodic Domains in Italian: An Ultrasound Investigation. 393-396 - Juraj Simko, Fred Cummins, Stefan Benus:

Investigating the Stability of Intergestural Timing Relations. 397-400 - Claudio Zmarich, Barbara Gili Fivela, Pascal Perrier, Christophe Savariaux, Graziano Tisato:

Speech Timing Organization for the Phonological Length Contrast in Italian Consonants. 401-404 - Chiara Celata, Silvia Calamai:

Timing in Italian VNC Sequences at Different Speech Rates. 405-408 - Christina Hagedorn, Michael I. Proctor, Louis Goldstein:

Automatic Analysis of Singleton and Geminate Consonant Articulation Using Real-Time Magnetic Resonance Imaging. 409-412
Speech Segmentation
- Yih-Ru Wang:

A Two-Stage Sample-Based Phone Boundary Detector Using Segmental Similarity Features. 413-416 - Qiang Huang, Stephen J. Cox:

Iterative Improvement of Speaker Segmentation in a Noisy Environment Using High-Level Knowledge. 417-420 - Diego Castán, Carlos Vaquero, Alfonso Ortega, David Martínez González, Jesús Antonio Villalba López, Eduardo Lleida:

Hierarchical Audio Segmentation with HMM and Factor Analysis in Broadcast News Domain. 421-424 - Ozlem Kalinli:

Syllable Segmentation of Continuous Speech Using Auditory Attention Cues. 425-428 - Vijayaditya Peddinti, Kishore Prahallad:

Exploiting Phone-Class Specific Landmarks for Refinement of Segment Boundaries in TTS Databases. 429-432 - Agnès Pedone, Juan José Burred, Simon Maller, Pierre Leveau:

Phoneme-Level Text to Audio Synchronization on Speech Signals with Background Music. 433-436
ASR - Acoustic Models II
- Frank Seide, Gang Li, Dong Yu:

Conversational Speech Transcription Using Context-Dependent Deep Neural Networks. 437-440 - Guangsen Wang, Khe Chai Sim:

Sequential Classification Criteria for NNs in Automatic Speech Recognition. 441-444 - Mathew Magimai-Doss, Ramya Rasipuram, Guillermo Aradilla, Hervé Bourlard:

Grapheme-Based Automatic Speech Recognition Using KL-HMM. 445-448 - Joseph Keshet, Chih-Chieh Cheng, Mark Stoehr, David A. McAllester:

Direct Error Rate Minimization of Hidden Markov Models. 449-452 - Xie Sun, Xin Chen, Yunxin Zhao:

On the Effectiveness of Statistical Modeling Based Template Matching Approach for Continuous Speech Recognition. 453-456 - Guangsen Wang, Khe Chai Sim:

Comparison of Smoothing Techniques for Robust Context Dependent Acoustic Modelling in Hybrid NN/HMM Systems. 457-460
Robust Speech Recognition II
- Ramón Fernandez Astudillo, João Paulo da Silva Neto:

Propagation of Uncertainty Through Multilayer Perceptrons for Robust Automatic Speech Recognition. 461-464 - Katariina Mahkonen, Antti Hurmalainen, Tuomas Virtanen, Jort F. Gemmeke:

Mapping Sparse Representation to State Likelihoods in Noise-Robust Automatic Speech Recognition. 465-468 - Heikki Kallasjoki, Ulpu Remes, Jort F. Gemmeke, Tuomas Virtanen, Kalle J. Palomäki:

Uncertainty Measures for Improving Exemplar-Based Source Separation. 469-472 - Hsien-Cheng Liao, Yuan-Fu Liao, Chin-Hui Lee:

Maximum Confidence Measure Based Interaural Phase Difference Estimation for Noise Masking in Dual-Microphone Robust Speech Recognition. 473-476 - Shirin Badiezadegan, Richard C. Rose:

A Performance Monitoring Approach to Fusing Enhanced Spectrogram Channels in Robust Speech Recognition. 477-480 - Ning Cheng, Xunying Liu, Lan Wang:

Generalized Variable Parameter HMMs for Noise Robust Speech Recognition. 481-484
Speaker Recognition - Analysis and Statistics II
- Pierre-Michel Bousquet, Driss Matrouf, Jean-François Bonastre:

Intersession Compensation and Scoring Methods in the i-vectors Space for Speaker Recognition. 485-488 - Szymon Drgas, Adam Dabrowski:

Kernel Alignment Maximization for Speaker Recognition Based on High-Level Features. 489-492 - Balaji Vasan Srinivasan, Daniel Garcia-Romero, Dmitry N. Zotkin, Ramani Duraiswami:

Kernel Partial Least Squares for Speaker Recognition. 493-496 - Mohamed Kamal Omar, Jason W. Pelecanos:

Conversational-Side-Specific Inter-Session Variability Compensation. 497-500 - David A. van Leeuwen, Niko Brümmer:

A Speaker Line-Up for the Likelihood Ratio. 501-504 - Jesús Antonio Villalba López, Niko Brümmer:

Towards Fully Bayesian Speaker Recognition: Integrating Out the Between-Speaker Covariance. 505-508
Speaker Recognition - Analysis and Statistics II
- Hemant A. Patil, Pallavi N. Baljekar:

Novel VTEO Based Mel Cepstral Features for Classification of Normal and Pathological Voices. 509-512 - Eiji Shimura, Kazuhiko Kakehi:

Temporal Performance of Dysarthric Patients in Speech and Tapping Tasks. 513-516 - Xinhui Zhou, Maureen L. Stone, Carol Y. Espy-Wilson:

A Comparative Acoustic Study on Speech of Glossectomy Patients and Normal Subjects. 517-520 - Ali Alpan, Francis Grenez, Jean Schoentgen:

Dysperiodicity Analysis of Perceptually Assessed Synthetic Speech Stimuli. 521-524 - Alain Ghio, Frédérique Weisz, Giovanna Baracca, Giovanna Cantarella, Danièle Robert, Virginie Woisard, Franco Fussi, Antoine Giovanni:

Is the Perception of Voice Quality Language-Dependant? A Comparison of French and Italian Listeners and Dysphonic Speakers. 525-528 - Juan Rafael Orozco-Arroyave, S. Murillo Rendón, Andrés Marino Álvarez-Meza, Julián D. Arias-Londoño, Edilson Delgado-Trejos, Jesús Francisco Vargas-Bonilla, César Germán Castellanos-Domínguez:

Automatic Selection of Acoustic and Non-Linear Dynamic Features in Voice Signals for Hypernasality Detection. 529-532
ASR - Lexical, Prosodic and Multi-Lingual Models
- Sravana Reddy, Evandro B. Gouvêa

:
Learning from Mistakes: Expanding Pronunciation Lexicons Using Word Recognition Errors. 533-536 - David Imseng, Hervé Bourlard, John Dines, Philip N. Garner, Mathew Magimai-Doss:

Improving Non-Native ASR Through Stochastic Multilingual Phoneme Space Transformations. 537-540 - Scott Novotney, Richard M. Schwartz, Sanjeev Khudanpur:

Unsupervised Arabic Dialect Adaptation with Self-Training. 541-544 - Dino Seppi, Kris Demuynck, Dirk Van Compernolle:

Template-Based Automatic Speech Recognition Meets Prosody. 545-548 - Ibrahim Badr, Ian McGraw, James R. Glass:

Pronunciation Learning from Continuous Speech. 549-552 - Yanmin Qian, Daniel Povey, Jia Liu:

State-Level Data Borrowing for Low-Resource Speech Recognition Based on Subspace GMMs. 553-560
Source Separation
- Yasmina Benabderrahmane, Sid-Ahmed Selouani, Douglas D. O'Shaughnessy:

Blind Speech Separation in Multiple Environments Using a Frequency Oriented PCA Method for Convolutive Mixtures. 557-560 - Zbynek Koldovský, Jirí Málek, Petr Tichavský:

Blind Speech Separation in Time-Domain Using Block-Toeplitz Structure of Reconstructed Signal Matrices. 561-564 - Auxiliadora Sarmiento, Iván Durán-Díaz, Sergio Cruces, Pablo Aguilera:

Generalized Method for Solving the Permutation Problem in Frequency-Domain Blind Source Separation of Convolved Speech Signals. 565-568 - Emad M. Grais, Hakan Erdogan:

Adaptation of Speaker-Specific Bases in Non-Negative Matrix Factorization for Single Channel Speech-Music Separation. 569-572 - Shuhua Zhang, Laurent Girin:

An Informed Source Separation System for Speech Signals. 573-576 - Ngoc Thuy Tran, William G. Cowley, André Pollok:

Adaptive Blocking Beamformer for Speech Separation. 577-580
Multimodal Signal Processing
- Per Ola Kristensson, Keith Vertanen:

Asynchronous Multimodal Text Entry Using Speech and Gesture Keyboards. 581-584 - Niall McLaughlin, Ji Ming, Danny Crookes:

Robust Bimodal Person Identification Using Face and Speech with Limited Training Data and Corruption of Both Modalities. 585-588 - Atef Ben Youssef, Thomas Hueber, Pierre Badin, Gérard Bailly:

Toward a Multi-Speaker Visual Articulatory Feedback System. 589-592 - Thomas Hueber, Elie-Laurent Benaroya, Bruce Denby, Gérard Chollet:

Statistical Mapping Between Articulatory and Acoustic Data for an Ultrasound-Based Silent Speech Interface. 593-596 - Joerg Schmalenstroeer, Florian Jacob, Reinhold Haeb-Umbach, Marius H. Hennecke, Gernot A. Fink:

Unsupervised Geometry Calibration of Acoustic Sensor Networks Using Source Correspondences. 597-600 - Michael Wand, Matthias Janke, Tanja Schultz:

Investigations on Speaking Mode Discrepancies in EMG-Based Speech Recognition. 601-604
ASR - Language Models II
- Tomás Mikolov, Anoop Deoras, Stefan Kombrink, Lukás Burget, Jan Cernocký:

Empirical Evaluation and Combination of Advanced Language Modeling Techniques. 605-608 - Geoffrey Zweig, Shuangyu Chang:

Personalizing Model M for Voice-Search. 609-612 - Takahiro Shinozaki, Yu Kubota, Sadaoki Furui, Eiji Utsunomiya, Yasutaka Shindoh:

Sentence Selection by Direct Likelihood Maximization for Language Model Adaptation. 613-616 - Ebru Arisoy, Bhuvana Ramabhadran, Hong-Kwang Jeff Kuo:

Feature Combination Approaches for Discriminative Language Models. 617-620 - Sankaranarayanan Ananthakrishnan, Stavros Tsakalidis, Rohit Prasad, Premkumar Natarajan:

On-Line Language Model Biasing for Multi-Pass Automatic Speech Recognition. 621-624 - Moonyoung Kang, Tim Ng, Long Nguyen:

Mandarin Word-Character Hybrid-Input Neural Network Language Model. 625-628
Phonology and Phonetics
- Vahid Sadeghi:

Laryngealization and Breathiness in Persian. 629-632 - Viola Müller, Jonathan Harrington, Felicitas Kleber, Ulrich Reubold:

Age-Dependent Differences in the Neutralization of the Intervocalic Voicing Contrast: Evidence from an Apparent-Time Study on East Franconian. 633-636 - Barbara Samlowski, Bernd Möbius, Petra Wagner:

Comparing Syllable Frequencies in Corpora of Written and Spoken Language. 637-640 - Luca Iacoponi, Renata Savy:

Sylli: Automatic Phonological Syllabification for Italian. 641-644 - André N. Xavier, Plínio A. Barbosa:

A Preliminary Study on the Production of Signs in Brazilian Sign Language when One of the Manual Articulators is Unavailable. 645-648 - Ho-hsien Pan, Mao-Hsu Chen, Shao-Ren Lyu:

Electroglottograph and Acoustic Cues for Phonation Contrasts in Taiwan Min Falling Tones. 649-652
Voice Conversion
- Daisuke Saito, Keisuke Yamamoto, Nobuaki Minematsu, Keikichi Hirose:

One-to-Many Voice Conversion Based on Tensor Representation of Speaker Space. 653-656 - Yu Qiao, Tong Tong, Nobuaki Minematsu:

A Study on Bag of Gaussian Model with Application to Voice Conversion. 657-660 - Lei Li, Yoshihiko Nankaku, Keiichi Tokuda:

A Bayesian Approach to Voice Conversion Based on GMMs Using Multiple Model Structures. 661-664 - Mahdi Eslami, Hamid Sheikhzadeh, Abolghasem Sayadiyan:

Quality Improvement of Voice Conversion Systems Based on Trellis Structured Vector Quantization. 665-668 - Hadas Benisty, David Malah:

Voice Conversion Using GMM with Enhanced Global Variance. 669-672 - Elizabeth Godoy, Olivier Rosec, Thierry Chonavel:

Spectral Envelope Transformation Using DFW and Amplitude Scaling for Voice Conversion with Parallel or Nonparallel Corpora. 673-676
Robust Speech Recognition III
- Pejman Mowlaee, Rahim Saeidi, Zheng-Hua Tan, Mads Græsbøll Christensen, Tomi Kinnunen, Pasi Fränti, Søren Holdt Jensen:

Sinusoidal Approach for the Single-Channel Speech Separation and Recognition Challenge. 677-680 - Cemil Demir, A. Taylan Cemgil, Murat Saraclar:

Semi-Supervised Single-Channel Speech-Music Separation for Automatic Speech Recognition. 681-684 - Hari Krishna Maganti, Marco Matassoni:

A Level-Dependent Auditory Filter-Bank for Speech Recognition in Reverberant Environments. 685-688 - Mehrez Souden, Keisuke Kinoshita, Marc Delcroix, Tomohiro Nakatani:

A Multichannel Feature-Based Processing for Robust Speech Recognition. 689-692 - Xiong Xiao, Jinyu Li, Chng Eng Siong, Haizhou Li:

Feature Normalization Using Structured Full Transforms for Robust Speech Recognition. 693-696 - Masakiyo Fujimoto, Shinji Watanabe, Tomohiro Nakatani:

A Robust Estimation Method of Noise Mixture Model for Noise Suppression. 697-700
Spoken Language Understanding
- Xiao Li, Ye-Yi Wang, Gökhan Tür:

Multi-Task Learning for Spoken Language Understanding with Shared Slots. 701-704 - Dustin Hillard, Asli Celikyilmaz, Dilek Hakkani-Tür, Gökhan Tür:

Learning Weighted Entity Lists from Web Click Logs for Spoken Language Understanding. 705-708 - Dilek Hakkani-Tür, Gökhan Tür, Larry P. Heck, Elizabeth Shriberg:

Bootstrapping Domain Detection Using Query Click Logs for New Domains. 709-712 - Asli Celikyilmaz, Dilek Hakkani-Tür

, Gökhan Tür:
Approximate Inference for Domain Detection in Spoken Language Understanding. 713-716 - Chien-Lin Huang, Bin Ma, Haizhou Li, Chung-Hsien Wu:

Speech Indexing Using Semantic Context Inference. 717-720 - Yun-Cheng Ju, Jasha Droppo:

Automatically Optimizing Utterance Classification Performance without Human in the Loop. 721-724
Dialect and Accent Identification
- Philippe Boula de Mareüil, Jean-Luc Rouas, Manuela Yapomo:

In Search of Cues Discriminating West-African Accents in French. 725-728 - Abualsoud Hanani, Martin J. Russell, Michael J. Carey:

Computer and Human Recognition of Regional Accents of British English. 729-732 - Rong Tong, Bin Ma, Haizhou Li, Chng Eng Siong:

Target-Aware Lattice Rescoring for Dialect Recognition. 733-736 - Murat Akbacak, Dimitra Vergyri, Andreas Stolcke, Nicolas Scheffer, Arindam Mandal:

Effective Arabic Dialect Classification Using Diverse Phonotactic Models. 737-740 - Nancy F. Chen, Wade Shen, Joseph P. Campbell:

Characterizing Deletion Transformations Across Dialects Using a Sophisticated Tying Mechanism. 741-744 - Fadi Biadsy, Julia Hirschberg, Daniel P. W. Ellis:

Dialect and Accent Recognition Using Phonetic-Segmentation Supervectors. 745-748
First Language Acquisition
- Kouki Miyazawa, Hideaki Miura, Hideaki Kikuchi, Reiko Mazuka:

The Multi Timescale Phoneme Acquisition Model of the Self-Organizing Based on the Dynamic Features. 749-752 - Helen Brown, M. Gareth Gaskell:

The Time-Course of Talker-Specificity Effects for Newly-Learned Pseudowords: Evidence for a Hybrid Model of Lexical Representation. 753-756 - Britta Lintfert, Antje Schweitzer, Bernd Möbius:

A Parametric Approach to Intonation Acquisition Research: Validation on Child-Directed Speech Data. 757-760 - Maarten Versteegh, Louis ten Bosch, Lou Boves:

Modelling Novelty Preference in Word Learning. 761-764 - Gopal Ananthakrishnan, Giampiero Salvi:

Using Imitation to Learn Infant-Adult Acoustic Mappings. 765-768 - Christina Bergmann, Louis ten Bosch, Lou Boves:

Thresholding Word Activations for Response Scoring - Modelling Psycholinguistic Data. 769-772
ASR - Acoustic Models III
- Roger Hsiao, Tanja Schultz:

Generalized Baum-Welch Algorithm and its Implication to a New Extended Baum-Welch Algorithm. 773-776 - Frank Diehl, Mark John Francis Gales, Xunying Liu, Marcus Tomalin, Philip C. Woodland:

Word Boundary Modelling and Full Covariance Gaussians for Arabic Speech-to-Text Systems. 777-780 - Tom Ko, Brian Mak:

A Fully Automated Derivation of State-Based Eigentriphones for Triphone Modeling with No Tied States Using Regularization. 781-784 - Tara N. Sainath, Bhuvana Ramabhadran, David Nahamoo, Dimitri Kanevsky:

Reducing Computational Complexities of Exemplar-Based Sparse Representations with Applications to Large Vocabulary Speech Recognition. 785-788 - Yu Zhang, Jian Xu, Zhi-Jie Yan, Qiang Huo:

An i-vector Based Approach to Training Data Clustering for Improved Speech Recognition. 789-792 - Senaka Buthpitiya, Ian R. Lane, Jike Chong:

Rapid Training of Acoustic Models Using Graphics Processing Unit. 793-796
Spoken Dialogue Systems I
- Teruhisa Misu, Kiyonori Ohtake, Chiori Hori, Hisashi Kawai, Satoshi Nakamura:

User Study of Spoken Decision Support System. 797-800 - Antoine Raux, Yi Ma:

Efficient Probabilistic Tracking of User Goal and Dialog History for Spoken Dialog Systems. 801-804 - Alexander Schmitt, Alexander Zgorzelski, Wolfgang Minker:

Tackling a Shilly-Shally Classifier for Predicting Task Success in Spoken Dialogue Interaction. 805-808 - Toyomi Meguro, Yasuhiro Minami, Ryuichiro Higashinaka, Kohji Dohsaka:

Evaluation of Listening-Oriented Dialogue Control Rules Based on the Analysis of HMMs. 809-812 - David Suendermann, Jackson Liscombe, Jonathan Bloom, Grace Li, Roberto Pieraccini:

Large-Scale Experiments on Data-Driven Design of Commercial Spoken Dialog Systems. 813-816 - Fredrik Kronlid, Jessica Villing, Alexander Berman, Staffan Larsson:

Comparing System-Driven and Free Dialogue in In-Vehicle Interaction. 817-820
Spoken Language Resources, Evaluation and Standardization II
- Michael A. Carlin, Samuel Thomas, Aren Jansen, Hynek Hermansky:

Rapid Evaluation of Speech Representations for Spoken Term Discovery. 821-824 - Ben Hixon, Eric Schneider, Susan L. Epstein:

Phonemic Similarity Metrics to Compare Pronunciation Methods. 825-828 - Janto Skowronek, Alexander Raake:

Investigating the Effect of Number of Interlocutors on the Quality of Experience for Multi-Party Audio Conferencing. 829-832 - Jáchym Kolár, Lori Lamel:

On Development of Consistently Punctuated Speech Corpora. 833-836 - Shrikanth S. Narayanan, Erik Bresch, Prasanta Kumar Ghosh, Louis Goldstein, Athanasios Katsamanis, Yoon Kim, Adam C. Lammert, Michael I. Proctor, Vikram Ramanarayanan, Yinghua Zhu:

A Multimodal Real-Time MRI Articulatory Corpus for Speech Research. 837-840 - Denis Burnham, Dominique Estival, Steven Fazio, Jette Viethen, Felicity Cox, Robert Dale, Steve Cassidy, Julien Epps, Roberto Togneri, Michael Wagner, Yuko Kinoshita, Roland Göcke

, Joanne Arciuli, Mark Onslow
, Trent W. Lewis, Andrew Butcher, John Hajek:
Building an Audio-Visual Corpus of Australian English: Large Corpus Collection with an Economical Portable and Replicable Black Box. 841-844
Language Identification
- Rong Zheng, Ce Zhang, Bo Xu:

Data-Driven UBM Generation via Tied Gaussians for GMM-Supervector Based Accent Identification. 845-848 - David Martínez González, Jesús Antonio Villalba López, Antonio Miguel, Alfonso Ortega, Eduardo Lleida:

I3A Language Recognition System for Albayzin 2010 LRE. 849-852 - Mikel Peñagarikano, Amparo Varona, Luis Javier Rodríguez, Germán Bordel:

Dimensionality Reduction for Using High-Order n-Grams in SVM-Based Phonotactic Language Recognition. 853-856 - Najim Dehak

, Pedro A. Torres-Carrasquillo, Douglas A. Reynolds, Réda Dehak:
Language Recognition via i-vectors and Dimensionality Reduction. 857-860 - David Martínez González, Oldrich Plchot, Lukás Burget, Ondrej Glembek, Pavel Matejka:

Language Recognition in iVectors Space. 861-864
Second Language Acquisition, Development and Learning II
- Xiaojun Qian, Helen M. Meng, Frank K. Soong:

On Mispronunciation Lexicon Generation Using Joint-Sequence Multigrams in Computer-Aided Pronunciation Training (CAPT). 865-868 - Bianca Sisinni, Mirko Grimaldi:

Validating a Second Language Perception Model for Classroom Context - A Longitudinal Study within the Perceptual Assimilation Model. 869-872 - Makiko Sadakata, James M. McQueen:

The Role of Variability in Non-Native Perceptual Learning of a Japanese Geminate-Singleton Fricative Contrast. 873-876 - Jared Bernstein, Jian Cheng, Masanori Suzuki:

Fluency Changes with General Progress in L2 Proficiency. 877-880 - Slim Ouni:

Tongue Gestures Awareness and Pronunciation Training. 881-844 - Wim A. van Dommelen, Valérie Hazan:

Impact of Speaker Variability on Speech Perception in Non-Native Listeners. 885-888
ASR - Search, Keyword Spotting and Confidence Measures II
- Evelyn Kurniawati, Samsudin Ng, Karthik Muralidhar, Sapna George:

A Template Based Voice Trigger System Using Bhattacharyya Edit Distance. 889-892 - David Nolden, Ralf Schlüter, Hermann Ney:

Acoustic Look-Ahead for More Efficient Decoding in LVCSR. 893-896 - Frank Duckhorn, Matthias Wolff, Rüdiger Hoffmann:

A New Epsilon Filter for Efficient Composition of Weighted Finite-State Transducers. 897-900 - Sabato Marco Siniscalchi, Torbjørn Svendsen, Chin-Hui Lee:

A Bottom-Up Stepwise Knowledge-Integration Approach to Large Vocabulary Continuous Speech Recognition Using Weighted Finite State Machines. 901-904 - Matthew Stephen Seigel, Philip C. Woodland:

Combining Information Sources for Confidence Estimation with CRF Models. 905-908 - Kouichi Katsurada, Shinta Sawada, Shigeki Teshima, Yurie Iribe, Tsuneo Nitta:

Evaluation of Fast Spoken Term Detection Using a Suffix Array. 909-912
SLP for Information Extraction and Retrieval I
- Timothy J. Hazen:

Latent Topic Modeling for Audio Corpus Summarization. 913-916 - Richard Dufour, Yannick Estève, Paul Deléglise:

Investigation of Spontaneous Speech Characterization Applied to Speaker Role Recognition. 917-920 - Armando Muscariello, Guillaume Gravier, Frédéric Bimbot:

Zero-Resource Audio-Only Spoken Term Detection Based on a Combination of Template Matching Techniques. 921-924 - Yeon-Jun Kim, David C. Gibbon:

Automatic Learning in Content Indexing Service Using Phonetic Alignment. 925-928 - Pei-Ning Chen, Kuan-Yu Chen, Berlin Chen:

Leveraging Relevance Cues for Improved Spoken Document Retrieval. 929-932 - Yun-Nung Chen, Yu Huang, Ching-feng Yeh, Lin-Shan Lee:

Spoken Lecture Summarization by Random Walk over a Graph Constructed with Automatically Extracted Key Terms. 933-936
Speaker Diarization I
- Hagai Aronowitz:

Speaker Diarization Using a priori Acoustic Information. 937-940 - Kofi Boakye, Oriol Vinyals, Gerald Friedland:

Improved Overlapped Speech Handling for Speaker Diarization. 941-944 - Stephen Shum, Najim Dehak, Ekapol Chuangsuwanich, Douglas A. Reynolds, James R. Glass:

Exploiting Intra-Conversation Variability for Speaker Diarization. 945-948 - Masafumi Nishida, Seiichi Yamamoto:

Speaker Clustering Based on Non-Negative Matrix Factorization. 949-952 - Sree Harsha Yella, Fabio Valente:

Information Bottleneck Features for HMM/GMM Speaker Diarization of Meetings Recordings. 953-956 - David Wang, Robbie Vogt, Sridha Sridharan, David Dean:

Cross Likelihood Ratio Based Speaker Clustering Using Eigenvoice Models. 957-960
Prosody I
- Giuseppina Turco, Michele Gubian, Jessamyn Schertz:

A Quantitative Investigation of the Prosody of Verum Focus in Italian. 961-964 - Amelie Dorn, Ailbhe Ní Chasaide:

Effects of Focus on f0 and Duration in Irish (Gaelic) Declaratives. 965-968 - Jennifer Cole, Stefanie Shattuck-Hufnagel:

The Phonology and Phonetics of Perceived Prosody: What do Listeners Imitate? 969-972 - Amandine Michelas, Noël Nguyen:

Uncovering the Effect of Imitation on Tonal Patterns of French Accentual Phrases. 973-976 - Pilar Prieto, Cecilia Pugliesi, Joan Borràs-Comes

, Ernesto Arroyo, Josep Blat:
Crossmodal Prosodic and Gestural Contribution to the Perception of Contrastive Focus. 977-980 - Erin Cvejic

, Jeesun Kim, Chris Davis:
Temporal Relationship Between Auditory and Visual Prosodic Cues. 981-984
ASR - New Paradigms
- Xie Sun, Yunxin Zhao:

New Methods for Template Selection and Compression in Continuous Speech Recognition. 985-988 - Shi-Xiong Zhang, Mark J. F. Gales:

Structured Support Vector Machines for Noise Robust Continuous Speech Recognition. 989-990 - Masayuki Suzuki, Gakuto Kurata, Masafumi Nishimura, Nobuaki Minematsu:

Continuous Digits Recognition Leveraging Invariant Structure. 993-996 - Dimitri Kanevsky, David Nahamoo, Tara N. Sainath, Bhuvana Ramabhadran:

Convergence of Line Search A-Function Methods. 997-1000 - Yasuhisa Fujii, Kazumasa Yamamoto, Seiichi Nakagawa:

Hidden Boosted MMI and Hierarchical State Posterior Feature for Automatic Speech Recognition Based on Hidden Conditional Neural Fields. 1001-1004 - Jun Cai, Bruce Denby, Pierre Roussel-Ragot, Gérard Dreyfus, Lise Crevier-Buchman:

Recognition and Real Time Performances of a Lightweight Ultrasound Based Silent Speech Interface Employing a Language Model. 1005-1008
Spoken Dialogue Systems II
- Heriberto Cuayáhuitl, Nina Dethlefs:

Optimizing Situated Dialogue Management in Unknown Environments. 1009-1012 - Om Deshmukh, Shajith Ikbal, Ashish Verma, Etienne Marcheret:

Acoustic-Similarity Based Technique to Improve Concept Recognition. 1013-1016 - Doug Peters, Peter Stubley:

Dialog Methods for Improved Alphanumeric String Capture. 1017-1020 - David DeVault, Kenji Sagae, David R. Traum:

Detecting the Status of a Predictive Incremental Speech Understanding Model for Real-Time Decision-Making in a Spoken Dialogue System. 1021-1024 - Senthilkumar Chandramohan, Matthieu Geist, Fabrice Lefèvre, Olivier Pietquin:

User Simulation in Dialogue Systems Using Inverse Reinforcement Learning. 1025-1028 - Paul A. Crook, Oliver Lemon:

Lossless Value Directed Compression of Complex User Goal States for Statistical Spoken Dialogue Systems. 1029-1032
Speaker Diarization II
- Janez Zibert, France Mihelic:

Prosodic and Phonetic Features for Speaker Clustering in Speaker Diarization Systems. 1033-1036 - Marijn Huijbregts, David A. van Leeuwen:

Diarization-Based Speaker Retrieval for Broadcast Television Archives. 1037-1040 - Martin Zelenák, Javier Hernando:

The Detection of Overlapping Speech with Prosodic Features for Speaker Diarization. 1041-1044 - Sree Hari Krishnan Parthasarathi, Hervé Bourlard, Daniel Gatica-Perez

:
LP Residual Features for Robust, Privacy-Sensitive Speaker Diarization. 1045-1048 - Houman Ghaemmaghami, David Dean, Robbie Vogt, Sridha Sridharan:

Extending the Task of Diarization to Speaker Attribution. 1049-1052 - Viet-Anh Tran

, Viet Bac Le, Claude Barras, Lori Lamel:
Comparing Multi-Stage Approaches for Cross-Show Speaker Diarization. 1053-1056
Prosody II
- György Szaszák, Katalin Nagy, András Beke:

Analysing the Correspondence Between Automatic Prosodic Segmentation and Syntactic Structure. 1057-1060 - Joseph Tepperman, Emily Nava:

Long-Distance Rhythmic Dependencies and their Application to Automatic Language Identification. 1061-1064 - Andrew Rosenberg:

Symbolic and Direct Sequential Modeling of Prosody for Classification of Speaking-Style and Nativeness. 1065-1068 - Wentao Gu, Ting Zhang, Hiroya Fujisaki:

Prosodic Analysis and Perception of Mandarin Utterances Conveying Attitudes. 1069-1072 - Chierh Cheng, Michele Gubian:

Predicting Taiwan Mandarin Tone Shapes from their Duration. 1073-1076 - Charlotte Wollermann, Ulrich Schade, Bernhard Schröder:

Variation of Accent Type and of Context - Influences on Pragmatic Focus Interpretation. 1077-1080
Adaptation for ASR
- Shinji Watanabe, Atsushi Nakamura, Biing-Hwang Juang:

Model Adaptation for Automatic Speech Recognition Based on Multiple Time Scale Evolution. 1081-1084 - Catherine Breslin, K. K. Chin, Mark J. F. Gales, Kate M. Knill:

Integrated Online Speaker Clustering and Adaptation. 1085-1088 - Zoltán Tüske, Christian Plahl, Ralf Schlüter:

A Study on Speaker Normalized MLP Features in LVCSR. 1089-1092 - Yongwon Jeong, Young Kuk Kim:

Matrix-Variate Distribution of Training Models for Robust Speaker Adaptation. 1093-1096 - Michael L. Seltzer, Alex Acero

:
Separating Speaker and Environmental Variability Using Factored Transforms. 1097-1100 - Mazin Gilbert, Iker Arizmendi, Enrico Bocchieri, Diamantino Caseiro, Vincent Goffin, Andrej Ljolje, Mike Phillips, Chao Wang, Jay G. Wilpon:

Your Mobile Virtual Assistant Just Got Smarter! 1101-1104
SLP for Information Extraction and Retrieval II
- Vincent Claveau, Sébastien Lefèvre:

Topic Segmentation of TV-Streams by Mathematical Morphology and Vectorization. 1105-1108 - Mimi Lu, Cheung-Chi Leung, Lei Xie, Bin Ma, Haizhou Li:

Probabilistic Latent Semantic Analysis for Broadcast News Story Segmentation. 1109-1112 - Evandro B. Gouvêa

:
Hybrid Speech Recognition for Voice Search: A Comparative Study. 1113-1116 - Bo Peng, Yao Qian, Frank K. Soong, Bo Zhang:

A New Phonetic Candidate Generator for Improving Search Query Efficiency. 1117-1120 - Yukiko Suzuki, Kiyoaki Aikawa:

Towards Voice-Input Symbolic Pattern Retrieval Using Parameter-Based Search. 1121-1124 - Vikram Gupta, Jitendra Ajmera, Arun Kumar, Ashish Verma:

A Language Independent Approach to Audio Search. 1125-1128
Regular Poster Sessions
Second Language Acquisition, Development and Learning I
- Mikhail Ordin, Leona Polyanskaya, Christiane Ulbrich:

Acquisition of Timing Patterns in Second Language. 1129-1132 - Hongyan Li, Shen Huang, Shijin Wang, Bo Xu:

Context-Dependent Duration Modeling with Backoff Strategy and Look-Up Tables for Pronunciation Assessment and Mispronunciation Detection. 1133-1136 - Mee Sonu, Keiichi Tajima, Hiroaki Kato, Yoshinori Sagisaka:

Perceptual Training of Vowel Length Contrast of Japanese by L2 Listeners: Effects of an Isolated Word versus a Word Embedded in Sentences. 1137-1140 - E.-Chin Wu:

Similar Vowels in L1/L2 Production: Confused or Discerned in Early L2 English Learners with Different Amount of Exposure. 1141-1144 - Lya Meister, Einar Meister

:
Production and Perception of Estonian Vowels by Native and Non-Native Speakers. 1145-1148 - Hiroshi Kibishi, Seiichi Nakagawa:

New Feature Parameters for Pronunciation Evaluation in English Presentations at International Conferences. 1149-1152 - Gérard Bailly, Will Barbour:

Synchronous Reading: Learning French Orthography by Audiovisual Training. 1153-1156 - Christos Koniaris, Olov Engwall:

Phoneme Level Non-Native Pronunciation Analysis by an Auditory Model-Based Native Assessment Scheme. 1157-1160 - Pavel Sturm, Radek Skarnitzl

:
The Open Front Vowel /æ/ in the Production and Perception of Czech Students of English. 1161-1164 - Catia Cucchiarini, Henk van den Heuvel, Eric Sanders, Helmer Strik:

Error Selection for ASR-Based English Pronunciation Training in 'My Pronunciation Coach'. 1165-1168 - Tomoko Nariai, Kazuyo Tanaka:

An Experimental Analysis of Pitch Patterns in Japanese Speakers of English with Verification by Speech Re-Synthesis. 1169-1172 - Tomoko Nariai, Kazuyo Tanaka, Yoshiaki Itoh:

An Analysis of Word Duration in Native Speakers and Japanese Speakers of English. 1173-1176
Speech Enhancement
- Laura Laaksonen, Ville Myllylä, Riitta Niemistö:

Evaluating Artificial Bandwidth Extension by Conversational Tests in Car Using Mobile Devices with Integrated Hands-Free Functionality. 1177-1180 - Hannu Pulakka, Ulpu Remes, Santeri Yrttiaho, Kalle J. Palomäki, Mikko Kurimo, Paavo Alku:

Low-Frequency Bandwidth Extension of Telephone Speech Using Sinusoidal Synthesis and Gaussian Mixture Model. 1181-1184 - Amr H. Nour-Eldin, Peter Kabal:

Memory-Based Approximation of the Gaussian Mixture Model Framework for Bandwidth Extension of Narrowband Speech. 1185-1188 - Philip Harding, Ben Milner:

Speech Enhancement by Reconstruction from Cleaned Acoustic Features. 1189-1192 - Jae-Hun Choi, Sang-Kyun Kim, Joon-Hyuk Chang:

A Soft Decision-Based Speech Enhancement Using Acoustic Noise Classification. 1193-1196 - Chao Li, Wenju Liu:

A Noise Estimation Method Based on Speech Presence Probability and Spectral Sparseness. 1197-1120 - Chao Li, Wenju Liu:

Improved a posteriori Speech Presence Probability Estimation Based on Cepstro-Temporal Smoothing and Time-Frequency Correlation. 1201-1204 - Md Foezur Rahman Chowdhury, Sid-Ahmed Selouani, Douglas D. O'Shaughnessy:

A Rapid Adaptation Algorithm for Tracking Highly Non-Stationary Noises based on Bayesian Inference for On-Line Spectral Change Point Detection. 1205-1208 - Kuldip K. Paliwal, Belinda Schwerin, Kamil K. Wójcicki:

Single Channel Speech Enhancement Using MMSE Estimation of Short-Time Modulation Magnitude Spectrum. 1209-1212 - Atanu Saha, Tetsuya Shimamura:

Speech Enhancement Using Masking Properties in Adverse Environments. 1213-1216 - Bhiksha Raj, Rita Singh, Tuomas Virtanen:

Phoneme-Dependent NMF for Speech Enhancement in Monaural Mixtures. 1217-1220 - Christina Leitner, Franz Pernkopf, Gernot Kubin:

Kernel PCA for Speech Enhancement. 1221-1224 - Angel M. Gomez, Belinda Schwerin

, Kuldip K. Paliwal:
Objective Intelligibility Prediction of Speech by Combining Correlation and Distortion Based Techniques. 1225-1228
ASR - Feature Extraction I
- Frantisek Grézl, Martin Karafiát:

Integrating Recent MLP Feature Extraction Techniques into TRAP Architecture. 1229-1232 - Martin Wöllmer, Björn W. Schuller, Gerhard Rigoll:

Feature Frame Stacking in RNN-Based Tandem ASR Systems - Learned vs. Predefined Context. 1233-1236 - Christian Plahl, Ralf Schlüter, Hermann Ney:

Improved Acoustic Feature Combination for LVCSR by Neural Networks. 1237-1240 - Joel Pinto, Mathew Magimai-Doss, Hervé Bourlard:

Hierarchical Tandem Features for ASR in Mandarin. 1241-1244 - Fabio Valente, Mathew Magimai-Doss, Wen Wang:

Analysis and Comparison of Recent MLP Features for LVCSR Systems. 1245-1248 - Jaehyung Lee, Soo-Young Lee:

Deep Learning of Speech Features for Improved Phonetic Recognition. 1249-1252 - Heyun Huang, Yang Liu, Jort F. Gemmeke, Louis ten Bosch, Bert Cranen, Lou Boves:

Globality-Locality Consistent Discriminant Analysis for Phone Classification. 1253-1256 - Hynek Boril, Frantisek Grézl, John H. L. Hansen:

Front-End Compensation Methods for LVCSR Under Lombard Effect. 1257-1260 - Jung-Won Lee, Jeung-Yoon Choi, Hong-Goo Kang:

Classification of Fricatives Using Feature Extrapolation of Acoustic-Phonetic Features in Telephone Speech. 1261-1264 - Sami Keronen, Jouni Pohjalainen, Paavo Alku, Mikko Kurimo:

Noise Robust Feature Extraction Based on Extended Weighted Linear Prediction in LVCSR. 1265-1268 - Bernd T. Meyer, Suman V. Ravuri, Marc René Schädler, Nelson Morgan:

Comparing Different Flavors of Spectro-Temporal Features for ASR. 1269-1272 - Ehsan Variani, Thomas Schaaf:

VTLN in the MFCC Domain: Band-Limited versus Local Interpolation. 1273-1276 - Sridhar Krishna Nemala, Kailash Patil, Mounya Elhilali:

Multistream Bandpass Modulation Features for Robust Speech Recognition. 1277-1280 - Davide Marino, Thomas Hain

:
An Analysis of Automatic Speech Recognition with Multiple Microphones. 1281-1284
Spoken Dialogue & Spoken Language Understanding Systems
- Géraldine Damnati, Delphine Charlet:

Multi-View Approach for Speaker Turn Role Labeling in TV Broadcast News Shows. 1285-1288 - Sudeep Gandhe, Michael Rushforth, Priti Aggarwal, David R. Traum:

Evaluation of an Integrated Authoring Tool for Building Advanced Question-Answering Characters. 1289-1292 - Gökhan Tür

, Dilek Hakkani-Tür, Dustin Hillard, Asli Celikyilmaz
:
Towards Unsupervised Spoken Language Understanding: Exploiting Query Click Logs for Slot Filling. 1293-1296 - Donghyeon Lee, Cheongjae Lee, Minwoo Jeong, Kyungduk Kim, Seokhwan Kim, Junhwi Choi, Gary Geunbae Lee:

Web-Enhanced Content Retrieval for Information Access Dialogue System. 1297-1300 - Lucie Daubigney, Milica Gasic, Senthilkumar Chandramohan, Matthieu Geist, Olivier Pietquin, Steve J. Young:

Uncertainty Management for On-Line Optimisation of a POMDP-Based Large-Scale Spoken Dialogue System. 1301-1304 - Sunao Hara, Norihide Kitaoka, Kazuya Takeda:

Detection of Task-Incomplete Dialogs Based on Utterance-and-Behavior Tag N-Gram for Spoken Dialog Systems. 1305-1308 - Ruhi Sarikaya, Stanley F. Chen

, Bhuvana Ramabhadran:
Shrinkage-Based Features for Natural Language Call Routing. 1309-1312 - Leonid Rachevsky, Dimitri Kanevsky, Ruhi Sarikaya, Bhuvana Ramabhadran:

Clustering with Modified Cosine Distance Learned from Constraints. 1313-1316 - Andrew Fandrianto, Brian Langner, Alan W. Black:

Using Speaker ID to Discover Repeat Callers of a Spoken Dialog System. 1317-1320 - Florian Pinault, Fabrice Lefèvre:

Semantic Graph Clustering for POMDP-Based Spoken Dialog Systems. 1321-1324 - Ryo Taguchi, Yuji Yamada, Koosuke Hattori, Taizo Umezaki, Masahiro Hoguro, Naoto Iwahashi, Kotaro Funakoshi, Mikio Nakano:

Learning Place-Names from Spoken Utterances and Localization Results by Mobile Robot. 1325-1328 - Björn Gambäck, Fredrik Olsson, Oscar Täckström:

Active Learning for Dialogue Act Classification. 1329-1332 - Thierry Bazillon, Benjamin Maza, Mickael Rouvier, Frédéric Béchet, Alexis Nasr:

Speaker Role Recognition Using Question Detection and Characterization. 1333-1336 - Qiang Huang, Stephen J. Cox:

Learning Score Structure from Spoken Language for a Tennis Game. 1337-1340 - Silke M. Witt:

Semi-Automated Classifier Adaptation for Natural Language Call Routing. 1341-1344 - Wei-Bin Liang, Chung-Hsien Wu, Chih-Hung Wang, Jhing-Fa Wang:

Interactional Style Detection for Versatile Dialogue Response Using Prosodic and Semantic Features. 1345-1348 - Christine Kühnel, Benjamin Weiss, Matthias Schulz, Sebastian Möller:

Quality Aspects of Multimodal Dialog Systems: Identity, Stimulation and Success. 1349-1352
Prosodic Structure
- Joseph Tepperman, Emily Nava:

Where Should Pitch Accents and Phrase Breaks Go? A Syntax Tree Transducer Solution. 1353-1356 - Giuliano Bocci, Cinzia Avesani:

Phrasal Prominences do not need Pitch Movements: Postfocal Phrasal Heads in Italian. 1357-1360 - David Le Gac, Hiyon Yoo:

Intonation of Left Dislocated Topics in Modern Greek. 1361-1364 - Laura Thompson, Catherine Inez Watson, Ray Harlow, Jeanette King, Margaret Maclagan, Helen Charters, Peter Keegan:

Phrases, Pitch and Perceived Prominence in Maori. 1365-1368 - Tomás Dubeda:

Perceptual Sensitivity to Prenuclear and Nuclear Intonational Patterns. 1369-1372 - Raya Kalaldeh:

Tonal Alignment Defined: The Case of Southern Irish English. 1373-1376 - Andrew Rosenberg:

Using Mutual Information to Identify Regions of Analysis for Prosodic Analysis. 1377-1380 - Chiu-yu Tseng, Zhao-yu Su, Chi-Feng Huang:

Prosodic Highlights in Mandarin Continuous Speech - Cross-Genre Attributes and Implications. 1381-1384 - Simone Sulpizio, James M. McQueen:

When Two Newly-Acquired Words are One: New Words Differing in Stress Alone are not Automatically Represented Differently. 1385-1388 - Shehui Bu, Zhenjie Zhuo, Lingling Yang, Shuichi Itahashi:

Automatic Determination of the Standard Chinese Prosodic Phrase Boundaries by F0 Generation Model. 1389-1392 - Céline De Looze, Stéphane Rauzy:

Measuring Speakers' Similarity in Speech by Means of Prosodic Cues: Methods and Potential. 1393-1396 - Li-chiung Yang:

Tonal Variations in Mandarin: New Evidence from Spontaneous and Read Speech. 1397-1400
Language Processing
- Camille Guinaudeau, Julia Hirschberg:

Accounting for Prosodic Information to Improve ASR-Based Topic Tracking for TV Broadcast News. 1401-1404 - Kenji Imamura, Tomoko Izumi, Kugatsu Sadamitsu, Kuniko Saito, Satoshi Kobashikawa, Hirokazu Masataki:

Morpheme Conversion for Connecting Speech Recognizer and Language Analyzers in Unsegmented Languages. 1405-1408 - Ren-Ying Fang, Bo-Wei Chen, Jhing-Fa Wang, Chung-Hsien Wu:

Emotion Detection Based on Concept Inference and Spoken Sentence Analysis for Customer Service. 1409-1412 - Christophe Cerisara, Pavel Král, Claire Gardent:

Commas Recovery with Syntactic Features in French and in Czech. 1413-1416 - Daniele Falavigna:

Redundancy Reduction in ASR of Spontaneous Speech Through Statistical Machine Translation. 1417-1420 - Chin-Chih Chiang:

From Interview to News Text: A Study of Taiwan TV Political Interviews in Newspaper Reports. 1421-1424
ASR - Language Models I
- Jeffrey Sorensen, Cyril Allauzen:

Unary Data Structures for Language Models. 1425-1428 - Cyril Allauzen, Michael Riley:

Bayesian Language Model Interpolation for Mobile Speech Input. 1429-1432 - Martin Sundermeyer, Ralf Schlüter, Hermann Ney:

On the Estimation of Discount Parameters for Language Model Smoothing. 1433-1436 - Patrick Lehnen, Stefan Hahn, Hermann Ney:

N-Grams for Conditional Random Fields or a Failure-Transition(f) Posterior for Acyclic FSTs. 1437-1440 - M. Ali Basha Shaik, Amr El-Desoky Mousa, Ralf Schlüter, Hermann Ney:

Hybrid Language Models Using Mixed Types of Sub-Lexical Units for Open Vocabulary German LVCSR. 1441-1444 - Amr El-Desoky Mousa, M. Ali Basha Shaik, Ralf Schlüter, Hermann Ney:

Morpheme Based Factored Language Models for German LVCSR. 1445-1448 - Markus Nußbaum-Thom, Amr El-Desoky Mousa, Ralf Schlüter, Hermann Ney:

Compound Word Recombination for German LVCSR. 1449-1452 - Akio Kobayashi, Takahiro Oku, Shinichi Homma, Toru Imai, Seiichi Nakagawa:

Lattice-Based Risk Minimization Training for Unsupervised Language Model Adaptation. 1453-1456 - Christian Gillot, Christophe Cerisara:

Similarity Language Model. 1457-1460 - Erinç Dikici, Murat Semerci, Murat Saraclar, Ethem Alpaydin:

Data Sampling and Dimensionality Reduction Approaches for Reranking ASR Outputs Using Discriminative Language Models. 1461-1464 - Ryo Masumura, Seongjun Hahm, Akinori Ito:

Training a Language Model Using Webdata for Large Vocabulary Japanese Spontaneous Speech Recognition. 1465-1468 - Hai Son Le, Ilya Oparin, Abdelkhalek Messaoudi, Alexandre Allauzen, Jean-Luc Gauvain, François Yvon:

Large Vocabulary SOUL Neural Network Language Models. 1469-1472 - Jonathan Mamou, Abhinav Sethy, Bhuvana Ramabhadran, Ron Hoory, Paul Vozila:

Improved Spoken Query Transcription Using Co-Occurrence Information. 1473-1476 - Yik-Cheung Tam, Paul Vozila:

Unsupervised Latent Speaker Language Modeling. 1477-1480
Spoken Language Resources, Evaluation and Standardization I
- Nobuaki Minematsu, Koji Okabe, Keisuke Ogaki, Keikichi Hirose:

Measurement of Objective Intelligibility of Japanese Accented English Using ERJ (English Read by Japanese) Database. 1481-1484 - Sebastian Möller, Chihuy Bang, Teele Tamme, Markus Vaalgamaa, Benjamin Weiss:

From Single-Call to Multi-Call Quality: A Study on Long-Term Quality Integration in Audio-Visual Speech Communication. 1485-1488 - Hui Lin, Jeff A. Bilmes:

Optimal Selection of Limited Vocabulary Speech Corpora. 1489-1492 - Stephen A. Zahorian, Jiang Wu, Montri Karnjanadecha, Chandra Sekhar Vootkuri, Brian Wong, Andrew Hwang, Eldar Tokhtamyshev:

Open Source Multi-Language Audio Database for Spoken Language Processing Applications. 1493-1496 - Matthew Black, Daniel Bone, Marian E. Williams, Phillip Gorrindo, Pat Levitt, Shrikanth S. Narayanan:

The USC CARE Corpus: Child-Psychologist Interactions of Children with Autism Spectrum Disorders. 1497-1500 - Nelly Barbot, Vincent Barreaud, Olivier Boëffard, Laure Charonnat, Arnaud Delhay, Sébastien Le Maguer, Damien Lolive:

Towards a Versatile Multi-Layered Description of Speech Corpora Using Algebraic Relations. 1501-1504 - Korin Richmond, Phil Hoole, Simon King:

Announcing the Electromagnetic Articulography (Day 1) Subset of the mngu0 Articulatory Corpus. 1505-1508 - Gregor Pirker, Michael Wohlmayr, Stefan Petrik, Franz Pernkopf:

A Pitch Tracking Corpus with Evaluation on Multipitch Tracking Scenario. 1509-1512 - Taras Butko:

On Building and Evaluating a Broadcast-News Audio Segmentation System. 1513-1516 - Simon Dobrisek, France Mihelic:

Time- and Acoustic-Mediated Alignment Algorithms for Speech Recognition Evaluation. 1517-1520 - Julia Niemann, Kati Schulz, Ina Wechsung:

Effects of Shortening Speech Prompts of In-Car Voice User Interfaces on Users Mental Models. 1521-1524 - Laurens van der Werff, Wessel Kraaij, Franciska de Jong:

Speech Transcript Evaluation for Information Retrieval. 1525-1528 - Luis Javier Rodríguez, Mikel Peñagarikano, Amparo Varona, Mireia Díez, Germán Bordel:

The Albayzin 2010 Language Recognition Evaluation. 1529-1532 - Roger K. Moore

:
Progress and Prospects for Speech Technology: Results from Three Sexennial Surveys. 1533-1536 - Josef R. Novak, Nobuaki Minematsu, Keikichi Hirose:

Painless WFST Cascade Construction for LVCSR - Transducersaurus. 1537-1540
Paralinguistic Information - Classification and Detection
- Catharine Oertel, Stefan Scherer, Nick Campbell:

On the Use of Multimodal Cues for the Prediction of Degrees of Involvement in Spontaneous Conversation. 1541-1544 - Narichika Nomoto, Masafumi Tamoto, Hirokazu Masataki, Osamu Yoshioka, Satoshi Takahashi:

Anger Recognition in Spoken Dialog Using Linguistic and Para-Linguistic Information. 1545-1548 - Alexei V. Ivanov, Giuseppe Riccardi, Adam J. Sporka, Jakub Franc:

Recognition of Personality Traits from Human Spoken Conversations. 1549-1552 - Björn W. Schuller, Zixing Zhang, Felix Weninger, Gerhard Rigoll:

Using Multiple Databases for Training in Emotion Recognition: To Unite or to Vote? 1553-1556 - Felix Burkhardt, Björn W. Schuller, Benjamin Weiss, Felix Weninger:

"Would You Buy a Car from Me?" - On the Likability of Telephone Voices. 1557-1560 - James Gibson, Athanasios Katsamanis, Matthew P. Black, Shrikanth S. Narayanan:

Automatic Identification of Salient Acoustic Instances in Couples' Behavioral Interactions Using Diverse Density Support Vector Machines. 1561-1564 - Daniel Neiberg, Joakim Gustafson:

Predicting Speaker Changes and Listener Responses with and without Eye-Contact. 1565-1568 - Senaka Amarakeerthi

, Tin Lay Nwe, Liyanage C. De Silva, Michael Cohen:
Emotion Classification Using Inter- and Intra-Subband Energy Variation. 1569-1572 - Kazuki Kitahara, Shinzi Michiwiki, Miku Sato, Shoichi Matsunaga, Masaru Yamashita, Kazuyuki Shinohara:

Emotion Classification of Infants' Cries Using Duration Ratios of Acoustic Segments. 1573-1576 - Bogdan Vlasenko, Dmytro Prylipko, David Philippou-Hübner, Andreas Wendemuth:

Vowels Formants Analysis Allows Straightforward Detection of High Arousal Acted and Spontaneous Emotions. 1577-1580 - Daniel Neiberg, Petri Laukka, Hillary Anger Elfenbein:

Intra-, Inter-, and Cross-Cultural Classification of Vocal Affect. 1581-1584
Applications for Learning, Education, Aged and Handicapped Persons
- Sajad Shirali-Shahreza, Yashar Ganjali, Ravin Balakrishnan:

Verifying Human Users in Speech-Based Interactions. 1585-1588 - Jian Cheng:

Automatic Assessment of Prosody in High-Stakes English Tests. 1589-1592 - Dean Luo, Xuesong Yang, Lan Wang:

Improvement of Segmental Mispronunciation Detection with Prior Knowledge Extracted from Large L2 Speech Corpus. 1593-1596 - Jian Cheng, Jianqiang Shen:

Off-Topic Detection in Automated Speech Assessment Applications. 1597-1600 - Sebastian Stüker, Johanna Fay, Kay Berkling:

Towards Context-Dependent Phonetic Spelling Error Correction in Children's Freely Composed Text for Diagnostic and Pedagogical Purposes. 1601-1604 - Verónica López-Ludeña, Rubén San Segundo, Ricardo de Córdoba, Javier Ferreiros

, Juan Manuel Montero, José Manuel Pardo:
Factored Translation Models for Improving a Speech into Sign Language Translation System. 1605-1608 - Kálmán Abari, Zsuzsanna Zsófia Rácz, Gábor Olaszy:

Formant Maps in Hungarian Vowels - Online Data Inventory for Research, and Education. 1609-1612 - Germán Bordel, Silvia Nieto, Mikel Peñagarikano, Luis Javier Rodríguez, Amparo Varona:

Automatic Subtitling of the Basque Parliament Plenary Sessions Videos. 1613-1616 - Yurie Iribe, Silasak Manosavanh, Kouichi Katsurada, Ryoko Hayashi, Chunyue Zhu, Tsuneo Nitta:

Generating Animated Pronunciation from Speech Through Articulatory Feature Extraction. 1617-1620 - Wei Chen, Jack Mostow:

A Tale of Two Tasks: Detecting Children's Off-Task Speech in a Reading Tutor. 1621-1624 - Toshiko Isei-Jaakkola, Takatoshi Naka, Keikichi Hirose:

Problems Encountered by Japanese EL2 with English Short Vowels as Illustrated on a 3D Vowel Chart. 1625-1628 - Thomas Pellegrini, Rui Correia, Isabel Trancoso, Jorge Baptista, Nuno J. Mamede:

Automatic Generation of Listening Comprehension Learning Material in European Portuguese. 1629-1632 - Chao-Hong Liu, Chung-Hsien Wu, David Sarwono, Jhing-Fa Wang:

Candidate Generation for ASR Output Error Correction Using a Context-Dependent Syllable Cluster-Based Confusion Matrix. 1633-1636 - Huynh Thai Hoa, An Vu Tran, Tran Huy Dat:

Semi-Supervised Tree Support Vector Machine for Online Cough Recognition. 1637-1640
Robust Speech Recognition I
- Volker Leutnant, Alexander Krueger, Reinhold Haeb-Umbach:

A Versatile Gaussian Splitting Approach to Non-Linear State Estimation and its Application to Noise-Robust ASR. 1641-1644 - Hilman Ferdinandus Pardede, Koichi Shinoda:

Generalized-Log Spectral Mean Normalization for Speech Recognition. 1645-1648 - Young-Ik Kim, Hoon-Young Cho, Sang-Hun Kim:

Zero-Crossing-Based Channel Attentive Weighting of Cepstral Features for Robust Speech Recognition: The ETRI 2011 CHiME Challenge System. 1649-1652 - Wooil Kim, John H. L. Hansen:

Feature Compensation for Speech Recognition in Severely Adverse Environments Due to Background Noise and Channel Distortion. 1653-1656 - Ning Ma, Jon Barker, Heidi Christensen, Phil D. Green:

Binaural Cues for Fragment-Based Speech Recognition in Reverberant Multisource Environments. 1657-1660 - Vikas Joshi, Raghavendra Bilgi, Srinivasan Umesh, Luz García, M. Carmen Benítez:

Sub-Band Level Histogram Equalization for Robust Speech Recognition. 1661-1664 - Ulpu Remes, Yoshihiko Nankaku, Keiichi Tokuda:

GMM-Based Missing-Feature Reconstruction on Multi-Frame Windows. 1665-1668 - Yang Sun, Jort F. Gemmeke, Bert Cranen, Louis ten Bosch, Lou Boves:

Improvements of a Dual-Input DBN for Noise Robust ASR. 1669-1672 - Randy Gomez, Tatsuya Kawahara:

Denoising Using Optimized Wavelet Filtering for Automatic Speech Recognition. 1673-1676 - Florian Müller, Alfred Mertins:

Noise Robust Speaker-Independent Speech Recognition with Invariant-Integration Features Using Power-Bias Subtraction. 1677-1680
ASR - Acoustic Models I
- Michele Alessandrini, Giorgio Biagetti, Alessandro Curzi, Claudio Turchetti:

Semi-Automatic Acoustic Model Generation from Large Unsynchronized Audio and Text Chunks. 1681-1684 - Brian Strope, Doug Beeferman, Alexander Gruenstein, Xin Lei:

Unsupervised Testing Strategies for ASR. 1685-1688 - Gakuto Kurata, Nobuyasu Itoh, Masafumi Nishimura:

Acoustic Model Training with Detecting Transcription Errors in the Training Data. 1689-1692 - Aren Jansen, Kenneth Church:

Towards Unsupervised Training of Speaker Independent Acoustic Models. 1693-1692 - Xiaodong Cui, Xin Chen, Jian Xue, Peder A. Olsen, John R. Hershey, Bowen Zhou:

Acoustic Modeling with Bootstrap and Restructuring Based on Full Covariance. 1697-1700 - Jian Xu, Yu Zhang, Zhi-Jie Yan, Qiang Huo:

An i-vector Based Approach to Acoustic Sniffing for Irrelevant Variability Normalization Based Acoustic Model Training and Speech Recognition. 1701-1704 - Muhammad Ali Tahir, Ralf Schlüter, Hermann Ney:

Log-Linear Optimization of Second-Order Polynomial Features with Subsequent Dimension Reduction for Speech Recognition. 1705-1708 - Qingqing Zhang, Lori Lamel, Jean-Luc Gauvain:

Genre Categorization and Modeling for Broadcast Speech Transcription. 1709-1712 - Sunghwan Shin, Ho-Young Jung, Biing-Hwang Juang:

Individual Error Minimization Learning Framework and its Applications to Speech Recognition and Utterance Verification. 1713-1716 - Sakhia Darjaa, Milos Cernak, Marián Trnka, Milan Rusko, Róbert Sabo:

Effective Triphone Mapping for Acoustic Modeling in Speech Recognition. 1717-1720 - Udhyakumar Nallasamy, Michael Garbus, Florian Metze, Qin Jin, Thomas Schaaf, Tanja Schultz:

Analysis of Dialectal Influence in Pan-Arabic ASR. 1721-1724 - Azarakhsh Jalalvand, Fabian Triefenbach, David Verstraeten, Jean-Pierre Martens:

Connected Digit Recognition by Means of Reservoir Computing. 1725-1728 - Madhavi Vedula Ratnagiri, Biing-Hwang Juang, Lawrence R. Rabiner:

Large Margin - Minimum Classification Error Using Sum of Shifted Sigmoids as the Loss Function. 1729-1732 - Javier Mikel Olaso, M. Inés Torres, Raquel Justo:

Representing Phonological Features Through a Two-Level Finite State Model. 1733-1736 - Jan Vanek, Jan Trmal, Josef V. Psutka, Josef Psutka:

Optimization of the Gaussian Mixture Model Evaluation on GPU. 1737-1740
Source Separation and Speech Enhancement
- Xueliang Zhang, Wenju Liu:

Monaural Voiced Speech Segregation Based on Pitch and Comb Filter. 1741-1744 - Yasuharu Hirasawa, Naoki Yasuraoka, Toru Takahashi, Tetsuya Ogata, Hiroshi G. Okuno:

Fast and Simple Iterative Algorithm of Lp-Norm Minimization for Under-Determined Speech Separation. 1745-1748 - Azam Rabiee, Saeed Setayeshi, Soo-Young Lee:

Monaural Speech Separation Based on a 2D Processing and Harmonic Analysis. 1749-1752 - Ingrid Jafari, Serajul Haque, Roberto Togneri, Sven Nordholm:

Underdetermined Blind Source Separation with Fuzzy Clustering for Arbitrarily Arranged Sensors. 1753-1756 - Dang Hai Tran Vu, Reinhold Haeb-Umbach:

On Initial Seed Selection for Frequency Domain Blind Speech Separation. 1757-1760 - Nobuaki Tanaka, Tetsuji Ogawa

, Tetsunori Kobayashi:
Spatial Filter Calibration Based on Minimization of Modified LSD. 1761-1764 - Toru Nakashika, Tetsuya Takiguchi, Yasuo Ariki:

Probabilistic Spectrum Envelope: Categorized Audio-Features Representation for NMF-Based Sound Decomposition. 1765-1768 - Jinho Choi, Chang D. Yoo:

A High Resolution Multiple Source Localization Based on Generalized Cumulant Structure (GCS) Matrix. 1769-1772 - Emad M. Grais, Hakan Erdogan:

Single Channel Speech Music Separation Using Nonnegative Matrix Factorization with Sliding Windows and Spectral Masks. 1773-1776 - Jorge I. Marin-Hurtado, David V. Anderson:

Perceptually-Inspired Processing for Multichannel Wiener Filter. 1777-1780 - Shoichi Nakano, Kazumasa Yamamoto, Seiichi Nakagawa:

Speech Recognition in Mixed Sound of Speech and Music Based on Vector Quantization and Non-Negative Matrix Factorization. 1781-1784 - Tomohiro Nakatani, Shoko Araki, Marc Delcroix, Takuya Yoshioka, Masakiyo Fujimoto:

Reduction of Highly Nonstationary Ambient Noise by Integrating Spectral and Locational Characteristics of Speech and Noise for Robust ASR. 1785-1788 - Carlo Drioli, Andrea Calanca:

Voice Processing by Dynamic Glottal Models with Applications to Speech Enhancement. 1789-1792 - Jinqiu Sang, Guoping Li, Hongmei Hu, Mark E. Lutman, Stefan Bleeck:

Supervised Sparse Coding Strategy in Cochlear Implants. 1793-1796
HMM-Based Speech Synthesis II
- Benjamin Picart, Thomas Drugman, Thierry Dutoit:

Continuous Control of the Degree of Articulation in HMM-Based Speech Synthesis. 1797-1800 - Ling-Hui Chen, Yoshihiko Nankaku, Heiga Zen, Keiichi Tokuda, Zhen-Hua Ling, Li-Rong Dai:

Estimation of Window Coefficients for Dynamic Feature Extraction for HMM-Based Speech Synthesis. 1801-1804 - Zhengqi Wen, Jianhua Tao:

Inverse Filtering Based Harmonic Plus Noise Excitation Model for HMM-Based Speech Synthesis. 1805-1808 - Daniel Erro, Iñaki Sainz, Eva Navas, Inma Hernáez:

Improved HNM-Based Vocoder for Statistical Synthesizers. 1809-1812 - Gopala Krishna Anumanchipalli, Luís C. Oliveira, Alan W. Black:

A Statistical Phrase/Accent Model for Intonation Modeling. 1813-1816 - Gustav Eje Henter, W. Bastiaan Kleijn

:
Intermediate-State HMMs to Capture Continuously-Changing Signal Features. 1817-1820 - Norbert Braunschweiler, Sabine Buchholz:

Automatic Sentence Selection from Speech Corpora Including Diverse Speech for Improved HMM-TTS Synthesis Quality. 1821-1824 - Hui Liang, John Dines:

Phonological Knowledge Guided HMM State Mapping for Cross-Lingual Speaker Adaptation. 1825-1828 - Nicolas Obin, Pierre Lanchantin, Anne Lacheret, Xavier Rodet:

Reformulating Prosodic Break Model into Segmental HMMs and Information Fusion. 1829-1832 - Ranniery Maia, Heiga Zen, Kate M. Knill, Mark J. F. Gales, Sabine Buchholz:

Multipulse Sequences for Residual Signal Modeling. 1833-1836 - Cassia Valentini-Botinhao, Junichi Yamagishi, Simon King:

Can Objective Measures Predict the Intelligibility of Modified HMM-Based Synthetic Speech in Noise? 1837-1840 - Tsuneo Nitta, Takayuki Onoda, Masashi Kimura, Yurie Iribe, Kouichi Katsurada:

Speech Synthesis Based on Articulatory-Movement HMMs with Voice-Source Codebooks. 1841-1844 - Tsuneo Kato, Makoto Yamada, Nobuyuki Nishizawa, Keiichiro Oura, Keiichi Tokuda:

Large-Scale Subjective Evaluations of Speech Rate Control Methods for HMM-Based Speech Synthesizers. 1845-1848 - Yu Maeno, Takashi Nose, Takao Kobayashi, Yusuke Ijima, Hideharu Nakajima, Hideyuki Mizuno, Osamu Yoshioka:

HMM-Based Emphatic Speech Synthesis Using Unsupervised Context Labeling. 1849-1852
Phonetics and Phonology, Stress, Accent, Rhythm
- Chiara Bertini, Pier Marco Bertinetto, Na Zhi:

Chinese and Italian Speech Rhythm: Normalization and the CCI Algorithm. 1853-1852 - Paolo Mairano, Antonio Romano:

Rhythm Metrics on Syllables and Feet do not Work as Expected. 1857-1860 - Lei Chen, Klaus Zechner:

Applying Rhythm Features to Automatically Assess Non-Native Speech. 1861-1864 - Brian Vaughan:

Prosodic Synchrony in Co-Operative Task-Based Dialogues: A Measure of Agreement and Disagreement. 1865-1868 - Oliver Niebuhr, Astrid Wolf:

Low and High, Short and Long by Crook or by Hook? 1869-1872 - Christian Heinrich, Florian Schiel:

Estimating Speaking Rate by Means of Rhythmicity Parameters. 1873-1876 - Denis Arnold, Bernd Möbius, Petra Wagner:

Comparing Word and Syllable Prominence Rated by Naïve Listeners. 1877-1880 - Shinichi Tokuma, Yi Xu:

L1/L2 Perception of Lexical Stress with F0 Peak-Delay: Effect of an Extra Syllable Added. 1881-1884 - Kheang Seng, Yurie Iribe, Tsuneo Nitta:

Letter-to-Phoneme Conversion Based on Two-Stage Neural Network Focusing on Letter and Phoneme Contexts. 1885-1888 - Rosemary Orr, Hugo Quené, Roeland van Beek, Thari Diefenbach, David A. van Leeuwen, Marijn Huijbregts:

An International English Speech Corpus for Longitudinal Study of Accent Development. 1889-1892 - Sunhee Kim, Kyuwhan Lee, Minhwa Chung:

A Corpus-Based Study of English Pronunciation Variations. 1893-1896 - Hywel Stoakes, Andrew Butcher, Janet Fletcher, Marija Tabain:

Long Term Average Speech Spectra in Yolngu Matha and Pitjantjatjara Speaking Females and Males. 1897-1900 - Tekla Etelka Gráczi, Steven M. Lulich, Tamás Gábor Csapó, András Beke:

Context and Speaker Dependency in the Relation of Vowel Formants and Subglottal Resonances - Evidence from Hungarian. 1901-1904
ASR - Search, Keyword Spotting and Confidence Measures I
- Keith Kintzley, Aren Jansen, Hynek Hermansky:

Event Selection from Phone Posteriorgrams Using Matched Filters. 1905-1908 - Yaodong Zhang, James R. Glass:

A Piecewise Aggregate Approximation Lower-Bound Estimate for Posteriorgram-Based Dynamic Time Warping. 1909-1912 - Long Qin, Ming Sun, Alexander I. Rudnicky:

OOV Detection and Recovery Using Hybrid Models with Different Fragments. 1913-1916 - Haiyang Li, Jiqing Han, Tieran Zheng:

AUC Optimization Based Confidence Measure for Keyword Spotting. 1917-1920 - Zejun Ma, Xiaorui Wang, Bo Xu:

An Empirical Study of Multilingual Spoken Term Detection. 1921-1924 - Zejun Ma, Xiaorui Wang, Bo Xu:

Fusing Multiple Confidence Measures for Chinese Spoken Term Detection. 1925-1928 - Zhanlei Yang, Hao Chao, Wenju Liu:

Response Probability Based Decoding Algorithm for Large Vocabulary Continuous Speech Recognition. 1929-1932 - Yuxiang Shan, Yan Deng, Jia Liu:

Combining Lattice-Based Language Dependent and Independent Approaches for Out-of-Language Detection in LVCSR. 1933-1936 - Naoaki Ito, Yoshihiko Nankaku, Akinobu Lee:

Evaluation of Tree-Trellis Based Decoding in Over-Million LVCSR. 1937-1940 - Hao Huang, Bing Hu Li:

Lattice Based Discriminative Model Combination Using Automatically Induced Phonetic Contexts. 1941-1944 - Taniya Mishra, Andrej Ljolje, Mazin Gilbert:

Predicting Human Perceived Accuracy of ASR Systems. 1945-1948 - Ioana Vasilescu, Dahbia Yahia, Natalie D. Snoeren, Martine Adda-Decker, Lori Lamel:

Cross-Lingual Study of ASR Errors: On the Role of the Context in Human Perception of Near-Homophones. 1949-1952 - Tatsuhiko Saito, Takashi Nose, Takao Kobayashi, Yohei Okato, Akio Horii:

Performance Prediction of Speech Recognition Using Average-Voice-Based Speech Synthesis. 1953-1956 - Ali Haznedaroglu, Levent M. Arslan:

Confidence Measures for Turkish Call Center Conversations. 1957-1960 - Taichi Asami, Narichika Nomoto, Satoshi Kobashikawa, Yoshikazu Yamaguchi, Hirokazu Masataki, Satoshi Takahashi:

Spoken Document Confidence Estimation Using Contextual Coherence. 1961-1964
Pitch Processing - Singing Voice Analysis
- Alipah Pawi, Saeed Vaseghi, Ben Milner, Seyed Ghorshi:

Fundamental Frequency Estimation Using Modified Higher Order Moments and Multiple Windows. 1965-1968 - Michael Wohlmayr, Franz Pernkopf:

EM-Based Gain Adaptation for Probabilistic Multipitch Tracking. 1969-1972 - Thomas Drugman, Abeer Alwan:

Joint Robust Voicing Detection and Pitch Estimation Based on Residual Harmonics. 1973-1976 - D. Govind, S. R. Mahadeva Prasanna, Debadatta Pati:

Epoch Extraction in High Pass Filtered Speech Using Hilbert Envelope. 1977-1980 - Alexander Pavlovets, Alexander A. Petrovsky:

Robust HNR-Based Closed-Loop Pitch and Harmonic Parameters Estimation. 1981-1984 - Chetana Prakash, N. Dhananjaya, Suryakanth V. Gangashetty:

Exploring Bessel Features for Detection of Glottal Closure Instants. 1985-1988 - João P. Cabral, John Kane, Christer Gobl, Julie Carson-Berndsen:

Evaluation of Glottal Epoch Detection Algorithms on Different Voice Types. 1989-1992 - Antonio Origlia, Giovanni Abete, Francesco Cutugno, Iolanda Alfano, Renata Savy, Bogdan Ludusan:

A Divide et impera Algorithm for Optimal Pitch Stylization. 1993-1996 - Ricardo Teixeira Sousa, Aníbal J. S. Ferreira:

Singing Voice Analysis Using Relative Harmonic Delays. 1997-2000 - Siu Wa Lee, Minghui Dong:

Singing Voice Synthesis: Singer-Dependent Vibrato Modeling and Coherent Processing of Spectral Envelope. 2001-2004 - Sylvain Le Beux, Lionel Feugère, Christophe d'Alessandro:

Chorus Digitalis: Experiments in Chironomic Choir Singing. 2005-2008
Prosodic Modeling
- Kun Li, Shuang Zhang, Mingxing Li, Wai Kit Lo, Helen M. Meng:

Prominence Model for Prosodic Features in Automatic Lexical Stress and Pitch Accent Detection. 2009-2012 - Ya Li, Jianhua Tao, Xiaoying Xu:

Hierarchical Stress Modeling in Mandarin Text-to-Speech. 2013-2016 - Chong-Jia Ni, Wenju Liu, Bo Xu:

Automatic Prosodic Events Detection by Using Syllable-Based Acoustic, Lexical and Syntactic Features. 2017-2020 - Albert Rilliard, Alexandre Allauzen, Philippe Boula de Mareüil:

Using Dynamic Time Warping to Compute Prosodic Similarity Measures. 2021-2024 - Plínio Almeida Barbosa, Hansjörg Mixdorff, Sandra Madureira:

Applying the Quantitative Target Approximation Model (qTA) to German and Brazilian Portuguese. 2025-2028 - Nicolas Obin, Anne Lacheret, Xavier Rodet:

Stylization and Trajectory Modelling of Short and Long Term Speech Prosody Variations. 2029-2032 - Mathieu Avanzi, Nicolas Obin, Anne Lacheret-Dujour, Bernard Victorri:

Toward a Continuous Modeling of French Prosodic Structure: Using Acoustic Features to Predict Prominence Location and Prominence Degree. 2033-2036 - Tim Mahrt, Jui-Ting Huang, Yoonsook Mo, Margaret M. Fleck, Mark Hasegawa-Johnson, Jennifer Cole:

Optimal Models of Prosodic Prominence Using the Bayesian Information Criterion. 2037-2040 - Hussein Hussein, Hansjörg Mixdorff, Hue San Do, Rüdiger Hoffmann:

Quantitative Analysis of Tone Coarticulation in Mandarin. 2041-2044 - Daniel Neiberg, Gopal Ananthakrishnan, Joakim Gustafson:

Tracking Pitch Contours Using Minimum Jerk Trajectories. 2045-2048
Discourse and Dialogue
- Benjamin Maza, Marc El-Bèze, Georges Linarès, Renato de Mori:

On the Use of Linguistic Features in an Automatic System for Speech Analytics of Telephone Conversations. 2049-2052 - Abe Kazemzadeh, Sungbok Lee, Panayiotis G. Georgiou, Shrikanth S. Narayanan:

Determining what Questions to Ask, with the Help of Spectral Graph Theory. 2053-2056 - Hendrik Buschmeier, Zofia Malisz, Marcin Wlodarczak, Stefan Kopp, Petra Wagner:

'Are You Sure You're Paying Attention?' - 'Uh-Huh' Communicating Understanding as a Marker of Attentiveness. 2057-2060 - Yuichi Ishimoto, Mika Enomoto, Hitoshi Iida:

Projectability of Transition-Relevance Places Using Prosodic Features in Japanese Spontaneous Conversation. 2061-2064 - Anna Hjalmarsson, Kornel Laskowski:

Measuring Final Lengthening for Speaker-Change Prediction. 2065-2068 - Kornel Laskowski, Jens Edlund, Mattias Heldner:

Incremental Learning and Forgetting in Stochastic Turn-Taking Models. 2069-2072 - Kallirroi Georgila, David R. Traum:

Reinforcement Learning of Argumentation Dialogue Policies in Negotiation. 2073-2076 - Tobias Heinroth, Savina Koleva, Wolfgang Minker:

Topic Switching Strategies for Spoken Dialogue Systems. 2077-2080 - Ryuichiro Higashinaka, Noriaki Kawamae, Kugatsu Sadamitsu, Yasuhiro Minami, Toyomi Meguro, Kohji Dohsaka, Hirohito Inagaki:

Unsupervised Clustering of Utterances Using Non-Parametric Bayesian Methods. 2081-2084
SLP for Speech Translation, Information Extraction and Retrieval
- Carolina Parada, Mark Dredze, Frederick Jelinek:

OOV Sensitive Named-Entity Recognition in Speech. 2085-2088 - Markus Saers, Dekai Wu, Chi-kiu Lo, Karteek Addanki:

Speech Translation with Grammar Driven Probabilistic Phrasal Bilexica Extraction. 2089-2092 - Christoph Tillmann, Sanjika Hewavitharana:

An Efficient Unified Extraction Algorithm for Bilingual Data. 2093-2096 - Songfang Huang, Bowen Zhou:

Using Features from Topic Models to Alleviate Over-Generation in Hierarchical Phrase-Based Translation. 2097-2100 - Songfang Huang, Bowen Zhou:

An Empirical Study on Improving Hierarchical Phrase-Based Translation Using Alignment Features. 2101-2104 - Xiaodong He, Li Deng:

Robust Speech Translation by Domain Adaptation. 2105-2108 - Emil Ettelaie, Panayiotis G. Georgiou, Shrikanth S. Narayanan:

Enhancements to the Training Process of Classifier-Based Speech Translator via Topic Modeling. 2109-2112 - Vivek Kumar Rangarajan Sridhar, Luciano Barbosa, Srinivas Bangalore:

A Scalable Approach to Building a Parallel Corpus from the Web. 2113-2116 - Yoshiaki Itoh, Kohei Iwata, Masaaki Ishigame, Kazuyo Tanaka, Shi-wook Lee:

Spoken Term Detection Results Using Plural Subword Models by Estimating Detection Performance for Each Query. 2117-2120 - Luciano Barbosa, Diamantino Caseiro, Giuseppe Di Fabbrizio:

SpeechForms: From Web to Speech and Back. 2121-2124 - Kazuyuki Noritake, Hiroaki Nanjo, Takehiko Yoshimi:

Image Processing Filters for Line Detection-Based Spoken Term Detection. 2125-2128 - Joe Polifroni, François Mairesse:

Using Latent Topic Features for Named Entity Extraction in Search Queries. 2129-2132 - Ryo Masumura, Seongjun Hahm, Akinori Ito:

Language Model Expansion Using Webdata for Spoken Document Retrieval. 2133-2136 - Tomoyosi Akiba, Koichiro Honda:

Effects of Query Expansion for Spoken Document Passage Retrieval. 2137-2140 - Chun-an Chan, Lin-Shan Lee:

Unsupervised Hidden Markov Modeling of Spoken Queries for Spoken Term Detection without Speech Recognition. 2141-2144 - Roberto Gemello, Franco Mana, Pier Domenico Batzu:

Topic Identification from Audio Recordings Using Rich Recognition Results and Neural Network Based Classifiers. 2145-2148
Speech Synthesis - Selected Topics
- Alok Parlikar, Alan W. Black:

A Grammar Based Approach to Style Specific Phrase Prediction. 2149-2152 - Oliver Watts, Bowen Zhou:

Unsupervised Features from Text for Speech Synthesis in a Speech-to-Speech Translation System. 2153-2156 - Oliver Watts, Junichi Yamagishi, Simon King:

Unsupervised Continuous-Valued Word Features for Phrase-Break Prediction without a Part-of-Speech Tagger. 2157-2160 - Francisco Campillo, Francisco Méndez Pazó, Montserrat Arza, Laura Docío Fernández, Antonio Bonafonte, Eva Navas, Iñaki Sainz:

Albayzín 2010: A Spanish Text to Speech Evaluation. 2161-2164 - Binbin Shen, Zhiyong Wu, Yongxin Wang, Lianhong Cai:

Combining Active and Semi-Supervised Learning for Homograph Disambiguation in Mandarin Text-to-Speech Synthesis. 2165-2168 - Thomas Ewender, Beat Pfister:

Automatically Creating a Diphone Set from a Speech Database. 2169-2172 - Wesley Mattheyses, Lukas Latacz, Werner Verhelst:

Automatic Viseme Clustering for Audiovisual Speech Synthesis. 2173-2176 - Florian Hinterleitner, Sebastian Möller, Christoph Norrenbrock, Ulrich Heute:

Perceptual Quality Dimensions of Text-to-Speech Systems. 2177-2180 - Shinsuke Mori, Graham Neubig:

A Pointwise Approach to Pronunciation Estimation for a TTS Front-End. 2181-2184 - Mohamed Abou-Zleikha, Julie Carson-Berndsen:

Correlating Text with Prosody. 2185-2188 - Andrew Rosenberg, Raul Fernandez, Bhuvana Ramabhadran:

"What is... Dengue Fever?" - Modeling and Predicting Pronunciation Errors in a Text-to-Speech System. 2189-2192 - Christoph Norrenbrock, Ulrich Heute, Florian Hinterleitner, Sebastian Möller:

Aperiodicity Analysis for Quality Estimation of Text-to-Speech Signals. 2193-2196
Human Speech and Sound Perception I
- Eeva Klintfors, Ellen Marklund, Francisco Lacerda:

Parallels in Infants' Attention to Speech Articulation and to Physical Changes in Speech-Unrelated Objects. 2197-2200 - Daniel Duran, Jagoda Bruni, Grzegorz Dogil, Hinrich Schütze:

Speech Events are Recoverable from Unlabeled Articulatory Data: Using an Unsupervised Clustering Approach on Data Obtained from Electromagnetic Midsaggital Articulography (EMA). 2201-2204 - Sofia Strömbergsson:

Children's Recognition of their own Voice: Influence of Phonological Impairment. 2205-2208 - Takayuki Kagomiya, Seiji Nakagawa:

Evaluation of Bone-Conducted Ultrasonic Hearing-Aid Regarding Transmission of Speaker Discrimination Information. 2209-2212 - Christian Herff, Matthias Janke, Michael Wand, Tanja Schultz:

Impact of Different Feedback Mechanisms in EMG-Based Speech Recognition. 2213-2216 - Michael C. W. Yip:

Phonotactic Constraints and the Segmentation of Cantonese Speech. 2217-2220 - Katrin Schneider, Grzegorz Dogil, Bernd Möbius:

Reaction Time and Decision Difficulty in the Perception of Intonation. 2221-2224 - Ferenc Honbolygo, Valéria Csépe:

Processing of Stress Related Acoustic Cues as Indexed by ERPs. 2225-2228 - Marijt J. Witteman, Andrea Weber, James M. McQueen:

On the Relationship Between Perceived Accentedness, Acoustic Similarity, and Processing Difficulty in Foreign-Accented Speech. 2229-2232 - Shigeaki Amano, Yukari Hirata:

The Perception Boundary Between Single and Geminate Stops in 3- and 4-Mora Japanese Words. 2233-2236 - Yusuke Ijima, Mitsuaki Isogai, Hideyuki Mizuno:

Correlation Analysis of Acoustic Features with Perceptual Voice Quality Similarity for Similar Speaker Selection. 2237-2240
Multilingual and Multimodal Approaches to Spoken Language
- Vicent Alabau, Verónica Romero

, Antonio L. Lagarda, Carlos D. Martínez-Hinarejos:
A Multimodal Approach to Dictation of Handwritten Historical Documents. 2245-2248 - Asterios Toutios, Utpala Musti, Slim Ouni, Vincent Colotte:

Weight Optimization for Bimodal Unit-Selection Talking Head Synthesis. 2249-2252 - Stefan Schaffer, Benjamin Jöckel, Ina Wechsung, Robert Schleicher, Sebastian Möller:

Modality Selection and Perceived Mental Effort in a Mobile Application. 2253-2256 - Jitendra Ajmera, Ashish Verma:

A Cross-Lingual Spoken Content Search System. 2257-2260 - Christian Girardi, Roberto Gretter, Daniele Falavigna, Fabio Brugnara, Diego Giuliani, Marcello Federico:

NeMo: A Platform for Multilingual News Monitoring. 2261-2264 - Sourish Chaudhuri, Mark Harvilla, Bhiksha Raj:

Unsupervised Learning of Acoustic Unit Descriptors for Audio Content Representation and Classification. 2265-2268 - Michael Glodek, Stefan Scherer, Friedhelm Schwenker:

Conditioned Hidden Markov Model Fusion for Multimodal Classification. 2269-2272 - Benjamin Lecouteux, Michel Vacher, François Portet:

Distant Speech Recognition in a Smart Home: Comparison of Several Multisource ASRs in Realistic Conditions. 2273-2276 - Jiansong Chen, Lei Zhu, Bailan Feng, Peng Ding, Bo Xu:

A Robust Approach to Mining Repeated Sequence in Audio Stream. 2277-2280
ASR - New Paradigms and Other Topics
- Dong Yu, Li Deng:

Accelerated Parallelizable Neural Network Learning Algorithm for Speech Recognition. 2281-2284 - Dong Yu, Li Deng:

Deep Convex Net: A Scalable Architecture for Speech Pattern Classification. 2285-2288 - Siwei Wang

, Gina-Anne Levow:
Modeling Broad Context for Tone Recognition with Conditional Random Fields. 2289-2292 - Shang-wen Li, Yow-Bang Wang, Liang-Che Sun, Lin-Shan Lee:

Improved Tonal Language Speech Recognition by Integrating Spectro-Temporal Evidence and Pitch Information with Properly Chosen Tonal Acoustic Units. 2293-2296 - Evandro Gouvêa, Marelie H. Davel:

Kullback-Leibler Divergence-Based ASR Training Data Selection. 2297-2300 - Arild Brandrud Næss, Karen Livescu, Rohit Prabhavalkar

:
Articulatory Feature Classification Using Nearest Neighbors. 2301-2304 - Sébastien Demange, Slim Ouni:

Continuous Episodic Memory Based Speech Recognition Using Articulatory Dynamics. 2305-2308 - T. Li, Philip C. Woodland, Frank Diehl, Mark J. F. Gales:

Graphone Model Interpolation and Arabic Pronunciation Generation. 2309-2312 - Irina Illina, Dominique Fohr, Denis Jouvet:

Grapheme-to-Phoneme Conversion Using Conditional Random Fields. 2313-2316 - Ching-feng Yeh, Chao-Yu Huang, Lin-Shan Lee:

Bilingual Acoustic Model Adaptation by Unit Merging on Different Levels and Cross-Level Integration. 2317-2320 - Marijn Schraagen, Gerrit Bloothooft:

A Qualitative Evaluation of Phoneme-to-Phoneme Technology. 2321-2324 - Daniele Falavigna, Roberto Gretter:

Cheap Bootstrap of Multi-Lingual Hidden Markov Models. 2325-2328 - Nima Mesgarani, Samuel Thomas, Hynek Hermansky:

Adaptive Stream Fusion in Multistream Recognition of Speech. 2329-2332 - Man-Hung Siu, Herbert Gish, Steve Lowe, Arthur Chan:

Unsupervised Audio Patterns Discovery Using HMM-Based Self-Organized Units. 2333-2336 - John Labiak, Karen Livescu:

Nearest Neighbors with Learned Distances for Phonetic Frame Classification. 2337-2340
Speaker Recognition - Modeling, Automatic Procedures, Analysis III
- Ahilan Kanagasundaram, Robbie Vogt, David Dean, Sridha Sridharan, Michael Mason:

i-vector Based Speaker Recognition on Short Utterances. 2341-2344 - Hanwu Sun, Bin Ma:

Study of Overlapped Speech Detection for NIST SRE Summed Channel Speaker Recognition. 2345-2348 - Zhanyu Ma, Arne Leijon:

Super-Dirichlet Mixture Models Using Differential Line Spectral Frequencies for Text-Independent Speaker Identification. 2349-2352 - Hon-Bill Yu, Man-Wai Mak:

Comparison of Voice Activity Detectors for Interview Speech in NIST Speaker Recognition Evaluation. 2353-2356 - Achintya Kumar Sarkar, Srinivasan Umesh:

Eigen-Voice Based Anchor Modeling System for Speaker Identification Using MLLR Super-Vector. 2357-2360 - Wen Wang, Andreas Kathol, Harry Bratt:

Automatic Detection of Speaker Attributes Based on Utterance Text. 2361-2364 - Sandro Cumani, Pier Domenico Batzu, Daniele Colibro, Claudio Vair, Pietro Laface, Vasileios Vasilakakis:

Comparison of Speaker Recognition Approaches for Real Applications. 2365-2368 - Tim Polzehl, Sebastian Möller, Florian Metze:

Modeling Speaker Personality Using Voice. 2369-2372 - Marc Ferras, Koichi Shinoda, Sadaoki Furui:

Structural Joint Factor Analysis for Speaker Recognition. 2373-2376 - Sangeeta Biswas, Marc Ferras, Koichi Shinoda, Sadaoki Furui:

Acoustic Forest for SMAP-Based Speaker Verification. 2377-2380 - Garimella S. V. S. Sivaram, Samuel Thomas, Hynek Hermansky:

Mixture of Auto-Associative Neural Networks for Speaker Verification. 2381-2384
Speech Audio Analysis and Classification
- Seppo Fagerlund, Unto K. Laine:

Stop Consonant Recognition by Temporal Fine Structure of Burst. 2385-2388 - Katrin Kirchhoff, Andrei Alexandrescu:

Phonetic Classification Using Controlled Random Walks. 2389-2392 - Luís Marujo, Márcio Viveiros, João Paulo Neto:

Keyphrase Cloud Generation of Broadcast News. 2393-2396 - Alfonso M. Canterla, Magne Hallstein Johnsen:

Optimized Feature Extraction and HMMs in Subword Detectors. 2397-2400 - Ziqiang Shi, Jiqing Han, Tieran Zheng:

Real-World Speech/Non-Speech Audio Classification Based on Sparse Representation Features and GPCs. 2401-2404 - Manas A. Pathak, Bhiksha Raj:

Privacy Preserving Speaker Verification Using Adapted GMMs. 2405-2408 - Éva Székely, João P. Cabral, Peter Cahill, Julie Carson-Berndsen:

Clustering Expressive Speech Styles in Audiobooks Using Glottal Source Parameters. 2409-2412 - Bogdan Ludusan, Antonio Origlia, Francesco Cutugno:

On the Use of the Rhythmogram for Automatic Syllabic Prominence Detection. 2413-2416 - Sethserey Sam, Xiong Xiao, Laurent Besacier, Eric Castelli, Haizhou Li, Chng Eng Siong:

Speech Modulation Features for Robust Nonnative Speech Accent Detection. 2417-2420 - Chi Zhang, John H. L. Hansen:

Frame-Level Vocal Effort Likelihood Space Modeling for Improved Whisper-Island Detection. 2421-2424 - Xing Fan, John H. L. Hansen:

Speaker Identification for Whispered Speech Using a Training Feature Transformation from Neutral to Whisper. 2425-2428 - Andrea DeMarco, Stephen J. Cox:

An Accurate and Robust Gender Identification Algorithm. 2429-2432 - Xiaohong Yang, Qingcai Chen, Shusen Zhou, Xiaolong Wang:

Deep Belief Networks for Automatic Music Genre Classification. 2433-2436 - Jonathan William Dennis, Tran Huy Dat, Haizhou Li:

Image Representation of the Subband Power Distribution for Robust Sound Classification. 2437-2440 - Bo Xiao, Viktor Rozgic, Athanasios Katsamanis, Brian R. Baucom

, Panayiotis G. Georgiou, Shrikanth S. Narayanan:
Acoustic and Visual Cues of Turn-Taking Dynamics in Dyadic Interactions. 2441-2444
Human Speech and Sound Perception II
- Alexandra Jesse, Holger Mitterer:

Pointing Gestures do not Influence the Perception of Lexical Stress. 2445-2448 - Ian R. Cushing, Francis F. Li, Ken Worrall, Tim D. Jackson:

Relationships Between Phonetic Features and Speech Perception - A Statistical Investigation from a Large Anechoic British English Corpus. 2449-2452 - Guy J. Brown, Tim Jürgens, Ray Meddis, Matthew Robertson, Nicholas R. Clark:

The Representation of Speech in a Nonlinear Auditory Model: Time-Domain Analysis of Simulated Auditory-Nerve Firing Patterns. 2453-2456 - Luís Pinto Coelho, Daniela Braga, Miguel Sales Dias, Carmen García-Mateo:

An Automatic Voice Pleasantness Classification System Based on Prosodic and Acoustic Patterns of Voice Preference. 2457-2460 - René Carré, Pierre L. Divenyi, Willy Serniclaes, Emmanuel Ferragne, Egidio Marsico, Viet Son Nguyen:

Contributions of F1 and F2 (F2') to the Perception of Plosive Consonants. 2461-2464 - Jeesun Kim, Chris Davis:

Auditory Speech Processing is Affected by Visual Speech in the Periphery. 2465-2468 - Tim Paris, Jeesun Kim, Chris Davis:

Visual Speech Speeds Up Auditory Identification Responses. 2469-2472 - Ryoichi Takashima, Tohru Nagano, Ryuki Tachibana, Masafumi Nishimura:

Agglomerative Hierarchical Clustering of Emotions in Speech Based on Subjective Relative Similarity. 2473-2476 - Guangting Mai, Gang Peng:

Optimal Syllabic Rates and Processing Units in Perceiving Mandarin Spoken Sentences. 2477-2480 - Mirjam Wester, Hui Liang:

Cross-Lingual Speaker Discrimination Using Natural and Synthetic Speech. 2481-2484
Speech Audio Analysis
- Yongzhe Shi, Weiqiang Zhang, Jia Liu:

Robust Audio Fingerprinting Based on Local Spectral Luminance Maxima Scheme. 2485-2488 - Unto K. Laine:

Entropy-Rate Driven Inference of Stochastic Grammars. 2489-2492 - Sheng-Chieh Lee, K. Bharanitharan, Bo-Wei Chen, Jhing-Fa Wang, Chung-Hsien Wu, Min-Jian Liao:

An Efficient Pre-Processing Scheme to Improve the Sound Source Localization System in Noisy Environment. 2493-2496 - Guylaine Le Jan, Yannick Benezeth, Guillaume Gravier, Frédéric Bimbot:

A Study on Auditory Feature Spaces for Speech-Driven Lip Animation. 2497-2500 - Erfan Loweimi, Seyed Mohammad Ahadi, Hamid Sheikhzadeh:

Phase-Only Speech Reconstruction Using Very Short Frames. 2501-2504 - Trond Skogstad, Torbjørn Svendsen:

Frequency-Warped and Stabilized Time-Varying Cepstral Coefficients. 2505-2508 - Freddy William, Abhijeet Sangwan, John H. L. Hansen:

Using Human Perception for Automatic Accent Assessment. 2509-2512 - Carlos Molina, Sungbok Lee, Shrikanth S. Narayanan, Néstor Becerra Yoma:

A Study of the Effectiveness of Articulatory Strokes for Phonemic Recognition. 2513-2516 - Erika Okamoto, Toshio Irino, Ryuichi Nisimura, Hideki Kawahara:

Auditory Filterbank Improves Voice Morphing. 2517-2520 - Anna Katharina Fuchs, Christian Feldbauer, Michael Stark:

Monaural Sound Localization. 2521-2524
Speech Coding
- Masahiro Fukui, Shigeaki Sasaki, Yusuke Hiwasaki, Sachiko Kurihara, Yoichi Haneda:

Dual-Mode AVQ Coding Based on Spectral Masking and Sparseness Detection for ITU-T G.711.1/G.722 Super-Wideband Extensions. 2525-2528 - Azar Taufique, Kumaran Vijayasankar, Wooil Kim, John H. L. Hansen, Marco Tacca, Andrea Fumagalli:

Phone Impact Based Speech Transmission Technique for Reliable Speech Recognition in Poor Wireless Network Conditions. 2529-2532 - Jingting Zhou, Daniel Garcia-Romero, Carol Y. Espy-Wilson:

Automatic Speech Codec Identification with Applications to Tampering Detection of Speech Recordings. 2533-2536 - Chang-Heon Lee, Olivier Rosec, Yannis Stylianou:

A Hybrid Quasi-Harmonic/CELP Wideband Speech Coding Scheme for Unit Selection TTS Synthesis. 2537-2540 - Anssi Rämö, Henri Toukomaa:

Voice Quality Characterization of IETF Opus Codec. 2541-2544 - Christian Fischer Pedersen:

Leja Ordering LSFs for Accurate Estimation of Predictor Coefficients. 2545-2548 - Qipeng Gong, Peter Kabal:

Improved Quality for Conversational VoIP Using Path Diversity. 2549-2552 - Abdul Hannan Khan, Peter Kabal:

Tree Encoding for the ITU-T G.711.1 Speech Coder. 2553-2556 - Dong Wang, Ravichander Vipperla, Nicholas W. D. Evans:

Parallel and Hierarchical Decision Making for Sparse Coding in Speech Recognition. 2557-2560 - Chen-Yu Chiang, Jyh-Her Yang, Ming-Chieh Liu, Yih-Ru Wang, Yuan-Fu Liao, Sin-Horng Chen:

A New Model-Based Mandarin-Speech Coding System. 2561-2564
Robustness and Adaptation for ASR
- Petr Cerva, Karel Palecek, Jan Silovský, Jan Nouza:

Using Unsupervised Feature-Based Speaker Adaptation for Improved Transcription of Spoken Archives. 2565-2568 - Volker Fischer, Siegfried Kunzmann:

Online Speaker Adaptation with Pre-Computed FMLLR Transformations. 2569-2572 - Diego Giuliani, Fabio Brugnara:

Instantaneous Speaker Adaptation Through Selection and Combination of fMLLR Transformation Matrices. 2573-2576 - Hwa Jeon Song, Yunkeun Lee, Hyung Soon Kim:

Joint Bilinear Transformation Space Based Maximum a posteriori Linear Regression Adaptation Using Prior with Variance Function. 2577-2580 - Doddipatla Rama Sanand, Mikko Kurimo:

A Study on Combining VTLN and SAT to Improve the Performance of Automatic Speech Recognition. 2581-2584 - Yu Tsao, Paul R. Dixon, Chiori Hori, Hisashi Kawai:

Incorporating Regional Information to Enhance MAP-Based Stochastic Feature Compensation for Robust Speech Recognition. 2585-2588 - Shweta Ghai, Rohit Sinha:

A Study on the Effect of Pitch on LPCC and PLPC Features for Children's ASR in Comparison to MFCC. 2589-2592 - Denis Jouvet, Dominique Fohr, Irina Illina:

About Handling Boundary Uncertainty in a Speaking Rate Dependent Modeling Approach. 2593-2596 - Ji Wu, Zhiyang He, Ping Lv:

An Active Learning Approach to Task Adaptation. 2597-2600 - Vikas Joshi, Raghavendra Bilgi, Srinivasan Umesh, M. Carmen Benítez, Luz García:

Efficient Speaker and Noise Normalization for Robust Speech Recognition. 2601-2604 - Thomas Winkler:

How Realistic is Artificially Added Noise? 2605-2608
Voice Activity Detection
- Masashi Unoki, Xugang Lu, Rico Petrick, Shota Morita, Masato Akagi, Rüdiger Hoffmann:

Voice Activity Detection in MTF-Based Power Envelope Restoration. 2609-2612 - Miquel Espi, Shigeki Miyabe, Takuya Nishimoto, Nobutaka Ono, Shigeki Sagayama:

Using Spectral Fluctuation of Speech in Multi-Feature HMM-Based Voice Activity Detection. 2613-2616 - Kannu Mehta, Chau Khoa Pham, Chng Eng Siong:

Linear Dynamic Models for Voice Activity Detection. 2617-2620 - Jouni Pohjalainen, Tuomo Raitio, Paavo Alku:

Detection of Shouted Speech in the Presence of Ambient Noise. 2621-2624 - Takashi Fukuda, Osamu Ichikawa, Masafumi Nishimura:

Breath-Detection-Based Telephony Speech Phrasing. 2625-2628 - Gibak Kim:

Multi-Channel Voice Activity Detection Based on Conic Constraints. 2629-2632 - Theodore Petsatodis, Fotios Talantzis, Christos Boukis, Zheng-Hua Tan, Ramjee Prasad:

Multi-Sensor Voice Activity Detection Based on Multiple Observation Hypothesis Testing. 2633-2636 - Chao Gao, Guruprasad Saikumar, Saurabh Khanwalkar, Avi Herscovici, Anoop Kumar, Amit Srivastava, Premkumar Natarajan:

Online Speech Activity Detection in Broadcast News. 2637-2640 - Daniel Reich, Felix Putze, Dominic Heger, Joris IJsselmuiden, Rainer Stiefelhagen, Tanja Schultz:

Tue-SeA Real-Time Speech Command Detector for a Smart Control Room. 2641-2644 - Ekapol Chuangsuwanich

, James R. Glass:
Robust Voice Activity Detector for Real World Applications Using Harmonicity and Modulation Frequency. 2645-2648 - Tomas Dekens, Werner Verhelst:

On Noise Robust Voice Activity Detection. 2649-2652 


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID