


default search action
INTERSPEECH/EUROSPEECH 2005: Lisbon, Portugal
- 9th European Conference on Speech Communication and Technology, INTERSPEECH-Eurospeech 2005, Lisbon, Portugal, September 4-8, 2005. ISCA 2005

Keynote Papers
- Graeme M. Clark:

The multiple-channel cochlear implant: interfacing electronic technology to human consciousness. 1-4
Speech Recognition - Language Modelling I-III
- Yik-Cheung Tam, Tanja Schultz:

Dynamic language model adaptation using variational Bayes inference. 5-8 - Vidura Seneviratne, Steve J. Young:

The hidden vector state language model. 9-12 - Shinsuke Mori, Gakuto Kurata:

Class-based variable memory length Markov model. 13-16 - Alexander Gruenstein, Chao Wang, Stephanie Seneff:

Context-sensitive statistical language modeling. 17-20 - Chao Wang, Stephanie Seneff, Grace Chung:

Language model data filtering via user simulation and dialogue resynthesis. 21-24 - Jen-Tzung Chien, Meng-Sung Wu, Chia-Sheng Wu:

Bayesian learning for latent semantic analysis. 25-28
Prosody in Language Performance I, II
- Daniel Hirst, Caroline Bouzon:

The effect of stress and boundaries on segmental duration in a corpus of authentic speech (british English). 29-32 - Tomoko Ohsuga, Masafumi Nishida, Yasuo Horiuchi, Akira Ichikawa:

Investigation of the relationship between turn-taking and prosodic features in spontaneous dialogue. 33-36 - Michiko Watanabe, Keikichi Hirose, Yasuharu Den, Nobuaki Minematsu:

Filled pauses as cues to the complexity of following phrases. 37-40 - Katrin Schneider, Bernd Möbius:

Perceptual magnet effect in German boundary tones. 41-44 - Angela Grimm, Jochen Trommer:

Constraints on the acquisition of simplex and complex words in German. 45-48 - Julien Meyer:

Whistled speech: a natural phonetic description of languages adapted to human perception and to the acoustical environment. 49-52
Spoken Language Extraction / Retrieval I, II
- Olivier Siohan, Michiel Bacchiani:

Fast vocabulary-independent audio search using path-based graph indexing. 53-56 - John Makhoul, Alex Baron, Ivan Bulyko, Long Nguyen, Lance A. Ramshaw, David Stallard, Richard M. Schwartz, Bing Xiang:

The effects of speech recognition and punctuation on information extraction performance. 57-60 - Ciprian Chelba, Alex Acero

:
Indexing uncertainty for spoken document search. 61-64 - Tomoyosi Akiba, Hiroyuki Abe:

Exploiting passage retrieval for n-best rescoring of spoken questions. 65-68 - BalaKrishna Kolluru, Heidi Christensen

, Yoshihiko Gotoh:
Multi-stage compaction approach to broadcast news summarisation. 69-72 - Chien-Lin Huang, Chia-Hsin Hsieh, Chung-Hsien Wu:

Audio-video summarization of TV news using speech recognition and shot change detection. 73-76
The Blizzard Challenge 2005
- Alan W. Black, Keiichi Tokuda:

The blizzard challenge - 2005: evaluating corpus-based speech synthesis on common datasets. 77-80 - Shinsuke Sakai, Han Shu:

A probabilistic approach to unit selection for corpus-based speech synthesis. 81-84 - John Kominek, Christina L. Bennett, Brian Langner, Arthur R. Toth:

The blizzard challenge 2005 CMU entry - a method for improving speech synthesis systems. 85-88 - H. Timothy Bunnell, Christopher A. Pennington, Debra Yarrington, John Gray:

Automatic personal synthetic voice construction. 89-92 - Heiga Zen, Tomoki Toda:

An overview of nitech HMM-based speech synthesis system for blizzard challenge 2005. 93-96 - Wael Hamza, Raimo Bakis, Zhiwei Shuang, Heiga Zen:

On building a concatenative speech synthesis system from the blizzard challenge speech databases. 97-100 - Robert A. J. Clark, Korin Richmond, Simon King:

Multisyn voices from ARCTIC data for the blizzard challenge. 101-104 - Christina L. Bennett:

Large scale evaluation of corpus-based synthesizers: results and lessons from the blizzard challenge 2005. 105-108
New Applications
- Berlin Chen, Yi-Ting Chen, Chih-Hao Chang, Hung-Bin Chen:

Speech retrieval of Mandarin broadcast news via mobile devices. 109-112 - Michiaki Katoh, Kiyoshi Yamamoto, Jun Ogata, Takashi Yoshimura, Futoshi Asano, Hideki Asoh, Nobuhiko Kitawaki:

State estimation of meetings by information fusion using Bayesian network. 113-116 - Roger K. Moore:

Results from a survey of attendees at ASRU 1997 and 2003. 117-120 - Reinhold Haeb-Umbach, Basilis Kladis, Joerg Schmalenstroeer:

Speech processing in the networked home environment - a view on the amigo project. 121-124 - Masahide Sugiyama:

Fixed distortion segmentation in efficient sound segment searching. 125-128 - Tin Lay Nwe, Haizhou Li:

Identifying singers of popular songs. 129-132 - Jun Ogata, Masataka Goto:

Speech repair: quick error correction just by using selection operation for speech input interfaces. 133-136 - Dirk Olszewski, Fransiskus Prasetyo, Klaus Linhard:

Steerable highly directional audio beam loudspeaker. 137-140 - Hassan Ezzaidi, Jean Rouat:

Automatic music genre classification using second-order statistical measures for the prescriptive approach. 141-144 - Alberto Abad, Dusan Macho, Carlos Segura, Javier Hernando, Climent Nadeu:

Effect of head orientation on the speaker localization performance in smart-room environment. 145-148 - Corinne Fredouille, Gilles Pouchoulin, Jean-François Bonastre, M. Azzarello, Antoine Giovanni, Alain Ghio:

Application of automatic speaker recognition techniques to pathological voice assessment (dysphonia). 149-152 - Upendra V. Chaudhari, Ganesh N. Ramaswamy, Edward A. Epstein, Sasha Caskey, Mohamed Kamal Omar:

Adaptive speech analytics: system, infrastructure, and behavior. 153-156
E-learning and Spoken Language Processing
- Katherine Forbes-Riley, Diane J. Litman:

Correlating student acoustic-prosodic profiles with student learning in spoken tutoring dialogues. 157-160 - Diane J. Litman, Katherine Forbes-Riley:

Speech recognition performance and learning in spoken dialogue tutoring. 161-164 - Satoshi Asakawa, Nobuaki Minematsu, Toshiko Isei-Jaakkola, Keikichi Hirose:

Structural representation of the non-native pronunciations. 165-168 - Fu-Chiang Chou:

Ya-ya language box - a portable device for English pronunciation training with speech recognition technologies. 169-172 - Akinori Ito, Yen-Ling Lim, Motoyuki Suzuki, Shozo Makino:

Pronunciation error detection method based on error rule clustering using a decision tree. 173-176 - Abhinav Sethy, Shrikanth S. Narayanan, Nicolaus Mote, W. Lewis Johnson:

Modeling and automating detection of errors in Arabic language learner speech. 177-180 - Felicia Zhang, Michael Wagner:

Effects of F0 feedback on the learning of Chinese tones by native speakers of English. 181-184
E-inclusion and Spoken Language Processing I, II
- Tom Brøndsted, Erik Aaskoven:

Voice-controlled internet browsing for motor-handicapped users. design and implementation issues. 185-188 - Briony Williams, Delyth Prys, Ailbhe Ní Chasaide:

Creating an ongoing research capability in speech technology for two minority languages: experiences from the WISPR project. 189-192 - Anestis Vovos, Basilis Kladis, Nikolaos D. Fakotakis:

Speech operated smart-home control system for users with special needs. 193-196 - Takatoshi Jitsuhiro, Shigeki Matsuda, Yutaka Ashikari, Satoshi Nakamura, Ikuko Eguchi Yairi

, Seiji Igi:
Spoken dialog system and its evaluation of geographic information system for elderly persons' mobility support. 197-200 - Daniele Falavigna, Toni Giorgino, Roberto Gretter:

A frame based spoken dialog system for home care. 201-204
Acoustic Processing for ASR I-III
- Matthias Wölfel:

Frame based model order selection of spectral envelopes. 205-208 - Vivek Tyagi, Christian Wellekens, Hervé Bourlard:

On variable-scale piecewise stationary spectral analysis of speech signals for ASR. 209-212 - Arlo Faria, David Gelbart:

Efficient pitch-based estimation of VTLN warp factors. 213-216 - Yanli Zheng, Richard Sproat, Liang Gu, Izhak Shafran, Haolang Zhou, Yi Su, Daniel Jurafsky, Rebecca Starr, Su-Youn Yoon:

Accent detection and speech recognition for Shanghai-accented Mandarin. 217-220 - Loïc Barrault, Renato de Mori, Roberto Gemello, Franco Mana, Driss Matrouf:

Variability of automatic speech recognition systems using different features. 221-224 - Slavomír Lihan, Jozef Juhár, Anton Cizmar:

Crosslingual and bilingual speech recognition with Slovak and Czech speechdat-e databases. 225-228 - Carmen Peláez-Moreno, Qifeng Zhu, Barry Y. Chen, Nelson Morgan:

Automatic data selection for MLP-based feature extraction for ASR. 229-232 - Thilo Köhler, Christian Fügen, Sebastian Stüker, Alex Waibel:

Rapid porting of ASR-systems to mobile devices. 233-236 - Hugo Meinedo, João Paulo Neto:

A stream-based audio segmentation, classification and clustering pre-processing system for broadcast news using ANN models. 237-240 - Etienne Marcheret, Karthik Visweswariah, Gerasimos Potamianos:

Speech activity detection fusing acoustic phonetic and energy features. 241-244 - Zoltán Tüske, Péter Mihajlik

, Zoltán Tobler, Tibor Fegyó:
Robust voice activity detection based on the entropy of noise-suppressed spectrum. 245-248 - Masamitsu Murase, Shun'ichi Yamamoto, Jean-Marc Valin, Kazuhiro Nakadai, Kentaro Yamada, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno:

Multiple moving speaker tracking by microphone array on mobile robot. 249-252
Speech Recognition - Adaptation I, II
- Yaxin Zhang, Bian Wu, Xiaolin Ren, Xin He:

A speaker biased SI recognizer for embedded mobile applications. 253-256 - Bart Bakker, Carsten Meyer, Xavier L. Aubert:

Fast unsupervised speaker adaptation through a discriminative eigen-MLLR algorithm. 257-260 - Rusheng Hu, Jian Xue, Yunxin Zhao:

Incremental largest margin linear regression and MAP adaptation for speech separation in telemedicine applications. 261-264 - Giulia Garau, Steve Renals, Thomas Hain

:
Applying vocal tract length normalization to meeting recordings. 265-268 - Srinivasan Umesh, András Zolnay, Hermann Ney:

Implementing frequency-warping and VTLN through linear transformation of conventional MFCC. 269-272 - Xiaodong Cui, Abeer Alwan:

MLLR-like speaker adaptation based on linearization of VTLN with MFCC features. 273-276 - Chandra Kant Raut, Takuya Nishimoto, Shigeki Sagayama:

Model adaptation by state splitting of HMM for long reverberation. 277-280 - Daben Liu, Daniel Kiecza, Amit Srivastava, Francis Kubala:

Online speaker adaptation and tracking for real-time speech recognition. 281-284 - Masafumi Nishida, Yasuo Horiuchi, Akira Ichikawa:

Automatic speech recognition based on adaptation and clustering using temporal-difference learning. 285-288 - Hui Ye, Steve J. Young:

Improving the speech recognition performance of beginners in spoken conversational interaction for language learning. 289-292 - Randy Gomez, Akinobu Lee, Hiroshi Saruwatari, Kiyohiro Shikano:

Rapid unsupervised speaker adaptation based on multi-template HMM sufficient statistics in noisy environments. 293-296 - Dong-jin Choi, Yung-Hwan Oh:

Rapid speaker adaptation for continuous speech recognition using merging eigenvoices. 297-300
Signal Analysis, Processing and Feature Estimation I-III
- Jian Liu, Thomas Fang Zheng, Jing Deng, Wenhu Wu:

Real-time pitch tracking based on combined SMDSF. 301-304 - András Bánhalmi, Kornél Kovács, András Kocsor, László Tóth:

Fundamental frequency estimation by least-squares harmonic model fitting. 305-308 - Siu Wa Lee, Frank K. Soong, Pak-Chung Ching:

Harmonic filtering for joint estimation of pitch and voiced source with single-microphone input. 309-312 - Marián Képesi, Luis Weruaga:

High-resolution noise-robust spectral-based pitch estimation. 313-316 - John-Paul Hosom:

F0 estimation for adult and children's speech. 317-320 - Ben Milner, Xu Shao, Jonathan Darch:

Fundamental frequency and voicing prediction from MFCCs for speech reconstruction from unconstrained speech. 321-324 - Nelly Barbot, Olivier Boëffard, Damien Lolive:

F0 stylisation with a free-knot b-spline model and simulated-annealing optimization. 325-328 - Friedhelm R. Drepper:

Voiced excitation as entrained primary response of a reconstructed glottal master oscillator. 329-332 - Damien Vincent, Olivier Rosec, Thierry Chonavel:

Estimation of LF glottal source parameters based on an ARX model. 333-336 - Leigh D. Alsteris, Kuldip K. Paliwal:

Some experiments on iterative reconstruction of speech from STFT phase and magnitude spectra. 337-340 - R. Muralishankar, Abhijeet Sangwan, Douglas D. O'Shaughnessy:

Statistical properties of the warped discrete cosine transform cepstrum compared with MFCC. 341-344 - Aníbal J. S. Ferreira:

New signal features for robust identification of isolated vowels. 345-348 - Jonathan Pincas, Philip J. B. Jackson:

Amplitude modulation of frication noise by voicing saturates. 349-352 - Ron M. Hecht, Naftali Tishby:

Extraction of relevant speech features using the information bottleneck method. 353-356 - Mohammad Firouzmand, Laurent Girin, Sylvain Marchand:

Comparing several models for perceptual long-term modeling of amplitude and phase trajectories of sinusoidal speech. 357-360 - Hynek Hermansky, Petr Fousek:

Multi-resolution RASTA filtering for TANDEM-based ASR. 361-364 - Woojay Jeon, Biing-Hwang Juang:

A category-dependent feature selection method for speech signals. 365-368 - Trausti T. Kristjansson, Sabine Deligne, Peder A. Olsen:

Voicing features for robust speech detection. 369-372
Robust Speech Recognition I-IV
- Svein Gunnar Pettersen, Magne Hallstein Johnsen, Tor André Myrvoll:

Joint Bayesian predictive classification and parallel model combination for robust speech recognition. 373-376 - Glauco F. G. Yared, Fábio Violaro, Lívio C. Sousa:

Gaussian elimination algorithm for HMM complexity reduction in continuous speech recognition systems. 377-380 - Luis Buera, Eduardo Lleida, Antonio Miguel, Alfonso Ortega:

Robust speech recognition in cars using phoneme dependent multi-environment linear normalization. 381-384 - Yi Chen, Lin-Shan Lee:

Energy-based frame selection for reliable feature normalization and transformation in robust speech recognition. 385-388 - Yoshitaka Nakajima, Hideki Kashioka, Kiyohiro Shikano, Nick Campbell:

Remodeling of the sensor for non-audible murmur (NAM). 389-392 - Amarnag Subramanya, Jeff A. Bilmes, Chia-Ping Chen:

Focused word segmentation for ASR. 393-396
Speech Perception I, II
- Jennifer A. Alexander, Patrick C. M. Wong, Ann R. Bradlow:

Lexical tone perception in musicians and non-musicians. 397-400 - Joan K.-Y. Ma, Valter Ciocca, Tara L. Whitehill:

Contextual effect on perception of lexical tones in Cantonese. 401-404 - Hansjörg Mixdorff, Yu Hu, Denis Burnham:

Visual cues in Mandarin tone perception. 405-408 - Hansjörg Mixdorff, Yu Hu:

Cross-language perception of word stress. 409-412 - Anne Cutler:

The lexical statistics of word recognition problems caused by L2 phonetic confusion. 413-416 - Chun-Fang Huang, Masato Akagi:

A multi-layer fuzzy logical model for emotional speech perception. 417-420
Spoken Language Understanding I, II
- Ian R. Lane, Tatsuya Kawahara:

Utterance verification incorporating in-domain confidence and discourse coherence measures. 421-424 - Constantinos Boulis, Mari Ostendorf:

Using symbolic prominence to help design feature subsets for topic classification and clustering of natural human-human conversations. 425-428 - Katsuhito Sudoh, Hajime Tsukada:

Tightly integrated spoken language understanding using word-to-concept translation. 429-432 - Ruhi Sarikaya, Hong-Kwang Jeff Kuo, Vaibhava Goel, Yuqing Gao:

Exploiting unlabeled data using multiple classifiers for improved natural language call-routing. 433-436 - Hong-Kwang Jeff Kuo, Vaibhava Goel:

Active learning with minimum expected error for spoken language understanding. 437-440 - Matthias Thomae, Tibor Fábián, Robert Lieb, Günther Ruske:

Lexical out-of-vocabulary models for one-stage speech interpretation. 441-444
E-inclusion and Spoken Language Processing I, II
- Mark S. Hawley, Phil D. Green, Pam Enderby

, Stuart P. Cunningham, Roger K. Moore
:
Speech technology for e-inclusion of people with physical disabilities and disordered speech. 445-448 - Björn Granström:

Speech technology for language training and e-inclusion. 449-452 - Roger C. F. Tucker, Ksenia Shalonova:

Supporting the creation of TTS for local language voice information systems. 453-456 - Ove Andersen, Christian Hjulmand:

Access for all - a talking internet service. 457-460 - Knut Kvale, Narada D. Warakagoda:

A speech centric mobile multimodal service useful for dyslectics and aphasics. 461-464
Paralinguistic and Nonlinguistic Information in Speech
- Nick Campbell, Hideki Kashioka, Ryo Ohara:

No laughing matter. 465-468 - Christophe Blouin, Valérie Maffiolo:

A study on the automatic detection and characterization of emotion in a voice service context. 469-472 - Raul Fernandez, Rosalind W. Picard:

Classical and novel discriminant features for affect recognition from speech. 473-476 - Jaroslaw Cichosz, Krzysztof Slot:

Low-dimensional feature space derivation for emotion recognition. 477-480 - Carlos Toshinori Ishi, Hiroshi Ishiguro, Norihiro Hagita:

Proposal of acoustic measures for automatic detection of vocal fry. 481-484 - Khiet P. Truong, David A. van Leeuwen:

Automatic detection of laughter. 485-488 - Anton Batliner, Stefan Steidl, Christian Hacker, Elmar Nöth, Heinrich Niemann:

Tales of tuning - prototyping for automatic classification of emotional user states. 489-492 - Iker Luengo, Eva Navas, Inmaculada Hernáez, Jon Sánchez:

Automatic emotion recognition using prosodic parameters. 493-496 - Sungbok Lee, Serdar Yildirim, Abe Kazemzadeh, Shrikanth S. Narayanan:

An articulatory study of emotional speech production. 497-500 - Gregor Hofer, Korin Richmond, Robert A. J. Clark:

Informed blending of databases for emotional speech synthesis. 501-504 - Fabio Tesser, Piero Cosi, Carlo Drioli, Graziano Tisato:

Emotional FESTIVAL-MBROLA TTS synthesis. 505-508 - Felix Burkhardt:

Emofilt: the simulation of emotional speech by prosody-transformation. 509-512 - Andrew Rosenberg, Julia Hirschberg:

Acoustic/prosodic and lexical correlates of charismatic speech. 513-516 - Yoko Greenberg, Minoru Tsuzaki, Hiroaki Kato, Yoshinori Sagisaka:

Communicative speech synthesis using constituent word attributes. 517-520 - Angelika Braun, Matthias Katerbow:

Emotions in dubbed speech: an intercultural approach with respect to F0. 521-524 - Nicolas Audibert

, Véronique Aubergé, Albert Rilliard:
The prosodic dimensions of emotion in speech: the relative weights of parameters. 525-528 - Susanne Schötz:

Stimulus duration and type in perception of female and male speaker age. 529-532 - Cecilia Ovesdotter Alm, Richard Sproat:

Perceptions of emotions in expressive storytelling. 533-536 - Hideki Kawahara, Alain de Cheveigné, Hideki Banno, Toru Takahashi, Toshio Irino:

Nearly defect-free F0 trajectory extraction for expressive speech modifications based on STRAIGHT. 537-540 - Tomoko Yonezawa, Noriko Suzuki, Kenji Mase, Kiyoshi Kogure:

Gradually changing expression of singing voice based on morphing. 541-544
Issues in Large Vocabulary Decoding
- I. Lee Hetherington:

A multi-pass, dynamic-vocabulary approach to real-time, large-vocabulary speech recognition. 545-548 - George Saon, Daniel Povey, Geoffrey Zweig:

Anatomy of an extremely fast LVCSR decoder. 549-552 - Dong Yu, Li Deng, Alex Acero:

Evaluation of a long-contextual-Span hidden trajectory model and phonetic recognizer using a* lattice search. 553-556 - Takaaki Hori, Atsushi Nakamura:

Generalized fast on-the-fly composition algorithm for WFST-based speech recognition. 557-560 - Hiroaki Nanjo, Teruhisa Misu, Tatsuya Kawahara:

Minimum Bayes-risk decoding considering word significance for information retrieval system. 561-564 - Arthur Chan, Mosur Ravishankar, Alexander I. Rudnicky:

On improvements to CI-based GMM selection. 565-568 - Dominique Massonié, Pascal Nocera, Georges Linarès:

Scalable language model look-ahead for LVCSR. 569-572 - Miroslav Novak:

Memory efficient approximative lattice generation for grammar based decoding. 573-576 - Dong-Hoon Ahn, Su-Byeong Oh, Minhwa Chung:

Improved semi-dynamic network decoding using WFSTs. 577-580 - Janne Pylkkönen:

New pruning criteria for efficient decoding. 581-584 - Tibor Fábián, Robert Lieb, Günther Ruske, Matthias Thomae:

A confidence-guided dynamic pruning approach - utilization of confidence measurement in speech recognition. 585-588
Spoken Language Extraction / Retrieval I, II
- Toru Taniguchi, Akishige Adachi, Shigeki Okawa, Masaaki Honda, Katsuhiko Shirai:

Discrimination of speech, musical instruments and singing voices using the temporal patterns of sinusoidal segments in audio signals. 589-592 - Gabriel Murray, Steve Renals, Jean Carletta:

Extractive summarization of meeting recordings. 593-596 - Arjan van Hessen, Jaap Hinke:

IR-based classification of customer-agent phone calls. 597-600 - Benoît Favre, Frédéric Béchet, Pascal Nocera:

Mining broadcast news data: robust information extraction from word lattices. 601-604 - Mikko Kurimo, Ville T. Turunen:

To recover from speech recognition errors in spoken document retrieval. 605-608 - Edgar González, Jordi Turmo:

Unsupervised clustering of spontaneous speech documents. 609-612 - Masahide Yamaguchi, Masaru Yamashita, Shoichi Matsunaga:

Spectral cross-correlation features for audio indexing of broadcast news and meetings. 613-616 - Chiori Hori, Alex Waibel:

Spontaneous speech consolidation for spoken language applications. 617-620 - Sameer Maskey, Julia Hirschberg:

Comparing lexical, acoustic/prosodic, structural and discourse features for speech summarization. 621-624 - Te-Hsuan Li, Ming-Han Lee, Berlin Chen, Lin-Shan Lee:

Hierarchical topic organization and visual presentation of spoken documents using probabilistic latent semantic analysis (PLSA) for efficient retrieval/browsing applications. 625-628 - Janez Zibert, France Mihelic, Jean-Pierre Martens, Hugo Meinedo, João Paulo Neto, Laura Docío Fernández, Carmen García-Mateo, Petr David, Jindrich Zdánský, Matús Pleva, Anton Cizmar, Andrej Zgank, Zdravko Kacic, Csaba Teleki, Klára Vicsi:

The COST278 broadcast news segmentation and speaker clustering evaluation - overview, methodology, systems, results. 629-632 - Igor Szöke, Petr Schwarz, Pavel Matejka, Lukás Burget, Martin Karafiát, Michal Fapso, Jan Cernocký:

Comparison of keyword spotting approaches for informal continuous speech. 633-636 - Teruhisa Misu, Tatsuya Kawahara:

Dialogue strategy to clarify user's queries for document retrieval system with speech interface. 637-640 - Nicolas Moreau, Shan Jin, Thomas Sikora:

Comparison of different phone-based spoken document retrieval methods with text and spoken queries. 641-644
Signal Analysis, Processing and Feature Estimation I-III
- Pedro Gómez, Francisco Díaz Pérez, Agustín Álvarez-Marquina, Rafael Martínez, Victoria Rodellar, Roberto Fernández-Baíllo, Alberto Nieto, Francisco J. Fernandez:

PCA of perturbation parameters in voice pathology detection. 645-648 - Anindya Sarkar, T. V. Sreenivas:

Dynamic programming based segmentation approach to LSF matrix reconstruction. 649-652 - T. Nagarajan, Douglas D. O'Shaughnessy:

Explicit segmentation of speech based on frequency-domain AR modeling. 653-656 - Petr Motlícek, Lukás Burget, Jan Cernocký:

Non-parametric speaker turn segmentation of meeting data. 657-660 - Petri Korhonen, Unto K. Laine:

Unsupervised segmentation of continuous speech using vector autoregressive time-frequency modeling errors. 661-664 - P. Vijayalakshmi, M. Ramasubba Reddy:

The analysis on band-limited hypernasal speech using group delay based formant extraction technique. 665-668 - Jindrich Zdánský, Jan Nouza:

Detection of acoustic change-points in audio records via global BIC maximization and dynamic programming. 669-672 - Md. Khademul Islam Molla, Keikichi Hirose, Nobuaki Minematsu:

Multi-band approach of audio source discrimination with empirical mode decomposition. 673-676 - Minoru Tsuzaki, Satomi Tanaka, Hiroaki Kato, Yoshinori Sagisaka:

Application of auditory image model for speech event detection. 677-680 - José Anibal Arias:

Unsupervised identification of speech segments using kernel methods for clustering. 681-684 - Georgios Evangelopoulos, Petros Maragos:

Speech event detection using multiband modulation energy. 685-688 - John Kominek, Alan W. Black:

Measuring unsupervised acoustic clustering through phoneme pair merge-and-split tests. 689-692 - Fabio Valente, Christian Wellekens:

Variational Bayesian speaker change detection. 693-696 - Sarah Borys, Mark Hasegawa-Johnson:

Distinctive feature based SVM discriminant features for improvements to phone recognition on telephone band speech. 697-700 - P. Vijayalakshmi, M. Ramasubba Reddy:

Detection of hypernasality using statistical pattern classifiers. 701-704 - Luis Weruaga, Marián Képesi:

Self-organizing chirp-sensitive artificial auditory cortical model. 705-708 - Sotiris Karabetsos, Pirros Tsiakoulis, Stavroula-Evita Fotinea, Ioannis Dologlou:

On the use of a decimative spectral estimation method based on eigenanalysis and SVD for formant and bandwidth tracking of speech signals. 709-712 - Alexei V. Ivanov, Marek Parfieniuk, Alexander A. Petrovsky:

Frequency-domain auditory suppression modelling (FASM) - a WDFT-based anthropomorphic noise-robust feature extraction algorithm for speech recognition. 713-716
Keynote Papers
- Fernando C. N. Pereira:

Linear models for structure prediction. 717-720
Speech Recognition - Language Modelling I-III
- Chuang-Hua Chueh, To-Chang Chien, Jen-Tzung Chien:

Discriminative maximum entropy language model for speech recognition. 721-724 - Maximilian Bisani, Hermann Ney:

Open vocabulary speech recognition with flat hybrid models. 725-728 - Minwoo Jeong, Jihyun Eun, Sangkeun Jung, Gary Geunbae Lee:

An error-corrective language-model adaptation for automatic speech recognition. 729-732 - Shiuan-Sung Lin, François Yvon:

Discriminative training of finite state decoding graphs. 733-736 - Holger Schwenk, Jean-Luc Gauvain:

Building continuous space language models for transcribing european languages. 737-740 - Peng Xu, Lidia Mangu:

Using random forest language models in the IBM RT-04 CTS system. 741-744
Spoken Language Acquisition, Development and Learning I, II
- Willemijn Heeren:

Perceptual development of the duration cue in dutch /a-a: /. 745-748 - Hong You, Abeer Alwan, Abe Kazemzadeh, Shrikanth S. Narayanan:

Pronunciation variations of Spanish-accented English spoken by young children. 749-752 - Willemijn Heeren:

L2 development of quantity perception: dutch listeners learning Finnish /t-t: /. 753-756 - Claudio Zmarich, Serena Bonifacio:

Phonetic inventories in Italian children aged 18-27 months: a longitudinal study. 757-760 - Hiroko Hirano, Goh Kawai:

Pitch patterns of intonational phrases and intonational phrase groups in native and non-native speech. 761-764 - Rebecca Hincks:

Measuring liveliness in presentation speech. 765-768
Multi-modal / Multi-media Processing I, II
- Nick Campbell:

Non-verbal speech processing for a communicative agent. 769-772 - Stuart N. Wrigley

, Guy J. Brown:
Physiologically motivated audio-visual localisation and tracking. 773-776 - Jing Huang, Daniel Povey:

Discriminatively trained features using fMPE for multi-stream audio-visual speech recognition. 777-780 - Graziano Tisato, Piero Cosi, Carlo Drioli, Fabio Tesser:

INTERFACE: a new tool for building emotive/expressive talking heads. 781-784 - Pascual Ejarque, Javier Hernando:

Variance reduction by using separate genuine- impostor statistics in multimodal biometrics. 785-788 - Volker Schubert, Stefan W. Hamerich:

The dialog application metalanguage GDialogXML. 789-792 - Jonas Beskow, Mikael Nordenberg:

Data-driven synthesis of expressive visual speech using an MPEG-4 talking head. 793-796 - Oytun Türk, Marc Schröder, Baris Bozkurt, Levent M. Arslan:

Voice quality interpolation for emotional text-to-speech synthesis. 797-800 - Murtaza Bulut, Carlos Busso, Serdar Yildirim, Abe Kazemzadeh, Chul Min Lee, Sungbok Lee, Shrikanth S. Narayanan:

Investigating the role of phoneme-level modifications in emotional speech resynthesis. 801-804 - Björn W. Schuller, Ronald Müller, Manfred K. Lang, Gerhard Rigoll:

Speaker independent emotion recognition by early fusion of acoustic and linguistic features within ensembles. 805-808 - Jonghwa Kim, Elisabeth André

, Matthias Rehm, Thurid Vogt, Johannes Wagner:
Integrating information from speech and physiological signals to achieve emotional sensitivity. 809-812 - Ellen Douglas-Cowie, Laurence Devillers, Jean-Claude Martin, Roddy Cowie, Suzie Savvidou, Sarkis Abrilian, Cate Cox:

Multimodal databases of everyday emotion: facing up to complexity. 813-816
Spoken / Multi-modal Dialogue Systems I, II
- Francisco Torres, Emilio Sanchis, Encarna Segarra:

Learning of stochastic dialog models through a dialog simulation technique. 817-820 - Lesley-Ann Black, Michael F. McTear, Norman D. Black, Roy Harper, Michelle Lemon:

Evaluating the DI@l-log system on a cohort of elderly, diabetic patients: results from a preliminary study. 821-824 - Pavel Král, Christophe Cerisara, Jana Klecková:

Combination of classifiers for automatic recognition of dialog acts. 825-828 - Xiaojun Wu, Thomas Fang Zheng, Michael Brasser, Zhanjiang Song:

Rapidly developing spoken Chinese dialogue systems with the d-ear SDS SDK. 829-832 - Daniela Oria, Akos Vetek:

Robust algorithms and interaction strategies for voice spelling. 833-836 - Ioannis Toptsis, Axel Haasch, Sonja Hwel, Jannik Fritsch, Gernot A. Fink:

Modality integration and dialog management for a robotic assistant. 837-840 - Norbert Reithinger, Daniel Sonntag:

An integration framework for a mobile multimodal dialogue system accessing the semantic web. 841-844 - Ryuichi Nisimura, Akinobu Lee, Masashi Yamada, Kiyohiro Shikano:

Operating a public spoken guidance system in real environment. 845-848 - Esa-Pekka Salonen, Markku Turunen, Jaakko Hakulinen, Leena Helin, Perttu Prusi, Anssi Kainulainen:

Distributed dialogue management for smart terminal devices. 849-852 - Jaakko Hakulinen, Markku Turunen, Esa-Pekka Salonen:

Visualization of spoken dialogue systems for demonstration, debugging and tutoring. 853-856 - César González Ferreras, Valentín Cardeñoso-Payo:

Development and evaluation of a spoken dialog system to access a newspaper web site. 857-860 - Olivier Pietquin, Richard Beaufort:

Comparing ASR modeling methods for spoken dialogue simulation and optimal strategy learning. 861-864 - Shiu-Wah Chu, Ian M. O'Neill, Philip Hanna, Michael F. McTear:

An approach to multi-strategy dialogue management. 865-868 - Anna Hjalmarsson:

Towards user modelling in conversational dialogue systems: a qualitative study of the dynamics of dialogue parameters. 869-872 - Kouichi Katsurada, Kazumine Aoki, Hirobumi Yamada, Tsuneo Nitta:

Reducing the description amount in authoring MMI applications. 873-876 - Kazunori Komatani, Naoyuki Kanda, Tetsuya Ogata, Hiroshi G. Okuno:

Contextual constraints based on dialogue models in database search task for spoken dialogue systems. 877-880 - Mihai Rotaru, Diane J. Litman:

Using word-level pitch features to better predict student emotions during spoken tutoring dialogues. 881-884 - Antoine Raux, Brian Langner, Dan Bohus, Alan W. Black, Maxine Eskénazi:

Let's go public! taking a spoken dialog system to the real world. 885-888 - Shinya Fujie, Kenta Fukushima, Tetsunori Kobayashi:

Back-channel feedback generation using linguistic and nonlinguistic information and its application to spoken dialogue system. 889-892 - Kallirroi Georgila, James Henderson, Oliver Lemon:

Learning user simulations for information state update dialogue systems. 893-896 - Darío Martín-Iglesias, Yago Pereiro-Estevan, Ana I. García-Moral, Ascensión Gallardo-Antolín, Fernando Díaz-de-María:

Design of a voice-enabled interface for real-time access to stock exchange from a PDA through GPRS. 897-900 - William Schuler, Tim Miller:

Integrating denotational meaning into a DBN language model. 901-904 - Louis ten Bosch:

Improving out-of-coverage language modelling in a multimodal dialogue system using small training sets. 905-908 - Olivier Galibert, Gabriel Illouz, Sophie Rosset:

Ritel: an open-domain, human-computer dialog system. 909-912
Robust Speech Recognition I-IV
- Reinhold Haeb-Umbach, Joerg Schmalenstroeer:

A comparison of particle filtering variants for speech feature enhancement. 913-916 - Ilyas Potamitis, Nikolaos D. Fakotakis:

Enhancement of mel log-power spectrum of speech using particle filtering. 917-920 - Makoto Shozakai, Goshu Nagino:

Improving robustness of speech recognition performance to aggregate of noises by two-dimensional visualization. 921-924 - Woohyung Lim, Bong Kyoung Kim, Nam Soo Kim:

Feature compensation based on switching linear dynamic model and soft decision. 925-928 - Shilei Huang, Xiang Xie, Jingming Kuang:

Using output probability distribution for improving speech recognition in adverse environment. 929-932 - Eric H. C. Choi:

A generalized framework for compensation of mel-filterbank outputs in feature extraction for robust ASR. 933-936 - Hesham Tolba, Zili Li, Douglas D. O'Shaughnessy:

Robust automatic speech recognition using a perceptually-based optimal spectral amplitude estimator speech enhancement algorithm in various low-SNR environments. 937-940 - Stephen So, Kuldip K. Paliwal:

Improved noise-robustness in distributed speech recognition via perceptually-weighted vector quantisation of filterbank energies. 941-944 - Babak Nasersharif, Ahmad Akbari:

Sub-band weighted projection measure for robust sub-band speech recognition. 945-948 - Jianping Deng, Martin Bouchard, Tet Hin Yeap:

Noise compensation using interacting multiple kalman filters. 949-952 - Veronique Stouten, Hugo Van hamme, Patrick Wambacq:

Kalman and unscented kalman filter feature enhancement for noise robust ASR. 953-956 - Chia-Yu Wan, Lin-Shan Lee:

Histogram-based quantization (HQ) for robust and scalable distributed speech recognition. 957-960 - Yong-Joo Chung:

A data-driven approach for the model parameter compensation in noisy speech recognition. 961-964 - Satoshi Kobashikawa, Satoshi Takahashi, Yoshikazu Yamaguchi, Atsunori Ogawa:

Rapid response and robust speech recognition by preliminary model adaptation for additive and convolutional noise. 965-968 - Saurabh Prasad, Stephen A. Zahorian:

Nonlinear and linear transformations of speech features to compensate for channel and noise effects. 969-972 - Motoyuki Suzuki, Yusuke Kato, Akinori Ito, Shozo Makino:

Construction method of acoustic models dealing with various background noises based on combination of HMMs. 973-976 - Haitian Xu, Zheng-Hua Tan, Paul Dalsgaard, Børge Lindberg:

Robust speech recognition based on noise and SNR classification - a multiple-model framework. 977-980 - Hwa Jeon Song, Hyung Soon Kim:

Eigen-environment based noise compensation method for robust speech recognition. 981-984 - Martin Graciarena, Horacio Franco, Gregory K. Myers, Victor Abrash:

Robust feature compensation in nonstationary and multiple noise environments. 985-988 - Jasha Droppo, Alex Acero:

Maximum mutual information SPLICE transform for seen and unseen conditions. 989-992 - Sven E. Krüger, Martin Schafföner, Marcel Katz, Edin Andelic, Andreas Wendemuth:

Speech recognition with support vector machines in a hybrid system. 993-996 - Vincent Barreaud, Douglas D. O'Shaughnessy, Jean-Guy Dahan:

Experiments on speaker profile portability. 997-1000 - Daniele Colibro, Luciano Fissore, Claudio Vair, Emanuele Dalmasso, Pietro Laface:

A confidence measure invariant to language and grammar. 1001-1004 - Ken Schutte, James R. Glass:

Robust detection of sonorant landmarks. 1005-1008
Speech Production I
- Amélie Rochet-Capellan, Jean-Luc Schwartz:

The labial-coronal effect and CVCV stability during reiterant speech production: an acoustic analysis. 1009-1012 - Amélie Rochet-Capellan, Jean-Luc Schwartz:

The labial-coronal effect and CVCV stability during reiterant speech production: an articulatory analysis. 1013-1016 - Mitsuhiro Nakamura:

Articulatory constraints and coronal stops: an EPG study. 1017-1020 - Vincent Robert, Brigitte Wrobel-Dautcourt, Yves Laprie, Anne Bonneau:

Strategies of labial coarticulation. 1021-1024 - Jianwu Dang, Jianguo Wei, Takeharu Suzuki, Pascal Perrier:

Investigation and modeling of coarticulation during speech. 1025-1028 - Fang Hu:

Tongue kinematics in diphthong production in Ningbo Chinese. 1029-1032 - Takayuki Arai:

Comparing tongue positions of vowels in oral and nasal contexts. 1033-1036 - Slim Ouni:

Can we retrieve vocal tract dynamics that produced speech? toward a speaker articulatory strategy model. 1037-1040 - Pascal Perrier, Liang Ma, Yohan Payan:

Modeling the production of VCV sequences via the inversion of a biomechanical model of the tongue. 1041-1044 - Xiaochuan Niu, Alexander Kain, Jan P. H. van Santen:

Estimation of the acoustic properties of the nasal tract during the production of nasalized vowels. 1045-1048 - Kohichi Ogata:

A web-based articulatory speech synthesis system for distance education. 1049-1052 - Paavo Alku, Matti Airas, Tom Bäckström, Hannu Pulakka:

Group delay function as a means to assess quality of glottal inverse filtering. 1053-1056 - Eva Björkner, Johan Sundberg, Paavo Alku:

Subglottal pressure and NAQ variation in voice production of classically trained baritone singers. 1057-1060 - Gunnar Fant, Anita Kruckenberg:

Covariation of subglottal pressure, F0 and intensity. 1061-1064 - Javier Pérez, Antonio Bonafonte:

Automatic voice-source parameterization of natural speech. 1065-1068 - Chakir Zeroual, John H. Esling, Lise Crevier-Buchman:

Physiological study of whispered speech in Moroccan Arabic. 1069-1072 - Carla P. Moura

, D. Andrade, Luis M. Cunha, Maria J. Cunha, Helena Vilarinho, Henrique Barros
, Diamantino Freitas, M. Pais-Clemente:
Voice quality in down syndrome children treated with rapid maxillary expansion. 1073-1076 - Julien Hanquinet, Francis Grenez, Jean Schoentgen:

Synthesis of disordered speech. 1077-1080 - Julie Fontecave, Frédéric Berthommier:

Quasi-automatic extraction of tongue movement from a large existing speech cineradiographic database. 1081-1084 - Shimon Sapir, Ravit Cohen Mimran:

The working memory token test (WMTT): preliminary findings in young adults with and without dyslexia. 1085-1088 - Sérgio Paulo

, Luís C. Oliveira:
Reducing the corpus-based TTS signal degradation due to speaker's word pronunciations. 1089-1092 - Wai-Sum Lee:

A phonetic study of the "er-hua" rimes in Beijing Mandarin. 1093-1096
Acoustic Processing for ASR I-III
- Li Deng, Dong Yu, Alex Acero:

Learning statistically characterized resonance targets in a hidden trajectory model of speech coarticulation and reduction. 1097-1100 - Daniil Kocharov, András Zolnay, Ralf Schlüter, Hermann Ney:

Articulatory motivated acoustic features for speech recognition. 1101-1104 - Shinji Watanabe, Atsushi Nakamura:

Effects of Bayesian predictive classification using variational Bayesian posteriors for sparse training data in speech recognition. 1105-1108 - Yu Tsao, Jinyu Li, Chin-Hui Lee:

A study on separation between acoustic models and its applications. 1109-1112 - Mohamed Afify:

Extended baum-welch reestimation of Gaussian mixture models based on reverse Jensen inequality. 1113-1116 - Asela Gunawardana, Milind Mahajan, Alex Acero, John C. Platt:

Hidden conditional random fields for phone classification. 1117-1120
Signal Analysis, Processing and Feature Estimation I-III
- Francesco Gianfelici, Giorgio Biagetti, Paolo Crippa, Claudio Turchetti:

Asymptotically exact AM-FM decomposition based on iterated hilbert transform. 1121-1124 - Athanassios Katsamanis, Petros Maragos:

Advances in statistical estimation and tracking of AM-FM speech components. 1125-1128 - Jonathan Darch, Ben P. Milner, Saeed Vaseghi:

Formant frequency prediction from MFCC vectors in noisy environments. 1129-1132 - S. R. Mahadeva Prasanna, B. Yegnanarayana:

Detection of vowel onset point events using excitation information. 1133-1136 - João P. Cabral, Luís C. Oliveira:

Pitch-synchronous time-scaling for prosodic and voice quality transformations. 1137-1140 - Yasunori Ohishi, Masataka Goto, Katunobu Itou, Kazuya Takeda:

Discrimination between singing and speaking voices. 1141-1144
Spoken Language Resources and Technology Evaluation I, II
- Douglas A. Jones, Wade Shen, Elizabeth Shriberg, Andreas Stolcke, Teresa M. Kamm, Douglas A. Reynolds:

Two experiments comparing reading with listening for human processing of conversational telephone speech. 1145-1148 - Sylvain Galliano, Edouard Geoffrois, Djamel Mostefa, Khalid Choukri, Jean-François Bonastre, Guillaume Gravier:

The ESTER phase II evaluation campaign for the rich transcription of French broadcast news. 1149-1152 - Takashi Saito:

A method of multi-layered speech segmentation tailored for speech synthesis. 1153-1156 - Sérgio Paulo

, Luís C. Oliveira:
Generation of word alternative pronunciations using weighted finite state transducers. 1157-1160 - Helmer Strik

, Diana Binnenpoorte, Catia Cucchiarini:
Multiword expressions in spontaneous speech: do we really speak like that? 1161-1164 - Jáchym Kolár, Jan Svec, Stephanie M. Strassel, Christopher Walker, Dagmar Kozlíková, Josef Psutka:

Czech spontaneous speech corpus with structural metadata. 1165-1168
Early Language Acquisition
- Kentaro Ishizuka, Ryoko Mugitani, Hiroko Kato Solvang, Shigeaki Amano:

A longitudinal analysis of the spectral peaks of vowels for a Japanese infant. 1169-1172 - Krisztina Zajdó, Jeannette M. van der Stelt, Ton G. Wempe, Louis C. W. Pols:

Cross-linguistic comparison of two-year-old children's acoustic vowel spaces: contrasting Hungarian with dutch. 1173-1176 - Britta Lintfert, Katrin Schneider:

Acoustic correlates of contrastive stress in German children. 1177-1180 - Giampiero Salvi:

Ecological language acquisition via incremental model-based clustering. 1181-1184 - Tamami Sudo, Ken Mogi:

Perceptual and linguistic category formation in infants. 1185-1188
Multi-modal / Multi-media Processing I, II
- Raghunandan S. Kumaran, Karthik Narayanan, John N. Gowdy:

Myoelectric signals for multimodal speech recognition. 1189-1192 - Philippe Daubias:

Is color information really useful for lip-reading ? (or what is lost when color is not used). 1193-1196 - Islam Shdaifat, Rolf-Rainer Grigat:

A system for audio-visual speech recognition. 1197-1200 - Norihide Kitaoka, Hironori Oshikawa, Seiichi Nakagawa:

Multimodal interface for organization name input based on combination of isolated word recognition and continuous base-word recognition. 1201-1204 - Yosuke Matsusaka:

Recognition of (3) party conversation using prosody and gaze. 1205-1208 - Dongdong Li, Yingchun Yang, Zhaohui Wu:

Combining voiceprint and face biometrics for speaker identification using SDWS. 1209-1212 - Neil Cooke, Martin J. Russell:

Using the focus of visual attention to improve spontaneous speech recognition. 1213-1216 - Sabri Gurbuz:

Real-time outer lip contour tracking for HCI applications. 1217-1220 - Jing Huang, Karthik Visweswariah:

Improving lip-reading with feature space transforms for multi-stream audio-visual speech recognition. 1221-1224 - Hansjörg Mixdorff, Denis Burnham, Guillaume Vignali, Patavee Charnvivit:

Are there facial correlates of Thai syllabic tones? 1225-1228 - Rowan Seymour, Ji Ming, Darryl Stewart:

A new posterior based audio-visual integration method for robust speech recognition. 1229-1232
Bridging the Gap ASR-HSR
- Sorin Dusan, Lawrence R. Rabiner:

On integrating insights from human speech perception into automatic speech recognition. 1233-1236 - Odette Scharenborg:

Parallels between HSR and ASR: how ASR can contribute to HSR. 1237-1240 - Louis ten Bosch, Odette Scharenborg:

ASR decoding in a computational model of human word recognition. 1241-1244 - Viktoria Maier, Roger K. Moore:

An investigation into a simulation of episodic memory for automatic speech recognition. 1245-1248 - Eric Fosler-Lussier, C. Anton Rytting, Soundararajan Srinivasan:

Phonetic ignorance is bliss: investigating the effects of phonetic information reduction on ASR performance. 1249-1252 - Marcus Holmberg, David Gelbart, Ulrich Ramacher, Werner Hemmert:

Automatic speech recognition with neural spike trains. 1253-1256 - Michael J. Carey, Tuan P. Quang:

A speech similarity distance weighting for robust recognition. 1257-1260 - Takao Murakami, Kazutaka Maruyama, Nobuaki Minematsu, Keikichi Hirose:

Japanese vowel recognition based on structural representation of speech. 1261-1264 - Soundararajan Srinivasan, DeLiang Wang:

Modeling the perception of multitalker speech. 1265-1268 - Sue Harding, Jon P. Barker, Guy J. Brown:

Binaural feature selection for missing data speech recognition. 1269-1272 - Thorsten Wesker, Bernd T. Meyer, Kirsten Wagener, Jörn Anemüller, Alfred Mertins, Birger Kollmeier:

Oldenburg logatome speech corpus (OLLO) for speech recognition experiments with humans and machines. 1273-1276
Speech Recognition - Language Modelling I-III
- Jen-Wei Kuo, Berlin Chen:

Minimum word error based discriminative training of language models. 1277-1280 - A. Ghaoui, François Yvon, Chafic Mokbel, Gérard Chollet:

On the use of morphological constraints in n-gram statistical language model. 1281-1284 - Elvira I. Sicilia-Garcia, Ji Ming, Francis Jack Smith:

A posteriori multiple word-domain language model. 1285-1288 - Javier Dieguez-Tirado, Carmen García-Mateo, Antonio Cardenal López:

Effective topic-tree based language model adaptation. 1289-1292 - Abhinav Sethy, Panayiotis G. Georgiou, Shrikanth S. Narayanan:

Building topic specific language models from webdata using competitive models. 1293-1296 - Carlos Troncoso, Tatsuya Kawahara:

Trigger-based language model adaptation for automatic meeting transcription. 1297-1300 - Jacques Duchateau, Dong Hoon Van Uytsel, Hugo Van hamme, Patrick Wambacq:

Statistical language models for large vocabulary spontaneous speech recognition in dutch. 1301-1304 - Alexandre Allauzen, Jean-Luc Gauvain:

Diachronic vocabulary adaptation for broadcast news transcription. 1305-1308 - Vesa Siivola, Bryan L. Pellom:

Growing an n-gram language model. 1309-1312 - Harald Hning, Manuel Kirschner, Fritz Class, André Berton, Udo Haiber:

Embedding grammars into statistical language models. 1313-1316 - Simo Broman, Mikko Kurimo:

Methods for combining language models in speech recognition. 1317-1320 - Airenas Vaiciunas, Gailius Raskinis:

Review of statistical modeling of highly inflected lithuanian using very large vocabulary. 1321-1324 - Genevieve Gorrell, Brandyn Webb:

Generalized hebbian algorithm for incremental latent semantic analysis. 1325-1328 - Arnar Thor Jensson, Edward W. D. Whittaker, Koji Iwano, Sadaoki Furui:

Language model adaptation for resource deficient languages using translated data. 1329-1332 - Petra Witschel, Sergey Astrov, Gabriele Bakenecker, Josef G. Bauer, Harald Höge:

POS-based language models for large vocabulary speech recognition on embedded systems. 1333-1336
Speech Recognition - Pronunciation Modelling
- Je Hun Jeon, Minhwa Chung:

Automatic generation of domain-dependent pronunciation lexicon with data-driven rules and rule adaptation. 1337-1340 - Michael Tjalve, Mark A. Huckvale:

Pronunciation variation modelling using accent features. 1341-1344 - Khiet P. Truong, Ambra Neri, Febe de Wet, Catia Cucchiarini, Helmer Strik

:
Automatic detection of frequent pronunciation errors made by L2-learners. 1345-1348 - Josef Psutka, Pavel Ircing, Josef V. Psutka, Jan Hajic, William J. Byrne, Jirí Mírovský:

Automatic transcription of Czech, Russian, and Slovak spontaneous speech in the MALACH project. 1349-1352 - Stéphane Dupont, Christophe Ris, Laurent Couvreur, Jean-Marc Boite:

A study of implicit and explicit modeling of coarticulation and pronunciation variation. 1353-1356 - Shinya Takahashi, Tsuyoshi Morimoto, Sakashi Maeda, Naoyuki Tsuruta:

Detection of coughs from user utterances using imitated phoneme model. 1357-1360 - V. Ramasubramanian, P. Srinivas, T. V. Sreenivas:

Stochastic pronunciation modeling by ergodic-HMM of acoustic sub-word units. 1361-1364 - Chen Liu, Lynette Melnar:

An automated linguistic knowledge-based cross-language transfer method for building acoustic models for a language without native training data. 1365-1368 - Ghazi Bouselmi, Dominique Fohr, Irina Illina, Jean Paul Haton:

Fully automated non-native speech recognition using confusion-based acoustic model integration. 1369-1372
Prosodic Structure
- Véronique Aubergé, Albert Rilliard:

The focus prosody: more than a simple binary function. 1373-1376 - Martha Dalton, Ailbhe Ní Chasaide:

Peak timing in two dialects of connaught irish. 1377-1380 - Janet Fletcher:

Compound rises and "uptalk" in spoken English. 1381-1384 - Li-chiung Yang:

Duration and the temporal structure of Mandarin discourse. 1385-1388 - Bei Wang:

Prosodic realization of split noun phrases in Mandarin Chinese compared in topic and focus contexts. 1389-1392 - Ziyu Xiong:

Downstep effect on disyllabic words of citation forms in standard Chinese. 1393-1396 - Jinfu Ni, Hisashi Kawai, Keikichi Hirose:

Estimation of intonation variation with constrained tone transformations. 1397-1400 - Ho-hsien Pan:

Voice quality of falling tones in taiwan min. 1401-1404 - Chiu-yu Tseng, Bau-Ling Fu:

Duration, intensity and pause predictions in relation to prosody organization. 1405-1408 - Jiahong Yuan, Jason M. Brenier, Daniel Jurafsky:

Pitch accent prediction: effects of genre and speaker. 1409-1412 - Hiroya Fujisaki, Sumio Ohno:

Analysis and modeling of fundamental frequency contours of hindi utterances. 1413-1416 - Natasha Govender, Etienne Barnard, Marelie H. Davel:

Fundamental frequency and tone in isizulu: initial experiments. 1417-1420 - Judith Bishop, Marc Peake, Dmitry Sityaev:

Intonational sequences in tuscan Italian. 1421-1424 - Caterina Petrone:

Effects of raddoppiamento sintattico on tonal alignment in Italian. 1425-1428 - Tomás Dubeda, Jan Votrubec:

Acoustic analysis of Czech stress: intonation, duration and intensity revisited. 1429-1432 - Mohamed Yeou:

Variability of F0 peak alignment in moroccan Arabic accentual focus. 1433-1436 - Anne Lacheret, Ch. Lyche, Michel Morel:

Phonological analysis of schwa and liaison within the PFC project (phonologie du fran ais contemporain): how determinant are the prosodic factors? 1437-1440 - Plínio A. Barbosa, Pablo Arantes, Alexsandro R. Meireles, Jussara M. Vieira:

Abstractness in speech-metronome synchronisation: P-centres as cyclic attractors. 1441-1444
Applications of Confidence Related Measures to ASR
- Makoto Yamada, Tsuneo Kato, Masaki Naito, Hisashi Kawai:

Improvement of rejection performance of keyword spotting using anti-keywords derived from large vocabulary considering acoustical similarity to keywords. 1445-1448 - Ralf Schlüter, T. Scharrenbach, Volker Steinbiss, Hermann Ney:

Bayes risk minimization using metric loss functions. 1449-1452 - Akio Kobayashi, Kazuo Onoe, Shoei Sato, Toru Imai:

Word error rate minimization using an integrated confidence measure. 1453-1456 - Bin Dong, Qingwei Zhao, Yonghong Yan:

Fast confidence measure algorithm for continuous speech recognition. 1457-1460 - Hamed Ketabdar, Jithendra Vepa, Samy Bengio, Hervé Bourlard:

Developing and enhancing posterior based speech recognition systems. 1461-1464 - Peng Liu, Ye Tian, Jian-Lai Zhou, Frank K. Soong:

Background model based posterior probability for measuring confidence. 1465-1468
Multilingual TTS
- Laura Mayfield Tomokiyo, Alan W. Black, Kevin A. Lenzo:

Foreign accents in synthetic speech: development and evaluation. 1469-1472 - Raul Fernandez, Wei Zhang, Ellen Eide, Raimo Bakis, Wael Hamza, Yi Liu, Michael Picheny, John F. Pitrelli, Yong Qing, Zhiwei Shuang, Li Qin Shen:

Toward multiple-language TTS: experiments in English and Mandarin. 1473-1476 - Javier Latorre, Koji Iwano, Sadaoki Furui:

Cross-language synthesis with a polyglot synthesizer. 1477-1480 - Mucemi Gakuru, Frederick K. Iraki, Roger C. F. Tucker, Ksenia Shalonova, Kamanda Ngugi:

Development of a Kiswahili text to speech system. 1481-1484 - Jaime Botella Ordinas, Volker Fischer, Claire Waast-Richard:

Multilingual models in the IBM bilingual text-to-speech systems. 1485-1488 - Artur Janicki, Piotr Herman:

Reconstruction of Polish diacritics in a text-to-speech system. 1489-1492
Speech Bandwidth Extension
- Hiroyuki Ehara, Toshiyuki Morii, Masahiro Oshikiri, Koji Yoshida, Kouichi Honma:

Design of bandwidth scalable LSF quantization using interframe and intraframe prediction. 1493-1496 - Bernd Geiser, Peter Jax, Peter Vary:

Artificial bandwidth extension of speech supported by watermark-transmitted side information. 1497-1500 - Rongqiang Hu, Venkatesh Krishnan, David V. Anderson:

Speech bandwidth extension by improved codebook mapping towards increased phonetic classification. 1501-1504 - Dhananjay Bansal, Bhiksha Raj, Paris Smaragdis:

Bandwidth expansion of narrowband speech using non-negative matrix factorization. 1505-1508 - Michael L. Seltzer, Alex Acero, Jasha Droppo:

Robust bandwidth extension of noise-corrupted narrowband speech. 1509-1512 - João P. Cabral, Luís C. Oliveira:

Pitch-synchronous time-scaling for high-frequency excitation regeneration. 1513-1516
Spoken Language Resources and Technology Evaluation I, II
- Felix Burkhardt, Astrid Paeschke, M. Rolfes, Walter F. Sendlmeier, Benjamin Weiss:

A database of German emotional speech. 1517-1520 - Philippe Boula de Mareüil, Christophe d'Alessandro, Gérard Bailly, Frédéric Béchet, Marie-Neige Garcia, Michel Morel, Romain Prudon, Jean Véronis:

Evaluating the pronunciation of proper names by four French grapheme-to-phoneme converters. 1521-1524 - Filip Jurcícek, Jirí Zahradil, Libor Jelínek:

A human-human train timetable dialogue corpus. 1525-1528 - Gloria Branco, Luís Almeida, Rui Gomes, Nuno Beires:

A Portuguese spoken and multi-modal dialog corpora. 1529-1532 - Joyce Y. C. Chan, P. C. Ching, Tan Lee:

Development of a Cantonese-English code-mixing speech corpus. 1533-1536 - Andrej Zgank, Darinka Verdonik, Aleksandra Zögling Markus, Zdravko Kacic:

BNSI Slovenian broadcast news database - speech and text corpus. 1537-1540 - Jan Volín, Radek Skarnitzl, Petr Pollák:

Confronting HMM-based phone labelling with human evaluation of speech production. 1541-1544 - Stephanie M. Strassel, Jáchym Kolár, Zhiyi Song, Leila Barclay, Meghan Lammie Glenn:

Structural metadata annotation: moving beyond English. 1545-1548 - Delphine Charlet, Sacha Krstulovic, Frédéric Bimbot, Olivier Boëffard, Dominique Fohr, Odile Mella, Filip Korkmazsky, Djamel Mostefa, Khalid Choukri, Arnaud Vallée:

Neologos: an optimized database for the development of new speech processing algorithms. 1549-1552 - Cheng-Yuan Lin, Kuan-Ting Chen, Jyh-Shing Roger Jang:

A hybrid approach to automatic segmentation and labeling for Mandarin Chinese speech corpus. 1553-1556 - Yuang-Chin Chiang, Min-Siong Liang, Hong-Yi Lin, Ren-Yuan Lyu:

The multiple pronunciations in Taiwanese and the automatic transcription of Buddhist sutra with augmented read speech. 1557-1560 - Marelie H. Davel, Etienne Barnard:

Bootstrapping pronunciation dictionaries: practical issues. 1561-1564 - Nigel G. Ward, Anais G. Rivera, Karen Ward, David G. Novick:

Root causes of lost time and user stress in a simple dialog system. 1565-1568 - Julie A. Parisi, Douglas Brungart:

Evaluating communication effectiveness in team collaboration. 1569-1572 - David Conejero, Alan Lounds

, Carmen García-Mateo, Leandro Rodríguez Liñares, Raquel Mochales, Asunción Moreno:
Bilingual aligned corpora for speech to speech translation for Spanish, English and Catalan. 1573-1576 - Hynek Boril, Petr Pollák:

Design and collection of Czech Lombard speech database. 1577-1580 - Abe Kazemzadeh, Hong You, Markus Iseli, Barbara Jones, Xiaodong Cui, Margaret Heritage, Patti Price, Elaine Andersen, Shrikanth S. Narayanan, Abeer Alwan:

TBALL data collection: the making of a young children's speech corpus. 1581-1584 - Hitomi Tohyama, Shigeki Matsubara, Nobuo Kawaguchi, Yasuyoshi Inagaki:

Construction and utilization of bilingual speech corpus for simultaneous machine interpretation research. 1585-1588 - Rebecca A. Bates, Patrick Menning, Elizabeth Willingham, Chad Kuyper:

Meeting acts: a labeling system for group interaction in meetings. 1589-1592 - Marius-Calin Silaghi, Rachna Vargiya:

A new evaluation criteria for keyword spotting techniques and a new algorithm. 1593-1596 - Christoph Draxler, Alexander Steffen:

Phattsessionz: recording 1000 adolescent speakers in schools in Germany. 1597-1600 - Solomon Teferra Abate, Wolfgang Menzel, Bairu Tafila:

An Amharic speech corpus for large vocabulary continuous speech recognition. 1601-1604 - Hans Dolfing, David Reitter, Luís Almeida, Nuno Beires, Michael Cody, Rui Gomes, Kerry Robinson, Roman Zielinski:

The FASil speech and multimodal corpora. 1605-1608 - Karin Müller:

Revealing phonological similarities between German and dutch. 1609-1612
Large Vocabulary Speech Recognition Systems
- Dimitra Vergyri, Katrin Kirchhoff, Venkata Ramana Rao Gadde, Andreas Stolcke, Jing Zheng:

Development of a conversational telephone speech recognizer for Levantine Arabic. 1613-1616 - Bhuvana Ramabhadran:

Exploiting large quantities of spontaneous speech for unsupervised training of acoustic models. 1617-1620 - Che-Kuang Lin, Lin-Shan Lee:

Improved spontaneous Mandarin speech recognition by disfluency interruption point (IP) detection using prosodic features. 1621-1624 - Jeff Z. Ma, Spyros Matsoukas:

Improvements to the BBN RT04 Mandarin conversational telephone speech recognition system. 1625-1628 - Sakriani Sakti, Satoshi Nakamura, Konstantin Markov:

Incorporating a Bayesian wide phonetic context model for acoustic rescoring. 1629-1632 - Abdelkhalek Messaoudi, Lori Lamel, Jean-Luc Gauvain:

Modeling vowels for Arabic BN transcription. 1633-1636 - Mohamed Afify, Long Nguyen, Bing Xiang, Sherif M. Abdou, John Makhoul:

Recent progress in Arabic broadcast news transcription at BBN. 1637-1640 - Spyros Matsoukas, Rohit Prasad, Srinivas Laxminarayan, Bing Xiang, Long Nguyen, Richard M. Schwartz:

The 2004 BBN 1xRT recognition systems for English broadcast news and conversational telephone speech. 1641-1644 - Rohit Prasad, Spyros Matsoukas, Chia-Lin Kao, Jeff Z. Ma, Dongxin Xu, Thomas Colthurst, Owen Kimball, Richard M. Schwartz, Jean-Luc Gauvain, Lori Lamel, Holger Schwenk, Gilles Adda, Fabrice Lefèvre:

The 2004 BBN/LIMSI 20xRT English conversational telephone speech recognition system. 1645-1648 - Bing Xiang, Long Nguyen, Xuefeng Guo, Dongxin Xu:

The BBN Mandarin broadcast news transcription system. 1649-1652 - Paul Deléglise, Yannick Estève, Sylvain Meignier, Téva Merlin:

The LIUM speech transcription system: a CMU Sphinx III-based system for French broadcast news. 1653-1656 - Lori Lamel, Gilles Adda, Éric Bilinski, Jean-Luc Gauvain:

Transcribing lectures and seminars. 1657-1660 - Thomas Hain

, John Dines, Giulia Garau, Martin Karafiát, Darren Moore, Vincent Wan, Roeland Ordelman, Steve Renals:
Transcription of conference room meetings: an investigation. 1661-1664 - Jean-Luc Gauvain, Gilles Adda, Martine Adda-Decker, Alexandre Allauzen, Véronique Gendner, Lori Lamel, Holger Schwenk:

Where are we in transcribing French broadcast news? 1665-1668 - Odette Scharenborg, Stephanie Seneff:

Two-pass strategy for handling OOVs in a large vocabulary recognition task. 1669-1672 - Long Nguyen, Bing Xiang, Mohamed Afify, Sherif M. Abdou, Spyros Matsoukas, Richard M. Schwartz, John Makhoul:

The BBN RT04 English broadcast news transcription system. 1673-1676 - Rong Zhang, Ziad Al Bawab, Arthur Chan, Ananlada Chotimongkol, David Huggins-Daines, Alexander I. Rudnicky:

Investigations on ensemble based semi-supervised acoustic model training. 1677-1680 - Jan Nouza, Jindrich Zdánský, Petr David, Petr Cerva, Jan Kolorenc, Dana Nejedlová:

Fully automated system for Czech spoken broadcast transcription with very large (300k+) lexicon. 1681-1684 - Mike Schuster, Takaaki Hori, Atsushi Nakamura:

Experiments with probabilistic principal component analysis in LVCSR. 1685-1688 - Thang Tat Vu, Dung Tien Nguyen, Chi Mai Luong, John-Paul Hosom:

Vietnamese large vocabulary continuous speech recognition. 1689-1692 - Takahiro Shinozaki, Mari Ostendorf, Les E. Atlas:

Data sampling for improved speech recognizer training. 1693-1696
Speech Perception I, II
- Do Dat Tran, Eric Castelli, Jean-François Serignat, Van Loan Trinh, Le Xuan Hung:

Influence of F0 on Vietnamese syllable perception. 1697-1700 - Barbara Schwanhäußer, Denis Burnham:

Lexical tone and pitch perception in tone and non-tone language speakers. 1701-1704 - Isabel Falé, Isabel Hub Faria:

Intonational contrasts in EP: a categorical perception approach. 1705-1708 - Bettina Braun, Andrea Weber, Matthew W. Crocker:

Does narrow focus activate alternative referents? 1709-1712 - Kiyoaki Aikawa, Hayato Hashimoto:

Audiovisual interaction on the perception of frequency glide of linear sweep tones. 1713-1716 - Kei Omata, Ken Mogi:

Audiovisual integration in dichotic listening. 1717-1720 - Gunilla Svanfeldt, Dirk Olszewski:

Perception experiment combining a parametric loudspeaker and a synthetic talking head. 1721-1724 - Catherine Mayo, Robert A. J. Clark, Simon King:

Multidimensional scaling of listener responses to synthetic speech. 1725-1728 - Hiroko Terasawa, Malcolm Slaney, Jonathan Berger:

A timbre space for speech. 1729-1732 - Abdellah Kacha, Francis Grenez, Jean Schoentgen:

Voice quality assessment by means of comparative judgments of speech tokens. 1733-1736 - Toshio Irino, Satoru Satou, Shunsuke Nomura, Hideki Banno, Hideki Kawahara:

Speech intelligibility derived from time-frequency and source smearing. 1737-1740 - Nahoko Hayashi, Takayuki Arai, Nao Hodoshima, Yusuke Miyauchi, Kiyohiro Kurisu:

Steady-state pre-processing for improving speech intelligibility in reverberant environments: evaluation in a hall with an electrical reverberator. 1741-1744 - Patrick C. M. Wong, Kiara M. Lee, Todd B. Parrish:

Neural bases of listening to speech in noise. 1745-1748 - P. Jongmans, Frans J. M. Hilgers, Louis C. W. Pols, Corina J. van As-Brooks:

The intelligibility of tracheoesophageal speech: first results. 1749-1752 - Guy J. Brown, Kalle J. Palomäki:

A computational model of the speech reception threshold for laterally separated speech and noise. 1753-1756 - Esther Janse:

Lexical inhibition effects in time-compressed speech. 1757-1760 - Caroline Jacquier, Fanny Meunier:

Perception of time-compressed rapid acoustic cues in French CV syllables. 1761-1764 - Claire-Léonie Grataloup, Michel Hoen, François Pellegrino, E. Veuillet, Lionel Collet, Fanny Meunier:

Reversed speech comprehension depends on the auditory efferent system functionality. 1765-1768 - Won Tokuma, Shinichi Tokuma:

Perceptual space of English fricatives for Japanese learners. 1769-1772 - Ioana Vasilescu, Maria Candea, Martine Adda-Decker:

Perceptual salience of language-specific acoustic differences in autonomous fillers across eight languages. 1773-1776 - Marc D. Pell:

Effects of cortical and subcortical brain damage on the processing of emotional prosody. 1777-1780
Keynote Papers
- Elizabeth Shriberg:

Spontaneous speech: how people really talk and why engineers should care. 1781-1784
Speech Recognition - Adaptation I, II
- Karthik Visweswariah, Peder A. Olsen:

Feature adaptation using projection of Gaussian posteriors. 1785-1788 - Xiao Li, Jeff A. Bilmes, Jonathan Malkin:

Maximum margin learning and adaptation of MLP classifiers. 1789-1792 - Arindam Mandal, Mari Ostendorf, Andreas Stolcke:

Leveraging speaker-dependent variation of adaptation. 1793-1796 - Roger Wend-Huu Hsiao, Brian Kan-Wing Mak:

A comparative study of two kernel eigenspace-based speaker adaptation methods on large vocabulary continuous speech recognition. 1797-1800 - Xuechuan Wang, Douglas D. O'Shaughnessy:

Environmental compensation using ASR model adaptation by a Bayesian parametric representation method. 1801-1804 - Jun Luo, Zhijian Ou, Zuoying Wang:

Discriminative speaker adaptation with eigenvoices. 1805-1808
Prosody Modelling and Speech Technology I, II
- Gina-Anne Levow:

Context in multi-lingual tone and pitch accent recognition. 1809-1812 - Fabio Tamburini:

Automatic prominence identification and prosodic typology. 1813-1816 - Tommy Ingulfsen, Tina Burrows, Sabine Buchholz:

Influence of syntax on prosodic boundary prediction. 1817-1820 - Roberto Gretter, Dino Seppi:

Using prosodic information for disambiguation purposes. 1821-1824 - Wentao Gu, Keikichi Hirose, Hiroya Fujisaki:

Analysis of the effects of word emphasis and echo question on F0 contours of Cantonese utterances. 1825-1828 - Tina Burrows, Peter Jackson, Katherine M. Knill, Dmitry Sityaev:

Combining models of prosodic phrasing and pausing. 1829-1832
Detecting and Synthesizing Speaker State
- Julia Hirschberg, Stefan Benus, Jason M. Brenier, Frank Enos, Sarah Friedman, Sarah Gilman, Cynthia Girand, Martin Graciarena, Andreas Kathol, Laura A. Michaelis, Bryan L. Pellom, Elizabeth Shriberg, Andreas Stolcke:

Distinguishing deceptive from non-deceptive speech. 1833-1836 - Jackson Liscombe, Julia Hirschberg, Jennifer J. Venditti:

Detecting certainness in spoken tutorial dialogues. 1837-1840 - Laurence Vidrascu, Laurence Devillers:

Detection of real-life emotions in call centers. 1841-1844 - Jackson Liscombe, Giuseppe Riccardi, Dilek Hakkani-Tür:

Using context to improve emotion detection in spoken dialog systems. 1845-1848 - Irena Yanushevskaya, Christer Gobl, Ailbhe Ní Chasaide:

Voice quality and f0 cues for affect expression: implications for synthesis. 1849-1852 - Toru Takahashi, Takeshi Fujii, Masashi Nishi, Hideki Banno, Toshio Irino, Hideki Kawahara:

Voice and emotional expression transformation based on statistics of vowel parameters in an emotional speech database. 1853-1856
Rapid Development of Spoken Dialogue Systems
- Giuseppe Di Fabbrizio, Gökhan Tür, Dilek Hakkani-Tür

:
Automated wizard-of-oz for spoken dialogue systems. 1857-1860 - Kouichi Katsurada, Kunitoshi Sato, Hiroaki Adachi, Hirobumi Yamada, Tsuneo Nitta:

A rapid prototyping tool for constructing web-based MMI applications. 1861-1864 - Philip Hanna, Ian M. O'Neill, Xingkun Liu, Michael F. McTear:

Developing extensible and reusable spoken dialogue components: an examination of the Queen's communicator. 1865-1868 - Ye-Yi Wang, Alex Acero:

SGStudio: rapid semantic grammar development for spoken language understanding. 1869-1872 - Murat Akbacak, Yuqing Gao, Liang Gu, Hong-Kwang Jeff Kuo:

Rapid transition to new spoken dialogue domains: language model training using knowledge from previous domain applications and web text resources. 1873-1876 - Manny Rayner, Pierrette Bouillon, Nikos Chatzichrisafis, Beth Ann Hockey, Marianne Santaholma, Marianne Starlander, Hitoshi Isahara, Kyoko Kanzaki, Yukie Nakao:

A methodology for comparing grammar-based and robust approaches to speech understanding. 1877-1880
Text-to-Speech I, II
- François Mairesse, Marilyn A. Walker:

Learning to personalize spoken generation for dialogue systems. 1881-1884 - S. Revelin, Didier Cadic, Claire Waast-Richard:

Optimization of text-to-speech phonetic transcriptions using a-posteriori signal comparison. 1885-1888 - Özgül Salor, Mübeccel Demirekler:

Voice transformation using principle component analysis based LSF quantization and dynamic programming approach. 1889-1892 - Hai Ping Li, Wei Zhang:

Adapt Mandarin TTS system to Chinese dialect TTS systems. 1893-1896 - Min Zheng, Qin Shi, Wei Zhang, Lianhong Cai:

Grapheme-to-phoneme conversion based on TBL algorithm in Mandarin TTS system. 1897-1900 - Paolo Massimino, Alberto Pacchiotti:

An automaton-based machine learning technique for automatic phonetic transcription. 1901-1904 - Tasanawan Soonklang, Robert I. Damper, Yannick Marchand:

Comparative objective and subjective evaluation of three data-driven techniques for proper name pronunciation. 1905-1908 - Olov Engwall:

Articulatory synthesis using corpus-based estimation of line spectrum pairs. 1909-1912 - Aoju Chen, Els den Os:

Effects of pitch accent type on interpreting information status in synthetic speech. 1913-1916 - Perttu Prusi, Anssi Kainulainen, Jaakko Hakulinen, Markku Turunen, Esa-Pekka Salonen, Leena Helin:

Towards generic spatial object model and route guidance grammar for speech-based systems. 1917-1920 - Chi-Chun Hsia, Chung-Hsien Wu, Te-Hsien Liu:

Duration-embedded bi-HMM for expressive voice conversion. 1921-1924 - Toshio Hirai, Hisashi Kawai, Minoru Tsuzaki, Nobuyuki Nishizawa:

Analysis of major factors of naturalness degradation in concatenative synthesis. 1925-1928 - Jilei Tian, Jani Nurminen, Imre Kiss:

Duration modeling and memory optimization in a Mandarin TTS system. 1929-1932 - Min-Siong Liang, Ke-Chun Chuang, Rhuei-Cheng Yang, Yuang-Chin Chiang, Ren-Yuan Lyu:

A bi-lingual Mandarin-to-taiwanese text-to-speech system. 1933-1936 - Uwe D. Reichel, Florian Schiel:

Using morphology and phoneme history to improve grapheme-to-phoneme conversion. 1937-1940 - Olga Goubanova, Simon King:

Predicting consonant duration with Bayesian belief networks. 1941-1944 - Per-Anders Jande:

Inducing decision tree pronunciation variation models from annotated speech data. 1945-1948 - Lijuan Wang, Yong Zhao, Min Chu, Frank K. Soong, Zhigang Cao:

Phonetic transcription verification with generalized posterior probability. 1949-1952 - Hua Cheng, Fuliang Weng, Niti Hantaweepant, Lawrence Cavedon, Stanley Peters:

Training a maximum entropy model for surface realization. 1953-1956 - Tomoki Toda, Kiyohiro Shikano:

NAM-to-speech conversion with Gaussian mixture models. 1957-1960 - Michelina Savino, Mario Refice, Massimo Mitaritonna:

Which Italian do current systems speak? a first step towards pronunciation modelling of Italian varieties. 1961-1964 - Dominika Oliver, Robert A. J. Clark:

Modelling pitch accent types for Polish speech synthesis. 1965-1968 - Chatchawarn Hansakunbuntheung, Ausdang Thangthai, Chai Wutiwiwatchai, Rungkarn Siricharoenchai:

Learning methods and features for corpus-based phrase break prediction on Thai. 1969-1972 - Paul Taylor:

Hidden Markov models for grapheme to phoneme conversion. 1973-1976
Speaker Characterization and Recognition I-IV
- Longbiao Wang, Norihide Kitaoka, Seiichi Nakagawa:

Robust distant speaker recognition based on position dependent cepstral mean normalization. 1977-1980 - David A. van Leeuwen:

Speaker adaptation in the NIST speaker recognition evaluation 2004. 1981-1984 - Jacob Goldberger, Hagai Aronowitz:

A distance measure between GMMs based on the unscented transform and its application to speaker recognition. 1985-1988 - Sorin Dusan:

Estimation of speaker's height and vocal tract length from speech signal. 1989-1992 - Doroteo Torre Toledano, Carlos Fombella, Joaquin Gonzalez-Rodriguez, Luis A. Hernández Gómez:

On the relationship between phonetic modeling precision and phonetic speaker recognition accuracy. 1993-1996 - J. Fortuna, P. Sivakumaran, Aladdin M. Ariyaeeinia, Amit S. Malegaonkar:

Open-set speaker identification using adapted Gaussian mixture models. 1997-2000 - James McAuley, Ji Ming, Pat Corr:

Speaker verification in noisy conditions using correlated subband features. 2001-2004 - Mikaël Collet, Yassine Mami, Delphine Charlet, Frédéric Bimbot:

Probabilistic anchor models approach for speaker verification. 2005-2008 - Mijail Arcienega, Anil Alexander, Philipp Zimmermann, Andrzej Drygajlo:

A Bayesian network approach combining pitch and spectral envelope features to reduce channel mismatch in speaker verification and forensic speaker recognition. 2009-2012 - Kwok-Kwong Yiu, Man-Wai Mak, Sun-Yuan Kung:

Channel robust speaker verification via Bayesian blind stochastic feature transformation. 2013-2016 - Tomoko Matsui, Kunio Tanabe:

dPLRM-based speaker identification with log power spectrum. 2017-2020 - Xianxian Zhang, John H. L. Hansen, Pongtep Angkititrakul, Kazuya Takeda:

Speaker verification using Gaussian mixture models within changing real car environments. 2021-2024 - Kanae Amino, Tsutomu Sugawara, Takayuki Arai:

The correspondences between the perception of the speaker individualities contained in speech sounds and their acoustic properties. 2025-2028 - Samuel Kim, Sung-Wan Yoon, Thomas Eriksson, Hong-Goo Kang, Dae Hee Youn:

A noise-robust pitch synchronous feature extraction algorithm for speaker recognition systems. 2029-2032 - Jing Deng, Thomas Fang Zheng, Zhanjiang Song, Jian Liu:

Modeling high-level information by using Gaussian mixture correlation for GMM-UBM based speaker recognition. 2033-2036 - Xianxian Zhang, John H. L. Hansen:

In-set/out-of-set speaker identification based on discriminative speech frame selection. 2037-2040 - Zhenchun Lei, Yingchun Yang, Zhaohui Wu:

Mixture of support vector machines for text-independent speaker recognition. 2041-2044 - Shilei Zhang, Junmei Bai, Shuwu Zhang, Bo Xu:

Optimal model order selection based on regression tree in speaker identification. 2045-2048 - Marcos Faúndez-Zanuy, Jordi Solé-Casals:

Speaker verification improvement using blind inversion of distortions. 2049-2052
Single-channel Speech Enhancement
- Israel Cohen:

Supergaussian GARCH models for speech signals. 2053-2056 - Athanasios Mouchtaris, Jan Van der Spiegel, Paul Mueller, Panagiotis Tsakalides:

A spectral conversion approach to feature denoising and speech enhancement. 2057-2060 - Alfonso Ortega, Eduardo Lleida, Enrique Masgrau, Luis Buera, Antonio Miguel:

Acoustic feedback cancellation in speech reinforcement systems for vehicles. 2061-2064 - Julien Bourgeois, Jürgen Freudenberger, Guillaume Lathoud:

Implicit control of noise canceller for speech enhancement. 2065-2068 - T. M. Sunil Kumar, T. V. Sreenivas:

Speech enhancement using Markov model of speech segments. 2069-2072 - Vladimir Braquet, Takao Kobayashi:

A wavelet based noise reduction algorithm for speech signal corrupted by coloured noise. 2073-2076 - Esfandiar Zavarehei, Saeed Vaseghi:

Speech enhancement in temporal DFT trajectories using Kalman filters. 2077-2080 - Qin Yan, Saeed Vaseghi, Esfandiar Zavarehei, Ben P. Milner:

Formant-tracking linear prediction models for speech processing in noisy environments. 2081-2084 - Hui Jiang, Qian-Jie Fu:

Statistical noise compensation for cochlear implant processing. 2085-2088 - Tuan Van Pham, Gernot Kubin:

WPD-based noise suppression using nonlinearly weighted threshold quantile estimation and optimal wavelet shrinking. 2089-2092 - Weifeng Li, Katunobu Itou, Kazuya Takeda, Fumitada Itakura:

Subjective and objective quality assessment of regression-enhanced speech in real car environments. 2093-2096 - Masashi Unoki, Masaaki Kubo, Atsushi Haniu, Masato Akagi:

A model for selective segregation of a target instrument sound from the mixed sound of various instruments. 2097-2100 - Richard C. Hendriks, Richard Heusdens, Jesper Jensen:

Improved decision directed approach for speech enhancement using an adaptive time segmentation. 2101-2104 - Heinrich W. Löllmann, Peter Vary:

Generalized filter-bank equalizer for noise reduction with reduced signal delay. 2105-2108 - Nicoleta Roman, DeLiang Wang:

A pitch-based model for separation of reverberant speech. 2109-2112 - David Yuheng Zhao, W. Bastiaan Kleijn

:
On noise gain estimation for HMM-based speech enhancement. 2113-2116 - Om Deshmukh, Carol Y. Espy-Wilson:

Speech enhancement using auditory phase opponency model. 2117-2120
Acoustic Modelling for LVCSR
- Brian Mak, Jeff Siu-Kei Au-Yeung, Yiu-Pong Lai, Man-Hung Siu:

High-density discrete HMM with the use of scalar quantization indexing. 2121-2124 - Jing Zheng, Andreas Stolcke:

Improved discriminative training using phone lattices. 2125-2128 - Qifeng Zhu, Barry Y. Chen, Frantisek Grézl, Nelson Morgan:

Improved MLP structures for data-driven feature extraction for ASR. 2129-2132 - Wolfgang Macherey, Lars Haferkamp, Ralf Schlüter, Hermann Ney:

Investigations on error minimizing training criteria for discriminative training in automatic speech recognition. 2133-2136 - Khe Chai Sim, Mark J. F. Gales:

Temporally varying model parameters for large vocabulary continuous speech recognition. 2137-2140 - Qifeng Zhu, Andreas Stolcke, Barry Y. Chen, Nelson Morgan:

Using MLP features in SRI's conversational speech recognition system. 2141-2144
Speech Production I
- Matti Airas, Hannu Pulakka, Tom Bäckström, Paavo Alku:

A toolkit for voice inverse filtering and parametrisation. 2145-2148 - Denisse Sciamarella, Christophe d'Alessandro:

Stylization of glottal-flow spectra produced by a mechanical vocal-fold model. 2149-2152 - Hideyuki Nomura, Tetsuo Funada:

Numerical glottal sound source model as coupled problem between vocal cord vibration and glottal flow. 2153-2156 - Marianne Pouplier, Maureen Stone:

A tagged-cine MRI investigation of German vowels. 2157-2160 - Antoine Serrurier, Pierre Badin:

A three-dimensional linear articulatory model of velum based on MRI data. 2161-2164 - Anne Cros, Didier Demolin, Ana Georgina Flesia, Antonio Galves:

On the relationship between intra-oral pressure and speech sonority. 2165-2168
Speaker Characterization and Recognition I-IV
- Mohamed Kamal Omar, Jirí Navrátil, Ganesh N. Ramaswamy:

Maximum conditional mutual information modeling for speaker verification. 2169-2172 - Luciana Ferrer, M. Kemal Sönmez, Sachin S. Kajarekar:

Class-dependent score combination for speaker recognition. 2173-2176 - Hagai Aronowitz, Dror Irony, David Burshtein:

Modeling intra-speaker variability for speaker recognition. 2177-2180 - Girija Chetty, Michael Wagner:

Liveness detection using cross-modal correlations in face-voice person authentication. 2181-2184 - Taichi Asami, Koji Iwano, Sadaoki Furui:

Stream-weight optimization by LDA and adaboost for multi-stream speaker verification. 2185-2188 - Yosef A. Solewicz, Moshe Koppel:

Considering speech quality in speaker verification fusion. 2189-2192
Gender and Age Issues in Speech and Language Research I, II
- Matteo Gerosa, Diego Giuliani, Fabio Brugnara:

Speaker adaptive acoustic modeling with mixture of adult and children's speech. 2193-2196 - Shona D'Arcy

, Martin J. Russell:
A comparison of human and computer recognition accuracy for children's speech. 2197-2200 - Piero Cosi, Bryan L. Pellom:

Italian children's speech recognition for advanced interactive literacy tutors. 2201-2204 - Martine Adda-Decker, Lori Lamel:

Do speech recognizers prefer female speakers? 2205-2208 - Serdar Yildirim, Chul Min Lee, Sungbok Lee, Alexandros Potamianos, Shrikanth S. Narayanan:

Detecting Politeness and frustration state of a child in a conversational computer game. 2209-2212 - Diana Binnenpoorte, Christophe Van Bael, Els den Os, Lou Boves:

Gender in everyday speech and language: a corpus-based study. 2213-2216
Spoken Language Acquisition, Development and Learning I, II
- Shigeaki Amano:

Developmental change of phoneme duration in a Japanese infant and mother. 2217-2220 - Haiping Jia, Hiroki Mori, Hideki Kasuya:

Mora timing organization in producing contrastive geminate/single consonants and long/short vowels by native and non-native speakers of Japanese: effects of speaking rate. 2221-2224 - Hongyan Wang, Vincent J. van Heuven:

Mutual intelligibility of american, Chinese and dutch-accented speakers of English. 2225-2228 - Peter Juel Henrichsen:

Deriving a bi-lingual dictionary from raw transcription data. 2229-2232 - Kei Ohta, Seiichi Nakagawa:

A statistical method of evaluating pronunciation proficiency for Japanese words. 2233-2236
Language and Dialect Identification I, II
- Pavel Matejka, Petr Schwarz, Jan Cernocký, Pavel Chytil:

Phonotactic language identification using high quality phoneme recognition. 2237-2240 - Rongqing Huang, John H. L. Hansen:

Advances in word based dialect/accent classification. 2241-2244 - Rym Hamdi, Salem Ghazali, Melissa Barkat-Defradas:

Syllable structure in spoken Arabic: a comparative investigation. 2245-2248 - J. C. Marcadet, Volker Fischer, Claire Waast-Richard:

A transformation-based learning approach to language identification for mixed-lingual text-to-speech synthesis. 2249-2252 - Shuichi Itahashi, Shiwei Zhu, Mikio Yamamoto:

Constructing family trees of multilingual speech using Gaussian mixture models. 2253-2256 - Jean-Luc Rouas:

Modeling long and short-term prosody for language identification. 2257-2260
Spoken Language Translation I, II
- Matthias Paulik, Christian Fügen, Sebastian Stüker, Tanja Schultz, Thomas Schaaf, Alex Waibel:

Document driven machine translation enhanced ASR. 2261-2264 - Shahram Khadivi, András Zolnay, Hermann Ney:

Automatic text dictation in computer-assisted translation. 2265-2268 - Luis Rodríguez, Jorge Civera, Enrique Vidal, Francisco Casacuberta, César Ernesto Martínez:

On the use of speech recognition in computer assisted translation. 2269-2272 - Andreas Kathol, Kristin Precoda, Dimitra Vergyri, Wen Wang, Susanne Z. Riehemann:

Speech translation for low-resource languages: the case of Pashto. 2273-2276 - David Picó, Jorge González, Francisco Casacuberta, Diamantino Caseiro, Isabel Trancoso:

Finite-state transducer inference for a speech-input Portuguese-to-English machine translation system. 2277-2280 - Kenko Ohta, Keiji Yasuda, Gen-ichiro Kikui, Masuzo Yanagida:

Quantitative evaluation of effects of speech recognition errors on speech translation quality. 2281-2284
Multi-channel Speech Enhancement
- Thomas Lotter, Bastian Sauert, Peter Vary:

A stereo input-output superdirective beamformer for dual channel noise reduction. 2285-2288 - Ulrich Klee, Tobias Gehrig, John W. McDonough:

Kalman filters for time delay of arrival-based source localization. 2289-2292 - Osamu Ichikawa, Masafumi Nishimura:

Simultaneous adaptation of echo cancellation and spectral subtraction for in-car speech recognition. 2293-2296 - Rong Hu, Yunxin Zhao:

Variable step size adaptive decorrelation filtering for competing speech separation. 2297-2300 - Daisuke Saitoh, Atsunobu Kaminuma, Hiroshi Saruwatari, Tsuyoki Nishikawa, Akinobu Lee:

Speech extraction in a car interior using frequency-domain ICA with rapid filter adaptations. 2301-2304 - Rongqiang Hu, Sunil D. Kamath, David V. Anderson:

Speech enhancement using non-acoustic sensors. 2305-2308 - Marc Delcroix, Takafumi Hikichi, Masato Miyoshi:

Improved blind dereverberation performance by using spatial information. 2309-2312 - Junfeng Li, Masato Akagi:

A hybrid microphone array post-filter in a diffuse noise field. 2313-2316 - Venkatesh Krishnan, Phil Spencer Whitehead, David V. Anderson, Mark A. Clements:

A framework for estimation of clean speech by fusion of outputs from multiple speech enhancement systems. 2317-2320 - Yuki Denda, Takanobu Nishiura, Yoichi Yamashita:

A study of weighted CSP analysis with average speech spectrum for noise robust talker localization. 2321-2324 - Young-Ik Kim, Sung Jun An, Rhee Man Kil, Hyung-Min Park:

Sound segregation based on binaural zero-crossings. 2325-2328 - Jürgen Freudenberger, Klaus Linhard:

A two-microphone diversity system and its application for hands-free car kits. 2329-2332 - Takahiro Murakami, Kiyoshi Kurihara, Yoshihisa Ishida:

Directionally constrained minimization of power algorithm for speech signals. 2333-2336 - Alessio Brutti, Maurizio Omologo, Piergiorgio Svaizer:

Oriented global coherence field for the estimation of the head orientation in smart rooms equipped with distributed microphone arrays. 2337-2340 - Nilesh Madhu, Rainer Martin:

Robust speaker localization through adaptive weighted pair TDOA (AWEPAT) estimation. 2341-2344 - Guillaume Lathoud, Mathew Magimai-Doss, Bertrand Mesot:

A spectrogram model for enhanced source localization and noise-robust ASR. 2345-2348 - Sriram Srinivasan, Mattias Nilsson, W. Bastiaan Kleijn

:
Denoising through source separation and minimum tracking. 2349-2352 - Louisa Busca Grisoni, John H. L. Hansen:

Collaborative voice activity detection for hearing aids. 2353-2356 - Enrique Robledo-Arnuncio, Biing-Hwang Juang:

Using inter-frequency decorrelation to reduce the permutation inconsistency problem in blind source separation. 2357-2360 - Amarnag Subramanya, Zhengyou Zhang, Zicheng Liu, Jasha Droppo, Alex Acero:

A graphical model for multi-sensory speech processing in air-and-bone conductive microphones. 2361-2364
Prosody in Language Performance I, II
- Heejin Kim, Jennifer Cole:

The stress foot as a unit of planned timing: evidence from shortening in the prosodic phrase. 2365-2368 - Pauline Welby, Hélène Loevenbruck:

Segmental "anchorage" and the French late rise. 2369-2372 - Ivan Chow:

Prosodic cues for syntactically-motivated junctures. 2373-2376 - Isabel Falé, Isabel Hub Faria:

A glimpse of the time-course of intonation processing in European Portuguese. 2377-2380 - Petra Wagner:

Great expectations - introspective vs. perceptual prominence ratings and their acoustic correlates. 2381-2384 - Christian Jensen, John Tøndering:

Choosing a scale for measuring perceived prominence. 2385-2388 - Jens Edlund, David House, Gabriel Skantze:

The effects of prosodic features on the interpretation of clarification ellipses. 2389-2392 - Matthias Jilka:

Exploration of different types of intonational deviations in foreign-accented and synthesized speech. 2393-2396 - Jörg Bröggelwirth:

A rhythmic-prosodic model of poetic speech. 2397-2400 - Sonja Biersack, Vera Kempe, Lorna Knapton:

Fine-tuning speech registers: a comparison of the prosodic features of child-directed and foreigner-directed speech. 2401-2404 - Timothy Arbisi-Kelm:

An analysis of the intonational structure of stuttered speech. 2405-2408 - Britta Lintfert, Wolfgang Wokurek:

Voice quality dimensions of pitch accents. 2409-2412 - Marion Dohen, Hélène Loevenbruck:

Audiovisual production and perception of contrastive focus in French: a multispeaker study. 2413-2416 - Pashiera Barkhuysen, Emiel Krahmer, Marc Swerts:

Predicting end of utterance in multimodal and unimodal conditions. 2417-2420 - Saori Tanaka, Masafumi Nishida, Yasuo Horiuchi, Akira Ichikawa:

Production of prominence in Japanese sign language. 2421-2424
Speaker Characterization and Recognition I-IV
- Andreas Stolcke, Luciana Ferrer, Sachin S. Kajarekar, Elizabeth Shriberg, Anand Venkataraman:

MLLR transforms as features in speaker recognition. 2425-2428 - Brendan Baker, Robbie Vogt, Sridha Sridharan:

Gaussian mixture modelling of broad phonetic and syllabic events for text-independent speaker verification. 2429-2432 - Hagai Aronowitz, David Burshtein:

Efficient speaker identification and retrieval. 2433-2436 - Rohit Sinha, S. E. Tranter, Mark J. F. Gales, Philip C. Woodland:

The Cambridge University March 2005 speaker diarisation system. 2437-2440 - Xuan Zhu, Claude Barras, Sylvain Meignier, Jean-Luc Gauvain:

Combining speaker identification and BIC for speaker diarization. 2441-2444 - Dan Istrate, Nicolas Scheffer, Corinne Fredouille, Jean-François Bonastre:

Broadcast news speaker tracking for ESTER 2005 campaign. 2445-2448
Phonetics and Phonology I, II
- Sorin Dusan:

On the nature of acoustic information in identification of coarticulated vowels. 2449-2452 - Cédric Gendrot, Martine Adda-Decker:

Impact of duration on F1/F2 formant values of oral vowels: an automatic analysis of large broadcast news corpora in French and German. 2453-2456 - Hugo Quené:

Modeling of between-speaker and within-speaker variation in spontaneous speech tempo. 2457-2460 - Masahiko Komatsu, Makiko Aoyagi:

Vowel devoicing vs. mora-timed rhythm in spontaneous Japanese - inspection of phonetic labels of OGI_TS. 2461-2464 - Jalal-Eddin Al-Tamimi, Emmanuel Ferragne:

Does vowel space size depend on language vowel inventories? evidence from two Arabic dialects and French. 2465-2468 - Chilin Shih:

Understanding phonology by phonetic implementation. 2469-2472
Spoken / Multi-modal Dialogue Systems I, II
- Niels Ole Bernsen, Laila Dybkjær:

User evaluation of conversational agent h. c. Andersen. 2473-2476 - Silke Goronzy, Nicole Beringer:

Integrated development and on-the-fly simulation of multimodal dialogs. 2477-2480 - Mihai Rotaru, Diane J. Litman, Katherine Forbes-Riley:

Interactions between speech recognition problems and user emotions. 2481-2484 - Junlan Feng, Srihari Reddy, Murat Saraclar:

Webtalk: mining websites for interactively answering questions. 2485-2488 - Sebastian Möller:

Towards generic quality prediction models for spoken dialogue systems - a case study. 2489-2492 - S. Parthasarathy, Cyril Allauzen, R. Munkong:

Robust access to large structured data using voice form-filling. 2493-2496
Human factors, User Experience and Natural Language Application Design
- Esther Levin, Alex Levin:

Spoken dialog system for real-time data capture. 2497-2500 - Michael Pucher, Peter Fröhlich:

A user study on the influence of mobile device class, synthesis method, data rate and lexicon on speech synthesis quality. 2501-2504 - Fang Chen, Yael Katzenellenbogen:

User's experience of a commercial speech dialogue system. 2505-2508 - Esther Levin, Amir M. Mané:

Voice user interface design for automated directory assistance. 2509-2512 - Maria Gabriela Alvarez-Ryan, Narendra K. Gupta, Barbara Hollister, Tirso Alonso:

Optimizing user experience through design of the spoken language understanding (SLU) module. 2513-2516 - Jeremy H. Wright, David A. Kapilow, Alicia Abella:

Interactive visualization of human-machine dialogs. 2517-2520
TTS Inventory
- Matthew P. Aylett:

Synthesising hyperarticulation in unit selection TTS. 2521-2524 - Daniel Tihelka:

Symbolic prosody driven unit selection for highly natural synthetic speech. 2525-2528 - Jindrich Matousek, Zdenek Hanzlícek, Daniel Tihelka:

Hybrid syllable/triphone speech synthesis. 2529-2532 - Francisco Campillo Díaz, José Luis Alba, Eduardo Rodríguez Banga:

A neural network approach for the design of the target cost function in unit-selection speech synthesis. 2533-2536 - Christian Weiss:

FSM and k-nearest-neighbor for corpus based video-realistic audio-visual synthesis. 2537-2540 - Gui-Lin Chen, Ke-Song Han, Zhen-Li Yu, Dong-Jian Yue, Yi-Qing Zu:

An embedded and concatenative approach to TTS of multiple languages. 2541-2544 - Tony Ezzat, Ethan Meyers, James R. Glass, Tomaso A. Poggio:

Morphing spectral envelopes using audio flow. 2545-2548 - Vincent Colotte, Richard Beaufort:

Linguistic features weighting for a text-to-speech system without prosody model. 2549-2552 - Ingunn Amdal, Torbjørn Svendsen:

Unit selection synthesis database development using utterance verification. 2553-2556 - Yong Zhao, Lijuan Wang, Min Chu, Frank K. Soong, Zhigang Cao:

Refining phoneme segmentations using speaker-adaptive context dependent boundary models. 2557-2560 - Yining Chen, Yong Zhao, Min Chu:

Customizing base unit set with speech database in TTS systems. 2561-2564 - Soufiane Rouibia, Olivier Rosec:

Unit selection for speech synthesis based on a new acoustic target cost. 2565-2568 - Dan Chazan, Ron Hoory, Zvi Kons, Ariel Sagi, Slava Shechtman, Alexander Sorin:

Small footprint concatenative text-to-speech synthesis system using complex spectral envelope modeling. 2569-2572 - Francesc Alías, Ignasi Iriondo Sanz, Lluís Formiga, Xavier Gonzalvo, Carlos Monzo, Xavier Sevillano:

High quality Spanish restricted-domain TTS oriented to a weather forecast application. 2573-2576 - Ingmund Bjrkan, Torbjørn Svendsen, Snorre Farner:

Comparing spectral distance measures for join cost optimization in concatenative speech synthesis. 2577-2580 - Maria João Barros, Ranniery Maia, Keiichi Tokuda, Fernando Gil Resende, Diamantino Freitas:

HMM-based european Portuguese TTS system. 2581-2584 - Wael Hamza, John F. Pitrelli:

Combining the flexibility of speech synthesis with the naturalness of pre-recorded audio: a comparison of two approaches to phrase-splicing TTS. 2585-2588 - Guntram Strecha, Oliver Jokisch, Matthias Eichner, Rüdiger Hoffmann:

Codec integrated voice conversion for embedded speech synthesis. 2589-2592 - David Sündermann, Guntram Strecha, Antonio Bonafonte, Harald Höge, Hermann Ney:

Evaluation of VTLN-based voice conversion for embedded speech synthesis. 2593-2596 - Juri Isogai, Junichi Yamagishi, Takao Kobayashi:

Model adaptation and adaptive training using ESAT algorithm for HMM-based speech synthesis. 2597-2600 - Tien Ying Fung, Yuk-Chi Li, Eddie Sio, Icarus Lee, Helen M. Meng, P. C. Ching:

Embedded Cantonese TTS for multi-device access to web content. 2601-2604 - Karl Schnell, Arild Lacroix:

Model based analysis of a diphone database for improved unit concatenation. 2605-2608
Robust Speech Recognition I-IV
- Ning Ma, Phil D. Green:

Context-dependent word duration modelling for robust speech recognition. 2609-2612 - Julien Epps, Eric H. C. Choi:

An energy search approach to variable frame rate front-end processing for robust ASR. 2613-2616 - Roberto Gemello, Franco Mana, Renato de Mori:

Non-linear estimation of voice activity to improve automatic recognition of noisy speech. 2617-2620 - Yusuke Kida, Tatsuya Kawahara:

Voice activity detection based on optimally weighted combination of multiple features. 2621-2624 - Pei Ding:

Soft decision strategy and adaptive compensation for robust speech recognition against impulsive noise. 2625-2628 - Nicolás Morales, Doroteo T. Toledano, John H. L. Hansen, José Colás, Javier Garrido Salas:

Statistical class-based MFCC enhancement of filtered and band-limited speech for robust ASR. 2629-2632 - Hemant Misra

, Hervé Bourlard:
Spectral entropy feature in full-combination multi-stream for robust ASR. 2633-2636 - Wooil Kim, Richard M. Stern, Hanseok Ko:

Environment-independent mask estimation for missing-feature reconstruction. 2637-2640 - André Coy, Jon Barker:

Soft harmonic masks for recognising speech in the presence of a competing speaker. 2641-2644 - Lech Szymanski, Martin Bouchard:

Comb filter decomposition for robust ASR. 2645-2648 - Panikos Heracleous, Tomomi Kaino, Hiroshi Saruwatari, Kiyohiro Shikano:

Investigating the role of the Lombard reflex in non-audible murmur (NAM) recognition. 2649-2652 - Evan Ruzanski, John H. L. Hansen, Don Finan, James Meyerhoff, William Norris, Terry Wollert:

Improved "TEO" feature-based automatic stress detection using physiological and acoustic speech sensors. 2653-2656 - Takeshi S. Kobayakawa:

Spectral subtraction using elliptic integral for multiplication factor. 2657-2660 - Longbiao Wang, Norihide Kitaoka, Seiichi Nakagawa:

Robust distant speech recognition based on position dependent CMN using a novel multiple microphone processing technique. 2661-2664 - H. Tanaka, Hiroshi Fujimura, Chiyomi Miyajima, Takanori Nishino, Katunobu Itou, Kazuya Takeda:

Data collection and evaluation of speech recognition for motorbike riders. 2665-2668 - Agustín Álvarez-Marquina, Pedro Gómez, Victor Nieto Lluis, Rafael Martínez, Victoria Rodellar:

Application of a first-order differential microphone for efficient voice activity detection in a car platform. 2669-2672 - Panji Setiawan, Suhadi Suhadi, Tim Fingscheidt, Sorel Stan:

Robust speech recognition for mobile devices in car noise. 2673-2676 - Péter Mihajlik

, Zoltán Tobler, Zoltán Tüske, Géza Gordos:
Evaluation and optimization of noise robust front-end technologies for the automatic recognition of Hungarian telephone speech. 2677-2680 - Gang Chen, Douglas D. O'Shaughnessy, Hesham Tolba:

A performance investigation of noisy voice recognition over IP telephony networks. 2681-2684 - Akinori Ito, Takashi Kanayama, Motoyuki Suzuki, Shozo Makino:

Internal noise suppression for speech recognition by small robots. 2685-2688 - Florian Kraft, Robert G. Malkin, Thomas Schaaf, Alex Waibel:

Temporal ICA for classification of acoustic events i a kitchen environment. 2689-2692 - Jan Felix Krebber:

"hello - is anybody at home?" - about the minimum word accuracy of a smart home spoken dialogue system. 2693-2696 - Hans-Günter Hirsch, Harald Finster:

The simulation of realistic acoustic input scenarios for speech recognition systems. 2697-2700 - Michael Walsh, Gregory M. P. O'Hare, Julie Carson-Berndsen:

An agent-based framework for speech investigation. 2701-2704
Speech Coding
- Stephen So, Kuldip K. Paliwal:

Switched split vector quantisation of line spectral frequencies for wideband speech coding. 2705-2708 - Changchun Bao, Jason Lukasiak, Christian H. Ritz:

A novel voicing cut-off determination for low bit-rate harmonic speech coding. 2709-2712 - Hauke Krüger, Peter Vary:

A partial decorrelation scheme for improved predictive open loop quantization with noise shaping. 2713-2716 - Venkatesh Krishnan, Thomas P. Barnwell III, David V. Anderson:

Using dynamic codebook re-ordering to exploit inter-frame correlation in MELP coders. 2717-2720 - Adriane Swalm Durey, Venkatesh Krishnan, Thomas P. Barnwell III:

Enhanced speech coding based on phonetic class segmentation. 2721-2724 - Ali Erdem Ertan, Thomas P. Barnwell III:

A pitch-synchronous pitch-cycle modification method for designing a hybrid i-MELP/waveform-matching speech coder. 2725-2728 - Joon-Hyuk Chang, Jong Won Shin, Seung Yeol Lee, Nam Soo Kim:

A new structural preprocessor for low-bit rate speech coding. 2729-2732 - Tiago H. Falk, Wai-Yip Chan, Peter Kabal:

An improved GMM-based voice quality predictor. 2733-2736 - Jan S. Erkelens:

High-quality memoryless subband coding of impulse responses at 22 bits per frame. 2737-2740 - Shi-Han Chen, Kuo-Guan Wu, Chih-Chung Kuo:

A study of variable pulse allocation for MPE and CELP coders based on PESQ analysis. 2741-2744 - José L. Pérez-Córdoba, Antonio M. Peinado, Angel M. Gomez, Antonio J. Rubio:

Joint source-channel coding of LSP parameters for bursty channels. 2745-2748
Gender and Age Issues in Speech and Language Research I, II
- Daniel Elenius, Mats Blomberg:

Adaptation and normalization experiments in speech recognition for 4 to 8 year old children. 2749-2752 - Wim Jansen, Hugo Van hamme:

PROSPECT features and their application to missing data techniques for vocal tract length normalization. 2753-2756 - Andreas Hagen, Bryan L. Pellom:

Data driven subword unit modeling for speech recognition and its application to interactive reading tutors. 2757-2760 - Anton Batliner, Mats Blomberg, Shona D'Arcy, Daniel Elenius, Diego Giuliani, Matteo Gerosa, Christian Hacker, Martin J. Russell, Stefan Steidl, Michael Wong:

The PF_STAR children's speech corpus. 2761-2764 - Linda Bell, Johan Boye, Joakim Gustafson, Mattias Heldner, Anders Lindström, Mats Wirén:

The Swedish NICE corpus - spoken dialogues between children and embodied characters in a computer game scenario. 2765-2768 - Yusuke Miyauchi, Nao Hodoshima, Keiichi Yasu, Nahoko Hayashi, Takayuki Arai, Mitsuko Shindo:

A preprocessing technique for improving speech intelligibility in reverberant environments: the effect of steady-state suppression on elderly people. 2769-2772
Discourse and Dialogue I, II
- Norbert Pfleger, Markus Löckelt:

Synchronizing dialogue contributions of human users and virtual characters in a virtual reality environment. 2773-2776 - Anand Venkataraman, Yang Liu, Elizabeth Shriberg, Andreas Stolcke:

Does active learning help automatic dialog act tagging in meeting data? 2777-2780 - Dan Bohus, Alexander I. Rudnicky:

A principled approach for rejection threshold optimization in spoken dialog systems. 2781-2784 - David Pérez-Piñar López, Carmen García-Mateo:

Application of confidence measures for dialogue systems through the use of parallel speech recognizers. 2785-2788 - Sophie Rosset, Delphine Tribout:

Multi-level information and automatic dialog acts detection in human-human spoken dialogs. 2789-2792 - Rieks op den Akker, Harry Bunt, Simon Keizer, Boris W. van Schooten:

From question answering to spoken dialogue: towards an information search assistant for interactive multimodal information extraction. 2793-2796
Text-to-Speech I, II
- Ulrich Reubold, Alexander Steffen:

Pitch-effects in diphone recording: are logatomes inappropriate? 2797-2800 - Tomoki Toda, Keiichi Tokuda:

Speech parameter generation algorithm considering global variance for HMM-based speech synthesis. 2801-2804 - Makoto Tachibana, Junichi Yamagishi, Takashi Masuko, Takao Kobayashi:

Performance evaluation of style adaptation for hidden semi-Markov model based speech synthesis. 2805-2808 - Gabriel Webster, Tina Burrows, Katherine M. Knill:

A comparison of methods for speaker-dependent pronunciation tuning for text-to-speech synthesis. 2809-2812 - Ann K. Syrdal, Alistair Conkie:

Perceptually-based data-driven join costs: comparing join types. 2813-2816 - Yannis Pantazis, Yannis Stylianou, Esther Klabbers:

Discontinuity detection in concatenated speech synthesis based on nonlinear speech analysis. 2817-2820
Language and Dialect Identification I, II
- Tingyao Wu, Dirk Van Compernolle, Jacques Duchateau, Qian Yang, Jean-Pierre Martens:

Improving the discrimination between native accents when recorded over different channels. 2821-2824 - Isabel Trancoso, António Joaquim Serralheiro, Céu Viana, Diamantino Caseiro:

Aligning and recognizing spoken books in different varieties of Portuguese. 2825-2828 - Bin Ma, Haizhou Li, Chin-Hui Lee:

An acoustic segment modeling approach to automatic language identification. 2829-2832 - Dong Zhu, Martine Adda-Decker, Fabien Antoine:

Different size multilingual phone inventories and context-dependent acoustic models for language identification. 2833-2836 - Sheng Gao, Bin Ma, Haizhou Li, Chin-Hui Lee:

A text categorization approach to automatic language identification. 2837-2840 - Giampiero Salvi:

Advances in regional accent clustering in Swedish. 2841-2844
Speech Recognition in Ubiquitous Networking and Context-Aware Computing
- David Pearce, Jonathan Engelsma, James C. Ferrans, John Johnson:

An architecture for seamless access to distributed multimodal services. 2845-2848 - Zheng-Hua Tan, Paul Dalsgaard, Børge Lindberg, Haitian Xu:

Robust speech recognition in ubiquitous networking and context-aware computing. 2849-2852 - Valentin Ion, Reinhold Haeb-Umbach:

Unified probabilistic approach to error concealment for distributed speech recognition. 2853-2856 - Alastair Bruce James, Ben Milner:

Combining packet loss compensation methods for robust distributed speech recognition. 2857-2860 - Trond Skogstad, Torbjørn Svendsen:

Distributed ASR using speech coder data for efficient feature vector representation. 2861-2864 - Sadaoki Furui, Tomohisa Ichiba, Takahiro Shinozaki, Edward W. D. Whittaker, Koji Iwano:

Cluster-based modeling for ubiquitous speech recognition. 2865-2868
Phonetics and Phonology I, II
- Danny R. Moates, Zinny S. Bond, Russell Fox, Verna Stockmal:

The feature [sonorant] in lexical access. 2869-2872 - Simone Mikuteit:

Voice and aspiration in German and east bengali stops: a cross-language study. 2873-2876 - Irene Jacobi, Louis C. W. Pols, Jan Stroop:

Polder dutch: aspects of the /ei/-lowering in standard dutch. 2877-2880 - Eric Castelli, René Carré:

Production and perception of Vietnamese vowels. 2881-2884 - Vu Ngoc Tuan, Christophe d'Alessandro, Alexis Michaud:

Using open quotient for the characterisation of vietnamese glottalised tones. 2885-2888 - John Hajek, Mary Stevens:

On the acoustic characterization of ejective stops in Waima'a. 2889-2892 - Mary Stevens, John Hajek:

Spirantization of /p t k/ in Sienese Italian and so-called semi-fricatives. 2893-2896 - Barbara Gili Fivela, Claudio Zmarich:

Italian geminates under speech rate and focalization changes: kinematic, acoustic, and perception data. 2897-2900 - Sunhee Kim:

Durational characteristics of Korean Lombard speech. 2901-2904 - Toshiko Isei-Jaakkola, Satoshi Asakawa:

A cross-linguistic study of vowel quantity in different word structures: Japanese, Finnish and Czech. 2905-2908 - Laura Mori, Melissa Barkat-Defradas:

Acoustic properties of foreign accent: VOT variations in Moroccan-accented Italian. 2909-2912 - Andréia S. Rauber, Paola Escudero, Ricardo Augusto Hoffmann Bion, Barbara O. Baptista:

The interrelation between the perception and production of English vowels by native speakers of Brazilian Portuguese. 2913-2916 - Julia Hoelterhoff:

Recognition of German obstruents. 2917-2920 - Radek Skarnitzl, Jan Volín:

Czech voiced labiodental continuant discrimination from basic acoustic data. 2921-2924 - Jean-Baptiste Maj, Anne Bonneau, Dominique Fohr, Yves Laprie:

An elitist approach for extracting automatically well-realized speech sounds with high confidence. 2925-2928 - Na'im R. Tyson:

Applying multiple regression models for predicting word duration in a corpus of spontaneous speech. 2929-2932 - Catarina Oliveira

, Lurdes Castro Moutinho, António J. S. Teixeira:
On european Portuguese automatic syllabification. 2933-2936 - Aimilios Chalamandaris, Spyros Raptis, Pirros Tsiakoulis:

Rule-based grapheme-to-phoneme method for the Greek. 2937-2940 - Constandinos Kalimeris, George K. Mikros, Stelios Bakamidis:

Assimilation and deletion phenomena involving word-final /n/ and word-initial /p, t, k/ in modern Greek: a codification of the observed variation intended for use in TTS synthesis. 2941-2944 - Christian Weiss, Bianca Aschenberner:

A German viseme-set for automatic transcription of input text used for audio-visual speech synthesis. 2945-2948 - Johanna-Pascale Roy:

Visual perception of anticipatory rounding gestures in French. 2949-2952
Acoustic Processing for ASR I-III
- Michael Jonas, James G. Schmolze:

Hierarchical clustering of mixture tying using a partially observable Markov decision process. 2953-2956 - Pierre Ouellet, Gilles Boulianne

, Patrick Kenny:
Flavors of Gaussian warping. 2957-2960 - Joseph Keshet, Shai Shalev-Shwartz, Yoram Singer, Dan Chazan:

Phoneme alignment based on discriminative learning. 2961-2964 - Jussi Leppänen, Imre Kiss:

Comparison of low footprint acoustic modeling techniques for embedded ASR systems. 2965-2968 - Atiwong Suchato, Proadpran Punyabukkana:

Factors in classification of stop consonant place of articulation. 2969-2972 - Arthur R. Toth, Alan W. Black:

Cross-speaker articulatory position data for phonetic feature prediction. 2973-2976 - Daniel Povey:

Improvements to fMPE for discriminative training of features. 2977-2980 - Xin Lei, Mei-Yuh Hwang, Mari Ostendorf:

Incorporating tone-related MLP posteriors in the feature representation for Mandarin ASR. 2981-2984 - Yan Han, Johan de Veth, Lou Boves:

Speech trajectory clustering for improved speech recognition. 2985-2988 - Andrey Temko, Dusan Macho, Climent Nadeu:

Selection of features and combination of classifiers using a fuzzy approach for acoustic event classification. 2989-2992 - Jan Stadermann, Wolfram Koska, Gerhard Rigoll:

Multi-task learning strategies for a recurrent neural net in a hybrid tied-posteriors acoustic model. 2993-2996 - Florian Hönig, Georg Stemmer, Christian Hacker, Fabio Brugnara:

Revising Perceptual Linear Prediction (PLP). 2997-3000 - Joel Pinto, R. N. V. Sitaram:

Confidence measures in speech recognition based on probability distribution of likelihoods. 3001-3004 - Frank Diehl, Asunción Moreno, Enric Monte:

Continuous local codebook features for multi- and cross-lingual acoustic phonetic modelling. 3005-3008 - Antonio Miguel, Eduardo Lleida, Richard C. Rose, Luis Buera, Alfonso Ortega:

Augmented state space acoustic decoding for modeling local variability in speech. 3009-3012 - Dimitrios Dimitriadis, Petros Maragos, Alexandros Potamianos:

Auditory Teager energy cepstrum coefficients for robust speech recognition. 3013-3016 - Yasser Hifny, Steve Renals, Neil D. Lawrence

:
A hybrid Maxent/HMM based ASR system. 3017-3020 - Hakan Erdogan:

Regularizing linear discriminant analysis for speech recognition. 3021-3024 - Yadong Wang, Steven Greenberg, Jayaganesh Swaminathan, Ramdas Kumaresan, David Poeppel:

Comprehensive modulation representation for automatic speech recognition. 3025-3028 - Qiang Fu, Biing-Hwang Juang:

Segment-based phonetic class detection using minimum verification error (MVE) training. 3029-3032 - Yi Liu, Pascale Fung:

Acoustic and phonetic confusions in accented speech recognition. 3033-3036 - Mario E. Munich, Qiguang Lin:

Auditory image model features for automatic speech recognition. 3037-3040 - Panikos Heracleous, Tomomi Kaino, Hiroshi Saruwatari, Kiyohiro Shikano:

Applications of NAM microphones in speech recognition for privacy in human-machine communication. 3041-3044 - Joe Frankel, Simon King:

A hybrid ANN/DBN approach to articulatory feature recognition. 3045-3048
Speaker Characterization and Recognition I-IV
- Daniel Moraru, Mathieu Ben, Guillaume Gravier:

Experiments on speaker tracking and segmentation in radio broadcast news. 3049-3052 - Emanuele Dalmasso, Pietro Laface, Daniele Colibro, Claudio Vair:

Unsupervised segmentation and verification of multi-speaker conversational speech. 3053-3056 - Sacha Krstulovic, Frédéric Bimbot, Delphine Charlet, Olivier Boëffard:

Focal speakers: a speaker selection method able to deal with heterogeneous similarity criteria. 3057-3060 - Mathieu Ben, Guillaume Gravier, Frédéric Bimbot:

A model space framework for efficient speaker detection. 3061-3064 - Nicolas Scheffer, Jean-François Bonastre:

Speaker detection using acoustic event sequences. 3065-3068 - Wei-Ho Tsai, Hsin-Min Wang:

Speaker clustering of unknown utterances based on maximum purity estimation. 3069-3072 - Petra Zochová, Vlasta Radová:

Modified DISTBIC algorithm for speaker change detection. 3073-3076 - Gilles Gonon, Rémi Gribonval, Frédéric Bimbot:

Decision trees with improved efficiency for fast speaker verification. 3077-3080 - Nicolas Eveno, Laurent Besacier:

A speaker independent "liveness" test for audio-visual biometrics. 3081-3084 - Shingo Kuroiwa, Yoshiyuki Umeda, Satoru Tsuge, Fuji Ren:

Distributed speaker recognition using speaker-dependent VQ codebook and earth mover's distance. 3085-3088 - Ka-Yee Leung, Man-Wai Mak, Man-Hung Siu, Sun-Yuan Kung:

Speaker verification via articulatory feature-based conditional pronunciation modeling with vowel and consonant mixture models. 3089-3092 - Jixu Chen, Beiqian Dai, Jun Sun:

Prosodic features based on wavelet analysis for speaker verification. 3093-3096 - Mohamed Mihoubi, Douglas D. O'Shaughnessy, Pierre Dumouchel:

Relevant information extraction for discriminative training applied to speaker identification. 3097-3100 - Jérôme Louradour, Khalid Daoudi:

Conceiving a new sequence kernel and applying it to SVM speaker verification. 3101-3104 - Jing Deng, Thomas Fang Zheng, Jian Liu, Wenhu Wu:

The predictive differential amplitude spectrum for robust speaker recognition in stationary noises. 3105-3108 - Michael Mason, Robbie Vogt, Brendan Baker, Sridha Sridharan:

Data-driven clustering for blind feature mapping in speaker verification. 3109-3112 - Xi Zhou, Zhiqiang Yao, Beiqian Dai:

Improved covariance modeling for GMM in speaker identification. 3113-3116 - Robbie Vogt, Brendan Baker, Sridha Sridharan:

Modelling session variability in text-independent speaker verification. 3117-3120 - Mihalis Siafarikas, Todor Ganchev, Nikolaos D. Fakotakis, George K. Kokkinakis:

Overlapping wavelet packet features for speaker verification. 3121-3124 - An-rong Yin, Xiang Xie, Jingming Kuang:

Using Hadamard ECOC in multi-class problems based on SVM. 3125-3128
Robust Speech Recognition I-IV
- Hank Liao, Mark J. F. Gales:

Joint uncertainty decoding for noise robust speech recognition. 3129-3132 - Vincent Vanhoucke:

Confidence scoring and rejection using multi-pass speech recognition. 3133-3136 - Cheng-Lung Lee, Wen-Whei Chang:

Memory-enhanced MMSE-based channel error mitigation for distributed speech recognition. 3137-3140 - Takashi Fukuda, Muhammad Ghulam, Tsuneo Nitta:

Designing multiple distinctive phonetic feature extractors for canonicalization by using clustering technique. 3141-3144 - Keisuke Kinoshita, Tomohiro Nakatani, Masato Miyoshi:

Efficient blind dereverberation framework for automatic speech recognition. 3145-3148 - Matthias Wölfel, John W. McDonough:

Combining multi-source far distance speech recognition strategies: beamforming, blind channel and confusion network combination. 3149-3152
Speech Coding and Quality Assessment
- Akira Takahashi, Atsuko Kurashima, Chiharu Morioka, Hideaki Yoshino:

Objective quality assessment of wideband speech by an extension of ITU-t recommendation p.862. 3153-3156 - Marc Werner, Peter Vary:

Quality control for UMTS-AMR speech channels. 3157-3160 - Wei Chen, Peter Kabal, Turaj Zakizadeh Shabestary:

Perceptual postfilter estimation for low bit rate speech coders using Gaussian mixture models. 3161-3164 - Kengo Fujita, Tsuneo Kato, Hideaki Yamada, Hisashi Kawai:

SNR-dependent background noise compensation of PESQ values for cellular phone speech. 3165 - Gil Ho Lee, Jae Sam Yoon, Hong Kook Kim:

A MFCC-based CELP speech coder for server-based speech recognition in network environments. 3169-3172 - Volodya Grancharov, Jonas Samuelsson, W. Bastiaan Kleijn

:
Distortion measures for vector quantization of noisy spectrum. 3173-3176
Spoken Language Translation I, II
- Evgeny Matusov, Stephan Kanthak, Hermann Ney:

On the integration of speech recognition and statistical machine translation. 3177-3180 - V. H. Quan, Marcello Federico, Mauro Cettolo:

Integrated n-best re-ranking for spoken language translation. 3181-3184 - Josep Maria Crego, José B. Mariño, Adrià de Gispert:

An n-gram-based statistical machine translation decoder. 3185-3188 - Liang Gu, Yuqing Gao:

Use of maximum entropy in natural word generation for statistical concept-based speech-to-speech translation. 3189-3192 - Adrià de Gispert, José B. Mariño, Josep Maria Crego:

Improving statistical machine translation by classifying and generalizing inflected verb forms. 3193-3196 - Abdulvohid Bozarov, Yoshinori Sagisaka, Ruiqiang Zhang, Gen-ichiro Kikui:

Improved speech recognition word lattice translation by confidence measure. 3197-3200
Speech Inversion
- Parham Mokhtari, Tatsuya Kitamura, Hironori Takemoto, Kiyoshi Honda:

Vocal tract area function inversion by linear regression of cepstrum. 3201-3204 - Olov Engwall:

Introducing visual cues in acoustic-to-articulatory inversion. 3205-3208 - Victor N. Sorokin, Alexander S. Leonov, I. S. Makarov, A. I. Tsyplikhin:

Speech inversion and re-synthesis. 3209-3212 - Mark A. Huckvale, Ian S. Howard:

Teaching a vocal tract simulation to imitate stop consonants. 3213-3216 - Blaise Potard, Yves Laprie:

Using phonetic constraints in acoustic-to-articulatory inversion. 3217-3220 - Asterios Toutios, Konstantinos G. Margaritis:

A support vector approach to the acoustic-to-articulatory mapping. 3221-3224
Prosody Modelling and Speech Technology I, II
- Daniel Hirst, Cyril Auran:

Analysis by synthesis of speech prosody: the Prozed environment. 3225-3228 - Stephen Cox:

A discriminative approach to phrase break modelling. 3229-3232 - Ian Read, Stephen Cox:

Stochastic and syntactic techniques for predicting phrase breaks. 3233-3236 - Gerasimos Xydas, Panagiotis Zervas, Georgios Kouroupetroglou, Nikolaos D. Fakotakis, George K. Kokkinakis:

Tree-based prediction of prosodic phrase breaks on top of shallow textual features. 3237-3240 - Honghui Dong, Jianhua Tao, Bo Xu:

Chinese prosodic phrasing with a constraint-based approach. 3241-3244 - Minghui Dong, Kim-Teng Lua, Haizhou Li:

A probabilistic approach to prosodic word prediction for Mandarin Chinese TTS. 3245-3248 - João Paulo Ramos Teixeira, Diamantino Freitas, Hiroya Fujisaki:

Evaluation of a system for F0 contour prediction for european Portuguese. 3249-3252 - Ke Li, Yoshinori Sagisaka:

Analysis on command sequences of a F0 generation model for Mandarin speech and its application to their automatic extraction. 3253-3256 - Keikichi Hirose, Yusuke Furuyama, Nobuaki Minematsu:

Corpus-based extraction of F0 contour generation process model parameters. 3257-3260 - David Escudero Mancebo, Valentín Cardeñoso-Payo:

Optimized selection of intonation dictionaries in corpus based intonation modelling. 3261-3264 - Qinghua Sun, Keikichi Hirose, Wentao Gu, Nobuaki Minematsu:

Generation of fundamental frequency contours for Mandarin speech synthesis based on tone nucleus model. 3265-3268 - Chen-Yu Chiang, Yih-Ru Wang, Sin-Horng Chen:

On the inter-syllable coarticulation effect of pitch modeling for Mandarin speech. 3269-3272 - Matej Rojc, Pablo Daniel Agüero, Antonio Bonafonte, Zdravko Kacic:

Training the tilt intonation model using the JEMA methodology. 3273-3276 - Dagen Wang, Shrikanth S. Narayanan:

Piecewise linear stylization of pitch via wavelet analysis. 3277-3280 - Harald Romsdorfer, Beat Pfister:

Phonetic labeling and segmentation of mixed-lingual prosody databases. 3281-3284 - Edmilson Morais, Fábio Violaro:

Exploratory analysis of linguistic data based on genetic algorithm for robust modeling of the segmental duration of speech. 3285-3288 - Dafydd Gibbon, Flaviane Romani Fernandes:

Annotation-mining for rhythm model comparison in Brazilian portuguese. 3289-3292 - Tohru Nagano, Shinsuke Mori, Masafumi Nishimura:

A stochastic approach to phoneme and accent estimation. 3293-3296 - Jason M. Brenier, Daniel M. Cer, Daniel Jurafsky:

The detection of emphatic words using acoustic and lexical features. 3297-3300 - Dinoj Surendran, Gina-Anne Levow, Yi Xu:

Tone recognition in Mandarin using focus. 3301-3304 - Mikolaj Wypych:

An automatic intonation recognizer for the Polish language based on machine learning and expert knowledge. 3305-3308 - Atsuhiro Sakurai:

Generalized envelope matching technique for time-scale modification of speech (GEM-TSM). 3309-3312
Topics in Speech Recognition
- Yang Liu, Elizabeth Shriberg, Andreas Stolcke, Mary P. Harper:

Comparing HMM, maximum entropy, and conditional random fields for disfluency detection. 3313-3316 - Bhiksha Raj, Rita Singh, Paris Smaragdis:

Recognizing speech from simultaneous speakers. 3317-3320 - Vincent Wan, James Carmichael:

Polynomial dynamic time warping kernel support vector machines for dysarthric speech recognition with sparse training data. 3321-3324 - R. Lejeune, J. Baude, C. Tchong, Hubert Crepy, Claire Waast-Richard:

Flavoured acoustic model and combined spelling to sound for asymmetrical bilingual environment. 3325-3328 - Chris D. Bartels, Kevin Duh, Jeff A. Bilmes, Katrin Kirchhoff, Simon King:

Genetic triangulation of graphical models for speech and language processing. 3329-3332 - Guillermo Aradilla, Jithendra Vepa, Hervé Bourlard:

Improving speech recognition using a data-driven approach. 3333-3336 - Shigeki Matsuda, Wolfgang Herbordt, Satoshi Nakamura:

Outlier detection for acoustic model training using robust statistics. 3337-3340 - Jonathan Le Roux, Erik McDermott:

Optimization methods for discriminative training. 3341-3344 - Patrick Cardinal, Gilles Boulianne, Michel Comeau:

Segmentation of recordings based on partial transcriptions. 3345-3348 - Hussien Seid, Björn Gambäck:

A speaker independent continuous speech recognizer for Amharic. 3349-3352 - Tetsuji Ogawa, Tetsunori Kobayashi:

Optimizing the structure of partly-hidden Markov models using weighted likelihood-ratio maximization criterion. 3353-3356 - C. Santhosh Kumar, V. P. Mohandas, Haizhou Li:

Multilingual speech recognition: a unified approach. 3357-3360 - Tomás Bartos, Ludek Müller:

Detection of recognition errors based on classifiers trained on artificially created data. 3361-3364 - Jinyu Li, Chin-Hui Lee:

On designing and evaluating speech event detectors. 3365-3368 - Joseph Razik, Odile Mella, Dominique Fohr, Jean Paul Haton:

Local word confidence measure using word graph and n-best list. 3369-3372 - Xiaolin Ren, Xin He, Yaxin Zhang:

Mandarin/English mixed-lingual name recognition for mobile phone. 3373-3376 - Javier Ferreiros, Rubén San Segundo, Fernando Fernández Martínez, Luis Fernando D'Haro, Valentín Sama, Roberto Barra-Chicote, Pedro Mellén:

New word-level and sentence-level confidence scoring using graph theory calculus and its evaluation on speech understanding. 3377-3380 - Masanobu Nakamura, Koji Iwano, Sadaoki Furui:

Analysis of spectral space reduction in spontaneous speech and its effects on speech recognition performances. 3381-3384 - Simon King, Chris D. Bartels, Jeff A. Bilmes:

SVitchboard 1: small vocabulary tasks from Switchboard. 3385-3388
Discourse and Dialogue I, II
- Wieneke Wesseling, R. J. J. H. van Son:

Timing of experimentally elicited minimal responses as quantitative evidence for the use of intonation in projecting TRPs. 3389-3392 - Shinya Yamada, Toshihiko Itoh, Kenji Araki:

Linguistic and acoustic features depending on different situations - the experiments considering speech recognition rate. 3393-3396 - Dirk Bühler, Stefan W. Hamerich:

Towards voiceXML compilation for portable embedded applications in ubiquitous environments. 3397-3400 - Eva Strangert:

Prosody in public speech: analyses of a news announcement and a Political interview. 3401-3404 - Tanveer A. Faruquie, Pankaj Kankar, Nitendra Rajput, Abhishek Verma:

An architecture for pluggable disambiguation mechanism for RDC based voice applications. 3409-3412 - Nitendra Rajput, Amit Anil Nanavati, Abhishek Kumar, Neeraj Chaudhary:

Adapting dialog call-flows for pervasive devices. 3413-3416 - Ulf Krum, Hartwig Holzapfel, Alex Waibel:

Clarification questions to improve dialogue flow and speech recognition in spoken dialogue systems. 3417-3420 - Fernando Fernández Martínez, Javier Ferreiros, Valentín Sama, Juan Manuel Montero, Rubén San Segundo, Javier Macías Guarasa, Rafael García:

Speech interface for controlling an hi-fi audio system based on a Bayesian belief networks approach for dialog modeling. 3421-3424
Spoken Language Understanding I, II
- Matthias Thomae, Tibor Fábián, Robert Lieb, Günther Ruske:

Hierarchical language models for one-stage speech interpretation. 3425-3428 - Nick J.-C. Wang:

Spoken language understanding using layered n-gram modeling. 3429-3432 - Mihai Surdeanu, Jordi Turmo, Eli Comelles:

Named entity recognition from spontaneous open-domain speech. 3433-3436 - Imed Zitouni, Hui Jiang, Qiru Zhou:

Discriminative training and support vector machine for natural language call routing. 3437-3440 - Jihyun Eun, Minwoo Jeong, Gary Geunbae Lee:

A multiple classifier-based concept-spotting approach for robust spoken language understanding. 3441-3444 - Robert Lieb, Matthias Thomae, Günther Ruske, Daniel Bobbert, Frank Althoff:

A flexible and integrated interface between speech recognition, speech interpretation and dialog management. 3445-3448 - Tomohiro Ohno, Shigeki Matsubara, Hideki Kashioka, Naoto Kato, Yasuyoshi Inagaki:

Incremental dependency parsing of Japanese spoken monologue based on clause boundaries. 3449-3452 - Atsushi Sako, Tetsuya Takiguchi, Yasuo Ariki:

Situation based speech recognition for structuring baseball live games. 3453-3456 - Hélène Bonneau-Maynard, Sophie Rosset, Christelle Ayache, Anne Kuhn, Djamel Mostefa:

Semantic annotation of the French media dialog corpus. 3457-3460 - Ralf Engel:

Robust and efficient semantic parsing of free word order languages in spoken dialogue systems. 3461-3464 - Catherine Kobus, Géraldine Damnati, Lionel Delphin-Poulat, Renato de Mori:

Conceptual language model design for spoken language understanding. 3465-3468 - Luís Seabra Lopes

, António J. S. Teixeira, Marcelo Quinderé, Mário Rodrigues:
From robust spoken language understanding to knowledge acquisition and management. 3469-3472 - Cheng Wu, Xiang Li, Hong-Kwang Jeff Kuo, E. E. Jan, Vaibhava Goel, David M. Lubensky:

Improving end-to-end performance of call classification through data confusion reduction and model tolerance enhancement. 3473-3476

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID














