default search action
INTERSPEECH/EUROSPEECH 2005: Lisbon, Portugal
- 9th European Conference on Speech Communication and Technology, INTERSPEECH-Eurospeech 2005, Lisbon, Portugal, September 4-8, 2005. ISCA 2005
Keynote Papers
- Graeme M. Clark:
The multiple-channel cochlear implant: interfacing electronic technology to human consciousness. 1-4
Speech Recognition - Language Modelling I-III
- Yik-Cheung Tam, Tanja Schultz:
Dynamic language model adaptation using variational Bayes inference. 5-8 - Vidura Seneviratne, Steve J. Young:
The hidden vector state language model. 9-12 - Shinsuke Mori, Gakuto Kurata:
Class-based variable memory length Markov model. 13-16 - Alexander Gruenstein, Chao Wang, Stephanie Seneff:
Context-sensitive statistical language modeling. 17-20 - Chao Wang, Stephanie Seneff, Grace Chung:
Language model data filtering via user simulation and dialogue resynthesis. 21-24 - Jen-Tzung Chien, Meng-Sung Wu, Chia-Sheng Wu:
Bayesian learning for latent semantic analysis. 25-28
Prosody in Language Performance I, II
- Daniel Hirst, Caroline Bouzon:
The effect of stress and boundaries on segmental duration in a corpus of authentic speech (british English). 29-32 - Tomoko Ohsuga, Masafumi Nishida, Yasuo Horiuchi, Akira Ichikawa:
Investigation of the relationship between turn-taking and prosodic features in spontaneous dialogue. 33-36 - Michiko Watanabe, Keikichi Hirose, Yasuharu Den, Nobuaki Minematsu:
Filled pauses as cues to the complexity of following phrases. 37-40 - Katrin Schneider, Bernd Möbius:
Perceptual magnet effect in German boundary tones. 41-44 - Angela Grimm, Jochen Trommer:
Constraints on the acquisition of simplex and complex words in German. 45-48 - Julien Meyer:
Whistled speech: a natural phonetic description of languages adapted to human perception and to the acoustical environment. 49-52
Spoken Language Extraction / Retrieval I, II
- Olivier Siohan, Michiel Bacchiani:
Fast vocabulary-independent audio search using path-based graph indexing. 53-56 - John Makhoul, Alex Baron, Ivan Bulyko, Long Nguyen, Lance A. Ramshaw, David Stallard, Richard M. Schwartz, Bing Xiang:
The effects of speech recognition and punctuation on information extraction performance. 57-60 - Ciprian Chelba, Alex Acero:
Indexing uncertainty for spoken document search. 61-64 - Tomoyosi Akiba, Hiroyuki Abe:
Exploiting passage retrieval for n-best rescoring of spoken questions. 65-68 - BalaKrishna Kolluru, Heidi Christensen, Yoshihiko Gotoh:
Multi-stage compaction approach to broadcast news summarisation. 69-72 - Chien-Lin Huang, Chia-Hsin Hsieh, Chung-Hsien Wu:
Audio-video summarization of TV news using speech recognition and shot change detection. 73-76
The Blizzard Challenge 2005
- Alan W. Black, Keiichi Tokuda:
The blizzard challenge - 2005: evaluating corpus-based speech synthesis on common datasets. 77-80 - Shinsuke Sakai, Han Shu:
A probabilistic approach to unit selection for corpus-based speech synthesis. 81-84 - John Kominek, Christina L. Bennett, Brian Langner, Arthur R. Toth:
The blizzard challenge 2005 CMU entry - a method for improving speech synthesis systems. 85-88 - H. Timothy Bunnell, Christopher A. Pennington, Debra Yarrington, John Gray:
Automatic personal synthetic voice construction. 89-92 - Heiga Zen, Tomoki Toda:
An overview of nitech HMM-based speech synthesis system for blizzard challenge 2005. 93-96 - Wael Hamza, Raimo Bakis, Zhiwei Shuang, Heiga Zen:
On building a concatenative speech synthesis system from the blizzard challenge speech databases. 97-100 - Robert A. J. Clark, Korin Richmond, Simon King:
Multisyn voices from ARCTIC data for the blizzard challenge. 101-104 - Christina L. Bennett:
Large scale evaluation of corpus-based synthesizers: results and lessons from the blizzard challenge 2005. 105-108
New Applications
- Berlin Chen, Yi-Ting Chen, Chih-Hao Chang, Hung-Bin Chen:
Speech retrieval of Mandarin broadcast news via mobile devices. 109-112 - Michiaki Katoh, Kiyoshi Yamamoto, Jun Ogata, Takashi Yoshimura, Futoshi Asano, Hideki Asoh, Nobuhiko Kitawaki:
State estimation of meetings by information fusion using Bayesian network. 113-116 - Roger K. Moore:
Results from a survey of attendees at ASRU 1997 and 2003. 117-120 - Reinhold Haeb-Umbach, Basilis Kladis, Joerg Schmalenstroeer:
Speech processing in the networked home environment - a view on the amigo project. 121-124 - Masahide Sugiyama:
Fixed distortion segmentation in efficient sound segment searching. 125-128 - Tin Lay Nwe, Haizhou Li:
Identifying singers of popular songs. 129-132 - Jun Ogata, Masataka Goto:
Speech repair: quick error correction just by using selection operation for speech input interfaces. 133-136 - Dirk Olszewski, Fransiskus Prasetyo, Klaus Linhard:
Steerable highly directional audio beam loudspeaker. 137-140 - Hassan Ezzaidi, Jean Rouat:
Automatic music genre classification using second-order statistical measures for the prescriptive approach. 141-144 - Alberto Abad, Dusan Macho, Carlos Segura, Javier Hernando, Climent Nadeu:
Effect of head orientation on the speaker localization performance in smart-room environment. 145-148 - Corinne Fredouille, Gilles Pouchoulin, Jean-François Bonastre, M. Azzarello, Antoine Giovanni, Alain Ghio:
Application of automatic speaker recognition techniques to pathological voice assessment (dysphonia). 149-152 - Upendra V. Chaudhari, Ganesh N. Ramaswamy, Edward A. Epstein, Sasha Caskey, Mohamed Kamal Omar:
Adaptive speech analytics: system, infrastructure, and behavior. 153-156
E-learning and Spoken Language Processing
- Katherine Forbes-Riley, Diane J. Litman:
Correlating student acoustic-prosodic profiles with student learning in spoken tutoring dialogues. 157-160 - Diane J. Litman, Katherine Forbes-Riley:
Speech recognition performance and learning in spoken dialogue tutoring. 161-164 - Satoshi Asakawa, Nobuaki Minematsu, Toshiko Isei-Jaakkola, Keikichi Hirose:
Structural representation of the non-native pronunciations. 165-168 - Fu-Chiang Chou:
Ya-ya language box - a portable device for English pronunciation training with speech recognition technologies. 169-172 - Akinori Ito, Yen-Ling Lim, Motoyuki Suzuki, Shozo Makino:
Pronunciation error detection method based on error rule clustering using a decision tree. 173-176 - Abhinav Sethy, Shrikanth S. Narayanan, Nicolaus Mote, W. Lewis Johnson:
Modeling and automating detection of errors in Arabic language learner speech. 177-180 - Felicia Zhang, Michael Wagner:
Effects of F0 feedback on the learning of Chinese tones by native speakers of English. 181-184
E-inclusion and Spoken Language Processing I, II
- Tom Brøndsted, Erik Aaskoven:
Voice-controlled internet browsing for motor-handicapped users. design and implementation issues. 185-188 - Briony Williams, Delyth Prys, Ailbhe Ní Chasaide:
Creating an ongoing research capability in speech technology for two minority languages: experiences from the WISPR project. 189-192 - Anestis Vovos, Basilis Kladis, Nikolaos D. Fakotakis:
Speech operated smart-home control system for users with special needs. 193-196 - Takatoshi Jitsuhiro, Shigeki Matsuda, Yutaka Ashikari, Satoshi Nakamura, Ikuko Eguchi Yairi, Seiji Igi:
Spoken dialog system and its evaluation of geographic information system for elderly persons' mobility support. 197-200 - Daniele Falavigna, Toni Giorgino, Roberto Gretter:
A frame based spoken dialog system for home care. 201-204
Acoustic Processing for ASR I-III
- Matthias Wölfel:
Frame based model order selection of spectral envelopes. 205-208 - Vivek Tyagi, Christian Wellekens, Hervé Bourlard:
On variable-scale piecewise stationary spectral analysis of speech signals for ASR. 209-212 - Arlo Faria, David Gelbart:
Efficient pitch-based estimation of VTLN warp factors. 213-216 - Yanli Zheng, Richard Sproat, Liang Gu, Izhak Shafran, Haolang Zhou, Yi Su, Daniel Jurafsky, Rebecca Starr, Su-Youn Yoon:
Accent detection and speech recognition for Shanghai-accented Mandarin. 217-220 - Loïc Barrault, Renato de Mori, Roberto Gemello, Franco Mana, Driss Matrouf:
Variability of automatic speech recognition systems using different features. 221-224 - Slavomír Lihan, Jozef Juhár, Anton Cizmar:
Crosslingual and bilingual speech recognition with Slovak and Czech speechdat-e databases. 225-228 - Carmen Peláez-Moreno, Qifeng Zhu, Barry Y. Chen, Nelson Morgan:
Automatic data selection for MLP-based feature extraction for ASR. 229-232 - Thilo Köhler, Christian Fügen, Sebastian Stüker, Alex Waibel:
Rapid porting of ASR-systems to mobile devices. 233-236 - Hugo Meinedo, João Paulo Neto:
A stream-based audio segmentation, classification and clustering pre-processing system for broadcast news using ANN models. 237-240 - Etienne Marcheret, Karthik Visweswariah, Gerasimos Potamianos:
Speech activity detection fusing acoustic phonetic and energy features. 241-244 - Zoltán Tüske, Péter Mihajlik, Zoltán Tobler, Tibor Fegyó:
Robust voice activity detection based on the entropy of noise-suppressed spectrum. 245-248 - Masamitsu Murase, Shun'ichi Yamamoto, Jean-Marc Valin, Kazuhiro Nakadai, Kentaro Yamada, Kazunori Komatani, Tetsuya Ogata, Hiroshi G. Okuno:
Multiple moving speaker tracking by microphone array on mobile robot. 249-252
Speech Recognition - Adaptation I, II
- Yaxin Zhang, Bian Wu, Xiaolin Ren, Xin He:
A speaker biased SI recognizer for embedded mobile applications. 253-256 - Bart Bakker, Carsten Meyer, Xavier L. Aubert:
Fast unsupervised speaker adaptation through a discriminative eigen-MLLR algorithm. 257-260 - Rusheng Hu, Jian Xue, Yunxin Zhao:
Incremental largest margin linear regression and MAP adaptation for speech separation in telemedicine applications. 261-264 - Giulia Garau, Steve Renals, Thomas Hain:
Applying vocal tract length normalization to meeting recordings. 265-268 - Srinivasan Umesh, András Zolnay, Hermann Ney:
Implementing frequency-warping and VTLN through linear transformation of conventional MFCC. 269-272 - Xiaodong Cui, Abeer Alwan:
MLLR-like speaker adaptation based on linearization of VTLN with MFCC features. 273-276 - Chandra Kant Raut, Takuya Nishimoto, Shigeki Sagayama:
Model adaptation by state splitting of HMM for long reverberation. 277-280 - Daben Liu, Daniel Kiecza, Amit Srivastava, Francis Kubala:
Online speaker adaptation and tracking for real-time speech recognition. 281-284 - Masafumi Nishida, Yasuo Horiuchi, Akira Ichikawa:
Automatic speech recognition based on adaptation and clustering using temporal-difference learning. 285-288 - Hui Ye, Steve J. Young:
Improving the speech recognition performance of beginners in spoken conversational interaction for language learning. 289-292 - Randy Gomez, Akinobu Lee, Hiroshi Saruwatari, Kiyohiro Shikano:
Rapid unsupervised speaker adaptation based on multi-template HMM sufficient statistics in noisy environments. 293-296 - Dong-jin Choi, Yung-Hwan Oh:
Rapid speaker adaptation for continuous speech recognition using merging eigenvoices. 297-300
Signal Analysis, Processing and Feature Estimation I-III
- Jian Liu, Thomas Fang Zheng, Jing Deng, Wenhu Wu:
Real-time pitch tracking based on combined SMDSF. 301-304 - András Bánhalmi, Kornél Kovács, András Kocsor, László Tóth:
Fundamental frequency estimation by least-squares harmonic model fitting. 305-308 - Siu Wa Lee, Frank K. Soong, Pak-Chung Ching:
Harmonic filtering for joint estimation of pitch and voiced source with single-microphone input. 309-312 - Marián Képesi, Luis Weruaga:
High-resolution noise-robust spectral-based pitch estimation. 313-316 - John-Paul Hosom:
F0 estimation for adult and children's speech. 317-320 - Ben Milner, Xu Shao, Jonathan Darch:
Fundamental frequency and voicing prediction from MFCCs for speech reconstruction from unconstrained speech. 321-324 - Nelly Barbot, Olivier Boëffard, Damien Lolive:
F0 stylisation with a free-knot b-spline model and simulated-annealing optimization. 325-328 - Friedhelm R. Drepper:
Voiced excitation as entrained primary response of a reconstructed glottal master oscillator. 329-332 - Damien Vincent, Olivier Rosec, Thierry Chonavel:
Estimation of LF glottal source parameters based on an ARX model. 333-336 - Leigh D. Alsteris, Kuldip K. Paliwal:
Some experiments on iterative reconstruction of speech from STFT phase and magnitude spectra. 337-340 - R. Muralishankar, Abhijeet Sangwan, Douglas D. O'Shaughnessy:
Statistical properties of the warped discrete cosine transform cepstrum compared with MFCC. 341-344 - Aníbal J. S. Ferreira:
New signal features for robust identification of isolated vowels. 345-348 - Jonathan Pincas, Philip J. B. Jackson:
Amplitude modulation of frication noise by voicing saturates. 349-352 - Ron M. Hecht, Naftali Tishby:
Extraction of relevant speech features using the information bottleneck method. 353-356 - Mohammad Firouzmand, Laurent Girin, Sylvain Marchand:
Comparing several models for perceptual long-term modeling of amplitude and phase trajectories of sinusoidal speech. 357-360 - Hynek Hermansky, Petr Fousek:
Multi-resolution RASTA filtering for TANDEM-based ASR. 361-364 - Woojay Jeon, Biing-Hwang Juang:
A category-dependent feature selection method for speech signals. 365-368 - Trausti T. Kristjansson, Sabine Deligne, Peder A. Olsen:
Voicing features for robust speech detection. 369-372
Robust Speech Recognition I-IV
- Svein Gunnar Pettersen, Magne Hallstein Johnsen, Tor André Myrvoll:
Joint Bayesian predictive classification and parallel model combination for robust speech recognition. 373-376 - Glauco F. G. Yared, Fábio Violaro, Lívio C. Sousa:
Gaussian elimination algorithm for HMM complexity reduction in continuous speech recognition systems. 377-380 - Luis Buera, Eduardo Lleida, Antonio Miguel, Alfonso Ortega:
Robust speech recognition in cars using phoneme dependent multi-environment linear normalization. 381-384 - Yi Chen, Lin-Shan Lee:
Energy-based frame selection for reliable feature normalization and transformation in robust speech recognition. 385-388 - Yoshitaka Nakajima, Hideki Kashioka, Kiyohiro Shikano, Nick Campbell:
Remodeling of the sensor for non-audible murmur (NAM). 389-392 - Amarnag Subramanya, Jeff A. Bilmes, Chia-Ping Chen:
Focused word segmentation for ASR. 393-396
Speech Perception I, II
- Jennifer A. Alexander, Patrick C. M. Wong, Ann R. Bradlow:
Lexical tone perception in musicians and non-musicians. 397-400 - Joan K.-Y. Ma, Valter Ciocca, Tara L. Whitehill:
Contextual effect on perception of lexical tones in Cantonese. 401-404 - Hansjörg Mixdorff, Yu Hu, Denis Burnham:
Visual cues in Mandarin tone perception. 405-408 - Hansjörg Mixdorff, Yu Hu:
Cross-language perception of word stress. 409-412 - Anne Cutler:
The lexical statistics of word recognition problems caused by L2 phonetic confusion. 413-416 - Chun-Fang Huang, Masato Akagi:
A multi-layer fuzzy logical model for emotional speech perception. 417-420
Spoken Language Understanding I, II
- Ian R. Lane, Tatsuya Kawahara:
Utterance verification incorporating in-domain confidence and discourse coherence measures. 421-424 - Constantinos Boulis, Mari Ostendorf:
Using symbolic prominence to help design feature subsets for topic classification and clustering of natural human-human conversations. 425-428 - Katsuhito Sudoh, Hajime Tsukada:
Tightly integrated spoken language understanding using word-to-concept translation. 429-432 - Ruhi Sarikaya, Hong-Kwang Jeff Kuo, Vaibhava Goel, Yuqing Gao:
Exploiting unlabeled data using multiple classifiers for improved natural language call-routing. 433-436 - Hong-Kwang Jeff Kuo, Vaibhava Goel:
Active learning with minimum expected error for spoken language understanding. 437-440 - Matthias Thomae, Tibor Fábián, Robert Lieb, Günther Ruske:
Lexical out-of-vocabulary models for one-stage speech interpretation. 441-444
E-inclusion and Spoken Language Processing I, II
- Mark S. Hawley, Phil D. Green, Pam Enderby, Stuart P. Cunningham, Roger K. Moore:
Speech technology for e-inclusion of people with physical disabilities and disordered speech. 445-448 - Björn Granström:
Speech technology for language training and e-inclusion. 449-452 - Roger C. F. Tucker, Ksenia Shalonova:
Supporting the creation of TTS for local language voice information systems. 453-456 - Ove Andersen, Christian Hjulmand:
Access for all - a talking internet service. 457-460 - Knut Kvale, Narada D. Warakagoda:
A speech centric mobile multimodal service useful for dyslectics and aphasics. 461-464
Paralinguistic and Nonlinguistic Information in Speech
- Nick Campbell, Hideki Kashioka, Ryo Ohara:
No laughing matter. 465-468 - Christophe Blouin, Valérie Maffiolo:
A study on the automatic detection and characterization of emotion in a voice service context. 469-472 - Raul Fernandez, Rosalind W. Picard:
Classical and novel discriminant features for affect recognition from speech. 473-476 - Jaroslaw Cichosz, Krzysztof Slot:
Low-dimensional feature space derivation for emotion recognition. 477-480 - Carlos Toshinori Ishi, Hiroshi Ishiguro, Norihiro Hagita:
Proposal of acoustic measures for automatic detection of vocal fry. 481-484 - Khiet P. Truong, David A. van Leeuwen:
Automatic detection of laughter. 485-488 - Anton Batliner, Stefan Steidl, Christian Hacker, Elmar Nöth, Heinrich Niemann:
Tales of tuning - prototyping for automatic classification of emotional user states. 489-492 - Iker Luengo, Eva Navas, Inmaculada Hernáez, Jon Sánchez:
Automatic emotion recognition using prosodic parameters. 493-496 - Sungbok Lee, Serdar Yildirim, Abe Kazemzadeh, Shrikanth S. Narayanan:
An articulatory study of emotional speech production. 497-500 - Gregor Hofer, Korin Richmond, Robert A. J. Clark:
Informed blending of databases for emotional speech synthesis. 501-504 - Fabio Tesser, Piero Cosi, Carlo Drioli, Graziano Tisato:
Emotional FESTIVAL-MBROLA TTS synthesis. 505-508