John H. L. Hansen, Bryan L. Pellom:
7th International Conference on Spoken Language Processing, ICSLP2002 - INTERSPEECH 2002, Denver, Colorado, USA, September 16-20, 2002. ISCA 2002
W. Tecumseh Fitch:
The evolution of spoken language: a comparative approach.
Steve J. Young:
Talking to machines (statistically speaking).
Duncan Macho, Laurent Mauuary, Bernhard Noé, Yan Ming Cheng, Douglas Ealey, Denis Jouvet, Holly Kelleher, David Pearce, Fabien Saadoun:
Evaluation of a noise-robust DSR front-end on Aurora databases.
André Gustavo Adami, Lukás Burget, Stéphane Dupont, Harinath Garudadri, Frantisek Grézl, Hynek Hermansky, Pratibha Jain, Sachin S. Kajarekar, Nelson Morgan, Sunil Sivadas:
Qualcomm-ICSI-OGI features for ASR.
Michael Kleinschmidt, David Gelbart:
Improving word accuracy with Gabor feature extraction.
Jasha Droppo, Li Deng, Alex Acero:
Evaluation of SPLICE on the Aurora 2 and 3 tasks.
Brian Kan-Wing Mak, Yik-Cheung Tam:
Performance of discriminatively trained auditory features on Aurora2 and Aurora3.
José C. Segura, M. Carmen Benítez, Ángel de la Torre, Antonio J. Rubio:
Feature extraction combining spectral noise reduction and cepstral histogram equalization for robust ASR.
Jingdong Chen, Dimitris Dimitriadis, Hui Jiang, Qi Li, Tor André Myrvoll, Olivier Siohan, Frank K. Soong:
Bell labs approach to Aurora evaluation on connected digit recognition.
Hong Kook Kim, Richard C. Rose:
Algorithms for distributed speech recognition in a noisy automobile environment.
Florian Hilger, Sirko Molau, Hermann Ney:
Quantile based histogram equalization for online applications.
Chia-Ping Chen, Karim Filali, Jeff A. Bilmes:
Frontend post-processing and backend model enhancement on the Aurora 2.0/3.0 databases.
Masaki Ida, Satoshi Nakamura:
HMM COmposition-based rapid model adaptation using a priori noise GMM adaptation evaluation on Aurora2 corpus.
Jeih-Weih Hung, Lin-Shan Lee:
Data-driven temporal filters obtained via different optimization criteria evaluated on Aurora2 database.
Bojan Kotnik, Damjan Vlaj, Zdravko Kacic, Bogomir Horvat:
Efficient additive and convolutional noise reduction procedures.
Markus Lieb, Alexander Fischer:
Progress with the philips continuous ASR system on the Aurora 2 noisy digits database.
Jian Wu, Qiang Huo:
An environment compensated minimum classification error training approach and its evaluation on Aurora2 database.
Kaisheng Yao, Donglai Zhu, Satoshi Nakamura:
Evaluation of a noise adaptive speech recognition system on the Aurora 3 database.
Laura Docío Fernández, Carmen García-Mateo:
Distributed speech recognition over IP networks on the Aurora 3 database.
Masakiyo Fujimoto, Yasuo Ariki:
Evaluation of noisy speech recognition based on noise reduction and acoustic model adaptation on the Aurora2 tasks.
George Saon, Juan M. Huerta:
Improvements to the IBM Aurora 2 multi-condition system.
Pratibha Jain, Hynek Hermansky, Brian Kingsbury:
Distributed speech recognition using noise-robust MFCC and traps-estimated manner features.
Norihide Kitaoka, Seiichi Nakagawa:
Evaluation of spectral subtraction with smoothing of time direction on the Aurora 2 task.
Xiaodong Cui, Markus Iseli, Qifeng Zhu, Abeer Alwan:
Evaluation of noise robust features on the Aurora databases.
Nicholas W. D. Evans, John S. D. Mason:
Computationally efficient noise compensation for robust automatic speech recognition assessed under the Aurora 2/3 framework.
Omar Farooq, Sekharjit Datta:
Mel-scaled wavelet filter based features for noisy unvoiced phoneme recognition.
Kazuo Onoe, Hiroyuki Segi, Takeshi Kobayakawa, Shoei Sato, Toru Imai, Akio Ando:
Filter bank subtraction for robust speech recognition.
Andrew C. Morris, Simon Payne, Hervé Bourlard:
Low cost duration modelling for noise robust speech recognition.
A comparative study of approximations for parallel model combination of static and dynamic parameters.
Petr Motlícek, Lukás Burget:
Noise estimation for efficient speech enhancement and robust speech recognition.
Özgür Çetin, Harriet J. Nock, Katrin Kirchhoff, Jeff A. Bilmes, Mari Ostendorf:
The 2001 GMTK-based SPINE ASR system.
Using adaptive signal limiter together with weighting techniques for noisy speech recognition.
Shingo Yamade, Kanako Matsunami, Akira Baba, Akinobu Lee, Hiroshi Saruwatari, Kiyohiro Shikano:
Spectral subtraction in noisy environments applied to speaker adaptation based on HMM sufficient statistics.
Man-Hung Siu, Yu-Chung Chan:
Robust speech recognition against short-time noise.
Mario Toma, Andrea Lodi, Roberto Guerrieri:
Word endpoints detection in the presence of non-stationary noise.
Pere Pujol Marsal, Susagna Pol, Astrid Hagen, Hervé Bourlard, Climent Nadeu:
Comparison and combination of RASTA-PLP and FF features in a hybrid HMM/MLP speech recognition system.
Tao Xu, Zhigang Cao:
Robust MMSE-FW-LAASR scheme at low SNRs.
András Zolnay, Ralf Schlüter, Hermann Ney:
Robust speech recognition using a voiced-unvoiced feature.
Febe de Wet, Johan de Veth, Bert Cranen, Lou Boves:
Accumulated kullback divergence for analysis of ASR performance in the presence of noise.
Brian Kingsbury, Pratibha Jain, André Gustavo Adami:
A hybrid HMM/traps model for robust voice activity detection.
Chengyi Zheng, Yonghong Yan:
Run time information fusion in speech recognition.
Jon A. Arrowood, Mark A. Clements:
Using observation uncertainty in HMM decoding.
Matthew N. Stuttle, M. J. F. Gales:
Combining a Gaussian mixture model front end with MFCC parameters.
Jasha Droppo, Alex Acero, Li Deng:
Noise from corrupted speech log mel-spectral energies.
Carlos Lima, Luís B. Almeida, João L. Monteiro:
Improving the role of unvoiced speech segments by spectral normalisation in robust speech recognition.
Venkata Ramana Rao Gadde, Andreas Stolcke, Dimitra Vergyri, Jing Zheng, M. Kemal Sönmez, Anand Venkataraman:
Building an ASR system for noisy environments: SRI's 2001 SPINE evaluation system.
R. J. J. H. van Son, Louis C. W. Pols:
Evidence for efficiency in vowel production.
Matthew P. Aylett:
Stochastic suprasegmentals: relationship between the spectral characteristics of vowels, redundancy and prosodic structure.
Jihène Serkhane, Jean-Luc Schwartz, Louis-Jean Boë, Barbara L. Davis, Christine L. Matyear:
Motor specifications of a baby robot via the analysis of infants² vocalizations.
Laura L. Koenig, Jorge C. Lucero:
Oral-laryngeal control patterns for fricatives in 5-year-olds and adults.
Véronique Delvaux, Thierry Metens, Alain Soquet:
French nasal vowels: acoustic and articulatory properties.
Patrick Kenny, Gilles Boulianne, Pierre Dumouchel:
Maximum likelihood estimation of eigenvoices and residual variances for large vocabulary speech recognition tasks.
Ernest Pusateri, Timothy J. Hazen:
Rapid speaker adaptation using speaker clustering.
Chao Huang, Tao Chen, Eric Chang:
Adaptive model combination for dynamic speaker selection training.
Ka-Yan Kwan, Tan Lee, Chen Yang:
Unsupervised n-best based model adaptation using model-level confidence measures.
Patrick Nguyen, Luca Rigazio, Christian Wellekens, Jean-Claude Junqua:
LU factorization for feature transformation.
Guo-Hong Ding, Yi-Fei Zhu, Chengrong Li, Bo Xu:
Implementing vocal tract length normalization in the MLLR framework.
Dong Kook Kim, Nam Soo Kim:
Markov models based on speaker space model evolution.
Baojie Li, Keikichi Hirose, Nobuaki Minematsu:
Robust speech recognition using inter-speaker and intra-speaker adaptation.
Carlos Lima, Luís B. Almeida, João L. Monteiro:
Continuous environmental adaptation of a speech recogniser in telephone line conditions.
Tree-structured maximum a posteriori adaptation for a segment-based speech recognition system.
Thomas Plötz, Gernot A. Fink:
Robust time-synchronous environmental adaptation for continuous speech recognition systems.
Thomas Niesler, Daniel Willett:
Unsupervised language model adaptation for lecture speech transcription.
Yongxin Li, Hakan Erdogan, Yuqing Gao, Etienne Marcheret:
Incremental on-line feature space MLLR adaptation for telephony speech recognition.
Sirko Molau, Florian Hilger, Daniel Keysers, Hermann Ney:
Enhanced histogram normalization in the acoustic feature space.
David N. Levin:
Blind normalization of speech from different channels and speakers.
Jun Ogata, Yasuo Ariki:
Unsupervised acoustic model adaptation based on phoneme error minimization.
Bowen Zhou, John H. L. Hansen:
Improved structural maximum likelihood eigenspace mapping for rapid speaker adaptation.
Ángel de la Torre, Dominique Fohr, Jean Paul Haton:
Statistical adaptation of acoustic models to noise conditions for robust speech recognition.
Fabio Brugnara, Mauro Cettolo, Marcello Federico, Diego Giuliani:
Issues in automatic transcription of historical audio data.
Verna Stockmal, Zinny S. Bond:
Same talker, different language: a replication.
A. K. V. Sai Jayram, V. Ramasubramanian, T. V. Sreenivas:
Automatic language identification using acoustic sub-word units.
Ian Maddieson, Ioana Vasilescu:
Factors in human language identification.
Pedro A. Torres-Carrasquillo, Elliot Singer, Mary A. Kohler, Richard J. Greene, Douglas A. Reynolds, John R. Deller Jr.:
Approaches to language identification using Gaussian mixture models and shifted delta cepstral features.
Eddie Wong, Sridha Sridharan:
Methods to improve Gaussian mixture model based language identification system.
Hongyan Jing, Evelyne Tzoukermann:
Part-of-speech tagging in French text-to-speech synthesis: experiments in tagset selection.
Grapheme-to-phoneme conversion using pseudo-morphological units.
Maximilian Bisani, Hermann Ney:
Investigations on joint-multigram models for grapheme-to-phoneme conversion.
Lucian Galescu, James F. Allen:
Pronunciation of proper names with a joint n-gram model for bi-directional grapheme-to-phoneme conversion.
Matthias Jilka, Ann K. Syrdal:
The AT&t German text-to-speech system: realistic linguistic description.
Haiping Li, Fangxin Chen, Liqin Shen:
Generating script using statistical information of the context variation unit vector.
Chih-Chung Kuo, Jing-Yi Huang:
Efficient and scalable methods for text script generation in corpus-based TTS design.
Peter Rutten, Matthew P. Aylett, Justin Fackrell, Paul Taylor:
A statistically motivated database pruning technique for unit selection synthesis.
Yi-Jian Wu, Yu Hu, Xiaoru Wu, Ren-Hua Wang:
A new method of building decision tree based on target information.
Junichi Yamagishi, Masatsune Tamura, Takashi Masuko, Keiichi Tokuda, Takao Kobayashi:
A context clustering technique for average voice model in HMM-based speech synthesis.
Minoru Tsuzaki, Hisashi Kawai:
Feature extraction for unit selection in concatenative speech synthesis: comparison between AIM, LPC, and MFCC.
Francisco Campillo Díaz, Eduardo Rodríguez Banga:
Combined prosody and candidate unit selections for corpus-based text-to-speech systems.
Yeon-Jun Kim, Alistair Conkie:
Automatic segmentation combining an HMM-based approach and spectral boundary correction.
Abhinav Sethy, Shrikanth S. Narayanan:
Refined speech segmentation for concatenative speech synthesis.
Andrew P. Breen, Barry Eggleton, Peter Dion, Steve Minnis:
Refocussing on the text normalisation process in text-to-speech systems.
Jithendra Vepa, Jahnavi Ayachitam, K. V. K. Kalpana Reddy:
A text-to-speech synthesis system for telugu.
Diamantino Freitas, Daniela Braga:
Towards an intonation module for a portuguese TTS system.
Takashi Saito, Masaharu Sakamoto:
Applying a hybrid intonation model to a seamless speech synthesizer.
Toshio Hirai, Seiichi Tenpaku, Kiyohiro Shikano:
Using start/end timings of spectral transitions between phonemes in concatenative speech synthesis.
Jinfu Ni, Hisashi Kawai:
Design of a Mandarin sentence set for corpus-based speech synthesis by use of a multi-tier algorithm taking account of the varied prosodic and spectral characteristics.
Hiroki Mori, Takahiro Ohtsuka, Hideki Kasuya:
A data-driven approach to source-formant type text-to-speech system.
Yu Shi, Eric Chang, Hu Peng, Min Chu:
Power spectral density based channel equalization of large speech database for concatenative TTS system.
Helen M. Meng, Chi-Kin Keung, Kai-Chung Siu, Tien Ying Fung, P. C. Ching:
CU VOCAL: corpus-based syllable concatenation for Chinese speech synthesis across domains and dialects.
Jinlin Lu, Hisashi Kawai:
Perceptual evaluation of naturalness due to substitution of Chinese syllable for concatenative speech synthesis.
Dan Chazan, Ron Hoory, Zvi Kons, Dorel Silberstein, Alexander Sorin:
Reducing the footprint of the IBM trainable speech synthesis system.
Sung-Joo Lee, Hyung Soon Kim:
Computationally efficient time-scale modification of speech using 3 level clipping.
Zhiwei Shuang, Yu Hu, Zhen-Hua Ling, Ren-Hua Wang:
A miniature Chinese TTS system based on tailored corpus.
Hoeun Song, Jaein Kim, Kyongrok Lee, Jinyoung Kim:
Phonetic normalization using z-score in segmental prosody estimation for corpus-based TTS system.
Hideki Kawahara, Parham Zolfaghari, Alain de Cheveigné:
On F0 trajectory optimization for very high-quality speech manipulation.
Tan Lee, Greg Kochanski, Chilin Shih, Yujia Li:
Modeling tones in continuous Cantonese speech.
Minghui Dong, Kim-Teng Lua:
Pitch contour model for Chinese text-to-speech using CART and statistical model.
Phuay Hui Low, Saeed Vaseghi:
Application of microprosody models in text to speech synthesis.
Sheng Zhao, Jianhua Tao, Lianhong Cai:
Prosodic phrasing with inductive learning.
Ben Milner, Xu Shao:
Speech reconstruction from mel-frequency cepstral coefficients using a source-filter model.
Hiromichi Kawanami, Tsuyoshi Masuda, Tomoki Toda, Kiyohiro Shikano:
Designing Japanese speech database covering wide range in prosody for hybrid speech synthesizer.
Dirk Bühler, Wolfgang Minker, Jochen Häußler, Sven Krger:
Flexible multimodal human-machine interaction in mobile environments.
Edward C. Kaiser, Philip R. Cohen:
Implementation testing of a hybrid symbolic/statistical multimodal architecture.
Yoko Yamakata, Tatsuya Kawahara, Hiroshi G. Okuno:
Belief network based disambiguation of object reference in spoken dialogue system for robot.
Jonas Beskow, Jens Edlund, Magnus Nordstrand:
Specification and realisation of multimodal output in dialogue systems.
Francis K. H. Quek, Yingen Xiong, David McNeill:
Gestural trajectory symmetries and discourse segmentation.
Francis K. H. Quek, David McNeill, Robert K. Bryll, Mary P. Harper:
Gestural spatialization in natural discourse segmentation.
Kazuhiro Nakadai, Hiroshi G. Okuno, Hiroaki Kitano:
Real-time sound source localization and separation for robot audition.
Jiyong Ma, Jie Yan, Ronald A. Cole:
CU animate tools for enabling conversations with animated characters.
Philip R. Cohen, Rachel Coulston, Kelly Krout:
Multiparty multimodal interaction: a preliminary analysis.
Peter Poller, Jochen Müller:
Distributed audio-visual speech synchronization.
Philippe Daubias, Paul Deléglise:
Lip-reading based on a fully automatic statistical model.
Xiaoxing Liu, Yibao Zhao, Xiaobo Pi, Luhong Liang, Ara V. Nefian:
Audio-visual continuous speech recognition using a coupled hidden Markov model.
Laila Dybkjær, Niels Ole Bernsen:
Data, annotation schemes and coding tools for natural interactivity.
Francis K. H. Quek, Yang Shi, Cemil Kirbas, Shunguang Wu:
VisSTA: a tool for analyzing multimodal discourse data.
Stephen G. Lambacher, William L. Martens, Kazuhiko Kakehi:
The influence of identification training on identification and production of the american English mid and low vowels by native speakers of Japanese.
Keiichi Tajima, Reiko Akahane-Yamada, Tsuneo Yamada:
Perceptual learning of second-language syllable rhythm by elderly listeners.
Constance M. Clarke:
Perceptual adjustment to foreign-accented English with short term exposure.
Denis K. Burnham, Ron Brooker:
Absolute pitch and lexical tones: tone perception by non-musician, musician, and absolute pitch non-tonal language speakers.
Comprehension of non-native speech: inaccurate phoneme processing and activation of lexical competitors.
Overview on recent activities in speech understanding and dialogue systems evaluation.
Marilyn A. Walker, Alexander I. Rudnicky, Rashmi Prasad, John S. Aberdeen, Elizabeth Owen Bratt, John S. Garofolo, Helen Wright Hastie, Audrey N. Le, Bryan L. Pellom, Alexandros Potamianos, Rebecca J. Passonneau, Salim Roukos, Gregory A. Sanders, Stephanie Seneff, David Stallard:
DARPA communicator: cross-system results for the 2001 evaluation.
Marilyn A. Walker, Alexander I. Rudnicky, John S. Aberdeen, Elizabeth Owen Bratt, John S. Garofolo, Helen Wright Hastie, Audrey N. Le, Bryan L. Pellom, Alexandros Potamianos, Rebecca J. Passonneau, Rashmi Prasad, Salim Roukos, Gregory A. Sanders, Stephanie Seneff, David Stallard:
DARPA communicator evaluation: progress from 2000 to 2001.
Gregory A. Sanders, Audrey N. Le, John S. Garofolo:
Effects of word error rate in the DARPA communicator data during 2000 and 2001.
Candace L. Sidner, Clifton Forlines:
Subset languages for conversing with collaborative interface agents.