


Остановите войну!
for scientists:


default search action
INTERSPEECH 2003: Geneva, Switzerland
- 8th European Conference on Speech Communication and Technology, EUROSPEECH 2003 - INTERSPEECH 2003, Geneva, Switzerland, September 1-4, 2003. ISCA 2003
Plenary Talks
- Kenneth Ward Church:
Speech and language processing: where have we been and where are we going? - Birger Kollmeier:
Auditory principles in speech processing - do computers need silicon ears ?
Aurora Noise Robustness on SMALL Vocabulary Databases
- Kaisheng Yao, Erik M. Visser, Oh-Wook Kwon, Te-Won Lee:
A speech processing front-end with eigenspace normalization for robust speech recognition in noisy automobile environments. - Yiu-Pong Lai, Man-Hung Siu:
Maximum likelihood normalization for robust speech recognition. - Veronique Stouten, Hugo Van hamme, Kris Demuynck, Patrick Wambacq:
Robust speech recognition using model-based feature enhancement. - Jian Wu, Qiang Huo:
Several HKU approaches for robust speech recognition and their evaluation on Aurora connected digit recognition tasks. - Yadong Wang, Jesse Hansen, Gopi Krishna Allu, Ramdas Kumaresan:
Average instantaneous frequency (AIF) and average log-envelopes (ALE) for ASR with the Aurora 2 database. - Akira Sasou, Futoshi Asano, Kazuyo Tanaka, Satoshi Nakamura:
Adaptation of acoustic model using the gain-adapted HMM decomposition method.
ISCA Special Interest Group Session: "Hot Topics" in Speech Science and Technology
- Jean-François Bonastre, Frédéric Bimbot, Louis-Jean Boë, Joseph P. Campbell, Douglas A. Reynolds, Ivan Magrin-Chagnolleau:
Person authentication by voice: a need for caution. - Gérard Bailly, Nick Campbell, Bernd Möbius:
ISCA special session: hot topics in speech synthesis. - Béatrice de Gelder:
Perceiving emotions by ear and by eye. - Steven Greenberg:
Strategies for automatic multi-tier annotation of spoken language corpora. - Lin-Shan Lee, Yuan Ho, Jia-fu Chen, Shun-Chuan Chen:
Why is the special structure of the language important for Chinese spoken language processing? - examples on spoken document retrieval, segmentation and summarization.
Speech Signal Processing 1-4
- Luis Weruaga, Marián Képesi:
Speech analysis with the short-time chirp transform. - Ixone Arroabarren, Alfonso Carlosena:
Glottal spectrum based inverse filtering. - G. V. Kiran, Thippur V. Sreenivas:
A novel method of analysing and comparing responses of hearing aid algorithms using auditory time-frequency representation. - Kuldip K. Paliwal, Bishnu S. Atal:
Frequency-related representation of speech. - Vikas C. Raykar, Ramani Duraiswami, B. Yegnanarayana, S. R. Mahadeva Prasanna:
Tracking a moving speaker using excitation source information. - Li Deng, Issam Bazzi, Alex Acero:
Tracking vocal tract resonances using an analytical nonlinear predictor and a target-guided temporal constraint. - Khosrow Lashkari, Toshio Miki:
Optimization of the CELP model in the LSP domain. - Ben Gillett, Simon King:
Transforming voice quality. - Yusuke Hioka, Nozomu Hamada:
DOA estimation of speech signal using equilateral-triangular microphone array. - Ilyas Potamitis, George Tremoulis, Nikos Fakotakis, George Kokkinakis:
Multi-array fusion for beamforming and localization of moving speakers. - Xu Shao, Ben P. Milner, Stephen J. Cox:
Integrated pitch and MFCC extraction for speech reconstruction and speech recognition applications. - Lasse Laaksonen, Sakari Himanen, Ari Heikkinen, Jani Nurminen:
Exploiting time warping in AMR-NB and AMR-WB speech coders. - Stephan Grashey:
A new approach to voice activity detection based on self-organizing maps. - Yoshinori Shiga, Simon King:
Estimating the spectral envelope of voiced speech using multi-frame analysis. - Essa Jafer, Abdulhussain E. Mahdi:
Adaptive noise estimation using second generation and perceptual wavelet transforms. - Julien Bourgeois:
A clustering approach to on-line audio source separation. - Yoshinori Shiga, Simon King:
Estimation of voice source and vocal tract characteristics based on multi-frame analysis. - Taoufik En-Najjary, Olivier Rosec, Thierry Chonavel:
A new method for pitch prediction from spectral envelope and its application in voice conversion. - Marco Orlandi, Alfiero Santarelli, Daniele Falavigna:
Maximum likelihood endpoint detection with time-domain features. - Ixone Arroabarren, Alfonso Carlosena:
Unified analysis of glottal source spectrum. - Aïcha Bouzid, Noureddine Ellouze:
Local regularity analysis at glottal opening and closure instants in electroglottogram signal using wavelet transform modulus maxima. - Martin Schafföner, Marcel Katz, Sven E. Krüger, Andreas Wendemuth:
Improved robustness of automatic speech recognition using a new class definition in linear discriminant analysis. - Oytun Türk, Levent M. Arslan:
Voice conversion methods for vocal tract and pitch contour modification. - Olaf Schreiner:
Modulation spectrum for pitch and speech pause detection. - Dimitrios Dimitriadis, Petros Maragos:
Robust energy demodulation based on continuous models with application to speech recognition. - Jong Uk Kim, Sang-Gyun Kim, Chang D. Yoo:
A robust and sensitive word boundary decision algorithm. - Seongho Seo, Dalwon Jang, Sunil Lee, Chang D. Yoo:
A novel transcoding algorithm for SMV and g.723.1 speech coders via direct parameter transformation. - Dalwon Jang, Seongho Seo, Sunil Lee, Chang D. Yoo:
A novel rate selection algorithm for transcoding CELP-type codec and SMV. - Gary Choy, David Hermann, Robert L. Brennan, Todd Schneider, Hamid Sheikhzadeh, Etienne Cornu:
Subband-based acoustic shock limiting algorithm on a low-resource DSP system. - Patricia A. Pelle, Matias L. Capeletto:
Pitch estimation using phase locked loops. - Dhany Arifianto, Takao Kobayashi:
Performance evaluation of IFAS-based fundamental frequency estimator in noisy environment. - Hans Kruschke, Michael Lenz:
Estimation of the parameters of the quantitative intonation model with continuous wavelet analysis. - Francisco Romero Rodriguez, Wei Ming Liu, Nicholas W. D. Evans, John S. D. Mason:
Morphological filtering of speech spectrograms in the context of additive noise. - Guillaume Lathoud, Iain McCowan, Darren Moore:
Segmenting multiple concurrent speakers using microphone arrays. - T. Nagarajan, Hema A. Murthy, Rajesh M. Hegde:
Segmentation of speech into syllable-like units. - Massimo Petrillo, Francesco Cutugno:
A syllable segmentation algorithm for English and italian. - Ashish Verma, Arun Kumar:
Modeling speaking rate for voice fonts. - Jouni Pohjalainen:
A new HMM-based approach to broad phonetic classification of speech. - Xin Zhong, Mark A. Clements, Sung Lim:
Acoustic change detection and segment clustering of two-way telephone conversations. - David N. Levin:
Blind normalization of speech from different channels. - Aparna Gurijala, John R. Deller Jr.:
Speech watermarking by parametric embedding with an l_(infinity) fidelity criterion.
Phonology and Phonetics I
- Shu-Chuan Tseng:
Features of contracted syllables of spontaneous Mandarin. - K. Samudravijaya:
Durational characteristics of hindi stop consonants. - Toshiko Isei-Jaakkola:
Quantity comparison of Japanese and finnish in various word structures. - Mary Baltazani:
Broad focus across sentence types in greek. - Chatchawarn Hansakunbuntheung, Virongrong Tesprasit, Rungkarn Siricharoenchai, Yoshinori Sagisaka:
Analysis and modeling of syllable duration for Thai speech synthesis. - Aoju Chen:
Reaction time as an indicator of discrete intonational contrasts in English. - Dafydd Gibbon:
Corpus-based syntax-prosody tree matching. - D. W. Ying, W. Gao, W. Q. Wang:
A new approach to segment and detect syllables from high-speed speech. - R. J. J. H. van Son, Louis C. W. Pols:
Information structure and efficiency in speech production. - Anna Corazza, Louis ten Bosch:
Learning rule ranking by dynamic construction of context-free grammars using AND/OR graphs. - Elena Zvonik, Fred Cummins:
The effect of surrounding phrase lengths on pause duration. - Shigeki Okawa, Katsuhiko Shirai:
Statistical estimation of phoneme's most stable point based on universal constraint. - Nicole Beringer:
Independent automatic segmentation by self-learning categorial pronunciation rules. - Bettina Braun, D. Robert Ladd:
Prosodic correlates of contrastive and non-contrastive themes in German. - Yiya Chen:
Accentual lengthening in standard Chinese: evidence from four-syllable constituents. - Supphanat Kanokphara:
Syllable structure based phonetic units for context-dependent continuous Thai speech recognition. - Fang Hu:
An acoustic phonetic analysis of diphthongs in ningbo Chinese. - Takashi Otake, Yoko Sakamoto:
Latent ability to manipulate phonemes by Japanese preliterates in roman alphabet. - Hartmut R. Pfitzinger:
The /i/-/a/-/u/-ness of spoken vowels.
Topics in Prosody and Emotional Speech
- Ben Gillett, Simon King:
Transforming F0 contours. - Norman D. Cook, Takeshi Fujisawa, Kazuaki Takami:
Evaluation of the affect of speech intonation using a model of the perception of interval dissonance and harmonic tension. - Wen-Hsing Lai, Yih-Ru Wang, Sin-Horng Chen:
A new pitch modeling approach for Mandarin speech. - Panagiotis Zervas, Manolis Maragoudakis, Nikos Fakotakis, George Kokkinakis:
Bayesian induction of intonational phrase breaks. - Thibaut Ehrette, Noël Chateau, Christophe d'Alessandro, Valérie Maffiolo:
Predicting the perceptive judgment of voices in a telecom context: selection of acoustic parameters. - Sven L. Mattys:
Stress-based speech segmentation revisited. - Oh-Wook Kwon, Kwokleung Chan, Jiucang Hao, Te-Won Lee:
Emotion recognition by speech signals. - Fabio Tamburini:
Automatic prosodic prominence detection in speech using acoustic features: an unsupervised system. - Vladimir Hozjan, Zdravko Kacic:
Improved emotion recognition with large set of statistical features. - Patavee Charnvivit, Nuttakorn Thubthong, Ekkarit Maneenoi, Sudaporn Luksaneeyanawin, Somchai Jitapunkul:
Recognition of intonation patterns in Thai utterance. - Keikichi Hirose, Yusuke Furuyama, Shuichi Narusawa, Nobuaki Minematsu, Hiroya Fujisaki:
Use of linguistic information for automatic extraction of f_0 contour generation process model parameters. - Marion Dohen, Hélène Loevenbruck, Marie-Agnès Cathiard, Jean-Luc Schwartz:
Potential audiovisual correlates of contrastive focus in French. - Toshie Hatano, Yasuo Horiuchi, Akira Ichikawa:
How does human segment the speech by prosody ? - Brenton D. Walker, Bradley C. Lackey, Jennifer S. Muller, Patrick John Schone:
Language-reconfigurable universal phone recognition. - Chul Min Lee, Shrikanth S. Narayanan:
Emotion recognition using a data-driven fuzzy inference system. - Noriko Suzuki, Yohei Yabuta, Yugo Takeuchi, Yasuhiro Katagiri:
Effects of voice prosody by computers on human behaviors. - Oliver Jokisch, Marco Kühne:
An investigation of intensity patterns for German. - João Paulo Ramos Teixeira, Diamantino Freitas:
Segmental durations predicted with a neural network. - Takumi Yamashita, Yoshinori Sagisaka:
Generation and perception of f_0 markedness in conversational speech with adverbs expressing degrees. - Hansjörg Mixdorff, Nguyen Hung Bach, Hiroya Fujisaki, Chi Mai Luong:
Quantitative analysis and synthesis of syllabic tones in vietnamese. - Shinya Kiriyama, Yoshifumi Mitsuta, Yuta Hosokawa, Yoshikazu Hashimoto, Toshihiko Itoh, Shigeyoshi Kitazawa:
Japanese prosodic labeling support system utilizing linguistic information. - Véronique Aubergé, Nicolas Audibert, Albert Rilliard:
Why and how to control the authentic emotional speech corpora. - Laurence Devillers, Ioana Vasilescu:
Prosodic cues for emotion characterization in real-life spoken dialogs.
Language Modeling, Discourse and Dialog
- Joseph Polifroni, Grace Chung, Stephanie Seneff:
Towards the automatic generation of mixed-initiative dialogue systems from web content. - Edward Filisko, Stephanie Seneff:
A context resolution server for the galaxy conversational systems. - Hilda Hardy, Kirk Baker, Hélène Bonneau-Maynard, Laurence Devillers, Sophie Rosset, Tomek Strzalkowski:
Semantic and dialogic annotation for automated multilingual customer service. - H. B. M. Nicholson, Ellen Gurman Bard, Anne H. Anderson, María L. Flecha-García, David Kenicer, Lucy Smallwood, Jim Mullin, Robin J. Lickley, Yiya Chen:
Disfluency under feedback and time-pressure. - Peter A. Heeman, Fan Yang, Susan E. Strayer:
Control in task-oriented dialogues. - Kevin McTait, Martine Adda-Decker:
The 300k LIMSI German broadcast news transcription system. - Jilei Tian, Janne Suontausta, Juha Häkkinen:
Weighted entropy training for the decision tree based text-to-phoneme mapping. - Yoshihiko Ogawa, Hirofumi Yamamoto, Yoshinori Sagisaka, Gen-ichiro Kikui:
Word class modeling for speech recognition with out-of-task words using a hierarchical language model. - Roeland Ordelman, Arjan van Hessen, Franciska de Jong:
Compound decomposition in dutch large vocabulary speech recognition. - Guergana K. Savova, Joan Bachenko:
Designing for errors: similarities and differences of disfluency rates and prosodic characteristics across domains. - Mirjam Wester:
Syllable classification using articulatory-acoustic features. - Imed Zitouni, Olivier Siohan, Chin-Hui Lee:
Hierarchical class n-gram language models: towards better estimation of unseen events in speech recognition. - Sergio Barrachina, Juan Miguel Vilar:
Incremental and iterative monolingual clustering algorithms. - Anand Venkataraman, Wen Wang:
Techniques for effective vocabulary selection. - Lucian Galescu:
Recognition of out-of-vocabulary words with sub-lexical language models. - Hélène Bonneau-Maynard, Sophie Rosset:
A semantic representation for spoken dialogs. - Martine Adda-Decker:
A corpus-based decompounding algorithm for German lexical modeling in LVCSR. - Kyong-Nim Lee, Minhwa Chung:
Modeling cross-morpheme pronunciation variations for korean large vocabulary continuous speech recognition.
Speech Synthesis: Unit Selection 1, 2
- Yi Zhou, Yiqing Zu:
Unit selection based on voice recognition. - Jun Xu, Thomas Choy, Minghui Dong, Cuntai Guan, Haizhou Li:
On unit analysis for Cantonese corpus-based TTS. - Tanya Lambert, Andrew P. Breen, Barry Eggleton, Stephen J. Cox, Ben P. Milner:
Unit selection in concatenative TTS synthesis systems based on mel filter bank amplitudes and phonetic context. - Baris Bozkurt, Özlem Öztürk, Thierry Dutoit:
Text design for TTS speech corpus building using a modified greedy selection. - Seung Seop Park, Chong Kyu Kim, Nam Soo Kim:
Discriminative weight training for unit-selection based speech synthesis. - Peter Rutten, Justin Fackrell:
The application of interactive speech unit selection in TTS systems. - Francisco Campillo Díaz, Eduardo Rodríguez Banga:
On the design of cost functions for unit-selection speech synthesis. - Jithendra Vepa, Simon King:
Kalman-filter based join cost for unit-selection speech synthesis. - Tomoki Toda, Hisashi Kawai, Minoru Tsuzaki:
Optimizing integrated cost function for segment selection in concatenative speech synthesis based on perceptual evaluations. - Jindrich Matousek, Daniel Tihelka, Josef Psutka:
Automatic segmentation for czech concatenative speech synthesis using statistical approach with boundary-specific correction. - Chih-Chung Kuo, Chi-Shiang Kuo, Jau-Hung Chen, Sen-Chia Chang:
Automatic speech segmentation and verification for concatenative synthesis. - Sérgio Paulo, Luís C. Oliveira:
DTW-based phonetic alignment using multiple acoustic features. - John Kominek, Christina L. Bennett, Alan W. Black:
Evaluating and correcting phoneme segmentation for unit selection synthesis. - Esther Klabbers, Jan P. H. van Santen:
Control and prediction of the impact of pitch modification on synthetic speech quality. - Matthew P. Aylett, Justin Fackrell, Peter Rutten:
My voice, your prosody: sharing a speaker specific prosody model across speakers in unit selection TTS. - Virongrong Tesprasit, Paisarn Charoenpornsawat, Virach Sornlertlamvanich:
Learning phrase break detection in Thai text-to-speech. - Alexander Kain, Jan P. H. van Santen:
A speech model of acoustic inventories based on asynchronous interpolation. - Keikichi Hirose, Takayuki Ono, Nobuaki Minematsu:
Corpus-based synthesis of fundamental frequency contours of Japanese using automatically-generated prosodic corpus and generation process model. - S. Prahallad Kishore, Alan W. Black:
Unit size in unit selection speech synthesis. - Antje Schweitzer, Norbert Braunschweiler, Tanja Klankert, Bernd Möbius, Bettina Säuberlich:
Restricted unlimited domain synthesis.