


Остановите войну!
for scientists:


default search action
INTERSPEECH 2007: Antwerp, Belgium
- INTERSPEECH 2007, 8th Annual Conference of the International Speech Communication Association, Antwerp, Belgium, August 27-31, 2007. ISCA 2007
Keynotes 1-4
- Victor Zue:
On organic interfaces. 1-8 - Sophie K. Scott:
The neural basis of speech perception - a view from functional imaging. 9-13 - Alex Waibel, Keni Bernardin, Matthias Wölfel:
Computer-supported human-human multilingual communication. 14-21 - Pierre-Yves Oudeyer:
Self-organization in the evolution of shared systems of speech sounds: a computational study. 22-29
Discriminative and Large Margin Techniques in Acoustic Modeling
- Jinyu Li, Chin-Hui Lee:
Soft margin feature extraction for automatic speech recognition. 30-33 - Yan Yin, Hui Jiang:
A fast optimization method for large margin estimation of HMMs based on second order cone programming. 34-37 - Hao-Zheng Li, Douglas D. O'Shaughnessy:
Frame margin probability discriminative training algorithm for noisy speech recognition. 38-41 - Fabio Valente, Jithendra Vepa, Christian Plahl, Christian Gollan, Hynek Hermansky, Ralf Schlüter:
Hierarchical neural networks feature extraction for LVCSR system. 42-45 - Peder A. Olsen, John R. Hershey:
Bhattacharyya error and divergence using variational importance sampling. 46-49 - Tingyao Wu, Jacques Duchateau, Dirk Van Compernolle:
Phoneme dependent frame selection preference. 50-53
Speech Production I, II
- Xinhui Zhou, Carol Y. Espy-Wilson, Mark Tiede, Suzanne Boyce:
An articulatory and acoustic study of "retroflex" and "bunched" american English rhotic sound based on MRI. 54-57 - Paula Martins, Inês Carbone, Augusto Silva, António J. S. Teixeira:
An MRI study of european portuguese nasals. 58-61 - Sayoko Takano, Hiroki Matsuzaki, Kunitoshi Motoki:
A four-cube FEM model of the extrinsic and intrinsic tongue muscles to simulate the production of vowel /i/. 62-65 - Juan F. Torres, Elliot Moore:
Performance evaluation of glottal quality measures from the perspective of vocal tract filter consistency. 66-69 - Veena D. Singampalli, Philip J. B. Jackson:
Statistical identification of critical, dependent and redundant articulators. 70-73 - Chao Qin, Miguel Á. Carreira-Perpiñán:
An empirical investigation of the nonuniqueness in the acoustic-to-articulatory mapping. 74-77
Phonetic Segmentation and Classification I, II
- Peter Karsmakers, Kristiaan Pelckmans, Johan A. K. Suykens, Hugo Van hamme
:
Fixed-size kernel logistic regression for phoneme classification. 78-81 - Seung Seop Park, Jong Won Shin, Jong Kyu Kim, Nam Soo Kim:
A multiple-model based framework for automatic speech segmentation. 82-85 - Aren Jansen, Partha Niyogi:
Semi-supervised learning of speech sounds. 86-89 - Abhinav Parate, Ashish Verma, Jayanta Basak:
Evaluation of syllable stress using single class classifier. 90-93 - Mohammad Nurul Huda, Muhammad Ghulam, Junsei Horikawa, Tsuneo Nitta:
Distinctive phonetic feature (DPF) based phone segmentation using hybrid neural networks. 94-97 - Jean-Philippe Goldman, Mathieu Avanzi, Anne-Catherine Simon, Anne Lacheret, Antoine Auchlin:
A methodology for the automatic detection of perceived prominent syllables in spoken French. 98-101
Discourse, Dialog and Conversation
- Hiroki Mori, Hideki Kasuya:
Voice source and vocal tract variations as cues to emotional states perceived from expressive conversational speech. 102-105 - Fan Yang, Peter A. Heeman:
Exploring initiative strategies using computer simulation. 106-109 - Chiu-yu Tseng, Zhao-yu Su:
From one base form to multiple output styles - predicting stylistic dynamics of discourse prosody. 110-113 - Claudia Crocco, Renata Savy:
Topic in dialogue: prosodic and syntactic features. 114-117 - Michiko Watanabe, Yasuharu Den, Keikichi Hirose, Shusaku Miwa, Nobuaki Minematsu:
Features of pauses and conjunctions at syntactic and discourse boundaries in Japanese monologues. 118-121
Spoken Dialog Systems I, II
- Craig Wootton, Michael F. McTear, Terry Anderson:
Utilizing online content as domain knowledge in a multi-domain dynamic dialogue system. 122-125 - Boris W. van Schooten, Sophie Rosset, Olivier Galibert, Aurélien Max, Rieks op den Akker, Gabriel Illouz:
Handling speech input in the ritel QA dialogue system. 126-129 - Woosung Kim:
Online call quality monitoring for automating agent-based call centers. 130-133 - Sebastian Möller, Klaus-Peter Engelbrecht, Antti Oulasvirta:
Analysis of communication failures for spoken dialogue systems. 134-137 - Sandra Mann, André Berton, Ute Ehrlich:
How to access audio files of large data bases using in-car speech dialogue systems. 138-141 - Kazunori Komatani, Tatsuya Kawahara, Hiroshi G. Okuno:
Analyzing temporal transition of real user's behaviors in a spoken dialogue system. 142-145 - J. Sherwani, Dong Yu, Tim Paek, Mary Czerwinski, Yun-Cheng Ju, Alex Acero:
Voicepedia: towards speech-based access to unstructured information. 146-149 - Vivek Kumar Rangarajan Sridhar, Srinivas Bangalore, Shrikanth S. Narayanan:
Exploiting prosodic features for dialog act tagging in a discriminative modeling framework. 150-153 - Hua Ai, Antonio Roque, Anton Leuski, David R. Traum:
Using information state to improve dialogue move identification in a spoken dialogue system. 154-157 - Shiu-Wah Chu, Ian M. O'Neill, Philip Hanna:
Using multiple strategies to manage spoken dialogue. 158-161 - Marcelo Quinderé, Luís Seabra Lopes
, António J. S. Teixeira:
An information state based dialogue manager for a mobile robot. 162-165
Accent and Language Identification I, II
- Josef G. Bauer, Bernt Andrassy, Ekaterina Timoshenko:
Discriminative optimization of language adapted HMMs for a language identification system based on parallel phoneme recognizers. 166-169 - Khe Chai Sim, Haizhou Li:
Fusion of contrastive acoustic models for parallel phonotactic spoken language identification. 170-173 - Liang Wang, Eliathamby Ambikairajah, Eric H. C. Choi:
Multi-layer kohonen self-organizing feature map for language identification. 174-177 - Bo Yin, Eliathamby Ambikairajah, Fang Chen:
Hierarchical language identification based on automatic language clustering. 178-181 - Ekaterina Timoshenko, Harald Höge:
Using speech rhythm for acoustic language identification. 182-185 - Kakeung Wong, Man-Hung Siu, Brian Mak:
A model-based estimation of phonotactic language verification performance. 186-189 - Mike Rosner, Paulseph-John Farrugia:
A tagging algorithm for mixed language identification in a noisy domain. 190-193 - Doroteo T. Toledano, Javier Gonzalez-Dominguez, Alejandro Abejón-Gonzalez, Danilo Spada, Ismael Mateos-Garcia, Joaquin Gonzalez-Rodriguez:
Improved language recognition using better phonetic decoders and fusion with MFCC and SDC features. 194-197
Education and Training
- Daniel Bolaños, Wayne H. Ward, Sarel van Vuuren, Javier Garrido Salas:
Syllable lattices as a basis for a children's speech reading tracker. 198-201 - Fuping Pan, Qingwei Zhao, Yonghong Yan:
Mandarin vowel pronunciation quality evaluation by using formant pattern recognition. 202-205 - Matthew Black, Joseph Tepperman, Sungbok Lee, Patti Price, Shrikanth S. Narayanan:
Automatic detection and classification of disfluent reading miscues in young children's speech for the purpose of assessment. 206-209 - Nobuaki Minematsu, K. Kamata, Satoshi Asakawa, Takehiko Makino, Tazuko Nishimura, Keikichi Hirose:
Structural assessment of language learners' pronunciation. 210-213 - Abdurrahman Samir, Sherif Mahdy Abdou, Ahmed Husien Khalil, Mohsen A. Rashwan:
Enhancing usability of CAPL system for qur'an recitation learning. 214-217 - Febe de Wet, Christa van der Walt, Thomas Niesler:
Automatic large-scale oral language proficiency assessment. 218-221
Robust ASR I, II
- Yuki Denda, Takamasa Tanaka, Masato Nakayama, Takanobu Nishiura, Yoichi Yamashita:
Noise-robust hands-free voice activity detection with adaptive zero crossing detection using talker direction estimation. 222-225 - Agustín Álvarez Marquina, Rafael Martínez, Pedro Gómez, Victor Nieto Lluis, V. Rodellar:
A robust mel-scale subband voice activity detector for a car platform. 226-229 - Kentaro Ishizuka, Tomohiro Nakatani, Masakiyo Fujimoto, Noboru Miyazaki:
Noise robust front-end processing with voice activity detection based on periodic to aperiodic component ratio. 230-233 - A. M. Toh, Roberto Togneri, Sven Nordholm:
Feature and distribution normalization schemes for statistical mismatch reduction in reverberant speech recognition. 234-237 - Matthew Gibson, Thomas Hain:
Temporal masking for unsupervised minimum Bayes risk speaker adaptation. 238-241 - Tsung-hsueh Hsieh, Jeih-Weih Hung:
Speech feature compensation based on pseudo stereo codebooks for robust speech recognition in additive noise environments. 242-245 - Dimitrios Dimitriadis, Petros Maragos, Stamatios Lefkimmiatis:
Multiband, multisensor robust features for noisy speech recognition. 246-249 - Akira Sasou, Hiroaki Kojima:
Noise robust speech recognition for voice driven wheelchair. 250-253
Adaptation in ASR I, II
- Yun Tang, Richard C. Rose:
Clustered maximum likelihood linear basis for rapid speaker adaptation. 254-257 - Wen Xuan Teng, Guillaume Gravier, Frédéric Bimbot, Frédéric Soufflet:
Rapid speaker adaptation by reference model interpolation. 258-261 - Randy Gomez, Tomoki Toda, Hiroshi Saruwatari, Kiyohiro Shikano:
Rapid unsupervised speaker adaptation using single utterance based on MLLR and speaker selection. 262-265 - Brian Kan-Wing Mak, Roger Wend-Huu Hsiao:
Robustness of several kernel-based fast adaptation methods on noisy LVCSR. 266-269 - Janne Pylkkönen:
Estimating VTLN warping factors by distribution matching. 270-273 - Ming Liu, Xi Zhou, Mark Hasegawa-Johnson, Thomas S. Huang, Zhengyou Zhang:
Frequency domain correspondence for speaker normalization. 274-277 - Masafumi Nishida, Yasuo Horiuchi, Akira Ichikawa:
Unsupervised training of adaptation rate using q-learning in large vocabulary continuous speech recognition. 278-281 - Martin Karafiát, Lukás Burget, Jan Cernocký, Thomas Hain
:
Application of CMLLR in narrow band wide band adapted systems. 282-285 - Christophe Lévy, Georges Linarès, Jean-François Bonastre:
Fast adaptation of GMM-based compact models. 286-289
Speaker Verification & Identification I-IV
- Zahi N. Karam, William M. Campbell:
A new kernel for SVM MLLR based speaker recognition. 290-293 - Kong-Aik Lee, Changhuai You, Haizhou Li, Tomi Kinnunen:
A GMM-based probabilistic sequence kernel for speaker verification. 294-297 - Hagai Aronowitz:
Speaker recognition using kernel-PCA and intersession variability modeling. 298-301 - Réda Dehak, Najim Dehak
, Patrick Kenny, Pierre Dumouchel:
Linear and non linear kernel GMM supervector machines for speaker verification. 302-305 - Ignacio Lopez-Moreno, Ismael Mateos-Garcia, Daniel Ramos, Joaquin Gonzalez-Rodriguez:
Support vector regression for speaker verification. 306-309 - Chris Longworth, Mark J. F. Gales:
Derivative and parametric kernels for speaker verification. 310-313
Spoken Data Retrieval I, II
- David R. H. Miller, Michael Kleber, Chia-Lin Kao, Owen Kimball, Thomas Colthurst, Stephen A. Lowe, Richard M. Schwartz, Herbert Gish:
Rapid and accurate spoken term detection. 314-317 - Yi-Cheng Pan, Hung-lin Chang, Berlin Chen, Lin-Shan Lee:
Subword-based position specific posterior lattices (s-PSPL) for indexing speech information. 318-321 - Andreas Merkel, Dietrich Klakow:
Improved methods for language model based question classification. 322-325 - Tomoyosi Akiba, Hirofumi Tsujimura:
Error-tolerant question answering for spoken documents. 326-329 - Dilek Hakkani-Tür, Gökhan Tür, Michael Levit:
Exploiting information extraction annotations for document retrieval in distillation tasks. 330-333 - Kishan Thambiratnam, Frank Seide:
Learning spoken document similarity and recommendation using supervised probabilistic latent semantic analysis. 334-337
Accent and Language Identification I, II
- David A. van Leeuwen, Khiet P. Truong:
An open-set detection evaluation methodology applied to language and emotion recognition. 338-341 - Xi Yang, Man-Hung Siu, Herbert Gish, Brian Mak:
Boosting with anti-models for automatic language identification. 342-345 - Fabio Castaldo, Daniele Colibro, Emanuele Dalmasso, Pietro Laface, Claudio Vair:
Acoustic language identification using fast discriminative training. 346-349 - Ming Li, Hongbin Suo, Xiao Wu, Ping Lu, Yonghong Yan:
Spoken language identification using score vector modeling and support vector machine. 350-353 - Ricardo de Córdoba, Luis Fernando D'Haro, Fernando Fernández Martínez, Javier Macías Guarasa, Javier Ferreiros:
Language identification based on n-gram frequency ranking. 354-357 - Wade Shen, Douglas A. Reynolds:
Improving phonotactic language recognition with acoustic adaptation. 358-361
Speech Perception I, II
- Michael C. W. Yip:
Spoken word recognition of Chinese homophones: a further investigation. 362-365 - Maria K. Wolters, Pauline Campbell, Christine DePlacido, Amy Liddell, David Owens:
The role of outer hair cell function in the perception of synthetic versus natural speech. 366-369 - Akiko Kusumoto, Alexander Kain, John-Paul Hosom, Jan P. H. van Santen:
Hybridizing conversational and clear speech. 370-373 - Sophie Dufour, Ulrich H. Frauenfelder:
Neighborhood density and neighborhood frequency effects in French spoken word recognition. 374-377 - Toshio Irino, Yoshie Aoki, Yoshie Hayashi, Hideki Kawahara, Roy D. Patterson:
Discrimination and recognition of scaled word sounds. 378-381 - László Tóth:
Benchmarking human performance on the acoustic and linguistic subtasks of ASR systems. 382-385 - Lin Yang, Jianping Zhang, Yonghong Yan:
Contributions of temporal fine structure cues to Chinese speech recognition in cochlear implant simulation. 386-389 - Xihong Wu, Jing Chen, Zhigang Yang, Qiang Huang, Mengyuan Wang, Liang Li:
Effect of number of masking talkers on speech-on-speech masking in Chinese. 390-393 - Odile Bagou, Sophie Dufour, Cécile Fougeron, Alain Content, Ulrich H. Frauenfelder:
Do different boundary types induce subtle acoustic cues to which French listeners are sensitive? 394-397 - Svante Stadler, Arne Leijon, Björn Hagerman:
An information theoretic approach to predict speech intelligibility for listeners with normal and impaired hearing. 398-401 - Travis Wade, Bernd Möbius:
Speaking rate effects in a landmark-based phonetic exemplar model. 402-405 - Kazumi Maniwa, Allard Jongman, Travis Wade:
Acoustic correlates of intelligibility enhancements in clearly produced fricatives. 406-409 - Tim Jürgens, Thomas Brand, Birger Kollmeier:
Modelling the human-machine gap in speech reception: microscopic speech intelligibility prediction for normal-hearing subjects with an auditory model. 410-413 - Ayako Ikeno, John H. L. Hansen:
Lombard speech impact on perceptual speaker recognition. 414-417 - Huiwen Goy, Kathleen Pichora-Fuller, Pascal van Lieshout, Gurjit Singh, Bruce Schneider:
Effect of within- and between-talker variability on word identification in noise by younger and older adults. 418-421 - H. Timothy Bunnell, N. Carolyn Schanen, Linda D. Vallino, Thierry G. Morlet, James B. Polikoff, Jennette D. Driscoll, James T. Mantell:
Speech perception in children with speech sound disorder. 422-425 - Huan Wang, Werner Hemmert:
Speech coding and information processing by auditory neurons. 426-429 - Annie C. Gilbert, Victor J. Boucher:
What do listeners attend to in hearing prosodic structures? investigating the human speech-parser using short-term recall. 430-433
Prosody: Prosodic Structure
- Yosuke Igarashi:
Pitch pattern alternation in goshogawara Japanese: evidence for a prosodic phrase above the domain for downstep. 434-437 - Irina Nesterenko, Pavel A. Skrelin:
Some evidence on the phonetics and phonology of prosodic phrasing in Russian. 438-441 - Jan Volín, Radek Skarnitzl:
Temporal downtrends in Czech read speech. 442-445 - Hyongsil Cho, Daniel Hirst:
Empirical evidence for prosodic phrasing: pauses as linguistic annotation in Korean read speech. 446-449 - Markus Dreyer, Izhak Shafran:
Exploiting prosody for PCFGs with latent annotations. 450-453 - Qin Shi, Danning Jiang, Fanping Meng, Yong Qin:
Combining length distribution model with decision tree in prosodic phrase prediction. 454-457 - Li-chiung Yang:
Duration and pauses as boundary-markers in speech: a cross-linguistic study. 458-461
Prosodic Modeling I, II
- Jian Yu, Lixing Huang, Jianhua Tao, Xia Wang:
Modeling incompletion phenomenon in Mandarin dialog prosody. 462-465 - Anne Tamm, Kálmán Abari, Gábor Olaszy:
Accent assignment algorithm in Hungarian, based on syntactic analysis. 466-469 - Cheng-Yuan Lin, Pei-Chi Jao, Jyh-Shing Roger Jang:
An effective initial/final duration prediction method for corpus-based singing voice synthesis of Mandarin Chinese. 470-473 - Géza Németh, Márk Fék, Tamás Gábor Csapó:
Increasing prosodic variability of text-to-speech synthesizers. 474-477 - Damien Lolive, Nelly Barbot, Olivier Boëffard:
Unsupervised HMM classification of F0 curves. 478-481 - Ian Read, Stephen Cox:
Automatic pitch accent prediction for text-to-speech synthesis. 482-485 - Xinqiang Ni, Yining Chen, Frank K. Soong, Min Chu, Ping Zhang:
An unsupervised approach to automatic prosodic annotation. 486-489 - Zeynep Inanoglu, Steve J. Young:
A system for transforming the emotion in speech: combining data-driven conversion techniques for prosody and voice quality. 490-493 - Chen-Yu Chiang, Hsiu-Min Yu, Yih-Ru Wang, Sin-Horng Chen:
An automatic prosody labeling method for Mandarin speech. 494-497
Speech Analysis
- Koby Crammer:
A conservative aggressive subspace tracker. 498-501 - Mattias Nilsson, W. Bastiaan Kleijn
:
Mutual information and the speech signal. 502-505 - Tony Ezzat, Jake V. Bouvrie, Tomaso A. Poggio:
Spectro-temporal analysis of speech using 2-d Gabor filters. 506-509 - Tomas Dekens, Mike Demol, Werner Verhelst, Piet Verhoeve:
A comparative study of speech rate estimation techniques. 510-513 - Tiago H. Falk, Hua Yuan, Wai-Yip Chan:
Spectro-temporal processing for blind estimation of reverberation time and single-ended quality measurement of reverberant speech. 514-517
Spectral Analysis, Formants and Vocal Tract Models
- Toon van Waterschoot, Marc Moonen:
Linear prediction of audio signals. 518-521 - Carlo Magi, Tomas Bäckström, Paavo Alku:
Stabilised weighted linear prediction - a robust all-pole method for speech processing. 522-525