


Остановите войну!
for scientists:


default search action
ICASSP 1999: Phoenix, Arizona, USA
- Proceedings of the 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP '99, Phoenix, Arizona, USA, March 15-19, 1999. IEEE Computer Society 1999, ISBN 0-7803-5041-3
Volume 1
CELP Coding
- Paul Mermelstein, Yasheng Qian:
Analysis by synthesis speech coding with generalized pitch prediction. 1-4 - Pierre Combescure, Jürgen Schnitzler, Kyrill A. Fischer, Ralf Kirchherr, Claude Lamblin, Alain Le Guyader, Dominique Massaloux, Catherine Quinquis, Joachim Stegmann, Peter Vary:
A 16, 24, 32 kbit/s wideband speech codec based on ATCELP. 5-8 - Stefan Heinen, Marc Adrat, Oliver Steil, Peter Vary, Wen Xu:
A 6.1 to 13.3-kb/s variable rate CELP codec (VR-CELP) for AMR speech coding. 9-12 - Tadashi Amada, Kimio Miseki, Masami Akamine:
CELP speech coding based on an adaptive pulse position codebook. 13-16 - Miguel Arjona Ramírez
, Max Gerken:
A multistage search of algebraic CELP codebooks. 17-20 - Nam Kyu Ha:
A fast search method of algebraic codebook by reordering search sequence. 21-24 - Roar Hagen, Erik Ekudden:
An 8 kbit/s ACELP coder with improved background noise performance. 25-28 - Harald Pobloth, W. Bastiaan Kleijn
:
On phase perception in speech. 29-32
Large Vocabulary Recognition
- Steven Wegmann, Puming Zhan, Larry Gillick:
Progress in Broadcast News transcription at Dragon Systems. 33-36 - Scott Saobing Chen, Ellen Eide, Mark J. F. Gales, Ramesh A. Gopinath, Dimitri Kanevsky, Peder A. Olsen:
Recent improvements to IBM's speech recognition system for automatic transcription of broadcast news. 37-40 - Jayadev Billa, Thomas Colthurst, Amro El-Jaroudi, Rukmini Iyer, Kristine W. Ma, Spyridon Matsoukas, Carl Quillen, Fred Richardson, Man-Hung Siu, George Zavaliagkos, Herbert Gish:
Recent experiments in large vocabulary conversational speech recognition. 41-44 - Martine Adda-Decker, Gilles Adda, Jean-Luc Gauvain, Lori Lamel:
Large vocabulary speech recognition in French. 45-48 - Sue E. Johnson, Pierre Jourlin, Gareth L. Moore, Karen Spärck Jones, Philip C. Woodland:
The Cambridge University spoken document retrieval system. 49-52 - Barbara Peskin, Michael Newman, Don McAllaster, Venkatesh Nagesha, Hywel B. Richards, Steven Wegmann, Melvyn J. Hunt, Larry Gillick:
Improvements in recognition of conversational telephone speech. 53-56 - Thomas Hain
, Philip C. Woodland, Thomas Niesler, Edward W. D. Whittaker:
The 1998 HTK system for transcription of conversational telephone speech. 57-60 - James R. Glass, Timothy J. Hazen, I. Lee Hetherington:
Real-time telephone-based speech recognition in the Jupiter domain. 61-64
Speech Analysis and Enhancement
- Chung-Hsien Hu, Jau-Hung Chen:
Template-driven generation of prosodic information for Chinese concatenative synthesis. 65-68 - Hiroshi Saruwatari, Shoji Kajita, Kazuya Takeda, Fumitada Itakura:
Speech enhancement using nonlinear microphone array with complementary beamforming. 69-72 - David C. Smith, Jeffrey Townsend, Douglas J. Nelson, Dan Richman:
A multivariate speech activity detector based on the syllable rate. 73-76 - Jeff Kuo, Eva B. Holmberg, Robert E. Hillman:
Discriminating speakers with vocal nodules using aerodynamic and acoustic features. 77-80 - Kenji Matsui, Noriyo Hara:
Enhancement of esophageal speech using formant synthesis. 81-84 - Helen M. Hanson, Richard S. McGowan, Kenneth N. Stevens, Robert E. Beaudoin:
Development of rules for controlling the HLsyn speech synthesizer. 85-88 - Daniel Tapias, Carlos García, Christophe Cazassus:
On the characteristics and effects of loudness during utterance production in continuous speech recognition. 89-92 - Francesco Beritelli, Salvatore Casale, Alfredo Cavallaro:
A multichannel speech/silence detector based on time delay estimation and fuzzy classification. 93-96 - Toshio Irino:
Noise suppression using a time-varying, analysis/synthesis gamma chirp filterbank. 97-100 - Peter Søren Kirk Hansen, Per Christian Hansen
, Steffen Duus Hansen, John Aasted Sørensen
:
Experimental comparison of signal subspace based noise reduction methods. 101-104
Acoustic Modeling I
- Man-Hung Siu, Michael Jonas, Herbert Gish:
Using a large vocabulary continuous speech recognizer for a constrained domain with limited training. 105-108 - Joseph Picone, Sandi Pike, Roland Reagan, Terri Kamm, John S. Bridle, Li Deng, Z. Ma, Hywel B. Richards, Mike Schuster:
Initial evaluation of hidden dynamic models on conversational speech. 109-112 - Spyros Matsoukas, George Zavaliagkos:
Convolutional density estimation in hidden Markov models for speech recognition. 113-116 - Rita Singh, Bhiksha Raj, Richard M. Stern:
Automatic clustering and generation of contextual questions for tied states in hidden Markov models. 117-120 - Tetsunori Kobayashi, Junko Furuyama, Ken Masumitsu:
Partly hidden Markov model and its application to speech recognition. 121-124 - Jae H. Kim, Raziel Haimi-Cohen, Frank K. Soong:
Hidden Markov models with divergence based vector quantized variances. 125-128 - Yuqing Gao, Ea-Ee Jan, Mukund Padmanabhan, Michael Picheny:
HMM training based on quality measurement. 129-132 - Koji Iwano, Keikichi Hirose:
Prosodic word boundary detection using statistical modeling of moraic fundamental frequency contours and its use for continuous speech recognition. 133-136
ASR Systems and Applications
- Yukikuni Nishida, Yoshio Nakadai, Yoshitake Suzuki, Tetsuma Sakurai, Toshihide Kurokawa, Hirokazu Sato:
Voice recognition focusing on vowel strings on a fixed-point 20-MIPS DSP board. 137-140 - Makoto Shozakai:
Speech interface VLSI for car applications. 141-144 - Stephen W. Anderson, Natalie Liberman, Erica G. Bernstein, Stephen Foster, Erin Cate, Brenda Levin, Randy Hudson:
Recognition of elderly speech and voice-driven document retrieval. 145-148 - Michael J. Carey, Eluned S. Parris, Harvey Lloyd-Thomas:
A comparison of features for speech, music discrimination. 149-152 - Mazin G. Rahim:
Recognizing connected digits in a natural spoken dialog. 153-156 - Dong-Suk Yuk, James L. Flanagan:
Telephone speech recognition using neural networks and hidden Markov models. 157-160 - Ji Ming, Philip Hanna, Darryl Stewart, Marie Owens, Francis Jack Smith:
Improving speech recognition performance by using multi-model approaches. 161-164 - Coimbatore S. Ramalingam, Yifan Gong, Lorin Netsch, Wallace W. Anderson, John J. Godfrey, Yu-Hung Kao:
Speaker-dependent name dialing in a car environment with out-of-vocabulary rejection. 165-168 - Qing Guo, Fang Zheng, Jian Wu, Wenhu Wu:
An new method used in HMM for modeling frame correlation. 169-172 - Patrick Nguyen, Philippe Gelin, Jean-Claude Junqua, Jen-Tzung Chien
:
N-best based supervised and unsupervised adaptation for native and non-native speakers in cars. 173-176
Topics in Speech Coding
- Simão Ferraz de Campos Neto, Franklin L. Corcoran, Ara Karahisar:
Performance assessment of tandem connection of enhanced cellular coders. 177-180 - Ki-Seung Lee, Richard V. Cox:
TTS based very low bit rate speech coder. 181-184 - Ling Kok Ng, Gang Li, Xiao Lin, Guoan Bi:
Wideband speech coding with toll quality based on IA-model. 185-188 - Kazunori Ozawa:
4 kb/s multi-pulse based CELP speech coding using excitation switching. 189-192 - Erdal Paksoy, Juan Carlos De Martin, Alan McCree, Christian G. Gerlach, Anand K. Anandakumar, Wai-Ming Lai, Vishu Viswanathan:
An adaptive multi-rate speech coder for digital cellular telephony. 193-196 - Azhar Mustapha, Suat Yeldener:
An adaptive post-filtering technique based on the modified Yule-Walker filter. 197-200 - Anthony J. Accardi, Richard V. Cox:
A modular approach to speech enhancement with an application to speech coding. 201-204 - Gernot Kubin, W. Bastiaan Kleijn
:
On speech coding in a perceptual domain. 205-208
Speech Analysis
- Panuthat Boonpramuk, Tetsuo Funada, Noboru Kanedera:
Speech analysis/synthesis/conversion by using sequential processing. 209-212 - Mike Brookes, Han Pin Loke:
Modelling energy flow in the vocal tract with applications to glottal closure and opening detection. 213-216 - Srinivasan Umesh, Leon Cohen, Douglas J. Nelson:
Fitting the Mel scale. 217-220 - Wai Kat Liu, Pascale Fung:
Fast accent identification and accented speech recognition. 221-224 - Howard Hua Yang, Sarel van Vuuren, Hynek Hermansky:
Relevancy of time-frequency features for phonetic classification measured by mutual information. 225-228 - Keiichi Tokuda, Takashi Masuko, Noboru Miyazaki, Takao Kobayashi:
Hidden Markov models based on multi-space probability distribution for pitch pattern modeling. 229-232 - Ashraf Alkhairy:
An algorithm for glottal volume velocity estimation. 233-236 - Khaled El-Maleh, Ara Samouelian, Peter Kabal:
Frame level noise classification in mobile environments. 237-240
Low Bit Rate Speech Coding I
- Nicola R. Chong
, Ian S. Burnett
, Joe F. Chicharo:
Low delay multi-level decomposition and quantisation techniques for WI coding. 241-244 - Takahiro Unno, Thomas P. Barnwell III, Kwan K. Truong:
An improved mixed excitation linear prediction (MELP) coder. 245-248 - Stephane Villette, Milos Stefanovic, Ahmet M. Kondoz:
Split band LPC based adaptive multi-rate GSM candidate. 249-252 - Milan Jelinek, Jean-Pierre Adoul:
Frequency-domain spectral envelope estimation for low rate coding of speech. 253-256 - Chunyan Li, Vladimir Cuperman, Allen Gersho:
Robust closed-loop pitch estimation for harmonic coders by time scale modification. 257-260 - Hong-Goo Kang, Dipanjan Sen:
Phase adjustment in waveform interpolation. 261-264 - Jongseo Sohn, Wonyong Sung:
A low resolution pulse position coding method for improved excitation modeling of speech transition. 265-268 - Oded Gottesman:
Dispersion phase vector quantization for enhancement of waveform interpolative coder. 269-272
Robust Speech Recognition in Noisy Environments
- Firas Jabloun, A. Enis Çetin:
The Teager energy based feature parameters for robust speech recognition in car noise. 273-276 - Ascensión Gallardo-Antolín
, Fernando Díaz-de-María, Francisco J. Valverde-Albacete
:
Avoiding distortions due to speech coding and transmission errors in GSM ASR tasks. 277-280 - Mike Peters:
Binaural bark subband pre-processing of nonstationary signals for noise robust speech feature extraction. 281-284 - Satoru Tsuge, Toshiaki Fukuda, Harald Singer:
Speaker normalized spectral subband parameters for noise robust speech recognition. 285-288 - Hynek Hermansky, Sangita Sharma:
Temporal patterns (TRAPs) in ASR of noisy speech. 289-292 - Montri Karnjanadecha, Stephen A. Zahorian:
Signal modeling for isolated word recognition. 293-296 - Yifan Gong, John J. Godfrey:
Transforming HMMs for speaker-independent hands-free speech recognition in the car. 297-300 - Shuen Kong Wong, Bertram E. Shi:
Channel and noise adaptation via HMM mixture mean transform and stochastic matching. 301-304
Speaker Recognition
- Jialong He, Li Liu:
Speaker verification performance and the length of test sentence. 305-308 - Rivarol Vergin, Douglas D. O'Shaughnessy:
On the use of some divergence measures in speaker recognition. 309-312 - Roland Auckenthaler, Eluned S. Parris, Michael J. Carey:
Improving a GMM speaker verification system by phonetic weighting. 313-316 - Yong Gu, Trevor Thomas:
A hybrid score measurement for HMM-based speaker verification. 317-320 - William M. Campbell, Khaled T. Assaleh
:
Polynomial classifier techniques for speaker verification. 321-324 - Alvin Garcia, Richard J. Mammone:
Channel-robust speaker identification using modified-mean cepstral mean normalization with frequency warping. 325-328 - Mübeccel Demirekler, Ali Haydar:
Feature selection using genetics-based algorithm and its application to speaker identification. 329-332
Acoustic Modeling II
- Daniel Povey, Philip C. Woodland:
Frame discrimination training for HMMs for large vocabulary speech recognition. 333-336 - Françoise Beaufays, Mitchel Weintraub, Yochai Konig:
Discriminative mixture weight estimation for large Gaussian mixture models. 337-340 - Richard C. Rose, Giuseppe Riccardi:
Modeling disfluency and background events in ASR for a natural language understanding task. 341-344 - Wu Chou, Wolfgang Reichl:
Decision tree state tying based on penalized Bayesian information criterion. 345-348 - Jiayu Li, Alejandro Murua:
A 2D extended HMM for speech recognition. 349-352 - Xiaoqiang Luo, Frederick Jelinek:
Probabilistic classification of HMM states for large vocabulary continuous speech recognition. 353-356 - Hywel B. Richards, John S. Bridle:
The HDM: a segmental hidden dynamic model of coarticulation. 357-360 - Sankar Basu, Charles A. Micchelli, Peder A. Olsen:
Maximum likelihood estimates for exponential type density families. 361-364
Speech Production and Synthesis
- Stephen D. Peters, Peter Stubley, Jean-Marc Valin:
On the limits of speech recognition in noise. 365-368 - Qian-Jie Fu, Robert V. Shannon:
Recognition of spectrally degraded speech in noise with nonlinear amplitude mapping. 369-372 - Robert E. Donovan, Martin Franz, Jeffrey S. Sorensen, Salim Roukos:
Phrase splicing and variable substitution using the IBM trainable speech synthesis system. 373-376 - Yannis Stylianou:
Assessment and correction of voice quality variabilities in large speech databases for concatenative speech synthesis. 377-380 - Darragh O'Brien, Alex I. C. Monaghan:
Shape invariant time-scale modification of speech using a harmonic model. 381-384 - Kim E. A. Silverman, Jerome R. Bellegarda:
Using a sigmoid transformation for improved modeling of phoneme duration. 385-388 - Karthik Narasimhan, José C. Príncipe, Donald G. Childers:
Nonlinear dynamic modeling of the voiced excitation for improved speech synthesis. 389-392 - Arnaud Robert:
Results on perceptual invariants to transformations on speech. 393-396
Feature Extraction
- Reinhold Haeb-Umbach:
Investigations on inter-speaker variability in the feature space. 397-400 - Seung Ho Choi, Hong Kook Kim, Hwang Soo Lee:
LSP weighting functions based on spectral sensitivity and mel-frequency warping for speech recognition in digital communication. 401-404 - Chun-Ping Chan, Yiu Wing Wong, Tan Lee, Pak-Chung Ching:
Two-dimensional multi-resolution analysis of speech signals and its application to speech recognition. 405-408 - Rathinavelu Chengalvarayan:
Hierarchical subband linear predictive cepstral (HSLPC) features for HMM-based speech recognition. 409-412 - Douglas D. O'Shaughnessy, Hesham Tolba:
Towards a robust/fast continuous speech recognition system using a voiced-unvoiced decision. 413-416 - Jhing-Fa Wang, Shi-Huang Chen:
A C/V segmentation algorithm for Mandarin speech signal based on wavelet transforms. 417-420 - Tsuneo Nitta:
Feature extraction for speech recognition based on orthogonal acoustic-feature planes and LDA. 421-424 - Partha Niyogi, Chris Burges, Padma Ramesh:
Distinctive feature detection using support vector machines. 425-428
Robust Speech Recognition and Adaption
- Nam Soo Kim:
Time-varying noise compensation using multiple Kalman filters. 429-432 - Wei-Tyng Hong, Sin-Horng Chen:
A segment-based C0 adaptation scheme for PMC-based noisy Mandarin speech recognition. 433-436 - Jeih-weih Hung, Jia-Lin Shen, Lin-Shan Lee:
Improved parallel model combination techniques with split Gaussian mixtures for speech recognition under noisy conditions. 437-440 - Günther Ruske, Ki Yong Lee:
Speech recognition and enhancement by a nonstationary AR HMM with gain adaptation under unknown noise. 441-444 - Alexander Fischer, Volker Stahl:
Database and online adaptation for improved speech recognition in car environments. 445-448 - Diego Giuliani, Marco Matassoni, Maurizio Omologo, Piergiorgio Svaizer
:
Training of HMM with filtered speech material for hands-free recognition. 449-452 - Chafic Mokbel, Olivier Collin:
Incremental enrolment of speech recognizers. 453-456 - Bishnu S. Atal:
Automatic speech recognition: a communication perspective. 457-460
Low Bit Rate Speech Coding II
- Mohammad Reza Nakhai
, Farokh A. Marvasti:
Split band CELP (SB-CELP) speech coder. 461-464 - Najam Malik, W. Harvey Holmes:
Log amplitude modeling of sinusoids in voiced speech. 465-468 - Minoru Kohata:
1.2 kbit/s harmonic coder using auditory filters. 469-472 - Jesper Jensen, Søren Holdt Jensen, Egon Hansen:
Exponential sinusoidal modeling of transitional speech segments. 473-476 - Eric W. M. Yu, Cheung-Fat Chan:
Harmonic+noise coding using improved V/UV mixing and efficient spectral quantization. 477-480 - Suat Yeldener:
A 4 kb/s toll quality harmonic excitation linear predictive speech coder. 481-484