


default search action
Bhuvana Ramabhadran
Person information
Refine list

refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
showing all ?? records
2020 – today
- 2024
- [c211]Takaaki Saeki, Gary Wang, Nobuyuki Morioka, Isaac Elias, Kyle Kastner, Andrew Rosenberg, Bhuvana Ramabhadran, Heiga Zen, Françoise Beaufays, Hadar Shemtov:
Extending Multilingual Speech Synthesis to 100+ Languages without Transcribed Data. ICASSP 2024: 11546-11550 - [c210]Gowtham Ramesh, Kartik Audhkhasi, Bhuvana Ramabhadran:
Task Vector Algebra for ASR Models. ICASSP 2024: 12256-12260 - [i44]Takaaki Saeki, Gary Wang, Nobuyuki Morioka, Isaac Elias, Kyle Kastner, Andrew Rosenberg, Bhuvana Ramabhadran, Heiga Zen, Françoise Beaufays, Hadar Shemtov:
Extending Multilingual Speech Synthesis to 100+ Languages without Transcribed Data. CoRR abs/2402.18932 (2024) - [i43]Zhong Meng, Zelin Wu, Rohit Prabhavalkar, Cal Peyser, Weiran Wang, Nanxin Chen, Tara N. Sainath, Bhuvana Ramabhadran:
Text Injection for Neural Contextual Biasing. CoRR abs/2406.02921 (2024) - [i42]Neeraj Gaur, Rohan Agrawal, Gary Wang, Parisa Haghani, Andrew Rosenberg, Bhuvana Ramabhadran:
ASTRA: Aligning Speech and Text Representations for Asr without Sampling. CoRR abs/2406.06664 (2024) - [i41]Murali Karthick Baskar, Andrew Rosenberg, Bhuvana Ramabhadran, Neeraj Gaur, Zhong Meng:
Speech Prefix-Tuning with RNNT Loss for Improving LLM Predictions. CoRR abs/2406.14701 (2024) - [i40]Bolaji Yusuf, Murali Karthick Baskar, Andrew Rosenberg, Bhuvana Ramabhadran:
Speculative Speech Recognition by Audio-Prefixed Low-Rank Adaptation of Language Models. CoRR abs/2407.04641 (2024) - [i39]Shikhar Vashishth, Harman Singh, Shikhar Bharadwaj, Sriram Ganapathy, Chulayuth Asawaroengchai, Kartik Audhkhasi, Andrew Rosenberg, Ankur Bapna, Bhuvana Ramabhadran:
STAB: Speech Tokenizer Assessment Benchmark. CoRR abs/2409.02384 (2024) - [i38]Fadi Biadsy, Youzheng Chen, Isaac Elias, Kyle Kastner, Gary Wang, Andrew Rosenberg, Bhuvana Ramabhadran:
Zero-shot Cross-lingual Voice Transfer for TTS. CoRR abs/2409.13910 (2024) - [i37]Christopher Richardson, Roshan Sharma, Neeraj Gaur, Parisa Haghani, Anirudh Sundar, Bhuvana Ramabhadran:
Schema Augmentation for Zero-Shot Domain Adaptation in Dialogue State Tracking. CoRR abs/2411.00150 (2024) - 2023
- [j19]Dong Yu
, Yifan Gong, Michael A. Picheny, Bhuvana Ramabhadran
, Dilek Hakkani-Tür, Rohit Prasad, Heiga Zen
, Jan Skoglund
, Jan Honza Cernocký
, Lukás Burget
, Abdelrahman Mohamed:
Twenty-Five Years of Evolution in Speech and Language Processing. IEEE Signal Process. Mag. 40(5): 27-39 (2023) - [c209]Yosuke Higuchi, Andrew Rosenberg, Yuan Wang, Murali Karthick Baskar, Bhuvana Ramabhadran:
Mask-Conformer: Augmenting Conformer with Mask-Predict Decoder. ASRU 2023: 1-8 - [c208]Kartik Audhkhasi, Brian Farris, Bhuvana Ramabhadran, Pedro J. Moreno:
Modular Conformer Training for Flexible End-to-End ASR. ICASSP 2023: 1-5 - [c207]Tongzhou Chen, Cyril Allauzen, Yinghui Huang, Daniel S. Park, David Rybach, W. Ronny Huang, Rodrigo Cabrera, Kartik Audhkhasi, Bhuvana Ramabhadran, Pedro J. Moreno, Michael Riley:
Large-Scale Language Model Rescoring on Long-Form Data. ICASSP 2023: 1-5 - [c206]Zhong Meng, Weiran Wang, Rohit Prabhavalkar, Tara N. Sainath, Tongzhou Chen, Ehsan Variani, Yu Zhang, Bo Li, Andrew Rosenberg, Bhuvana Ramabhadran:
JEIT: Joint End-to-End Model and Internal Language Model Training for Speech Recognition. ICASSP 2023: 1-5 - [c205]Takaaki Saeki, Heiga Zen, Zhehuai Chen, Nobuyuki Morioka, Gary Wang, Yu Zhang, Ankur Bapna, Andrew Rosenberg, Bhuvana Ramabhadran:
Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised Learning for Text-to-Speech. ICASSP 2023: 1-5 - [c204]Gary Wang, Kyle Kastner, Ankur Bapna, Zhehuai Chen, Andrew Rosenberg, Bhuvana Ramabhadran, Yu Zhang:
Understanding Shared Speech-Text Representations. ICASSP 2023: 1-5 - [c203]Mohammad Zeineldeen, Kartik Audhkhasi, Murali Karthick Baskar, Bhuvana Ramabhadran:
Robust Knowledge Distillation from RNN-T Models with Noisy Training Labels Using Full-Sum Loss. ICASSP 2023: 1-5 - [c202]Murali Karthick Baskar, Andrew Rosenberg, Bhuvana Ramabhadran, Kartik Audhkhasi:
O-1: Self-training with Oracle and 1-best Hypothesis. INTERSPEECH 2023: 77-81 - [c201]Yochai Blau, Rohan Agrawal, Lior Madmony, Gary Wang, Andrew Rosenberg, Zhehuai Chen, Zorik Gekhman, Genady Beryozkin, Parisa Haghani, Bhuvana Ramabhadran:
Using Text Injection to Improve Recognition of Personal Identifiers in Speech. INTERSPEECH 2023: 191-195 - [i36]Zhong Meng, Weiran Wang, Rohit Prabhavalkar, Tara N. Sainath, Tongzhou Chen, Ehsan Variani, Yu Zhang, Bo Li, Andrew Rosenberg, Bhuvana Ramabhadran:
JEIT: Joint End-to-End Model and Internal Language Model Training for Speech Recognition. CoRR abs/2302.08583 (2023) - [i35]Yu Zhang, Wei Han, James Qin, Yongqiang Wang, Ankur Bapna, Zhehuai Chen, Nanxin Chen, Bo Li, Vera Axelrod, Gary Wang, Zhong Meng, Ke Hu, Andrew Rosenberg, Rohit Prabhavalkar, Daniel S. Park, Parisa Haghani, Jason Riesa, Ginger Perng, Hagen Soltau, Trevor Strohman, Bhuvana Ramabhadran, Tara N. Sainath, Pedro J. Moreno, Chung-Cheng Chiu, Johan Schalkwyk, Françoise Beaufays, Yonghui Wu:
Google USM: Scaling Automatic Speech Recognition Beyond 100 Languages. CoRR abs/2303.01037 (2023) - [i34]Mohammad Zeineldeen, Kartik Audhkhasi, Murali Karthick Baskar, Bhuvana Ramabhadran:
Robust Knowledge Distillation from RNN-T Models With Noisy Training Labels Using Full-Sum Loss. CoRR abs/2303.05958 (2023) - [i33]Gary Wang, Kyle Kastner, Ankur Bapna, Zhehuai Chen, Andrew Rosenberg, Bhuvana Ramabhadran, Yu Zhang:
Understanding Shared Speech-Text Representations. CoRR abs/2304.14514 (2023) - [i32]Tongzhou Chen, Cyril Allauzen, Yinghui Huang, Daniel S. Park, David Rybach, W. Ronny Huang, Rodrigo Cabrera, Kartik Audhkhasi, Bhuvana Ramabhadran, Pedro J. Moreno, Michael Riley:
Large-scale Language Model Rescoring on Long-form Data. CoRR abs/2306.08133 (2023) - [i31]Yochai Blau, Rohan Agrawal, Lior Madmony, Gary Wang, Andrew Rosenberg, Zhehuai Chen, Zorik Gekhman, Genady Beryozkin, Parisa Haghani, Bhuvana Ramabhadran:
Using Text Injection to Improve Recognition of Personal Identifiers in Speech. CoRR abs/2308.07393 (2023) - [i30]Murali Karthick Baskar, Andrew Rosenberg, Bhuvana Ramabhadran, Kartik Audhkhasi:
O-1: Self-training with Oracle and 1-best Hypothesis. CoRR abs/2308.07486 (2023) - 2022
- [j18]Murali Karthick Baskar
, Andrew Rosenberg, Bhuvana Ramabhadran
, Yu Zhang
, Pedro J. Moreno:
Ask2Mask: Guided Data Selection for Masked Speech Modeling. IEEE J. Sel. Top. Signal Process. 16(6): 1357-1366 (2022) - [j17]Yu Zhang
, Daniel S. Park
, Wei Han
, James Qin, Anmol Gulati, Joel Shor
, Aren Jansen, Yuanzhong Xu, Yanping Huang, Shibo Wang, Zongwei Zhou, Bo Li
, Min Ma
, William Chan, Jiahui Yu, Yongqiang Wang, Liangliang Cao, Khe Chai Sim
, Bhuvana Ramabhadran
, Tara N. Sainath
, Françoise Beaufays, Zhifeng Chen
, Quoc V. Le, Chung-Cheng Chiu, Ruoming Pang, Yonghui Wu:
BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition. IEEE J. Sel. Top. Signal Process. 16(6): 1519-1532 (2022) - [c200]Neeraj Gaur, Tongzhou Chen, Ehsan Variani, Parisa Haghani, Bhuvana Ramabhadran, Pedro J. Moreno:
Multilingual Second-Pass Rescoring for Automatic Speech Recognition Systems. ICASSP 2022: 6407-6411 - [c199]Zhehuai Chen, Yu Zhang, Andrew Rosenberg, Bhuvana Ramabhadran, Pedro J. Moreno, Gary Wang:
Tts4pretrain 2.0: Advancing the use of Text and Speech in ASR Pretraining with Consistency and Contrastive Losses. ICASSP 2022: 7677-7681 - [c198]Kartik Audhkhasi, Yinghui Huang, Bhuvana Ramabhadran, Pedro J. Moreno:
Analysis of Self-Attention Head Diversity for Conformer-based Automatic Speech Recognition. INTERSPEECH 2022: 1026-1030 - [c197]Weiran Wang, Tongzhou Chen, Tara N. Sainath, Ehsan Variani, Rohit Prabhavalkar, W. Ronny Huang, Bhuvana Ramabhadran, Neeraj Gaur, Sepand Mavandadi, Cal Peyser, Trevor Strohman, Yanzhang He, David Rybach:
Improving Rare Word Recognition with LM-aware MWER Training. INTERSPEECH 2022: 1031-1035 - [c196]Ehsan Variani, Michael Riley, David Rybach, Cyril Allauzen, Tongzhou Chen, Bhuvana Ramabhadran:
On Adaptive Weight Interpolation of the Hybrid Autoregressive Transducer. INTERSPEECH 2022: 1646-1650 - [c195]Murali Karthick Baskar, Andrew Rosenberg, Bhuvana Ramabhadran, Yu Zhang, Nicolás Serrano:
Reducing Domain mismatch in Self-supervised speech pre-training. INTERSPEECH 2022: 3028-3032 - [c194]Gary Wang, Andrew Rosenberg, Bhuvana Ramabhadran, Fadi Biadsy, Jesse Emond, Yinghui Huang, Pedro J. Moreno:
Non-Parallel Voice Conversion for ASR Augmentation. INTERSPEECH 2022: 3408-3412 - [c193]Zhehuai Chen, Yu Zhang, Andrew Rosenberg, Bhuvana Ramabhadran, Pedro J. Moreno, Ankur Bapna, Heiga Zen
:
MAESTRO: Matched Speech Text Representations through Modality Matching. INTERSPEECH 2022: 4093-4097 - [c192]Gary Wang, Ekin D. Cubuk, Andrew Rosenberg, Shuyang Cheng, Ron J. Weiss, Bhuvana Ramabhadran, Pedro J. Moreno, Quoc V. Le, Daniel S. Park:
G-Augment: Searching for the Meta-Structure of Data Augmentation Policies for ASR. SLT 2022: 23-30 - [c191]Zhehuai Chen, Ankur Bapna, Andrew Rosenberg, Yu Zhang, Bhuvana Ramabhadran, Pedro J. Moreno, Nanxin Chen:
Maestro-U: Leveraging Joint Speech-Text Representation Learning for Zero Supervised Speech ASR. SLT 2022: 68-75 - [c190]Zhong Meng, Tongzhou Chen, Rohit Prabhavalkar, Yu Zhang, Gary Wang, Kartik Audhkhasi, Jesse Emond, Trevor Strohman, Bhuvana Ramabhadran, W. Ronny Huang, Ehsan Variani, Yinghui Huang, Pedro J. Moreno:
Modular Hybrid Autoregressive Transducer. SLT 2022: 197-204 - [i29]Murali Karthick Baskar, Andrew Rosenberg, Bhuvana Ramabhadran, Yu Zhang, Pedro J. Moreno:
Ask2Mask: Guided Data Selection for Masked Speech Modeling. CoRR abs/2202.12719 (2022) - [i28]Zhehuai Chen, Yu Zhang, Andrew Rosenberg, Bhuvana Ramabhadran, Pedro J. Moreno, Ankur Bapna, Heiga Zen
:
MAESTRO: Matched Speech Text Representations through Modality Matching. CoRR abs/2204.03409 (2022) - [i27]Weiran Wang, Tongzhou Chen, Tara N. Sainath, Ehsan Variani, Rohit Prabhavalkar, W. Ronny Huang, Bhuvana Ramabhadran, Neeraj Gaur, Sepand Mavandadi, Cal Peyser, Trevor Strohman, Yanzhang He, David Rybach:
Improving Rare Word Recognition with LM-aware MWER Training. CoRR abs/2204.07553 (2022) - [i26]Alëna Aksënova, Zhehuai Chen, Chung-Cheng Chiu, Daan van Esch, Pavel Golik
, Wei Han, Levi King, Bhuvana Ramabhadran, Andrew Rosenberg, Suzan Schwartz, Gary Wang:
Accented Speech Recognition: Benchmarking, Pre-training, and Diverse Data. CoRR abs/2205.08014 (2022) - [i25]Kartik Audhkhasi, Yinghui Huang, Bhuvana Ramabhadran, Pedro J. Moreno:
Analysis of Self-Attention Head Diversity for Conformer-based Automatic Speech Recognition. CoRR abs/2209.06096 (2022) - [i24]Gary Wang, Andrew Rosenberg, Bhuvana Ramabhadran, Fadi Biadsy, Yinghui Huang, Jesse Emond, Pedro Moreno Mengibar:
Non-Parallel Voice Conversion for ASR Augmentation. CoRR abs/2209.06987 (2022) - [i23]Zhehuai Chen, Ankur Bapna, Andrew Rosenberg, Yu Zhang, Bhuvana Ramabhadran, Pedro J. Moreno, Nanxin Chen:
Maestro-U: Leveraging joint speech-text representation learning for zero supervised speech ASR. CoRR abs/2210.10027 (2022) - [i22]Gary Wang, Ekin D. Cubuk, Andrew Rosenberg, Shuyang Cheng, Ron J. Weiss, Bhuvana Ramabhadran, Pedro J. Moreno, Quoc V. Le, Daniel S. Park:
G-Augment: Searching for the Meta-Structure of Data Augmentation Policies for ASR. CoRR abs/2210.10879 (2022) - [i21]Takaaki Saeki, Heiga Zen
, Zhehuai Chen, Nobuyuki Morioka, Gary Wang, Yu Zhang, Ankur Bapna, Andrew Rosenberg, Bhuvana Ramabhadran:
Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised Learning for Text-To-Speech. CoRR abs/2210.15447 (2022) - [i20]Zhong Meng, Tongzhou Chen, Rohit Prabhavalkar, Yu Zhang, Gary Wang, Kartik Audhkhasi, Jesse Emond, Trevor Strohman, Bhuvana Ramabhadran, W. Ronny Huang, Ehsan Variani, Yinghui Huang, Pedro J. Moreno:
Modular Hybrid Autoregressive Transducer. CoRR abs/2210.17049 (2022) - 2021
- [c189]Zhehuai Chen, Yu Zhang, Andrew Rosenberg, Bhuvana Ramabhadran, Gary Wang, Pedro J. Moreno:
Injecting Text in Self-Supervised Speech Pretraining. ASRU 2021: 251-258 - [c188]Hainan Xu, Yinghui Huang, Yun Zhu, Kartik Audhkhasi, Bhuvana Ramabhadran:
Convolutional Dropout and Wordpiece Augmentation for End-to-End Speech Recognition. ICASSP 2021: 5984-5988 - [c187]Neeraj Gaur, Brian Farris, Parisa Haghani, Isabel Leal, Pedro J. Moreno, Manasa Prasad, Bhuvana Ramabhadran, Yun Zhu:
Mixture of Informed Experts for Multilingual Speech Recognition. ICASSP 2021: 6234-6238 - [c186]Rohan Doshi, Youzheng Chen, Liyang Jiang, Xia Zhang, Fadi Biadsy, Bhuvana Ramabhadran, Fang Chu, Andrew Rosenberg, Pedro J. Moreno:
Extending Parrotron: An End-to-End, Speech Conversion and Speech Recognition Model for Atypical Speech. ICASSP 2021: 6988-6992 - [c185]Zhehuai Chen, Andrew Rosenberg, Yu Zhang, Heiga Zen
, Mohammadreza Ghodsi, Yinghui Huang, Jesse Emond, Gary Wang, Bhuvana Ramabhadran, Pedro J. Moreno:
Semi-Supervision in ASR: Sequential MixMatch and Factorized TTS-Based Augmentation. Interspeech 2021: 736-740 - [c184]Kartik Audhkhasi, Tongzhou Chen, Bhuvana Ramabhadran, Pedro J. Moreno:
Mixture Model Attention: Flexible Streaming and Non-Streaming Automatic Speech Recognition. Interspeech 2021: 1812-1816 - [c183]Isabel Leal, Neeraj Gaur, Parisa Haghani, Brian Farris, Pedro J. Moreno, Manasa Prasad, Bhuvana Ramabhadran, Yun Zhu:
Self-Adaptive Distillation for Multilingual Speech Recognition: Leveraging Student Independence. Interspeech 2021: 2556-2560 - [c182]Hainan Xu, Kartik Audhkhasi, Yinghui Huang, Jesse Emond, Bhuvana Ramabhadran:
Regularizing Word Segmentation by Creating Misspellings. Interspeech 2021: 2561-2565 - [c181]Zhehuai Chen, Bhuvana Ramabhadran, Fadi Biadsy, Xia Zhang, Youzheng Chen, Liyang Jiang, Fang Chu, Rohan Doshi, Pedro J. Moreno:
Conformer Parrotron: A Faster and Stronger End-to-End Speech Conversion and Recognition Model for Atypical Speech. Interspeech 2021: 4828-4832 - [i19]Zhehuai Chen, Yu Zhang, Andrew Rosenberg, Bhuvana Ramabhadran, Gary Wang, Pedro J. Moreno:
Injecting Text in Self-Supervised Speech Pretraining. CoRR abs/2108.12226 (2021) - [i18]Yu Zhang, Daniel S. Park, Wei Han, James Qin, Anmol Gulati, Joel Shor, Aren Jansen, Yuanzhong Xu, Yanping Huang, Shibo Wang, Zongwei Zhou, Bo Li, Min Ma, William Chan, Jiahui Yu, Yongqiang Wang, Liangliang Cao, Khe Chai Sim, Bhuvana Ramabhadran, Tara N. Sainath, Françoise Beaufays, Zhifeng Chen, Quoc V. Le, Chung-Cheng Chiu, Ruoming Pang, Yonghui Wu:
BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning for Automatic Speech Recognition. CoRR abs/2109.13226 (2021) - 2020
- [c180]Guangzhi Sun, Yu Zhang, Ron J. Weiss, Yuan Cao, Heiga Zen
, Andrew Rosenberg, Bhuvana Ramabhadran, Yonghui Wu:
Generating Diverse and Natural Text-to-Speech Samples Using a Quantized Fine-Grained VAE and Autoregressive Prosody Prior. ICASSP 2020: 6699-6703 - [c179]Gary Wang, Andrew Rosenberg, Zhehuai Chen, Yu Zhang, Bhuvana Ramabhadran, Yonghui Wu, Pedro J. Moreno:
Improving Speech Recognition Using Consistent Predictions on Synthesized Speech. ICASSP 2020: 7029-7033 - [c178]Ehsan Variani, Tongzhou Chen, James Apfel, Bhuvana Ramabhadran, Seungji Lee, Pedro J. Moreno:
Neural Oracle Search on N-BEST Hypotheses. ICASSP 2020: 7824-7828 - [c177]Arindrima Datta, Bhuvana Ramabhadran, Jesse Emond, Anjuli Kannan, Brian Roark:
Language-Agnostic Multilingual Modeling. ICASSP 2020: 8239-8243 - [c176]Zhehuai Chen, Andrew Rosenberg, Yu Zhang, Gary Wang, Bhuvana Ramabhadran, Pedro J. Moreno:
Improving Speech Recognition Using GAN-Based Speech Synthesis and Contrastive Unspoken Text Selection. INTERSPEECH 2020: 556-560 - [c175]Gary Wang, Andrew Rosenberg, Zhehuai Chen, Yu Zhang, Bhuvana Ramabhadran, Pedro J. Moreno:
SCADA: Stochastic, Consistent and Adversarial Data Augmentation to Improve ASR. INTERSPEECH 2020: 2832-2836 - [c174]Yun Zhu, Parisa Haghani, Anshuman Tripathi, Bhuvana Ramabhadran, Brian Farris, Hainan Xu, Han Lu, Hasim Sak, Isabel Leal, Neeraj Gaur, Pedro J. Moreno, Qian Zhang:
Multilingual Speech Recognition with Self-Attention Structured Parameterization. INTERSPEECH 2020: 4741-4745 - [i17]Guangzhi Sun, Yu Zhang, Ron J. Weiss, Yuan Cao, Heiga Zen, Andrew Rosenberg, Bhuvana Ramabhadran, Yonghui Wu:
Generating diverse and natural text-to-speech samples using a quantized fine-grained VAE and auto-regressive prosody prior. CoRR abs/2002.03788 (2020) - [i16]Arindrima Datta, Bhuvana Ramabhadran, Jesse Emond, Anjuli Kannan, Brian Roark:
Language-agnostic Multilingual Modeling. CoRR abs/2004.09571 (2020) - [i15]Arindrima Datta, Guanlong Zhao, Bhuvana Ramabhadran, Eugene Weinstein:
LSTM Acoustic Models Learn to Align and Pronounce with Graphemes. CoRR abs/2008.06121 (2020)
2010 – 2019
- 2019
- [c173]Andrew Rosenberg, Yu Zhang, Bhuvana Ramabhadran, Ye Jia, Pedro J. Moreno, Yonghui Wu, Zelin Wu:
Speech Recognition with Augmented Synthesized Speech. ASRU 2019: 996-1002 - [c172]Min Ma, Bhuvana Ramabhadran, Jesse Emond, Andrew Rosenberg, Fadi Biadsy:
Comparison of Data Augmentation and Adaptation Strategies for Code-switched Automatic Speech Recognition. ICASSP 2019: 6081-6085 - [c171]Yu Zhang, Ron J. Weiss, Heiga Zen
, Yonghui Wu, Zhifeng Chen, R. J. Skerry-Ryan, Ye Jia, Andrew Rosenberg, Bhuvana Ramabhadran:
Learning to Speak Fluently in a Foreign Language: Multilingual Speech Synthesis and Cross-Language Voice Cloning. INTERSPEECH 2019: 2080-2084 - [c170]Anjuli Kannan, Arindrima Datta, Tara N. Sainath, Eugene Weinstein, Bhuvana Ramabhadran, Yonghui Wu, Ankur Bapna, Zhifeng Chen, Seungji Lee:
Large-Scale Multilingual Speech Recognition with a Streaming End-to-End Model. INTERSPEECH 2019: 2130-2134 - [i14]Yu Zhang, Ron J. Weiss, Heiga Zen, Yonghui Wu, Zhifeng Chen, R. J. Skerry-Ryan, Ye Jia, Andrew Rosenberg, Bhuvana Ramabhadran:
Learning to Speak Fluently in a Foreign Language: Multilingual Speech Synthesis and Cross-Language Voice Cloning. CoRR abs/1907.04448 (2019) - [i13]Anjuli Kannan, Arindrima Datta, Tara N. Sainath, Eugene Weinstein, Bhuvana Ramabhadran, Yonghui Wu, Ankur Bapna, Zhifeng Chen, Seungji Lee:
Large-Scale Multilingual Speech Recognition with a Streaming End-to-End Model. CoRR abs/1909.05330 (2019) - [i12]Andrew Rosenberg, Yu Zhang, Bhuvana Ramabhadran, Ye Jia, Pedro J. Moreno, Yonghui Wu, Zelin Wu:
Speech Recognition with Augmented Synthesized Speech. CoRR abs/1909.11699 (2019) - 2018
- [c169]Kartik Audhkhasi, Brian Kingsbury, Bhuvana Ramabhadran, George Saon
, Michael Picheny:
Building Competitive Direct Acoustics-to-Word Models for English Conversational Speech Recognition. ICASSP 2018: 4759-4763 - [c168]Andrew Rosenberg, Raul Fernandez, Bhuvana Ramabhadran:
Measuring the Effect of Linguistic Resources on Prosody Modeling for Speech Synthesis. ICASSP 2018: 5114-5118 - [c167]Xuesong Yang
, Kartik Audhkhasi, Andrew Rosenberg, Samuel Thomas, Bhuvana Ramabhadran, Mark Hasegawa-Johnson:
Joint Modeling of Accents and Acoustics for Multi-Accent Speech Recognition. ICASSP 2018: 5989-5993 - [c166]Yinghui Huang, Abhinav Sethy, Kartik Audhkhasi, Bhuvana Ramabhadran:
Whole Sentence Neural Language Models. ICASSP 2018: 6089-6093 - [c165]Bhuvana Ramabhadran:
Open Problems in Speech Recognition. INTERSPEECH 2018: 1766 - [c164]Takashi Fukuda, Raul Fernandez, Andrew Rosenberg, Samuel Thomas, Bhuvana Ramabhadran, Alexander Sorin, Gakuto Kurata:
Data Augmentation Improves Recognition of Foreign Accented Speech. INTERSPEECH 2018: 2409-2413 - [c163]Jesse Emond, Bhuvana Ramabhadran, Brian Roark, Pedro J. Moreno, Min Ma:
Transliteration Based Approaches to Improve Code-Switched Speech Recognition Performance. SLT 2018: 448-455 - [i11]Xuesong Yang, Kartik Audhkhasi, Andrew Rosenberg, Samuel Thomas, Bhuvana Ramabhadran, Mark Hasegawa-Johnson:
Joint Modeling of Accents and Acoustics for Multi-Accent Speech Recognition. CoRR abs/1802.02656 (2018) - 2017
- [j16]Kartik Audhkhasi, Andrew Rosenberg, George Saon
, Abhinav Sethy, Bhuvana Ramabhadran, Stanley F. Chen, Michael Picheny:
Recent progress in deep end-to-end models for spoken language processing. IBM J. Res. Dev. 61(4-5): 2:1-2:10 (2017) - [j15]Bhuvana Ramabhadran, Nancy F. Chen
, Mary P. Harper, Brian Kingsbury, Kate M. Knill:
Introduction to the Special Issue on End-to-End Speech and Language Processing. IEEE J. Sel. Top. Signal Process. 11(8): 1237-1239 (2017) - [j14]Kartik Audhkhasi
, Andrew Rosenberg
, Abhinav Sethy, Bhuvana Ramabhadran
, Brian Kingsbury
:
End-to-End ASR-Free Keyword Search From Speech. IEEE J. Sel. Top. Signal Process. 11(8): 1351-1359 (2017) - [j13]I-Hsin Chung, Tara N. Sainath, Bhuvana Ramabhadran, Michael Picheny, John A. Gunnels, Vernon Austel, Upendra V. Chaudhari, Brian Kingsbury:
Parallel Deep Neural Network Training for Big Data on Blue Gene/Q. IEEE Trans. Parallel Distributed Syst. 28(6): 1703-1714 (2017) - [c162]Gakuto Kurata, Bhuvana Ramabhadran, George Saon
, Abhinav Sethy:
Language modeling with highway LSTM. ASRU 2017: 244-251 - [c161]Ewout van den Berg, Bhuvana Ramabhadran, Michael Picheny:
Training variance and performance evaluation of neural networks in speech. ICASSP 2017: 2287-2291 - [c160]Jia Cui, Brian Kingsbury, Bhuvana Ramabhadran, George Saon
, Tom Sercu, Kartik Audhkhasi, Abhinav Sethy, Markus Nußbaum-Thom, Andrew Rosenberg:
Knowledge distillation across ensembles of multilingual models for low-resource languages. ICASSP 2017: 4825-4829 - [c159]Kartik Audhkhasi, Andrew Rosenberg, Abhinav Sethy, Bhuvana Ramabhadran, Brian Kingsbury:
End-to-end ASR-free keyword search from speech. ICASSP 2017: 4840-4844 - [c158]Takashi Fukuda, Osamu Ichikawa, Gakuto Kurata, Ryuki Tachibana, Samuel Thomas, Bhuvana Ramabhadran:
Effective joint training of denoising feature space transforms and Neural Network based acoustic models. ICASSP 2017: 5190-5194 - [c157]Osamu Ichikawa, Takashi Fukuda, Masayuki Suzuki, Gakuto Kurata, Bhuvana Ramabhadran:
Harmonic feature fusion for robust neural network-based acoustic modeling. ICASSP 2017: 5195-5199 - [c156]Andrew Rosenberg, Kartik Audhkhasi, Abhinav Sethy, Bhuvana Ramabhadran, Michael Picheny:
End-to-end speech recognition and keyword search on low-resource languages. ICASSP 2017: 5280-5284 - [c155]Tom Sercu, George Saon
, Jia Cui, Xiaodong Cui, Bhuvana Ramabhadran, Brian Kingsbury, Abhinav Sethy:
Network architectures for multilingual speech representation learning. ICASSP 2017: 5295-5299 - [c154]Raul Fernandez, Andrew Rosenberg, Alexander Sorin, Bhuvana Ramabhadran, Ron Hoory:
Voice-transformation-based data augmentation for prosodic classification. ICASSP 2017: 5530-5534 - [c153]George Saon
, Gakuto Kurata, Tom Sercu, Kartik Audhkhasi, Samuel Thomas, Dimitrios Dimitriadis, Xiaodong Cui, Bhuvana Ramabhadran, Michael Picheny, Lynn-Li Lim, Bergul Roomi, Phil Hall:
English Conversational Telephone Speech Recognition by Humans and Machines. INTERSPEECH 2017: 132-136 - [c152]Yinghui Huang, Abhinav Sethy, Bhuvana Ramabhadran:
Fast Neural Network Language Model Lookups at N-Gram Speeds. INTERSPEECH 2017: 274-278 - [c151]Gakuto Kurata, Abhinav Sethy, Bhuvana Ramabhadran, George Saon
:
Empirical Exploration of Novel Architectures and Objectives for Language Models. INTERSPEECH 2017: 279-283 - [c150]Asaf Rendel, Raul Fernandez, Zvi Kons, Andrew Rosenberg, Ron Hoory, Bhuvana Ramabhadran:
Weakly-Supervised Phrase Assignment from Text in a Speech-Synthesis System Using Noisy Labels. INTERSPEECH 2017: 759-763 - [c149]Kartik Audhkhasi, Bhuvana Ramabhadran, George Saon
, Michael Picheny, David Nahamoo:
Direct Acoustics-to-Word Models for English Conversational Speech Recognition. INTERSPEECH 2017: 959-963 - [c148]Masayuki Suzuki, Gakuto Kurata, Abhinav Sethy, Bhuvana Ramabhadran, Kenneth Ward Church
, Mark Drake:
Symbol Sequence Search from Telephone Conversation. INTERSPEECH 2017: 3612-3616 - [c147]Takashi Fukuda, Masayuki Suzuki, Gakuto Kurata, Samuel Thomas, Jia Cui, Bhuvana Ramabhadran:
Efficient Knowledge Distillation from an Ensemble of Teachers. INTERSPEECH 2017: 3697-3701 - [c146]Andrew Rosenberg, Bhuvana Ramabhadran:
Bias and Statistical Significance in Evaluating Speech Synthesis with Mean Opinion Scores. INTERSPEECH 2017: 3976-3980 - [i10]Kartik Audhkhasi, Andrew Rosenberg, Abhinav Sethy, Bhuvana Ramabhadran, Brian Kingsbury:
End-to-End ASR-free Keyword Search from Speech. CoRR abs/1701.04313 (2017) - [i9]George Saon, Gakuto Kurata, Tom Sercu, Kartik Audhkhasi, Samuel Thomas, Dimitrios Dimitriadis, Xiaodong Cui, Bhuvana Ramabhadran, Michael Picheny, Lynn-Li Lim, Bergul Roomi, Phil Hall:
English Conversational Telephone Speech Recognition by Humans and Machines. CoRR abs/1703.02136 (2017) - [i8]Kartik Audhkhasi, Bhuvana Ramabhadran, George Saon, Michael Picheny, David Nahamoo:
Direct Acoustics-to-Word Models for English Conversational Speech Recognition. CoRR abs/1703.07754 (2017) - [i7]Gakuto Kurata, Bhuvana Ramabhadran, George Saon, Abhinav Sethy:
Language Modeling with Highway LSTM. CoRR abs/1709.06436 (2017) - [i6]Kartik Audhkhasi, Brian Kingsbury, Bhuvana Ramabhadran, George Saon, Michael Picheny:
Building competitive direct acoustics-to-word models for English conversational speech recognition. CoRR abs/1712.03133 (2017) - 2016