default search action
Brian Kingsbury
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
showing all ?? records
2020 – today
- 2024
- [c129]A F. M. Saif, Xiaodong Cui, Han Shen, Songtao Lu, Brian Kingsbury, Tianyi Chen:
Joint Unsupervised and Supervised Training for Automatic Speech Recognition via Bilevel Optimization. ICASSP 2024: 10931-10935 - [c128]Siddhant Arora, George Saon, Shinji Watanabe, Brian Kingsbury:
Semi-Autoregressive Streaming ASR with Label Context. ICASSP 2024: 11681-11685 - [i50]A F. M. Saif, Xiaodong Cui, Han Shen, Songtao Lu, Brian Kingsbury, Tianyi Chen:
Joint Unsupervised and Supervised Training for Automatic Speech Recognition via Bilevel Optimization. CoRR abs/2401.06980 (2024) - [i49]Ankit Gupta, George Saon, Brian Kingsbury:
Exploring the limits of decoder-only models trained on public speech recognition corpora. CoRR abs/2402.00235 (2024) - 2023
- [c127]Andrew Rouditchenko, Yung-Sung Chuang, Nina Shvetsova, Samuel Thomas, Rogério Feris, Brian Kingsbury, Leonid Karlinsky, David Harwath, Hilde Kuehne, James R. Glass:
C2KD: Cross-Lingual Cross-Modal Knowledge Distillation for Multilingual Text-Video Retrieval. ICASSP 2023: 1-5 - [c126]Vishal Sunder, Samuel Thomas, Hong-Kwang Jeff Kuo, Brian Kingsbury, Eric Fosler-Lussier:
Fine-Grained Textual Knowledge Transfer to Improve RNN Transducers for Speech Recognition and Understanding. ICASSP 2023: 1-5 - [c125]Samuel Thomas, Hong-Kwang Jeff Kuo, George Saon, Brian Kingsbury:
Multi-Speaker Data Augmentation for Improved end-to-end Automatic Speech Recognition. ICASSP 2023: 1-5 - [c124]Vishal Sunder, Eric Fosler-Lussier, Samuel Thomas, Hong-Kwang Jeff Kuo, Brian Kingsbury:
ConvKT: Conversation-Level Knowledge Transfer for Context Aware End-to-End Spoken Language Understanding. INTERSPEECH 2023: 1129-1133 - [c123]Xiaodong Cui, George Saon, Brian Kingsbury:
Improving RNN Transducer Acoustic Models for English Conversational Speech Recognition. INTERSPEECH 2023: 1299-1303 - [c122]Andrew Rouditchenko, Sameer Khurana, Samuel Thomas, Rogério Feris, Leonid Karlinsky, Hilde Kuehne, David Harwath, Brian Kingsbury, James R. Glass:
Comparison of Multilingual Self-Supervised and Weakly-Supervised Speech Pre-Training for Adaptation to Unseen Languages. INTERSPEECH 2023: 2268-2272 - [c121]Kristjan H. Greenewald, Brian Kingsbury, Yuancheng Yu:
High-Dimensional Smoothed Entropy Estimation via Dimensionality Reduction. ISIT 2023: 2613-2618 - [i48]Kristjan H. Greenewald, Brian Kingsbury, Yuancheng Yu:
High-Dimensional Smoothed Entropy Estimation via Dimensionality Reduction. CoRR abs/2305.04712 (2023) - [i47]Andrew Rouditchenko, Sameer Khurana, Samuel Thomas, Rogério Feris, Leonid Karlinsky, Hilde Kuehne, David Harwath, Brian Kingsbury, James R. Glass:
Comparison of Multilingual Self-Supervised and Weakly-Supervised Speech Pre-Training for Adaptation to Unseen Languages. CoRR abs/2305.12606 (2023) - [i46]Siddhant Arora, George Saon, Shinji Watanabe, Brian Kingsbury:
Semi-Autoregressive Streaming ASR With Label Context. CoRR abs/2309.10926 (2023) - [i45]Xiaodong Cui, Ashish R. Mittal, Songtao Lu, Wei Zhang, George Saon, Brian Kingsbury:
Soft Random Sampling: A Theoretical and Empirical Analysis. CoRR abs/2311.12727 (2023) - 2022
- [c120]Nina Shvetsova, Brian Chen, Andrew Rouditchenko, Samuel Thomas, Brian Kingsbury, Rogério Feris, David Harwath, James R. Glass, Hilde Kuehne:
Everything at Once - Multi-modal Fusion Transformer for Video Retrieval. CVPR 2022: 19988-19997 - [c119]Songtao Lu, Xiaodong Cui, Mark S. Squillante, Brian Kingsbury, Lior Horesh:
Decentralized Bilevel Optimization for Personalized Client Learning. ICASSP 2022: 5543-5547 - [c118]Hong-Kwang Jeff Kuo, Zoltán Tüske, Samuel Thomas, Brian Kingsbury, George Saon:
Improving End-to-end Models for Set Prediction in Spoken Language Understanding. ICASSP 2022: 7162-7166 - [c117]Vishal Sunder, Samuel Thomas, Hong-Kwang Jeff Kuo, Jatin Ganhotra, Brian Kingsbury, Eric Fosler-Lussier:
Towards End-to-End Integration of Dialog History for Improved Spoken Language Understanding. ICASSP 2022: 7497-7501 - [c116]Zvi Kons, Aharon Satt, Hong-Kwang Kuo, Samuel Thomas, Boaz Carmeli, Ron Hoory, Brian Kingsbury:
A New Data Augmentation Method for Intent Classification Enhancement and its Application on Spoken Conversation Datasets. ICASSP 2022: 7632-7636 - [c115]Samuel Thomas, Hong-Kwang Jeff Kuo, Brian Kingsbury, George Saon:
Towards Reducing the Need for Speech Training Data to Build Spoken Language Understanding Systems. ICASSP 2022: 7932-7936 - [c114]Samuel Thomas, Brian Kingsbury, George Saon, Hong-Kwang Jeff Kuo:
Integrating Text Inputs for Training and Adapting RNN Transducer ASR Models. ICASSP 2022: 8127-8131 - [c113]Jiatong Shi, George Saon, David Haws, Shinji Watanabe, Brian Kingsbury:
VQ-T: RNN Transducers using Vector-Quantized Prediction Network States. INTERSPEECH 2022: 1656-1660 - [c112]Andrea Fasoli, Chia-Yu Chen, Mauricio J. Serrano, Swagath Venkataramani, George Saon, Xiaodong Cui, Brian Kingsbury, Kailash Gopalakrishnan:
Accelerating Inference and Language Model Fusion of Recurrent Neural Network Transducers via End-to-End 4-bit Quantization. INTERSPEECH 2022: 2038-2042 - [c111]Xiaodong Cui, George Saon, Tohru Nagano, Masayuki Suzuki, Takashi Fukuda, Brian Kingsbury, Gakuto Kurata:
Improving Generalization of Deep Neural Network Acoustic Models with Length Perturbation and N-best Based Label Smoothing. INTERSPEECH 2022: 2638-2642 - [c110]Vishal Sunder, Eric Fosler-Lussier, Samuel Thomas, Hong-Kwang Kuo, Brian Kingsbury:
Tokenwise Contrastive Pretraining for Finer Speech-to-BERT Alignment in End-to-End Speech-to-Intent Systems. INTERSPEECH 2022: 2683-2687 - [c109]Takashi Fukuda, Samuel Thomas, Masayuki Suzuki, Gakuto Kurata, George Saon, Brian Kingsbury:
Global RNN Transducer Models For Multi-dialect Speech Recognition. INTERSPEECH 2022: 3138-3142 - [c108]Songtao Lu, Siliang Zeng, Xiaodong Cui, Mark S. Squillante, Lior Horesh, Brian Kingsbury, Jia Liu, Mingyi Hong:
A Stochastic Linearized Augmented Lagrangian Method for Decentralized Bilevel Optimization. NeurIPS 2022 - [i44]Hong-Kwang Jeff Kuo, Zoltán Tüske, Samuel Thomas, Brian Kingsbury, George Saon:
Improving End-to-End Models for Set Prediction in Spoken Language Understanding. CoRR abs/2201.12105 (2022) - [i43]Zvi Kons, Aharon Satt, Hong-Kwang Kuo, Samuel Thomas, Boaz Carmeli, Ron Hoory, Brian Kingsbury:
A new data augmentation method for intent classification enhancement and its application on spoken conversation datasets. CoRR abs/2202.10137 (2022) - [i42]Samuel Thomas, Brian Kingsbury, George Saon, Hong-Kwang Jeff Kuo:
Integrating Text Inputs For Training and Adapting RNN Transducer ASR Models. CoRR abs/2202.13155 (2022) - [i41]Samuel Thomas, Hong-Kwang Jeff Kuo, Brian Kingsbury, George Saon:
Towards Reducing the Need for Speech Training Data To Build Spoken Language Understanding Systems. CoRR abs/2203.00006 (2022) - [i40]Xiaodong Cui, George Saon, Tohru Nagano, Masayuki Suzuki, Takashi Fukuda, Brian Kingsbury, Gakuto Kurata:
Improving Generalization of Deep Neural Network Acoustic Models with Length Perturbation and N-best Based Label Smoothing. CoRR abs/2203.15176 (2022) - [i39]Vishal Sunder, Samuel Thomas, Hong-Kwang Jeff Kuo, Jatin Ganhotra, Brian Kingsbury, Eric Fosler-Lussier:
Towards End-to-End Integration of Dialog History for Improved Spoken Language Understanding. CoRR abs/2204.05169 (2022) - [i38]Vishal Sunder, Eric Fosler-Lussier, Samuel Thomas, Hong-Kwang Jeff Kuo, Brian Kingsbury:
Tokenwise Contrastive Pretraining for Finer Speech-to-BERT Alignment in End-to-End Speech-to-Intent Systems. CoRR abs/2204.05188 (2022) - [i37]Andrea Fasoli, Chia-Yu Chen, Mauricio J. Serrano, Swagath Venkataramani, George Saon, Xiaodong Cui, Brian Kingsbury, Kailash Gopalakrishnan:
Accelerating Inference and Language Model Fusion of Recurrent Neural Network Transducers via End-to-End 4-bit Quantization. CoRR abs/2206.07882 (2022) - [i36]Jiatong Shi, George Saon, David Haws, Shinji Watanabe, Brian Kingsbury:
VQ-T: RNN Transducers using Vector-Quantized Prediction Network States. CoRR abs/2208.01818 (2022) - [i35]Andrew Rouditchenko, Yung-Sung Chuang, Nina Shvetsova, Samuel Thomas, Rogério Feris, Brian Kingsbury, Leonid Karlinsky, David Harwath, Hilde Kuehne, James R. Glass:
C2KD: Cross-Lingual Cross-Modal Knowledge Distillation for Multilingual Text-Video Retrieval. CoRR abs/2210.03625 (2022) - 2021
- [j16]Xiaodong Cui, Wei Zhang, Abdullah Kayi, Mingrui Liu, Ulrich Finkler, Brian Kingsbury, George Saon, David S. Kung:
Asynchronous Decentralized Distributed Training of Acoustic Models. IEEE ACM Trans. Audio Speech Lang. Process. 29: 3565-3576 (2021) - [c107]George Saon, Zoltán Tüske, Daniel Bolaños, Brian Kingsbury:
Advancing RNN Transducer Technology for Speech Recognition. ICASSP 2021: 5654-5658 - [c106]Xiaodong Cui, Songtao Lu, Brian Kingsbury:
Federated Acoustic Modeling for Automatic Speech Recognition. ICASSP 2021: 6748-6752 - [c105]Edmilson da Silva Morais, Hong-Kwang Jeff Kuo, Samuel Thomas, Zoltán Tüske, Brian Kingsbury:
End-to-End Spoken Language Understanding Using Transformer Networks and Self-Supervised Pre-Trained Features. ICASSP 2021: 7483-7487 - [c104]Samuel Thomas, Hong-Kwang Jeff Kuo, George Saon, Zoltán Tüske, Brian Kingsbury, Gakuto Kurata, Zvi Kons, Ron Hoory:
RNN Transducer Models for Spoken Language Understanding. ICASSP 2021: 7493-7497 - [c103]Brian Chen, Andrew Rouditchenko, Kevin Duarte, Hilde Kuehne, Samuel Thomas, Angie W. Boggust, Rameswar Panda, Brian Kingsbury, Rogério Feris, David Harwath, James R. Glass, Michael Picheny, Shih-Fu Chang:
Multimodal Clustering Networks for Self-supervised Learning from Unlabeled Videos. ICCV 2021: 7992-8001 - [c102]Jatin Ganhotra, Samuel Thomas, Hong-Kwang Jeff Kuo, Sachindra Joshi, George Saon, Zoltán Tüske, Brian Kingsbury:
Integrating Dialog History into End-to-End Spoken Language Understanding Systems. Interspeech 2021: 1254-1258 - [c101]Andrew Rouditchenko, Angie W. Boggust, David Harwath, Brian Chen, Dhiraj Joshi, Samuel Thomas, Kartik Audhkhasi, Hilde Kuehne, Rameswar Panda, Rogério Schmidt Feris, Brian Kingsbury, Michael Picheny, Antonio Torralba, James R. Glass:
AVLnet: Learning Audio-Visual Language Representations from Instructional Videos. Interspeech 2021: 1584-1588 - [c100]Xiaodong Cui, Brian Kingsbury, George Saon, David Haws, Zoltán Tüske:
Reducing Exposure Bias in Training Recurrent Neural Network Transducers. Interspeech 2021: 1802-1806 - [c99]Gakuto Kurata, George Saon, Brian Kingsbury, David Haws, Zoltán Tüske:
Improving Customization of Neural Transducers by Mitigating Acoustic Mismatch of Synthesized Audio. Interspeech 2021: 2027-2031 - [c98]Zoltán Tüske, George Saon, Brian Kingsbury:
On the Limit of English Conversational Speech Recognition. Interspeech 2021: 2062-2066 - [c97]Andrea Fasoli, Chia-Yu Chen, Mauricio J. Serrano, Xiao Sun, Naigang Wang, Swagath Venkataramani, George Saon, Xiaodong Cui, Brian Kingsbury, Wei Zhang, Zoltán Tüske, Kailash Gopalakrishnan:
4-Bit Quantization of LSTM-Based Speech Recognition Models. Interspeech 2021: 2586-2590 - [c96]Andrew Rouditchenko, Angie W. Boggust, David Harwath, Samuel Thomas, Hilde Kuehne, Brian Chen, Rameswar Panda, Rogério Feris, Brian Kingsbury, Michael Picheny, James R. Glass:
Cascaded Multilingual Audio-Visual Learning from Videos. Interspeech 2021: 3006-3010 - [i34]Xiaodong Cui, Songtao Lu, Brian Kingsbury:
Federated Acoustic Modeling For Automatic Speech Recognition. CoRR abs/2102.04429 (2021) - [i33]George Saon, Zoltán Tüske, Daniel Bolaños, Brian Kingsbury:
Advancing RNN Transducer Technology for Speech Recognition. CoRR abs/2103.09935 (2021) - [i32]Samuel Thomas, Hong-Kwang Jeff Kuo, George Saon, Zoltán Tüske, Brian Kingsbury, Gakuto Kurata, Zvi Kons, Ron Hoory:
RNN Transducer Models For Spoken Language Understanding. CoRR abs/2104.03842 (2021) - [i31]Brian Chen, Andrew Rouditchenko, Kevin Duarte, Hilde Kuehne, Samuel Thomas, Angie W. Boggust, Rameswar Panda, Brian Kingsbury, Rogério Schmidt Feris, David Harwath, James R. Glass, Michael Picheny, Shih-Fu Chang:
Multimodal Clustering Networks for Self-supervised Learning from Unlabeled Videos. CoRR abs/2104.12671 (2021) - [i30]Zoltán Tüske, George Saon, Brian Kingsbury:
On the limit of English conversational speech recognition. CoRR abs/2105.00982 (2021) - [i29]Ashish R. Mittal, Samarth Bharadwaj, Shreya Khare, Saneem A. Chemmengath, Karthik Sankaranarayanan, Brian Kingsbury:
Representation based meta-learning for few-shot spoken intent recognition. CoRR abs/2106.15238 (2021) - [i28]Jatin Ganhotra, Samuel Thomas, Hong-Kwang Jeff Kuo, Sachindra Joshi, George Saon, Zoltán Tüske, Brian Kingsbury:
Integrating Dialog History into End-to-End Spoken Language Understanding Systems. CoRR abs/2108.08405 (2021) - [i27]Xiaodong Cui, Brian Kingsbury, George Saon, David Haws, Zoltán Tüske:
Reducing Exposure Bias in Training Recurrent Neural Network Transducers. CoRR abs/2108.10803 (2021) - [i26]Andrea Fasoli, Chia-Yu Chen, Mauricio J. Serrano, Xiao Sun, Naigang Wang, Swagath Venkataramani, George Saon, Xiaodong Cui, Brian Kingsbury, Wei Zhang, Zoltán Tüske, Kailash Gopalakrishnan:
4-bit Quantization of LSTM-based Speech Recognition Models. CoRR abs/2108.12074 (2021) - [i25]Xiaodong Cui, Wei Zhang, Abdullah Kayi, Mingrui Liu, Ulrich Finkler, Brian Kingsbury, George Saon, David S. Kung:
Asynchronous Decentralized Distributed Training of Acoustic Models. CoRR abs/2110.11199 (2021) - [i24]Andrew Rouditchenko, Angie W. Boggust, David Harwath, Samuel Thomas, Hilde Kuehne, Brian Chen, Rameswar Panda, Rogério Feris, Brian Kingsbury, Michael Picheny, James R. Glass:
Cascaded Multilingual Audio-Visual Learning from Videos. CoRR abs/2111.04823 (2021) - [i23]Wei Zhang, Mingrui Liu, Yu Feng, Xiaodong Cui, Brian Kingsbury, Yuhai Tu:
Loss Landscape Dependent Self-Adjusting Learning Rates in Decentralized Stochastic Gradient Descent. CoRR abs/2112.01433 (2021) - [i22]Nina Shvetsova, Brian Chen, Andrew Rouditchenko, Samuel Thomas, Brian Kingsbury, Rogério Feris, David Harwath, James R. Glass, Hilde Kuehne:
Everything at Once - Multi-modal Fusion Transformer for Video Retrieval. CoRR abs/2112.04446 (2021) - 2020
- [c95]Wei Zhang, Xiaodong Cui, Abdullah Kayi, Mingrui Liu, Ulrich Finkler, Brian Kingsbury, George Saon, Youssef Mroueh, Alper Buyuktosunoglu, Payel Das, David S. Kung, Michael Picheny:
Improving Efficiency in Large-Scale Decentralized Distributed Training. ICASSP 2020: 3022-3026 - [c94]Guojing Cong, Brian Kingsbury, Chih-Chieh Yang, Tianyi Liu:
Fast Training of Deep Neural Networks for Speech Recognition. ICASSP 2020: 6884-6888 - [c93]Yinghui Huang, Hong-Kwang Kuo, Samuel Thomas, Zvi Kons, Kartik Audhkhasi, Brian Kingsbury, Ron Hoory, Michael Picheny:
Leveraging Unpaired Text Data for Training End-To-End Speech-to-Intent Systems. ICASSP 2020: 7984-7988 - [c92]Zoltán Tüske, George Saon, Kartik Audhkhasi, Brian Kingsbury:
Single Headed Attention Based Sequence-to-Sequence Model for State-of-the-Art Results on Switchboard. INTERSPEECH 2020: 551-555 - [c91]Hong-Kwang Jeff Kuo, Zoltán Tüske, Samuel Thomas, Yinghui Huang, Kartik Audhkhasi, Brian Kingsbury, Gakuto Kurata, Zvi Kons, Ron Hoory, Luis A. Lastras:
End-to-End Spoken Language Understanding Without Full Transcripts. INTERSPEECH 2020: 906-910 - [c90]Ashish R. Mittal, Samarth Bharadwaj, Shreya Khare, Saneem A. Chemmengath, Karthik Sankaranarayanan, Brian Kingsbury:
Representation Based Meta-Learning for Few-Shot Spoken Intent Recognition. INTERSPEECH 2020: 4283-4287 - [c89]Samuel Thomas, Kartik Audhkhasi, Brian Kingsbury:
Transliteration Based Data Augmentation for Training Multilingual ASR Acoustic Models in Low Resource Settings. INTERSPEECH 2020: 4736-4740 - [i21]Zoltán Tüske, George Saon, Kartik Audhkhasi, Brian Kingsbury:
Single headed attention based sequence-to-sequence model for state-of-the-art results on Switchboard-300. CoRR abs/2001.07263 (2020) - [i20]Wei Zhang, Xiaodong Cui, Abdullah Kayi, Mingrui Liu, Ulrich Finkler, Brian Kingsbury, George Saon, Youssef Mroueh, Alper Buyuktosunoglu, Payel Das, David S. Kung, Michael Picheny:
Improving Efficiency in Large-Scale Decentralized Distributed Training. CoRR abs/2002.01119 (2020) - [i19]Andrew Rouditchenko, Angie W. Boggust, David Harwath, Dhiraj Joshi, Samuel Thomas, Kartik Audhkhasi, Rogério Feris, Brian Kingsbury, Michael Picheny, Antonio Torralba, James R. Glass:
AVLnet: Learning Audio-Visual Language Representations from Instructional Videos. CoRR abs/2006.09199 (2020) - [i18]Hong-Kwang Jeff Kuo, Zoltán Tüske, Samuel Thomas, Yinghui Huang, Kartik Audhkhasi, Brian Kingsbury, Gakuto Kurata, Zvi Kons, Ron Hoory, Luis A. Lastras:
End-to-End Spoken Language Understanding Without Full Transcripts. CoRR abs/2009.14386 (2020) - [i17]Yinghui Huang, Hong-Kwang Kuo, Samuel Thomas, Zvi Kons, Kartik Audhkhasi, Brian Kingsbury, Ron Hoory, Michael Picheny:
Leveraging Unpaired Text Data for Training End-to-End Speech-to-Intent Systems. CoRR abs/2010.04284 (2020) - [i16]Edmilson da Silva Morais, Hong-Kwang Jeff Kuo, Samuel Thomas, Zoltán Tüske, Brian Kingsbury:
End-to-end spoken language understanding using transformer networks and self-supervised pre-trained features. CoRR abs/2011.08238 (2020)
2010 – 2019
- 2019
- [j15]Avner May, Alireza Bagheri Garakani, Zhiyun Lu, Dong Guo, Kuan Liu, Aurélien Bellet, Linxi Fan, Michael Collins, Daniel Hsu, Brian Kingsbury, Michael Picheny, Fei Sha:
Kernel Approximation Methods for Speech Recognition. J. Mach. Learn. Res. 20: 59:1-59:36 (2019) - [c88]George Saon, Zoltán Tüske, Kartik Audhkhasi, Brian Kingsbury, Michael Picheny, Samuel Thomas:
Simplified LSTMS for Speech Recognition. ASRU 2019: 547-553 - [c87]Wei Zhang, Xiaodong Cui, Ulrich Finkler, Brian Kingsbury, George Saon, David S. Kung, Michael Picheny:
Distributed Deep Learning Strategies for Automatic Speech Recognition. ICASSP 2019: 5706-5710 - [c86]George Saon, Zoltán Tüske, Kartik Audhkhasi, Brian Kingsbury:
Sequence Noise Injected Training for End-to-end Speech Recognition. ICASSP 2019: 6261-6265 - [c85]Samuel Thomas, Masayuki Suzuki, Yinghui Huang, Gakuto Kurata, Zoltán Tüske, George Saon, Brian Kingsbury, Michael Picheny, Tom Dibert, Alice Kaiser-Schatzlein, Bern Samko:
English Broadcast News Speech Recognition by Humans and Machines. ICASSP 2019: 6455-6459 - [c84]Anna Choromanska, Benjamin Cowen, Sadhana Kumaravel, Ronny Luss, Mattia Rigotti, Irina Rish, Paolo Diachille, Viatcheslav Gurev, Brian Kingsbury, Ravi Tejwani, Djallel Bouneffouf:
Beyond Backprop: Online Alternating Minimization with Auxiliary Variables. ICML 2019: 1193-1202 - [c83]Ziv Goldfeld, Ewout van den Berg, Kristjan H. Greenewald, Igor Melnyk, Nam Nguyen, Brian Kingsbury, Yury Polyanskiy:
Estimating Information Flow in Deep Neural Networks. ICML 2019: 2299-2308 - [c82]Michael Picheny, Zoltán Tüske, Brian Kingsbury, Kartik Audhkhasi, Xiaodong Cui, George Saon:
Challenging the Boundaries of Speech Recognition: The MALACH Corpus. INTERSPEECH 2019: 326-330 - [c81]Kartik Audhkhasi, George Saon, Zoltán Tüske, Brian Kingsbury, Michael Picheny:
Forget a Bit to Learn Better: Soft Forgetting for CTC-Based Automatic Speech Recognition. INTERSPEECH 2019: 2618-2622 - [c80]Wei Zhang, Xiaodong Cui, Ulrich Finkler, George Saon, Abdullah Kayi, Alper Buyuktosunoglu, Brian Kingsbury, David S. Kung, Michael Picheny:
A Highly Efficient Distributed Deep Learning System for Automatic Speech Recognition. INTERSPEECH 2019: 2628-2632 - [i15]Wei Zhang, Xiaodong Cui, Ulrich Finkler, Brian Kingsbury, George Saon, David S. Kung, Michael Picheny:
Distributed Deep Learning Strategies For Automatic Speech Recognition. CoRR abs/1904.04956 (2019) - [i14]Samuel Thomas, Masayuki Suzuki, Yinghui Huang, Gakuto Kurata, Zoltán Tüske, George Saon, Brian Kingsbury, Michael Picheny, Tom Dibert, Alice Kaiser-Schatzlein, Bern Samko:
English Broadcast News Speech Recognition by Humans and Machines. CoRR abs/1904.13258 (2019) - [i13]Wei Zhang, Xiaodong Cui, Ulrich Finkler, George Saon, Abdullah Kayi, Alper Buyuktosunoglu, Brian Kingsbury, David S. Kung, Michael Picheny:
A Highly Efficient Distributed Deep Learning System For Automatic Speech Recognition. CoRR abs/1907.05701 (2019) - [i12]Michael Picheny, Zoltán Tüske, Brian Kingsbury, Kartik Audhkhasi, Xiaodong Cui, George Saon:
Challenging the Boundaries of Speech Recognition: The MALACH Corpus. CoRR abs/1908.03455 (2019) - 2018
- [c79]Kartik Audhkhasi, Brian Kingsbury, Bhuvana Ramabhadran, George Saon, Michael Picheny:
Building Competitive Direct Acoustics-to-Word Models for English Conversational Speech Recognition. ICASSP 2018: 4759-4763 - [i11]Anna Choromanska, Sadhana Kumaravel, Ronny Luss, Irina Rish, Brian Kingsbury, Ravi Tejwani, Djallel Bouneffouf:
Beyond Backprop: Alternating Minimization with co-Activation Memory. CoRR abs/1806.09077 (2018) - [i10]Ziv Goldfeld, Ewout van den Berg, Kristjan H. Greenewald, Igor Melnyk, Nam Nguyen, Brian Kingsbury, Yury Polyanskiy:
Estimating Information Flow in Neural Networks. CoRR abs/1810.05728 (2018) - [i9]Vidya Muthukumar, Tejaswini Pedapati, Nalini K. Ratha, Prasanna Sattigeri, Chai-Wah Wu, Brian Kingsbury, Abhishek Kumar, Samuel Thomas, Aleksandra Mojsilovic, Kush R. Varshney:
Understanding Unequal Gender Classification Accuracy from Face Images. CoRR abs/1812.00099 (2018) - 2017
- [j14]Bhuvana Ramabhadran, Nancy F. Chen, Mary P. Harper, Brian Kingsbury, Kate M. Knill:
Introduction to the Special Issue on End-to-End Speech and Language Processing. IEEE J. Sel. Top. Signal Process. 11(8): 1237-1239 (2017) - [j13]Kartik Audhkhasi, Andrew Rosenberg, Abhinav Sethy, Bhuvana Ramabhadran, Brian Kingsbury:
End-to-End ASR-Free Keyword Search From Speech. IEEE J. Sel. Top. Signal Process. 11(8): 1351-1359 (2017) - [j12]I-Hsin Chung, Tara N. Sainath, Bhuvana Ramabhadran, Michael Picheny, John A. Gunnels, Vernon Austel, Upendra V. Chaudhari, Brian Kingsbury:
Parallel Deep Neural Network Training for Big Data on Blue Gene/Q. IEEE Trans. Parallel Distributed Syst. 28(6): 1703-1714 (2017) - [c78]Jia Cui, Brian Kingsbury, Bhuvana Ramabhadran, George Saon, Tom Sercu, Kartik Audhkhasi, Abhinav Sethy, Markus Nußbaum-Thom, Andrew Rosenberg:
Knowledge distillation across ensembles of multilingual models for low-resource languages. ICASSP 2017: 4825-4829 - [c77]Kartik Audhkhasi, Andrew Rosenberg, Abhinav Sethy, Bhuvana Ramabhadran, Brian Kingsbury:
End-to-end ASR-free keyword search from speech. ICASSP 2017: 4840-4844 - [c76]Tom Sercu, George Saon, Jia Cui, Xiaodong Cui, Bhuvana Ramabhadran, Brian Kingsbury, Abhinav Sethy:
Network architectures for multilingual speech representation learning. ICASSP 2017: 5295-5299 - [c75]Guojing Cong, Brian Kingsbury, Soumyadip Gosh, George Saon, Fan Zhou:
Accelerating deep neural network learning for speech recognition on a cluster of GPUs. MLHPC@SC 2017: 3:1-3:8 - [i8]Avner May, Alireza Bagheri Garakani, Zhiyun Lu, Dong Guo, Kuan Liu, Aurélien Bellet, Linxi Fan, Michael Collins, Daniel J. Hsu, Brian Kingsbury, Michael Picheny, Fei Sha:
Kernel Approximation Methods for Speech Recognition. CoRR abs/1701.03577 (2017) - [i7]Kartik Audhkhasi, Andrew Rosenberg, Abhinav Sethy, Bhuvana Ramabhadran, Brian Kingsbury:
End-to-End ASR-free Keyword Search from Speech. CoRR abs/1701.04313 (2017) - [i6]Kartik Audhkhasi, Brian Kingsbury, Bhuvana Ramabhadran, George Saon, Michael Picheny:
Building competitive direct acoustics-to-word models for English conversational speech recognition. CoRR abs/1712.03133 (2017) - 2016
- [c74]Avner May, Michael Collins, Daniel J. Hsu, Brian Kingsbury:
Compact kernel models for acoustic modeling via random feature selection. ICASSP 2016: 2424-2428 - [c73]Jie Chen, Lingfei Wu, Kartik Audhkhasi, Brian Kingsbury, Bhuvana Ramabhadran:
Efficient one-vs-one kernel ridge regression for speech recognition. ICASSP 2016: 2454-2458 - [c72]Tom Sercu, Christian Puhrsch, Brian Kingsbury, Yann LeCun:
Very deep multilingual convolutional neural networks for LVCSR. ICASSP 2016: 4955-4959 - [c71]Zhiyun Lu, Dong Guo, Alireza Bagheri Garakani, Kuan Liu, Avner May, Aurélien Bellet, Linxi Fan, Michael Collins, Brian Kingsbury, Michael Picheny, Fei Sha:
A comparison between deep neural nets and kernel acoustic models for speech recognition. ICASSP 2016: 5070-5074 - [c70]Gakuto Kurata, Brian Kingsbury:
Improved Neural Network Initialization by Grouping Context-Dependent Targets for Acoustic Modeling. INTERSPEECH 2016: 27-31 - [c69]Samuel Thomas, Kartik Audhkhasi, Jia Cui, Brian Kingsbury, Bhuvana Ramabhadran:
Multilingual Data Selection for Low Resource Speech Recognition. INTERSPEECH 2016: 3853-3857 - [i5]Zhiyun Lu, Dong Guo, Alireza Bagheri Garakani, Kuan Liu, Avner May, Aurélien Bellet, Linxi Fan, Michael Collins, Brian Kingsbury, Michael Picheny, Fei Sha:
A Comparison between Deep Neural Nets and Kernel Acoustic Models for Speech Recognition. CoRR abs/1603.05800 (2016) - 2015
- [j11]Tara N. Sainath, Brian Kingsbury, George Saon, Hagen Soltau, Abdel-rahman Mohamed, George E. Dahl, Bhuvana Ramabhadran:
Deep Convolutional Neural Networks for Large-scale Speech Tasks. Neural Networks 64: 39-48 (2015) - [j10]