
Yashesh Gaur
Refine list

refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2021
- [i17]Xuankai Chang, Naoyuki Kanda, Yashesh Gaur, Xiaofei Wang, Zhong Meng, Takuya Yoshioka:
Hypothesis Stitcher for End-to-End Speaker-attributed ASR on Long-form Multi-talker Recordings. CoRR abs/2101.01853 (2021) - 2020
- [c19]Hirofumi Inaguma, Yashesh Gaur, Liang Lu, Jinyu Li, Yifan Gong:
Minimum Latency Training Strategies for Streaming Sequence-to-Sequence ASR. ICASSP 2020: 6064-6068 - [c18]Jinyu Li, Yu Wu, Yashesh Gaur, Chengyi Wang, Rui Zhao, Shujie Liu:
On the Comparison of Popular End-to-End Models for Large Scale Speech Recognition. INTERSPEECH 2020: 1-5 - [c17]Naoyuki Kanda, Yashesh Gaur, Xiaofei Wang, Zhong Meng, Zhuo Chen, Tianyan Zhou, Takuya Yoshioka:
Joint Speaker Counting, Speech Recognition, and Speaker Identification for Overlapped Speech of any Number of Speakers. INTERSPEECH 2020: 36-40 - [c16]Dimitrios Dimitriadis, Ken'ichi Kumatani, Robert Gmyr, Yashesh Gaur, Sefik Emre Eskimez:
A Federated Approach in Training Acoustic Models. INTERSPEECH 2020: 981-985 - [c15]Jeremy H. M. Wong, Yashesh Gaur, Rui Zhao, Liang Lu, Eric Sun, Jinyu Li, Yifan Gong:
Combination of End-to-End and Hybrid Models for Speech Recognition. INTERSPEECH 2020: 1783-1787 - [c14]Naoyuki Kanda, Yashesh Gaur, Xiaofei Wang, Zhong Meng, Takuya Yoshioka:
Serialized Output Training for End-to-End Overlapped Speech Recognition. INTERSPEECH 2020: 2797-2801 - [c13]Ken'ichi Kumatani, Dimitrios Dimitriadis, Yashesh Gaur, Robert Gmyr, Sefik Emre Eskimez, Jinyu Li, Michael Zeng:
Sequence-Level Self-Learning with Multiple Hypotheses. INTERSPEECH 2020: 3775-3779 - [i16]Zhong Meng, Yashesh Gaur, Jinyu Li, Yifan Gong:
Character-Aware Attention-Based End-to-End Speech Recognition. CoRR abs/2001.01795 (2020) - [i15]Zhong Meng, Jinyu Li, Yashesh Gaur, Yifan Gong:
Domain Adaptation via Teacher-Student Learning for End-to-End Speech Recognition. CoRR abs/2001.01798 (2020) - [i14]Naoyuki Kanda, Yashesh Gaur, Xiaofei Wang, Zhong Meng, Takuya Yoshioka:
Serialized Output Training for End-to-End Overlapped Speech Recognition. CoRR abs/2003.12687 (2020) - [i13]Hirofumi Inaguma, Yashesh Gaur, Liang Lu, Jinyu Li, Yifan Gong:
Minimum Latency Training Strategies for Streaming Sequence-to-Sequence ASR. CoRR abs/2004.05009 (2020) - [i12]Jinyu Li, Yu Wu, Yashesh Gaur, Chengyi Wang, Rui Zhao, Shujie Liu:
On the Comparison of Popular End-to-End Models for Large Scale Speech Recognition. CoRR abs/2005.14327 (2020) - [i11]Naoyuki Kanda, Yashesh Gaur, Xiaofei Wang, Zhong Meng, Zhuo Chen, Tianyan Zhou, Takuya Yoshioka:
Joint Speaker Counting, Speech Recognition, and Speaker Identification for Overlapped Speech of Any Number of Speakers. CoRR abs/2006.10930 (2020) - [i10]Dimitrios Dimitriadis, Ken'ichi Kumatani, Robert Gmyr, Yashesh Gaur, Sefik Emre Eskimez:
Federated Transfer Learning with Dynamic Gradient Aggregation. CoRR abs/2008.02452 (2020) - [i9]Naoyuki Kanda, Xuankai Chang, Yashesh Gaur, Xiao-Fei Wang, Zhong Meng, Zhuo Chen, Takuya Yoshioka:
Investigation of End-To-End Speaker-Attributed ASR for Continuous Multi-Talker Recordings. CoRR abs/2008.04546 (2020) - [i8]Zhong Meng, Sarangarajan Parthasarathy, Eric Sun, Yashesh Gaur, Naoyuki Kanda, Liang Lu, Xie Chen, Rui Zhao, Jinyu Li, Yifan Gong:
Internal Language Model Estimation for Domain-Adaptive End-to-End Speech Recognition. CoRR abs/2011.01991 (2020) - [i7]Naoyuki Kanda, Zhong Meng, Liang Lu, Yashesh Gaur, Xiaofei Wang, Zhuo Chen, Takuya Yoshioka:
Minimum Bayes Risk Training for End-to-End Speaker-Attributed ASR. CoRR abs/2011.02921 (2020) - [i6]Xiaofei Wang, Naoyuki Kanda, Yashesh Gaur, Zhuo Chen, Zhong Meng, Takuya Yoshioka:
Exploring End-to-End Multi-channel ASR with Bias Information for Meeting Transcription. CoRR abs/2011.03110 (2020) - [i5]Shahram Ghorbani, Yashesh Gaur, Yu Shi, Jinyu Li:
Listen, Look and Deliberate: Visual context-aware speech recognition using pre-trained text-video representations. CoRR abs/2011.04084 (2020)
2010 – 2019
- 2019
- [c12]Zhong Meng, Jinyu Li, Yashesh Gaur, Yifan Gong:
Domain Adaptation via Teacher-Student Learning for End-to-End Speech Recognition. ASRU 2019: 268-275 - [c11]Zhong Meng, Yashesh Gaur, Jinyu Li, Yifan Gong:
Character-Aware Attention-Based End-to-End Speech Recognition. ASRU 2019: 949-955 - [c10]Zhong Meng, Yashesh Gaur, Jinyu Li, Yifan Gong:
Speaker Adaptation for Attention-Based End-to-End Speech Recognition. INTERSPEECH 2019: 241-245 - [c9]Yashesh Gaur, Jinyu Li, Zhong Meng, Yifan Gong:
Acoustic-to-Phrase Models for Speech Recognition. INTERSPEECH 2019: 2240-2244 - [i4]Zhong Meng, Yashesh Gaur, Jinyu Li, Yifan Gong:
Speaker Adaptation for Attention-Based End-to-End Speech Recognition. CoRR abs/1911.03762 (2019) - 2018
- [c8]Anuroop Sriram, Heewoo Jun, Yashesh Gaur, Sanjeev Satheesh:
Robust Speech Recognition Using Generative Adversarial Networks. ICASSP 2018: 5639-5643 - 2017
- [c7]Eric Battenberg, Jitong Chen
, Rewon Child, Adam Coates, Yashesh Gaur, Yi Li, Hairong Liu, Sanjeev Satheesh, Anuroop Sriram, Zhenyao Zhu:
Exploring neural transducers for end-to-end speech recognition. ASRU 2017: 206-213 - [i3]Eric Battenberg, Rewon Child, Adam Coates, Christopher Fougner, Yashesh Gaur, Jiaji Huang, Heewoo Jun, Ajay Kannan, Markus Kliegl
, Atul Kumar, Hairong Liu, Vinay Rao, Sanjeev Satheesh, David Seetapun, Anuroop Sriram, Zhenyao Zhu:
Reducing Bias in Production Speech Models. CoRR abs/1705.04400 (2017) - [i2]Eric Battenberg, Jitong Chen, Rewon Child, Adam Coates, Yashesh Gaur, Yi Li, Hairong Liu, Sanjeev Satheesh, David Seetapun, Anuroop Sriram, Zhenyao Zhu:
Exploring Neural Transducers for End-to-End Speech Recognition. CoRR abs/1707.07413 (2017) - [i1]Anuroop Sriram, Heewoo Jun, Yashesh Gaur, Sanjeev Satheesh:
Robust Speech Recognition Using Generative Adversarial Networks. CoRR abs/1711.01567 (2017) - 2016
- [c6]Yashesh Gaur, Florian Metze, Jeffrey P. Bigham:
Manipulating Word Lattices to Incorporate Human Corrections. INTERSPEECH 2016: 3062-3065 - [c5]Yashesh Gaur, Walter S. Lasecki, Florian Metze, Jeffrey P. Bigham:
The effects of automatic speech recognition quality on human transcription latency. W4A 2016: 23:1-23:8 - 2015
- [c4]Yashesh Gaur:
The Effects of Automatic Speech Recognition Quality on Human Transcription Latency. ASSETS 2015: 367-368 - [c3]Yashesh Gaur, Florian Metze, Yajie Miao, Jeffrey P. Bigham:
Using keyword spotting to help humans correct captioning faster. INTERSPEECH 2015: 2829-2833 - 2013
- [c2]Hemant A. Patil, Tanvina B. Patel, Swati Talesara, Nirmesh J. Shah, Hardik B. Sailor, Bhavik B. Vachhani, Janki Akhani, Bhargav Kanakiya, Yashesh Gaur, Vibha Prajapati:
Algorithms for speech segmentation at syllable-level for text-to-speech synthesis system in Gujarati. O-COCOSDA/CASLRE 2013: 1-7 - [c1]Yashesh Gaur, Maulik C. Madhavi
, Hemant A. Patil:
Speaker Recognition Using Sparse Representation via Superimposed Features. PReMI 2013: 140-147
Coauthor Index

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
load content from web.archive.org
Privacy notice: By enabling the option above, your browser will contact the API of web.archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from ,
, and
to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and
to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
Tweets on dblp homepage
Show tweets from on the dblp homepage.
Privacy notice: By enabling the option above, your browser will contact twitter.com and twimg.com to load tweets curated by our Twitter account. At the same time, Twitter will persistently store several cookies with your web browser. While we did signal Twitter to not track our users by setting the "dnt" flag, we do not have any control over how Twitter uses your data. So please proceed with care and consider checking the Twitter privacy policy.
last updated on 2021-01-24 22:46 CET by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint