default search action
Michael L. Seltzer
- > Home > Persons > Michael L. Seltzer
Publications
- 2024
- [c82]Egor Lakomkin, Chunyang Wu, Yassir Fathullah, Ozlem Kalinli, Michael L. Seltzer, Christian Fuegen:
End-to-End Speech Recognition Contextualization with Large Language Models. ICASSP 2024: 12406-12410 - 2023
- [i30]Egor Lakomkin, Chunyang Wu, Yassir Fathullah, Ozlem Kalinli, Michael L. Seltzer, Christian Fuegen:
End-to-End Speech Recognition Contextualization with Large Language Models. CoRR abs/2309.10917 (2023) - 2022
- [c74]Suyoun Kim, Duc Le, Weiyi Zheng, Tarun Singh, Abhinav Arora, Xiaoyu Zhai, Christian Fuegen, Ozlem Kalinli, Michael L. Seltzer:
Evaluating User Perception of Speech Recognition System Quality with Semantic Distance Metric. INTERSPEECH 2022: 3978-3982 - 2021
- [c73]Suyoun Kim, Yuan Shangguan, Jay Mahadeokar, Antoine Bruguier, Christian Fuegen, Michael L. Seltzer, Duc Le:
Improved Neural Language Model Fusion for Streaming Recurrent Neural Network Transducer. ICASSP 2021: 7333-7337 - [c72]Ganesh Venkatesh, Alagappan Valliappan, Jay Mahadeokar, Yuan Shangguan, Christian Fuegen, Michael L. Seltzer, Vikas Chandra:
Memory-Efficient Speech Recognition on Smart Devices. ICASSP 2021: 8368-8372 - [c71]Duc Le, Mahaveer Jain, Gil Keren, Suyoun Kim, Yangyang Shi, Jay Mahadeokar, Julian Chan, Yuan Shangguan, Christian Fuegen, Ozlem Kalinli, Yatharth Saraf, Michael L. Seltzer:
Contextualized Streaming End-to-End Speech Recognition with Trie-Based Deep Biasing and Shallow Fusion. Interspeech 2021: 1772-1776 - [c70]Suyoun Kim, Abhinav Arora, Duc Le, Ching-Feng Yeh, Christian Fuegen, Ozlem Kalinli, Michael L. Seltzer:
Semantic Distance: A New Metric for ASR Performance Analysis Towards Spoken Language Understanding. Interspeech 2021: 1977-1981 - [c69]Yangyang Shi, Varun Nagaraja, Chunyang Wu, Jay Mahadeokar, Duc Le, Rohit Prabhavalkar, Alex Xiao, Ching-Feng Yeh, Julian Chan, Christian Fuegen, Ozlem Kalinli, Michael L. Seltzer:
Dynamic Encoder Transducer: A Flexible Solution for Trading Off Accuracy for Latency. Interspeech 2021: 2042-2046 - [c68]Jay Mahadeokar, Yangyang Shi, Yuan Shangguan, Chunyang Wu, Alex Xiao, Hang Su, Duc Le, Ozlem Kalinli, Christian Fuegen, Michael L. Seltzer:
Flexi-Transducer: Optimizing Latency, Accuracy and Compute for Multi-Domain On-Device Scenarios. Interspeech 2021: 2107-2111 - [c67]Yuan Shangguan, Rohit Prabhavalkar, Hang Su, Jay Mahadeokar, Yangyang Shi, Jiatong Zhou, Chunyang Wu, Duc Le, Ozlem Kalinli, Christian Fuegen, Michael L. Seltzer:
Dissecting User-Perceived Latency of On-Device E2E Speech Recognition. Interspeech 2021: 4553-4557 - [c64]Jay Mahadeokar, Yuan Shangguan, Duc Le, Gil Keren, Hang Su, Thong Le, Ching-Feng Yeh, Christian Fuegen, Michael L. Seltzer:
Alignment Restricted Streaming Recurrent Neural Network Transducer. SLT 2021: 52-59 - [c63]Duc Le, Gil Keren, Julian Chan, Jay Mahadeokar, Christian Fuegen, Michael L. Seltzer:
Deep Shallow Fusion for RNN-T Personalization. SLT 2021: 251-257 - [i24]Ganesh Venkatesh, Alagappan Valliappan, Jay Mahadeokar, Yuan Shangguan, Christian Fuegen, Michael L. Seltzer, Vikas Chandra:
Memory-efficient Speech Recognition on Smart Devices. CoRR abs/2102.11531 (2021) - [i23]Suyoun Kim, Abhinav Arora, Duc Le, Ching-Feng Yeh, Christian Fuegen, Ozlem Kalinli, Michael L. Seltzer:
Semantic Distance: A New Metric for ASR Performance Analysis Towards Spoken Language Understanding. CoRR abs/2104.02138 (2021) - [i22]Yangyang Shi, Varun Nagaraja, Chunyang Wu, Jay Mahadeokar, Duc Le, Rohit Prabhavalkar, Alex Xiao, Ching-Feng Yeh, Julian Chan, Christian Fuegen, Ozlem Kalinli, Michael L. Seltzer:
Dynamic Encoder Transducer: A Flexible Solution For Trading Off Accuracy For Latency. CoRR abs/2104.02176 (2021) - [i21]Duc Le, Mahaveer Jain, Gil Keren, Suyoun Kim, Yangyang Shi, Jay Mahadeokar, Julian Chan, Yuan Shangguan, Christian Fuegen, Ozlem Kalinli, Yatharth Saraf, Michael L. Seltzer:
Contextualized Streaming End-to-End Speech Recognition with Trie-Based Deep Biasing and Shallow Fusion. CoRR abs/2104.02194 (2021) - [i20]Yuan Shangguan, Rohit Prabhavalkar, Hang Su, Jay Mahadeokar, Yangyang Shi, Jiatong Zhou, Chunyang Wu, Duc Le, Ozlem Kalinli, Christian Fuegen, Michael L. Seltzer:
Dissecting User-Perceived Latency of On-Device E2E Speech Recognition. CoRR abs/2104.02207 (2021) - [i19]Jay Mahadeokar, Yangyang Shi, Yuan Shangguan, Chunyang Wu, Alex Xiao, Hang Su, Duc Le, Ozlem Kalinli, Christian Fuegen, Michael L. Seltzer:
Flexi-Transducer: Optimizing Latency, Accuracy and Compute forMulti-Domain On-Device Scenarios. CoRR abs/2104.02232 (2021) - [i17]Suyoun Kim, Duc Le, Weiyi Zheng, Tarun Singh, Abhinav Arora, Xiaoyu Zhai, Christian Fuegen, Ozlem Kalinli, Michael L. Seltzer:
Evaluating User Perception of Speech Recognition System Quality with Semantic Distance Metric. CoRR abs/2110.05376 (2021) - 2020
- [c62]Duc Le, Thilo Köhler, Christian Fuegen, Michael L. Seltzer:
G2G: TTS-Driven Pronunciation Learning for Graphemic Hybrid ASR. ICASSP 2020: 6869-6873 - [c61]Yongqiang Wang, Abdelrahman Mohamed, Duc Le, Chunxi Liu, Alex Xiao, Jay Mahadeokar, Hongzhao Huang, Andros Tjandra, Xiaohui Zhang, Frank Zhang, Christian Fuegen, Geoffrey Zweig, Michael L. Seltzer:
Transformer-Based Acoustic Modeling for Hybrid Speech Recognition. ICASSP 2020: 6874-6878 - [c59]Yangyang Shi, Yongqiang Wang, Chunyang Wu, Christian Fuegen, Frank Zhang, Duc Le, Ching-Feng Yeh, Michael L. Seltzer:
Weak-Attention Suppression for Transformer Based Speech Recognition. INTERSPEECH 2020: 4996-5000 - [i16]Yangyang Shi, Yongqiang Wang, Chunyang Wu, Christian Fuegen, Frank Zhang, Duc Le, Ching-Feng Yeh, Michael L. Seltzer:
Weak-Attention Suppression For Transformer Based Speech Recognition. CoRR abs/2005.09137 (2020) - [i14]Suyoun Kim, Yuan Shangguan, Jay Mahadeokar, Antoine Bruguier, Christian Fuegen, Michael L. Seltzer, Duc Le:
Improved Neural Language Model Fusion for Streaming Recurrent Neural Network Transducer. CoRR abs/2010.13878 (2020) - [i13]Jay Mahadeokar, Yuan Shangguan, Duc Le, Gil Keren, Hang Su, Thong Le, Ching-Feng Yeh, Christian Fuegen, Michael L. Seltzer:
Alignment Restricted Streaming Recurrent Neural Network Transducer. CoRR abs/2011.03072 (2020) - [i11]Duc Le, Gil Keren, Julian Chan, Jay Mahadeokar, Christian Fuegen, Michael L. Seltzer:
Deep Shallow Fusion for RNN-T Personalization. CoRR abs/2011.07754 (2020) - 2019
- [c58]Duc Le, Xiaohui Zhang, Weiyi Zheng, Christian Fügen, Geoffrey Zweig, Michael L. Seltzer:
From Senones to Chenones: Tied Context-Dependent Graphemes for Hybrid Speech Recognition. ASRU 2019: 457-464 - [c57]Zhehuai Chen, Mahaveer Jain, Yongqiang Wang, Michael L. Seltzer, Christian Fuegen:
End-to-end Contextual Speech Recognition Using Class Language Models and a Token Passing Decoder. ICASSP 2019: 6186-6190 - [c56]Zhehuai Chen, Mahaveer Jain, Yongqiang Wang, Michael L. Seltzer, Christian Fuegen:
Joint Grapheme and Phoneme Embeddings for Contextual End-to-End ASR. INTERSPEECH 2019: 3490-3494 - [i10]Duc Le, Xiaohui Zhang, Weiyi Zheng, Christian Fügen, Geoffrey Zweig, Michael L. Seltzer:
From Senones to Chenones: Tied Context-Dependent Graphemes for Hybrid Speech Recognition. CoRR abs/1910.01493 (2019) - [i9]Yongqiang Wang, Abdelrahman Mohamed, Duc Le, Chunxi Liu, Alex Xiao, Jay Mahadeokar, Hongzhao Huang, Andros Tjandra, Xiaohui Zhang, Frank Zhang, Christian Fuegen, Geoffrey Zweig, Michael L. Seltzer:
Transformer-based Acoustic Modeling for Hybrid Speech Recognition. CoRR abs/1910.09799 (2019) - [i8]Duc Le, Thilo Köhler, Christian Fuegen, Michael L. Seltzer:
G2G: TTS-Driven Pronunciation Learning for Graphemic Hybrid ASR. CoRR abs/1910.12612 (2019) - [i7]Ching-Feng Yeh, Jay Mahadeokar, Kaustubh Kalgaonkar, Yongqiang Wang, Duc Le, Mahaveer Jain, Kjell Schubert, Christian Fuegen, Michael L. Seltzer:
Transformer-Transducer: End-to-End Speech Recognition with Self-Attention. CoRR abs/1910.12977 (2019) - [i6]Mahaveer Jain, Kjell Schubert, Jay Mahadeokar, Ching-Feng Yeh, Kaustubh Kalgaonkar, Anuroop Sriram, Christian Fuegen, Michael L. Seltzer:
RNN-T For Latency Controlled ASR With Improved Beam Search. CoRR abs/1911.01629 (2019) - 2018
- [i4]Zhehuai Chen, Mahaveer Jain, Yongqiang Wang, Michael L. Seltzer, Christian Fuegen:
End-to-end contextual speech recognition using class language models and a token passing decoder. CoRR abs/1812.02142 (2018)
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-08-08 19:17 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint