


Остановите войну!
for scientists:
Florian Metze
Person information

- affiliation: Carnegie Mellon University, Pittsburgh, USA
Refine list

refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
showing all ?? records
2020 – today
- 2022
- [i69]Juncheng B. Li, Shuhui Qu, Xinjian Li, Po-Yao Huang, Florian Metze:
On Adversarial Robustness of Large-scale Audio Visual Learning. CoRR abs/2203.12122 (2022) - [i68]Juncheng B. Li, Shuhui Qu, Po-Yao Huang, Florian Metze:
AudioTagging Done Right: 2nd comparison of deep learning methods for environmental sound classification. CoRR abs/2203.13448 (2022) - [i67]Juncheng B. Li, Shuhui Qu, Florian Metze:
Robustness of Neural Architectures for Audio Event Detection. CoRR abs/2205.03268 (2022) - 2021
- [c201]Hu Xu, Gargi Ghosh, Po-Yao Huang, Prahal Arora, Masoumeh Aminzadeh, Christoph Feichtenhofer, Florian Metze, Luke Zettlemoyer:
VLM: Task-agnostic Video-Language Model Pre-training for Video Understanding. ACL/IJCNLP (Findings) 2021: 4227-4239 - [c200]Amanda Cardoso Duarte, Shruti Palaskar, Lucas Ventura, Deepti Ghadiyaram, Kenneth DeHaan, Florian Metze, Jordi Torres, Xavier Giró-i-Nieto:
How2Sign: A Large-Scale Multimodal Dataset for Continuous American Sign Language. CVPR 2021: 2735-2744 - [c199]Abhilasha Ravichander, Siddharth Dalmia, Maria Ryskina, Florian Metze, Eduard H. Hovy, Alan W. Black:
NoiseQA: Challenge Set Evaluation for User-Centric Question Answering. EACL 2021: 2976-2992 - [c198]Hu Xu, Gargi Ghosh, Po-Yao Huang, Dmytro Okhonko, Armen Aghajanyan, Florian Metze, Luke Zettlemoyer, Christoph Feichtenhofer:
VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text Understanding. EMNLP (1) 2021: 6787-6800 - [c197]Juncheng B. Li, Kaixin Ma, Shuhui Qu, Po-Yao Huang, Florian Metze:
Audio-Visual Event Recognition Through the Lens of Adversary. ICASSP 2021: 616-620 - [c196]Xinjian Li, David R. Mortensen, Florian Metze, Alan W. Black:
Multilingual Phonetic Dataset for Low Resource Speech Recognition. ICASSP 2021: 6958-6962 - [c195]Xinjian Li, Juncheng Li, Jiali Yao, Alan W. Black, Florian Metze:
Phone Distribution Estimation for Low Resource Languages. ICASSP 2021: 7233-7237 - [c194]Mandela Patrick, Po-Yao Huang, Ishan Misra, Florian Metze, Andrea Vedaldi, Yuki M. Asano, João F. Henriques:
Space-Time Crop & Attend: Improving Cross-modal Video Representation Learning. ICCV 2021: 10540-10552 - [c193]Mandela Patrick, Po-Yao Huang, Yuki Markus Asano, Florian Metze, Alexander G. Hauptmann, João F. Henriques, Andrea Vedaldi:
Support-set bottlenecks for video-text representation learning. ICLR 2021 - [c192]Shruti Palaskar, Ruslan Salakhutdinov, Alan W. Black, Florian Metze:
Multimodal Speech Summarization Through Semantic Concept Learning. Interspeech 2021: 791-795 - [c191]Siddhant Arora, Alissa Ostapenko, Vijay Viswanathan, Siddharth Dalmia, Florian Metze, Shinji Watanabe, Alan W. Black:
Rethinking End-to-End Evaluation of Decomposable Tasks: A Case Study on Spoken Language Understanding. Interspeech 2021: 1264-1268 - [c190]Xinjian Li, Juncheng Li, Florian Metze, Alan W. Black:
Hierarchical Phone Recognition with Compositional Phonetics. Interspeech 2021: 2461-2465 - [c189]Brian Yan, Siddharth Dalmia, David R. Mortensen, Florian Metze, Shinji Watanabe:
Differentiable Allophone Graphs for Language-Universal Speech Recognition. Interspeech 2021: 2471-2475 - [c188]Siddharth Dalmia, Brian Yan, Vikas Raunak, Florian Metze, Shinji Watanabe:
Searchable Hidden Intermediates for End-to-End Models of Decomposable Sequence Tasks. NAACL-HLT 2021: 1882-1896 - [c187]Poyao Huang, Mandela Patrick, Junjie Hu, Graham Neubig, Florian Metze, Alex Hauptmann:
Multilingual Multimodal Pre-training for Zero-Shot Cross-Lingual Transfer of Vision-Language Models. NAACL-HLT 2021: 2443-2459 - [c186]Mandela Patrick, Dylan Campbell, Yuki M. Asano, Ishan Misra, Florian Metze, Christoph Feichtenhofer, Andrea Vedaldi, João F. Henriques:
Keeping Your Eye on the Ball: Trajectory Attention in Video Transformers. NeurIPS 2021: 12493-12506 - [e5]Heng Tao Shen, Yueting Zhuang, John R. Smith, Yang Yang, Pablo Cesar, Florian Metze, Balakrishnan Prabhakaran:
MM '21: ACM Multimedia Conference, Virtual Event, China, October 20 - 24, 2021. ACM 2021, ISBN 978-1-4503-8651-7 [contents] - [i66]Abhilasha Ravichander, Siddharth Dalmia, Maria Ryskina, Florian Metze, Eduard H. Hovy, Alan W. Black:
NoiseQA: Challenge Set Evaluation for User-Centric Question Answering. CoRR abs/2102.08345 (2021) - [i65]Po-Yao Huang, Mandela Patrick, Junjie Hu, Graham Neubig, Florian Metze, Alexander G. Hauptmann:
Multilingual Multimodal Pre-training for Zero-Shot Cross-Lingual Transfer of Vision-Language Models. CoRR abs/2103.08849 (2021) - [i64]Mandela Patrick, Yuki Markus Asano, Bernie Huang, Ishan Misra, Florian Metze, João F. Henriques, Andrea Vedaldi:
Space-Time Crop & Attend: Improving Cross-modal Video Representation Learning. CoRR abs/2103.10211 (2021) - [i63]Triantafyllos Afouras, Yuki Markus Asano, Francois Fagan, Andrea Vedaldi, Florian Metze:
Self-supervised object detection from audio-visual correspondence. CoRR abs/2104.06401 (2021) - [i62]Siddharth Dalmia, Brian Yan, Vikas Raunak, Florian Metze, Shinji Watanabe:
Searchable Hidden Intermediates for End-to-End Models of Decomposable Sequence Tasks. CoRR abs/2105.00573 (2021) - [i61]Hu Xu, Gargi Ghosh, Po-Yao Huang, Prahal Arora, Masoumeh Aminzadeh, Christoph Feichtenhofer, Florian Metze, Luke Zettlemoyer:
VLM: Task-agnostic Video-Language Model Pre-training for Video Understanding. CoRR abs/2105.09996 (2021) - [i60]Mandela Patrick, Dylan Campbell, Yuki Markus Asano, Ishan Misra, Florian Metze, Christoph Feichtenhofer, Andrea Vedaldi, João F. Henriques:
Keeping Your Eye on the Ball: Trajectory Attention in Video Transformers. CoRR abs/2106.05392 (2021) - [i59]Siddhant Arora, Alissa Ostapenko, Vijay Viswanathan, Siddharth Dalmia, Florian Metze, Shinji Watanabe, Alan W. Black:
Rethinking End-to-End Evaluation of Decomposable Tasks: A Case Study on Spoken Language Understanding. CoRR abs/2106.15065 (2021) - [i58]Brian Yan, Siddharth Dalmia, David R. Mortensen, Florian Metze, Shinji Watanabe:
Differentiable Allophone Graphs for Language-Universal Speech Recognition. CoRR abs/2107.11628 (2021) - [i57]Hu Xu, Gargi Ghosh, Po-Yao Huang, Dmytro Okhonko, Armen Aghajanyan, Florian Metze, Luke Zettlemoyer, Christoph Feichtenhofer:
VideoCLIP: Contrastive Pre-training for Zero-shot Video-Text Understanding. CoRR abs/2109.14084 (2021) - [i56]Roshan Sharma, Shruti Palaskar, Alan W. Black, Florian Metze:
Speech Summarization using Restricted Self-Attention. CoRR abs/2110.06263 (2021) - 2020
- [j15]Shruti Palaskar, Ramon Sanabria, Florian Metze:
Transfer learning for multimodal dialog. Comput. Speech Lang. 64: 101093 (2020) - [j14]Lucia Specia, Loïc Barrault
, Ozan Caglayan
, Amanda Cardoso Duarte, Desmond Elliott
, Spandana Gella
, Nils Holzenberger
, Chiraag Lala, Sun Jae Lee
, Jindrich Libovický
, Pranava Madhyastha
, Florian Metze
, Karl Mulligan
, Alissa Ostapenko, Shruti Palaskar
, Ramon Sanabria, Josiah Wang
, Raman Arora:
Grounded Sequence to Sequence Transduction. IEEE J. Sel. Top. Signal Process. 14(3): 577-591 (2020) - [j13]Odette Scharenborg
, Lucas Ondel, Shruti Palaskar, Philip Arthur, Francesco Ciannella, Mingxing Du, Elin Larsen, Danny Merkx
, Rachid Riad, Liming Wang
, Emmanuel Dupoux, Laurent Besacier, Alan W. Black
, Mark Hasegawa-Johnson
, Florian Metze
, Graham Neubig, Sebastian Stüker, Pierre Godard, Markus Müller:
Speech Technology for Unwritten Languages. IEEE ACM Trans. Audio Speech Lang. Process. 28: 964-975 (2020) - [j12]Fengquan Dong, Kun Qian
, Zhao Ren
, Alice Baird, Xinjian Li, Zhenyu Dai, Bo Dong, Florian Metze
, Yoshiharu Yamamoto
, Björn W. Schuller
:
Machine Listening for Heart Status Monitoring: Introducing and Benchmarking HSS - The Heart Sounds Shenzhen Corpus. IEEE J. Biomed. Health Informatics 24(7): 2082-2092 (2020) - [c185]Xinjian Li, Siddharth Dalmia, David R. Mortensen, Juncheng Li, Alan W. Black, Florian Metze:
Towards Zero-Shot Learning for Automatic Phonemic Transcription. AAAI 2020: 8261-8268 - [c184]Tejas Srinivasan, Ramon Sanabria, Florian Metze, Desmond Elliott
:
Fine-Grained Grounding for Multimodal Speech Recognition. EMNLP (Findings) 2020: 2667-2677 - [c183]Vikas Raunak, Siddharth Dalmia, Vivek Gupta, Florian Metze:
On Long-Tailed Phenomena in Neural Machine Translation. EMNLP (Findings) 2020: 3088-3095 - [c182]Tejas Srinivasan, Ramon Sanabria, Florian Metze:
Looking Enhances Listening: Recovering Missing Speech Using Images. ICASSP 2020: 6304-6308 - [c181]Anirudh Mani, Shruti Palaskar, Nimshi Venkat Meripo, Sandeep Konam, Florian Metze:
ASR Error Correction and Domain Adaptation Using Machine Translation. ICASSP 2020: 6344-6348 - [c180]Xinjian Li, Siddharth Dalmia, Juncheng Li, Matthew Lee, Patrick Littell, Jiali Yao, Antonios Anastasopoulos, David R. Mortensen, Graham Neubig, Alan W. Black, Florian Metze:
Universal Phone Recognition with a Multilingual Allophone System. ICASSP 2020: 8249-8253 - [c179]Mahaveer Jain, Gil Keren, Jay Mahadeokar, Geoffrey Zweig, Florian Metze, Yatharth Saraf:
Contextual RNN-T for Open Domain ASR. INTERSPEECH 2020: 11-15 - [c178]Zimeng Qiu, Yiyuan Li, Xinjian Li, Florian Metze, William M. Campbell:
Towards Context-Aware End-to-End Code-Switching Speech Recognition. INTERSPEECH 2020: 4776-4780 - [c177]David R. Mortensen, Xinjian Li, Patrick Littell, Alexis Michaud, Shruti Rijhwani, Antonios Anastasopoulos, Alan W. Black, Florian Metze, Graham Neubig:
AlloVera: A Multilingual Allophone Database. LREC 2020: 5329-5336 - [c176]Vikas Raunak, Vaibhav Kumar, Vivek Gupta, Florian Metze:
On Dimensional Linguistic Properties of the Word Embedding Space. RepL4NLP@ACL 2020: 156-165 - [i55]Zhong Zhou, Isak Czeresnia Etinger, Florian Metze, Alexander G. Hauptmann, Alexander Waibel:
Gun Source and Muzzle Head Detection. CoRR abs/2001.11120 (2020) - [i54]Tejas Srinivasan, Ramon Sanabria, Florian Metze:
Looking Enhances Listening: Recovering Missing Speech Using Images. CoRR abs/2002.05639 (2020) - [i53]Xinjian Li, Siddharth Dalmia, David R. Mortensen, Juncheng Li, Alan W. Black, Florian Metze:
Towards Zero-shot Learning for Automatic Phonemic Transcription. CoRR abs/2002.11781 (2020) - [i52]Xinjian Li, Siddharth Dalmia, Juncheng Li, Matthew Lee, Patrick Littell, Jiali Yao, Antonios Anastasopoulos, David R. Mortensen, Graham Neubig, Alan W. Black, Florian Metze:
Universal Phone Recognition with a Multilingual Allophone System. CoRR abs/2002.11800 (2020) - [i51]Anirudh Mani, Shruti Palaskar, Nimshi Venkat Meripo, Sandeep Konam, Florian Metze:
ASR Error Correction and Domain Adaptation Using Machine Translation. CoRR abs/2003.07692 (2020) - [i50]David R. Mortensen, Xinjian Li, Patrick Littell, Alexis Michaud, Shruti Rijhwani, Antonios Anastasopoulos, Alan W. Black, Florian Metze, Graham Neubig:
AlloVera: A Multilingual Allophone Database. CoRR abs/2004.08031 (2020) - [i49]Amanda Cardoso Duarte, Shruti Palaskar, Deepti Ghadiyaram, Kenneth DeHaan, Florian Metze, Jordi Torres, Xavier Giró-i-Nieto:
How2Sign: A Large-scale Multimodal Dataset for Continuous American Sign Language. CoRR abs/2008.08143 (2020) - [i48]Ze Cheng, Juncheng Li, Chenxu Wang, Jixuan Gu, Hao Xu, Xinjian Li, Florian Metze:
Revisiting Factorizing Aggregated Posterior in Learning Disentangled Representations. CoRR abs/2009.05739 (2020) - [i47]Tejas Srinivasan, Ramon Sanabria, Florian Metze, Desmond Elliott
:
Fine-Grained Grounding for Multimodal Speech Recognition. CoRR abs/2010.02384 (2020) - [i46]Mandela Patrick, Po-Yao Huang, Yuki Markus Asano, Florian Metze, Alexander G. Hauptmann, João F. Henriques, Andrea Vedaldi:
Support-set bottlenecks for video-text representation learning. CoRR abs/2010.02824 (2020) - [i45]Vikas Raunak, Siddharth Dalmia, Vivek Gupta, Florian Metze:
On Long-Tailed Phenomena in Neural Machine Translation. CoRR abs/2010.04924 (2020) - [i44]Tejas Srinivasan, Ramon Sanabria, Florian Metze, Desmond Elliott:
Multimodal Speech Recognition with Unstructured Audio Masking. CoRR abs/2010.08642 (2020) - [i43]Juncheng B. Li, Kaixin Ma, Shuhui Qu, Po-Yao Huang, Florian Metze:
Audio-Visual Event Recognition through the lens of Adversary. CoRR abs/2011.07430 (2020)
2010 – 2019
- 2019
- [j11]Niluthpol Chowdhury Mithun
, Juncheng Li, Florian Metze, Amit K. Roy-Chowdhury:
Joint embeddings with multimodal cues for video-text retrieval. Int. J. Multim. Inf. Retr. 8(1): 3-18 (2019) - [j10]Okko Räsänen
, Shreyas Seshadri
, Julien Karadayi, Eric Riebling, John Bunce
, Alejandrina Cristià
, Florian Metze, Marisa Casillas, Celia Rosemberg, Elika Bergelson, Melanie Soderstrom
:
Automatic word count estimation from daylong child-centered recordings in various language environments using language-independent syllabification of speech. Speech Commun. 113: 63-80 (2019) - [c175]Suyoun Kim, Siddharth Dalmia, Florian Metze:
Gated Embeddings in End-to-End Speech Recognition for Conversational-Context Fusion. ACL (1) 2019: 1131-1141 - [c174]Shruti Palaskar, Jindrich Libovický, Spandana Gella, Florian Metze:
Multimodal Abstractive Summarization for How2 Videos. ACL (1) 2019: 6587-6596 - [c173]Yun Wang, Juncheng Li, Florian Metze:
A Comparison of Five Multiple Instance Learning Pooling Functions for Sound Event Detection with Weak Labeling. ICASSP 2019: 31-35 - [c172]Yun Wang, Florian Metze:
Connectionist Temporal Localization for Sound Event Detection with Sequential Labeling. ICASSP 2019: 745-749 - [c171]Siddharth Dalmia, Xinjian Li, Alan W. Black, Florian Metze:
Phoneme Level Language Models for Sequence Based Low Resource ASR. ICASSP 2019: 6091-6095 - [c170]Shruti Palaskar, Vikas Raunak, Florian Metze:
Learned in Speech Recognition: Contextual Acoustic Word Embeddings. ICASSP 2019: 6530-6534 - [c169]Nils Holzenberger, Shruti Palaskar, Pranava Madhyastha
, Florian Metze, Raman Arora:
Learning from Multiview Correlations in Open-domain Videos. ICASSP 2019: 8628-8632 - [c168]Ozan Caglayan, Ramon Sanabria, Shruti Palaskar, Loïc Barrault, Florian Metze:
Multimodal Grounding for Sequence-to-sequence Speech Recognition. ICASSP 2019: 8648-8652 - [c167]Vikas Raunak, Sang Keun Choe, Quanyang Lu, Yi Xu, Florian Metze:
On Leveraging the Visual Modality for Neural Machine Translation. INLG 2019: 147-151 - [c166]Xinjian Li, Siddharth Dalmia, Alan W. Black, Florian Metze:
Multilingual Speech Recognition with Corpus Relatedness Sampling. INTERSPEECH 2019: 2120-2124 - [c165]Xinjian Li, Zhong Zhou, Siddharth Dalmia, Alan W. Black, Florian Metze:
SANTLR: Speech Annotation Toolkit for Low Resource Languages. INTERSPEECH 2019: 3681-3682 - [c164]Suyoun Kim, Siddharth Dalmia, Florian Metze:
Cross-Attention End-to-End ASR for Two-Party Conversations. INTERSPEECH 2019: 4380-4384 - [c163]Florian Metze:
Survey Talk: Multimodal Processing of Speech and Language. INTERSPEECH 2019 - [c162]Tejas Srinivasan, Ramon Sanabria, Florian Metze:
CMU's Machine Translation System for IWSLT 2019. IWSLT 2019 - [c161]Tejas Srinivasan, Ramon Sanabria, Florian Metze:
Multitask Learning For Different Subword Segmentations In Neural Machine Translation. IWSLT 2019 - [c160]Yasufumi Moriya, Ramon Sanabria, Florian Metze, Gareth J. F. Jones:
MediaEval 2019: Eyes and Ears Together. MediaEval 2019 - [c159]Suyoun Kim, Florian Metze:
Acoustic-to-Word Models with Conversational Context Information. NAACL-HLT (1) 2019: 2766-2771 - [c158]Juncheng Li, Shuhui Qu, Xinjian Li, Joseph Szurley, J. Zico Kolter, Florian Metze:
Adversarial Music: Real world Audio Adversary against Wake-word Detection System. NeurIPS 2019: 11908-11918 - [c157]Vikas Raunak, Vivek Gupta, Florian Metze:
Effective Dimensionality Reduction for Word Embeddings. RepL4NLP@ACL 2019: 235-243 - [i42]Eduard H. Hovy, Jaime G. Carbonell, Hans Chalupsky, Anatole Gershman, Alex Hauptmann, Florian Metze, Teruko Mitamura, Zaid Sheikh, Ankit Dangi, Aditi Chaudhary, Xianyang Chen, Xiang Kong, Bernie Huang, Salvador Medina, Hector Liu, Xuezhe Ma, Maria Ryskina, Ramon Sanabria, Varun Gangal:
OPERA: Operations-oriented Probabilistic Extraction, Reasoning, and Analysis. TAC 2019 - [i41]Shruti Palaskar, Vikas Raunak, Florian Metze:
Learned In Speech Recognition: Contextual Acoustic Word Embeddings. CoRR abs/1902.06833 (2019) - [i40]Siddharth Dalmia, Xinjian Li, Alan W. Black, Florian Metze:
Phoneme Level Language Models for Sequence Based Low Resource ASR. CoRR abs/1902.07613 (2019) - [i39]Aditi Chaudhary, Siddharth Dalmia, Junjie Hu, Xinjian Li, Austin Matthews, Aldrian Obaja Muis, Naoki Otani, Shruti Rijhwani, Zaid Sheikh, Nidhi Vyas, Xinyi Wang, Jiateng Xie, Ruochen Xu, Chunting Zhou, Peter J. Jansen, Yiming Yang, Lori S. Levin, Florian Metze, Teruko Mitamura, David R. Mortensen, Graham Neubig, Eduard H. Hovy, Alan W. Black, Jaime G. Carbonell, Graham Horwood, Shabnam Tafreshi, Mona T. Diab, Efsun Sarioglu, Noura Farra, Kathleen R. McKeown:
The ARIEL-CMU Systems for LoReHLT18. CoRR abs/1902.08899 (2019) - [i38]Suyoun Kim, Florian Metze:
Acoustic-to-Word Models with Conversational Context Information. CoRR abs/1905.08796 (2019) - [i37]Yasufumi Moriya, Ramon Sanabria, Florian Metze, Gareth J. F. Jones:
Grounding Object Detections With Transcriptions. CoRR abs/1906.06147 (2019) - [i36]Shruti Palaskar, Jindrich Libovický, Spandana Gella, Florian Metze:
Multimodal Abstractive Summarization for How2 Videos. CoRR abs/1906.07901 (2019) - [i35]Suyoun Kim, Siddharth Dalmia, Florian Metze:
Gated Embeddings in End-to-End Speech Recognition for Conversational-Context Fusion. CoRR abs/1906.11604 (2019) - [i34]Tejas Srinivasan, Ramon Sanabria, Florian Metze:
Analyzing Utility of Visual Context in Multimodal Speech Recognition Under Noisy Conditions. CoRR abs/1907.00477 (2019) - [i33]Suyoun Kim, Siddharth Dalmia, Florian Metze:
Cross-Attention End-to-End ASR for Two-Party Conversations. CoRR abs/1907.10726 (2019) - [i32]Xinjian Li, Siddharth Dalmia, Alan W. Black, Florian Metze:
Multilingual Speech Recognition with Corpus Relatedness Sampling. CoRR abs/1908.01060 (2019) - [i31]Xinjian Li, Zhong Zhou, Siddharth Dalmia, Alan W. Black, Florian Metze:
SANTLR: Speech Annotation Toolkit for Low Resource Languages. CoRR abs/1908.01067 (2019) - [i30]Vikas Raunak, Vaibhav Kumar, Vivek Gupta, Florian Metze:
On Dimensional Linguistic Properties of the Word Embedding Space. CoRR abs/1910.02211 (2019) - [i29]Vikas Raunak, Sang Keun Choe, Quanyang Lu, Yi Xu, Florian Metze:
On Leveraging the Visual Modality for Neural Machine Translation. CoRR abs/1910.02754 (2019) - [i28]Tejas Srinivasan, Ramon Sanabria, Florian Metze:
Multitask Learning For Different Subword Segmentations In Neural Machine Translation. CoRR abs/1910.12368 (2019) - [i27]Juncheng B. Li, Shuhui Qu, Xinjian Li, J. Zico Kolter, Florian Metze:
Adversarial Music: Real World Audio Adversary Against Wake-word Detection System. CoRR abs/1911.00126 (2019) - [i26]Vikas Raunak, Vaibhav Kumar, Florian Metze:
On Compositionality in Neural Machine Translation. CoRR abs/1911.01497 (2019) - [i25]Siddharth Dalmia, Abdelrahman Mohamed, Mike Lewis, Florian Metze, Luke Zettlemoyer:
Enforcing Encoder-Decoder Modularity in Sequence-to-Sequence Models. CoRR abs/1911.03782 (2019) - 2018
- [c156]Siddharth Dalmia, Ramon Sanabria, Florian Metze, Alan W. Black:
Sequence-Based Multi-Lingual Low Resource Speech Recognition. ICASSP 2018: 4909-4913 - [c155]Odette Scharenborg, Laurent Besacier, Alan W. Black, Mark Hasegawa-Johnson, Florian Metze, Graham Neubig, Sebastian Stüker, Pierre Godard, Markus Müller, Lucas Ondel, Shruti Palaskar, Philip Arthur, Francesco Ciannella, Mingxing Du, Elin Larsen, Danny Merkx
, Rachid Riad, Liming Wang, Emmanuel Dupoux:
Linguistic Unit Discovery from Multi-Modal Inputs in Unwritten Languages: Summary of the "Speaking Rosetta" JSALT 2017 Workshop. ICASSP 2018: 4979-4983 - [c154]Neville Ryant, Elika Bergelson, Kenneth Church, Alejandrina Cristià, Jun Du, Sriram Ganapathy, Sanjeev Khudanpur, Diana Kowalski, Mahesh Krishnamoorthy, Rajat Kulshreshta, Mark Liberman, Yu-Ding Lu, Matthew Maciejewski, Florian Metze, Ján Profant, Lei Sun, Yu Tsao
, Zhou Yu:
Enhancement and Analysis of Conversational Speech: JSALT 2017. ICASSP 2018: 5154-5158 - [c153]Shruti Palaskar, Ramon Sanabria, Florian Metze:
End-to-end Multimodal Speech Recognition. ICASSP 2018: 5774-5778 - [c152]Juncheng Li, Yun Wang, Joseph Szurley, Florian Metze, Samarjit Das:
A Light-Weight Multimodal Framework for Improved Environmental Audio Tagging. ICASSP 2018: 6832-6836 - [c151]Thomas Zenkel, Ramon Sanabria, Florian Metze, Alex Waibel:
Subword and Crossword Units for CTC Acoustic Models. INTERSPEECH 2018: 396-400 - [c150]Yun Wang, Juncheng Li, Florian Metze:
Comparing the Max and Noisy-Or Pooling Functions in Multiple Instance Learning for Weakly Supervised Sequence Learning Tasks. INTERSPEECH 2018: 1339-1343 - [c149]Adrien Le Franc
, Eric Riebling, Julien Karadayi, Yun Wang, Camila Scaff, Florian Metze, Alejandrina Cristià:
The ACLEW DiViMe: An Easy-to-use Diarization Tool. INTERSPEECH 2018: 1383-1387 - [c148]Shao-Yen Tseng, Juncheng Li, Yun Wang, Florian Metze, Joseph Szurley, Samarjit Das:
Multiple Instance Deep Learning for Weakly Supervised Small-Footprint Audio Event Detection. INTERSPEECH 2018: 3279-3283 - [c147]Boyang Li, Beth Cardier, Tong Wang, Florian Metze:
Annotating High-Level Structures of Short Stories and Personal Anecdotes. LREC 2018 - [c146]Yasufumi Moriya, Ramon Sanabria, Florian Metze, Gareth J. F. Jones:
Eyes and Ears Together: New Task for Multimodal Spoken Content Analysis. MediaEval 2018 - [c145]Niluthpol Chowdhury Mithun
, Juncheng Li, Florian Metze, Amit K. Roy-Chowdhury:
Learning Joint Embedding with Multimodal Cues for Cross-Modal Video-Text Retrieval. ICMR 2018: 19-27 - [c144]Siddharth Dalmia, Xinjian Li, Florian Metze, Alan W. Black:
Domain Robust Feature Extraction for Rapid Low Resource ASR Development. SLT 2018: 258-265 - [c143]Shruti Palaskar, Florian Metze:
Acoustic-to-Word Recognition with Sequence-to-Sequence Models. SLT 2018: 397-404 - [c142]Suyoun Kim, Florian Metze:
Dialog-Context Aware end-to-end Speech Recognition. SLT 2018: 434-440 - [c141]Ramon Sanabria, Florian Metze:
Hierarchical Multitask Learning With CTC. SLT 2018: 485-490 - [i24]Eduard H. Hovy, Taylor Berg-Kirkpatrick, Jaime G. Carbonell, Hans Chalupsky, Anatole Gershman, Alexander G. Hauptmann, Florian Metze, Teruko Mitamura, Aditi Chaudhary, Xianyang Chen, Bernie Po-Yao Huang, Hector Zhengzhong Liu, Xuezhe Ma, Shruti Palaskar, Dheeraj Rajagopal, Maria Ryskina, Ramon Sanabria:
OPERA: Operations-oriented Probabilistic Extraction, Reasoning, and Analysis. TAC 2018 - [i23]Odette Scharenborg, Laurent Besacier, Alan W. Black, Mark Hasegawa-Johnson, Florian Metze, Graham Neubig, Sebastian Stüker, Pierre Godard, Markus Müller, Lucas Ondel, Shruti Palaskar, Philip Arthur, Francesco Ciannella, Mingxing Du, Elin Larsen, Danny Merkx, Rachid Riad, Liming Wang, Emmanuel Dupoux:
Linguistic unit discovery from multi-modal inputs in unwritten languages: Summary of the "Speaking Rosetta" JSALT 2017 Workshop. CoRR abs/1802.05092 (2018) - [i22]Siddharth Dalmia, Ramon Sanabria, Florian Metze, Alan W. Black:
Sequence-based Multi-lingual Low Resource Speech Recognition. CoRR abs/1802.07420 (2018) - [i21]Yun Wang, Juncheng Li, Florian Metze:
Comparing the Max and Noisy-Or Pooling Functions in Multiple Instance Learning for Weakly Supervised Sequence Learning Tasks. CoRR abs/1804.01146 (2018) - [i20]Shruti Palaskar, Ramon Sanabria, Florian Metze:
End-to-End Multimodal Speech Recognition. CoRR abs/1804.09713 (2018) - [i19]Ramon Sanabria, Florian Metze:
Hierarchical Multi Task Learning With CTC. CoRR abs/1807.07104 (2018) - [i18]Shruti Palaskar, Florian Metze:
Acoustic-to-Word Recognition with Sequence-to-Sequence Models. CoRR abs/1807.09597 (2018) - [i17]Siddharth Dalmia, Xinjian Li, Florian Metze, Alan W. Black:
Domain Robust Feature Extraction for Rapid Low Resource ASR Development. CoRR abs/1807.10984 (2018) - [i16]Suyoun Kim, Florian Metze:
Dialog-context aware end-to-end speech recognition. CoRR abs/1808.02171 (2018) - [i15]