default search action
Thomas Drugman
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2024
- [i60]Mateusz Lajszczak, Guillermo Cámbara, Yang Li, Fatih Beyhan, Arent van Korlaar, Fan Yang, Arnaud Joly, Álvaro Martín-Cortinas, Ammar Abbas, Adam Michalski, Alexis Moinet, Sri Karlapati, Ewa Muszynska, Haohan Guo, Bartosz Putrycz, Soledad López Gambino, Kayeon Yoo, Elena Sokolova, Thomas Drugman:
BASE TTS: Lessons from building a billion-parameter Text-to-Speech model on 100K hours of data. CoRR abs/2402.08093 (2024) - 2023
- [c77]Ammar Abbas, Sri Karlapati, Bastian Schnell, Penny Karanasou, Marcel Granero Moya, Amith Nagaraj, Ayman Boustati, Nicole Peinelt, Alexis Moinet, Thomas Drugman:
eCat: An End-to-End Model for Multi-Speaker TTS & Many-to-Many Fine-Grained Prosody Transfer. INTERSPEECH 2023: 3387-3391 - [c76]Marcel Granero Moya, Penny Karanasou, Sri Karlapati, Bastian Schnell, Nicole Peinelt, Alexis Moinet, Thomas Drugman:
A Comparative Analysis of Pretrained Language Models for Text-to-Speech. SSW 2023: 14-20 - [c75]Arnaud Joly, Marco Nicolis, Ekaterina Peterova, Alessandro Lombardi, Ammar Abbas, Arent van Korlaar, Aman Hussain, Parul Sharma, Alexis Moinet, Mateusz Lajszczak, Penny Karanasou, Antonio Bonafonte, Thomas Drugman, Elena Sokolova:
Controllable Emphasis with zero data for text-to-speech. SSW 2023: 113-119 - [i59]Ammar Abbas, Sri Karlapati, Bastian Schnell, Penny Karanasou, Marcel Granero Moya, Amith Nagaraj, Ayman Boustati, Nicole Peinelt, Alexis Moinet, Thomas Drugman:
eCat: An End-to-End Model for Multi-Speaker TTS & Many-to-Many Fine-Grained Prosody Transfer. CoRR abs/2306.11327 (2023) - [i58]Arnaud Joly, Marco Nicolis, Ekaterina Peterova, Alessandro Lombardi, Ammar Abbas, Arent van Korlaar, Aman Hussain, Parul Sharma, Alexis Moinet, Mateusz Lajszczak, Penny Karanasou, Antonio Bonafonte, Thomas Drugman, Elena Sokolova:
Controllable Emphasis with zero data for text-to-speech. CoRR abs/2307.07062 (2023) - [i57]Marcel Granero Moya, Penny Karanasou, Sri Karlapati, Bastian Schnell, Nicole Peinelt, Alexis Moinet, Thomas Drugman:
A Comparative Analysis of Pretrained Language Models for Text-to-Speech. CoRR abs/2309.01576 (2023) - 2022
- [j25]Daniel Korzekwa, Jaime Lorenzo-Trueba, Thomas Drugman, Bozena Kostek:
Computer-assisted pronunciation training - Speech synthesis is almost all you need. Speech Commun. 142: 22-33 (2022) - [c74]Mateusz Lajszczak, Animesh Prasad, Arent van Korlaar, Bajibabu Bollepalli, Antonio Bonafonte, Arnaud Joly, Marco Nicolis, Alexis Moinet, Thomas Drugman, Trevor Wood, Elena Sokolova:
Distribution Augmentation for Low-Resource Expressive Text-To-Speech. ICASSP 2022: 8307-8311 - [c73]Sri Karlapati, Penny Karanasou, Mateusz Lajszczak, Syed Ammar Abbas, Alexis Moinet, Peter Makarov, Ray Li, Arent van Korlaar, Simon Slangen, Thomas Drugman:
CopyCat2: A Single Model for Multi-Speaker TTS and Many-to-Many Fine-Grained Prosody Transfer. INTERSPEECH 2022: 3363-3367 - [c72]Peter Makarov, Syed Ammar Abbas, Mateusz Lajszczak, Arnaud Joly, Sri Karlapati, Alexis Moinet, Thomas Drugman, Penny Karanasou:
Simple and Effective Multi-sentence TTS with Expressive and Coherent Prosody. INTERSPEECH 2022: 3368-3372 - [c71]Syed Ammar Abbas, Thomas Merritt, Alexis Moinet, Sri Karlapati, Ewa Muszynska, Simon Slangen, Elia Gatti, Thomas Drugman:
Expressive, Variable, and Controllable Duration Modelling in TTS. INTERSPEECH 2022: 4546-4550 - [i56]Mateusz Lajszczak, Animesh Prasad, Arent van Korlaar, Bajibabu Bollepalli, Antonio Bonafonte, Arnaud Joly, Marco Nicolis, Alexis Moinet, Thomas Drugman, Trevor Wood, Elena Sokolova:
Distribution augmentation for low-resource expressive text-to-speech. CoRR abs/2202.06409 (2022) - [i55]Sri Karlapati, Penny Karanasou, Mateusz Lajszczak, Ammar Abbas, Alexis Moinet, Peter Makarov, Ray Li, Arent van Korlaar, Simon Slangen, Thomas Drugman:
CopyCat2: A Single Model for Multi-Speaker TTS and Many-to-Many Fine-Grained Prosody Transfer. CoRR abs/2206.13443 (2022) - [i54]Ammar Abbas, Thomas Merritt, Alexis Moinet, Sri Karlapati, Ewa Muszynska, Simon Slangen, Elia Gatti, Thomas Drugman:
Expressive, Variable, and Controllable Duration Modelling in TTS. CoRR abs/2206.14165 (2022) - [i53]Peter Makarov, Ammar Abbas, Mateusz Lajszczak, Arnaud Joly, Sri Karlapati, Alexis Moinet, Thomas Drugman, Penny Karanasou:
Simple and Effective Multi-sentence TTS with Expressive and Coherent Prosody. CoRR abs/2206.14643 (2022) - [i52]Daniel Korzekwa, Jaime Lorenzo-Trueba, Thomas Drugman, Bozena Kostek:
Computer-assisted Pronunciation Training - Speech synthesis is almost all you need. CoRR abs/2207.00774 (2022) - 2021
- [c70]Sri Karlapati, Ammar Abbas, Zack Hodari, Alexis Moinet, Arnaud Joly, Penny Karanasou, Thomas Drugman:
Prosodic Representation Learning and Contextual Sampling for Neural Text-to-Speech. ICASSP 2021: 6573-6577 - [c69]Zack Hodari, Alexis Moinet, Sri Karlapati, Jaime Lorenzo-Trueba, Thomas Merritt, Arnaud Joly, Ammar Abbas, Penny Karanasou, Thomas Drugman:
Camp: A Two-Stage Approach to Modelling Prosody in Context. ICASSP 2021: 6578-6582 - [c68]Daniel Korzekwa, Jaime Lorenzo-Trueba, Szymon Zaporowski, Shira Calamaro, Thomas Drugman, Bozena Kostek:
Mispronunciation Detection in Non-Native (L2) English with Uncertainty Modeling. ICASSP 2021: 7738-7742 - [c67]Penny Karanasou, Sri Karlapati, Alexis Moinet, Arnaud Joly, Ammar Abbas, Simon Slangen, Jaime Lorenzo-Trueba, Thomas Drugman:
A Learned Conditional Prior for the VAE Acoustic Space of a TTS System. Interspeech 2021: 3620-3624 - [c66]Daniel Korzekwa, Roberto Barra-Chicote, Szymon Zaporowski, Grzegorz Beringer, Jaime Lorenzo-Trueba, Alicja Serafinowicz, Jasha Droppo, Thomas Drugman, Bozena Kostek:
Detection of Lexical Stress Errors in Non-Native (L2) English with Data Augmentation and Attention. Interspeech 2021: 3915-3919 - [c65]Daniel Korzekwa, Jaime Lorenzo-Trueba, Thomas Drugman, Shira Calamaro, Bozena Kostek:
Weakly-Supervised Word-Level Pronunciation Error Detection in Non-Native English Speech. Interspeech 2021: 4408-4412 - [c64]Bastian Schnell, Goeric Huybrechts, Bartek Perz, Thomas Drugman, Jaime Lorenzo-Trueba:
EmoCat: Language-agnostic Emotional Voice Conversion. SSW 2021: 72-77 - [c63]Alejandro Mottini, Jaime Lorenzo-Trueba, Sri Vishnu Kumar Karlapati, Thomas Drugman:
Voicy: Zero-Shot Non-Parallel Voice Conversion in Noisy Reverberant Environments. SSW 2021: 113-117 - [c62]Ammar Abbas, Bajibabu Bollepalli, Alexis Moinet, Arnaud Joly, Penny Karanasou, Peter Makarov, Simon Slangen, Sri Karlapati, Thomas Drugman:
Multi-Scale Spectrogram Modelling for Neural Text-to-Speech. SSW 2021: 177-182 - [i51]Bastian Schnell, Goeric Huybrechts, Bartek Perz, Thomas Drugman, Jaime Lorenzo-Trueba:
EmoCat: Language-agnostic Emotional Voice Conversion. CoRR abs/2101.05695 (2021) - [i50]Daniel Korzekwa, Jaime Lorenzo-Trueba, Szymon Zaporowski, Shira Calamaro, Thomas Drugman, Bozena Kostek:
Mispronunciation Detection in Non-native (L2) English with Uncertainty Modeling. CoRR abs/2101.06396 (2021) - [i49]Daniel Korzekwa, Jaime Lorenzo-Trueba, Thomas Drugman, Shira Calamaro, Bozena Kostek:
Weakly-supervised word-level pronunciation error detection in non-native English speech. CoRR abs/2106.03494 (2021) - [i48]Alejandro Mottini, Jaime Lorenzo-Trueba, Sri Vishnu Kumar Karlapati, Thomas Drugman:
Voicy: Zero-Shot Non-Parallel Voice Conversion in Noisy Reverberant Environments. CoRR abs/2106.08873 (2021) - [i47]Penny Karanasou, Sri Karlapati, Alexis Moinet, Arnaud Joly, Ammar Abbas, Simon Slangen, Jaime Lorenzo-Trueba, Thomas Drugman:
A learned conditional prior for the VAE acoustic space of a TTS system. CoRR abs/2106.10229 (2021) - [i46]Ammar Abbas, Bajibabu Bollepalli, Alexis Moinet, Arnaud Joly, Penny Karanasou, Peter Makarov, Simon Slangen, Sri Karlapati, Thomas Drugman:
Multi-Scale Spectrogram Modelling for Neural Text-to-Speech. CoRR abs/2106.15649 (2021) - 2020
- [j24]Marius Cotescu, Thomas Drugman, Goeric Huybrechts, Jaime Lorenzo-Trueba, Alexis Moinet:
Voice Conversion for Whispered Speech Synthesis. IEEE Signal Process. Lett. 27: 186-190 (2020) - [c61]Orazio Angelini, Alexis Moinet, Kayoko Yanagisawa, Thomas Drugman:
Singing Synthesis: With a Little Help from my Attention. INTERSPEECH 2020: 1221-1225 - [c60]Sri Karlapati, Alexis Moinet, Arnaud Joly, Viacheslav Klimkov, Daniel Sáez-Trigueros, Thomas Drugman:
CopyCat: Many-to-Many Fine-Grained Prosody Transfer for Neural Text-to-Speech. INTERSPEECH 2020: 4387-4391 - [c59]Shubhi Tyagi, Marco Nicolis, Jonas Rohnke, Thomas Drugman, Jaime Lorenzo-Trueba:
Dynamic Prosody Generation for Speech Synthesis Using Linguistics-Driven Acoustic Embedding Selection. INTERSPEECH 2020: 4407-4411 - [i45]Thomas Drugman, Thomas Dubuisson, Thierry Dutoit:
Phase-based Information for Voice Pathology Detection. CoRR abs/2001.00372 (2020) - [i44]Thomas Drugman, Abeer Alwan:
Joint Robust Voicing Detection and Pitch Estimation Based on Residual Harmonics. CoRR abs/2001.00459 (2020) - [i43]Thomas Drugman, Mark R. P. Thomas, Jón Guðnason, Patrick A. Naylor, Thierry Dutoit:
Detection of Glottal Closure Instants from Speech Signals: a Quantitative Review. CoRR abs/2001.00473 (2020) - [i42]Thomas Drugman, Jérôme Urbain, Nathalie Bauwens, Ricardo Chessini, Carlos Valderrama, Patrick Lebecque, Thierry Dutoit:
Objective Study of Sensor Relevance for Automatic Cough Detection. CoRR abs/2001.00537 (2020) - [i41]Thomas Drugman, Thierry Dutoit:
A Comparative Evaluation of Pitch Modification Techniques. CoRR abs/2001.00579 (2020) - [i40]Thomas Drugman, Jérôme Urbain, Thierry Dutoit:
Assessment of Audio Features for Automatic Cough Detection. CoRR abs/2001.00580 (2020) - [i39]Thomas Drugman, Geoffrey Wilfart, Thierry Dutoit:
Eigenresiduals for improved Parametric Speech Synthesis. CoRR abs/2001.00581 (2020) - [i38]Thomas Drugman, Thierry Dutoit, Baris Bozkurt:
Excitation-based Voice Quality Analysis and Modification. CoRR abs/2001.00582 (2020) - [i37]Thomas Drugman, Thomas Dubuisson, Thierry Dutoit:
On the Mutual Information between Source and Filter Contributions for Voice Pathology Detection. CoRR abs/2001.00583 (2020) - [i36]Thomas Drugman, Baris Bozkurt, Thierry Dutoit:
A Comparative Study of Glottal Source Estimation Techniques. CoRR abs/2001.00840 (2020) - [i35]Thomas Drugman, Thierry Dutoit:
Glottal Closure and Opening Instant Detection from Speech Signals. CoRR abs/2001.00841 (2020) - [i34]Thomas Drugman, Geoffrey Wilfart, Thierry Dutoit:
A Deterministic plus Stochastic Model of the Residual Signal for Improved Parametric Speech Synthesis. CoRR abs/2001.00842 (2020) - [i33]Thomas Drugman, Thierry Dutoit:
The Deterministic plus Stochastic Model of the Residual Signal and its Applications. CoRR abs/2001.01000 (2020) - [i32]Sri Karlapati, Alexis Moinet, Arnaud Joly, Viacheslav Klimkov, Daniel Saez-Trigueros, Thomas Drugman:
CopyCat: Many-to-Many Fine-Grained Prosody Transfer for Neural Text-to-Speech. CoRR abs/2004.14617 (2020) - [i31]Thomas Drugman, Thierry Dutoit:
Chirp Complex Cepstrum-based Decomposition for Asynchronous Glottal Analysis. CoRR abs/2005.04724 (2020) - [i30]Thomas Drugman, Jérôme Urbain, Nathalie Bauwens, Ricardo Chessini, Anne-Sophie Aubriot, Patrick Lebecque, Thierry Dutoit:
Audio and Contact Microphones for Cough Detection. CoRR abs/2005.05313 (2020) - [i29]Thomas Drugman, Baris Bozkurt, Thierry Dutoit:
Glottal Source Estimation using an Automatic Chirp Decomposition. CoRR abs/2005.07897 (2020) - [i28]Thomas Drugman, Thierry Dutoit:
Oscillating Statistical Moments for Speech Polarity Detection. CoRR abs/2005.07901 (2020) - [i27]Thomas Drugman, Thomas Dubuisson, Alexis Moinet, Nicolas D'Alessandro, Thierry Dutoit:
Glottal source estimation robustness: A comparison of sensitivity of voice source estimation techniques. CoRR abs/2005.11682 (2020) - [i26]Thomas Drugman, John Kane, Christer Gobl:
Data-driven Detection and Analysis of the Patterns of Creaky Voice. CoRR abs/2006.00518 (2020) - [i25]Thomas Drugman, Yannis Stylianou:
Maximum Voiced Frequency Estimation: Exploiting Amplitude and Phase Spectra. CoRR abs/2006.00521 (2020) - [i24]Thomas Drugman:
Residual Excitation Skewness for Automatic Speech Polarity Detection. CoRR abs/2006.00525 (2020) - [i23]Benjamin Picart, Thomas Drugman, Thierry Dutoit:
Analysis and Synthesis of Hypo and Hyperarticulated Speech. CoRR abs/2006.04136 (2020) - [i22]Thomas Drugman:
Maximum Phase Modeling for Sparse Linear Prediction of Speech. CoRR abs/2006.04138 (2020) - [i21]Onur Babacan, Thomas Drugman, Tuomo Raitio, Daniel Erro, Thierry Dutoit:
Parametric Representation for Singing Voice Synthesis: a Comparative Evaluation. CoRR abs/2006.04142 (2020) - [i20]Sri Karlapati, Ammar Abbas, Zack Hodari, Alexis Moinet, Arnaud Joly, Penny Karanasou, Thomas Drugman:
Prosodic Representation Learning and Contextual Sampling for Neural Text-to-Speech. CoRR abs/2011.02252 (2020) - [i19]Daniel Korzekwa, Roberto Barra-Chicote, Szymon Zaporowski, Grzegorz Beringer, Jaime Lorenzo-Trueba, Alicja Serafinowicz, Jasha Droppo, Thomas Drugman, Bozena Kostek:
Detection of Lexical Stress Errors in Non-native (L2) English with Data Augmentation and Attention. CoRR abs/2012.14788 (2020)
2010 – 2019
- 2019
- [c58]Javier Latorre, Jakub Lachowicz, Jaime Lorenzo-Trueba, Thomas Merritt, Thomas Drugman, Srikanth Ronanki, Viacheslav Klimkov:
Effect of Data Reduction on Sequence-to-sequence Neural TTS. ICASSP 2019: 7075-7079 - [c57]Jaime Lorenzo-Trueba, Thomas Drugman, Javier Latorre, Thomas Merritt, Bartosz Putrycz, Roberto Barra-Chicote, Alexis Moinet, Vatsal Aggarwal:
Towards Achieving Robust Universal Neural Vocoding. INTERSPEECH 2019: 181-185 - [c56]Daniel Korzekwa, Roberto Barra-Chicote, Bozena Kostek, Thomas Drugman, Mateusz Lajszczak:
Interpretable Deep Learning Model for the Detection and Reconstruction of Dysarthric Speech. INTERSPEECH 2019: 3890-3894 - [c55]Viacheslav Klimkov, Srikanth Ronanki, Jonas Rohnke, Thomas Drugman:
Fine-Grained Robust Prosody Transfer for Single-Speaker Neural Text-To-Speech. INTERSPEECH 2019: 4440-4444 - [c54]Nishant Prateek, Mateusz Lajszczak, Roberto Barra-Chicote, Thomas Drugman, Jaime Lorenzo-Trueba, Thomas Merritt, Srikanth Ronanki, Trevor Wood:
In Other News: a Bi-style Text-to-speech Model for Synthesizing Newscaster Voice with Limited Data. NAACL-HLT (2) 2019: 205-213 - [i18]Thomas Drugman, Goeric Huybrechts, Viacheslav Klimkov, Alexis Moinet:
Traditional Machine Learning for Pitch Detection. CoRR abs/1903.01290 (2019) - [i17]Thomas Drugman, Yannis Stylianou, Yusuke Kida, Masami Akamine:
Voice Activity Detection: Merging Source and Filter-based Information. CoRR abs/1903.02844 (2019) - [i16]Thomas Drugman, Janne Pylkkönen, Reinhard Kneser:
Active and Semi-Supervised Learning in ASR: Benefits on the Acoustic and Language Models. CoRR abs/1903.02852 (2019) - [i15]Nishant Prateek, Mateusz Lajszczak, Roberto Barra-Chicote, Thomas Drugman, Jaime Lorenzo-Trueba, Thomas Merritt, Srikanth Ronanki, Trevor Wood:
In Other News: A Bi-style Text-to-speech Model for Synthesizing Newscaster Voice with Limited Data. CoRR abs/1904.02790 (2019) - [i14]Viacheslav Klimkov, Srikanth Ronanki, Jonas Rohnke, Thomas Drugman:
Fine-grained robust prosody transfer for single-speaker neural text-to-speech. CoRR abs/1907.02479 (2019) - [i13]Daniel Korzekwa, Roberto Barra-Chicote, Bozena Kostek, Thomas Drugman, Mateusz Lajszczak:
Interpretable Deep Learning Model for the Detection and Reconstruction of Dysarthric Speech. CoRR abs/1907.04743 (2019) - [i12]Shubhi Tyagi, Marco Nicolis, Jonas Rohnke, Thomas Drugman, Jaime Lorenzo-Trueba:
Dynamic Prosody Generation for Speech Synthesis using Linguistics-Driven Acoustic Embedding Selection. CoRR abs/1912.00955 (2019) - [i11]Marius Cotescu, Thomas Drugman, Goeric Huybrechts, Jaime Lorenzo-Trueba, Alexis Moinet:
Voice Conversion for Whispered Speech Synthesis. CoRR abs/1912.05289 (2019) - [i10]Orazio Angelini, Alexis Moinet, Kayoko Yanagisawa, Thomas Drugman:
Singing Synthesis: with a little help from my attention. CoRR abs/1912.05881 (2019) - [i9]Thomas Drugman, Baris Bozkurt, Thierry Dutoit:
Complex Cepstrum-based Decomposition of Speech for Glottal Source Estimation. CoRR abs/1912.12602 (2019) - [i8]Thomas Drugman, Paavo Alku, Abeer Alwan, Bayya Yegnanarayana:
Glottal Source Processing: from Analysis to Applications. CoRR abs/1912.12604 (2019) - [i7]Onur Babacan, Thomas Drugman, Nicolas D'Alessandro, Nathalie Henrich, Thierry Dutoit:
A Comparative Study of Pitch Extraction Algorithms on a Large Variety of Singing Sounds. CoRR abs/1912.12609 (2019) - [i6]Thomas Drugman, Baris Bozkurt, Thierry Dutoit:
Causal-Anticausal Decomposition of Speech using Complex Cepstrum for Glottal Source Estimation. CoRR abs/1912.12843 (2019) - [i5]Thomas Drugman, Alexis Moinet, Thierry Dutoit, Geoffrey Wilfart:
Using a Pitch-Synchronous Residual Codebook for Hybrid HMM/Frame Selection Speech Synthesis. CoRR abs/1912.12887 (2019) - 2018
- [j23]Thomas Drugman, Goeric Huybrechts, Viacheslav Klimkov, Alexis Moinet:
Traditional Machine Learning for Pitch Detection. IEEE Signal Process. Lett. 25(11): 1745-1749 (2018) - [c53]Zeynab Raeesy, Kellen Gillespie, Chengyuan Ma, Thomas Drugman, Jiacheng Gu, Roland Maas, Ariya Rastrow, Björn Hoffmeister:
LSTM-Based Whisper Detection. SLT 2018: 139-144 - [c52]Thomas Merritt, Bartosz Putrycz, Adam Nadolski, Tianjun Ye, Daniel Korzekwa, Wiktor Dolecki, Thomas Drugman, Viacheslav Klimkov, Alexis Moinet, Andrew Breen, Rafal Kuklinski, Nikko Strom, Roberto Barra-Chicote:
Comprehensive Evaluation of Statistical Speech Waveform Synthesis. SLT 2018: 325-331 - [c51]Viacheslav Klimkov, Alexis Moinet, Adam Nadolski, Thomas Drugman:
Parameter Generation Algorithms for Text-To-Speech Synthesis with Recurrent Neural Networks. SLT 2018: 626-631 - [i4]Zeynab Raeesy, Kellen Gillespie, Chengyuan Ma, Thomas Drugman, Jiacheng Gu, Roland Maas, Ariya Rastrow, Björn Hoffmeister:
LSTM-based Whisper Detection. CoRR abs/1809.07832 (2018) - [i3]Jaime Lorenzo-Trueba, Thomas Drugman, Javier Latorre, Thomas Merritt, Bartosz Putrycz, Roberto Barra-Chicote:
Robust universal neural vocoding. CoRR abs/1811.06292 (2018) - [i2]Thomas Merritt, Bartosz Putrycz, Adam Nadolski, Tianjun Ye, Daniel Korzekwa, Wiktor Dolecki, Thomas Drugman, Viacheslav Klimkov, Alexis Moinet, Andrew Breen, Rafal Kuklinski, Nikko Strom, Roberto Barra-Chicote:
Comprehensive evaluation of statistical speech waveform synthesis. CoRR abs/1811.06296 (2018) - [i1]Javier Latorre, Jakub Lachowicz, Jaime Lorenzo-Trueba, Thomas Merritt, Thomas Drugman, Srikanth Ronanki, Klimkov Viacheslav:
Effect of data reduction on sequence-to-sequence neural TTS. CoRR abs/1811.06315 (2018) - 2017
- [c50]Viacheslav Klimkov, Adam Nadolski, Alexis Moinet, Bartosz Putrycz, Roberto Barra-Chicote, Thomas Merritt, Thomas Drugman:
Phrase Break Prediction for Long-Form Reading TTS: Exploiting Text Structure Information. INTERSPEECH 2017: 1064-1068 - 2016
- [j22]Thomas Drugman, Yannis Stylianou, Yusuke Kida, Masami Akamine:
Voice Activity Detection: Merging Source and Filter-based Information. IEEE Signal Process. Lett. 23(2): 252-256 (2016) - [j21]Sandrine Brognaux, Thomas Drugman:
HMM-Based Speech Segmentation: Improvements of Fully Automatic Approaches. IEEE ACM Trans. Audio Speech Lang. Process. 24(1): 5-15 (2016) - [c49]Thomas Drugman, Janne Pylkkönen, Reinhard Kneser:
Active and Semi-Supervised Learning in ASR: Benefits on the Acoustic and Language Models. INTERSPEECH 2016: 2318-2322 - [c48]Janne Pylkkönen, Thomas Drugman, Max Bisani:
Optimizing Speech Recognition Evaluation Using Stratified Sampling. INTERSPEECH 2016: 3106-3110 - [p1]Anna Esposito, Marcos Faúndez-Zanuy, Antonietta Maria Esposito, Gennaro Cordasco, Thomas Drugman, Jordi Solé-Casals, Francesco Carlo Morabito:
Recent Advances in Nonlinear Speech Processing: Directions and Challenges. Recent Advances in Nonlinear Speech Processing 2016: 5-11 - [e2]Anna Esposito, Marcos Faúndez-Zanuy, Antonietta Maria Esposito, Gennaro Cordasco, Thomas Drugman, Jordi Solé-Casals, Francesco Carlo Morabito:
Recent Advances in Nonlinear Speech Processing. Smart Innovation, Systems and Technologies 48, Springer 2016, ISBN 978-3-319-28107-0 [contents] - 2015
- [j20]Thomas Drugman:
Non-Linear Speech Processing (NOLISP 2013). Comput. Speech Lang. 30(1): 1-2 (2015) - [j19]Thomas Drugman, Myriam Rijckaert, Claire Janssens, Marc Remacle:
Tracheoesophageal speech: A dedicated objective acoustic assessment. Comput. Speech Lang. 30(1): 16-31 (2015) - [c47]Thomas Drugman, Yannis Stylianou, Langzhou Chen, Xie Chen, Mark J. F. Gales:
Robust excitation-based features for Automatic Speech Recognition. ICASSP 2015: 4664-4668 - [c46]Thomas Drugman, Yannis Stylianou:
Fast and accurate phase unwrapping. INTERSPEECH 2015: 1171-1175 - 2014
- [j18]Thomas Drugman:
Using mutual information in supervised temporal event detection: Application to cough detection. Biomed. Signal Process. Control. 10: 50-57 (2014) - [j17]Benjamin Picart, Thomas Drugman, Thierry Dutoit:
Analysis and HMM-based synthesis of hypo and hyperarticulated speech. Comput. Speech Lang. 28(2): 687-707 (2014) - [j16]Thomas Drugman, Paavo Alku, Abeer Alwan, Bayya Yegnanarayana:
Glottal source processing: From analysis to applications. Comput. Speech Lang. 28(5): 1117-1138 (2014) - [j15]Thomas Drugman, John Kane, Christer Gobl:
Data-driven detection and analysis of the patterns of creaky voice. Comput. Speech Lang. 28(5): 1233-1253 (2014) - [j14]Soheil Khorram, Hossein Sameti, Fahimeh Bahmaninezhad, Simon King, Thomas Drugman:
Context-dependent acoustic modeling based on hidden maximum entropy model for statistical parametric speech synthesis. EURASIP J. Audio Speech Music. Process. 2014: 12 (2014) - [j13]Thomas Drugman, Thierry Dutoit:
Speech polarity determination: A comparative evaluation. Neurocomputing 132: 121-125 (2014) - [j12]Benjamin Picart, Thomas Drugman, Thierry Dutoit:
HMM-based speech synthesis with various degrees of articulation: A perceptual study. Neurocomputing 132: 142-147 (2014) - [j11]Benjamin Picart, Thomas Drugman, Thierry Dutoit:
Automatic Variation of the Degree of Articulation in New HMM-Based Voices. IEEE J. Sel. Top. Signal Process. 8(2): 307-322 (2014) - [j10]Thomas Drugman, Yannis Stylianou:
Maximum Voiced Frequency Estimation: Exploiting Amplitude and Phase Spectra. IEEE Signal Process. Lett. 21(10): 1230-1234 (2014) - [j9]Thomas Drugman, Yannis Stylianou:
Fast Inter-Harmonic Reconstruction for Spectral Envelope Estimation in High-Pitched Voices. IEEE Signal Process. Lett. 21(11): 1418-1422 (2014) - [c45]Thomas Drugman, Tuomo Raitio:
Excitation modeling for HMM-based speech synthesis: Breaking down the impact of periodic and aperiodic components. ICASSP 2014: 260-264 - [c44]Gilles Degottex, John Kane, Thomas Drugman, Tuomo Raitio, Stefan Scherer:
COVAREP - A collaborative voice analysis repository for speech technologies. ICASSP 2014: 960-964 - [c43]Onur Babacan, Thomas Drugman, Tuomo Raitio, Daniel Erro, Thierry Dutoit:
Parametric representation for singing voice synthesis: A comparative evaluation. ICASSP 2014: 2564-2568 - [c42]Sandrine Brognaux, Benjamin Picart, Thomas Drugman:
Speech synthesis in various communicative situations: impact of pronunciation variations. INTERSPEECH 2014: 1524-1528 - 2013
- [j8]Thomas Drugman, Thierry Dutoit:
Detecting Speech Polarity with High-Order Statistics. Cogn. Comput. 5(4): 442-447 (2013) - [j7]John Kane, Thomas Drugman, Christer Gobl:
Improved automatic detection of creak. Comput. Speech Lang. 27(4): 1028-1047 (2013) - [j6]Thomas Drugman:
Residual Excitation Skewness for Automatic Speech Polarity Detection. IEEE Signal Process. Lett. 20(4): 387-390 (2013) - [j5]Thomas Drugman, Jérôme Urbain, Nathalie Bauwens, Ricardo Chessini, Carlos Valderrama, Patrick Lebecque, Thierry Dutoit:
Objective Study of Sensor Relevance for Automatic Cough Detection. IEEE J. Biomed. Health Informatics 17(3): 699-707 (2013) - [c41]Erfan Loweimi, Seyed Mohammad Ahadi, Thomas Drugman:
A new phase-based feature representation for robust speech recognition. ICASSP 2013: 7155-7159 - [c40]Onur Babacan, Thomas Drugman, Nicolas D'Alessandro, Nathalie Henrich, Thierry Dutoit:
A comparative study of pitch extraction algorithms on a large variety of singing sounds. ICASSP 2013: 7815-7819 - [c39]Thomas Drugman, John Kane, Tuomo Raitio, Christer Gobl:
Prediction of creaky voice from contextual factors. ICASSP 2013: 7967-7971 - [c38]Sandrine Brognaux, Benjamin Picart, Thomas Drugman:
A new prosody annotation protocol for live sports commentaries. INTERSPEECH 2013: 1554-1558 - [c37]Onur Babacan, Thomas Drugman, Nicolas D'Alessandro, Nathalie Henrich, Thierry Dutoit:
A quantitative comparison of glottal closure instant estimation algorithms on a large variety of singing sounds. INTERSPEECH 2013: 1702-1706 - [c36]Tuomo Raitio, John Kane, Thomas Drugman, Christer Gobl:
HMM-based synthesis of creaky voice. INTERSPEECH 2013: 2316-2320 - [c35]Thomas Drugman, Myriam Rijckaert, George Lawson, Marc Remacle:
Analysis and Quantification of Acoustic Artefacts in Tracheoesophageal Speech. NOLISP 2013: 104-111 - [c34]Erfan Loweimi, Seyed Mohammad Ahadi, Thomas Drugman, Samira Loveymi:
On the Importance of Pre-emphasis and Window Shape in Phase-Based Speech Recognition. NOLISP 2013: 160-167 - [c33]Benjamin Picart, Sandrine Brognaux, Thomas Drugman:
HMM-based speech synthesis of live sports commentaries: integration of a two-layer prosody annotation. SSW 2013: 19-24 - [e1]Thomas Drugman, Thierry Dutoit:
Advances in Nonlinear Speech Processing - 6th International Conference, NOLISP 2013, Mons, Belgium, June 19-21, 2013. Proceedings. Lecture Notes in Computer Science 7911, Springer 2013, ISBN 978-3-642-38846-0 [contents] - 2012
- [j4]Thomas Drugman, Baris Bozkurt, Thierry Dutoit:
A comparative study of glottal source estimation techniques. Comput. Speech Lang. 26(1): 20-34 (2012) - [j3]Thomas Drugman, Thierry Dutoit:
The Deterministic Plus Stochastic Model of the Residual Signal and Its Applications. IEEE Trans. Speech Audio Process. 20(3): 968-981 (2012) - [j2]Thomas Drugman, Mark R. P. Thomas, Jón Guðnason, Patrick A. Naylor, Thierry Dutoit:
Detection of Glottal Closure Instants From Speech Signals: A Quantitative Review. IEEE Trans. Speech Audio Process. 20(3): 994-1006 (2012) - [c32]Thomas Drugman, Jérôme Urbain, Nathalie Bauwens, Ricardo Chessini, Anne-Sophie Aubriot, Patrick Lebecque, Thierry Dutoit:
Audio and Contact Microphones for Cough Detection. INTERSPEECH 2012: 1303-1306 - [c31]Thomas Drugman, John Kane, Christer Gobl:
Modeling the Creaky Excitation for Parametric Speech Synthesis. INTERSPEECH 2012: 1424-1427 - [c30]Thomas Drugman, John Kane, Christer Gobl:
Resonator-based creaky voice detection. INTERSPEECH 2012: 1592-1595 - [c29]Loïc Reboursière, Otso Lähdeoja, Thomas Drugman, Stéphane Dupont, Cécile Picard-Limpens, Nicolas Riche:
Left and right-hand guitar playing techniques detection. NIME 2012 - [c28]Maria Astrinaki, Nicolas D'Alessandro, Benjamin Picart, Thomas Drugman, Thierry Dutoit:
Reactive and continuous control of HMM-based speech synthesis. SLT 2012: 252-257 - [c27]Benjamin Picart, Thomas Drugman, Thierry Dutoit:
Statistical methods for varying the degree of articulation in new HMM-based voices. SLT 2012: 291-296 - [c26]Sandrine Brognaux, Thomas Drugman, Richard Beaufort:
Automatic detection and correction of syntax-based prosody annotation errors. SLT 2012: 410-415 - [c25]Sandrine Brognaux, Sophie Roekhaut, Thomas Drugman, Richard Beaufort:
Train&align: A new online tool for automatic phonetic alignment. SLT 2012: 416-421 - [c24]Sandrine Brognaux, Sophie Roekhaut, Thomas Drugman, Richard Beaufort:
Automatic Phone Alignment - A Comparison between Speaker-Independent Models and Models Trained on the Corpus to Align. JapTAL 2012: 300-311 - 2011
- [j1]Thomas Drugman, Baris Bozkurt, Thierry Dutoit:
Causal-anticausal decomposition of speech using complex cepstrum for glottal source estimation. Speech Commun. 53(6): 855-866 (2011) - [c23]Thomas Drugman, Jérôme Urbain, Thierry Dutoit:
Assessment of audio features for automatic cough detection. EUSIPCO 2011: 1289-1293 - [c22]Thomas Drugman, Thomas Dubuisson, Thierry Dutoit:
Phase-based information for voice pathology detection. ICASSP 2011: 4612-4615 - [c21]Benjamin Picart, Thomas Drugman, Thierry Dutoit:
Continuous Control of the Degree of Articulation in HMM-Based Speech Synthesis. INTERSPEECH 2011: 1797-1800 - [c20]Thomas Drugman, Abeer Alwan:
Joint Robust Voicing Detection and Pitch Estimation Based on Residual Harmonics. INTERSPEECH 2011: 1973-1976 - [c19]Thomas Drugman, Thierry Dutoit:
Oscillating Statistical Moments for Speech Polarity Detection. NOLISP 2011: 48-54 - [c18]Benjamin Picart, Thomas Drugman, Thierry Dutoit:
Perceptual Effects of the Degree of Articulation in HMM-Based Speech Synthesis. NOLISP 2011: 177-182 - 2010
- [c17]Thomas Drugman, Thierry Dutoit:
A comparative evaluation of pitch modification techniques. EUSIPCO 2010: 756-760 - [c16]Thomas Drugman, Thierry Dutoit:
Chirp complex cepstrum-based decomposition for asynchronous glottal analysis. INTERSPEECH 2010: 657-660 - [c15]Thomas Drugman, Thierry Dutoit:
On the potential of glottal signatures for speaker recognition. INTERSPEECH 2010: 2106-2109 - [c14]Thomas Drugman, Thierry Dutoit:
Glottal-based analysis of the lombard effect. INTERSPEECH 2010: 2610-2613 - [c13]Benjamin Picart, Thomas Drugman, Thierry Dutoit:
Analysis and synthesis of hypo- and hyperarticulated speech. SSW 2010: 270-275
2000 – 2009
- 2009
- [c12]Thomas Drugman, Geoffrey Wilfart, Thierry Dutoit:
Eigenresiduals for improved parametric speech synthesis. EUSIPCO 2009: 2176-2180 - [c11]Thomas Drugman, Alexis Moinet, Thierry Dutoit, Geoffrey Wilfart:
Using a pitch-synchronous residual codebook for hybrid HMM/frame selection speech synthesis. ICASSP 2009: 3793-3796 - [c10]Thomas Drugman, Baris Bozkurt, Thierry Dutoit:
Complex cepstrum-based decomposition of speech for glottal source estimation. INTERSPEECH 2009: 116-119 - [c9]Thomas Drugman, Thomas Dubuisson, Thierry Dutoit:
On the mutual information between source and filter contributions for voice pathology detection. INTERSPEECH 2009: 1463-1466 - [c8]Thomas Drugman, Geoffrey Wilfart, Thierry Dutoit:
A deterministic plus stochastic model of the residual signal for improved parametric speech synthesis. INTERSPEECH 2009: 1779-1782 - [c7]Thomas Drugman, Thierry Dutoit:
Glottal closure and opening instant detection from speech signals. INTERSPEECH 2009: 2891-2894 - [c6]Thomas Dubuisson, Thomas Drugman, Thierry Dutoit:
On the mutual information of glottal source estimation techniques for the automatic detection of speech pathologies. MAVEBA 2009: 53-56 - [c5]Thomas Drugman, Baris Bozkurt, Thierry Dutoit:
Glottal Source Estimation Using an Automatic Chirp Decomposition. NOLISP 2009: 35-42 - 2008
- [c4]Thomas Drugman, Thomas Dubuisson, Nicolas D'Alessandro, Alexis Moinet, Thierry Dutoit:
Voice source parameters estimation by fitting the glottal formant and the inverse filtering open phase. EUSIPCO 2008: 1-5 - [c3]Mihai Gurban, Jean-Philippe Thiran, Thomas Drugman, Thierry Dutoit:
Dynamic modality weighting for multi-stream hmms inaudio-visual speech recognition. ICMI 2008: 237-240 - [c2]Thomas Drugman, Thomas Dubuisson, Alexis Moinet, Nicolas D'Alessandro, Thierry Dutoit:
Glottal Source Estimation Robustness - A Comparison of Sensitivity of Voice Source Estimation Techniques. SIGMAP 2008: 202-207 - 2007
- [c1]Thomas Drugman, Mihai Gurban, Jean-Philippe Thiran:
Relevant Feature Selection for Audio-Visual Speech Recognition. MMSP 2007: 179-182
Coauthor Index
aka: Penny Karanasou
aka: Sri Vishnu Kumar Karlapati
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-08-03 20:17 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint