


Остановите войну!
for scientists:


default search action
Hsin-Min Wang
Person information

- affiliation: Academia Sinica, Taipei, Taiwan
- affiliation (PhD 1995): National Taiwan University, Taipei, Taiwan
Refine list

refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
showing all ?? records
2020 – today
- 2023
- [j73]Chin-Yi Cheng
, Hung-Shin Lee
, Yu Tsao
, Hsin-Min Wang
:
Multi-Target Extractor and Detector for Unknown-Number Speaker Diarization. IEEE Signal Process. Lett. 30: 638-642 (2023) - [j72]Ryandhimas E. Zezario
, Szu-Wei Fu
, Fei Chen
, Chiou-Shann Fuh
, Hsin-Min Wang
, Yu Tsao
:
Deep Learning-Based Non-Intrusive Multi-Objective Speech Assessment Model With Cross-Domain Features. IEEE ACM Trans. Audio Speech Lang. Process. 31: 54-70 (2023) - [j71]Qian-Bei Hong
, Chung-Hsien Wu
, Hsin-Min Wang
:
Generalization Ability Improvement of Speaker Representation and Anti-Interference for Speaker Verification. IEEE ACM Trans. Audio Speech Lang. Process. 31: 486-499 (2023) - [j70]Qian-Bei Hong
, Chung-Hsien Wu
, Hsin-Min Wang
:
Decomposition and Reorganization of Phonetic Information for Speaker Embedding Learning. IEEE ACM Trans. Audio Speech Lang. Process. 31: 1745-1757 (2023) - [c232]Chi-Chang Lee, Yu Tsao, Hsin-Min Wang, Chu-Song Chen:
D4AM: A General Denoising Framework for Downstream Acoustic Models. ICLR 2023 - [i72]Yu-Wen Chen, Hsin-Min Wang, Yu Tsao:
BASPRO: a balanced script producer for speech corpus collection based on the genetic algorithm. CoRR abs/2301.04120 (2023) - [i71]Yung-Lun Chien, Hsin-Hao Chen, Ming-Chi Yen, Shu-Wei Tsai, Hsin-Min Wang, Yu Tsao, Tai-Shih Chi:
Audio-Visual Mandarin Electrolaryngeal Speech Voice Conversion. CoRR abs/2306.06652 (2023) - [i70]Hsin-Hao Chen, Yung-Lun Chien, Ming-Chi Yen, Shu-Wei Tsai, Yu Tsao, Tai-Shih Chi, Hsin-Min Wang:
Mandarin Electrolaryngeal Speech Voice Conversion using Cross-domain Features. CoRR abs/2306.06653 (2023) - [i69]Ryandhimas E. Zezario, Bo-Ren Brian Bai, Chiou-Shann Fuh, Hsin-Min Wang, Yu Tsao:
Multi-Task Pseudo-Label Learning for Non-Intrusive Speech Quality Assessment Model. CoRR abs/2308.09262 (2023) - [i68]Ryandhimas E. Zezario, Fei Chen, Chiou-Shann Fuh, Hsin-Min Wang, Yu Tsao:
Utilizing Whisper to Enhance Multi-Branched Speech Intelligibility Prediction Model for Hearing Aids. CoRR abs/2309.09548 (2023) - [i67]Shafique Ahmed, Chia-Wei Chen, Wenze Ren, Chin-Jou Li, Ernie Chu, Jun-Cheng Chen, Amir Hussain, Hsin-Min Wang, Yu Tsao, Jen-Cheng Hou:
Deep Complex U-Net with Conformer for Audio-Visual Speech Enhancement. CoRR abs/2309.11059 (2023) - [i66]Ryandhimas E. Zezario, Yu-Wen Chen, Szu-Wei Fu, Yu Tsao, Hsin-Min Wang, Chiou-Shann Fuh:
A Study on Incorporating Whisper for Robust Speech Assessment. CoRR abs/2309.12766 (2023) - [i65]Ammarah Hashmi, Sahibzada Adil Shahzad, Chia-Wen Lin, Yu Tsao, Hsin-Min Wang:
AVTENet: Audio-Visual Transformer-based Ensemble Network Exploiting Multiple Experts for Video Deepfake Detection. CoRR abs/2310.13103 (2023) - [i64]Sahibzada Adil Shahzad, Ammarah Hashmi, Yan-Tsung Peng, Yu Tsao, Hsin-Min Wang:
AV-Lip-Sync+: Leveraging AV-HuBERT to Exploit Multimodal Inconsistency for Video Deepfake Detection. CoRR abs/2311.02733 (2023) - [i63]Hsin-Tien Chiang, Szu-Wei Fu, Hsin-Min Wang, Yu Tsao, John H. L. Hansen:
Multi-objective Non-intrusive Hearing-aid Speech Assessment Model. CoRR abs/2311.08878 (2023) - [i62]Chi-Chang Lee, Hong-Wei Chen, Chu-Song Chen, Hsin-Min Wang, Tsung-Te Liu, Yu Tsao:
LC4SV: A Denoising Framework Learning to Compensate for Unseen Speaker Verification Models. CoRR abs/2311.16604 (2023) - 2022
- [j69]Cheng-Hung Hu
, Yu-Huai Peng, Junichi Yamagishi
, Yu Tsao
, Hsin-Min Wang
:
SVSNet: An End-to-End Speaker Voice Similarity Assessment Model. IEEE Signal Process. Lett. 29: 767-771 (2022) - [j68]Shang-Yi Chuang
, Hsin-Min Wang
, Yu Tsao
:
Improved Lite Audio-Visual Speech Enhancement. IEEE ACM Trans. Audio Speech Lang. Process. 30: 1345-1359 (2022) - [c231]Kuan-Chen Wang, Kai-Chun Liu, Hsin-Min Wang
, Yu Tsao:
EMGSE: Acoustic/EMG Fusion for Multimodal Speech Enhancement. ICASSP 2022: 1116-1120 - [c230]Haibin Wu, Heng-Cheng Kuo, Naijun Zheng, Kuo-Hsuan Hung, Hung-Yi Lee, Yu Tsao, Hsin-Min Wang
, Helen Meng:
Partially Fake Audio Detection by Self-Attention-Based Fake Span Discovery. ICASSP 2022: 9236-9240 - [c229]Chi-Chang Lee, Cheng-Hung Hu, Yu-Chen Lin, Chu-Song Chen, Hsin-Min Wang, Yu Tsao:
NASTAR: Noise Adaptive Speech Enhancement with Target-Conditional Resampling. INTERSPEECH 2022: 1183-1187 - [c228]Hung-Shin Lee, Pin-Tuan Huang, Yao-Fei Cheng, Hsin-Min Wang:
Chain-based Discriminative Autoencoders for Speech Recognition. INTERSPEECH 2022: 2078-2082 - [c227]Ryandhimas Edo Zezario
, Fei Chen, Chiou-Shann Fuh, Hsin-Min Wang, Yu Tsao:
MBI-Net: A Non-Intrusive Multi-Branched Speech Intelligibility Prediction Model for Hearing Aids. INTERSPEECH 2022: 3944-3948 - [c226]Wen-Chin Huang, Erica Cooper, Yu Tsao, Hsin-Min Wang, Tomoki Toda, Junichi Yamagishi:
The VoiceMOS Challenge 2022. INTERSPEECH 2022: 4536-4540 - [c225]Fan-Lin Wang, Hung-Shin Lee, Yu Tsao, Hsin-Min Wang:
Disentangling the Impacts of Language and Channel Variability on Speech Separation Networks. INTERSPEECH 2022: 5343-5347 - [c224]Ryandhimas Edo Zezario
, Szu-Wei Fu, Fei Chen, Chiou-Shann Fuh, Hsin-Min Wang, Yu Tsao:
MTI-Net: A Multi-Target Speech Intelligibility Prediction Model. INTERSPEECH 2022: 5463-5467 - [c223]Hung-Shin Lee, Pin-Yuan Chen, Yao-Fei Cheng, Yu Tsao, Hsin-Min Wang:
Speech-enhanced and Noise-aware Networks for Robust Speech Recognition. ISCSLP 2022: 145-149 - [c222]Shang-Bao Luo, Cheng-Chung Fan, Kuan-Yu Chen, Yu Tsao, Hsin-Min Wang, Keh-Yih Su:
Chinese Movie Dialogue Question Answering Dataset. ROCLING 2022: 7-14 - [c221]Aleksandra Smolka, Hsin-Min Wang, Jason S. Chang, Keh-Yih Su:
Is Character Trigram Overlapping Ratio Still the Best Similarity Measure for Aligning Sentences in a Paraphrased Corpus? ROCLING 2022: 49-60 - [i61]Kuan-Chen Wang, Kai-Chun Liu, Hsin-Min Wang, Yu Tsao:
EMGSE: Acoustic/EMG Fusion for Multimodal Speech Enhancement. CoRR abs/2202.06507 (2022) - [i60]Haibin Wu, Heng-Cheng Kuo, Naijun Zheng, Kuo-Hsuan Hung, Hung-Yi Lee, Yu Tsao, Hsin-Min Wang, Helen Meng:
Partially Fake Audio Detection by Self-attention-based Fake Span Discovery. CoRR abs/2202.06684 (2022) - [i59]Wen-Chin Huang, Erica Cooper, Yu Tsao, Hsin-Min Wang
, Tomoki Toda, Junichi Yamagishi:
The VoiceMOS Challenge 2022. CoRR abs/2203.11389 (2022) - [i58]Hung-Shin Lee, Pin-Tuan Huang, Yao-Fei Cheng, Hsin-Min Wang
:
Chain-based Discriminative Autoencoders for Speech Recognition. CoRR abs/2203.13687 (2022) - [i57]Hung-Shin Lee, Pin-Yuan Chen, Yu Tsao, Hsin-Min Wang
:
Speech-enhanced and Noise-aware Networks for Robust Speech Recognition. CoRR abs/2203.13696 (2022) - [i56]Hung-Shin Lee, Yu Tsao, Shyh-Kang Jeng, Hsin-Min Wang:
Subspace-based Representation and Learning for Phonotactic Spoken Language Recognition. CoRR abs/2203.15576 (2022) - [i55]Chin-Yi Cheng, Hung-Shin Lee, Yu Tsao, Hsin-Min Wang
:
Multi-Target Filter and Detector for Speaker Diarization. CoRR abs/2203.16007 (2022) - [i54]Fan-Lin Wang, Hung-Shin Lee, Yu Tsao, Hsin-Min Wang
:
Disentangling the Impacts of Language and Channel Variability on Speech Separation Networks. CoRR abs/2203.16040 (2022) - [i53]Yu-Huai Peng, Hung-Shin Lee, Pin-Tuan Huang, Hsin-Min Wang:
Generation of Speaker Representations Using Heterogeneous Training Batch Assembly. CoRR abs/2203.16646 (2022) - [i52]Chiang-Lin Tai, Hung-Shin Lee, Yu Tsao, Hsin-Min Wang
:
Filter-based Discriminative Autoencoders for Children Speech Recognition. CoRR abs/2204.00164 (2022) - [i51]Ryandhimas E. Zezario, Fei Chen, Chiou-Shann Fuh, Hsin-Min Wang, Yu Tsao:
MBI-Net: A Non-Intrusive Multi-Branched Speech Intelligibility Prediction Model for Hearing Aids. CoRR abs/2204.03305 (2022) - [i50]Ryandhimas E. Zezario, Szu-Wei Fu, Fei Chen, Chiou-Shann Fuh, Hsin-Min Wang, Yu Tsao:
MTI-Net: A Multi-Target Speech Intelligibility Prediction Model. CoRR abs/2204.03310 (2022) - [i49]Shih-Kuang Lee, Yu Tsao, Hsin-Min Wang:
A Study of Using Cepstrogram for Countermeasure Against Replay Attacks. CoRR abs/2204.04333 (2022) - [i48]Chi-Chang Lee, Cheng-Hung Hu, Yu-Chen Lin, Chu-Song Chen, Hsin-Min Wang
, Yu Tsao:
NASTAR: Noise Adaptive Speech Enhancement with Target-Conditional Resampling. CoRR abs/2206.09058 (2022) - [i47]Yin-Ping Cho, Yu Tsao, Hsin-Min Wang
, Yi-Wen Liu:
Mandarin Singing Voice Synthesis with Denoising Diffusion Probabilistic Wasserstein GAN. CoRR abs/2209.10446 (2022) - [i46]Li-Wei Chen, Yao-Fei Cheng, Hung-Shin Lee, Yu Tsao, Hsin-Min Wang:
A Teacher-student Framework for Unsupervised Speech Enhancement Using Noise Remixing Training and Two-stage Inference. CoRR abs/2210.15368 (2022) - [i45]Fan-Lin Wang, Yao-Fei Cheng, Hung-Shin Lee, Yu Tsao, Hsin-Min Wang:
CasNet: Investigating Channel Robustness for Speech Separation. CoRR abs/2210.15370 (2022) - 2021
- [j67]Wen-Li Wei
, Jen-Chun Lin
, Tyng-Luh Liu
, Hsiao-Rong Tyan, Hsin-Min Wang
, Hong-Yuan Mark Liao:
Learning to Visualize Music Through Shot Sequence for Automatic Concert Video Mashup. IEEE Trans. Multim. 23: 1731-1743 (2021) - [c220]Shih-hung Tsai, Chao-Chun Liang, Hsin-Min Wang, Keh-Yih Su:
Sequence to General Tree: Knowledge-Guided Geometry Word Problem Solving. ACL/IJCNLP (2) 2021: 964-972 - [c219]Qian-Bei Hong, Chung-Hsien Wu, Thanh Binh Nguyen, Hsin-Min Wang:
Improvement of Spatial Ambiguity in Multi-Channel Speech Separation Using Channel Attention. APSIPA ASC 2021: 619-623 - [c218]Yu-Huai Peng, Hung-Shin Lee, Pin-Tuan Huang, Hsin-Min Wang:
Generation of Speaker Representations Using Heterogeneous Training Batch Assembly. APSIPA ASC 2021: 719-724 - [c217]Yi-Syuan Liou, Wen-Chin Huang, Ming-Chi Yen, Shu-Wei Tsai, Yu-Huai Peng, Tomoki Toda, Yu Tsao, Hsin-Min Wang:
Time Alignment using Lip Images for Frame-based Electrolaryngeal Voice Conversion. APSIPA ASC 2021: 1234-1238 - [c216]Ming-Chi Yen, Wen-Chin Huang, Kazuhiro Kobayashi, Yu-Huai Peng, Shu-Wei Tsai, Yu Tsao, Tomoki Toda, Jyh-Shing Roger Jang, Hsin-Min Wang
:
Mandarin Electrolaryngeal Speech Voice Conversion with Sequence-to-Sequence Modeling. ASRU 2021: 650-657 - [c215]Hsin-Tien Chiang, Yi-Chiao Wu, Cheng Yu, Tomoki Toda, Hsin-Min Wang, Yih-Chun Hu, Yu Tsao:
HASA-Net: A Non-Intrusive Hearing-Aid Speech Assessment Network. ASRU 2021: 907-913 - [c214]Ryandhimas E. Zezario
, Chiou-Shann Fuh, Hsin-Min Wang
, Yu Tsao:
Speech Enhancement with Zero-Shot Model Selection. EUSIPCO 2021: 491-495 - [c213]Chung-En Sun, Yi-Wei Chen, Hung-Shin Lee, Yen-Hsing Chen, Hsin-Min Wang
:
Melody Harmonization Using Orderless Nade, Chord Balancing, and Blocked Gibbs Sampling. ICASSP 2021: 4145-4149 - [c212]Wen-Chin Huang, Chia-Hua Wu, Shang-Bao Luo, Kuan-Yu Chen, Hsin-Min Wang
, Tomoki Toda:
Speech Recognition by Simply Fine-Tuning Bert. ICASSP 2021: 7343-7347 - [c211]Wen-Chin Huang, Kazuhiro Kobayashi, Yu-Huai Peng, Ching-Feng Liu, Yu Tsao, Hsin-Min Wang
, Tomoki Toda:
A Preliminary Study of a Two-Stage Paradigm for Preserving Speaker Identity in Dysarthric Voice Conversion. Interspeech 2021: 1329-1333 - [c210]Yao-Fei Cheng, Hung-Shin Lee, Hsin-Min Wang
:
AlloST: Low-Resource Speech Translation Without Source Transcription. Interspeech 2021: 2252-2256 - [c209]Fan-Lin Wang, Yu-Huai Peng, Hung-Shin Lee, Hsin-Min Wang
:
Dual-Path Filter Network: Speaker-Aware Modeling for Speech Separation. Interspeech 2021: 3061-3065 - [c208]Yi-Chiao Wu, Cheng-Hung Hu, Hung-Shin Lee, Yu-Huai Peng, Wen-Chin Huang, Yu Tsao, Hsin-Min Wang
, Tomoki Toda:
Relational Data Selection for Data Augmentation of Speaker-Dependent Multi-Band MelGAN Vocoder. Interspeech 2021: 3630-3634 - [c207]Yu-Tao Chang, Yuan-Hong Yang, Yu-Huai Peng, Syu-Siang Wang
, Tai-Shih Chi, Yu Tsao, Hsin-Min Wang
:
MoEVC: A Mixture of Experts Voice Conversion System With Sparse Gating Mechanism for Online Computation Acceleration. ISCSLP 2021: 1-5 - [c206]Yi-Wei Chen, Hung-Shin Lee, Yen-Hsing Chen, Hsin-Min Wang:
SurpriseNet: Melody Harmonization Conditioning on User-controlled Surprise Contours. ISMIR 2021: 105-112 - [c205]Md Mahbub E. Noor, Yen-Ju Lu, Syu-Siang Wang
, Supratip Ghose, Chia-Yu Chang, Ryandhimas E. Zezario
, Shafique Ahmed, Wei-Ho Chung, Yu Tsao, Hsin-Min Wang
:
Investigation of a Single-Channel Frequency-Domain Speech Enhancement Network to Improve End-to-End Bengali Automatic Speech Recognition Under Unseen Noisy Conditions. O-COCOSDA 2021: 7-12 - [c204]Cheng-Chung Fan, Chia-Chih Kuo, Shang-Bao Luo, Pei-Jun Liao, Kuang-Yu Chang, Chiao-Wei Hsu, Meng-Tse Wu, Shih-Hong Tsai, Tzu-Man Wu, Aleksandra Smolka, Chao-Chun Liang, Hsin-Min Wang, Kuan-Yu Chen, Yu Tsao, Keh-Yih Su:
A Flexible and Extensible Framework for Multiple Answer Modes Question Answering. ROCLING 2021: 33-42 - [c203]Shih-hung Tsai, Chao-Chun Liang, Hsin-Min Wang, Keh-Yih Su:
Mining Commonsense and Domain Knowledge from Math Word Problems. ROCLING 2021: 111-117 - [i44]Wen-Chin Huang, Chia-Hua Wu, Shang-Bao Luo, Kuan-Yu Chen, Hsin-Min Wang, Tomoki Toda:
Speech Recognition by Simply Fine-tuning BERT. CoRR abs/2102.00291 (2021) - [i43]Cheng-Hung Hu, Yi-Chiao Wu, Wen-Chin Huang, Yu-Huai Peng, Yu-Wen Chen, Pin-Jui Ku, Tomoki Toda, Yu Tsao, Hsin-Min Wang:
The AS-NU System for the M2VoC Challenge. CoRR abs/2104.03009 (2021) - [i42]Yao-Fei Cheng, Hung-Shin Lee, Hsin-Min Wang:
AlloST: Low-resource Speech Translation without Source Transcription. CoRR abs/2105.00171 (2021) - [i41]Shih-hung Tsai, Chao-Chun Liang, Hsin-Min Wang, Keh-Yih Su:
Sequence to General Tree: Knowledge-Guided Geometry Word Problem Solving. CoRR abs/2106.00990 (2021) - [i40]Wen-Chin Huang, Kazuhiro Kobayashi, Yu-Huai Peng, Ching-Feng Liu, Yu Tsao, Hsin-Min Wang, Tomoki Toda:
A Preliminary Study of a Two-Stage Paradigm for Preserving Speaker Identity in Dysarthric Voice Conversion. CoRR abs/2106.01415 (2021) - [i39]Cheng-Hung Hu, Yu-Huai Peng, Junichi Yamagishi, Yu Tsao, Hsin-Min Wang:
SVSNet: An End-to-end Speaker Voice Similarity Assessment Model. CoRR abs/2107.09392 (2021) - [i38]Yi-Wei Chen, Hung-Shin Lee, Yen-Hsing Chen, Hsin-Min Wang:
SurpriseNet: Melody Harmonization Conditioning on User-controlled Surprise Contours. CoRR abs/2108.00378 (2021) - [i37]Yi-Syuan Liou, Wen-Chin Huang, Ming-Chi Yen, Shu-Wei Tsai, Yu-Huai Peng, Tomoki Toda, Yu Tsao, Hsin-Min Wang:
Time Alignment using Lip Images for Frame-based Electrolaryngeal Voice Conversion. CoRR abs/2109.03551 (2021) - [i36]Yun-Ju Chan, Chiang-Jen Peng, Syu-Siang Wang, Hsin-Min Wang, Yu Tsao, Tai-Shih Chi:
Speech Enhancement-assisted Stargan Voice Conversion in Noisy Environments. CoRR abs/2110.09923 (2021) - [i35]Ryandhimas E. Zezario, Szu-Wei Fu, Fei Chen, Chiou-Shann Fuh, Hsin-Min Wang, Yu Tsao:
Deep Learning-based Non-Intrusive Multi-Objective Speech Assessment Model with Cross-Domain Features. CoRR abs/2111.02363 (2021) - [i34]Hsin-Tien Chiang, Yi-Chiao Wu, Cheng Yu, Tomoki Toda, Hsin-Min Wang, Yih-Chun Hu, Yu Tsao:
HASA-net: A non-intrusive hearing-aid speech assessment network. CoRR abs/2111.05691 (2021) - 2020
- [j66]Xin Wang
, Junichi Yamagishi, Massimiliano Todisco, Héctor Delgado, Andreas Nautsch, Nicholas W. D. Evans, Md. Sahidullah, Ville Vestman, Tomi Kinnunen, Kong Aik Lee
, Lauri Juvela
, Paavo Alku
, Yu-Huai Peng, Hsin-Te Hwang, Yu Tsao
, Hsin-Min Wang
, Sébastien Le Maguer
, Markus Becker, Zhen-Hua Ling:
ASVspoof 2019: A large-scale public database of synthesized, converted and replayed speech. Comput. Speech Lang. 64: 101114 (2020) - [j65]Tsun-An Hsieh, Hsin-Min Wang
, Xugang Lu, Yu Tsao
:
WaveCRN: An Efficient Convolutional Recurrent Neural Network for End-to-End Speech Enhancement. IEEE Signal Process. Lett. 27: 2149-2153 (2020) - [j64]Chang-Le Liu, Sze-Wei Fu, You-Jin Li, Jen-Wei Huang
, Hsin-Min Wang
, Yu Tsao
:
Multichannel Speech Enhancement by Raw Waveform-Mapping Using Fully Convolutional Networks. IEEE ACM Trans. Audio Speech Lang. Process. 28: 1888-1900 (2020) - [j63]Cheng Yu, Ryandhimas E. Zezario
, Syu-Siang Wang
, Jonathan Sherman, Yi-Yen Hsieh, Xugang Lu, Hsin-Min Wang
, Yu Tsao
:
Speech Enhancement Based on Denoising Autoencoder With Multi-Branched Encoders. IEEE ACM Trans. Audio Speech Lang. Process. 28: 2756-2769 (2020) - [j62]Hung-Shin Lee
, Yu Tsao
, Shyh-Kang Jeng
, Hsin-Min Wang
:
Subspace-Based Representation and Learning for Phonotactic Spoken Language Recognition. IEEE ACM Trans. Audio Speech Lang. Process. 28: 3065-3079 (2020) - [j61]Wen-Chin Huang
, Hao Luo
, Hsin-Te Hwang, Chen-Chou Lo, Yu-Huai Peng, Yu Tsao
, Hsin-Min Wang
:
Unsupervised Representation Disentanglement Using Cross Domain Features and Adversarial Learning in Variational Autoencoder Based Voice Conversion. IEEE Trans. Emerg. Top. Comput. Intell. 4(4): 468-479 (2020) - [c202]Ryandhimas E. Zezario, Szu-Wei Fu, Chiou-Shann Fuh, Yu Tsao, Hsin-Min Wang:
STOI-Net: A Deep Learning based Non-Intrusive Speech Intelligibility Assessment Model. APSIPA 2020: 482-486 - [c201]Hao Yen, Pin-Jui Ku, Ming-Chi Yen, Hung-Shin Lee, Hsin-Min Wang:
Joint Training of Guided Learning and Mean Teacher Models for Sound Event Detection. DCASE 2020: 235-239 - [c200]Ryandhimas E. Zezario
, Tassadaq Hussain
, Xugang Lu, Hsin-Min Wang
, Yu Tsao
:
Self-Supervised Denoising Autoencoder with Linear Regression Decoder for Speech Enhancement. ICASSP 2020: 6669-6673 - [c199]Qian-Bei Hong, Chung-Hsien Wu, Hsin-Min Wang
, Chien-Lin Huang:
Statistics Pooling Time Delay Neural Network Based on X-Vector for Speaker Verification. ICASSP 2020: 6849-6853 - [c198]Qian-Bei Hong, Chung-Hsien Wu, Hsin-Min Wang
, Chien-Lin Huang:
Combining Deep Embeddings of Acoustic and Articulatory Features for Speaker Identification. ICASSP 2020: 7589-7593 - [c197]Shang-Yi Chuang, Yu Tsao, Chen-Chou Lo, Hsin-Min Wang
:
Lite Audio-Visual Speech Enhancement. INTERSPEECH 2020: 1131-1135 - [c196]Chi-Chang Lee, Yu-Chen Lin, Hsuan-Tien Lin, Hsin-Min Wang
, Yu Tsao:
SERIL: Noise Adaptive Speech Enhancement Using Regularization-Based Incremental Learning. INTERSPEECH 2020: 2432-2436 - [c195]Pin-Yuan Chen, Chia-Hua Wu, Hung-Shin Lee, Shao-Kang Tsao, Ming-Tat Ko, Hsin-Min Wang
:
Using Taigi Dramas with Mandarin Chinese Subtitles to Improve Taigi Speech Recognition. O-COCOSDA 2020: 71-76 - [i33]Cheng Yu, Ryandhimas E. Zezario, Jonathan Sherman, Yi-Yen Hsieh, Xugang Lu, Hsin-Min Wang, Yu Tsao:
Speech Enhancement based on Denoising Autoencoder with Multi-branched Encoders. CoRR abs/2001.01538 (2020) - [i32]Wen-Chin Huang, Hao Luo, Hsin-Te Hwang, Chen-Chou Lo, Yu-Huai Peng, Yu Tsao, Hsin-Min Wang:
Unsupervised Representation Disentanglement using Cross Domain Features and Adversarial Learning in Variational Autoencoder based Voice Conversion. CoRR abs/2001.07849 (2020) - [i31]Tsun-An Hsieh, Hsin-Min Wang, Xugang Lu, Yu Tsao:
WaveCRN: An Efficient Convolutional Recurrent Neural Network for End-to-end Speech Enhancement. CoRR abs/2004.04098 (2020) - [i30]Chi-Chang Lee, Yu-Chen Lin, Hsuan-Tien Lin, Hsin-Min Wang, Yu Tsao:
SERIL: Noise Adaptive Speech Enhancement using Regularization-based Incremental Learning. CoRR abs/2005.11760 (2020) - [i29]Shang-Yi Chuang
, Yu Tsao, Chen-Chou Lo, Hsin-Min Wang:
Lite Audio-Visual Speech Enhancement. CoRR abs/2005.11769 (2020) - [i28]Shang-Yi Chuang, Hsin-Min Wang, Yu Tsao:
Improved Lite Audio-Visual Speech Enhancement. CoRR abs/2008.13222 (2020) - [i27]Yu-Huai Peng, Cheng-Hung Hu, Alexander Chao-Fu Kang, Hung-Shin Lee, Pin-Yuan Chen, Yu Tsao, Hsin-Min Wang:
The Academia Sinica Systems of Voice Conversion for VCC2020. CoRR abs/2010.02669 (2020) - [i26]Chung-En Sun, Yi-Wei Chen, Hung-Shin Lee, Yen-Hsing Chen, Hsin-Min Wang:
Melody Harmonization Using Orderless NADE, Chord Balancing, and Blocked Gibbs Sampling. CoRR abs/2010.13468 (2020) - [i25]Ryandhimas E. Zezario, Szu-Wei Fu, Chiou-Shann Fuh, Yu Tsao, Hsin-Min Wang:
STOI-Net: A Deep Learning based Non-Intrusive Speech Intelligibility Assessment Model. CoRR abs/2011.04292 (2020) - [i24]Ryandhimas E. Zezario, Chiou-Shann Fuh, Hsin-Min Wang, Yu Tsao:
Speech Enhancement with Zero-Shot Model Selection. CoRR abs/2012.09359 (2020)
2010 – 2019
- 2019
- [c194]Hsiao-Tzu Hung, Chung-Yang Wang, Yi-Hsuan Yang, Hsin-Min Wang
:
Improving Automatic Jazz Melody Generation by Transfer Learning Techniques. APSIPA 2019: 339-346 - [c193]Tassadaq Hussain
, Yu Tsao
, Hsin-Min Wang
, Jia-Ching Wang, Sabato Marco Siniscalchi, Wen-Hung Liao:
Compressed Multimodal Hierarchical Extreme Learning Machine for Speech Enhancement. APSIPA 2019: 678-683 - [c192]Qian-Bei Hong, Chung-Hsien Wu, Ming-Hsiang Su, Hsin-Min Wang
:
Sequential Speaker Embedding and Transfer Learning for Text-Independent Speaker Identification. APSIPA 2019: 827-832 - [c191]Yueh-Ting Lee, Xuan-Bo Chen, Hung-Shin Lee, Jyh-Shing Roger Jang, Hsin-Min Wang
:
Multi-task Learning for Acoustic Modeling Using Articulatory Attributes. APSIPA 2019: 855-861 - [c190]Wei-Cheng Lin, Yu Tsao
, Fei Chen, Hsin-Min Wang
:
Investigation of Neural Network Approaches for Unified Spectral and Prosodic Feature Enhancement. APSIPA 2019: 1179-1184 - [c189]Shang-Bao Luo, Hung-Shin Lee, Kuan-Yu Chen, Hsin-Min Wang
:
Spoken Multiple-Choice Question Answering Using Multimodal Convolutional Neural Networks. ASRU 2019: 772-778 - [c188]Wen-Chin Huang, Yi-Chiao Wu, Hsin-Te Hwang, Patrick Lumban Tobing
, Tomoki Hayashi, Kazuhiro Kobayashi, Tomoki Toda
, Yu Tsao
, Hsin-Min Wang
:
Refined WaveNet Vocoder for Variational Autoencoder Based Voice Conversion. EUSIPCO 2019: 1-5 - [c187]Tassadaq Hussain
, Yu Tsao
, Hsin-Min Wang
, Jia-Ching Wang, Sabato Marco Siniscalchi, Wen-Hung Liao:
Audio-Visual Speech Enhancement using Hierarchical Extreme Learning Machine. EUSIPCO 2019: 1-5 - [c186]Yih-Liang Shen, Chao-Yuan Huang, Syu-Siang Wang
, Yu Tsao
, Hsin-Min Wang
, Tai-Shih Chi:
Reinforcement Learning Based Speech Enhancement for Robust Speech Recognition. ICASSP 2019: 6750-6754 - [c185]Wen-Chin Huang, Yi-Chiao Wu, Chen-Chou Lo, Patrick Lumban Tobing
, Tomoki Hayashi, Kazuhiro Kobayashi, Tomoki Toda
, Yu Tsao
, Hsin-Min Wang
:
Investigation of F0 Conditioning and Fully Convolutional Networks in Variational Autoencoder Based Voice Conversion. INTERSPEECH 2019: 709-713 - [c184]Chen-Chou Lo, Szu-Wei Fu, Wen-Chin Huang, Xin Wang
, Junichi Yamagishi, Yu Tsao
, Hsin-Min Wang
:
MOSNet: Deep Learning-Based Objective Assessment for Voice Conversion. INTERSPEECH 2019: 1541-1545 - [c183]Pin-Tuan Huang, Hung-Shin Lee, Syu-Siang Wang
, Kuan-Yu Chen, Yu Tsao
, Hsin-Min Wang
:
Exploring the Encoder Layers of Discriminative Autoencoders for LVCSR. INTERSPEECH 2019: 1631-1635 - [c182]Chien-Feng Liao, Yu Tsao
, Hung-yi Lee, Hsin-Min Wang
:
Noise Adaptive Speech Enhancement Using Domain Adversarial Training. INTERSPEECH 2019: 3148-3152 - [c181]Ryandhimas E. Zezario
, Szu-Wei Fu, Xugang Lu, Hsin-Min Wang
, Yu Tsao
:
Specialized Speech Enhancement Model Selection Based on Learned Non-Intrusive Quality Assessment Metric. INTERSPEECH 2019: 3168-3172 - [c180]Tassadaq Hussain
, Yu Tsao, Sabato Marco Siniscalchi, Jia-Ching Wang, Hsin-Min Wang