default search action
ICASSP 2022: Virtual and Singapore
- IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2022, Virtual and Singapore, 23-27 May 2022. IEEE 2022, ISBN 978-1-6654-0541-6
- Shibo Zhang, Ebrahim Nemati, Minh Dinh, Nathan Folkman, Tousif Ahmed, Md. Mahbubur Rahman, Jilong Kuang, Nabil Alshurafa, Alex Gao:
Coughtrigger: Earbuds IMU Based Cough Detection Activator Using An Energy-Efficient Sensitivity-Prioritized Time Series Classifier. 1-5 - Hoang Truong, Alessandro Montanari, Fahim Kawsar:
Non-Invasive Blood Pressure Monitoring with Multi-Modal In-Ear Sensing. 6-10 - Xiaolu Zeng, Beibei Wang, Chenshu Wu, Sai Deepika Regani, K. J. Ray Liu:
Intelligent Wi-Fi Based Child Presence Detection System. 11-15 - Wenxuan Li, Dongheng Zhang, Yadong Li, Zhi Wu, Jinbo Chen, Dong Zhang, Yang Hu, Qibin Sun, Yan Chen:
Real-Time Fall Detection Using Mmwave Radar. 16-20 - Dae Yon Hwang, Pai Chet Ng, Yuanhao Yu, Yang Wang, Petros Spachos, Dimitrios Hatzinakos, Konstantinos N. Plataniotis:
Hierarchical Deep Learning Model with Inertial and Physiological Sensors Fusion for Wearable-Based Human Activity Recognition. 21-25 - Yu-Chen Lin, Tsun-An Hsieh, Kuo-Hsuan Hung, Cheng Yu, Harinath Garudadri, Yu Tsao, Tei-Wei Kuo:
Speech Recovery For Real-World Self-Powered Intermittent Devices. 26-30 - Ai Okano, Yoshinobu Kajikawa:
Phase Control of Parametric Array Loudspeaker by Optimizing Sideband Weights. 31-35 - Florian Scalvini, Camille Bordeau, Maxime Ambard, Cyrille Migniot, Julien Dubois:
Low-Latency Human-Computer Auditory Interface Based on Real-Time Vision Analysis. 36-40 - Akihiko Sugiyama:
Robust Adaptive Noise Canceller Algorithm with Snr-Based Stepsize Control and Noise-Path Gain Compensation. 41-45 - Chao Liu, Linlin Gao, Ruobing Jiang:
Neartracker: Acoustic 2-D Target Tracking with Nearby Reflector in Siso System. 46-50 - Harinarayanan. E. V, Sachin Ghanekar:
An Efficient Method For Generic Dsp Implementation Of Dilated Convolution. 51-55 - Yu-Shan Tai, Chieh-Fang Teng, Cheng-Yang Chang, An-Yeu Andy Wu:
Compression-Aware Projection with Greedy Dimension Reduction for Convolutional Neural Network Activations. 56-60 - Simon Narduzzi, Siavash Arjomand Bigdeli, Shih-Chii Liu, L. Andrea Dunbar:
Optimizing The Consumption Of Spiking Neural Networks With Activity Regularization. 61-65 - Sujan Kumar Gonugondla, Naresh R. Shanbhag:
IMPQ: Reduced Complexity Neural Networks Via Granular Precision Assignment. 66-70 - Youngeun Kim, Hyoungseob Park, Abhishek Moitra, Abhiroop Bhattacharjee, Yeshwanth Venkatesha, Priyadarshini Panda:
Rate Coding Or Direct Coding: Which One Is Better For Accurate, Robust, And Energy-Efficient Spiking Neural Networks? 71-75 - Linghao Song, Yuze Chi, Jason Cong:
PYXIS: An Open-Source Performance Dataset Of Sparse Accelerators. 76-80 - Zuozhou Pan, Zhiping Lin, Yuanjin Zheng, Zong Meng:
Fast Fault Diagnosis Method Of Rolling Bearings In Multi-Sensor Measurement Enviroment. 81-85 - Diaa Badawi, Ishaan Bassi, Sule Ozev, Ahmet Enis Çetin:
Detecting Anomaly in Chemical Sensors via Regularized Contrastive Learning. 86-90 - Cheng Tang, Junkai Ji, Qiuzhen Lin, Yan Zhou:
Evolutionary Neural Architecture Design of Liquid State Machine for Image Classification. 91-95 - Huy Phan, Yi Xie, Jian Liu, Yingying Chen, Bo Yuan:
Invisible and Efficient Backdoor Attacks for Compressed Deep Neural Networks. 96-100 - Cheng-Hung Lo, Pei-Yun Tsai:
Tensor-Based Orthogonal Matching Pursuit with Phase Rotation for Channel Estimation In Hybrid Beamforming Mimo-Ofdm Systems. 101-105 - Darius Petermann, Minje Kim:
Spain-Net: Spatially-Informed Stereophonic Music Source Separation. 106-110 - Siyuan Yuan, Zhepei Wang, Umut Isik, Ritwik Giri, Jean-Marc Valin, Michael M. Goodwin, Arvindh Krishnaswamy:
Improved Singing Voice Separation with Chromagram-Based Pitch-Aware Remixing. 111-115 - Haici Yang, Shivani Firodiya, Nicholas J. Bryan, Minje Kim:
Don't Separate, Learn To Remix: End-To-End Neural Remixing With Joint Optimization. 116-120 - Yu Wang, Daniel Stoller, Rachel M. Bittner, Juan Pablo Bello:
Few-Shot Musical Source Separation. 121-125 - Ethan Manilow, Patrick O'Reilly, Prem Seetharaman, Bryan Pardo:
Source Separation By Steering Pretrained Music Models. 126-130 - Xuewen Yao, Megan Micheletti, Mckensey Johnson, Edison Thomaz, Kaya de Barbaro:
Infant Crying Detection In Real-World Environments. 131-135 - Qin Zhang, Qingming Tang, Chieh-Chi Kao, Ming Sun, Yang Liu, Chao Wang:
Wikitag: Wikipedia-Based Knowledge Embeddings Towards Improved Acoustic Event Classification. 136-140 - Magdalena Fuentes, Bea Steers, Pablo Zinemanas, Martín Rocamora, Luca Bondi, Julia Wilkins, Qianyi Shi, Yao Hou, Samarjit Das, Xavier Serra, Juan Pablo Bello:
Urban Sound & Sight: Dataset And Benchmark For Audio-Visual Urban Scene Understanding. 141-145 - Sai Srinadhu Katta, Kide Vuojärvi, Sivaprasad Nandyala, Ulla-Maria Kovalainen, Lauren Baddeley:
Real-World On-Board Uav Audio Data Set For Propeller Anomalies. 146-150 - Yuan Gong, Jin Yu, James R. Glass:
Vocalsound: A Dataset for Improving Human Vocal Sounds Recognition. 151-155 - Kento Nagatomo, Masahiro Yasuda, Kohei Yatabe, Shoichiro Saito, Yasuhiro Oikawa:
Wearable Seld Dataset: Dataset For Sound Event Localization And Detection Using Wearable Devices Around Head. 156-160 - Viet-Anh Nguyen, Anh H. T. Nguyen, Andy W. H. Khong:
Tunet: A Block-Online Bandwidth Extension Model Based On Transformers And Self-Supervised Pretraining. 161-165 - Jinjiang Liu, Xueliang Zhang:
DRC-NET: Densely Connected Recurrent Convolutional Neural Network for Speech Dereverberation. 166-170 - Jean-Marie Lemercier, Joachim Thiemann, Raphael Koning, Timo Gerkmann:
Customizable End-To-End Optimization Of Online Neural Network-Supported Dereverberation For Hearing Devices. 171-175 - Naoyuki Kamo, Rintaro Ikeshita, Keisuke Kinoshita, Tomohiro Nakatani:
Importance of Switch Optimization Criterion in Switching WPE Dereverberation. 176-180 - Ziyu Wang, Dejing Xu, Gus Xia, Ying Shan:
Audio-To-Symbolic Arrangement Via Cross-Modal Music Representation Learning. 181-185 - Shiqi Wei, Gus Xia, Yixiao Zhang, Liwei Lin, Weiguo Gao:
Music Phrase Inpainting Using Long-Term Representation and Contrastive Loss. 186-190 - Yi Zou, Pei Zou, Yi Zhao, Kaixiang Zhang, Ran Zhang, Xiaorui Wang:
Melons: Generating Melody With Long-Term Structure Using Transformers And Structure Graph. 191-195 - Moyu Terao, Yuki Hiramatsu, Ryoto Ishizuka, Yiming Wu, Kazuyoshi Yoshii:
Difficulty-Aware Neural Band-to-Piano Score Arrangement based on Note- and Statistic-Level Criteria. 196-200 - Pedro Ramoneda, Nazif Can Tamer, Vsevolod Eremenko, Xavier Serra, Marius Miron:
Score Difficulty Analysis for Piano Performance Education based on Fingering. 201-205 - Zhipeng Chen, Yiya Hao, Yaobin Chen, Gong Chen, Liang Ruan:
A Neural Network-based Howling Detection Method for Real-Time Communication Applications. 206-210 - Tomer Fireaizen, Saar Ron, Omer Bobrowski:
Alarm Sound Detection Using Topological Signal Processing. 211-215 - Osamu Ichikawa, Yuuto Shima, Takahiro Nakayama, Hajime Shirouzu:
A Method For Estimating The Grouping Of Participants In Classroom Group Work Using Only Audio Information. 216-220 - Yuki Okamoto, Shota Horiguchi, Masaaki Yamamoto, Keisuke Imoto, Yohei Kawaguchi:
Environmental Sound Extraction Using Onomatopoeic Words. 221-225 - Masahiro Yasuda, Yasunori Ohishi, Shoichiro Saito:
Echo-Aware Adaptation of Sound Event Localization and Detection in Unknown Environments. 226-230 - Juncheng B. Li, Shuhui Qu, Xinjian Li, Bernie Po-Yao Huang, Florian Metze:
On Adversarial Robustness Of Large-Scale Audio Visual Learning. 231-235 - Haibin Wu, Po-Chun Hsu, Ji Gao, Shanshan Zhang, Shen Huang, Jian Kang, Zhiyong Wu, Helen Meng, Hung-Yi Lee:
Adversarial Sample Detection for Speaker Verification by Neural Vocoders. 236-240 - Naoya Takahashi, Yuki Mitsufuji:
Amicable Examples for Informed Source Separation. 241-245 - David M. Chan, Shalini Ghosh, Debmalya Chakrabarty, Björn Hoffmeister:
Multi-Modal Pre-Training for Automated Speech Recognition. 246-250 - Ryota Tsunoda, Ryo Aihara, Ryoichi Takashima, Tetsuya Takiguchi, Yoshie Imai:
Speaker-Targeted Audio-Visual Speech Recognition Using a Hybrid CTC/Attention Model with Interference Loss. 251-255 - Yifei Wu, Chenda Li, Jinfeng Bai, Zhongqin Wu, Yanmin Qian:
Time-Domain Audio-Visual Speech Separation on Low Quality Videos. 256-260 - Mhd Modar Halimeh, Walter Kellermann:
Complex-Valued Spatial Autoencoders for Multichannel Speech Enhancement. 261-265 - Zhi-Wei Tan, Anh H. T. Nguyen, Yuan Liu, Andy W. H. Khong:
Multichannel Noise Reduction Using Dilated Multichannel U-Net and Pre-Trained Single-Channel Network. 266-270 - Hassan Taherian, Sefik Emre Eskimez, Takuya Yoshioka, Huaming Wang, Zhuo Chen, Xuedong Huang:
One Model to Enhance Them All: Array Geometry Agnostic Multi-Channel Personalized Speech Enhancement. 271-275 - Cong Han, Emine Merve Kaya, Kyle Hoefer, Malcolm Slaney, Simon Carlile:
Multi-Channel Speech Denoising for Machine Ears. 276-280 - Zhong-Qiu Wang, DeLiang Wang:
Localization based Sequential Grouping for Continuous Speech Separation. 281-285 - Mieszko Fras, Marcin Witkowski, Konrad Kowalczyk:
Convolutional Weighted Minimum Mean Square Error Filter for Joint Source Separation and Dereverberation. 286-290 - Ethan Manilow, Curtis Hawthorne, Cheng-Zhi Anna Huang, Bryan Pardo, Jesse H. Engel:
Improving Source Separation by Explicitly Modeling Dependencies between Sources. 291-295 - Yuichiro Koyama, Naoki Murata, Stefan Uhlich, Giorgio Fabbro, Shusuke Takahashi, Yuki Mitsufuji:
Music Source Separation With Deep Equilibrium Models. 296-300 - Natsuki Akaishi, Kohei Yatabe, Yasuhiro Oikawa:
Harmonic and Percussive Sound Separation Based on Mixed Partial Derivative of Phase Spectrogram. 301-305 - Enric Gusó, Jordi Pons, Santiago Pascual, Joan Serrà:
On Loss Functions and Evaluation Metrics for Music Source Separation. 306-310 - Sangwook Park, Mounya Elhilali:
Time-Balanced Focal Loss for Audio Event Detection. 311-315 - Kazuki Shimada, Yuichiro Koyama, Shusuke Takahashi, Naoya Takahashi, Emiru Tsunoo, Yuki Mitsufuji:
Multi-ACCDOA: Localizing And Detecting Overlapping Sounds From The Same Class With Auxiliary Duplicating Permutation Invariant Training. 316-320 - Arman Zharmagambetov, Qingming Tang, Chieh-Chi Kao, Qin Zhang, Ming Sun, Viktor Rozgic, Jasha Droppo, Chao Wang:
Improved Representation Learning For Acoustic Event Classification Using Tree-Structured Ontology. 321-325 - Sandeep Kothinti, Mounya Elhilali:
Temporal Contrastive-Loss for Audio Event Detection. 326-330 - Xu Wang, Xiangjinzi Zhang, Yunfei Zi, Shengwu Xiong:
A Frame Loss of Multiple Instance Learning for Weakly Supervised Sound Event Detection. 331-335 - Heinrich Dinkel, Zhiyong Yan, Yongqing Wang, Junbo Zhang, Yujun Wang:
Pseudo Strong Labels for Large Scale Weakly Supervised Audio Tagging. 336-340 - Wenyu Jin, Tim Schoof, Henning F. Schepker:
Individualized Hear-Through For Acoustic Transparency Using PCA-Based Sound Pressure Estimation At The Eardrum. 341-345 - Benjamin Lentz, Rainer Martin, Kirsten Oberländer, Christiane Völter:
On Spectral and Temporal Sparsification of Speech Signals for the Improvement of Speech Perception in CI Listeners. 346-350 - Fotios Drakopoulos, Sarah Verhulst:
A Differentiable Optimisation Framework for The Design of Individualised DNN-based Hearing-Aid Strategies. 351-355 - Sefik Emre Eskimez, Takuya Yoshioka, Huaming Wang, Xiaofei Wang, Zhuo Chen, Xuedong Huang:
Personalized speech enhancement: new models and Comprehensive evaluation. 356-360 - Jinxu Xiang, Yuyang Zhu, Rundi Wu, Ruilin Xu, Yuko Ishiwaka, Changxi Zheng:
Dynamic Sliding Window for Realtime Denoising Networks. 361-365 - Sunwoo Kim, Minje Kim:
Bloom-Net: Blockwise Optimization for Masking Networks Toward Scalable and Efficient Speech Enhancement. 366-370 - Tianrui Wang, Weibin Zhu, Yingying Gao, Junlan Feng, Shilei Zhang:
HGCN: Harmonic Gated Compensation Network for Speech Enhancement. 371-375 - Wenbin Jiang, Zhijun Liu, Kai Yu, Fei Wen:
Speech Enhancement with Neural Homomorphic Synthesis. 376-380 - Yang Xiang, Jesper Lisby Højvang, Morten Højfeldt Rasmussen, Mads Græsbøll Christensen:
A Bayesian Permutation Training Deep Representation Learning Method for Speech Enhancement with Variational Autoencoder. 381-385 - Huajian Fang, Tal Peer, Stefan Wermter, Timo Gerkmann:
Integrating Statistical Uncertainty into Neural Network-Based Speech Enhancement. 386-390 - Viet Anh Trinh, Sebastian Braun:
Unsupervised Speech Enhancement with Speech Recognition Embedding and Disentanglement Losses. 391-395 - Xianke Wang, Wei Xu, Weiming Yang, Wenqing Cheng:
Musicyolo: A Sight-Singing Onset/Offset Detection Framework Based on Object Detection Instead of Spectrum Frames. 396-400 - Yun-Ning Hung, Ju-Chiang Wang, Xuchen Song, Wei Tsung Lu, Minz Won:
Modeling Beats and Downbeats with a Time-Frequency Transformer. 401-405 - Michael Krause, Meinard Müller:
Hierarchical Classification of Singing Activity, Gender, and Type in Complex Music Recordings. 406-410 - Qiqi He, Xiaoheng Sun, Yi Yu, Wei Li:
Deepchorus: A Hybrid Model of Multi-Scale Convolution And Self-Attention for Chorus Detection. 411-415 - Ju-Chiang Wang, Yun-Ning Hung, Jordan B. L. Smith:
To Catch A Chorus, Verse, Intro, or Anything Else: Analyzing a Song with Structural Functions. 416-420 - Mojtaba Heydari, Matthew C. McCallum, Andreas F. Ehmann, Zhiyao Duan:
A Novel 1D State Space for Efficient Music Rhythmic Analysis. 421-425 - Haici Yang, Sanna Wager, Spencer Russell, Mike Luo, Minje Kim, Wontak Kim:
Upmixing Via Style Transfer: A Variational Autoencoder for Disentangling Spatial Images And Musical Content. 426-430 - Ricardo Falcón Pérez, Kazuki Shimada, Yuichiro Koyama, Shusuke Takahashi, Yuki Mitsufuji:
Spatial Mixup: Directional Loudness Modification as Data Augmentation for Sound Event Localization and Detection. 431-435 - Tobias Kabzinski, Peter Jax:
Towards Faster Continuous Multi-Channel HRTF Measurements Based On Learning System Models. 436-440 - Bowen Zhi, Dmitry N. Zotkin, Ramani Duraiswami:
Towards Fast And Convenient End-To-End HRTF Personalization. 441-445 - Mateusz Guzik, Konrad Kowalczyk:
Wishart Localization Prior On Spatial Covariance Matrix In Ambisonic Source Separation Using Non-Negative Tensor Factorization. 446-450 - Jiawen Huang, Emmanouil Benetos, Sebastian Ewert:
Improving Lyrics Alignment Through Joint Pitch Detection. 451-455 - Ilaria Manco, Emmanouil Benetos, Elio Quinton, György Fazekas:
Learning Music Audio Representations Via Weak Language Supervision. 456-460 - David Giuseppe Badiane, Raffaele Malvermi, Sebastian Gonzalez, Fabio Antonacci, Augusto Sarti:
On the Prediction of the Frequency Response of a Wooden Plate from Its Mechanical Parameters. 461-465 - Bo-Yu Chen, Wei-Han Hsu, Wei-Hsiang Liao, Marco A. Martínez Ramírez, Yuki Mitsufuji, Yi-Hsuan Yang:
Automatic DJ Transitions with Differentiable Audio Effects and Generative Adversarial Networks. 466-470 - Han Chen, Yan Song, Li-Rong Dai, Ian McLoughlin, Lin Liu:
Self-Supervised Representation Learning for Unsupervised Anomalous Sound Detection Under Domain Shift. 471-475 - Vasileios Tsouvalas, Aaqib Saeed, Tanir Ozcelebi:
Federated Self-Training for Data-Efficient Audio Recognition. 476-480 - Meng Feng, Chieh-Chi Kao, Qingming Tang, Ming Sun, Viktor Rozgic, Spyros Matsoukas, Chao Wang:
Federated Self-Supervised Learning for Acoustic Event Classification. 481-485 - Kwanghee Choi, Martin Kersner, Jacob Morton, Buru Chang:
Temporal Knowledge Distillation for on-device Audio Classification. 486-490 - Ognjen (Oggi) Rudovic, Akanksha Bindal, Vineet Garg, Pramod Simha, Pranay Dighe, Sachin Kajarekar:
Streaming on-Device Detection of Device Directed Speech from Voice and Touch-Based Invocation. 491-495 - Hiroshi Sawada, Rintaro Ikeshita, Keisuke Kinoshita, Tomohiro Nakatani:
Multi-Frame Full-Rank Spatial Covariance Analysis for Underdetermined BSS in Reverberant Environments. 496-500 - Aditya Arie Nugraha, Kouhei Sekiguchi, Mathieu Fontaine, Yoshiaki Bando, Kazuyoshi Yoshii:
Flow-Based Fast Multichannel Nonnegative Matrix Factorization for Blind Source Separation. 501-505 - Yudong He, He Wang, Qifeng Chen, Richard Hau Yue So:
Harvesting Partially-Disjoint Time-Frequency Information for Improving Degenerate Unmixing Estimation Technique. 506-510 - Shogo Seki, Hirokazu Kameoka, Li Li:
Investigation And Comparison of Optimization Methods for Variational Autoencoder-Based Underdetermined Multichannel Source Separation. 511-515 - Li Li, Hirokazu Kameoka, Shogo Seki:
HBP: An Efficient Block Permutation Solver Using Hungarian Algorithm and Spectrogram Inpainting for Multichannel Audio Source Separation. 516-520 - Chenxing Li, Yang Wang, Feng Deng, Zhuo Zhang, Xiaorui Wang, Zhongyuan Wang:
EAD-Conformer: a Conformer-Based Encoder-Attention-Decoder-Network for Multi-Task Audio Source Separation. 521-525 - Darius Petermann, Gordon Wichern, Zhong-Qiu Wang, Jonathan Le Roux:
The Cocktail Fork Problem: Three-Stem Audio Separation for Real-World Soundtracks. 526-530 - Félix Mathieu, Thomas Courtat, Gaël Richard, Geoffroy Peeters:
Phase Shifted Bedrosian Filterbank: An Interpretable Audio Front-End for Time-Domain Audio Source Separation. 531-535 - Rahil Parikh, Ilya Kavalerov, Carol Y. Espy-Wilson, Shihab A. Shamma:
Harmonicity Plays a Critical Role in DNN Based Versus in Biologically-Inspired Monaural Speech Segregation Systems. 536-540 - Changsheng Quan, Xiaofei Li:
Multi-Channel Narrow-Band Deep Speech Separation with Full-Band Permutation Invariant Training. 541-545 - Cunhang Fan, Zhao Lv, Shengbing Pei, Mingyue Niu:
Csenet: Complex Squeeze-and-Excitation Network for Speech Depression Level Prediction. 546-550 - Ebrahim Nemati, Xuhai Xu, Viswam Nathan, Korosh Vatanparvar, Tousif Ahmed, Md. Mahbubur Rahman, Dan McCaffrey, Jilong Kuang, Alex Gao:
Ubilung: Multi-Modal Passive-Based Lung Health Assessment. 551-555 - Neeraj Kumar Sharma, Srikanth Raj Chetupalli, Debarpan Bhattacharya, Debottam Dutta, Pravin Mote, Sriram Ganapathy:
The Second Dicova Challenge: Dataset and Performance Analysis for Diagnosis of Covid-19 Using Acoustics. 556-560 - Xing-Yu Chen, Qiu-Shi Zhu, Jie Zhang, Li-Rong Dai:
Supervised and Self-Supervised Pretraining Based Covid-19 Detection Using Acoustic Breathing/Cough/Speech Signals. 561-565 - Madhu R. Kamble, Jose Patino, Maria A. Zuluaga, Massimiliano Todisco:
Exploring Auditory Acoustic Features for The Diagnosis of Covid-19. 566-570 - Anton Ratnarajah, Shi-Xiong Zhang, Meng Yu, Zhenyu Tang, Dinesh Manocha, Dong Yu:
Fast-Rir: Fast Neural Diffuse Room Impulse Response Generator. 571-575 - Juliano G. C. Ribeiro, Shoichi Koyama, Hiroshi Saruwatari:
Region-to-Region Kernel Interpolation of Acoustic Transfer Function with Directional Weighting. 576-580 - Philipp Götz, Cagdas Tuna, Andreas Walther, Emanuël A. P. Habets:
Blind Reverberation Time Estimation in Dynamic Acoustic Conditions. 581-585 - Maozhong Fu, Jesper Rindom Jensen, Yuhan Li, Mads Græsbøll Christensen:
Sparse Modeling of The Early Part of Noisy Room Impulse Responses with Sparse Bayesian Learning. 586-590