default search action
ICASSP 2023: Rhodes Island, Greece
- IEEE International Conference on Acoustics, Speech and Signal Processing ICASSP 2023, Rhodes Island, Greece, June 4-10, 2023. IEEE 2023, ISBN 978-1-7281-6327-7
- Tongzhou Chen, Cyril Allauzen, Yinghui Huang, Daniel S. Park, David Rybach, W. Ronny Huang, Rodrigo Cabrera, Kartik Audhkhasi, Bhuvana Ramabhadran, Pedro J. Moreno, Michael Riley:
Large-Scale Language Model Rescoring on Long-Form Data. 1-5 - Haibo Ye, Fangyu Zhou, Xinjie Li, Qingheng Zhang:
Balanced Mixup Loss for Long-Tailed Visual Recognition. 1-5 - Hanbing Liu, Yanru Wu, Yang Liu, Ercan E. Kuruoglu, Xuan Zhang:
SDG-L: A Semiparametric Deep Gaussian Process based Framework for Battery Capacity Prediction. 1-5 - Harshat Kumar, Alejandro Parada-Mayorga, Alejandro Ribeiro:
Algebraic Convolutional Filters on Lie Group Algebras. 1-5 - Atsushi Miyashita, Tomoki Toda:
Representation of Vocal Tract Length Transformation Based on Group Theory. 1-5 - Aochuan Chen, Peter Lorenz, Yuguang Yao, Pin-Yu Chen, Sijia Liu:
Visual Prompting for Adversarial Robustness. 1-5 - Yuzhou Chen, Sotiris Batsakis, H. Vincent Poor:
Higher-Order Spatio-Temporal Neural Networks for Covid-19 Forecasting. 1-5 - Domenico Mattia Cinque, Claudio Battiloro, Paolo Di Lorenzo:
Pooling Strategies for Simplicial Convolutional Networks. 1-5 - Jerry Gu, Liam Collins, Debashri Roy, Aryan Mokhtari, Sanjay Shakkottai, Kaushik R. Chowdhury:
Meta-Learning for Image-Guided Millimeter-Wave Beam Selection in Unseen Environments. 1-5 - Amlu Anna Joshy, P. N. Parameswaran, Siddharth R. Nair, Rajeev Rajan:
Statistical Analysis of Speech Disorder Specific Features to Characterise Dysarthria Severity Level. 1-5 - Brian Yan, Matthew Wiesner, Ondrej Klejch, Preethi Jyothi, Shinji Watanabe:
Towards Zero-Shot Code-Switched Speech Recognition. 1-5 - Jian Chen, Wei Wang, Junxin Chen, Ming Cai:
Dynamic Vehicle Graph Interaction for Trajectory Prediction Based on Video Signals. 1-5 - Thien-Phuc Doan, Long Nguyen-Vu, Souhwan Jung, Kihun Hong:
BTS-E: Audio Deepfake Detection Using Breathing-Talking-Silence Encoder. 1-5 - Yahong Zhang, Sheng Shi, Chenchen Fan, Yixin Wang, Wenli Ouyang, WeiFan, Jianping Fan:
Long-Tailed Recognition with Causal Invariant Transformation. 1-5 - Xiu Zheng, Yuan Huang, Jie Tang:
Reliable Cluster-Based Framework for Open Set Domain Adaptation. 1-5 - Jing-Xuan Zhang, Genshun Wan, Zhen-Hua Ling, Jia Pan, Jianqing Gao, Cong Liu:
Self-Supervised Audio-Visual Speech Representations Learning by Multimodal Self-Distillation. 1-5 - Weiquan Huang, Fu Zhang:
Semi-Supervised Semantic Segmentation with Structured Output Space Adaption. 1-5 - Gaopeng Xu, Xianliang Wang, Sang Wang, Junfeng Yuan, Wei Guo, Wei Li, Jie Gao:
The NIO System for Audio-Visual Diarization and Recognition in MISP Challenge 2022. 1-2 - Chenghu Du, Shengwu Xiong:
CF-VTON: Multi-Pose Virtual Try-on with Cross-Domain Fusion. 1-5 - Subhashini Venugopalan, Jimmy Tobin, Samuel J. Yang, Katie Seaver, Richard J. N. Cave, Pan-Pan Jiang, Neil Zeghidour, Rus Heywood, Jordan R. Green, Michael P. Brenner:
Speech Intelligibility Classifiers from 550k Disordered Speech Samples. 1-5 - Kassem Kallas, Teddy Furon:
Mixer: DNN Watermarking using Image Mixup. 1-5 - Kaushani Majumder, Sibi Raj B. Pillai, Satish Mulleti:
Clustered Greedy Algorithm For Large-Scale Sensor Selection. 1-5 - Ke Hu, Tara N. Sainath, Bo Li, Nan Du, Yanping Huang, Andrew M. Dai, Yu Zhang, Rodrigo Cabrera, Zhifeng Chen, Trevor Strohman:
Massively Multilingual Shallow Fusion with Large Language Models. 1-5 - Dazhao Du, Bing Su, Zhewei Wei:
Preformer: Predictive Transformer with Multi-Scale Segment-Wise Correlations for Long-Term Time Series Forecasting. 1-5 - Ziyue Wang, Ya-Feng Liu, Zhaorui Wang, Wei Yu:
Scaling Law Analysis for Covariance Based Activity Detection in Cooperative Multi-Cell Massive Mimo. 1-5 - Michael Chan, Li Zhu, Korosh Vatanparvar, Hewon Jung, Jilong Kuang, Jun Alex Gao:
Improving Heart Rate and Heart Rate Variability Estimation from Video Through a HR-RR-Tuned Filter. 1-5 - Yosuke Higuchi, Tetsuji Ogawa, Tetsunori Kobayashi, Shinji Watanabe:
Intermpl: Momentum Pseudo-Labeling With Intermediate CTC Loss. 1-5 - Jiewen Zhu, Shengjia Chen, Lexiao Li, Luping Ji:
Sanet: Spatial Attention Network with Global Average Contrast Learning for Infrared Small Target Detection. 1-5 - Manila Kodali, Sudarsana Reddy Kadiri, Laura Laaksonen, Paavo Alku:
Automatic Classification of Vocal Intensity Category from Speech. 1-5 - Xingming Wang, Hao Wu, Chen Ding, Chuanzeng Huang, Ming Li:
Exploring Universal Singing Speech Language Identification Using Self-Supervised Learning Based Front-End Features. 1-5 - Jochen Fink, Renato L. G. Cavalcante, Zoran Utkovski, Slawomir Stanczak:
Deep-Unfolded Adaptive Projected Subgradient Method For Mimo Detection. 1-5 - Sofia Suvorova, Ali Pezeshki, Ross Kyprianou, Bill Moran:
A Radar-Jammer Zero-Sum Repeated Bayesian Game. 1-5 - Shuo Feng, Piji Li:
Ancient Chinese Word Segmentation and Part-of-Speech Tagging Using Distant Supervision. 1-5 - Yao Lu, Zhiyi Chen, Zehui Chen, Jie Hu, Liujuan Cao, Shengchuan Zhang:
CANDY: Category-Kernelized Dynamic Convolution for Instance Segmentation. 1-5 - Liuyin Wang, Mingchao Li, Hai-Tao Zheng:
High-Level Feature Fusion Network for Session-Based Social Recommendation. 1-5 - Mingliang Dai, Zhizhong Huang, Jiaqi Gao, Hongming Shan, Junping Zhang:
Cross-Head Supervision for Crowd Counting with Noisy Annotations. 1-5 - Liana Khamidullina, André L. F. de Almeida, Martin Haardt:
Rate Splitting and Precoding Strategies for Multi-User MIMO Broadcast Channels with Common and Private Streams. 1-5 - Lei Zhang, Jie Liu, Yanqi Bao, Jie Wang:
Region-Awared Transformer with Asymmetric Loss in Multi-Label Classification. 1-5 - Mehul Kumar, Jiyeon Kim, Dhananjaya Gowda, Abhinav Garg, Chanwoo Kim:
Self-Supervised Accent Learning for Under-Resourced Accents Using Native Language Data. 1-5 - Jun Wang, Peng Yao, Feng Deng, Jianchao Tan, Chengru Song, Xiaorui Wang:
NAS-DYMC: NAS-Based Dynamic Multi-Scale Convolutional Neural Network for Sound Event Detection. 1-5 - Xianyu Wang, Yuhan Zhang, Weihua He, Yaoyuan Wang, Minglei Li, Yuchen Wang, Jingyi Zhang, Shunbo Zhou, Ziyang Zhang:
Audio-Driven High Definetion and Lip-Synchronized Talking Face Generation Based on Face Reenactment. 1-5 - Han Ding, Wenjing Song, Cui Zhao, Fei Wang, Ge Wang, Wei Xi, Jizhong Zhao:
Knowledge-Graph Augmented Music Representation for Genre Classification. 1-5 - Da Li, Bo Tang, Lei Xue:
Co-Design for Mimo Radar and Mimo Communication Aided by Reconfigurable Intelligent Surface. 1-5 - Daizong Liu, Pan Zhou:
Jointly Visual- and Semantic-Aware Graph Memory Networks for Temporal Sentence Localization in Videos. 1-5 - Yudong Zhang, Wei Lu, Xu Wang, Pengkun Wang, Yang Wang:
Pondering About Task Spatial Misalignment: Classification-Localization Equilibrated Object Detection. 1-5 - Andrea Marinoni, Marine Mercier, Qian Shi, Sivasakthy Selvakumaran, Mark Girolami:
Incorporating Reliability in Graph Information Propagation by Fluid Dynamics Diffusion: A case of Multimodal Semisupervised Deep Learning. 1-5 - Zhao Ren, Thanh Tam Nguyen, Yi Chang, Björn W. Schuller:
Fast Yet Effective Speech Emotion Recognition with Self-Distillation. 1-5 - Marco A. Oliveira, Vitor Almeida, João Silva, Aníbal J. S. Ferreira:
Analysis and Re-Synthesis of Natural Cricket Sounds Assessing the Perceptual Relevance of Idiosyncratic Parameters. 1-5 - Yikang Wei, Yahong Han:
Exploring Instance Relation for Decentralized Multi-Source Domain Adaptation. 1-5 - Yihong Wu, Yuwen Heng, Mahesan Niranjan, Hansung Kim:
Depth Estimation for a Single Omnidirectional Image with Reversed-Gradient Warming-up Thresholds Discriminator. 1-5 - Ysobel Sims, Alexandre Mendes, Stephan K. Chalup:
Enhanced Embeddings in Zero-Shot Learning for Environmental Audio. 1-5 - Youngki Kwon, Hee-Soo Heo, Bong-Jin Lee, You Jin Kim, Jee-Weon Jung:
Absolute Decision Corrupts Absolutely: Conservative Online Speaker Diarisation. 1-5 - Paul-Gauthier Noé, Xiaoxiao Miao, Xin Wang, Junichi Yamagishi, Jean-François Bonastre, Driss Matrouf:
Hiding Speaker's Sex in Speech Using Zero-Evidence Speaker Representation in an Analysis/Synthesis Pipeline. 1-5 - Seyed Saman Saboksayr, Gonzalo Mateos:
Dual-Based Online Learning of Dynamic Network Topologies. 1-5 - Benjamin Z. Reichman, Anirudh Sundar, Christopher Richardson, Tamara Zubatiy, Prithwijit Chowdhury, Aaryan Shah, Jack Truxal, Micah Grimes, Dristi Shah, Woo Ju Chee, Saif Punjwani, Atishay Jain, Larry Heck:
Outside Knowledge Visual Question Answering Version 2.0. 1-5 - Zihui Cai, Hongwei Ding, Xuemeng Wu, Mohan Xu, Xiaohui Cui:
Hierarchical Transformer for Multi-Label Trailer Genre Classification. 1-5 - Georgios Rizos, Rafael A. Calvo, Björn W. Schuller:
Positive-Pair Redundancy Reduction Regularisation for Speech-Based Asthma Diagnosis Prediction. 1-5 - Xunmeng Wu, Zai Yang, Jian-Feng Cai, Zongben Xu:
Spectral Super-Resolution on the Unit Circle Via Gradient Descent. 1-5 - Seongyeon Park, Myungseo Song, Bohyung Kim, Tae-Hyun Oh:
Unsupervised Pre-Training for Data-Efficient Text-to-Speech on Low Resource Languages. 1-5 - Fengming Liang, Changlin Fan, Bo Xiao, Kongming Liang:
Semantic Centralized Contrastive Learning for Unsupervised Hashing. 1-5 - Chia-Sheng Liu, Jia-Fong Yeh, Hao Hsu, Hung-Ting Su, Ming-Sui Lee, Winston H. Hsu:
BIRD-PCC: Bi-Directional Range Image-Based Deep Lidar Point Cloud Compression. 1-5 - Zengrui Jin, Xurong Xie, Mengzhe Geng, Tianzi Wang, Shujie Hu, Jiajun Deng, Guinan Li, Xunying Liu:
Adversarial Data Augmentation Using VAE-GAN for Disordered Speech Recognition. 1-5 - Guanjun Li, Wei Xue, Wenju Liu, Jiangyan Yi, Jianhua Tao:
GCC-Speaker: Target Speaker Localization with Optimal Speaker-Dependent Weighting in Multi-Speaker Scenarios. 1-5 - Yihe Wang, Yitong Li, Yasheng Wang, Fei Mi, Pingyi Zhou, Jin Liu, Xin Jiang, Qun Liu:
History, Present and Future: Enhancing Dialogue Generation with Few-Shot History-Future Prompt. 1-5 - Yang Zhang, Krishna C. Puvvada, Vitaly Lavrukhin, Boris Ginsburg:
Conformer-Based Target-Speaker Automatic Speech Recognition For Single-Channel Audio. 1-5 - Dan Berrebbi, Brian Yan, Shinji Watanabe:
Avoid Overthinking in Self-Supervised Models for Speech Recognition. 1-5 - Sarah Miller, Christina Karam, Achour Idoughi, Kodai Kikuchi, Keigo Hirakawa:
A Bayesian Perspective on Noise2Noise: Theory and Extensions. 1-5 - Yuhongze Zhou, Liguang Zhou, Issam Hadj Laradji, Tin Lun Lam, Yangsheng Xu:
Affinity Learning With Blind-Spot Self-Supervision for Image Denoising. 1-5 - Tzeviya Sylvia Fuchs, Yedid Hoshen:
Unsupervised Word Segmentation Using Temporal Gradient Pseudo-Labels. 1-5 - Nauman Dawalatabad, Sameer Khurana, Antoine Laurent, James R. Glass:
On Unsupervised Uncertainty-Driven Speech Pseudo-Label Filtering and Model Calibration. 1-5 - Kisoo Kwon, Kuhwan Jeong, Junghyun Park, Hwidong Na, Jinwoo Shin:
String-Based Molecule Generation Via Multi-Decoder VAE. 1-5 - Xinzhou Xu, Jun Deng, Zixing Zhang, Zhen Yang, Björn W. Schuller:
Zero-Shot Speech Emotion Recognition Using Generative Learning with Reconstructed Prototypes. 1-5 - Roberto Pereira, Xavier Mestre, David Gregoratti:
Consistent Estimators of a New Class of Covariance Matrix Distances in the Large Dimensional Regime. 1-5 - Yu Bai, Ruian He, Weimin Tan, Bo Yan, Yangle Lin:
Fine-Grained Blind Face Inpainting with 3D Face Component Disentanglement. 1-5 - Yibin Tang, Ying Chen, Yuan Gao, Aimin Jiang, Lin Zhou:
ADHD Classification with Biomarker Identification Using a Triplet Loss Attention Auto-Encoding Network. 1-5 - Rakib Hyder, M. Salman Asif:
Compressive Sensing with Tensorized Autoencoder. 1-5 - Zhengzhuo Xu, Shuo Yang, Xingjun Wang, Chun Yuan:
Rethink Long-Tailed Recognition with Vision Transforms. 1-5 - Ruoyu Wang, Jun Du, Tian Gao:
Quantum Transfer Learning Using the Large-Scale Unsupervised Pre-Trained Model Wavlm-Large for Synthetic Speech Detection. 1-5 - Daeun Kyung, Kyungmin Jo, Jaegul Choo, Joonseok Lee, Edward Choi:
Perspective Projection-Based 3d CT Reconstruction from Biplanar X-Rays. 1-5 - Yanan Lin, Keyu Chen, Shihao Zhou, Yunan Huang, Yunqi Lei:
CO-NET: Classification-Oriented Point Cloud Sampling via Informative Feature Learning and Non-Overlapped Local Adjustment. 1-5 - Rémi Delogne, Vincent Schellekens, Laurent Daudet, Laurent Jacques:
Signal Processing with Optical Quadratic Random Sketches. 1-5 - Ferdinand Jost, Vassillen Chizhov, Joachim Weickert:
Optimising Different Feature Types for Inpainting-Based Image Representations. 1-5 - Fuyan Ma, Bin Sun, Shutao Li:
Logo-Former: Local-Global Spatio-Temporal Transformer for Dynamic Facial Expression Recognition. 1-5 - Yun-Ning Hung, Chao-Han Huck Yang, Pin-Yu Chen, Alexander Lerch:
Low-Resource Music Genre Classification with Cross-Modal Neural Model Reprogramming. 1-5 - Jiukai Sun, Ganchao Liu, Xuelong Li, Yuan Yuan:
Difference Guided VHR Remote Sensing Image Change Detection. 1-5 - Ryuichi Yamamoto, Reo Yoneyama, Tomoki Toda:
NNSVS: A Neural Network-Based Singing Voice Synthesis Toolkit. 1-5 - Yuya Nishi, Takumi Takahashi, Hiroki Iimori, Giuseppe Abreu, Shinsuke Ibi, Seiichi Sampei:
Wireless Location Tracking via Complex-Domain Super MDS with Time Series Self-Localization Information. 1-5 - Adarsh M. Subramaniam, Akshayaa Magesh, Venugopal V. Veeravalli:
Adaptive Step-Size Methods for Compressed SGD. 1-5 - Khoa Anh Ngo, Kyuhong Shim, Byonghyo Shim:
Spatial Cross-Attention for Transformer-Based Image Captioning. 1-5 - Tong Lei, Zhongshu Hou, Yuxiang Hu, Wanyu Yang, Tianchi Sun, Xiaobin Rong, Dahan Wang, Kai Chen, Jing Lu:
A Low-Latency Hybrid Multi-Channel Speech Enhancement System For Hearing Aids. 1-2 - Guangzhi Sun, Chao Zhang, Philip C. Woodland:
End-to-End Spoken Language Understanding with Tree-Constrained Pointer Generator. 1-5 - Anastasia Kuznetsova, Aswin Sivaraman, Minje Kim:
The Potential of Neural Speech Synthesis-Based Data Augmentation for Personalized Speech Enhancement. 1-5 - Sarbani Ghose, Deepak Mishra, Santi P. Maity, George C. Alexandropoulos:
RIS Reflection and Placement Optimisation for Underlay D2D Communications in Cognitive Cellular Networks. 1-5 - Tianyu Geng, Feng Ji, Pratibha, Wee Peng Tay:
Modulo EEG Signal Recovery Using Transformer. 1-5 - Bach-Tung Pham, Ting-Yu Wang, Phuong Le Thi, Khai-Thinh Nguyen, Yuan-Shan Lee, Tzu-Chiang Tai, Jia-Ching Wang:
Dense Adversarial Transfer Learning Based On Class-Invariance. 1-5 - Yuang Li, Xianrui Zheng, Philip C. Woodland:
Self-Supervised Learning-Based Source Separation for Meeting Data. 1-5 - Gerrit Maus, Dieter Brückmann:
Joint Angle and Respiration Estimation for Passive and Device-Free Respiration Monitoring. 1-5 - Yingting Li, Ambuj Mehrish, Rishabh Bhardwaj, Navonil Majumder, Bo Cheng, Shuai Zhao, Amir Zadeh, Rada Mihalcea, Soujanya Poria:
Evaluating Parameter-Efficient Transfer Learning Approaches on SURE Benchmark for Speech Understanding. 1-5 - Kartik Audhkhasi, Brian Farris, Bhuvana Ramabhadran, Pedro J. Moreno:
Modular Conformer Training for Flexible End-to-End ASR. 1-5 - Zihan Zhang, Shimin Zhang, Mingshuai Liu, Yanhong Leng, Zhe Han, Li Chen, Lei Xie:
Two-Step Band-Split Neural Network Approach For Full-Band Residual Echo Suppression. 1-2 - Huaizhen Tang, Xulong Zhang, Jianzong Wang, Ning Cheng, Jing Xiao:
Learning Speech Representations with Flexible Hidden Feature Dimensions. 1-5 - Vikram Krishnamurthy:
Adaptive Filtering Algorithms For Set-Valued Observations-Symmetric Measurement Approach To Unlabeled And Anonymized Data. 1-5 - Dianlong You, Houlin Wang, Bingxin Liu, Yang Yu, Zhiming Li:
DL-NET: Dilation Location Network for Temporal Action Detection. 1-5 - Vanya Bannihatti Kumar, Shanbo Cheng, Ningxin Peng, Yuchen Zhang:
Visual Information Matters for ASR Error Correction. 1-5 - Xiangping Zheng, Xun Liang, Bo Wu, Junlan Feng, Yuhui Guo, Sensen Zhang:
Intent Does Matter! Propagating High-Order Relations for Exploring Interest Preferences. 1-5 - Tom O'Malley, Shaojin Ding, Arun Narayanan, Quan Wang, Rajeev Rikhye, Qiao Liang, Yanzhang He, Ian McGraw:
Conditional Conformer: Improving Speaker Modulation For Single And Multi-User Speech Enhancement. 1-5 - Qin Lu, Konstantinos D. Polyzos:
Gaussian Process Dynamical Modeling for Adaptive Inference Over Graphs. 1-5 - Sakila S. Jayaweera, Beibei Wang, Xiaolu Zeng, Wei-Hsiang Wang, K. J. Ray Liu:
WIFI-Based Robust Child Presence Detection for Smart Cars. 1-5 - Hayato Futami, Jessica Huynh, Siddhant Arora, Shih-Lun Wu, Yosuke Kashiwagi, Yifan Peng, Brian Yan, Emiru Tsunoo, Shinji Watanabe:
The Pipeline System of ASR and NLU with MLM-based data Augmentation Toward Stop Low-Resource Challenge. 1-2 - Steven Vander Eeckt, Hugo Van hamme:
Weight Averaging: A Simple Yet Effective Method to Overcome Catastrophic Forgetting in Automatic Speech Recognition. 1-5 - Yan Zhao, Jincen Wang, Yuan Zong, Wenming Zheng, Hailun Lian, Li Zhao:
Deep Implicit Distribution Alignment Networks for cross-Corpus Speech Emotion Recognition. 1-5 - Byeonggeun Kim, Jun-Tae Lee, Seunghan Yang, Simyung Chang:
Scalable Weight Reparametrization for Efficient Transfer Learning. 1-5 - Mohammad Reza Hasanabadi, Majid Behdad, Davood Gharavian:
MFCCGAN: A Novel MFCC-Based Speech Synthesizer Using Adversarial Learning. 1-5 - Mingming Zhang, Ye Du, Zhenghui Hu, Qingjie Liu, Yunhong Wang:
BISVP: Building Footprint Extraction Via Bidirectional Serialized Vertex Prediction. 1-5 - Naman Khetan, Tushar Arora, Samee Ur Rehman, Deepak K. Gupta:
Implicitly Rotation Equivariant Neural Networks. 1-5 - Nirmesh Shah, Mayank Kumar Singh, Naoya Takahashi, Naoyuki Onoe:
Nonparallel Emotional Voice Conversion for Unseen Speaker-Emotion Pairs Using Dual Domain Adversarial Network & Virtual Domain Pairing. 1-5 - Yukun Zhang, Chuan Wang, Sanyi Zhang, Xiaochun Cao:
A Database for Multi-Modal Short Video Quality Assessment. 1-5 - Chakka Sai Pradeep, Neelam Sinha, Banibrata Mukhopadhyay:
Measuring Deviation from Stochasticity in Time-Series Using Autoencoder Based Time-Invariant Representation: Application to Black Hole Data. 1-5