


default search action
APSIPA 2024: Macau
- Asia Pacific Signal and Information Processing Association Annual Summit and Conference, APSIPA ASC 2024, Macau, December 3-6, 2024. IEEE 2025, ISBN 979-8-3503-6733-1
- Tsun-Hin Cheung
, Ka-Chun Fung, Songjiang Lai, Kwan-Ho Lin, Vincent T. Y. Ng, Kin-Man Lam:
Automatic Prompt Generation and Grounding Object Detection for Zero-Shot Image Anomaly Detection. 1-6 - Vu-An Hoang, Minh-Hanh Tran
, Viet Hang Dao, Thanh-Hai Tran:
GILED: Lesion Detection of Gastrointestinal Tract from Endoscopic Images and Medical Notes. 1-6 - Yutsuki Takeuchi, Taishi Nakashima, Nobutaka Ono, Takashi Takazawa, Shuhei Shimanoe, Yoshinori Tsuchiya:
Experimental Evaluation of Speech Enhancement for In-Car Environment Using Blind Source Separation and DNN-based Noise Suppression. 1-6 - Wataru Nakata, Takaaki Saeki, Yuki Saito, Shinnosuke Takamichi, Hiroshi Saruwatari:
NecoBERT: Self-Supervised Learning Model Trained by Masked Language Modeling on Rich Acoustic Features Derived from Neural Audio Codec. 1-6 - Gen Sato, Yusuke Ikeda:
Data-Driven Physics-Informed Neural Network for Sound Field Estimation in Rooms of Arbitrary Size. 1-5 - Jong In Kim, Sunhee Kim, Minhwa Chung:
Generating Phonetic Transcriptions for Korean English L2 Learners Using Multiple Self-Supervised-Model-Based ASR Systems and Rover Method. 1-6 - Jiaming Zhang, Jijie Wu, Xiaoxu Li:
Visual semantic alignment network based on pre-trained ViT for few-shot image classification. 1-6 - Song-Jiang Lai, Tsun-Hin Cheung
, Ka-Chun Fung, Tian-Shan Liu, Kin-Man Lam:
An End-to-End Two-Stream Network Based on RGB Flow and Representation Flow for Human Action Recognition. 1-6 - Libo Zhang, Yuxuan Han, Wenbin Lin, Jingwang Ling, Feng Xu:
PRTGaussian: Efficient Relighting Using 3D Gaussians with Precomputed Radiance Transfer. 1-6 - Woon-Seng Gan, Santi Peksi, Chung Kwan Lai, Yen Theng Lee, Dongyuan Shi, Bhan Lam:
A Real-Time Platform for Portable and Scalable Active Noise Mitigation for Construction Machinery. 1-6 - Daisuke Minami, Kiyoshi Nishikawa:
YOLO for High Resolution Images without Retraining. 1-6 - Huiyong Bak, Changhyeon Jeong:
Effective Speech Data Augmentation Method To Improve Customer Service Representative Speech Recognition System Performance. 1-5 - Arth J. Shah, Hemant A. Patil:
Significance of Lower Frequency Regions for Audio Deepfake Detection. 1-6 - Sibusiso Reuben Bakana, Yongfei Zhang, Bhekisipho Twala:
WildPose: HRNet-based Lightweight and Efficient Wildlife Pose Estimation. 1-6 - Shogo Seki, Li Li:
Inference Efficient Source Separation Using Input-dependent Convolutions. 1-5 - Pham Minh Tuan, Mouloud Adel, Nguyen Linh Trung, Eric Guedj:
Does Brain Atlas Choice Matter? An Empirical Study in Alzheimer's Diagnosis Using FDG-PET Images. 1-6 - Fan Zhang, Jacob Benesty, Chao Pan, Jingdong Chen:
New Perspectives and Insights on Distortionless Microphone Array Beamforming. 1-5 - Kyungjune Lee, Mingyu Jang, Jungwoo Huh, Jeonghaeng Lee, Seokkeun Choi, Sanghoon Lee:
MYMV: A Music Video Generation System with User-preferred Interaction. 1-4 - Ruxin Zheng, Saeid Sanei:
Separation of Cardiopulmonary Sound Signals for Classification of Respiratory Diseases. 1-6 - Benjamin Yen, Kazuhiro Nakadai:
Drone audition: implementation of an indoor multi-drone system for sound source tracking. 1-6 - Chung-Wen Wu, Berlin Chen:
Layer-Wise Feature Distillation with Unsupervised Multi-Aspect Optimization for Improved Automatic Speech Assessment. 1-5 - Jingyuan Tang, Songlin Sun:
Forward Prediction-Guided Cross-Partition Targeted Pruning for VVenC. 1-6 - Pengyu Cheng, Zhenhua Ling, Meng Meng, Yujun Wang:
Disentangling Speaker Representations from Intuitive Prosodic Features for Speaker-Adaptative and Prosody-Controllable Speech Synthesis. 1-6 - Mingjun Zhang, Yan Feng, Yu Gao, Longting Xu:
Non-Target Conversion Based Speech Steganography for Secure Speech Communication System. 1-6 - Aoto Yasue, Benjamin Yen, Katsutoshi Itoyama, Kazuhiro Nakadai:
LCMV-based Scan-and-Sum Beamforming for Region Source Extraction. 1-6 - Rintaro Takata, Yoshikazu Washizawa:
Complex CNN incorporating Hilbert transform for steady-state visual evoked potential BCI. 1-6 - Van-De Nguyen, Minh-Huong Hoang Dang, Quang-Huy Nguyen, Manh Cuong Dinh, Thanh-Ha Do:
Enhancing Cell Segmentation using Deep Learning Models by Custom Processing Techniques. 1-5 - Changsheng Chen, Xijin Li:
Robust Image Watermarking Scheme under Halftone Distortion with Surrogate Model. 1-6 - Thi-Loan Pham, Gia-Minh Pham, Tien-Dat Nguyen, Van-Hung Le, Thi-Lan Le, Duy-Hai Vu, Hai Vu, Chi-Mai Pham, Thanh-Hai Tran:
Data Augmentation and Assessment for Enhanced Ovarian Tumor Classification. 1-6 - Haopeng Geng, Daisuke Saito, Nobuaki Minematsu:
A Pilot Study of Applying Sequence-to-Sequence Voice Conversion to Evaluate the Intelligibility of L2 Speech Using a Native Speaker's Shadowings. 1-6 - Hengyi Zou, Sayaka Shiota:
Vocal Tract Length Perturbation-based Pseudo-Speaker Augmentation Considering Speaker Variability for Speaker Verification. 1-6 - Kangjian Huang, Yan Yang, Yongquan Jiang, Xiaobo Zhang, Zhuyi Angelina Li:
AFSDet: Video Small Object Detection Based on Adaptive Focused Slicing. 1-6 - Yizhou Peng, Eng Siong Chng:
Optimizing Multi-Speaker Speech Recognition with Online Decoding and Data Augmentation. 1-6 - Chihiro Watanabe, Hirokazu Kameoka:
GE2E-AC: Generalized End-to-End Loss Training for Accent Classification. 1-6 - Shingo Takemoto, Shunsuke Ono:
Rotation Invariant Spatio-Spectral Total Variation for Hyperspectral Image Denoising. 1-6 - Weiyi Xia, Satoru Fujita:
Cuisine Image Synthesis with Improved Multiscale GANs Guided by CLIP. 1-6 - Nanako Imaichi, Toru Nakashika:
Gamma-VAE: Speech representation based on VAE assuming gamma distribution for both latent variables and observation. 1-6 - Naijian Cao, Renjie He, Yuchao Dai, Mingyi He:
LoFLAT: Local Feature Matching using Focused Linear Attention Transformer. 1-6 - Hiroto Horimoto, Ryusei Kimura, Takahiro Tanaka, Shogo Okada:
Psychological Driving Style Estimation from GPS Sensor Data Alone. 1-6 - Seyun Um, Miseul Kim, Doyeon Kim, Hong-Goo Kang:
Bluemarble: Bridging Latent Uncertainty in Articulatory-to-Speech Synthesis with a Learned Codebook. 1-6 - Mingzhou He, Haojie Wang, Shuchang Zhou, Qingbo Wu, King Ngi Ngan, Fanman Meng, Hongliang Li:
Inertial Strengthened CLIP model for Zero-shot Multimodal Egocentric Activity Recognition. 1-6 - Guangwei Zhang, Yongping Xiong, Ruifan Li:
A Noisy Context Optimization Approach for Chinese Spelling Correction. 1-6 - Toru Takahashi, Eita Morigaki, Masato Nakayama:
Impulse response transforming method to control distance perception based on direct-to-reverberant energy ratio. 1-6 - Kenta Iwai, Takanobu Nishiura:
Performance Evaluation of Acoustic Echo and Noise Canceller with Variable-Step-Size Shared-Error NLMS Algorithm under Double-Talk Conditions. 1-5 - Longting Xu, Mingjun Zhang, Wenbin Zhang, Tianyi Wang, Jiawei Yin, Yu Gao:
Personal Voice Activity Detection With Ultra-Short Reference Speech. 1-6 - Naoki Koga, Yoshiaki Bando, Keisuke Imoto:
LEAD Dataset: How Can Labels for Sound Event Detection Vary Depending on Annotators? 1-6 - Yuhang Yang, Yizhou Peng, Hao Huang, Eng Siong Chng, Xionghu Zhong:
Adapting OpenAI's Whisper for Speech Recognition on Code-Switch Mandarin-English SEAME and ASRU2019 Datasets. 1-6 - Sota Hirata, Norihiro Takamune, Kouei Yamaoka, Daichi Kitamura, Hiroshi Saruwatari, Yu Takahashi, Kazunobu Kondo:
Auxiliary-Function-Based Steering Vector Estimation Method for Spatially Regularized Independent Low-Rank Matrix Analysis. 1-6 - Siddharth Harsh Y. Malhotra, Sapan H. Mankad:
Audio Similarity Detection. 1-6 - So-Yeon Jang, Jong-Ok Kim:
Color Guided Disease Segmentation for Plant Images. 1-6 - Xuping Huang, Akinori Ito:
A Study on Variable Embedding Locations of Reversible Spectral Speech Watermarking. 1-6 - Kai-Wei Huang, Chia-Ping Chen:
Long Audio File Speaker Diarization with Feasible End-to-End Models. 1-6 - Hong-Jie Hu, Yu-Chiao Lai, Chia-Ping Chen:
Enhancing Branchformer with Dynamic Branch Merging Module for Code-Switching Speech Recognition. 1-6 - Satoru Fujita, Keizo Oyama:
Learning a Sequence of Cursive-Style Japanese Characters in Classical Literary Works. 1-6 - Yike Chen, Yuru Song, Peijia Zheng, Yusong Du, Weiqi Luo:
Privacy-Preserving Anomaly Detection in Bitstream Video based on Gaussian Mixture Model. 1-6 - Tsugumasa Yutani, Yuya Yamamoto, Shuyo Nakatani, Hiroko Terasawa:
Wavetable Synthesis Using CVAE for Timbre Control Based on Semantic Label. 1-6 - Zuhai Zhang, Luheng Jia, Li Song, Shuyuan Zhu, Yuanfang Guo, Kebin Jia:
Dictionary Learning Based Two-stage Near-lossless Video Compression. 1-6 - Shogo Mito, Miho Miyajima, Hirofumi Tomioka, Hitomi Sato, Takashi Takeuchi, Hitoshi Muto, Yuji Kabasawa, Hiroyuki Harada, Kana Eguchi
, Shota Kato, Manabu Kano:
Postoperative Delirium Prediction Based on Preoperative Electrocardiogram and Electroencephalogram. 1-5 - Mengting Chen, Ziping Zhao:
Sparse Blind Deconvolution and Demixing via Block Majorization-Minimization. 1-6 - Xingyu Shen
, Wei-Ping Zhu:
Multichannel Speech Enhancement Using Complex-Valued Graph Convolutional Networks and Triple-Path Attentive Recurrent Networks. 1-6 - Po-Cheng Chan, Chung-Li Lu, Jia-Ching Wang:
Detecting Abnormal Machine Sounds Using An Ensemble Approach with Data Augmentation Techniques. 1-4 - Rei Aso, Sakaya Shiota, Hitoshi Kiya:
Disposable-key-based image encryption for collaborative learning of Vision Transformer. 1-6 - Geeta Sai Sahasra, Kadwasra Swapna, Arushi Srivastava, Aditya Pusuluri, Hemant A. Patil:
Comparative Analysis of Glottal and Vocal Tract Features in Dysarthria. 1-6 - Keitaro Yamashita, Kazuki Naganuma, Shunsuke Ono:
Generalized Graph Signal Sampling under Subspace Priors by Difference-of-Convex Minimization. 1-6 - Yuanxi Lin, Yuriy Evgenyevich Gapanyuk:
Frequency & Channel Attention Network for Small Footprint Noisy Spoken Keyword Spotting. 1-6 - Jia-Liang Lu, Bi-Cheng Yan, Yi-Cheng Wang, Tien-Hong Lo, Hsin-Wei Wang, Li-Ting Pai, Berlin Chen:
EADSum: Element-Aware Distillation for Enhancing Low-Resource Abstractive Summarization. 1-6 - Tsubasa Yano, Benjamin Yen, Kazuhiro Nakadai:
Drone audition: dataset and methods for ground surface material classification using drone noise in outdoor environment. 1-6 - Yuto Ishikawa, Osamu Take, Tomohiko Nakamura, Norihiro Takamune, Yuki Saito, Shinnosuke Takamichi, Hiroshi Saruwatari:
Real-Time Noise Estimation for Lombard-Effect Speech Synthesis in Human-Avatar Dialogue Systems. 1-6 - Sara Kashiwagi, Keitaro Tanaka, Shigeo Morishima:
Capturing Dynamic Identity Features for Speaker-Adaptive Visual Speech Recognition. 1-6 - Sheng Li, Yuka Ko, Akinori Ito:
LLM as decoder: Investigating Lattice-based Speech Recognition Hypotheses Rescoring Using LLM. 1-5 - Sifan Wu, Li Dong, Diqun Yan, Rangding Wang:
Normalizing Flows-Based Latent Variable Rearrangement for Generative Image Steganography. 1-6 - Beizuo Zhu, Kazunori Hayashi, Hiroki Mori:
Reduced-dimensional MUSIC Algorithm for Frequency Diverse Array in MIMO Radar System. 1-8 - Yen-Chou Pan, Yih-Liang Shen, Yuan-Fu Liao, Tai-Shih Chi:
Band-Split Inter-SubNet: Band-Split with Subband Interaction for Monaural Speech Enhancement. 1-6 - Akumalla Brahma Reddy, Bach-Tung Pham, Tung-Yu Zhuang, Bima Paristao, Pao-Chi Chang, Jia-Ching Wang:
Leveraging Attention Mechanisms for Breast Cancer Diagnosis. 1-4 - Zekun Yang, Jiajun He, Tomoki Toda:
Multi-Modal Video Summarization Based on Two-Stage Fusion of Audio, Visual, and Recognized Text Information. 1-6 - Hana Lebeta Goshu, Jun Xiao, Kin-Chung Chan, Cong Zhang, Mulugeta Tegegn Gemeda, Kin-Man Lam:
NeRF-FCM: Feature Calibration Mechanisms for NeRF-based 3D Object Detection. 1-6 - Jing Liang, Libo Wang, Peiya Li:
Fine-Grained Privacy-Preserving Image Retrieval in Cloud Environment. 1-6 - Hang Sheng, Qinji Shu, Hui Feng, Bo Hu:
Subset Random Sampling of Finite Time-vertex Graph Signals. 1-6 - Rohini Sri Mannepalli, Aditya Pusuluri, Hemant A. Patil:
Dysarthria Severity Classification Using Phase Based Features of LP Residual. 1-5 - Divesh Lala, Koji Inoue, Haruki Kawai, Zi Haur Pang, Mikey Elmers, Tatsuya Kawahara:
Development and evaluation of a semi-autonomous parallel attentive listening system. 1-6 - Michaël Antonie van Wyk, André Martin McDonald, David M. Rubin, Fangfang Zhang:
Novel Estimators for the Number of Susceptible Individuals in SIR Models of Infectious Epidemics. 1-6 - Fan Zhang, Chao Pan, Jingdong Chen, Jacob Benesty:
Low-Complexity Adaptive Beamformer for Joint Reverberation and Noise Suppression. 1-5 - Keiji Yamadera, Michiharu Niimi:
Improved Ultimate Link without Markers for Projective Transformation. 1-6 - Haibin Wu, Huang-Cheng Chou, Kai-Wei Chang, Lucas Goncalves, Jiawei Du, Jyh-Shing Roger Jang, Chi-Chun Lee, Hung-Yi Lee:
Empower Typed Descriptions by Large Language Models for Speech Emotion Recognition. 1-6 - Hau Joan, Yiqi Tew, Li Peng Tan:
Innovative Information Hiding in H.266/VVC using Sub-Block Transform Technique. 1-6 - Duhyun Kim, Jae-Young Sim:
Confidence-Aware Learning for Person Re-identification with Noisy Labels. 1-5 - Koichi Nishikawa, Shinsuke Ibi, Takumi Takahashi, Hisato Iwai:
Blind Self-Interference Analog Canceller with Differential Delay for Backscatter Communications. 1-6 - Trio Adiono, Erwin Setiawan, Michael Jonathan, Rahmat Mulyawan, Nana Sutisna, Infall Syafalni, Wasiu O. Popoola:
A Configurable OFDM Baseband Processor for RF-UOWC System-on-Chip. 1-4 - Justin Tomoya Wulf, Tetsuro Kitahara:
Analyzing House Music: Relations of Audio Features and Musical Structure. 1-5 - Jinzhuo Yao, Hongqing Liu, Yi Zhou, Lu Gan, Junkang Yang:
Diverse Time-Frequency Attention Neural Network for Acoustic Echo Cancellation. 1-6 - Quoc Anh Le, Hong-Thinh Nguyen:
New approach for Alzheimer's disease classification using topographic maps and deep learning model. 1-6 - Malik Akbar Hashemi Rafsanjani, Candy Olivia Mawalim, Dessi Puji Lestari, Sakriani Sakti, Masashi Unoki:
Unsupervised Anomalous Sound Detection Using Timbral and Human Voice Disorder-Related Acoustic Features. 1-6 - Ryosuke Onizawa, Gen Sato, Izumi Tsunokuni, Yusuke Ikeda:
Physics-Informed Neural Networks for Estimation of Scattered Sound Fields with Boundary Condition. 1-5 - Nao Harada, Rinka Kawano, Masaki Kawamura:
Proposal of Blind Extractable Additive Video Watermarking Method. 1-6 - Daimin Shi, Xiaoyong Lu, Yang Liu, Jingyi Yuan, Tao Pan:
Speech Depression Recognition from the Selfreference Effect Using LSTM with ResNet. 1-5 - Hung-Phong Tran, Thi-Hoai Phan, Thuy-Binh Nguyen, Thi-Ngoc-Diep Do, Hong-Quan Nguyen, Thanh-Hai Tran, Hien-Thanh Duong, Thi-Lan Le:
M-IRRA: A multilingual model for Text-based Person Search. 1-6 - Ryota Imanaka, Yuting Geng, Masato Nakayama, Takanobu Nishiura:
Augmented sound-image perception using pre-virtual-leading ultrasounds based on precedence effect. 1-6 - Tsung-Shan Yang, Yun-Cheng Wang, Chengwei Wei, Suya You, C.-C. Jay Kuo:
GMA: Green Multi-Modal Alignment for Image-Text Retrieval. 1-6 - Junda Zhu, Shisheng Guo, Longzhen Tang, Guolong Cui:
Multi-Channel Fusion Human Activity Recognition Algorithm Based on Millimeter-Wave Radar. 1-6 - Kwok Chin Yuen, Sheng Li, Jia Qi Yip, Engsiong Chng:
Low-resource Language Adaptation with Ensemble of PEFT Approaches. 1-6 - Hiromi Shidara, Kanta Miura, Takuro Ishii, Koichi Ito, Takafumi Aoki, Yoshifumi Saijo, Jun Ohmiya:
Performance Improvement of Single Plane-Wave Imaging Using U-Net and Discrete Wavelet Transform. 1-6 - Hualin Ren
, Christian H. Ritz, Jiahong Zhao
, Xiguang Zheng, Daeyoung Jang:
Generating Room Impulse Responses Using Neural Networks Trained with Weighted Combinations of Acoustic Parameter Loss Functions. 1-6 - Liwen Tang, Dingchang Zheng
, Fei Chen:
Iterative Demographic Attentional Feature Fusion-based CNN and Transformer Network for Accurate Cuffless Blood Pressure Estimation. 1-5 - Yuxin Wang, Shuolin Yang, Qianxi Wu, Zhishuo Zhang, Yunxia Liu, Yang Yang, Yakui Dong, Cheng Fei, Junliang Liu, Lili Wang, Shuzhen Fan, Yongfu Li:
A Semi-supervised Low-Light Image Enhancement with Color Guidance. 1-6 - Xiangjie Sui, Shiqi Wang, Yuming Fang:
A Survey on Objective Quality Assessment of Omnidirectional Images. 1-6 - Xingfeng Li, Xiaohan Shi, Yuke Si, Zilong Zhang, Feifei Cui, Yongwei Li, Yang Liu, Masashi Unoki, Masato Akagi:
BEES: A New Acoustic Task for Blended Emotion Estimation in Speech. 1-6 - Jiawei Yin, Wenbin Zhang, Mingjun Zhang, Yu Gao:
Self-Supervised Augmented Diffusion Model for Anomalous Sound Detection. 1-5 - Koki Aoyama, Koichi Adachi:
Collection of Correlated Information from Superimposed Multiple Chirp Signals. 1-6 - Hualin Ren
, Christian H. Ritz, Jiahong Zhao
, Xiguang Zheng, Daeyoung Jang:
Towards a B-format Ambisonic Room Impulse Response Generator Using Conditional Generative Adversarial Network. 1-6 - Shao-Yun Luo, Kuei-Chen Chen, Jian-Jiun Ding, Cheng-Che Lee, Hsin-Jung Lee:
High and Low Frequency Region Separation Method for Adaptive Image Expansion. 1-6 - Yujin Han, Taewan Kim:
New Abnormal Behavior Detection for Patient Surveillance System. 1-5 - Jae Hoon Shim, Min Woo Kim, Tae Gyu Lim, Byungseok Min, Sang Hwa Lee, Nam Ik Cho:
Enhancing Semiconductor X-RAY Images: A Framework Combining Denoising and Super-Resolution Modules With a Novel Dataset. 1-6 - Kyoka Kazama, Taishi Nakashima, Nobutaka Ono:
Measurement of Relative Transfer Function for Own Voice in Head-Mounted Microphone Array. 1-5 - Hayata Nakano, Yuting Geng, Kenta Iwai, Takanobu Nishiura:
Sound Quality Improvement in Visual Microphone by Emphasizing Focused Area Based on Focal Rate. 1-6 - Aulia Adila, Candy Olivia Mawalim, Masashi Unoki:
Detecting Spoof Voices in Asian Non-Native Speech: An Indonesian and Thai Case Study. 1-6 - Junting Wang, Satoko Koganemaru, Atsushi Shima, Yedi Cao, Kana Hirakawa, Ken Iwagana, Atsushi Suehiro, Keiko Maekawa, Tatsuya Mima, Yumie Ono:
Effect of Phase-Locked Transcranial Alternating Current Stimulation on Vocal tremor. 1-6 - Kun-Lin Tsai, Chao-Ting Huang:
Optimizing Computational Efficiency: In-Memory Computing with Dynamic Switching. 1-6 - Kazuki Naganuma, Shunsuke Ono:
Hyperspectral Unmixing With Row-Sparsity Enhancement: A Difference-of-Convex Approach. 1-5 - Ryusei Terui, Takeshi Yamada:
Speech emotion recognition based on crossmodal transformer and attention weight correction. 1-5 - Yuanyang Qi, Saeid Sanei:
Murmur Separation and Classification from Heart Sound Using Constrained Singular Spectrum Analysis and Wavelet Transform. 1-5 - Po Cheng Chan, Wei-Yu Chen, Chung Li Lu, Hsiang-Feng Chuang, Yu-Han Cheng, Jia-Ching Wang:
Integrating VGGSK and BEATs for Enhanced Sound Event Detection: A Semi-Supervised GRU-Based System with Weak Labels and Synthetic Soundscapes. 1-5 - James Gong, Bruce Li, Waleed Abdulla:
Optimising Neural Networks with Fine-Grained Forward-Forward Algorithm: A Novel Backpropagation-Free Training Algorithm. 1-6 - Li Du, Chao Pan, Lijun Zhang:
Wind Noise Reduction with Orthogonal Polynomial Expansion. 1-5 - Lo-Ya Li, Tien-Hong Lo, Jeih-Weih Hung, Shih-Chieh Huang, Berlin Chen:
Few-Shot Open-Set Keyword Spotting with Multi-Stage Training. 1-5 - Kenta Takahashi, Wataru Nakamura:
A Quasilinear-Time CVP Algorithm for Triangular Lattice Based Fuzzy Extractors and Fuzzy Signatures. 1-4 - Ji Qi, Huisheng Wang, H. Vicky Zhao:
ViP-CBM: Reducing Parameters in Concept Bottleneck Models by Visual-Projected Embeddings. 1-6 - Takatoshi Obata, Osamu Takyu, Kei Inage, Takeo Fujii, Kohei Yoshida, Masayuki Ariyoshi:
Observation of the Terrestrial Radio Environment Using the Low Earth Orbit Satellite Constellation. 1-5 - Huang-Cheng Chou:
A Tiny Whisper-SER: Unifying Automatic Speech Recognition and Multi-label Speech Emotion Recognition Tasks. 1-6 - Nimol Thuon, Jun Du:
KhmerFormer: Multi-Scale CNNs-Transformer with External Attention for Ancient Khmer Palm Leaf Isolated Glyph Classification. 1-6 - Zihang Lyu, Jun Xiao, Cong Zhang, Kin-Man Lam:
AI-generated image detectors are surprisingly easy to mislead... for now. 1-5 - Meet H. Soni, Ashish Panda, Sunil Kumar Kopparapu:
Generalized SpecAugment: Robust Online Augmentation Technique for End-to-End Automatic Speech Recognition. 1-5 - Arth J. Shah, Prathav Kevadiya, Hemant A. Patil:
Pop Noise Detection Using Group Delay Cepstral Coefficients. 1-6 - Chengzhe Shi, Wensheng Pan, Wanzhi Ma, Ying Liu, Qiang Xu, Zhiya Zhang, Shihai Shao:
A High-Isolation Sub-6 GHz In-Band Full-Duplex Communication System. 1-6 - Zhentao Lin, Zihao Chen, Bi Zeng, Leqi Chen, Jia Cai:
Performance Optimization in the Cascade of VAD and ASR Systems: A Study on Evaluation and Alignment Strategies. 1-6 - Liyuan Zhang, Xianrui Wang, Yichen Yang, Tetsuya Ueda, Shoji Makino, Jingdong Chen:
Heavy-tailed Distributions-Based Online Semi-blind Source Separation for Nonlinear Echo Cancellation. 1-5 - Yue-Yang He, Bi-Cheng Yan, Tien-Hong Lo, Meng-Shin Lin, Yung-Chang Hsu, Berlin Chen:
JAM: A Unified Neural Architecture for Joint Multi-granularity Pronunciation Assessment and Phone-level Mispronunciation Detection and Diagnosis Towards a Comprehensive CAPT System. 1-6 - Nutchanon Siripool, Suradej Duangpummet, Jessada Karnjana, Waree Kongprawechnon, Masashi Unoki:
Blind Estimation of Room Volume from Reverberant Speech Based on the Modulation Transfer Function. 1-6 - Zhe Xiao, Zongqi He, Zhuoning Xu, Yunze Li, Zelin Song, Calvin Leighton, Li Wang, Shanru Liu, Shiun Yee Wong, Wenfeng Huang, Wenjing Jia, Kin-Man Lam:
A Multi-Perceptual Learning Network for Retina OCT Image Denoising and Classification. 1-6 - Xue Yang, Changchun Bao, Xu Zhang, Xianhong Chen:
Target Speaker Extraction Method by Emphasizing the Active Speech with an Additional Enhancer. 1-6 - Yuto Ashikawa, Takashi Ito, Shohei Ishizu, Yosuke Kurihara:
A method for classification NEO-FFI answers fabricated and advantageous due to psychological bias using brainwave specific brain activity networks. 1-4 - Jintang Xue, Yun-Cheng Wang, Chengwei Wei, C.-C. Jay Kuo:
Efficient Feature Selection for Word Embedding Dimension Reduction. 1-6 - Felix Ming-Fei Duan, Wan-Chi Siu, Chun Chuen Hui:
New approach on Smiling faces with Domain Transfer in Latent Space. 1-5 - Xinyu Wang, Hong-Shuo Chen, Zhiruo Zhou, Suya You, Azad M. Madni, C.-C. Jay Kuo:
Green Video Camouflaged Object Detection. 1-6 - Junkang Yang, Hongqing Liu, Lu Gan, Yi Zhou, Xing Li, Jie Jia, Jinzhuo Yao:
SDNet: Noise-Robust Bandwidth Extension under Flexible Sampling Rates. 1-6 - Yuhang Zhang, Yuanman Li, Li Dong, Xia Li:
Robust Watermarking via Dual Guidance. 1-6 - Shu-Ping Chang, Cheng-Che Lee, Hsin-Jung Lee, Chieh-Hsiung Kuan, Jason Gemsun Young, Chia-Yu Yao, Jian-Jiun Ding:
An Annealing-Inspired Gradient-Descent Based Suboptimal Solver for Combinatorial Problems. 1-6 - Tomoaki Mizuno, Takuya Kishida, Natsue Yoshimura, Toru Nakashika:
An Investigation on the Speech Recovery from EEG Signals Using Transformer. 1-6 - Xiaoqing Tong, Kazunori Hayashi:
Deep Unfolding Aided Parameter Optimization for Multi-task Diffusion LMS Algorithm. 1-6 - Tomohiro Ariga, Reo Minakawa, Kazunori Kojima, Shi-wook Lee, Yoshiaki Itoh:
Keyword spotting for dialectal speech and Introduction of wav2vec2.0. 1-5 - Yoto Ikezaki, Yuting Geng, Masato Nakayama, Takanobu Nishiura:
Virtual multi-boosted amplitude modulation toward high-pressure audible sound with parametric array loudspeakers. 1-6 - Chenxing Li, Manjie Xu, Dong Yu:
SRC-gAudio: Sampling-Rate-Controlled Audio Generation. 1-6 - Teng-Kuan Huang, Mei-Chen Yeh:
Improving Semi-Supervised Object Detection by ROI-Enhanced Contrastive Learning. 1-6 - Seyun Um, Yongju Lee, WooSeok Ko, Yuan Zhou, Sangyoun Lee, Hong-Goo Kang:
EavaNet: Enhancing Emotional Facial Expressions in 3D Avatars through Speech-Driven Animation. 1-6 - Junwu Huang, Zhexiong Wan, Zhicheng Lu, Juanjuan Zhu, Mingyi He, Yuchao Dai:
Ev3DGS: Event Enhanced 3D Gaussian Splatting from Blurry Images. 1-6 - Yuru Song, Yike Chen, Peijia Zheng, Yusong Du, Weiqi Luo:
Secure Moving Object Detection Transformer in Compressed Video with Feature Fusion. 1-6 - Rashed Iqbal, Christian H. Ritz, Jack Yang
, Sarah K. Howard:
Few-Shot Audio Classification Model for Detecting Classroom Interactions Using LaSO Features in Prototypical Networks. 1-6 - Mary Josy John, Imad Barhumi:
Handling Missing Data in Limited-View Photoacoustic Tomography Using Compressive Sensing Algorithm-Based Deep Learning. 1-6 - Jonghwan Na, Yeseul Park, Bowon Lee:
A Comparative Study on the Biases of Age, Gender, Dialects, and L2 speakers of Automatic Speech Recognition for Korean Language. 1-6 - Jinyi Mi, Xiaohan Shi, Ding Ma, Jiajun He, Takuya Fujimura, Tomoki Toda:
Two-stage Framework for Robust Speech Emotion Recognition Using Target Speaker Extraction in Human Speech Noise Conditions. 1-6 - Ken Kalang Al Qalyubi, Nur Ahmadi, Dessi Puji Lestari:
Comparative Evaluation of Fine-Tuned Hybrid Transformer and Band-Split Recurrent Neural Networks for Music Source Separation. 1-5 - Wing-Ho Cheng, Wan-Chi Siu, H. Anthony Chan:
High-Quality Facial Pose Generation with Latent Space Processing. 1-7 - Haonan Hu, Ziye Yang, Jie Chen, Lijun Zhang:
Speech Dereverberation with Deconvolution Regularized by Denoising. 1-6 - Mas Ira Syafila Mohd Hilmi Tan, Lai-Kuan Wong, Yuen Peng Loh, Chih-Yang Pee:
Enhancing Early Plant Disease Detection: 1D to 2D Spectral Transformations. 1-6 - Tetsuya Asakawa, Masashi Hashimoto, Takeshi Miyaji, Kazuki Shimizu, Kei Nomura, Masaki Aono:
Real-time Segmentation of Coronary Artery Calcification Using Spatial Attention and Parallel Convolution. 1-5 - Kento Masuda, Kazumasa Yamamoto, Seiichi Nakagawa:
Data Augmentation Methods and Influence of Speech Recognition Performance for TED Talk's English to Japanese Speech Translation. 1-6 - Nana Sutisna, Aditya Prawira Nugroho, Christopher Jeffrey, Patrick Amadeus Irawan, Rizky Ramadhana, Ronggur Mahendra, Michael Jonathan, Infall Syafalni, Trio Adiono:
Leveraging IoT and Machine Learning for Efficient Rice Stock Monitoring and Prediction. 1-6 - Trong-Duc Nguyen, Tien-Dung Do, Thanh-Ha Do:
Automated Pseudo-Label Generation and Parallel Computing for Enhanced Few-Shot Medical Image Segmentation. 1-6 - Menghan Li, Zhihua Huang:
WavLM and Omni-Scale CNNs: Enhancing Boundary Detection in Partially Spoofed Audio. 1-5 - Qing Feng, Zhiqiang Wu, Xuebin Li, Heping Shen, Liushang, Tangmin, Shengquan Feng:
Temporal-Spatial Correlation Analysis for Ship-Radiated Noise Based on Random Matrix Theory. 1-6 - Jen-Tzung Chien, Yi-Chien Wu:
Empathetic Response Generation via Regularized Q-Learning. 1-6 - Shuting Hao, Daisuke Saito, Nobuaki Minematsu:
Enhancing Acoustic Scene Classification with Layer-wise Fine-Tuning on the SSAST Model. 1-6 - Tianwei Zhang
, Lianru Gao, Xu Sun, Lina Zhuang:
Tiny Object Detection Enhancement for Large-Scale Remote Sensing Imagery. 1-5 - Xinqi Jiang, Jinyu Tian:
Source Attribution for Images Generated by Diffusion-Based Text-to-Image Models: Exploring the Forensics Approach. 1-6 - Jia Qi Yip, Kwok Chin Yuen, Bin Ma, Engsiong Chng:
Speech Separation using Neural Audio Codecs with Embedding Loss. 1-6 - Koki Horikoshi, Gen Sato, Izumi Tsunokuni, Yusuke Ikeda:
Pressure Matching Using Data-Driven Estimation for Sound Fields and Transfer Functions. 1-5 - Siyang Qi, Fei Wang, Hongzhi Sun, Yang Ge, Bo Xiao:
GVDIE: A Zero-Shot Generative Information Extraction Method for Visual Documents Based on Large Language Models. 1-6 - Kohei Hayashi, Soichiro Honda, Hirokazu Kamei, Yoshihiro Maeda, Norishige Fukushima:
Contrast-Aware DCT for Image Enhancement with JPEG Compatible Coding. 1-6 - Tatsuro Inaba, Kazuyoshi Yoshii, Eita Nakamura:
On the Importance of Time and Pitch Relativity for Transformer-Based Symbolic Music Generation. 1-6 - Kazuki Yamato, Satoshi Ito:
Sampling Pattern Augmentation to Enhance Deep Learning-based Image Reconstruction of MRI. 1-6 - Kuan-Hsun Ho, En-Lun Yu, Jeih-Weih Hung, Shih-Chieh Huang, Berlin Chen:
GLASS: Investigating Global and Local context Awareness in Speech Separation. 1-6 - Hiroto Sawada, Shoko Imaizumi, Hitoshi Kiya:
Enhancing Security Using Random Binary Weights in Privacy-Preserving Federated Learning. 1-6 - Aquib Iqbal, Abid Hasan Zim, Md Asaduzzaman Tonmoy, Limengnan Zhou, Asad Malik, Minoru Kuribayashi:
EAViT: External Attention Vision Transformer for Audio Classification. 1-6 - Bach-Tung Pham, Pao-Chi Chang, Jia-Ching Wang:
Seismic-ionospheric Precursor Prediction Using Deep Learning. 1-4 - Eun-bin An, Ayoung Kim, Soon-Heung Jung, Hyon-Gon Choo, Kwang-deok Seo:
Adaptive Spatial Re-sampling Method for Video Coding for Machines. 1-4 - Wenze Ren, Yi-Cheng Lin, Huang-Cheng Chou, Haibin Wu, Yi-Chiao Wu, Chi-Chun Lee, Hung-Yi Lee, Hsin-Min Wang, Yu Tsao:
EMO-Codec: An In-Depth Look at Emotion Preservation Capacity of Legacy and Neural Codec Models with Subjective and Objective Evaluations. 1-6 - Divesh Lala, Koji Inoue, Tatsuya Kawahara:
Prediction of negative user reactions towards system responses during attentive listening. 1-6 - Daishi Tanaka, Michiharu Niimi:
Detection of Diffusion-Generated Images Using Sparse Coding. 1-6 - Takumi Nagawaki, Keisuke Ikeda, Kohei Chike, Hiroyuki Nagano, Masaki Nose, Satoshi Tamura:
Targeted Representation with Information Disentanglement Encoding Networks in Tasks. 1-5 - Yoto Fujita, Aditya Arie Nugraha, Diego Di Carlo, Yoshiaki Bando, Mathieu Fontaine, Kazuyoshi Yoshii:
Run-Time Adaptation of Neural Beamforming for Robust Speech Dereverberation and Denoising. 1-6 - Toshihiro Tsukagoshi, Kazuhiro Koiwai, Masafumi Nishida, Masafumi Nishimura:
SSL-based Chewing and Swallowing Detection Using Multiple Skin-contact Microphones. 1-5 - Rinka Kawano, Masaki Kawamura:
Estimation of rotation angle and anisotropic scaling rate using pilot signals for watermarking. 1-6 - Juhwan Yoon, Hyungseob Lim, Hyeonjin Cha, Hong-Goo Kang:
StylebookTTS: Zero-Shot Text-to-Speech Leveraging Unsupervised Style Representation. 1-6 - Wageesha Manamperi, Thushara D. Abhayapala:
Successive Speaker Relative Transfer Function Estimation Through Relative Transfer Matrix in Noisy Reverberant Environments. 1-6 - Keigo Ichikawa, Sei Ueno, Akinobu Lee:
Data generation for speaker diarization by speaker transition information. 1-5 - Takehiro Imamura, Yuka Hashizume, Tomoki Toda:
Multi-Task Learning Approaches for Music Similarity Representation Learning Based on Individual Instrument Sounds. 1-6 - Kazuhiro Nakadai, Makoto Kumon, Yoko Sasaki, Kotaro Hoshiba, Benjamin Yen:
Swarm Active Audition System with Robots and Drones for a Search and Rescue Task. 1-6 - Infall Syafalni, Angelica Winasta Sinisuka, Dwi Kalam Amal Tauhid, Farrel Ahmad, Muhammad Alif Putra Yasa, Steven Alexander Wen, Erwin Setiawan, Nana Sutisna, Trio Adiono:
Exploration Robot Based On YOLOv8 Algorithm. 1-5 - Kai Guo, Xiang Xie, Fengrun Zhang:
Annotation-free Fine-tuning for Unsupervised Anomalous Sound Detection. 1-5 - Fauzan Maftuh Alwafi, Boby Mugi Pratama, Phuong Thi Le, Bima Prihasto, Jia-Ching Wang:
Enhanced Detection of Illegally Parked Vehicles Using YOLO and Good Feature to Track Methods. 1-6 - Mai Ohta, Hiroki Matsuura, Takeo Fujii:
A Study on Packet-Level Index Modulation Using Frequency Offsets within a LoRaWAN Channel. 1-6 - Yuki Sato, Yuya Chiba, Ryuichiro Higashinaka:
Investigating the Language Independence of Voice Activity Projection Models through Standardization of Speech Segmentation Labels. 1-6 - Ginji Ohashi, Shinsuke Ibi, Takumi Takahashi, Hisato Iwai:
Data-Driven Tuning for Weighted Least Squares of BLE-AoA-based Indoor Localization. 1-6 - Vu Hoang Dung, Nguyen Trung Kien, Do Thanh Ha:
Enhanced Sparse Convolutional Detection Model for 3D Object Detection in Autonomous Vehicles Adapted to Traffic Conditions in Vietnam. 1-6 - Trio Adiono, Clarence Amadeus, Sindy Novaria Cicilya Sinaga, Teuku Rafifsyah Thomi:
Implementation of Real-Time Oscillometric Based Algorithm for Blood Pressure Measurement in Patient Monitor. 1-6 - Trio Adiono, Rd Elviana La'salina Muhlis, Clarence Amadeus, Sindy Novaria Cicilya Sinaga:
Development of Simple Algorithm to Detect and Filter Motion Artifact Noise in Non Invasive Blood Pressure (NIBP) Measurement. 1-6 - Ryotaro Nagase, Takashi Sumiyoshi, Natsuo Yamashita, Kota Dohi, Yohei Kawaguchi:
Can We Estimate Purchase Intention Based on Zero-shot Speech Emotion Recognition? 1-6 - Aditya Raikar, Meet H. Soni, Ashish Panda, Sunil Kumar Kopparapu:
Acoustic model adaptation in noisy and reverberated scenarios using multi-task learned embeddings. 1-5 - Kanishq Singhal, Aditya Goyal, Priyanka Gupta:
Quefrency Approach to Audio Deepfake Detection. 1-6 - Zhen-Xun Lee, Jian-Jiun Ding:
PBJDT: Point-Based Joint Detection-and-Tracking. 1-6 - Hangjing Zhang, H. Vicky Zhao:
Modeling and Analysis of the Interaction between Opinions and Actions among Heterogeneous Agents. 1-6 - Hoang-Son Bui
, Sy-Hoang Tran, Thuy-Binh Nguyen, Thanh-Hai Tran, Hai Vu, Thi-Lan Le:
Marker-Aware Ovarian Tumor Segmentation from Ultrasound Images. 1-6 - Yuki Nakano, Yuting Geng, Kenta Iwai, Takanobu Nishiura:
Deep-Learning-Based Speech Enhancement with Rough-Focused Optical Laser Microphone by Reconstructing Complex Spectrum. 1-5 - Nischay Purnekar, Benedetta Tondi, Mauro Barni:
Physical Domain Adversarial Attacks Against Source Printer Image Attribution. 1-6 - Shu Komatsu, Akira Kubota:
Color Enhancement for the Colorblind Using Color Correction Intensity Map and Pix2pix Image Conversion. 1-5 - Hiya Chaudhari, Arth J. Shah, Hemant A. Patil:
Cross Lingual Speech Representation for Infant Cry Classification. 1-5 - Yi Zhang, FangYuan Liu, JiaJia Song, Qi Zeng, Hui He:
MTFNet: Multi-Scale Transformer Framework for Robust Emotion Monitoring in Group Learning Settings. 1-8 - Zezhong Jin, Youzhi Tu, Man-Wai Mak:
Joseph: phonetic-aware speaker embedding for far-field speaker verification. 1-6 - Naotaka Kawata, Shota Orihashi, Satoshi Suzuki, Tomohiro Tanaka, Mana Ihori, Naoki Makishima, Taiga Yamane, Ryo Masumura:
Block Refinement Learning for Improving Early Exit in Autoregressive ASR. 1-6 - Gaurav Hirani, Kevin I-Kai Wang, Waleed H. Abdulla:
Continual Learning with Self-Organizing Maps: A Novel Group-Based Unsupervised Sequential Training Approach. 1-6 - Ryuichi Hatakeyama, Kohei Okuda, Toru Nakashika:
DDPMVC: Non-parallel any-to-many voice conversion using diffusion encoder. 1-6 - Tianqin Zheng, Hanchen Pei, Ningning Pan, Jilu Jin, Gongping Huang, Jingdong Chen, Jacob Benesty:
A Single-Input/Binaural-Output Perceptual Rendering Based Speech Separation Method in Noisy Environments. 1-5 - Dipanita Chakraborty, Werapon Chiracharit, Kosin Chamnongthai, Minoru Okada:
Camera Focal Length Prediction for Neural Novel View Synthesis from Monocular Video. 1-5 - Jun-Seok Lee, Yun-Sung Lee, Han-Jeong Hwang:
Effect of Dynamic Binaural Beats on Concentration Enhancement. 1-4 - Koki Maruyama, Shun Sawada, Hidefumi Ohmura, Kouichi Katsurada:
Speech Synthesis from IPA Sequences through EMA Data. 1-6 - Koichi Ito, Taito Tonosaki, Takafumi Aoki, Tetsushi Ohki, Masakatsu Nishigaki:
Multibiometrics Using a Single Face Image. 1-6 - Takumi Yamamoto, Kotaro Hoshiba, Benjamin Yen, Kazuhiro Nakadai:
Implementation of a Robot Operation System-based network for sound source localization using multiple drones. 1-6 - Arth J. Shah, Nandini V. Mandaviya, Hemant A. Patil:
Voice Liveness Detection Using Linear Frequency Residual Cepstral Coefficients. 1-6 - A. Sumarudin, Nana Sutisna, Infall Syafalni, Bambang Riyanto Trilaksono, Trio Adiono:
Optimizing Deep Q-Network for Shortest Path Computation of Mobile Robot Agents. 1-6 - Takuto Wada, Ryohei Sasaki, Katsumi Konishi:
Adaptive Subspace Clustering for Matrix Completion. 1-5 - Keiko Kawase, Gen Sato, Izumi Tsunokuni, Yusuke Ikeda:
Data-Driven Sound Field Reproduction for Higher-Order Mode Matching Using a Circular Loudspeaker Array. 1-5 - Jianan Chen, Chenhui Chu, Sheng Li, Tatsuya Kawahara:
Data Selection using Spoken Language Identification for Low-Resource and Zero-Resource Speech Recognition. 1-6 - Beom Jun Woo, Ji Won Yoon, Min Hyun Han, Chanyeong Moon, Nam Soo Kim:
EEND-EM: End-to-End Neural Speaker Diarization with EM-Network. 1-5 - Minh Vu, Zhou Wei, Binit Bhattarai, Kah Kuan Teh, Tran Huy Dat:
VietSing: A High-quality Vietnamese Singing Voice Corpus. 1-6 - Ngoc Son Tran, Pei-Chin Hsieh, Yih-Liang Shen, Yen-Hsun Chu, Tai-Shi Chi:
Real-Time Monophonic Dual-Pitch Extraction Model. 1-6 - Wageesha N. Manamperi, Thushara D. Abhayapala:
Relative Transfer Matrix for Drone Audition Applications: Source Enhancement. 1-6 - Asfa Jamil, Alessandro Artusi:
Ablation Study to Derive a Computationally Efficient Deep Learning-Based Super-Resolution Approach. 1-6 - Yoshifumi Shoji, Masahiro Yukawa:
Robust Quantile Regression Under Unreliable Data. 1-6 - Masaya Togashi, Ingon Chanpornpakdi, Toshihisa Tanaka:
Electroencenphalogram-Based Effective Features for Sustained Attention Assessment in Conversation. 1-6 - Ryu Takeda, Kazunori Komatani:
Scale-invariant Online Voice Activity Detection under Various Environments. 1-6 - Ritik Mahyavanshi, C. V. Mahesh Reddy, Arth J. Shah, Hemant A. Patil:
Teager Energy Cepstral Coefficients for Audio Deepfake Detection. 1-6 - Haruna Aoki, Sinan Zhang, Yumie Ono:
EEG-based Evaluation of Enjoyment Emotion during cognitive-motor task. 1-4 - Xusheng Yang, Zifeng Zhao, Yuexian Zou:
Peer Learning via Shared Speech Representation Prediction for Target Speech Separation. 1-7 - So Watanabe, Chee Siang Leow, Junichi Hoshino, Takehito Utsuro, Hiromitsu Nishizaki:
Assessment and Improvement of Customer Service Speech with Multiple Large Language Models. 1-6 - Arth J. Shah, Savita H. Yadav, Hemant A. Patil:
Teager Energy Cepstral Coefficients for Spoken Language Identification. 1-6 - Saki Nomura, Junya Hara, Hiroshi Higashi, Yuichi Tanaka:
Dynamic Sensor Placement on Graphs Based on Graph Signal Sampling Theory. 1-6 - Masora Okano, Koichi Ito, Masakatsu Nishigaki, Tetsushi Ohki:
Enhancing Remote Adversarial Patch Attacks on Face Detectors with Tiling and Scaling. 1-6 - Haeyoung Lee, Sunhee Kim, Minhwa Chung:
Analysis of Various Self-Supervised Learning Models for Automatic Pronunciation Assessment. 1-6 - Atsuya Emoto, Ryo Matsuoka:
Hyperspectral Anomaly Detection Using Robust Principal Component Analysis with Autoencoding Adversarial Network. 1-4 - Yu-Hsien Chung, Chi-Hsuan Lu, Jung-Hui Cho, Chih-Chang Yu:
Utilizing Cross Layer Attentions for Semantic Segmentation of Small Objects. 1-6 - Jinyi Mi, Sehun Kim, Tomoki Toda:
Improved Architecture for High-resolution Piano Transcription to Efficiently Capture Acoustic Characteristics of Music Signals. 1-6 - Hao Qin, Haoran Sun, Yi Wang:
A Byte-based GPT-2 Model for Bit-flip JPEG Bitstream Restoration. 1-6 - Xiaohan Shi, Yuan Gao, Jiajun He, Jinyi Mi, Xingfeng Li, Tomoki Toda:
A Study on Multimodal Fusion and Layer Adapter in Emotion Recognition. 1-6 - Xuan-Phuoc Nguyen, Thi-Huong Nguyen, Duc-Tan Tran
, Tien-Son Bui, Van-Toi Nguyen:
An isolated Vietnamese Sign Language Recognition method using a fusion of Heatmap and Depth information based on Convolutional Neural Networks. 1-6 - Hongil Kim, Changwoo Han, Donghyun Kim, Sung-Chang Lim, Seung-Won Jung:
Test-Time Optimization for Post-Processing of Compressed Videos. 1-6 - Yanjun Li, Xiangyu Zhao, Zhengpeng Zha, Zhenhua Ling:
ET-SSM: Linear-Time Encrypted Traffic Classification Method Based On Structured State Space Model. 1-6 - Dongfei Chang, Jijie Wu, Xiaoxu Li:
Agent Attention Feature Reconstruction Network for Fine-Grained Few-Shot Image Classification. 1-6 - Yuan-Jhe Yin, Bo-Yu Chen, Berlin Chen:
A Novel LLM-based Two-stage Summarization Approach for Long Dialogues. 1-6 - Jin Xuan Teh, Norihiro Takamune, Hiroshi Saruwatari, Benjamin Yen, Michael Kingan, Yusuke Hioka:
Beamforming informed independent low-rank matrix analysis for sound source enhancement in unmanned aerial vehicles. 1-6 - Li-Ting Pai, Yi-Cheng Wang, Bi-Cheng Yan, Hsin-Wei Wang, Jia-Liang Lu, Chi-Han Lin, Juan-Wei Xu, Berlin Chen:
An Effective Contextualized Automatic Speech Recognition Approach Leveraging Self-Supervised Phoneme Features. 1-6 - Ximin Chen, Yuting Ding, Nan Yan, Changsheng Chen, Fei Chen:
Context-FFT: A Context Feed Forward Transformer Network for EEG-based Speech Envelope Decoding. 1-5 - Jiajin He, Chengxi Dong, Yunqi Cai, Dong Wang:
ComplexFace: A Public Visible-Thermal Face Dataset with Real-Life Complexity. 1-6 - Danqi Jin, Yitong Chen, Jie Chen, Gongping Huang:
Affine Combination of General Adaptive Filters. 1-5 - Takaaki Kojima, Norihiro Takamune, Daichi Kitamura, Hiroshi Saruwatari:
Design of Spectrogram-Consistency Regularization Term Dependent on Observation in Independent Low-Rank Matrix Analysis for Blind Source Separation. 1-6 - Jia-Yin Peng, Jian-Yi Chen, Bing-Zhao Li:
A Novel kind of WVD Associated with the Linear Canonical Transform. 1-6 - Haokun Cao, Yuanman Li, Xinyu Yang, Xia Li:
Region Aware Framework for Constrained Image Splicing Detection and Localization. 1-6 - Muhammad Sayyid Afif, Infall Syafalni, Nana Sutisna, Trio Adiono:
Transformer Attention Matrix Multiplication Design using 4 × 4 Systolic Arrays. 1-6 - Kapeleshh KS, Wei Chen, Prince Aldrin Domer, Hong Ji:
Exploring Brain Connectivity Patterns and Cognitive Resilience in Aging: A Study with the LEMON Dataset. 1-6 - Tsutahiro Fukuhara, Junya Hara, Hiroshi Higashi, Yuichi Tanaka:
Graph Filter Transfer for Time-Varying Signal Estimation Between Two Networks. 1-6 - Xianrui Wang, Shiqi Zhang, Bo He, Shoji Makino, Jingdong Chen:
Learnable Cross-Correlation based Filter-and-Sum Networks for Multi-channel Speech Separation. 1-5 - Sadahiro Yoshikawa, Ryo Ishii, Shogo Okada:
Is Corpus Suitable for Human Perception?: Quality Assessment of Voice Response Timing in Conversational Corpus through Timing Replacement. 1-6 - Seunghee Han, Sunhee Kim, Minhwa Chung:
Developing a Multilingual Spontaneous L2 Speech Corpus for Automated Proficiency Assessment. 1-6 - Daiki Sawada, Masahiro Yukawa:
Robust Adaptive Filtering Based on Adaptive Projected Subgradient Method: Moreau Enhancement of Distance Function. 1-6 - Jingyu Ren, Lei Yang:
Enhanced RefineDNet for Single Image Dehazing. 1-6 - Veron Zhen Liang Hii, Aaron Ken Kiat Lo, Ida Pei Xin Lee, Alec Vince Gonzales Abuan, Sue Han Lee, Patrick Hang Hui Then:
Two-Way Malaysian Sign Language Communication System for Inclusive Education. 1-6 - Doyeon Kim, Yanjue Song, Nilesh Madhu, Hong-Goo Kang:
Enhancing Neural Speech Embeddings for Generative Speech Models. 1-6 - Tsubasa Naito, Ryuto Ito, Yuichi Tanaka, Shogo Muramatsu:
Dictionary Learning for Directed Graph Signals via Augmented GFT. 1-6 - Masaki Aono, Tetsuya Asakawa, Kazuki Shimizu, Masashi Hahsimoto, Takeshi Miyaji, Kei Nomura:
Detecting Coronary Artery Stenosis from Cardiac CT Images using 3D CNNs. 1-6 - Cong Hieu Le, Lam Thai Nguyen, Trung Kien Pham, Le Khanh Nguyen, Tran Hiep Dinh, Stefan Jouannic, Helene Adam, Pierre Duhamel, Nguyen Linh Trung, Trong-Minh Hoang:
Structural Analysis of Asian and African Rice Panicles via Transfer Learning. 1-8 - Wu-Hao Li, Te-Hsin Liu, Chen-Yu Chiang:
A Preliminary Study on Analysing Mandarin Tone Values of Romance L2 Mandarin Learners. 1-6 - En-Lun Yu, Ruei-Xian Chang, Jeih-Weih Hung, Shih-Chieh Huang, Berlin Chen:
COIN-AT-PVAD: A Conditional Intermediate Attention PVAD. 1-5 - Jen-Tzung Chien, Wei-Yu Sun:
Adversarial Augmentation and Adaptation for Speech Recognition. 1-6 - Zongmei Chen, Xin Liao, Xiaoshuai Wu, Yanxiang Chen:
Compressed Deepfake Video Detection Based on 3D Spatiotemporal Trajectories. 1-8 - Rei Hamakawa, Michiharu Niimi:
Generation of target speech with speaker individuality based on accent conversion for English pronunciation learning. 1-6 - Koji Iwano, Wakana Komuro, Manami Gomi:
Comparative Analysis of Voice Mimicry Attacks by High- and Low-Skilled Imitators on Speaker Verification Systems. 1-6 - Duc Hai Nguyen, Trong Hiep Do, Hoang Linh Phuong Nguyen, Quoc Khanh Nguyen, Duc-Tan Tran
, Tien-Son Bui, Van Toi Nguyen:
A Solution For Anomaly Detection of Red Beans In A Product Processing Line. 1-5 - Sayaka Toma, Tomoki Ariga, Yosuke Higuchi, Ichiju Hayasaka, Rie Shigyo, Tetsuji Ogawa:
Differences Between Singer and Speaker Verification: Training Singer Feature Representation Extractor Utilizing Singing Voice Characteristics. 1-5 - Yuma Kinoshita, Hitoshi Kiya:
Scene-Segmentation-Based Exposure Compensation for Tone Mapping of High Dynamic Range Scenes. 1-6 - Shaoxiang Dang, Tetsuya Matsumoto, Yoshinori Takeuchi, Hiroaki Kudo:
U-Mamba-Net: A highly efficient Mamba-based U-net style network for noisy and reverberant speech separation. 1-5 - Huisheng Wang, Mingxiao Liu, Ji Qi, H. Vicky Zhao:
Optimal Investment With Incomplete Information and Herd Effect. 1-6 - Ravindrakumar M. Purohit, Dharmendra H. Vaghera, Hemant A. Patil:
GPGAN-VC: Enhancing Voice Conversion using Gradient Penalty. 1-6 - Dahyun Kim, Dongkwon Jin, Chang-Su Kim:
Monocular Depth Estimation for Autonomous Driving Based on Instance Clustering Guidance. 1-6 - Kaibao Nie:
Incorporating Auditory Processing into Undergraduate Signal Processing Courses to Enhance Student Learning. 1-5 - Chuong Hoang Vo, Truong Thanh Nhat Mai
, Chul Lee:
Cloud Removal in Hyperspectral Satellite Images Using Low-rank Tensor Completion. 1-6 - Yicheng Li
, Xinghua Sun:
One-step Spectral Estimation for Euclidean Distance Matrix Approximation. 1-6 - Zhanxuan Mei, Yun-Cheng Wang, C.-C. Jay Kuo:
GSBIQA: Green Saliency-guided Blind Image Quality Assessment Method. 1-6 - Anindhita Nayazirly Sukarno, Yahwista Salomo, Trio Adiono, Infall Syafalni, Nana Sutisna, Rahmat Mulyawan:
Accelerated Real-Time Local Maxima Detection in Video Streams Using FPGA Technology. 1-6 - Mare Hirose, Shoko Imaizumi, Hitoshi Kiya:
On the Security of Bitstream-level JPEG Encryption with Restart Markers. 1-6 - Conghui Li, Chern Hong Lim, Xin Wang:
A Parameter-free model for long-term concrete creep prediction. 1-6 - Ravindra M. Purohit, Arth J. Shah, Hemant A. Patil:
GGMDDC: An Audio Deepfake Detection Multilingual Dataset. 1-6 - Davy Tec-Hinh Lau, Jian-Jiun Ding, Guillaume Muller:
Optimization of the Intensity Aware Loss for Dynamic Facial Expression Recognition. 1-5 - Sarah Shamina Abdul Rauf, Mas Ira Syafila Mohd Hilmi Tan, Yuen Peng Loh:
Multi-band Satellite Image Analysis for Multi-label Classification. 1-6 - Shintami Chusnul Hidayati, Muhammad Valda Rizky Nur Firdaus, Riki Wahyu Nur Dianto, Sarwosri:
Unleashing Attributes-content Adaptation with Multi-color Spaces for Food Photo Aesthetic Assessment. 1-6 - Ravindrakumar M. Purohit, Dharmendra H. Vaghera, Arth J. Shah, Hemant A. Patil:
PPHiFi-TTS: Phonetic Preserved High-Fidelity Text-to-Speech for Long-Term Speech Dependencies. 1-6 - Yiting Zhang, Kaien Mo, Tetsuya Ueda, Yichen Yang, Shoji Makino:
On Joint Dereverberation and Single Moving Source Separation with Online Source Steering. 1-4 - Mei Hashimoto, Michiharu Niimi:
Generation of Photo Slideshow with Song based on Closeness between Concept of Lyrics and That of Images. 1-6 - Meghana Avula, Aditya Pusuluri, Hemant A. Patil:
Significance of Entropy Based Features For Dysarthric Severity Level Classification. 1-6 - Rui Zhou, Akinori Ito, Takashi Nose:
Improving Speaker Consistency in Speech-to-Speech Translation Using Speaker Retention Unit-to-Mel Techniques. 1-6 - Meng-Shin Lin, Bi-Cheng Yan, Tien-Hong Lo, Hsin-Wei Wang, Yue-Yang He, Wei-Cheng Chao, Berlin Chen:
PG-MDD: Prompt-Guided Mispronunciation Detection and Diagnosis Leveraging Articulatory Features. 1-6 - Zhi-Wei Tan, Andy W. H. Khong:
SMoLnet-T: An Efficient Complex-spectral Mapping Speech Enhancement Approach with Frame-wise CNN and Spectral Combination Transformer for Drone Audition. 1-6 - Jing-Ming Guo, Lun-Da Yuan, Cian Huang, Yi-Chong Zeng:
Contrastive Learning Based Knowledge Distillation for Enhancing Defect Detection. 1-6 - Wataru Hatakeyama, Shinnosuke Nozaki, Ayumi Serizawa, Mizuho Yoshihira, Masahiro Fujita, Ayako Yoshimura, Tetsushi Ohki, Masakatsu Nishigaki:
Multi-Observed Authentication: A secure and usable authentication based on multi-point observation of a single physical credential. 1-6 - Umi Syamimi, Chern Hong Lim, Lillian Yee Kiaw Wang:
IoT-based Smart Attendance System using Face Recognition and Motion Detection. 1-6 - Ming Xuan Chai, Yao Deng Fam, Quinito Norman Octaviano, Chih-Yang Pee, Lai-Kuan Wong, Mas Ira Syafila Mohd Hilmi Tan, John See
:
Improved Cassava Plant Disease Classification with Leaf Detection. 1-6 - Onhi Kato, Akira Kubota:
Zero-Shot Learning for Haze Removal Using Fusion of Near-Infrared and Color Images. 1-6 - Cuixin Yang, Rongkang Dong, Kin-Man Lam:
Efficient Adaptation for Real-World Omnidirectional Image Super-Resolution. 1-6 - Ken Kurata, Gen Sato, Izumi Tsunokuni, Yusuke Ikeda:
Noise-Robust Estimation of Early-part Room Impulse Responses based on Physics-Informed Neural Network with Dynamic Pulling Method. 1-5 - Rongkang Dong, Cuixin Yang, Kin-Man Lam:
Text-guided Visual Prompt Tuning with Masked Images for Facial Expression Recognition. 1-6 - Xiangyu Zhao, Yanjun Li, Zhengpeng Zha, Zhenhua Ling:
MGVul: a Multi-Granularity Detection Framework for Software Vulnerability. 1-6 - Joonyong Park, Daisuke Saito, Nobuaki Minematsu:
Analytic Study of Text-Free Speech Synthesis for Raw Audio using a Self-Supervised Learning Model. 1-6 - Roshan Birjais, Kevin I-Kai Wang, Waleed H. Abdulla:
Training Deep Neural Networks with HSIC and Backpropagation. 1-5 - Nitya Tiwari, Arjun Reddy Vadyala, K. S. Nataraj:
Automated prediction of loudness growth curve using EEG signals. 1-6 - Tianyu Gong, Tao Zhang, Ye Zhong, Mengmeng Zhang, Huihui Bai:
Screen Content Encoding Network Based on Deep Contextual Information. 1-6 - Dengyong Zhang, Runqi Lou, Jiaxin Chen, Xin Liao, Gaobo Yang, Xiangling Ding:
Dual Motion Attention and Enhanced Knowledge Distillation for Video Frame Interpolation. 1-6 - Yuanchen Niu, Yuanman Li, Guijia Zhang, Xia Li:
A Diffusion-Based Approach for Restoring Face-swapped Images. 1-5 - Raffaele Disabato, AprilPyone MaungMaung, Huy H. Nguyen, Isao Echizen:
Transfer-Based Adversarial Attack Against Multimodal Models by Exploiting Perturbed Attention Region. 1-6 - Yih-Liang Shen, Tai-Shih Chi:
Ensemble learning based head-related transfer function personalization using anthropometric features. 1-6 - Jinkai Zhang, Zijuan Han, Yunxia Liu, Yang Yang:
A Multi-Domain Camera Model Identification Feature Restoration Network to Counter AI Compression Attacks. 1-6 - Hayato Takeuchi, Takao Kawamura, Nobutaka Ono, Shoko Araki:
Synchronization of Signals with Sampling Rate Offset and Missing Data Using Dynamic Programming Matching. 1-6 - Muwei Jian, Yukun Ling, Rui Wang, Yanjie Zhong, Huihui Huang, Xiaoguang Li:
RepViT Based Lightweight Architecture for Distracted Driving Detection. 1-6 - Satoshi Shoji, Wataru Yata, Keita Kume, Isao Yamada:
A Discrete-Valued Signal Estimation by Nonconvex Enhancement of SOAV with cLiGME Model. 1-6 - Yuki Nishi, Koichi Shinoda, Koji Iwano:
LDMSE: Low Computational Cost Generative Diffusion Model for Speech Enhancement. 1-6 - Xiao Zhang, Haoran Xing, Mingxue Song, Daiki Takeuchi, Noboru Harada, Shoji Makino:
Prediction-error-based Adaptive SpecAugment for Fine-tuning the Masked Model on Audio Classification Tasks. 1-6 - Chun-Lin Liao, Jian-Jiun Ding, Chun-Jen Shih:
Non-blind Deblurring Using Probabilistic Models and Spatial Adaptive Restoration. 1-6 - Primanda Adyatma Hafiz, Candy Olivia Mawalim, Dessi Puji Lestari, Sakriani Sakti, Masashi Unoki:
Anomalous Machine Sound Detection Based on Time Domain Gammatone Spectrogram Feature and IDNN Model. 1-6 - Jiahao Zhang, Qi Liu, Le Hui, Yuchao Dai:
A Two-Stage Method for 3D Architecture Wireframe Reconstruction from Airborne LiDAR Point Cloud. 1-6 - Zhirun Li, Shisheng Guo, Jiahui Chen, Zhihao Zhu, Chen Qiu, Guolong Cui, Yutao Xiang:
A Two-Stage Wall Parameters Estimation Algorithm for MIMO Through-the-Wall Radar. 1-5 - Naohito Yoshikawa, Masaaki Ikehara:
Enhancing YOLOv7 with GLF-Trans for Precision in Small Object Detection. 1-5 - Shuhong Chen, Zewei Chen, Chen Li, Xianwei Zheng, Minfan He, Xutao Li:
Adaptive Time-Varying Graph Learning for Traffic Flow Data Based on Anomaly Moment Detection. 1-5 - Quang-Hai Luong, Duc-Nghia Tran, Sy-Hiep Nguyen, Lam Sinh Cong, Duc-Tan Tran:
Enhancing Shear Wave Propagation Analysis in Tissue with Directional Filtering of Reflected Waves. 1-6 - Zepeng Zhang, Ziping Zhao:
A Joint Graph Signal and Laplacian Denoising Network. 1-5 - Nichika Koyama, Nari Tanabe, Masaya Fujisawa:
Hammering Inspection System Using HPSS and Gradient Boosting with a Wall-Climbing Robot. 1-5 - Huiyun Hu, Junda Kong, Fei Wang, Hongzhi Sun, Yang Ge, Bo Xiao:
GMNER-LF: Generative Multi-modal Named Entity Recognition Based on LLM with Information Fusion. 1-6 - Zewei Chen, Shuhong Chen, Chen Li, Xianwei Zheng, Minfan He, Xutao Li:
Knowledge Augmented Attention Gating Embedding for Link Prediction. 1-5 - Seung-Won Lee, Jun-Seok Lee, Han-Jeong Hwang:
Effect of White Noise on Working Memory Using Event-Related Potentials. 1-4 - Natchira Dachoponchai, Yodchanan Wongsawat, Jetsada Arnin:
Predictive Analysis of Driver Drowsiness Progression: Multi-Level Drowsiness Classification Using Physiological Signals. 1-6 - Ryota Seo, Minoru Kuribayashi, Akinobu Ura, Antoine Mallet, Rémi Cogranne, Wojciech Mazurczyk, David Megías
:
Toward Universal Detector for Synthesized Images by Estimating Generative AI Models. 1-6 - Joshua Murphy, Conor Rosato
, Andrew Millard, Simon Maskell:
Parameterizing Hierarchical Particle Filters with Concept Drift for Time-varying Parameter Estimation. 1-6 - Eunsoo Hong, Sunhee Kim, Minhwa Chung:
Unsupervised Discovery of Non-Categorical L2 Error Patterns Using Wav2Vec2.0 Code Vectors. 1-6 - Jiachen Qiu, Yushen Zuo, Kin-Man Lam:
ACE-Flow: Auto Color Encoding for Enhanced Low-Light Image Restoration. 1-6 - Dengyong Zhang, Chuanzhen Xu, Jiaxin Chen, Bin Deng, Xin Liao:
YOLO-DC: Enhancing object detection with deformable convolutions and contextual mechanism. 1-6 - Yohei Horiguchi, Masaaki Ikehara, Kei Shibasaki:
More Direct and stage-wise network for Face Super Resolution. 1-6 - Z. Guo, Y. H. Chan, N. F. Law:
Deep Learning-based Intraoperative Video Analysis for Cataract Surgery Instrument Identification. 1-7 - Changsheng Chen, Wenyu Chen, Ximin Chen, Haodong Li:
A Document Presentation Attack Detection Scheme with Optical Flow under a Flashlight. 1-6 - Han Wang, Mingrui He, Mingjun Zhang, Longting Xu:
Semi-Supervised Far-Field Speaker Verification with Distance Metric Domain Adaptation. 1-6 - Alika Choo, Arghya Pal, Sailaja Rajanala, Arkendu Sen:
META: Text Detoxification by leveraging METAmorphic Relations and Deep Learning Methods. 1-6 - Wendi Zhu, KokSheik Wong, Minoru Kuribayashi:
A Permutation-based Reversible Data Hiding Method with Zero Visual Distortion. 1-6 - Yuan Hu, Yifan Zhang, Mingyang Ma, Shaohui Mei:
A Coarse-to-Fine Change Detection Method for Remote Sensing Sparse Cultivated Land. 1-6 - Chen-Jui Hsu, Jian-Jiun Ding, Chun-Jen Shih:
Tsnake: A Time-Embedded Recurrent Contour-Based Instance Segmentation Model. 1-6 - Minyoung Oh, Jae-Young Sim:
Lifelong Person Re-Identification with Backward-Compatibility. 1-6 - Kaito Takahashi, Yukoh Wakabayashi, Kengo Ohta, Akio Kobayashi, Norihide Kitaoka:
Domain Adaptation by Alternating Learning of Acoustic and Linguistic Information for Japanese Deaf and Hard-of-Hearing People. 1-7 - Xiaohan Fang
, Peilin Chen
, Meng Wang, Shiqi Wang:
How Accurate Can Large Vision Language Model Perform for Images with Compression Degradation? 1-6 - Guojian Lin, Yu Tsao, Fei Chen:
A Non-Intrusive Speech Quality Assessment Model using Whisper and Multi-Head Attention. 1-6 - Sho Inoue, Kun Zhou, Shuai Wang, Haizhou Li:
Fine-Grained Quantitative Emotion Editing for Speech Generation. 1-6 - Tomohiro Hayashi, Riku Ogino, Kohei Saijo, Tetsuji Ogawa:
What to Refer and How? - Exploring Handling of Auxiliary Information in Target Speaker Extraction. 1-6

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.