


default search action
EACL 2026: Rabat, Morocco - Findings
- Vera Demberg, Kentaro Inui, Lluís Marquez:

Findings of the Association for Computational Linguistics: EACL 2026, Rabat, Morocco, March 24-29, 2026. Association for Computational Linguistics 2026, ISBN 979-8-89176-386-9 - Chong Zhang, Yixi Zhao, Yulu Xie, Chenshu Yuan, Yi Tu, Ya Guo, Mingxu Chai, Ziyu Shen, Yue Zhang, Qi Zhang:

Unveiling the Deficiencies of Pre-trained Text-and-Layout Models in Real-world Visually-rich Document Information Extraction. 1-16 - Rrubaa Panchendrarajan, Arkaitz Zubiaga:

Entity-aware Cross-lingual Claim Detection for Automated Fact-checking. 17-33 - Yuchen Zhuang, Di Jin, Jiaao Chen, Wenqi Shi, Hanrui Wang, Chao Zhang:

WorkForceAgent-R1: Incentivizing Reasoning Capability in LLM-based Web Agents via Reinforcement Learning. 34-49 - Sewon Kim, Jiwon Kim, Seungwoo Shin, Hyejin Chung, Daeun Moon, Yejin Kwon, Hyunsoo Yoon:

Being Kind Isn't Always Being Safe: Diagnosing Affective Hallucination in LLMs. 50-78 - Jiwon Kim, Hyunsoo Yoon:

Joint Multimodal Preference Optimization for Fine-Grained Visual-Textual Alignment. 79-94 - Kazutoshi Shinoda, Nobukatsu Hojo, Kyosuke Nishida, Yoshihiro Yamazaki, Keita Suzuki, Hiroaki Sugiyama, Kuniko Saito:

Let's Put Ourselves in Sally's Shoes: Shoes-of-Others Prefilling Improves Theory of Mind in Large Language Models. 95-109 - Seunghyuk Cho, Zhenyue Qin, Yang Liu, Youngbin Choi, Seungbeom Lee, Dongwoo Kim:

Plane Geometry Problem Solving with Multi-modal Reasoning: A Survey. 110-131 - Kieran Henderson, Kian Omoomi, Vasudha Varadarajan, Allison Lahnala, Charles Welch:

Examining the Utility of Self-disclosure Types for Modeling Annotators of Social Norms. 132-150 - Juhwan Choi, JungMin Yun, Changhun Kim, YoungBin Kim:

Position Paper: How Should We Responsibly Adopt LLMs in the Peer Review Process? 151-165 - Md. Tousin Akhter, Devansh Lalwani, Kshitij Sharad Jadhav, Pushpak Bhattacharyya:

Rad-Flamingo: A Multimodal Prompt driven Radiology Report Generation Framework with Patient-Centric Explanations. 166-188 - Zujie Liang, Feng Wei, Wujiang Xu, Yuxi Qian, Lin Chen, Xinhui Wu:

I-MCTS: Enhancing Agentic AutoML via Introspective Monte Carlo Tree Search. 189-210 - Zhipeng Xu, Zhenghao Liu, Yukun Yan, Shuo Wang, Shi Yu, Zheni Zeng, Chaojun Xiao, Zhiyuan Liu, Ge Yu, Chenyan Xiong:

ThinkNote: Enhancing Knowledge Integration and Utilization of Large Language Models via Constructivist Cognition Modeling. 211-229 - Ameen Ali, Lior Wolf, Ivan Titov:

Mitigating Copy Bias in In-Context Learning through Neuron Pruning. 230-251 - Zhe Xu, Kaveh Hassani, Si Zhang, Hanqing Zeng, Michihiro Yasunaga, Limei Wang, Dongqi Fu, Ning Yao, Bo Long, Hanghang Tong:

How to Make LMs Strong Node Classifiers? 252-274 - Yajiao Liu, Congliang Chen, Junchi Yang, Ruoyu Sun:

Rethinking Data Mixture for Large Language Models: A Comprehensive Survey and New Perspectives. 275-289 - Zachary William Hopton, Jannis Vamvas, Andrin Büchler, Anna Rutkiewicz, Rico Cathomas, Rico Sennrich:

The Mediomatix Corpus: Parallel Data for Romansh Language Varieties via Comparable Schoolbooks. 290-306 - Wei Zhao, Zhe Li, Yige Li, Jun Sun:

Unleashing the Unseen: Harnessing Benign Datasets for Jailbreaking Large Language Models. 307-330 - Karima Kadaoui, Hanin Atwany, Hamdan Al-Ali, Abdelrahman Mohamed, Ali Mekky, Sergei Tilga, Natalia Fedorova, Ekaterina Artemova, Hanan Aldarmaki, Yova Kementchedjhieva:

JEEM: Vision-Language Understanding in Four Arabic Dialects. 331-354 - Ghofrane Merhbene, Fabian Lecron, Philippe Fortemps, Bradford C. Dickerson, Mascha Kurpicz-Briki, Neguine Rezaii:

Detecting Primary Progressive Aphasia (PPA) from Text: A Benchmarking Study. 355-374 - Akshat Gupta, Atahan Ozdemir, Caoqinwei Gong, Gopala Anumanchipalli:

Geometric Interpretation of Layer Normalization and a Comparative Analysis with RMSNorm. 375-407 - Honghao Liu, Xuhui Jiang, Chengjin Xu, Cehao Yang, Yiran Cheng, Lionel Ni, Jian Guo:

Continual Pretraining on Encrypted Synthetic Data for Privacy-Preserving LLMs. 408-425 - Go Inoue, Bashar Alhafni, Nizar Habash, Timothy Baldwin:

Do Diacritics Matter? Evaluating the Impact of Arabic Diacritics on Tokenization and LLM Benchmarks. 426-442 - Ting-Rui Chiang, Dani Yogatama:

Pelican Soup Framework: A Theoretical Framework for Language Model Capabilities. 443-464 - Seonmin Koo, Jinsung Kim, Heuiseok Lim:

I Know, but I Don't Know! How Persona Conflict Undermines Instruction Adherence in Large Language Models. 465-489 - Maximilian Kreutner, Marlene Lutz, Markus Strohmaier:

Persona-driven Simulation of Voting Behavior in the European Parliament with Large Language Models. 490-511 - Sangwon Ryu, Heejin Do, Daehui Kim, Hwanjo Yu, Dongwoo Kim, Yunsu Kim, Gary Lee, Jungseul Ok:

Exploring Iterative Controllable Summarization with Large Language Models. 512-528 - Sherzod Hakimov, Roland Bernard, Tim Leiber, Karl Osswald, Kristina Richert, Ruilin Yang, Raffaella Bernardi, David Schlangen:

The Price of Thought: A Multilingual Analysis of Reasoning, Performance, and Cost of Negotiation in Large Language Models. 529-570 - Sahil Wadhwa, Himanshu Kumar, Guanqun Yang, Abbaas Alif Mohamed Nishar, Pranab Mohanty, Swapnil Shinde, Yue Wu:

ART: Adaptive Reasoning Trees for Explainable Claim Verification. 571-586 - Yu Cui, Sicheng Pan, Yifei Liu, Haibin Zhang, Cong Zuo:

VortexPIA: Indirect Prompt Injection Attack against LLMs for Efficient Extraction of User Privacy. 587-609 - Eunsoo Lee, Jeongwoo Lee, Minki Hong, Jangho Choi, Jihie Kim:

VisDoT : Enhancing Visual Reasoning through Human-Like Interpretation Grounding and Decomposition of Thought. 610-640 - Mingbo Song, Heming Xia, Jun Zhang, Chak Tou Leong, Qiancheng Xu, Wenjie Li, Sujian Li:

KNN-SSD: Enabling Dynamic Self-Speculative Decoding via Nearest Neighbor Layer Set Optimization. 641-655 - Louie Hong Yao, Nicholas Jarvis, Tianyu Jiang:

Towards Robust Evaluation of Visual Activity Recognition: Resolving Verb Ambiguity with Sense Clustering. 656-672 - Gio Paik, Yongbeom Kim, Soungmin Lee, Sangmin Ahn, Chan Woo Kim:

HiKE: Hierarchical Evaluation Framework for Korean-English Code-Switching Speech Recognition. 673-681 - Andrey Goncharov, Daniil Vyazhev, Petr Sychev, Edvard A. Khalafyan, Alexey Zaytsev:

Complexity-aware fine-tuning. 682-696 - Leonardo Ranaldi:

Multilingual Retrieval-Augmented Generation for Knowledge-Intensive Question Answering Task. 697-716 - Ashutosh Bajpai, Akshat Bhandari, Akshay Uttama Nambi, Tanmoy Chakraborty:

SpatialMath: Spatial Comprehension-Infused Symbolic Reasoning for Mathematical Problem-Solving. 717-742 - Paiheng Xu, Gang Wu, Xiang Chen, Tong Yu, Chang Xiao, Franck Dernoncourt, Tianyi Zhou, Wei Ai, Viswanathan (Vishy) Swaminathan:

Skill Discovery for Software Scripting Automation via Offline Simulations with LLMs. 743-759 - Patrícia Schmidtová, Niyati Bafna, Seth Aycock, Gianluca Vico, Wiktor Kamzela, Kathy Hämmerl, Vilém Zouhar:

How Important is 'Perfect' English for Machine Translation Prompts? 760-777 - Jiabin Fan, Guoqing Luo, Michael Bowling, Lili Mou:

KETCHUP: K-Step Return Estimation for Sequential Knowledge Distillation. 778-796 - Haiyan Zhao, Xuansheng Wu, Fan Yang, Bo Shen, Ninghao Liu, Mengnan Du:

Denoising Concept Vectors with Sparse Autoencoders for Improved Language Model Steering. 797-808 - Zara Siddique, Irtaza Khalid, Liam D. Turner, Luis Espinosa Anke:

Shifting Perspectives: Steering Vectors for Robust Bias Mitigation in LLMs. 809-820 - Omer Hofman, Jonathan Brokman, Oren Rachmil, Shamik Bose, Vikas Pahuja, Toshiya Shimizu, Trisha Starostina, Kelly Marchisio, Seraphina Goldfarb-Tarrant, Roman Vainshtein:

MAPS: A Multilingual Benchmark for Agent Performance and Security. 821-845 - Liwen Sun, Xiang Yu, Ming Tan, Zhuohao Chen, Anqi Cheng, Ashutosh Joshi, Chenyan Xiong:

Linking Knowledge to Care: Knowledge Graph-Augmented Medical Follow-Up Question Generation. 846-853 - Rongwu Xu, Xuan Qi, Zehan Qi, Wei Xu, Zhijiang Guo:

DebateQA: Evaluating Question Answering on Debatable Knowledge. 854-885 - Nishant Subramani, Kshitish Ghate, Mona T. Diab:

Personal Information Parroting in Language Models. 886-895 - Mingchen Li, Hanzhi Zhang, Heng Fan, Junhua Ding, Yunhe Feng:

Harmful Factuality: LLMs Correcting What They Shouldn't. 896-912 - Meiqing Jin, Liam Dugan, Chris Callison-Burch:

Toward Beginner-Friendly LLMs for Language Learning: Controlling Difficulty in Conversation. 913-936 - Nishat Raihan, Noah Erdachew, Jayoti Devi, Joanna C. S. Santos, Marcos Zampieri:

CodeGuard: Improving LLM Guardrails in CS Education. 937-949 - Yassir Lairgi, Ludovic Moncla, Khalid Benabdeslem, Rémy Cazabet, Pierre Cléau:

ATOM: AdapTive and OptiMized dynamic temporal knowledge graph construction using LLMs. 950-966 - Kemal Kurniawan, Meladel Mistica, Timothy Baldwin, Jey Han Lau:

On the Interplay between Human Label Variation and Model Fairness. 967-976 - Christophe Ye, Cassie S. Mitchell:

Where do LLMs currently stand on biomedical NER in both clean and noisy settings ? 977-1001 - Hirokazu Kiyomaru, Yusuke Oda, Takashi Kodama, Chaoran Liu, Daisuke Kawahara:

Scaling Data-Constrained Language Models with Synthetic Data. 1002-1016 - Omar Mahmoud, Ali Khalil, Thommen George Karimpanal, Buddhika Laknath Semage, Santu Rana:

The Unintended Trade-off of AI Alignment: Balancing Hallucination Mitigation and Safety in LLMs. 1017-1037 - Abhishek K. Mishra, Antoine Boutet, Lucas Magnana:

The Model's Language Matters: A Comparative Privacy Analysis of LLMs. 1038-1048 - Ulin Nuha, Adam Jatowt:

Towards the First NLP Benchmark for Ladin - an Extremely Low-Resource Language. 1049-1064 - Haiyang Shen, Hang Yan, Zhongshi Xing, Mugeng Liu, Yue Li, Zhiyang Chen, Yuxiang Wang, Jiuzheng Wang, Yun Ma:

DRAGON: Domain-specific Robust Automatic Data Generation for RAG Optimization. 1065-1078 - Toan Doan, Uyen Le, Thin Nguyen:

Causal Activation Steering via Sparse Mediation. 1079-1097 - Uyen Le, Thin Nguyen, Toan Nguyen, Toan Doan, Trung Le, Bac Le:

Causal Direct Preference Optimization for Language Model Alignment. 1098-1113 - Keito Kudo, Yoichi Aoki, Tatsuki Kuribayashi, Shusaku Sone, Masaya Taniguchi, Ana Brassard, Keisuke Sakaguchi, Kentaro Inui:

LLMs Faithfully and Iteratively Compute Answers During CoT: A Systematic Analysis With Multi-step Arithmetics. 1114-1153 - Jinchao Ge, Tengfei Cheng, Biao Wu, Zeyu Zhang, Shiya Huang, Judith Bishop, Gillian Shepherd, Meng Fang, Ling Chen, Yang Zhao:

VaseVQA: Multimodal Agent and Benchmark for Ancient Greek Pottery. 1154-1167 - Sullam Jeoung, Yueyan Chen, Yi Zhang, Shuai Wang, Haibo Ding, Lin Lee Cheong:

PromptPrism: A Linguistically-Inspired Taxonomy for Prompts. 1168-1192 - Hung Luu, Long S. T. Nguyen, Trung Pham, Hieu Pham, Tho Quan:

HiGraAgent: Dual-Agent Adaptive Reasoning over Hierarchical Knowledge Graph for Open Domain Multi-hop Question Answering. 1193-1217 - Mai Alkhamissi, Yunze Xiao, Badr AlKhamissi, Mona T. Diab:

Hire Your Anthropologist! Rethinking Culture Benchmarks Through an Anthropological Lens. 1218-1235 - Keigo Shibata, Kazuki Yano, Ryosuke Takahashi, Jaesung Lee, Wataru Ikeda, Jun Suzuki:

Suppressing Final Layer Hidden State Jumps in Transformer Pretraining. 1236-1262 - Zhexiong Liu, Diane J. Litman:

Intention-Adaptive LLM Fine-Tuning for Text Revision Generation. 1263-1281 - Yuxing Tian, Fengran Mo, Weixu Zhang, Yiyan Qi, Jian-Yun Nie:

ReAttn: Improving Attention-based Re-ranking via Attention Re-weighting. 1282-1295 - Md Hasebul Hasan, Mahir Labib Dihan, Tanzima Hashem, Mohammed Eunus Ali, Md. Rizwan Parvez:

MapAgent: A Hierarchical Agent for Geospatial Reasoning with Dynamic Map Tool Integration. 1296-1322 - Takashi Kodama, Yusuke Oda:

Comprehensive Study of Bilingual and Multi-category Instruction Pre-training. 1323-1340 - Mengdie Flora Wang, Haochen Xie, Mun Young Kim, Baishali Chaudhury, Meghana Ashok, Suren Gunturu, Sungmin Hong, Jae Oh Woo:

Reflect, Rewrite, Repeat: How Simple Arithmetic Enables Advanced Reasoning in Small Language Models. 1341-1363 - Jiwon Moon, Yerin Hwang, Dongryeol Lee, Taegwan Kang, Yongil Kim, Kyomin Jung:

Don't Judge Code by Its Cover: Exploring Biases in LLM Judges for Code Evaluation. 1364-1389 - Rui Xing, Preslav Nakov, Timothy Baldwin, Jey Han Lau:

COMMUNITYNOTES: A Dataset for Exploring the Helpfulness of Fact-Checking Explanations. 1390-1411 - Jivnesh Sandhan, Fei Cheng, Tushar Sandhan, Yugo Murawaki:

Persona Jailbreaking in Large Language Models. 1412-1430 - Rayyan Merchant, Kevin Tang:

ParsTranslit: Truly Versatile Tajik-Farsi Transliteration. 1431-1443 - Kohei Oda, Po-Min Chuang, Kiyoaki Shirai, Natthawut Kertkeidkachorn:

One Sentence, Two Embeddings: Contrastive Learning of Explicit and Implicit Semantic Representations. 1444-1452 - Jia-Kai Dong, I-Wei Huang, Chun-Tin Wu, Yi-Tien Tsai:

ETOM: A Five-Level Benchmark for Evaluating Tool Orchestration within the MCP Ecosystem. 1453-1488 - Sina Bagheri Nezhad, Yao Li, Ameeta Agrawal:

SymCode: A Neurosymbolic Approach to Mathematical Reasoning via Verifiable Code Generation. 1489-1503 - Heejeong Jeon, Minsu Park, YunSeok Choi, Eunil Park:

Unsupervised Detection of LLM-Generated Text in Korean Using Syntactic and Semantic Cues. 1504-1518 - Dhiman Goswami, Jai Kruthunz Naveen Kumar, Sanchari Das:

NLP Privacy Risk Identification in Social Media (NLP-PRISM): A Survey. 1519-1541 - Yisen Li, Lingfeng Yang, Wenxuan Shen, Pan Zhou, Yao Wan, Weiwei Lin, Dongping Chen:

CrowdSelect: SyntheticInstruction Data Selection with Multi-LLM Wisdom. 1542-1569 - Sheng-Lun Wei, Yu-Ling Liao, Yen-Hua Chang, Hen-Hsen Huang, Hsin-Hsi Chen:

Bias in the Ear of the Listener: Assessing Sensitivity in Audio Language Models Across Linguistic, Demographic, and Positional Variations. 1570-1589 - Iffat Maab, Junichi Yamagishi:

Pushing the Frontiers of Scientific Fact-Checking: The SCINLP Dataset. 1590-1617 - Nobuhiro Ueda, Yuyang Dong, Krisztián Boros, Daiki Ito, Takuya Sera, Masafumi Oyamada:

SCAN: Semantic Document Layout Analysis for Textual and Visual Retrieval-Augmented Generation. 1618-1637 - Jaewoo Lee, Joonho Ko, Jinheon Baek, Soyeong Jeong, Sung Ju Hwang:

Unified Multimodal Interleaved Document Representation for Retrieval. 1638-1654 - Minjun Kim, Inho Won, HyeonSeok Lim, MinKyu Kim, Junghun Yuk, Wooyoung Go, Jongyoul Park, Jungyeul Park, KyungTae Lim:

TELLME: Test-Enhanced Learning for Language Model Enrichment. 1655-1677 - Jieun Park, KyungTae Lim, Joon-ho Lim:

Beyond Accuracy: Alignment and Error Detection across Languages in the Bi-GSM8K Math-Teaching Benchmark. 1678-1704 - Loc Pham, Tung Luu, Thu Vo, Minh Nguyen, Viet Hoang:

VN-MTEB: Vietnamese Massive Text Embedding Benchmark. 1705-1725 - Mingyu Jeon, Sungjin Han, Jinkwon Hwang, Minchol Kwon, Jonghee Kim, Junyeong Kim:

See More, Store Less: Memory-Efficient Resolution for Video Moment Retrieval. 1726-1736 - Sihyeon Ha, Yongjeong Oh, Yo-Seb Jeon:

RB-LoRA: Rank-Balanced Aggregation for Low-Rank Adaptation with Federated Fine-Tuning. 1737-1746 - Seyed Mahed Mousavi, Edoardo Cecchinato, Lucia Hornikova, Giuseppe Riccardi:

Garbage In, Reasoning Out? Why Benchmark Scores are Unreliable and What to Do About It. 1747-1759 - Bo-Wei Chen, Chung-Chi Chen, An-Zi Yen:

Confidence-Driven Multi-Scale Model Selection for Cost-Efficient Inference. 1760-1770 - Han Yuan, Yue Zhao, Li Zhang, Wuqiong Luo, Zheng Ma:

Quantifying the Impact of Structured Output Format on Large Language Models through Causal Inference. 1771-1795 - Dzmitry Pihulski, Mikolaj Langner, Jan Eliasz, Przemyslaw Kazienko, Jan Kocon, Teddy Ferdinan:

Breaking the Illusion of Reasoning in Polish LLMs: Quality over Quantity of Thought. 1796-1811 - Jiapeng Wang, Jinhao Jiang, Zhiqiang Zhang, Jun Zhou, Xin Zhao:

RV-Syn: Rational and Verifiable Mathematical Reasoning Data Synthesis based on Structured Function Library. 1812-1827 - Liangtao Lin, Jun Zheng, Haidong Wang:

WebNovelBench: Placing LLM Novelists on the Web Novel Distribution. 1828-1847 - Yusuke Yamauchi, Akiko Aizawa:

From Semantics to Style: A Cross-Dataset Comparative Framework for Sentence Similarity Predictions. 1848-1877 - Andrey V. Galichin, Anton Korznikov, Alexey Dontsov, Oleg Rogov, Elena Tutubalina, Ivan V. Oseledets:

Feature Drift: How Fine-Tuning Repurposes Representations in LLMs. 1878-1887 - Tiziano Labruna, Arkadiusz Modzelewski, Giorgio Satta, Giovanni Da San Martino:

Detecting Winning Arguments with Large Language Models and Persuasion Strategies. 1888-1915 - Mingkai Tian, Guorong Li, Yuankai Qi, Anton van den Hengel, Qingming Huang:

The Devil is in the Distributions: Explicit Modeling of Scene Content is Key in Zero-Shot Video Captioning. 1916-1929 - Sara Rajaee, Rochelle Choenni, Ekaterina Shutova, Christof Monz:

Best-of-L: Cross-Lingual Reward Modeling for Mathematical Reasoning. 1930-1939 - Alba María Mármol-Romero, Robiert Sepúlveda-Torres, Estela Saquete, María Teresa Martín Valdivia, L. Alfonso Ureña:

Nuanced Toxicity Detection in Spanish: A New Corpus and Benchmark Study. 1940-1954 - Junseok Kim, Nakyeong Yang, Kyomin Jung:

Persona Switch: Mixing Distinct Perspectives in Decoding Time. 1955-1967 - Gautam Siddharth Kashyap, Harsh Joshi, Niharika Jain, Ebad Shabbir, Jiechao Gao, Nipun Joshi, Usman Naseem:

Revealing the Truth with ConLLM for Detecting Multi-Modal Deepfakes. 1968-1978 - Franziska Rubenbauer, Sebastian Steindl, Patrick Levi, Daniel Loebenberger, Ulrich Schäfer:

Detection of Adversarial Prompts with Model Predictive Entropy. 1979-1993 - Ruiran Su, Markus Leippold, Janet B. Pierrehumbert:

Actors, Frames and Arguments: A Multi-Decade Computational Analysis of Climate Discourse in Financial News using Large Language Models. 1994-2014 - Kushan Mitra, Dan Zhang, Hannah Kim, Estevam Hruschka:

RECAP: REwriting Conversations for Intent Understanding in Agentic Planning. 2015-2033 - Varsha Suresh, Muhammad Hamza Mughal, Christian Theobalt, Vera Demberg:

Modeling Turn-Taking with Semantically Informed Gestures. 2034-2041 - Usman Naseem, Gautam Siddharth Kashyap, Sushant Kumar Ray, Rafiq Ali, Ebad Shabbir, Abdullah Mohammad:

Do Large Language Models Reflect Demographic Pluralism in Safety? 2042-2052 - Collin Zhang, Tingwei Zhang, Vitaly Shmatikov:

Adversarial Decoding: Generating Readable Documents for Adversarial Objectives. 2053-2068 - John Mendonça, Alon Lavie, Isabel Trancoso:

MEDAL: A Framework for Benchmarking LLMs as Multilingual Open-Domain Dialogue Evaluators. 2069-2097 - Long S. T. Nguyen, Tho Quan:

Which Works Best for Vietnamese? A Practical Study of Information Retrieval Methods across Domains. 2098-2119 - Paolo Italiani, David Gimeno-Gómez, Luca Ragazzi, Gianluca Moro, Paolo Rosso:

MemeWeaver: Inter-Meme Graph Reasoning for Sexism and Misogyny Detection. 2120-2134 - Junseok Oh, Ji-Hwan Kim:

SEAM: Bridging the Temporal-Semantic Granularity Gap for LLM-based Speech Recognition. 2135-2144 - Luca Giordano, Simon Razniewski:

Foundations of LLM Knowledge Materialization: Termination, Reproducibility, Robustness. 2145-2164 - Trung Hieu Ngo, Adrien Bazoge, Solen Quiniou, Pierre-Antoine Gourraud, Emmanuel Morin:

Investigating Gender Stereotypes in Large Language Models via Social Determinants of Health. 2165-2180 - Isabelle Lee, Sarah Liaw, Dani Yogatama:

FOL-Traces: Verified First-Order Logic Reasoning Traces at Scale. 2181-2203 - Ieva Staliunaite, Julius Cheng, Andreas Vlachos:

Uncertainty Quantification for Evaluating Gender Bias in Machine Translation. 2204-2225 - Yongfu Xue:

PIRA: Preference-Oriented Instruction-Tuned Reward Models with Dual Aggregation. 2226-2234 - Konrad Löhr, Shuzhou Yuan, Michael Färber:

The Hidden Bias: A Study on Explicit and Implicit Political Stereotypes in Large Language Models. 2235-2252 - Stef Accou, Wessel Poelman:

TIPA: Typologically Informed Parameter Aggregation. 2253-2267 - Tom Zehle, Matthias Aßenmacher:

Can Calibration of Positional Encodings Enhance Long Context Utilization? 2268-2280 - Jonas Golde, Patrick Haller, Alan Akbik:

FiNERweb: Datasets and Artifacts for Scalable Multilingual Named Entity Recognition. 2281-2300 - Ying Ying Lim, Paul Röttger:

Bias in the East, Bias in the West: A Bilingual Analysis of LLM Political Bias on U.S.- and China-Related Issues. 2301-2326 - Shaivi Malik, Hasnat Md Abdullah, Sriparna Saha, Amit P. Sheth:

Ask Me Again Differently: GRAS for Measuring Bias in Vision Language Models on Gender, Race, Age, and Skin Tone. 2327-2388 - Xuan Luo, Yue Wang, Zefeng He, Geng Tu, Jing Li, Ruifeng Xu:

A Simple and Efficient Learning-Style Prompting for LLM Jailbreaking. 2389-2406 - Jiacheng Liu, Xiaofeng Hou:

Aggregating Crowd of LLMs for Cost-Effective Data Annotation. 2407-2419 - Evgeniia Tokarchuk, Maya K. Nachesa, Sergey Troshin, Vlad Niculae:

Representation Collapse in Machine Translation Through the Lens of Angular Dispersion. 2420-2431 - Flavio Merenda, José Manuél Gómez-Pérez, German Rigau:

Can LLMs Reason Like Doctors? Exploring the Limits of Large Language Models in Complex Medical Reasoning. 2432-2452 - Cedric Lothritz, Jordi Cabot, Laura Bernardy:

Testing Low-Resource Language Support in LLMs Using Language Proficiency Exams: the Case of Luxembourgish. 2453-2476 - Mathis Le Bail, Jérémie Dentan, Davide Buscaldi, Sonia Vanier:

Unveiling Decision-Making in LLMs for Text Classification : Extraction of influential and interpretable concepts with Sparse Autoencoders. 2477-2504 - Chenyue Zhou, Gürkan Solmaz, Flavio Cirillo, Kiril Gashteovski, Jonathan Fürst:

TextMineX: Data, Evaluation Framework and Ontology-guided LLM Pipeline for Humanitarian Mine Action. 2505-2523 - Mahbub E. Sobhani, Md. Faiyaz Abdullah Sayeedi, Tasnim Mohiuddin, Md Mofijul Islam, Swakkhar Shatabda:

MathMist: A Parallel Multilingual Benchmark Dataset for Mathematical Problem Solving and Reasoning. 2524-2550 - Seyyede Zahra Aftabi, Saeed Farzi:

Enhancing Reliability in Community Question Answering with an Expert-Oriented RAG System. 2551-2569 - Shuhuan Gu, Wenbiao Tao, Xinchen Ma, Kangkang He, Ye Guo, Xiang Li, Yunshi Lan:

Unsupervised Text Style Transfer for Controllable Intensity. 2570-2584 - AmirHossein Safdarian, Milad Mohammadi, Ehsan Jahanbakhsh, Mona Shahamat Naderi, Heshaam Faili:

SchemaGraphSQL: Efficient Schema Linking with Pathfinding Graph Algorithms for Text-to-SQL on Large-Scale Databases. 2585-2599 - Diego Rossini, Lonneke van der Plas:

Binary Token-Level Classification with DeBERTa for All-Type MWE Identification: A Lightweight Approach with Linguistic Enhancement. 2600-2610 - David Lindevelt, Suzan Verberne, Joost Broekens:

The Correlation Between Emotion in Text and Speech Segments is Limited: A Cross-Modal Study. 2611-2621 - Benedetta Muscato, Yue Li, Gizem Gezici, Zhixue Zhao, Fosca Giannotti:

Seeing All Sides: Multi-Perspective In-Context Learning for Subjective NLP. 2622-2638 - Kumiko Nakajima, Jan Zuiderveld, Sandro Pezzelle:

Beyond Divergent Creativity: A Human-Based Evaluation of Creativity in Large Language Models. 2639-2660 - Carlo Bretti, Pascal Mettes, Nanne van Noord:

Are Multimodal LLMs Movie Buffs? 2661-2677 - Milan Gritta, Debjit Paul, Xiaoguang Li, Lifeng Shang, Jun Wang, Gerasimos Lampouras:

Process Evaluation for Agentic Systems. 2678-2692 - Gaetano Cimino, Giuseppe Carenini, Vincenzo Deufemia:

MIMIC: Multi-party Dialogue Augmentation via Speaker Stylistic Transfer. 2693-2719 - Tafazzul Nadeem, Bhavik Shangari, Manish Rai, Gagan Raj Gupta, Ashutosh Modi:

TechING: Towards Real World Technical Image Understanding via VLMs. 2720-2749 - Piyush Singh Pasi:

Multilingual-To-Multimodal (M2M): Unlocking New Languages with Monolingual Text. 2750-2771 - Surgan Jandial, Yinheng Li, Justin Wagle, Kazuhito Koishida:

Do GUI Grounders Truly Understand UI Elements? 2772-2785 - Haowei Fu, Bo Ni, Han Xu, Kunpeng Liu, Dan Lin, Tyler Derr:

Ensemble Privacy Defense for Knowledge-Intensive LLMs against Membership Inference Attacks. 2786-2799 - Qiusi Zhan, Angeline Budiman-Chan, Abdelrahman Zayed, Xingzhi Guo, Daniel Kang, Joo-Kyung Kim:

SafeSearch: Do Not Trade Safety for Utility in LLM Search Agents. 2800-2815 - Ryan Shea, Yunan Lu, Liang Qiu, Zhou Yu:

SAGE : A Top-Down Bottom-Up Knowledge-Grounded User Simulator for Multi-turn Agent Evaluation. 2816-2839 - Branislav Pecher, Ján Cegin, Róbert Belanec, Ivan Srba, Jakub Simko, Mária Bieliková:

Better as Generators Than Classifiers: Leveraging LLMs and Synthetic Data for Low-Resource Multilingual Classification. 2840-2857 - Zijie Liu, Xinyu Zhao, Jie Peng, Jinhao Duan, Zhuangdi Zhu, Qingyu Chen, Kaidi Xu, Xia Hu, Tianlong Chen:

Dialogue is Better Than Monologue: Instructing Meidcal LLMs via Strategic Conversations. 2858-2872 - Saadat Hasan Khan, Spencer Hong, Jingyu Wu, Kevin Lybarger, Youbing Yin, Erin Babinsky, Daben Liu:

DF-RAG: Query-Aware Diversity for Retrieval-Augmented Generation. 2873-2894 - Arjun Chandra, Kevin Miller, Venkatesh Ravichandran, Constantinos Papayiannis, Venkatesh Saligrama:

Hearing Between the Lines: Unlocking the Reasoning Power of LLMs for Speech Evaluation. 2895-2916 - Phuong Minh Nguyen, Dang Huu-Tien, Naoya Inoue:

Improving Chain-of-Thought for Logical Reasoning via Attention-Aware Intervention. 2917-2941 - Michael R. Metel, Yufei Cui, Boxing Chen, Prasanna Parthasarathi:

Thinking Long, but Short: Stable Sequential Test-Time Scaling for Large Reasoning Models. 2942-2951 - Boyang Zhang, Istemi Ekin Akkus, Ruichuan Chen, Alice Dethise, Klaus Satzke, Ivica Rimac, Yang Zhang:

Defeating Cerberus: Privacy-Leakage Mitigation in Vision Language Models. 2952-2965 - Mohammadamin Shafiei, Hamidreza Saffari, Mohammad Taher Pilehvar, Alessandro Raganato:

TruthTrap: A Bilingual Benchmark for Evaluating Factually Correct Yet Misleading Information in Question Answering. 2966-2987 - Jiayi Tian, Ryan Solgi, Jinming Lu, Yifan Yang, Hai Li, Zheng Zhang:

FLAT-LLM: Fine-grained Low-rank Activation Space Transformation for Large Language Model Compression. 2988-3002 - Laurin Wischounig, Abdelrahman Abdallah, Adam Jatowt:

Negative Sampling Techniques in Dense Retrieval: A Survey. 3003-3020 - Wangyang Ying, Yanchi Liu, Xujiang Zhao, Wei Cheng, Zhengzhang Chen, Wenchao Yu, Yanjie Fu, Haifeng Chen:

Multi-Agent Procedural Graph Extraction with Structural and Logical Refinement. 3021-3034 - Wei-Chieh Huang, Cornelia Caragea:

MADIAVE: Multi-Agent Debate for Implicit Attribute Value Extraction. 3035-3053 - Minghao Guo, Qingcheng Zeng, Xujiang Zhao, Yanchi Liu, Wenchao Yu, Mengnan Du, Haifeng Chen, Wei Cheng:

DeepSieve: Information Sieving via LLM-as-a-Knowledge-Router. 3054-3077 - Xiaotang Du, Giwon Hong, Wai-Chung Kwan, Rohit Saxena, Ivan Titov, Pasquale Minervini, Emily Allaway:

Analyzing LLM Instruction Optimization for Tabular Fact Verification. 3078-3108 - Ioan-Paul Ciobanu, Andrei-Iulian Hîji, Nicolae-Catalin Ristea, Paul Irofti, Cristian Rusu, Radu Tudor Ionescu:

XMAD-Bench: Cross-Domain Multilingual Audio Deepfake Benchmark. 3109-3120 - Naiming Liu, Richard G. Baraniuk, Shashank Sonkar:

CLEAR-3K: Assessing Causal Explanatory Capabilities in Language Models. 3121-3136 - Runzhe Wu, Ankur Samanta, Ayush Jain, Scott Fujimoto, Jeongyeol Kwon, Ben Kretzu, Youliang Yu, Kaveh Hassani, Boris Vidolov, Yonathan Efroni:

Imbalanced Gradients in RL Post-Training of Multi-Task LLMs. 3137-3150 - Bo Yuan, Yun Zhou, Zhichao Xu, Kiran Ramnath, Aosong Feng, Balasubramaniam Srinivasan:

BayesFlow: A Probability Inference Framework for Meta-Agent Assisted Workflow Generation. 3151-3179 - Guimin Hu, Daniel Hershcovich, Hasti Seifi:

HapticLLaMA: A Multimodal Sensory Language Model for Haptic Captioning. 3180-3192 - Zizhong Li, Haopeng Zhang, Jiawei Zhang:

Token-Level Precise Attack on RAG: Searching for the Best Alternatives to Mislead Generation. 3193-3206 - Krishna Teja Chitty-Venkata, Jie Ye, Siddhisanket Raskar, Anthony Kougkas, Xian-He Sun, Murali Emani, Venkatram Vishwanath, Bogdan Nicolae:

PagedEviction: Structured Block-wise KV Cache Pruning for Efficient Large Language Model Inference. 3207-3218 - Ishani Mondal, Meera Bharadwaj, Ayush Roy, Aparna Garimella, Jordan Lee Boyd-Graber:

SMART-Editor: A Multi-Agent Framework for Human-Like Design Editing with Structural Integrity. 3219-3245 - Amr Gomaa, Ahmed Salem, Sahar Abdelnabi:

ConVerse: Benchmarking Contextual Safety in Agent-to-Agent Conversations. 3246-3268 - Kaiwen Zhou, Ahmed Elgohary, A S. M. Iftekhar, Amin Saied:

SIRAJ: Diverse and Efficient Red-Teaming for LLM Agents via Distilled Structured Reasoning. 3269-3292 - Tazeek Bin Abdur Rakib, Lay-Ki Soon, Wern Han Lim:

Who You Are, What You Say: Intra- and Inter- Context Personality for Emotion Recognition in Conversation. 3293-3308 - Charles Corbière, Simon Roburin, Syrielle Montariol, Antoine Bosselut, Alexandre Alahi:

DRIVINGVQA: A Dataset for Interleaved Visual Chain-of-Thought in Real-World Driving Scenarios. 3309-3333 - Fangyuan Xu, Rujun Han, Yanfei Chen, Zifeng Wang, I-Hung Hsu, Jun Yan, Vishy Tirumalashetty, Eunsol Choi, Tomas Pfister, Chen-Yu Lee:

SAGE: Steerable Agentic Data Generation for Deep Search with Execution Feedback. 3334-3351 - Yanglei Gan, Peng He, Yuxiang Cai, Run Lin, Guanyu Zhou, Qiao Liu:

Negative-Aware Diffusion Process for Temporal Knowledge Graph Extrapolation. 3352-3367 - Ruiyao Xu, Noelle I. Samia, Han Liu:

DS2-Instruct: Domain-Specific Data Synthesis for Large Language Models Instruction Tuning. 3368-3384 - Aaryaman Kartha, Ahmed Masry, Mohammed Saidul Islam, Thinh Lang, Shadikur Rahman, Ridwan Mahbub, Mizanur Rahman, Mahir Ahmed, Md. Rizwan Parvez, Enamul Hoque, Shafiq Joty:

DashboardQA: Benchmarking Multimodal Agents for Question Answering on Interactive Dashboards. 3385-3407 - Zhiting Mei, Christina Zhang, Tenny Yin, Justin Lidard, Ola Shorinwa, Anirudha Majumdar:

Reasoning about Uncertainty: Do Reasoning Models Know When They Don't Know? 3408-3458 - Naome A. Etori, Kelechi Ezema, Nathaniel Romney Robinson, Davis David, Alfred Malengo Kondoro, Elisha Ondieki Makori, Michael S. Mollel, Maria L. Gini:

AfriMMT-EA: Multi-domain Machine Translation for Low-Resource East African Languages. 3459-3492 - Zheng Huang, Kiran Ramnath, Yueyan Chen, Aosong Feng, Sangmin Woo, Balasubramaniam Srinivasan, Zhichao Xu, Kang Zhou, Shuai Wang, Haibo Ding, Lin Lee Cheong:

Diffusion Language Model Inference with Monte Carlo Tree Search. 3493-3512 - Duc Trung Vu, Chi Pham Khanh, Phi Van Dat, Ngo Van Linh, Dinh Viet Sang, Trung Le:

DWA-KD: Dual-Space Weighting and Time-Warped Alignment for Cross-Tokenizer Knowledge Distillation. 3513-3527 - Zhichen Zeng, Qi Yu, Xiao Lin, Ruizhong Qiu, Xuying Ning, Tianxin Wei, Yuchen Yan, Jingrui He, Hanghang Tong:

Harnessing Consistency for Robust Test-Time LLM Ensemble. 3528-3545 - Suhee Yoon, Sanghyu Yoon, Ye Seul Sim, Seungdong Yoa, Dongmin Kim, Soonyoung Lee, Hankook Lee, Woohyung Lim:

AutoAnoEval: Semantic-Aware Model Selection via Tree-Guided LLM Reasoning for Tabular Anomaly Detection. 3546-3560 - Tiejin Chen, Xiaoou Liu, Vishnu Nandam, Kuanru Liou, Hua Wei:

Conformal Feedback Alignment: Quantifying Answer-Level Reliability for Robust LLM Alignment. 3561-3572 - Sunzhu Li, Zhiyu Lin, Jiale Zhao, Shuling Yang, Chen Wei:

ThinkPilot: Steering Reasoning Models via Automated Think-prefixes Optimization. 3573-3592 - Shengmin Piao, Jieun Lee, Sanghyun Park:

LitE-SQL: A Lightweight and Efficient Text-to-SQL Framework with Vector-based Schema Linking and Execution-Guided Self-Correction. 3593-3608 - Thanh Vinh Nguyen, Ngo Van Dong, Minh Chu Xuan, Tung Nguyen, Linh Ngo Van, Dinh Viet Sang, Trung Le:

Beyond Coherence: Improving Temporal Consistency and Interpretability in Dynamic Topic Models. 3609-3629 - Yiyang Li, Zehong Wang, Zhengqing Yuan, Zheyuan Zhang, Keerthiram Murugesan, Chuxu Zhang, Yanfang Ye:

Interpretable Graph-Language Modeling for Detecting Youth Illicit Drug Use. 3630-3647 - Peijun Qing, Xingjian Diao, Chiyu Ma, Saeed Hassanpour, Soroush Vosoughi:

Tailoring Memory Granularity for Multi-Hop Reasoning over Long Contexts. 3648-3666 - Hongfu Liu, Zhouying Cui, Xiangming Gu, Ye Wang:

Unlocking Large Audio-Language Models for Interactive Language Learning. 3667-3690 - Jinghan Zhang, Fengran Mo, Tharindu Cyril Weerasooriya, Xinyue Ye, Dongjie Wang, Yanjie Fu, Kunpeng Liu:

Blind Spot Navigation in Large Language Model Reasoning with Thought Space Explorer. 3691-3707 - Haohan Yuan, Sukhwa Hong, Haopeng Zhang:

StrucSum: Graph-Structured Reasoning for Long Document Extractive Summarization with LLMs. 3708-3721 - Zekun Hu, Yichu Xu, De-Chuan Zhan:

Logits-Based Block Pruning with Affine Transformations for Large Language Models. 3722-3736 - Martin Hyben, Sebastian Kula, Ján Cegin, Jakub Simko, Ivan Srba, Róbert Móro:

MultiCW: A Large-Scale Balanced Benchmark Dataset for Training Robust Check-Worthiness Detection Models. 3737-3754 - Naihao Deng, Sheng Zhang, Henghui Zhu, Shuaichen Chang, Jiani Zhang, Alexander Hanbo Li, Chung-Wei Hang, Hideo Kobayashi, Yiqun Hu, Patrick Ng:

What Really Matters for Table LLMs? A Meta-Evaluation of Model and Data Effects. 3755-3782 - Abishek Stephen, Jindrich Libovický:

Evaluating Morphological Plausibility of Subword Tokenization via Statistical Alignment with Morpho-Syntactic Features. 3783-3791 - Vera Pavlova, Mohammed Makhlouf:

MOSAIC: Masked Objective with Selective Adaptation for In-domain Contrastive Learning. 3792-3807 - Kalyan Nakka, Nitesh Saxena:

BitBypass: A New Direction in Jailbreaking Aligned Large Language Models with Bitstream Camouflage. 3808-3834 - Jorge Osés Grijalba, L. Alfonso Ureña, Eugenio Martínez-Cámara, José Camacho-Collados:

The Problem of Ambiguity in Table Question Answering. 3835-3848 - Joschka Braun, Carsten Eickhoff, Seyed Ali Bahrainian:

Beyond Multiple Choice: Evaluating Steering Vectors for Summarization. 3849-3884 - Al-Amin Sany, Mohaiminul Islam, Tanzima Hashem, Md. Ashraful Islam, Mohammed Eunus Ali:

Similar Region Search using LLMs on Spatial Feature Space. 3885-3898 - Arpan Phukan, Manish Gupta, Asif Ekbal:

Learning to Ask: Multi-Decoder Fine-Tuning for Multi-Hop Visual Question Generation with External Knowledge. 3899-3918 - Ifeoluwa Wuraola, Daniel Marciniak, Nina Dethlefs:

SLANG-GraphRAG: Multi-Layered Retrieval with Domain-Specific Knowledge for Low Resource Social Media Conversations. 3919-3931 - Hannah Calzi Kleidermacher, James Zou:

Science Across Languages: Assessing LLM Multilingual Translation of Scientific Papers. 3932-3947 - Minjae Lee, Wonjun Kang, Byeongkeun Ahn, Christian Classen, Kevin Galim, Seunghyuk Oh, Minghao Yan, Hyung Il Koo, Kangwook Lee:

TABED: Test-Time Adaptive Ensemble Drafting for Robust Speculative Decoding in LVLMs. 3948-3974 - Alex Robertson, Huizhi Liang, Mahbub Gani, Rohit Kumar, Srijith Rajamohan:

KGHaluBench: A Knowledge Graph-Based Hallucination Benchmark for Evaluating the Breadth and Depth of LLM Knowledge. 3975-3989 - Jungin Kim, Shinwoo Park, Yo-Sub Han:

Marking Code Without Breaking It: Code Watermarking for Detecting LLM-Generated Code. 3990-4002 - Diogo Glória-Silva, David Semedo, João Magalhães:

VIGiA: Instructional Video Guidance via Dialogue Reasoning and Retrieval. 4003-4030 - Inigo Jauregi Unanue, Najmeh Sadoughi, Vimal Bhat, Zhu Liu, Massimo Piccardi:

Attribute-Controlled Translation with Preference Optimization. 4031-4057 - Nuhu Ibrahim, Rishi Ravikumar, Robert Stevens, Riza Batista-Navarro:

ReciFine: Finely Annotated Recipe Dataset for Controllable Recipe Generation. 4058-4074 - Thomas Bauwens, Miryam de Lhoneux:

ReBPE: Iteratively Improving the Internal Structure of a Structured Tokeniser by Mining its Internal Structure. 4075-4090 - Xiao Li, Kotaro Funakoshi, Manabu Okumura:

Emotion Recognition in Multi-Speaker Conversations through Speaker Identification, Knowledge Distillation, and Hierarchical Fusion. 4091-4106 - Yusuke Nakamura, Hirokazu Kiyomaru, Chaoran Liu, Shuhei Kurita, Daisuke Kawahara:

Demystifying Mixed Outcomes of Self-Training: Pre-training Analyses on Non-Toy LLMs. 4107-4113 - Masaki Sashida, Takeshi Kojima, Yusuke Iwasawa, Yutaka Matsuo:

Revealing Redundant Syntax in Large Language Models through Multi-Hop Dependency Paths. 4114-4137 - Toqeer Ehsan, Thamar Solorio:

A Scalable Framework for Automated NER Annotation Correction in Low-Resource Languages. 4138-4151 - Shanshan Wang, Derek F. Wong, Jingming Yao, Lidia S. Chao:

Can ChatGPT Really Understand Modern Chinese Poetry? 4152-4162 - Akriti Jain, Aparna Garimella:

Knowing What's Missing: Assessing Information Sufficiency in Question Answering. 4163-4174 - Yue Zhou, Henry Peng Zou, Barbara Di Eugenio, Yang Zhang:

The Curse of Verbalization: How Presentation Order Constrains LLM Reasoning. 4175-4185 - Donya Rooein, Sankalan Pal Chowdhury, Mariia Eremeeva, Yuan Qin, Debora Nozza, Mrinmaya Sachan, Dirk Hovy:

PATS: Personality-Aware Teaching Strategies with Large Language Model Tutors. 4186-4211 - Yiheng Zhao, Yuanliang Li, Shreya Savant, Jun Yan:

Mitigating Causal Bias in LLMs via Potential Outcomes Framework and Actual Causality Theory. 4212-4222 - Niko Dalla Noce, Davide Colla, Sina Farhang Doust, Lorenzo De Mattei, Davide Bacciu:

JuriFindIT: an Italian legal retrieval dataset. 4223-4241 - Noopur Zambare, Kiana Aghakasiri, Carissa Lin, Carrie Ye, J. Ross Mitchell, Mohamed Abdalla:

Towards Fair and Efficient De-identification: Quantifying the Efficiency and Generalizability of De-identification Approaches. 4242-4257 - Christopher M. Homan, Flip Korn, Deepak Pandita, Chris Welty:

How Many Ratings per Item are Necessary for Reliable Significance Testing? 4258-4273 - David Beauchemin, Pier-Luc Veilleux, Richard Khoury, Johanna-Pascale Roy:

QFrBLiMP: a Quebec-French Benchmark of Linguistic Minimal Pairs. 4274-4304 - Mae Sosto, Delfina Sol Martinez Pandiani, Laura Hollink:

QueerGen: How LLMs Reflect Societal Norms on Gender and Sexuality in Sentence Completion Task. 4305-4326 - Zhuoyan Xu, Haoyang Fang, Boran Han, Bonan Min, Bernie Wang, Cuixiong Hu, Shuai Zhang:

Efficient Table Retrieval and Understanding with Multimodal Large Language Models. 4327-4340 - Fatema Siddika, Md. Anwar Hossen, Juan Pablo Muñoz, Tanya G. Roosta, Anuj Sharma, Ali Jannesari:

FedReFT: Federated Representation Fine-Tuning with All-But-Me Aggregation. 4341-4362 - Deepon Halder, Alan Saji, Thanmay Jayakumar, Anoop Kunchukuttan, Ratish Puduppully, Raj Dabre:

RiddleBench: A New Generative Reasoning Benchmark for LLMs. 4363-4372 - Abdul Hameed Azeemi, Ihsan Ayyub Qazi, Agha Ali Raza:

Language Model-Driven Data Pruning Enables Efficient Active Learning. 4373-4392 - Lorenzo Puppi Vecchi, Alceu de Souza Britto Jr., Emerson Cabrera Paraiso, Rafael M. O. Cruz:

HARM: Learning Hate-Aware Reward Model for Evaluating Natural Language Explanations of Offensive Content. 4393-4431 - Xiao Xiao, Iftitahu Ni'mah, Yuyun Wabula, Mykola Pechenizkiy, Meng Fang:

MATH-IDN: A Multilingual Mathematical Problem Solving Dataset Featuring Local Languages in Indonesia. 4432-4438 - Yilun Liu, Yunpu Ma, Yuetian Lu, Shuo Chen, Zifeng Ding, Volker Tresp:

Parameter-Efficient Routed Fine-Tuning: Mixture-of-Experts Demands Mixture of Adaptation Modules. 4439-4457 - Zheyuan Zhang, Lin Ge, Hongjiang Li, Weicheng Zhu, Chuxu Zhang, Yanfang Ye:

MAPRO: Recasting Multi-Agent Prompt Optimization as Maximum a Posteriori Inference. 4458-4480 - Bowen Li, Ziqi Xu, Jing Ren, Renqiang Luo, Xikun Zhang, Xiuzhen Zhang, Yongli Ren, Feng Xia:

Debiasing Large Language Models via Adaptive Causal Prompting with Sketch-of-Thought. 4481-4499 - Joshua Tint, Som Sagar, Aditya Taparia, Kelly Raines, Bimsara Pathiraja, Caleb Liu, Ransalu Senanayake:

ExpressivityBench: Can LLMs Communicate Implicitly? 4500-4515 - Md Mahadi Hasan Nahid, Davood Rafiei, Weiwei Zhang, Yong Zhang:

Rethinking Schema Linking: A Context-Aware Bidirectional Retrieval Approach for Text-to-SQL. 4516-4546 - Shen Dong, Mingxuan Zhang, Pengfei He, Li Ma, Bhavani Thuraisingham, Hui Liu, Yue Xing:

PEAR: Planner-Executor Agent Robustness Benchmark. 4547-4567 - Zheng Hui, Xiaokai Wei, Yexi Jiang, Kevin Gao, Chen Wang, Se-eun Yoon, Rachit Pareek, Michelle Gong:

Toward Safe and Human-Aligned Game Conversational Recommendation via Multi-Agent Decomposition. 4568-4584 - Yi Fan, Michael Strube, Wei Liu:

Linguistic Cues for LLM-based Implicit Discourse Relation Classification. 4585-4602 - Duy C. Hoang, Hung T. Q. Le, Rui Chu, Ping Li, Weijie Zhao, Yingjie Lao, Khoa D. Doan:

SpARK: An Embarrassingly Simple Sparse Watermarking in LLMs with Enhanced Text Quality. 4603-4626 - Elisabeth Fittschen, Sabrina Li, Tom Lippincott, Leshem Choshen, Craig Messner:

Pretraining Language Models for Diachronic Linguistic Change Discovery. 4627-4642 - Adyasha Patra, Dhiraj Kumar Sah, Preethi Jyothi:

Improving Language Identification for Code-Switched Speech: The Pivotal Role of Accented English. 4643-4656 - Aditya Sharma, Christopher Pal, Amal Zouaq:

Reducing Hallucinations in Language Model-based SPARQL Query Generation Using Post-Generation Memory Retrieval. 4657-4668 - Zhengyuan Jiang, Yuepeng Hu, Yuchen Yang, Yinzhi Cao, Neil Zhenqiang Gong:

Jailbreaking Safeguarded Text-to-Image Models via Large Language Models. 4669-4684 - Haoran Wang, Jiatong Shi, Jinchuan Tian, Bohan Li, Kai Yu, Shinji Watanabe:

BSCodec: A Band-Split Neural Codec for High-Quality Universal Audio Reconstruction. 4685-4697 - Jiwei Guan, Hai Jin, Haohan Wang:

Crafting Adversarial Inputs for Large Vision-Language Models Using Black-Box Optimization. 4698-4708 - Jiazheng Li, Yawei Wang, Qiaojing Yan, Yijun Tian, Zhichao Xu, Huan Song, Panpan Xu, Lin Lee Cheong:

SALT: Step-level Advantage Assignment for Long-horizon Agents via Trajectory Graph. 4709-4725 - Xiaojie Guo, Yang Zhang, Bing Zhang, Ryo Kawahara, Mikio Takeuchi, Yada Zhu:

UniToolBench: A Benchmark for Tool-Augmented LLMs in Cross-Domain, Universal Task Automation. 4726-4736 - Rohit Dutta, Paramita Koley, Soham Poddar, Janardan Misra, Sanjay Podder, Naveen Balani, Saptarshi Ghosh, Niloy Ganguly:

Benchmarking the Energy Savings with Speculative Decoding Strategies. 4737-4748 - Yeonkyoung So, Gyuseong Lee, Sungmok Jung, Joonhak Lee, JiA Kang, Sangho Kim, Jaejin Lee:

Thunder-NUBench: A Benchmark for LLMs' Sentence-Level Negation Understanding. 4749-4793 - William Watson, Nicole Cho, Sumitra Ganesh, Manuela Veloso:

What Makes a Good Query? Measuring the Impact of Human-Confusing Linguistic Features on LLM Performance. 4794-4827 - Zhe Cao, Yusuke Oda, Qianying Liu, Akiko Aizawa, Taro Watanabe:

Completely Modular Fine-tuning for Dynamic Language Adaptation. 4828-4845 - Mabrouka Bessghaier, Md. Rafiul Biswas, Shimaa Ibrahim, Wajdi Zaghouani:

A Multi-Task Learning Framework for Modeling Engagement and Topic-Sensitive Responses in Arabic Women's Discourse. 4846-4854 - Preston K. Robinette, Andrew Hard, Swaroop Ramaswamy, Ehsan Amid, Rajiv Mathews, Taylor T. Johnson:

We Are What We Repeatedly Do: Improving Long Context Instruction Following. 4855-4884 - Juseon-Do, Sungwoo Han, Jingun Kwon, Hidetaka Kamigaito, Manabu Okumura:

ConRAS: Contrastive In-context Learning Framework for Retrieval-Augmented Summarization. 4885-4900 - Juseon-Do, Sungwoo Han, Jingun Kwon, Hidetaka Kamigaito, Katsuhiko Hayashi, Taro Watanabe:

Beyond Sampling: Self-Sorting for Long-Context Ranking. 4901-4910 - Mike Zhou, Fenil Bardoliya, Vivek Gupta, Dan Roth:

Program-of-Thought Reveals LLM Abstraction Ceilings. 4911-4919 - Ahatsham Hayat, Hunter Tridle, Mohammad Rashedul Hasan:

From Numbers to Narratives: Efficient Language Model-Based Detection for Safety-Critical Minority Classes. 4920-4937 - Hyeonseok Kang, Hyuk Namgoong, Goun Pyeon, Sangkeun Jung:

R-GDA: Reflective Guidance Data Augmentation with Multi-Agent Feedback for Domain-Specific Named Entity Recognition. 4938-4953 - Daniel Israel, Aditya Grover, Guy Van den Broeck:

Enabling Autoregressive Models to Fill In Masked Tokens. 4954-4965 - Atsushi Shimizu, Shohei Taniguchi, Yutaka Matsuo:

Position Encoding with Random Float Sampling Enhances Length Generalization of Transformers. 4966-4980 - Di Wu, Siyue Liu, Zixiang Ji, Ya-Liang Chang, Zhe-Yu Liu, Andrew Pleffer, Kai-Wei Chang:

Open-Domain Safety Policy Construction. 4981-4999 - Junyeob Kim, Sang-goo Lee, Taeuk Kim:

Think Just Enough: Leveraging Self-Assessed Confidence for Adaptive Reasoning in Language Models. 5000-5006 - Zehui Jiang, Xin Zhao, Yuta Kumadaki, Naoki Yoshinaga:

CLICKER: Cross-Lingual Knowledge Editing via In-Context Learning with Adaptive Stepwise Reasoning. 5007-5022 - Shengqi Zhu, Jeffrey M. Rzeszotarski, David Mimno:

Show or Tell? Modeling the evolution of request-making in Human-LLM conversations. 5023-5034 - Carlo Alfano, Aymen Al Marjani, Zeno Jonke, Amin Mantrach, Saab Mansour, Marcello Federico:

Multilingual Self-Taught Faithfulness Evaluators. 5035-5051 - Dain Kim, Jiwoo Lee, Jaehoon Yun, Yonghoe Koo, Qingyu Chen, Hyunjae Kim, Jaewoo Kang:

Benchmarking Direct Preference Optimization for Medical Large Vision-Language Models. 5052-5067 - Jonas Becker, Lars Benedikt Kaesberg, Andreas Stephan, Jan Philip Wahle, Terry Ruas, Bela Gipp:

Stay Focused: Problem Drift in Multi-Agent Debate. 5068-5102 - Yulia Otmakhova, Thinh Hung Truong, Rahmad Mahendra, Zenan Zhai, Rongxin Zhu, Daniel Beck, Jey Han Lau:

FLUKE: A Linguistically-Driven and Task-Agnostic Framework for Robustness Evaluation. 5103-5123 - Yiheng Yang, Yujie Wang, Chi Ma, Lei Yu, Emmanuele Chersoni, Chu-Ren Huang:

Sparse Brains are Also Adaptive Brains: Cognitive-Load-Aware Dynamic Activation for LLMs. 5124-5138 - Anthony Hughes, Vasisht Duddu, N. Asokan, Nikolaos Aletras, Ning Ma:

PATCH: Mitigating PII Leakage in Language Models with Privacy-Aware Targeted Circuit PatcHing. 5139-5153 - Ettore Caputo, Sergio Greco, Lucio La Cava:

Argument Component Segmentation with Fine-Tuned Large Language Models. 5154-5167 - Yuliang Yan, Haochun Tang, Shuo Yan, Enyan Dai:

DuFFin: A Dual-Level Fingerprinting Framework for LLMs IP Protection. 5168-5184 - Asif Azad, Mohammad Sadat Hossain, MD Sadik Hossain Shanto, M. Saifur Rahman, Md. Rizwan Parvez:

The Art of Saying "Maybe": A Conformal Lens for Uncertainty Benchmarking in VLMs. 5185-5201 - Jihyeon Kim, Insung Lee, Myoung-Wan Koo:

Diagnosis of Dysarthria Severity and Explanation Generation Using XAI-Enhanced CLINIC-GENIE on Diadochokinetic Tasks. 5202-5222 - Raoyuan Zhao, Yihong Liu, Hinrich Schütze, Michael A. Hedderich:

A Comprehensive Evaluation of Multilingual Chain-of-Thought Reasoning: Performance, Consistency, and Faithfulness Across Languages. 5223-5247 - Andreea-Nicoleta Dutulescu, Stefan Ruseti, Mihai Dascalu, Danielle S. McNamara:

ORSO QGen: Odds-Ratio Steerable Optimization for Controlling Question Generation. 5248-5259 - Noam Dahan, Omer Kidron, Gabriel Stanovsky:

Leveraging Digitized Newspapers to Collect Summarization Data in Low-Resource Languages. 5260-5273 - Jingshen Zhang, Xin Ying Qiu, Lifang Lu, Zhuhua Huang, Yutao Hu, Yuechang Wu, JunYu Lu:

Let's Simplify Step by Step: Guiding LLM Towards Multilingual Unsupervised Proficiency-Controlled Sentence Simplification. 5274-5290 - Yupian Lin, Guangya Yu, Cheng Yuan, Huan Du, Hui Luo, Yuang Bian, Jingping Liu, Zhidong He, Wen Du, Tong Ruan:

LogToP: Logic Tree-of-Program with Table Instruction-tuned LLMs for Controlled Logical Table-to-Text Generation. 5291-5303 - Youngsoo Jang, Yu Jin Kim, Geon-Hyeong Kim, Honglak Lee, Moontae Lee:

IRPO: Implicit Policy Regularized Preference Optimization. 5304-5325 - Seffi Cohen, Nurit Cohen-Inger, Niv Goldshlager, Bracha Shapira, Lior Rokach:

DFPE: A Diverse Fingerprint Ensemble for Enhancing LLM Performance. 5326-5336 - Yiyang Wang, Chen Ding, Hangfeng He:

Ranking Human and LLM Texts Using Locality Statistics. 5337-5348 - Jeonghun Baek, Kazuki Egashira, Shota Onohara, Atsuyuki Miyai, Yuki Imajuku, Hikaru Ikuta, Kiyoharu Aizawa:

MangaVQA and MangaLMM: A Benchmark and Specialized Model for Multimodal Manga Understanding. 5349-5370 - Tzu-Cheng Peng, Chien Chin Chen, Yung-Chun Chang:

Hierarchical User Intent Inference with Knowledge Graph Grounding. 5371-5377 - Joe Stacey, Lisa Alazraki, Aran Ubhi, Beyza Ermis, Aaron Mueller, Marek Rei:

Improving the OOD Performance of Closed-Source LLMs on NLI Through Strategic Data Selection. 5378-5404 - Siwei Wu, King Zhu, Yu Bai, Yiming Liang, Yizhi Li, Haoning Wu, Jiaheng Liu, Ruibo Liu, Xingwei Qu, Xuxin Cheng, Ge Zhang, Wenhao Huang, Chenghua Lin:

MMRA: A Benchmark for Evaluating Multi-Granularity and Multi-Image Relational Association Capabilities in Large Visual Language Models. 5405-5419 - Siwei Wu, JinCheng Ren, Xeron Du, Shuyue Guo, Xingwei Qu, Yiming Liang, Jie Liu, Yunwen Li, Tyler Loakman, Tianyu Zheng, Boyu Feng, Huaqing Yuan, Zili Wang, Jiaheng Liu, Wenhao Huang, Chenglin Cai, Haoran Que, Jian Yang, Yuelin Bai, Zekun Moore Wang, Zhouliang Yu, Qunshu Lin, Ding Pan, Yuchen Eleanor Jiang, Tiannan Wang, Wangchunshu Zhou, Shenzhi Wang, Xingyuan Bu, Minghao Liu, Guoyin Wang, Ge Zhang, Chenghua Lin:

COIG-P: A High-Quality and Large-Scale Chinese Preference Dataset for Alignment with Human Values. 5420-5447 - Ningyuan Deng, Hanyu Duan, Yixuan Tang, Yi Yang:

Revealing the Numeracy Gap: An Empirical Investigation of Text Embedding Models. 5448-5461 - Yuliang Xu, Siming Huang, Mingmeng Geng, Yao Wan, Xuanhua Shi, Dongping Chen:

code-transformed: The Influence of Large Language Models on Code. 5462-5490 - Mukund Choudhary, Madhur Jindal, Gaurja Aeron, Monojit Choudhury:

Do LLMs model human linguistic variation? A case study in Hindi-English Verb code-mixing. 5491-5509 - Mohammed Bouri, Mohammed Erradi, Adnane Saoud:

ART: Attention-Regularized Transformers for Multi-Modal Robustness. 5510-5535 - Himanshu Chaudhary, Ruida Wang, Gowtham Ramesh, Junjie Hu:

GRAFF: GRaph-Augmented Fine-grained Fusion for Large Language Models. 5536-5547 - Jerry Huang, Siddarth Madala, Risham Sidhu, Cheng Niu, Hao Peng, Julia Hockenmaier, Tong Zhang:

Tackling Distractor Documents in Multi-Hop QA with Reinforcement and Curriculum Learning. 5548-5561 - Andrei Vlad Man, Razvan-Alexandru Smadu, Cristian-George Craciun, Dumitru-Clementin Cercel, Florin Pop, Mihaela-Claudia Cercel:

RoD-TAL: A Benchmark for Answering Questions in Romanian Driving License Exams. 5562-5602 - Albert Sawczyn, Jakub Binkowski, Denis Janiak, Bogdan Gabrys, Tomasz Kajdanowicz:

FactSelfCheck: Fact-Level Black-Box Hallucination Detection for LLMs. 5603-5621 - Sonakshi Chauhan, Maheep Chaudhary, Kwan Kiu Choy, Samuel Nellessen, Nandi Schoots:

Punctuations and Predicates in Language Models. 5622-5636 - Mandeep Rathee, Venktesh V, Sean MacAvaney, Avishek Anand:

Test-time Corpus Feedback: From Retrieval to RAG. 5637-5656 - Anku Rani, Aparna Garimella, Apoorv Saxena, Balaji Vasan Srinivasan, Paul Pu Liang:

RADAR: A Reasoning-Guided Attribution Framework for Explainable Visual Data Analysis. 5657-5677 - Rifat Rafiuddin:

MaskLoRA: Low-Rank Subspace-Induced Token Masking for Efficient and Faithful Language Models. 5678-5692 - Marco Martinelli, Stefano Marchesin, Vanessa Bonato, Giorgio Maria Di Nunzio, Nicola Ferro, Ornella Irrera, Laura Menotti, Federica Vezzani, Gianmaria Silvello:

A Domain-Specific Curated Benchmark for Entity and Document-Level Relation Extraction. 5693-5711 - Yongxin Zhou, Changshun Wu, Philippe Mulhem, Didier Schwab, Maxime Peyrard:

What Matters to an LLM? Behavioral and Computational Evidences from Summarization. 5712-5737 - Max Pellert, Clemens Lechner, Indira Sen, Markus Strohmaier:

Neural network embeddings recover value dimensions from psychometric survey items on par with human data. 5738-5752 - Dwip Dalal, Madhav Kanda, Zhenhailong Wang, Heng Ji, Unnat Jain:

Compositional Reasoning via Joint Image and Language Decomposition. 5753-5775 - Manan Roy Choudhury, Adithya Chandramouli, Mannan Anand, Vivek Gupta:

Better Call CLAUSE: A Discrepancy Benchmark for Auditing LLMs Legal Reasoning Capabilities. 5776-5818 - Kuangdai Leng, Jia Bi, Samuel Pinilla, Jaehoon Cha:

Token-Wise Kernels (TWiKers) for Vicinity-Aware Attention in Transformers. 5819-5835 - Jeffrey Li, Joshua P. Gardner, Doug Kang, Fangping Shi, Karanjeet Singh, Chun-Liang Li, Herumb Shandilya, David Leo Wright Hall, Oncel Tuzel, Percy Liang, Ludwig Schmidt, Hadi Pouransari, Fartash Faghri:

Beyond a Single Extractor: Re-thinking HTML-to-Text Extraction for LLM Pre-training. 5836-5861 - Michal Pietruszka, Lukasz Borchmann, Aleksander Jedrosz, Pawel Morawiecki:

Can Models Help Us Create Better Models? Evaluating LLMs as Data Scientists. 5862-5886 - Jabez Magomere, Elena Kochkina, Samuel Mensah, Simerjot Kaur, Fernando Acero, Arturo Oncevay, Charese Smiley, Xiaomo Liu, Manuela Veloso:

Distill and Align Decomposition for Enhanced Claim Verification. 5887-5912 - Ramaravind Kommiya Mothilal, Joanna Roy, Syed Ishtiaque Ahmed, Shion Guha:

Argument-Based Consistency in Toxicity Explanations of LLMs. 5913-5941 - Seyyed Saeid Cheshmi, Hahnemann Ortiz, James Mooney, Dongyeop Kang:

Reasoning Beyond Literal: Cross-style Multimodal Reasoning for Figurative Language Understanding. 5942-5956 - Arthur Satouf, Yuxuan Zong, Habiboulaye Amadou Boubacar, Pablo Piantanida, Benjamin Piwowarski:

QueStER: Query Specification for Generative Keyword-Based Retrieval. 5957-5968 - Moghis Fereidouni, Muhammad Umair Haider, Peizhong Ju, A. B. Siddique:

Evaluating Sparse Autoencoders for Monosemantic Representation. 5969-5984 - Abdullah Al Monsur, Nitesh Vamshi Bommisetty, Gene Louis Kim:

Event Detection with a Context-Aware Encoder and LoRA for Improved Performance on Long-Tailed Classes. 5985-6003 - Hyewon Suh, Chaojian Li, Cheng-Jhih Shih, Zheng Wang, Kejing Xia, Yonggan Fu, Yingyan Celine Lin:

Think Hard Only When Needed: A Hybrid Best-of-N and Beam Search for Efficient Test-Time Compute. 6004-6017 - Dorde Klisura, Joseph Khoury, Ashish Kundu, Ram Krishnan, Anthony Rios:

Role-Conditioned Refusals: Evaluating Access Control Reasoning in Large Language Models. 6018-6034 - Rizky Ramadhana Putra, Raihan Sultan Pasha Basuki, Yutong Cheng, Peng Gao:

NL2Logic: AST-Guided Translation of Natural Language into First-Order Logic with Large Language Models. 6035-6051 - Aditya Bharat Soni, Boxuan Li, Xingyao Wang, Valerie Chen, Graham Neubig:

Coding Agents with Multimodal Browsing are Generalist Problem Solvers. 6052-6069 - Jongwook Han, Woojung Song, Jonggeun Lee, Yohan Jo:

Quantifying Data Contamination in Psychometric Evaluations of LLMs. 6070-6088 - Song-ha Jo, Youngrok Ko, Sang-goo Lee, Jinseok Seol:

Task-aware Block Pruning with Output Distribution Signals for Large Language Models. 6089-6107 - Jishnu Warrier, Heqing Huang, Yuzhang Lin, Sai Qian Zhang:

LARA: LLM-based Agile Power Distribution Network Restoration from Disastrous Events. 6108-6116 - Mohammad Khodadad, Ali Shiraee Kasmaee, Mahdi Astaraki, Nicholas Sherck, Hamidreza Mahyar, Soheila Samiee:

Evaluating Multi-Hop Reasoning in Large Language Models: A Chemistry-Centric Benchmark. 6117-6143 - Kshitij Mishra, Nils Lukas, Salem Lahlou:

SD-E2: Semantic Exploration for Reasoning Under Token Budgets. 6144-6157 - Haiyun Huang, Yukun Li, Marco A Pretell, Jacob Naroian, Ebadah Khan, Liping Liu:

How to Contextualize Empirical Data for Risk Analysis with LLMs: A Case Study of Power Outages. 6158-6172 - Minghan Zhang, Shu Zhao, Zhen Yang, Hongsheng Wu, Yongxing Lin, Haodong Zou, Jie Chen, Zhen Duan:

Thinking Beyond the Local: Multi-View Instructed Adaptive Reasoning in KG-Enhanced LLMs. 6173-6188 - Lekkala Sai Teja, Siva Gopala Krishna Nuthakki, Ufaq Khan, Muhammad Haris Khan, Atul Mishra:

DAMASHA: Detecting AI in Mixed Adversarial Texts via Segmentation with Human-interpretable Attribution. 6189-6206 - Juhyun Oh, Nayeon Lee, Chani Jung, Jiho Jin, Junho Myung, Jongwon Lee, Taieui Song, Alice Oh:

FINEST: Improving LLM Responses to Sensitive Topics Through Fine-Grained Evaluation. 6207-6226 - Junbo Li, Peng Zhou, Rui Meng, Meet P. Vadera, Lihong Li, Yang Li:

Turn-PPO: Turn-Level Advantage Estimation with PPO for Improved Multi-Turn RL in Agentic LLMs. 6227-6243 - Minhua Lin, Zhengzhang Chen, Yanchi Liu, Xujiang Zhao, Zongyu Wu, Junxiang Wang, Xiang Zhang, Suhang Wang, Haifeng Chen:

Decoding Time Series with LLMs: A Multi-Agent Framework for Cross-Domain Annotation. 6244-6281 - Sello Ralethe, Jan Buys:

Multi-Hall-SA: A Cross-lingual Benchmark for Multi-Type Hallucination Detection in Low-Resource South African Languages. 6282-6296 - Joonghyuk Hahn, Yo-Sub Han:

Query4Regex: Verifiable Regex Transformation through Formal Operations from NL and DSL Queries. 6297-6305 - Sanjeev Kumar, Preethi Jyothi, Pushpak Bhattacharyya:

SrcMix: Mixing of Related Source Languages Benefits Extremely Low-resource Machine Translation. 6306-6323 - Yash Saxena, Ankur Padia, Kalpa Gunaratna, Manas Gaur:

IMRNNs: An Efficient Method for Interpretable Dense Retrieval via Embedding Modulation. 6324-6337 - Shuyi Zhang, Zhenbin Chen, Shuting Li, Kewei Tu, Li Jing, Zixia Jia, Zilong Zheng:

MMUIE: Massive Multi-Domain Universal Information Extraction for Long Documents. 6338-6370 - Clemencia Siro, Pourya Aliannejadi, Mohammad Aliannejadi:

Learning to Judge: LLMs Designing and Applying Evaluation Rubrics. 6371-6389 - Sohhyung Park, Hyunji Kang, Sungzoon Cho, Dongil Kim:

PsyProbe: Proactive and Interpretable Dialogue through User State Modeling for Exploratory Counseling. 6390-6411 - Liel Binyamin, Elior Sulem:

Learning from Child-directed Speech in Two-language Scenarios: A French-English Case-Study. 6412-6426 - Camila Zurdo Tagliabue, Heloísa Oss Boll, Aykut Erdem, Erkut Erdem, Iacer Calixto:

DeVisE: Towards the Behavioral Testing of Medical Large Language Models. 6427-6441 - Matija Luka Kukic, Marko Culjak, David Dukic, Martin Tutek, Jan Snajder:

Sequence Repetition Enhances Token Embeddings and Improves Sequence Labeling with Decoder-only Language Models. 6442-6456 - Omid Ghahroodi, Arshia Hemmat, Marzia Nouri, Seyed Mohammad Hadi Hosseini, Doratossadat Dastgheib, Mohammad V. Sanian, Alireza Sahebi, Reihaneh Zohrabi, Mohammad Hossein Rohban, Ehsaneddin Asgari, Mahdieh Soleymani Baghshah:

MEENA (PersianMMMU): Multimodal-Multilingual Educational Exams for N-level Assessment. 6457-6491 - Taido Purason, Pavel Chizhov, Ivan P. Yamshchikov, Mark Fishel:

Teaching Old Tokenizers New Words: Efficient Tokenizer Adaptation for Pretrained Models. 6492-6516 - Lekkala Sai Teja, Ashok Urlana, Pruthwik Mishra:

AGIC: Attention-Guided Image Captioning to Improve Caption Relevance. 6517-6528 - Jieun Kim, Yujin Jeong, Sung-Bae Cho:

Visual-Linguistic Abductive Reasoning with LLMs for Knowledge-based Visual Question Answering. 6529-6544 - Guy Mor-Lan, Tamir Sheafer, Shaul R. Shenhav:

FactAppeal: Identifying Epistemic Factual Appeals in News Media. 6545-6556 - Thi Vu, Linh The Nguyen, Dat Quoc Nguyen:

Vietnamese Automatic Speech Recognition: A Revisit. 6557-6568 - Woongkyu Lee, Junhee Cho, Jungwook Choi:

MapCoder-Lite: Distilling Multi-Agent Coding into a Single Small LLM. 6569-6596 - Keenan Samway, Miu Nicole Takagi, Rada Mihalcea, Bernhard Schölkopf, Ilias Chalkidis, Daniel Hershcovich, Zhijing Jin:

When Do Language Models Endorse Limitations on Human Rights Principles? 6597-6623 - Lamisa Bintee Mizan Deya, Farhatun Shama, Abdul Aziz, Md Kaykobad Reza, Md. Shahidul Salim:

Abstractive Summarization of Bengali Academic Videos Based on Audio Subtitles. 6624-6643 - Bonaventure F. P. Dossou, Ines Arous, Audrey Durand, Jackie Chi Kit Cheung:

Active Learning with Non-Uniform Costs for African Natural Language Processing. 6644-6656 - Giacomo Gonella, Gian Maria Campedelli, Stefano Menini, Marco Guerini:

CrisiText: A dataset of warning messages for LLM training in emergency communication. 6657-6677 - Lukas Christ, Shahin Amiriparian:

Training-Free Text Emotion Tagging via LLM-Based Best-Worst Scaling. 6678-6694 - Hayk Stepanyan, Aishwarya Verma, Andrew Zaldivar, Rutledge Chin Feman, Erin MacMurray van Liemt, Charu Kalia, Vinodkumar Prabhakaran, Sunipa Dev:

Scaling Cultural Resources for Improving Generative Models. 6695-6709 - Sultan AlRashed, Jianghui Wang, Francesco Orabona:

Cards Against Contamination: TCG-Bench for Difficulty-Scalable Multilingual LLM Reasoning. 6710-6724 - Shounak Paul, Raghav Dogra, Pawan Goyal, Saptarshi Ghosh:

ILSIC: Corpora for Identifying Indian Legal Statutes from Queries by Laymen. 6725-6746

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID














