


default search action
LLM4Eval@sIGIR 2024: Washington DC, USA
- Clemencia Siro, Mohammad Aliannejadi, Hossein A. Rahmani, Nick Craswell, Charles L. A. Clarke, Guglielmo Faggioli, Bhaskar Mitra, Paul Thomas, Emine Yilmaz:

Proceedings of The First Workshop on Large Language Models for Evaluation in Information Retrieval (LLM4Eval 2024) co-located with 10th International Conference on Online Publishing (SIGIR 2024), Washington D.C., USA, July 18, 2024. CEUR Workshop Proceedings 3752, CEUR-WS.org 2024
LLMJudge Challenge Overivew
- Hossein A. Rahmani, Emine Yilmaz, Nick Craswell, Bhaskar Mitra, Paul Thomas, Charles L. A. Clarke, Mohammad Aliannejadi, Clemencia Siro, Guglielmo Faggioli:

LLMJudge: LLMs for Relevance Judgments. 1-3
Research Papers
- Bhashithe Abeysinghe, Ruhan Circi:

The Challenges of Evaluating LLM Applications: An Analysis of Automated, Human, and LLM-Based Approaches. 4-18 - Gabriel de Jesus

, Sérgio Nunes:
Exploring Large Language Models for Relevance Judgments in Tetun. 19-30 - Naghmeh Farzi, Laura Dietz:

EXAM++: LLM-based Answerability Metrics for IR Evaluation. 31-50 - Jia-Hong Huang, Hongyi Zhu, Yixian Shen, Stevan Rudinac, Alessio M. Pacces, Evangelos Kanoulas:

A Novel Evaluation Framework for Image2Text Generation. 51-65 - Hyunwoo Kim, Yoonseo Choi, Taehyun Yang, Honggu Lee, Chaneon Park, Yongju Lee, Jin Young Kim, Juho Kim:

Using LLMs to Investigate Correlations of Conversational Follow-up Queries with User Satisfaction. 66-91 - Zackary Rackauckas, Arthur Câmara, Jakub Zavrel:

Evaluating RAG-Fusion with RAGElo: an Automated Elo-based Framework. 92-112 - Jheng-Hong Yang, Jimmy Lin:

Toward Automatic Relevance Judgment using Vision-Language Models for Image-Text Retrieval Evaluation. 113-123

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.


Google
Google Scholar
Semantic Scholar
Internet Archive Scholar
CiteSeerX
ORCID














