


default search action
25th ISMIR 2024: San Francisco, CA, USA
- Blair Kaneshiro, Gautham J. Mysore, Oriol Nieto, Chris Donahue, Cheng-Zhi Anna Huang, Jin Ha Lee, Brian McFee, Matthew C. McCallum:
Proceedings of the 25th International Society for Music Information Retrieval Conference, ISMIR 2024, San Francisco, California, USA and Online, November 10-14, 2024. 2024, ISBN 978-1-7327299-4-0 - Zeng Ren
, Yannis Rammos, Martin A. Rohrmeier:
Formal Modeling of Structural Repetition Using Tree Compression. 53-60 - Adithi Shankar, Genís Plaja-Roglans, Thomas Nuttall, Martín Rocamora, Xavier Serra:
Saraga Audiovisual: A Large Multimodal Open Data Collection for the Analysis of Carnatic Music. 61-69 - Xingjian Du, Mingyu Liu, Pei Zou, Xia Liang, Zijie Wang, Huidong Liang, Bilei Zhu:
X-Cover: Better Music Version Identification System by Integrating Pretrained ASR Model. 70-77 - Emmanuel Deruty:
Harmonic and Transposition Constraints Arising From the Use of the Roland TR-808 Bass Drum. 78-85 - Hitoshi Suda, Shunsuke Yoshida, Tomohiko Nakamura, Satoru Fukayama, Jun Ogata:
FruitsMusic: A Real-World Corpus of Japanese Idol-Group Songs. 86-94 - Marios Glytsos, Christos Garoufis, Athanasia Zlatintsi, Petros Maragos:
Classical Guitar Duet Separation Using GuitarDuets - A Dataset of Real and Synthesized Guitar Recordings. 95-102 - Ziya Zhou, Yuhang Wu, Zhiyue Wu, Xinyue Zhang, Ruibin Yuan, Yinghao Ma, Lu Wang, Emmanouil Benetos, Wei Xue, Yike Guo:
Can LLMs "Reason" in Music? an Evaluation of LLMs' Capability of Music Understanding and Generation. 103-110 - Marco Pasini, Stefan Lattner, George Fazekas:
Music2Latent: Consistency Autoencoders for Latent Audio Compression. 111-119 - Johannes Zeitler, Ben Maman, Meinard Müller:
Robust and Accurate Audio Synchronization Using Raw Features From Transcription Models. 120-127 - Takayuki Nakatsuka, Masahiro Hamasaki, Masataka Goto:
Harnessing the Power of Distributions: Probabilistic Representation Learning on Hypersphere for Multimodal Music Information Retrieval. 128-136 - Andrew M. Demetriou, Jaehun Kim, Sandy Manolios, Cynthia C. S. Liem:
Towards Automated Personal Value Estimation in Song Lyrics. 137-145 - Simon Rouard, Yossi Adi, Jade Copet, Axel Roebel, Alexandre Défossez:
Audio Conditioning for Music Generation via Discrete Bottleneck Features. 146-153 - Chenyu Gao, Federico Reuben, Tom Collins:
Variation Transformer: New Datasets, Models, and Comparative Evaluation for Symbolic Music Variation Generation. 154-163 - Vjosa Preniqi, Iacopo Ghinassi, Julia Ive, Kyriaki Kalimeri, Charalampos Saitis:
Automatic Detection of Moral Values in Music Lyrics. 164-172 - Sebastian Strahl, Meinard Müller:
Semi-Supervised Piano Transcription Using Pseudo-Labeling Techniques. 173-181 - Huiran Yu, Zhiyao Duan:
Note-Level Transcription of Choral Music. 182-188 - Tsung-Ping Chen, Kazuyoshi Yoshii:
Learning Multifaceted Self-Similarity Over Time and Frequency for Music Structure Analysis. 189-197 - Antonin Gagneré, Slim Essid, Geoffroy Peeters:
A Contrastive Self-Supervised Learning Scheme for Beat Tracking Amenable to Few-Shot Learning. 198-206 - Morgan Buisson, Brian McFee, Slim Essid:
Using Pairwise Link Prediction and Graph Attention Networks for Music Structure Analysis. 207-214 - Danbinaerin Han, Mark R. H. Gotham, Dongmin Kim, Hannah Park, Sihun Lee, Dasaem Jeong
:
Six Dragons Fly Again: Reviving 15th-Century Korean Court Music With Transformers and Novel Encoding. 217-224 - David Rizo, Jorge Calvo-Zaragoza, Patricia García-Iasci, Teresa Delgado-Sánchez:
Lessons Learned From a Project to Encode Mensural Music on a Large Scale With Optical Music Recognition. 225-231 - Elena Georgieva, Pablo Ripollés, Brian McFee:
The Changing Sound of Music: An Exploratory Corpus Study of Vocal Trends Over Time. 232-239 - Pedro Ramoneda, Martín Rocamora, Taketo Akama:
Music Proofreading With RefinPaint: Where and How to Modify Compositions Given Context. 240-247 - Yigitcan Özer, Hans-Ulrich Berendes, Vlora Arifi-Müller, Fabian-Robert Stöter, Meinard Müller:
Notewise Evaluation for Music Source Separation: A Case Study for Separated Piano Tracks. 248-255 - Jyoti Narang, Nazif Can Tamer, Viviana De La Vega, Xavier Serra:
Automatic Estimation of Singing Voice Musical Dynamics. 256-263 - Or Tal, Alon Ziv, Itai Gat, Felix Kreuk, Yossi Adi:
Joint Audio and Symbolic Conditioning for Temporally Controlled Text-to-Music Generation. 264-271 - Javier Nistal, Marco Pasini, Cyran Aouameur, Maarten Grachten, Stefan Lattner:
Diff-a-Riff: Musical Accompaniment Co-Creation via Latent Diffusion Models. 272-280 - Ngan V. T. Nguyen, Elizabeth Acosta, Tommy Dang, David R. W. Sears:
Exploring Internet Radio Across the Globe With the MIRAGE Online Dashboard. 281-287 - Andrew C. Edwards, Xavier Riley, Pedro Sarmento, Simon Dixon:
MIDI-to-Tab: Guitar Tablature Inference via Masked Language Modeling. 288-294 - Jaehun Kim, Florian Henkel, Camilo Landau, Samuel E. Sandberg, Andreas F. Ehmann:
Transcription-Based Lyrics Embeddings: Simple Extraction of Effective Lyrics Embeddings From Audio. 295-303 - Hyon Kim, Xavier Serra:
A Method for MIDI Velocity Estimation for Piano Performance by a U-Net With Attention and FiLM. 304-310 - Yun-Han Lan, Wen-Yi Hsiao, Hao-Chung Cheng, Yi-Hsuan Yang:
MusiConGen: Rhythm and Chord Control for Transformer-Based Text-to-Music Generation. 311-318 - Tim Beyer, Angela Dai:
End-to-End Piano Performance-MIDI to Score Conversion With Transformers. 319-326 - Dorian Desblancs, Gabriel Meseguer-Brocal, Romain Hennequin, Manuel Moussallam:
From Real to Cloned Singer Identification. 327-334 - Jingyue Huang, Ke Chen, Yi-Hsuan Yang:
Emotion-Driven Piano Music Generation via Two-Stage Disentanglement and Functional Representation. 335-342 - Jiajun Deng, Yaolong Ju, Jing Yang, Simon Lui, Xunying Liu:
Efficient Adapter Tuning for Joint Singing Voice Beat and Downbeat Tracking With Self-Supervised Learning Features. 343-351 - Eun Ji Oh, Hyunjae Kim, Kyung Myun Lee:
Which Audio Features Can Predict the Dynamic Musical Emotions of Both Composers and Listeners? 352-359 - Julia Barnett, Hugo Flores García, Bryan Pardo:
Exploring Musical Roots: Applying Audio Embeddings to Empower Influence Attribution for a Generative Music Model. 360-368 - Andre Holzapfel, Anna-Kaisa Kaila, Petra Jääskeläinen:
Green MIR? Investigating Computational Cost of Recent Music-Ai Research in ISMIR. 371-380 - Seikoh Fukuda, Yuko Fukuda, Masamichi Hosoda, Ami Motomura, Eri Sasao, Masaki Matsubara, Masahiro Niitsuma:
Field Study on Children's Home Piano Practice: Developing a Comprehensive System for Enhanced Student-Teacher Engagement. 381-388 - Brian Bemman, Justin Christensen:
Inner Metric Analysis as a Measure of Rhythmic Syncopation. 389-396 - Lidia Morris, Rebecca Leger, Michele Newman, John Ashley Burgoyne, Ryan Groves, Natasha Mangal, Jin Ha Lee:
HAISP: A Dataset of Human-AI Songwriting Processes From the AI Song Contest. 397-404 - Giulia Argüello, Luca A. Lanzendörfer, Roger Wattenhofer:
Cue Point Estimation Using Object Detection. 405-412 - Kartik Ohri, Robert Kaye:
The ListenBrainz Listens Dataset. 413-419 - Marco Comunità, Zhi Zhong, Akira Takahashi, Shiqi Yang, Mengjie Zhao, Koichi Saito, Yukara Ikemiya, Takashi Shibuya, Shusuke Takahashi, Yuki Mitsufuji:
SpecMaskGIT: Masked Generative Modeling of Audio Spectrogram for Efficient Audio Synthesis and Beyond. 420-428 - Zach Evans, Julian D. Parker, CJ Carr, Zachary Zukowski, Josiah Taylor, Jordi Pons:
Long-Form Music Generation With Latent Diffusion. 429-437 - Martin E. Malandro:
Composer's Assistant 2: Interactive Multi-Track MIDI Infilling With Fine-Grained User Control. 438-445 - Yu-Hua Chen, Yen-Tung Yeh, Yuan-Chiao Cheng, Jui-Te Wu, Yu-Hsiang Ho, Jyh-Shing Roger Jang, Yi-Hsuan Yang:
Towards Zero-Shot Amplifier Modeling: One-to-Many Amplifier Modeling via Tone Embedding Control. 446-453 - Ju-Chiang Wang, Wei-Tsung Lu, Jitong Chen:
Mel-RoFormer for Vocal Separation and Vocal Melody Transcription. 454-461 - Noelia N. Luna-Barahona, Adrian Rosello, María Alfaro-Contreras, David Rizo, Jorge Calvo-Zaragoza:
Unsupervised Synthetic-to-Real Adaptation for Optical Music Recognition. 462-469 - Jinlong Zhu, Keigo Sakurai, Ren Togo, Takahiro Ogawa, Miki Haseyama:
MMT-BERT: Chord-Aware Symbolic Music Generation Based on Multitrack Music Transformer and MusicBERT. 470-477 - Recep Oguz Araz, Xavier Serra, Dmitry Bogdanov:
Discogs-VI: A Musical Version Identification Dataset Based on Public Editorial Metadata. 478-485 - Nicholas Cornia, Bruno Forment:
Who's Afraid of the 'Artyfyshall Byrd'? Historical Notions and Current Challenges of Musical Artificiality. 486-492 - Yaolong Ju, Chun Yat Wu, Betty Cortiñas-Lorenzo, Jing Yang, Jiajun Deng, Fan Fan, Simon Lui:
End-to-End Automatic Singing Skill Evaluation Using Cross-Attention and Data Augmentation for Solo Singing and Singing With Accompaniment. 493-500 - Francesco Foscarin, Emmanouil Karystinaios, Eita Nakamura, Gerhard Widmer:
Cluster and Separate: A GNN Approach to Voice and Staff Prediction for Score Engraving. 503-510 - Huan Zhang, Jinhua Liang, Simon Dixon:
From Audio Encoders to Piano Judges: Benchmarking Performance Understanding for Solo Piano. 511-519 - Pedro Ramoneda, Vsevolod Eremenko, Alexandre D'Hooge, Emilia Parada-Cabaleiro, Xavier Serra:
Towards Explainable and Interpretable Musical Difficulty Estimation: A Parameter-Efficient Approach. 520-528 - Michele Newman, Lidia J. Morris, Jun Kato, Masataka Goto, Jason Yip, Jin Ha Lee:
Purposeful Play: Evaluation and Co-Design of Casual Music Creation Applications With Children. 529-539 - Nicholas Evans, Behzad Haki, Daniel Gómez-Marín, Sergi Jordà:
El Bongosero: A Crowd-Sourced Symbolic Dataset of Improvised Hand Percussion Rhythms Paired With Drum Patterns. 540-546 - Joanne Affolter, Martin A. Rohrmeier:
Utilizing Listener-Provided Tags for Music Emotion Recognition: A Data-Driven Approach. 547-554 - Chih-Pin Tan, Hsin Ai, Yi-Hsin Chang, Shuen-Huei Guan, Yi-Hsuan Yang:
PiCoGen2: Piano Cover Generation With Transfer Learning Approach and Weakly Aligned Data. 555-562 - Soumya Sai Vanka, Christian J. Steinmetz, Jean-Baptiste Rolland, Joshua D. Reiss, George Fazekas:
Diff-MST: Differentiable Mixing Style Transfer. 563-570 - Julien Guinot, Elio Quinton, George Fazekas:
Semi-Supervised Contrastive Learning of Musical Representations. 571-579 - Léo Géré, Nicolas Audebert, Philippe Rigaux:
Improved Symbolic Drum Style Classification With Grammar-Based Hierarchical Representations. 580-587 - Jiwoo Ryu, Hao-Wen Dong, Jongmin Jung, Dasaem Jeong
:
Nested Music Transformer: Sequentially Decoding Compound Tokens in Symbolic Music and Audio Generation. 588-595 - Pedro González-Barrachina, María Alfaro-Contreras, Jorge Calvo-Zaragoza:
Continual Learning for Music Classification. 596-602 - Silvan Peter, Gerhard Widmer:
TheGlueNote: Learned Representations for Robust and Flexible Note Alignment. 603-610 - Xavier Riley, Zixun Guo, Andrew C. Edwards, Simon Dixon:
GAPS: A Large and Diverse Classical Guitar Dataset and Benchmark Transcription Model. 611-617 - Hugo T. Carvalho, Min Susan Li, Massimiliano Di Luca, Alan M. Wing
:
A Kalman Filter Model for Synchronization in Musical Ensembles. 618-624 - Alain Riou, Stefan Lattner, Gaëtan Hadjeres, Michael Anslow, Geoffroy Peeters:
Stem-JEPA: A Joint-Embedding Predictive Architecture for Musical Stem Compatibility Estimation. 625-633 - Fang-Duo Tsai, Shih-Lun Wu, Haven Kim, Bo-Yu Chen, Hao-Chung Cheng, Yi-Hsuan Yang:
Audio Prompt Adapter: Unleashing Music Editing Abilities for Text-to-Music With Lightweight Finetuning. 634-641 - Shangda Wu, Yashan Wang, Xiaobing Li, Feng Yu, Maosong Sun:
MelodyT5: A Unified Score-to-Score Transformer for Symbolic Music Processing. 642-650 - Emmanouil Karystinaios, Gerhard Widmer:
GraphMuse: A Library for Symbolic Music Graph Processing. 651-658 - Christian J. Steinmetz, Shubhr Singh, Marco Comunità, Ilias Ibnyahya, Shanxin Yuan, Emmanouil Benetos, Joshua D. Reiss:
ST-ITO: Controlling Audio Effects for Style Transfer With Inference-Time Optimization. 661-668 - Qixin Deng, Qikai Yang, Ruibin Yuan, Yipeng Huang, Yi Wang, Xubo Liu, Zeyue Tian, Jiahao Pan, Ge Zhang, Hanfeng Lin, Yizhi Li, Yinghao Ma, Jie Fu, Chenghua Lin, Emmanouil Benetos, Wenwu Wang, Guangyu Xia, Wei Xue, Yike Guo:
ComposerX: Multi-Agent Symbolic Music Composition With LLMs. 669-679 - Megan Wei, Michael Freeman, Chris Donahue, Chen Sun:
Do Music Generation Models Encode Music Theory? 680-687 - Silas Antonisen, Iván López-Espejo:
PolySinger: Singing-Voice to Singing-Voice Translation From English to Japanese. 688-696 - Arthur Flexer:
On the Validity of Employing ChatGPT for Distant Reading of Music Similarity. 697-704 - Venkatakrishnan Vaidyanathapuram Krishnan, Noel Alben, Anish A. Nair, Nathaniel Condit-Schultz:
Sanidha: A Studio Quality Multi-Modal Dataset for Carnatic Music. 705-712 - Pedro Sarmento, Jackson Loth, Mathieu Barthet:
Between the AI and Me: Analysing Listeners' Perspectives on AI- and Human-Composed Progressive Metal Music. 713-720 - Nils Demerlé, Philippe Esling, Guillaume Doras, David Genova:
Combining Audio Control and Style Transfer Using Latent Diffusion. 721-728 - Mequanent Argaw Muluneh, Yan-Tsung Peng, Li Su:
Computational Analysis of Yaredawi YeZema Silt in Ethiopian Orthodox Tewahedo Church Chants. 729-736 - Ondrej Cífka, Hendrik Schreiber, Luke Miner, Fabian-Robert Stöter:
Lyrics Transcription for Humans: A Readability-Aware Benchmark. 737-744 - Owen Green, Bob L. T. Sturm, Georgina Born, Melanie Wald-Fuhrmann:
A Critical Survey of Research in Music Genre Recognition. 745-782 - Liwei Lin, Gus Xia, Junyan Jiang, Yixiao Zhang:
Content-Based Controls for Music Large Language Modeling. 783-790 - Marcel A. Vélez Vásquez, Charlotte Pouw, John Ashley Burgoyne, Willem H. Zuidema:
Exploring the Inner Mechanisms of Large Generative Music Models. 791-798 - Saebyul Park, Halla Kim, Jiye Jung, Juyong Park, Jeounghoon Kim, Juhan Nam:
Quantitative Analysis of Melodic Similarity in Music Copyright Infringement Cases. 799-806 - Hendrik Vincent Koops, Gianluca Micchi, Elio Quinton:
Robust Lossy Audio Compression Identification. 807-813 - Malcolm Sailor:
RNBert: Fine-Tuning a Masked Language Model for Roman Numeral Analysis. 814-821 - Benno Weck, Ilaria Manco, Emmanouil Benetos, Elio Quinton, George Fazekas, Dmitry Bogdanov:
MuChoMusic: Evaluating Music Understanding in Multimodal Audio-Language Models. 825-833 - Sujoy Roychowdhury, Preeti Rao, Sharat Chandran:
Human Pose Estimation for Expressive Movement Descriptors in Vocal Musical Performances. 834-841 - Seokbeom Park, Hyunjae Kim, Kyung Myun Lee:
Enhancing Predictive Models of Music Familiarity With EEG: Insights From Fans and Non-Fans of K-Pop Group NCT127. 842-849 - Robert Sowula, Peter Knees:
Mosaikbox: Improving Fully Automatic DJ Mixing Through Rule-Based Stem Modification and Precise Beat-Grid Estimation. 850-857 - Jan Melechovský, Abhinaba Roy, Dorien Herremans:
MidiCaps: A Large-Scale MIDI Dataset With Text Captions. 858-865 - Stephen Ni-Hahn, Weihan Xu, Zirui Yin, Rico Zhu, Simon Mak, Yue Jiang, Cynthia Rudin:
A New Dataset, Notation Software, and Representation for Computational Schenkerian Analysis. 866-873 - Zachary Novack, Julian J. McAuley, Taylor Berg-Kirkpatrick, Nicholas J. Bryan:
DITTO-2: Distilled Diffusion Inference-Time T-Optimization for Music Generation. 874-881 - Christopher J. Tralie, Ben Cantil:
The Concatenator: A Bayesian Approach to Real Time Concatenative Musaicing. 882-889 - Muhammad Taimoor Haseeb, Ahmad Hammoudeh, Gus Xia:
Deep Recombinant Transformer: Enhancing Loop Compatibility in Digital Music Production. 890-896 - Yannis Vasilakis, Rachel M. Bittner, Johan Pauwels:
I Can Listen but Cannot Read: An Evaluation of Two-Tower Multimodal Systems for Instrument Recognition. 897-905 - Weixing Wei, Jiahao Zhao, Yulun Wu, Kazuyoshi Yoshii:
Streaming Piano Transcription Based on Consistent Onset and Offset Decoding With Sustain Pedal Detection. 906-913 - Juan Carlos Martinez-Sevilla, David Rizo, Jorge Calvo-Zaragoza:
Towards Universal Optical Music Recognition: A Case Study on Notation Types. 914-921 - Mathias Rose Bjare, Stefan Lattner, Gerhard Widmer:
Controlling Surprisal in Music Generation via Information Content Curve Matching. 922-929 - Guang Yang, Muru Zhang, Lin Qiu, Yanming Wan, Noah A. Smith:
Toward a More Complete OMR Solution. 930-937 - Ilaria Manco, Justin Salamon, Oriol Nieto:
Augment, Drop & Swap: Improving Diversity in LLM Captions for Efficient Music-Text Representation Learning. 938-945 - Seungheon Doh, Keunwoo Choi, Daeyong Kwon, Taesoo Kim, Juhan Nam:
Music Discovery Dialogue Generation Using Human Intent Analysis and Large Language Models. 946-953 - Yuexuan Kong, Vincent Lostanlen, Gabriel Meseguer-Brocal, Stella Wong, Mathieu Lagrange, Romain Hennequin:
STONE: Self-Supervised Tonality Estimator. 954-961 - Francesco Foscarin, Jan Schlüter, Gerhard Widmer:
Beat This! Accurate Beat Tracking Without DBN Postprocessing. 962-969 - Yujia Yan, Zhiyao Duan:
Scoring Time Intervals Using Non-Hierarchical Transformer for Automatic Piano Transcription. 973-980 - Julian Lenz, Anirudh Mani:
PerTok: Expressive Encoding and Modeling of Symbolic Musical Ideas and Variations. 981-988 - Nathaniel Condit-Schultz:
Looking for Tactus in All the Wrong Places: Statistical Inference of Metric Alignment in Rap Flow. 989-995 - Kun Fang, Ziyu Wang, Gus Xia, Ichiro Fujinaga:
Exploring GPT's Ability as a Judge in Music Understanding. 996-1003 - Roser Batlle-Roca, Wei-Hsiang Liao, Xavier Serra, Yuki Mitsufuji, Emilia Gómez:
Towards Assessing Data Replication in Music Generation With Music Similarity Metrics on Raw Audio. 1004-1011 - Shahan Nercessian, Johannes Imort, Ninon Devis, Frederik Blang:
Generating Sample-Based Musical Instruments Using Neural Audio Codec Language Models. 1012-1019 - Nithya Nadig Shikarpur, Krishna Maneesha Dendukuri, Yusong Wu, Antoine Caillon, Cheng-Zhi Anna Huang:
Hierarchical Generative Modeling of Melodic Vocal Contours in Hindustani Classical Music. 1020-1028 - Haonan Chen, Jordan B. L. Smith, Janne Spijkervet, Ju-Chiang Wang, Pei Zou, Bochen Li, Qiuqiang Kong, Xingjian Du:
SymPAC: Scalable Symbolic Music Generation With Prompts and Constraints. 1029-1036 - Giovanni Bindi, Philippe Esling:
Unsupervised Composable Representations for Audio. 1037-1045 - Pavani Chowdary, Bhavyajeet Singh, Rajat Agarwal, Vinoo Alluri:
Lyrically Speaking: Exploring the Link Between Lyrical Emotions, Themes and Depression Risk. 1046-1050 - Karn N. Watcharasupat, Alexander Lerch:
A Stem-Agnostic Single-Decoder System for Music Source Separation Beyond Four Stems. 1051-1059 - Mickaël Zehren, Marco Alunno, Paolo Bientinesi:
In-Depth Performance Analysis of the ADTOF-Based Algorithm for Automatic Drum Transcription. 1060-1067 - Patricia Hu, Lukás Samuel Marták, Carlos Eduardo Cancino Chacón, Gerhard Widmer:
Towards Musically Informed Evaluation of Piano Transcription Models. 1068-1075 - Tomoyasu Nakano, Masataka Goto:
Using Item Response Theory to Aggregate Music Annotation Results of Multiple Annotators. 1076-1084 - Irmak Bukey, Michael Feffer, Chris Donahue:
Just Label the Repeats for In-the-Wild Audio-to-Score Alignment. 1085-1092 - Lucas Simões Maia, Richa Namballa, Martín Rocamora, Magdalena Fuentes, Carlos Guedes:
Investigating Time-Line-Based Music Traditions With Field Recordings: A Case Study of Candomblé Bell Patterns. 1093-1100

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.