LLMs for Commit Messages: A Survey and an Agent-Based Evaluation Protocol on CommitBench

Research output: Contribution to journalReview articlepeer-review

Abstract

Commit messages are vital for traceability, maintenance, and onboarding in modern software projects, yet their quality is frequently inconsistent. Recent large language models (LLMs) can transform code diffs into natural language summaries, offering a path to more consistent and informative commit messages. This paper makes two contributions: (i) it provides a systematic survey of automated commit message generation with LLMs, critically comparing prompt-only, fine-tuned, and retrieval-augmented approaches; and (ii) it specifies a transparent, agent-based evaluation blueprint centered on CommitBench. Unlike prior reviews, we include a detailed dataset audit, preprocessing impacts, evaluation metrics, and error taxonomy. The protocol defines dataset usage and splits, prompting and context settings, scoring and selection rules, and reporting guidelines (results by project, language, and commit type), along with an error taxonomy to guide qualitative analysis. Importantly, this work emphasizes methodology and design rather than presenting new empirical benchmarking results. The blueprint is intended to support reproducibility and comparability in future studies.

Original languageEnglish
Article number427
JournalComputers
Volume14
Issue number10
DOIs
StatePublished - Oct 2025

Bibliographical note

Publisher Copyright:
© 2025 by the authors.

Keywords

  • CommitBench dataset
  • automated documentation
  • commit message generation
  • large language models
  • retrieval-augmented generation
  • software engineering automation
  • transformer architecture

ASJC Scopus subject areas

  • Computer Science (miscellaneous)
  • Human-Computer Interaction
  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'LLMs for Commit Messages: A Survey and an Agent-Based Evaluation Protocol on CommitBench'. Together they form a unique fingerprint.

Cite this