Abstract
API-driven chatbot systems are increasingly integral to software engineering applications, yet their effectiveness hinges on accurately generating and executing API calls. This is particularly challenging in scenarios requiring multi-step interactions with complex parameterization and nested API dependencies. Addressing these challenges, this work contributes to the evaluation and assessment of AI-based software development through three key advancements: (1) the introduction of a novel dataset specifically designed for benchmarking API function selection, parameter generation, and nested API execution; (2) an empirical evaluation of state-of-the-art language models, analyzing their performance across varying task complexities in API function generation and parameter accuracy; and (3) a hybrid approach to API routing, combining general-purpose large language models for API selection with fine-tuned models and prompt engineering for parameter generation. These innovations significantly improve API execution in chatbot systems, offering practical methodologies for enhancing software design, testing, and operational workflows in real-world software engineering contexts.
| Original language | English |
|---|---|
| Title of host publication | Proceedings of the 29th International Conference on Evaluation and Assessment in Software Engineering , EASE, 2025 edition, EASE 2025 |
| Editors | Muhammad Ali Babar, Ayse Tosun, Stefan Wagner, Viktoria Stray |
| Publisher | Association for Computing Machinery, Inc |
| Pages | 114-125 |
| Number of pages | 12 |
| ISBN (Electronic) | 9798400713859 |
| DOIs | |
| State | Published - 24 Dec 2025 |
| Event | 29th International Conference on Evaluation and Assessment of Software Engineering, EASE 2025 - Istanbul, Turkey Duration: 17 Jun 2025 → 20 Jun 2025 |
Publication series
| Name | Proceedings of the 29th International Conference on Evaluation and Assessment in Software Engineering , EASE, 2025 edition, EASE 2025 |
|---|
Conference
| Conference | 29th International Conference on Evaluation and Assessment of Software Engineering, EASE 2025 |
|---|---|
| Country/Territory | Turkey |
| City | Istanbul |
| Period | 17/06/25 → 20/06/25 |
Bibliographical note
Publisher Copyright:© 2025 Copyright held by the owner/author(s).
Keywords
- Benchmark
- Chatbot
- Function Calling
- Large Language Models
ASJC Scopus subject areas
- Software