dc.contributor.author | Augusto Alonso, Cristian | |
dc.contributor.author | Morán Barbón, Jesús | |
dc.contributor.author | Bertolino, Antonia | |
dc.contributor.author | Riva Álvarez, Claudio A. de la | |
dc.contributor.author | Tuya González, Pablo Javier | |
dc.date.accessioned | 2025-01-29T06:52:38Z | |
dc.date.available | 2025-01-29T06:52:38Z | |
dc.date.issued | 2025-01-25 | |
dc.identifier.citation | Augusto, C., Morán, J., Bertolino, A., de la Riva, C., & Tuya, J. (2025). Software system testing assisted by large language models: An exploratory study. En H. D. Menéndez, G. Bello-Orgaz, P. Barnard, J. R. Bautista, A. Farahi, S. Dash, D. Han, S. Fortz, & V. Rodriguez-Fernandez (Eds.), Testing software and systems: 36th IFIP WG 6.1 International Conference, ICTSS 2024, London, UK, October 30 – November 1, 2024, Proceedings (Lecture Notes in Computer Science, vol. 15383, cap. 17). Springer. | spa |
dc.identifier.isbn | 978-3-031-80888-3 | |
dc.identifier.uri | https://hdl.handle.net/10651/76363 | |
dc.description.abstract | Large language models (LLMs) based on transformer architecture
have revolutionized natural language processing (NLP), demonstrating excellent
capabilities in understanding and generating human-like text. In Software Engineering, LLMs have been applied in code generation, documentation, and report
writing tasks, to support the developer and reduce the amount of manual work.
In Software Testing, one of the cornerstones of Software Engineering, LLMs have
been explored for generating test code, test inputs, automating the oracle process
or generating test scenarios. However, their application to high-level testing
stages such as system testing, in which a deep knowledge of the business and the
technological stack is needed, remains largely unexplored. This paper presents
an exploratory study about how LLMs can support system test development.
Given that LLM performance depends on input data quality, the study focuses on
how to query general purpose LLMs to first obtain test scenarios and then derive
test cases from them. The study evaluates two popular LLMs (GPT-4o and GPT-
4o-mini), using as a benchmark a European project demonstrator. The study compares two different prompt strategies and employs well-established prompt patterns, showing promising results as well as room for improvement in the application of LLMs to support system testing. | spa |
dc.description.sponsorship | This work was supported in part by the project PID2022-137646OB-C32 under
Grant MCIN/ AEI/10.13039/501100011033/FEDER, UE, and in part by the project
MASE RDS-PTR_22_24_P2.1 Cybersecurity (Italy). | spa |
dc.format.extent | p. 239-255 | spa |
dc.language.iso | eng | spa |
dc.publisher | Springer | spa |
dc.relation.ispartof | Testing Software and Systems | spa |
dc.rights | Attribution-NonCommercial-NoDerivatives 4.0 Internacional | * |
dc.rights.uri | http://creativecommons.org/licenses/by-nc-nd/4.0/ | * |
dc.subject | Software Testing | spa |
dc.subject | E2E Testing | spa |
dc.subject | LLMs | spa |
dc.subject | Large Language Models | spa |
dc.subject | Software Engineering | spa |
dc.subject | End-to-End Testing | spa |
dc.subject | Pruebas de Sistema | spa |
dc.subject | Ingeniería del Software | spa |
dc.subject | Testing | spa |
dc.subject | Modelos de lenguaje | spa |
dc.title | Software System Testing Assisted by Large Language Models: An Exploratory Study | spa |
dc.type | conference output | spa |
dc.identifier.doi | 10.1007/978-3-031-80889-0_17 | |
dc.relation.projectID | info:eu-repo/grantAgreement/AEI/Plan Estatal de Investigación Científica y Técnica y de Innovación 2021-2023/PID2022-137646OB-C32/ES/ASEGURAMIENTO TEMPRANO DE LA CALIDAD EN ENTORNOS NOVEDOSOS DE PRODUCCION DE SOFTWARE/ | spa |
dc.relation.projectID | MASE RDS-PTR_22_24_P2.1 | spa |
dc.rights.accessRights | open access | spa |
dc.type.hasVersion | AM | spa |