Inteligencia Artificial para Ingeniería del Software

URI permanente para esta colección:

Artículos en la categoría Inteligencia Artificial para Ingeniería del Software publicados en las Actas de las XXVIII Jornadas de Ingeniería del Software y Bases de Datos (JISBD 2024).
Notificar un error en esta colección

Examinar

Envíos recientes

Mostrando 1 - 20 de 21
  • Artículo
    Asignación dinámica de tareas en entornos de Inteligencia Híbrida
    Mestre, Antoni; Albert, Manoli; Gil, Miriam; Pelechano, Vicente. Actas de las XXVIII Jornadas de Ingeniería del Software y Bases de Datos (JISBD 2024), 2024-06-17.
    La Inteligencia Híbrida ha surgido como un paradigma revolucionario al proponer una integración sinérgica entre las capacidades humanas y las habilidades de los sistemas inteligentes y la robótica. En este contexto, la asignación eficiente de tareas entre humanos y sistemas automatizados desempeña un papel crucial en el éxito de estos entornos. Este artículo presenta una propuesta innovadora para la asignación de tareas en entornos de inteligencia híbrida, resaltando su naturaleza dinámica y su enfoque centrado en el ser humano. La propuesta tiene como objetivo maximizar la eficiencia temporal y de recursos, mientras incorpora medidas para abordar la fatiga humana y mejorar el bienestar de los participantes.
  • Artículo
    ¿Sueñan los loros electrónicos (LLMs) con patrones de diseño?
    Parejo, José Antonio. Actas de las XXVIII Jornadas de Ingeniería del Software y Bases de Datos (JISBD 2024), 2024-06-17.
    A pesar de que algunos los califiquen como loros electrónicos por su mecanismo de funcionamiento basado en la elección probabilística de la siguente palabra a generar, la capacidad de los grandes modelos del lenguaje para generar código ha demostrado ser muy potente, especialmente en las versiones más grandes y modernas de este tipo de modelos. En este artículo nos planteamos una pregunta de investigación que hasta donde llega nuestro conocimiento aún no ha sido abordada: ¿Son los grandes modelos del lenguaje capaces de identificar escenarios de aplicación interesantes para la aplicación de patrones de diseño orientado a objetos y aplicarlos de manera corecta?. En este artículo planteamos un pequeño experimento usando ChatGPT4 y el lenguaje de programación Java, con varios escenarios de aplicación para algunos de los patrones de diseño orientado a objetos clásicos.
  • Artículo
    A methodology for explaining Learning-to-rank models for test case prioritization
    Berrios, Mario; Ramírez, Aurora; Feldt, Robert; Romero, José Raúl. Actas de las XXVIII Jornadas de Ingeniería del Software y Bases de Datos (JISBD 2024), 2024-06-17.
    In machine learning-based test case prioritization (TCP), there is a need for explainable methods that can shed light on the internal mechanisms of the models and clarify why certain test cases are deemed more likely to fail. To address this need, we propose an experimental methodology designed to generate and analyze global explanations for models in TCP. This methodology can help testers and researchers to understand how the impact of different features shifts as the software system evolves.
  • Artículo
    Towards an Extensible Architecture for LLM-based Programming Assistants in IDEs
    Contreras Romero, Albert; Guerra, Esther; de Lara, Juan. Actas de las XXVIII Jornadas de Ingeniería del Software y Bases de Datos (JISBD 2024), 2024-06-17.
    Large Language Models (LLMs) are the backbone of chatbots like ChatGPT, and are used to assist in all sort of domains. Following this trend, we are witnessing proposals of LLM-based assistants for coding tasks. However, current IDEs lack mechanisms tailored to facilitate the integration of such assistants, from how to interact with them to how to apply their suggestions without leaving the environment. To fill this gap, this short paper presents an extensible architecture for the definition of assistance tasks (e.g., method renaming) based on LLMs, and their binding to IDE commands and natural language prompts. We report on an ongoing effort to build a Java assistant within Eclipse based on this architecture, and illustrate its use.
  • Resumen
    Requirements classification using fastText and BETO in Spanish documents
    Limaylla-Lunarejo, Maria Isabel; Condori-Fernández, Nelly; Rodríguez Luaces, Miguel. Actas de las XXVIII Jornadas de Ingeniería del Software y Bases de Datos (JISBD 2024), 2024-06-17.
    Context and motivation: Machine Learning (ML) algorithms and Natural Language Processing (NLP) techniques have effectively supported the automatic software requirements classification. The emergence of pre-trained language models, like BERT, provides promising results in several downstream NLP tasks, such as text classification. Question/problem: Most ML/DL approaches on requirements classification show a lack of analysis for requirements written in the Spanish language. Moreover, there has not been much research on pre-trained language models, like fastText and BETO (BERT for the Spanish language), neither in the validation of the generalization of the models. Principal ideas/results: We aim to investigate the classification performance and generalization of fastText and BETO classifiers in comparison with other ML/DL algorithms. The findings show that Shallow ML algorithms outperformed fastText and BETO when training and testing in the same dataset, but BETO outperformed other classifiers on prediction performance in a dataset with different origins. Contribution: Our evaluation provides a quantitative analysis of the classification performance of fastTest and BETO in comparison with ML/DL algorithms, the external validity of trained models on another Spanish dataset, and the translation of the PROMISE NFR dataset in Spanish.
  • Resumen
    Co-evolving Scenarios and Simulated Players to Locate Bugs that arise from the Interaction of Software Models of Video Games
    Roca, Isis; Pastor, Óscar; Cetina, Carlos; Arcega, Lorena. Actas de las XXVIII Jornadas de Ingeniería del Software y Bases de Datos (JISBD 2024), 2024-06-17.
    Context: Game Software Engineering (GSE) is a field that focuses on developing and maintaining the software part of video games. A key component of video game development is the utilization of game engines, with many engines using software models to capture various aspects of the game. Objective: A challenge that GSE faces is the localization of bugs, mainly when working with large and intricated software models. Additionally, the interaction between software models (i.e. bosses, enemies, or environmental elements) during gameplay is often a significant source of bugs. In response to this challenge, we propose a co-evolution approach for bug localization in the software models of video games, called CoEBA. Methods: The CoEBA approach leverages Search-Based Software Engineering (SBSE) techniques to locate bugs in software models while considering their interactions. We conducted an evaluation in which we applied our approach to a commercial video game, Kromaia. We compared our approach with a state-of-the-art baseline approach that relied on the bug localization approach used by Kromaia’s developers and a random search used as a sanity check. Results: Our co-evolution approach outperforms the baseline approach in precision, recall, and F-measure. In addition, to provide evidence of the significance of our results, we conducted a statistical analysis. that shows significant differences in precision and recall values. Conclusion: The proposed approach, CoEBA, which considers the interaction between software models, can identify and locate bugs that other bug localization approaches may have overlooked.
  • Artículo
    Toward Trustworthy AI-Enabled Internet Search
    Romero-Arjona, Miguel; Segura, Sergio; Arrieta, Aitor. Actas de las XXVIII Jornadas de Ingeniería del Software y Bases de Datos (JISBD 2024), 2024-06-17.
    Artificial Intelligence (AI), particularly Large Language Models (LLMs), are called to reshape how we search information on the Internet. AI regulation initiatives (e.g., EU AI Act) are rapidly emerging to ensure that AI products are trustworthy and safe. However, ensuring the regulatory compliance of an AI system currently relies on manual checklists, making the process tedious, time-consuming, and hardly scalable. In this work-in-progress paper, we outline our vision for developing a tool ecosystem aimed at automatically testing AI-driven search engines in accordance with EU trustworthiness compliance requirements. Specifically, we present some of the key quality characteristics to target, a brief summary of the state-of-the-art LLM testing, and the key features of the envisioned tool ecosystem.
  • Artículo
    Integración de Feedback Humano para Guiar la Adaptación Inteligente de Interfaces de Usuario
    Gaspar Figueiredo, Daniel; Fernandez-Diego, Marta; Abrahão, Silvia; Insfran, Emilio; Nuredini, Ruben. Actas de las XXVIII Jornadas de Ingeniería del Software y Bases de Datos (JISBD 2024), 2024-06-17.
    La adaptación de las interfaces de usuario (IU) busca mejorar la experiencia de los usuarios (UX) en una variedad de contextos de uso. No obstante, las preferencias y necesidades cambiantes de los usuarios plantean desafíos a lo largo del proceso de adaptación. Esto pone de manifiesto la necesidad de recurrir a técnicas de aprendizaje automático para aprender de la interacción del usuario y facilitar la adaptación de IUs. En este contexto, adoptamos un enfoque innovador que integra el Feedback Humano (HF) en el proceso de Aprendizaje por Refuerzo (RL) para lograr una adaptación más precisa y personalizada. Para ello, se han desarrollado dos herramientas: un entorno de entrenamiento de agentes RL diseñado para crear dichos agentes con diversos algoritmos de RL y para distintas IUs adaptativas y una plataforma de captura de feedback que permite a los usuarios expresar sus preferencias de manera intuitiva.
  • Artículo
    Automatización del pipeline de aprendizaje automático para la priorización de pruebas
    Fuentes-Almoguera, Ángel; Ramírez, Aurora; García-Martínez, Carlos; Romero, José Raúl. Actas de las XXVIII Jornadas de Ingeniería del Software y Bases de Datos (JISBD 2024), 2024-06-17.
    La priorización de casos de prueba (TCP) consiste en ordenar y seleccionar los casos de prueba más relevantes para verificar que la funcionalidad actual de un sistema software no se ve afectada por cambios en el código. Recientemente, el aprendizaje automático (ML) se ha utilizado en TCP para predecir la probabilidad de fallo de cada caso de prueba. Sin embargo, los ingenieros de software pueden tener dificultades para identificar e implementar los modelos predictivos ad hoc más apropiados para cada problema de TCP. Por eso, a medida que se generen nuevas versiones y se validen con el conjunto de pruebas obtenido, es probable que el rendimiento del modelo disminuya. En este estudio, abordamos estos retos aplicando la composición automatizada de pipelines y la optimización de hiperparámetros. Ambas se consideran tareas dentro del ML automático (AutoML). Con este objetivo, nuestra propuesta emplea la programación genética guiada por gramáticas como método evolutivo para implementar el algoritmo AutoML. Nuestros resultados experimentales demuestran que nuestro enfoque puede adaptarse a las particularidades del sistema bajo prueba (SUT), seleccionando el pipeline y los hiperparámetros más adecuados para cada versión. Y lo que es más importante, nuestro enfoque elimina la necesidad de que los testers requieran de amplios conocimientos de ML, permitiéndoles generar pipelines adaptados a los sucesivos cambios en las versiones del SUT. El trabajo muestra además una discusión desde la perspectiva cualitativa de la idoneidad de la propuesta aplicada al problema TCP.
  • Artículo
    Una propuesta industrial para dar soporte a la elicitacion de requisitos mediante analisis inteligente de datos: el caso de la gestion de subvenciones
    Escobar Montes, Manuel; Gimenez Medina, Manuel; González Enríquez, José; Escalona, M.J.. Actas de las XXVIII Jornadas de Ingeniería del Software y Bases de Datos (JISBD 2024), 2024-06-17.
    Como practicamente todas la áreas de la sociedad, la Ingeniería del Software puede verse beneficiada por la aplicación de técnicas de inteligencia artificial. Este artículo presenta un trabajo preliminar en el que pretendemos analizar como la ingeniería de requisitos puede verse enriquecida con conocimiento extraido de bancos de datos. Concretamente, el trabajo se centra en un contexto industrial: el de las administraciones públicas andaluzas, y se inspira en el planteamiento de un problema real: la gestión más efectiva de las subvenciones públicas. En el trabajo se presenta el problema que inspira el trabajo y los primeros pasos que estamos planteando para poder focalizarnos en una solución aplicable en los programas de gestión de subvenciones andaluzas.
  • Artículo
    Razonamiento semántico adaptado a los recursos
    Bobed, Carlos; Bobillo, Fernando; Mena, Eduardo. Actas de las XXVIII Jornadas de Ingeniería del Software y Bases de Datos (JISBD 2024), 2024-06-17.
    Este artículo corto motiva la necesidad de mejorar los métodos actuales de razonamiento semántico de manera que tengan en cuenta los recursos existentes (por ejemplo, el hardware, las preferencias de privacidad o el tiempo disponible por el usuario). Para ello, se presentan varios casos de uso que justifican dicha necesidad y se proponen una serie de tareas necesarias para abordar el problema.
  • Resumen
    A Deep Learning Model for Natural Language Querying in Cyber-Physical Systems
    Llopis, Juan Alberto; Fernández-García, Antonio Jesús; Criado, Javier; Iribarne, Luis; Ayala, Rosa; Z. Wang, James. Actas de las XXVIII Jornadas de Ingeniería del Software y Bases de Datos (JISBD 2024), 2024-06-17.
    As a result of technological advancements, the number of IoT devices and services is rapidly increasing. Due to the increasing complexity of IoT devices and the various ways they can operate and communicate, finding a specific device can be challenging because of the complex tasks they can perform. To help find devices in a timely and efficient manner, in environments where the user may not know what devices are available or how to access them, we propose a recommender system using deep learning for matching user queries in the form of a natural language sentence with Web of Things (WoT) devices or services. The Transformer, a recent attention-based algorithm that gets superior results for natural language problems, is used for the deep learning model. Our study shows that the Transformer can be a recommendation tool for finding relevant WoT devices in Cyber-Physical Systems (CPSs). With hashing as an encoding technique, the proposed model returns the relevant devices with a high grade of confidence. After experimentation, the proposed model is validated by comparing it with our current search system, and the results are discussed. The work can potentially impact real-world application scenarios when many different devices are involved.
  • Artículo
    Sistema de recomendación distribuido para servicios de descubrimiento de la WoT
    Llopis, Juan Alberto; Iribarne, Luis; Criado, Javier; Fernández-García, Antonio Jesús. Actas de las XXVIII Jornadas de Ingeniería del Software y Bases de Datos (JISBD 2024), 2024-06-17.
    En la búsqueda de sistemas ciberfísicos, uno de los problemas es la adaptación de los sistemas de búsqueda o descubrimiento a la dinamicidad de los dispositivos. En el mismo periodo de tiempo, varios dispositivos pueden moverse entre localizaciones, ocasionando que una misma consulta devuelva resultados diferentes. En este trabajo proponemos un sistema de recomendación distribuido basado en lenguaje natural para el descubrimiento de sistemas ciberfísicos dentro de una federación de servicios de descubrimiento. El objetivo del sistema es adaptar el descubrimiento de dispositivos a la dinamicidad que tienen mediante el uso de modelos de Inteligencia Artificial para reentrenar el sistema recomendador cuando el sistema detecte que las métricas de búsqueda han empeorado. Además, se busca reducir la carga de trabajo del recomendador asociando modelos limitados a cada uno de los servicios de descubrimiento en lugar de un sistema recomendador centralizado para todos los servicios de descubrimiento. Por último, cada uno de los modelos de recomendación está apoyado por un servidor central, encargado de monitorizar su rendimiento y reentrenarlos cuando la calidad de las recomendaciones devueltas baje de un umbral establecido.
  • Resumen
    T-FREX: A Transformer-based Feature Extraction Method from Mobile App Reviews
    Motger, Quim; Miaschi, Alessio; Dell'Orletta, Felice; Franch, Xavier; Marco, Jordi. Actas de las XXVIII Jornadas de Ingeniería del Software y Bases de Datos (JISBD 2024), 2024-06-17.
    Mobile app reviews are a large-scale data source for software-related knowledge generation activities, including software maintenance, evolution and feedback analysis. Effective extraction of features (i.e., functionalities or characteristics) from these reviews is key to support analysis on the acceptance of these features, identification of relevant new feature requests and prioritization of feature development, among others. Traditional methods focus on syntactic pattern-based approaches, typically context-agnostic, evaluated on a closed set of apps, difficult to replicate and limited to a reduced set and domain of apps. Meanwhile, the pervasiveness of Large Language Models (LLMs) based on the Transformer architecture in software engineering tasks lays the groundwork for empirical evaluation of the performance of these models to support feature extraction. In this study, we present T-FREX, a Transformer-based, fully automatic approach for mobile app review feature extraction. First, we collect a set of ground truth features from users in a real crowdsourced software recommendation platform and transfer them automatically into a dataset of app reviews. Then, we use this newly created dataset to fine-tune multiple LLMs on a named entity recognition task under different data configurations. We assess the performance of T-FREX with respect to this ground truth, and we complement our analysis by comparing T-FREX with a baseline method from the field. Finally, we assess the quality of new features predicted by T-FREX through an external human evaluation. Results show that T-FREX outperforms on average the traditional syntactic-based method, especially when discovering new features from a domain for which the model has been fine-tuned.
  • Resumen
    The software heritage license dataset (2022 edition)
    Gonzalez-Barahona, Jesus M.; Montes León, Sergio Raúl; Robles, Gregorio; Zacchiroli, Stefano. Actas de las XXVIII Jornadas de Ingeniería del Software y Bases de Datos (JISBD 2024), 2024-06-17.
    When software is released publicly, it is common to include with it either the full text of the license or licenses under which it is published, or a detailed reference to them. Therefore public licenses, including FOSS (free, open source software) licenses, are usually publicly available in source code repositories To compile a dataset containing as many documents as possible that contain the text of software licenses, or references to the license terms. Once compiled, characterize the dataset so that it can be used for further research, or practical purposes related to license analysis Retrieve from Software Heritage—the largest publicly available archive of FOSS source code—all versions of all files whose names are commonly used to convey licensing terms. All retrieved documents will be characterized in various ways, using automated and manual analyses The dataset consists of 6.9 million unique license files. Additional metadata about shipped license files is also provided, making the dataset ready to use in various contexts, including: file length measures, MIME type, SPDX license (detected using ScanCode), and oldest appearance. The results of a manual analysis of 8102 documents is also included, providing a ground truth for further analysis. The dataset is released as open data as an archive file containing all deduplicated license files, plus several portable CSV files with metadata, referencing files via cryptographic checksums Thanks to the extensive coverage of Software Heritage, the dataset presented in this paper covers a very large fraction of all software licenses for public code. We have assembled a large body of software licenses, characterized it quantitatively and qualitatively, and validated that it is mostly composed of licensing information and includes almost all known license texts. The dataset can be used to conduct empirical studies on open source licensing, training of automated license classifiers, natural language processing (NLP) analyses of legal texts, as well as historical and phylogenetic studies on FOSS licensing. It can also be used in practice to improve tools detecting licenses in source code.
  • Resumen
    On the Generalizability of Deep Learning-based Code Completion Across Programming Language Versions
    Ciniselli, Matteo; Martin-Lopez, Alberto; Bavota, Gabriele. Actas de las XXVIII Jornadas de Ingeniería del Software y Bases de Datos (JISBD 2024), 2024-06-17.
    Code completion is a key feature of Integrated Development Environments (IDEs), aimed at predicting the next tokens a developer is likely to write, helping them write code faster and with less effort. Modern code completion approaches are often powered by deep learning (DL) models. However, the swift evolution of programming languages poses a critical challenge to the performance of DL-based code completion models: Can these models generalize across different language versions? This paper delves into such a question. In particular, we assess the capabilities of a state-of-the art model, CodeT5, to generalize across nine different Java versions, ranging from Java 2 to Java 17, while being exclusively trained on Java 8 code. Our evaluation spans three completion scenarios, namely, predicting tokens, constructs (e.g., the condition of an if statement) and entire code blocks. The results of our study reveal a noticeable disparity among language versions, with the worst performance being obtained in Java 2 and 17---the most far apart versions compared to Java 8. We investigate possible causes for the performance degradation and show that the adoption of a limited version-specific fine-tuning can partially alleviate the problem. Our work raises awareness on the importance of continuous model refinement, and it can inform the design of alternatives to make code completion models more robust to language evolution.
  • Resumen
    FASTDIAGP: An Algorithm for Parallelized Direct Diagnosis
    Le, Viet-Man; Vidal-Silva, Cristian; Felfernig, Alexander; Benavides, David; Galindo, José A.; Trang Tran, Thi Ngoc. Actas de las XXVIII Jornadas de Ingeniería del Software y Bases de Datos (JISBD 2024), 2024-06-17.
    Constraint-based applications attempt to identify a solution that meets all defined user requirements. If the requirements are inconsistent with the underlying constraint set, algorithms that compute diagnoses for inconsistent constraints should be implemented to help users resolve the “no solution could be found” dilemma. FastDiag is a typical direct diagnosis algorithm that supports diagnosis calculation without pre-determining conflicts. However, this approach faces runtime performance issues, especially when analyzing complex and large-scale knowledge bases. In this paper, we propose a novel algorithm, so-called FastDiagP, which is based on the idea of speculative programming. This algorithm extends FastDiag by integrating a parallelization mechanism that anticipates and pre-calculates consistency checks requested by FastDiag. This mechanism helps to provide consistency checks with fast answers and boosts the algorithm’s runtime performance. The performance improvements of our proposed algorithm have been shown through empirical results using the Linux-2.6.3.33 configuration knowledge base.
  • Artículo
    Naming Methods with Large Language Models after Refactoring Operations
    Recio Abad, Juan Carlos; Saborido, Rubén; Chicano, Francisco. Actas de las XXVIII Jornadas de Ingeniería del Software y Bases de Datos (JISBD 2024), 2024-06-17.
    In Object-oriented Programming, the Cognitive Complexity (CC) of software is a metric of the difficulty associated with understanding and maintaining the source code. This is usually measured at the method level, taking into account the number of control flow sentences and their nesting level. One way to reduce the CC associated to a method is by extracting code into new methods without altering any existing functionality. However, this involves deciding representative names for the new methods containing the extracted code. This work studies the capability of large language models for assigning names to new extracted methods during the evolution of a code base. Such evolution comprises continuous code extraction operations to study how the semantic of the new methods name evolves. We use the OpenAI Chat API with the textdavinci003 model in order to perform coding tasks. We found the precision of the model to be highly acceptable, achieving in many cases a level similar to that of a human which paves the way for new tooling. However, there are also a few cases in which it fails to provide appropriate names or does not even provide a name inside the indicated standards.
  • Artículo
    Adaptación proactiva en la monitorización de incendios con UAV: resultados preliminares
    Vílchez, Enrique; Troya, Javier; Cámara, Javier. Actas de las XXVIII Jornadas de Ingeniería del Software y Bases de Datos (JISBD 2024), 2024-06-17.
    Los Sistemas Ciber-Físicos Inteligentes (sCPS) operan en entornos dinámicos con incertidumbre, donde es crucial la anticipación a situaciones adversas y la descentralización en la toma de decisiones debido a razones de escalabilidad, resiliencia, y eficiencia. En este artículo, describimos cómo se puede emplear Predictive Coordinate Descent (PCD) para dotar a estos sistemas de una capacidad auto-adaptativa, proactiva, y descentralizada. En concreto, en este artículo comparamos la efectividad de PCD con la de un controlador sin anticipación basado en una Deep-Q-Network (DQN), en escenarios relacionados con la monitorización de incendios forestales con vehículos aéreos no tripulados.
  • Resumen
    Summary of: "Automated Misconfiguration Repair of Configurable Cyber-Physical Systems with Search: an Industrial Case Study on Elevator Dispatching Algorithms"
    Valle, Pablo; Arrieta, Aitor; Arratibel, Maite. Actas de las XXVIII Jornadas de Ingeniería del Software y Bases de Datos (JISBD 2024), 2024-06-17.
    Real-world Cyber-Physical Systems (CPSs) are usually configurable. Through parameters, it is possible to configure, select or unselect different system functionalities. While this provides high flexibility, it also becomes a source for failures due to misconfigurations. The large number of parameters these systems have and the long test execution time in this context due to the use of simulation-based testing make the manual repair process a cumbersome activity. Subsequently, in this context, automated repairing methods are paramount. In this paper, we propose an approach to automatically repair CPSs' misconfigurations. Our approach is evaluated with an industrial CPS case study from the elevation domain. Experiments with a real building and data obtained from operation suggests that our approach outperforms a baseline algorithm as well as the state of the practice (i.e., manual repair carried out by domain experts).