A model-driven approach for the definition of reproducible and replicable data analysis projects





Publicado en

Actas de las XXVI Jornadas de Ingeniería del Software y Bases de Datos (JISBD 2022)


CC BY 4.0


It is becoming increasingly common to exploit the data collected by Information Systems in order to carry out an analysis of them and obtain conclusions that give rise to a series of decisions in the different research fields. The fact that in most cases these conclusions cannot be properly backed up has given rise to a reproducibility crisis in Data Science, the discipline that makes it possible to convert such data into knowledge, and in research fields that apply it. In this paper we envision a Model-Driven framework to foster reproducible and replicable Data Science projects. The framework proposes the definition of systematic pipelines that may be (semi)automatically executed in terms of concrete implementation platforms. Proprietary or third party tools are also considered so that flexibility may be ensured without hindering.


Acerca de González, Francisco Javier Melchor

Palabras clave

Data Science, Model-Driven Engineering, Process, Replicability, Reproducibility
Página completa del ítem
Notificar un error en este artículo
Mostrar cita
Mostrar cita en BibTeX
Descargar cita en BibTeX