Navegación

Búsqueda

Búsqueda avanzada

El autor Jose F Aldana Montes ha publicado 6 artículo(s):

1 - tESA: using semantics of scientific articles to approximate semantic relatedness

Short abstract Semantic relatedness is a measure that quantifies the strength of a semantic link between two concepts. Often, it can be efficiently approximated with methods that operate on words, which represent these concepts. Approximating semantic relatedness between texts is an important part of many text and knowledge processing tasks of crucial importance in the ever growing domain of biomedical informatics. In this paper we present tESA, an extension to a well known Explicit Semantic Relatedness (ESA) method, which leverages the semantics of a corpus of scientific documents to improve the quality of the relatedness approximation for biomedical domain. In our extension we use two separate sets of vectors, corresponding to different sections of the articles from the underlying corpus of documents, as opposed to the original method, which only uses a single vector space. Our findings suggest that extending the original ESA methodology with the use of title vectors of the documents of scientific corpora may be used to enhance the performance of a distributional semantic relatedness measures. Background Semantic relatedness is a measure that quantifies the strength of a semantic link between two concepts. Often, it can be efficiently approximated with methods that operate on words, which represent these concepts. Approximating semantic relatedness between texts and concepts represented by these texts is an important part of many text and knowledge processing tasks of crucial importance in the ever growing domain of biomedical informatics. The problem of most state-of-the-art methods for calculating semantic relatedness is their dependence on highly specialized, structured knowledge resources, which makes these methods poorly adaptable for many usage scenarios. On the other hand, the domain knowledge in the Life Sciences has become more and more accessible, but mostly in its unstructured form – as texts in large document collections, which makes its use more challenging for automated processing. In this paper we present tESA, an extension to a well known Explicit Semantic Relatedness (ESA) method. Results In our extension we use two separate sets of vectors, corresponding to different sections of the articles from the underlying corpus of documents, as opposed to the original method, which only uses a single vector space. We present an evaluation of Life Sciences domain-focused applicability of both tESA and domain-adapted Explicit Semantic Analysis. The methods are tested against a set of standard benchmarks established for the evaluation of biomedical semantic relatedness quality. Our experiments show that the propsed method achieves results comparable with or superior to the current state-of-the-art methods. Additionally, a comparative discussion of the results obtained with tESA and ESA is presented, together with a study of the adaptability of the methods to different corpora and their performance with different input parameters. Conclusions Our findings suggest that combined use of the semantics from different sections (i.e. extending the original ESA methodology with the use of title vectors) of the documents of scientific corpora may be used to enhance the performance of a distributional semantic relatedness measures, which can be observed in the largest reference datasets. We also present the impact of the proposed extension on the size of distributional representations. Publication details The original paper tESA: a distributional measure for calculating semantic relatedness (DOI: 10.1186/s13326-016-0109-6), authored by Maciej Rybinski and José Francisco Aldana-Montes, was published online in the Journal of Biomedical Semantics on 28th of December 2016. The Journal of Biomedical Semantics currently holds (according to the latest JCR for 2015) an impact factor of 1.62, with a five-year impact factor of 2.511. The main impact factor places the Journal in the second cuartile (Q2) of its JCR-SCI category MATHEMATICAL & COMPUTATIONAL BIOLOGY. Acknowledgments Work presented here was partially supported by grants TIN2014-58304-R (Ministerio de Ciencia e Innovación), P11-TIC-7529 and P12-TIC-1519 (Plan Andaluz de Investigación, Desarrollo e Innovación) and EU FP7-KBBE-289126 (the EU 7th Framework Programme, BIOLEDGE).

Autores: Maciej Rybinski / Jose F Aldana Montes / 
Palabras Clave: Bioinformatics - Biomedical semantics - Distributional linguistics - Explicit semantic analysis - Knowledge extraction - Semantic relatedness - Semantic similarity

2 - Enhancing semantic consistency in anti-fraud rule-based expert systems

En este estudio, se propone un servicio guiado por ontología para la detección y clasificación de problemas de incoherencia semántica en sistemas expertos con bases de reglas de decisión. Se centra en el caso crítico de repositorios de reglas antifraude para la inspección de transacciones en entornos de comercio electrónico. La motivación principal consiste en examinar y seleccionar los conjuntos de datos de reglas antifraude para evitar conflictos semánticos que podrían llevar al sistema experto subyacente a funcionar incorrectamente, e. g., al aceptar transacciones fraudulentas y/o descartando las inofensivas. Se ha desarrollado una ontología OWL específica y una serie de reglas semánticas (SWRL) de razonamiento para evaluar dichas bases de reglas antifraude. Las tres principales contribuciones de este trabajo son: primero, la creación de un modelo de conocimiento conceptual para describir las reglas antifraude y sus relaciones; segundo, el desarrollo de reglas semánticas como métodos de detección de conflictos para sistemas expertos contra el fraude; en tercer lugar, se recopilan datos experimentales para evaluar y validar el modelo propuesto. Se utiliza un caso de uso real de la industria de comercio electrónico (e-Turismo) para explicar el diseño de la ontología y su uso. Los experimentos muestran que los enfoques ontológicos pueden descubrir y clasificar efectivamente conflictos en sistemas expertos basados en reglas para detección de fraude. La propuesta también se puede aplicar en otros dominios donde se trabaje con bases de reglas de conocimiento.Este trabajo se presenta como artículo relevante, con referencia: María del Mar Roldán-García, José García-Nieto, José F. Aldana-Montes. Enhancing semantic consistency in anti-fraud rule-based expert systems. Expert Systems with Applications, Volume 90, 2017, Pages 332-343, ISSN 0957-4174, https://doi.org/10.1016/j.eswa.2017.08.036.La revista Expert Systems with Applications está indexada en JCR-ISI en categorías: COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE – SCIE; ENGINEERING, ELECTRICAL & ELECTRONIC – SCIE; OPERATIONS RESEARCH & MANAGEMENT SCIENCE – SCIE; con ranking Q1 en todas ellas y cuenta con un factor de impacto 2016 de 3.928.

Autores: Maria Del Mar Roldan-Garcia / José Manuel García-Nieto / Jose F Aldana Montes / 
Palabras Clave: Modelo Semántico - Ontología - Razonamiento - Reglas Antifraude - Reglas SWRL - Sistemas Expertos de Base de Reglas

3 - Enriquecimiento Automático de Ontologías Biomédicas mediante el uso de Mappings

Dione es una representación en OWL del ICD-10-CM, consistente lógicamente, cuyos axiomas definen las inclusiones y exclusiones del ICD-10-CM mediante una metodología basada en los mappings ICD-10-CM/SNOMED-CT, proporcionados por UMLS y BioPortal, y que han sido validados por una comunidad de expertos en el ámbito biomédico. En este artículo se presenta una metodología automática que permite la población con axiomas en Dione a partir de los mappings establecidos entre ICD-10-CM y otra ontología biomédica que hayan sido proporcionados por BioPortal. Para mostrar el funcionamiento de esta metodología, se han utilizado los mappings entre Dione y ORDO. Esta última es una ontología que incluye el conjunto de enfermedades raras, genes y otras características para la población de nuevos axiomas en Dione. Una vez que estos axiomas se incluyeron en Dione, se comprobó su consistencia utilizando el razonador ELK y se mostró con un caso de uso que las clases equivalentes entre las ontologías DIONE-ORDO permitían la inferencia de axiomas que definen una clase ICD-10-CM en DIONE a una clase que representa una enfermedad rara en ORDO y, viceversa. Esta nueva metodología se puede aplicar a dos ontologías biomédicas cualquiera cuyos mappings estén previamente definidos en BioPortal.

Autores: María Jesús García Godoy / Esteban López-Camacho / María Del Mar Roldán-García / Jose F Aldana Montes / 
Palabras Clave: Enfermedades Raras - ICD-10-CM - Mappings - Ontologías Biomédicas - Razonamiento

4 - BIGOWL: Knowledge Centered Big Data Analytics

En las últimas décadas el aumento de fuentes de información en diferentes campos de la sociedad desde la salud hasta las redes sociales ha puesto de manifiesto la necesidad de nuevas técnicas para su análisis, lo que se ha venido a llamar el Big Data. Los problemas clásicos de optimización no son ajenos a este cambio de paradigma, como por ejemplo el problema del viajante de comercio (TSP), ya que se puede beneficiar de los datos que proporciona los diferentes sensores que se encuentran en las ciudades y que podemos acceder a ellos gracias a los portales de Open Data. Cuando estamos realizando análisis, ya sea de optimización o machine learning en Big Data, una de las formas más usada de abordarlo es mediante workflows de análisis. Estos están formados por componentes que hacen cada paso del análisis. El flujo de información en workflows puede ser anotada y almacenada usando herramientas de la Web Semántica para facilitar la reutilización de dichos componentes o incluso el workflow completo en futuros análisis, facilitando as+AO0, su reutilización y a su vez, mejorando el proceso de creación de estos. Para ello se ha creado la ontología BIGOWL, que permite trazar la cadena de valor de los datos de los workflows mediante semántica y además ayuda al analista en la creación de workflow gracias a que va guiando su composición con la información que contiene por la anotación de algoritmos, datos, componentes y workflows. La problemática que ha abordado y resuelto BIGOWL se encuentra en dar estructura a esta información para poder ser integrada en los componentes. Para para validar el modelo semántico, se presentan una serie de consultas SPARQL y reglas de razonamiento para guiar el proceso de creación y validación de dos casos de estudio, que consisten en: primero, el procesamiento en streaming de datos de tráfico real con Spark para la optimización de rutas en el entorno urbano de la ciudad de Nueva York+ADs y segundo, clasificación usando algoritmos de minería de datos de un conjunto de datos académicos como son los de la flor de Iris.

Autores: Cristóbal Barba-González / José García-Nieto / Maria Del Mar Roldan-Garcia / Ismael Navas-Delgado / Antonio J. Nebro / Jose F Aldana Montes / 
Palabras Clave: big data - Machine Learning - Optimización - Web Semantic

5 - FIMED: Flexible management of biomedical data

In the last decade, clinical trial management systems have become an essential support tool for data management and analysis in clinical research. However, these clinical tools have design limitations, since they are currently not able to cover the needs of adaptation to the continuous changes in the practice of the trials due to the heterogeneous and dynamic nature of the clinical research data. These systems are usually proprietary solutions provided by vendors for specific tasks. In this work, we propose FIMED, a software solution for the flexible management of clinical data from multiple trials, moving towards personalized medicine, which can contribute positively by improving clinical researchers quality and ease in clinical trials. This tool allows a dynamic and incremental design of patients’ profiles in the context of clinical trials, providing a flexible user interface that hides the complexity of using databases. Clinical researchers will be able to define personalized data schemas according to their needs and clinical study specifications. Thus, FIMED allows the incorporation of separate clinical data analysis from multiple trials. The efficiency of the software has been demonstrated by a real-world use case for a clinical assay in Melanoma disease, which has been indeed anonymized to provide a user demonstration. FIMED currently provides three data analysis and visualization components, guaranteeing a clinical exploration for gene expression data: heatmap visualization, clusterheatmap visualization, as well as gene regulatory network inference and visualization. An instance of this tool is freely available on the web at https://khaos.uma.es/fimed. It can be accessed with a demo user account, «researcher», using the password «demo». Category: COMPUTER SCIENCE, THEORY & METHODS. Ranking: 13/110. Journal: COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE. Year: 2021. DOI: https://doi.org/10.1016/j.cmpb.2021.106496.

Autores: Sandro Hurtado / José García-Nieto / Ismael Navas-Delgado / Jose F Aldana Montes / 
Palabras Clave: Clinical Research - Clinical Trial Management Systems - Gene Expression Data Analysis - Gene Regulatory Network Inference - NoSQL Database

6 - Injecting domain knowledge in multi-objective optimization problems:A semantic approach

In the field of complex problem optimization with metaheuristics, semantics has been used for modeling different aspects, such as: problem characterization, parameters, decision-maker’s preferences, or algorithms. However, there is a lack of approaches where ontologies are applied in a direct way into the optimization process, with the aim of enhancing it by allowing the systematic incorporation of additional domain knowledge. This is due to the high level of abstraction of ontologies, which makes them difficult to be mapped into the code implementing the problems and/or the specific operators of metaheuristics. In this paper, we present a strategy to inject domain knowledge (by reusing existing ontologies or creating a new one) into a problem implementation that will be optimized using a metaheuristic. Thus, this approach based on accepted ontologies enables building and exploiting complex computing systems in optimization problems. We describe a methodology to automatically induce user choices (taken from the ontology) into the problem implementations provided by the jMetal optimization framework. With the aim of illustrating our proposal, we focus on the urban domain. Concretely, we start from defining an ontology representing the domain semantics for a city (e.g., building, bridges, point of interest, routes, etc.) that allows defining a-priori preferences by a decision maker in a standard, reusable, and formal (logic-based) way. We validate our proposal with several instances of two use cases, consisting in bi-objective formulations of the Traveling Salesman Problem (TSP) and the Radio Network Design problem (RND), both in the context of an urban scenario. The results of the experiments conducted show how the semantic specification of domain constraints are effectively mapped into feasible solutions of the tackled TSP and RND scenarios. This proposal aims at representing a step forward towards the automatic modeling and adaptation of optimization problems guided by semantics, where the annotation of a human expert can be now considered during the optimization process.

Autores: Cristobal Barba-Gonzalez / Antonio J. Nebro / José García-Nieto / Maria Del Mar Roldan-Garcia / Ismael Navas-Delgado / Jose F Aldana Montes / 
Palabras Clave: Decision Making - domain knowledge - Metaheuristics - multi-objective optimization - Ontology - Semantic web technologies