Búsqueda avanzada

El autor Antonio Corral ha publicado 8 artículo(s):

1 - Creating datasets for data analysis through a cloud microservice-based architecture

Data analysis is a trending technique due to the tendency of analyzing patterns or generating knowledge in different domains. However, it is difficult to know at design time what raw data should be collected, how it is going to be analyzed or which analysis techniques will be applied to data. Service-oriented architectures can be applied to solve these problems by providing flexible and reliable architectures. In this paper, we present a microservice-based software architecture in the cloud with the aim of generating datasets to carry out data analysis. This architecture facilitates acquiring data, which may be located in a data center, distributed, or even on different devices (ubiquitous computing) due to the rise of the IoT. It provides an infrastructure over which multiple developer’ groups can work in parallel on the microservices. These microservices also provide a reliable and affordable adaptability to the lack of specific requirements in some functionalities and the fast evolution and variability of them, due to the fast changing of client needs.

Autores: Antonio Jesús Fernández-García / Javier Criado / Antonio Corral / Luis Iribarne / 
Palabras Clave: architectures - datasets - microservices

2 - A First Approach towards Storage and Query Processing of Big Spatial Networks in Scalable and Distributed Systems

Due to the ubiquitous use of spatial data applications and the large amounts of spatial data that these applications generate, the processing of large-scale queries in distributed systems is becoming increasingly popular. Complex spatial systems are very often organized under the form of Spatial Networks, a type of graph where nodes and edges are embedded in space. Examples of these spatial networks are transportation and mobility networks, mobile phone networks, social and contact networks, etc. When these spatial networks are big enough that exceed the capacity of commonly-used spatial computing technologies, we have Big Spatial Networks, and to manage them is necessary the use of distributed graph-parallel systems. In this paper, we describe our emerging work concerning the design of new storage methods and query processing algorithms over big spatial networks in scalable and distributed systems, which is a very active research area in the past years.

Autores: Manel Mena / Antonio Corral / Luis Iribarne / 
Palabras Clave: Distributed Systems - query processing - Spatial Networks - Storage Methods

3 - Efficient query processing on large spatial databases: A performance study

Processing of spatial queries has been studied extensively in the literature. In most cases, it is accomplished by indexing spatial data using spatial access methods. Spatial indexes, such as those based on the Quadtree, are important in spatial databases for efficient execution of queries involving spatial constraints and objects. In this paper, we study a recent balanced disk-based index structure for point data, called xBR+-tree, that belongs to the Quadtree family and hierarchically decomposes space in a regular manner. For the most common spatial queries, like Point Location, Window, Distance Range, Nearest Neighbor and Distance-based Join, the R-tree family is a very popular choice of spatial index, due to its excellent query performance. For this reason, we compare the performance of the xBR+-tree with respect to the R?-tree and the R+-tree for tree building and processing the most studied spatial queries. To perform this comparison, we utilize existing algorithms and present new ones. We demonstrate through extensive experimental performance results (I/O efficiency and execution time), based on medium and large real and synthetic datasets, that the xBR+-tree is a big winner in execution time in all cases and a winner in I/O in most cases.

Autores: George Roumelis / Michael Vassilakopoulos / Antonio Corral / Yannis Manolopoulos / 
Palabras Clave: Performance evaluation - Quadtrees - query processing - R-trees - Spatial access methods - Spatial databases - xBR-trees

4 - Una arquitectura de microservicios para componentes digitales en la Web de las Cosas

La comunicación entre dispositivos del Internet de las Cosas (IoT) es muy heterogénea y esto provoca que surjan problemas de interoperabilidad e integración entre dispositivos o plataformas. Además, debido al bajo poder de computación de estos dispositivos, es común encontrar cuellos de botella en la comunicación con los mismos.Para solucionar estos problemas, proponemos una arquitectura de microservicios para la gestión de lo que hemos denominado Digital Dices (DD). Los DD son una representación virtual de dispositivos IoT análoga al concepto de Digital Twin, pero incorporando un conjunto de nuevas características que mejoran la gestión de los dispositivos físicos. Los DD pretenden dar solución al problema de la interoperabilidad y el escalado de dispositivos IoT mediante una aproximación holística. Estos elementos proporcionarán una solución que permita la gestión de eventos y un control de entrada/salida utilizando tecnologías web. Por último, pretendemos hacerlos compatibles con los estándares de la Web de las Cosas (WoT) y prepararlos para que formen parte de un sistema Open Data.

Autores: Manel Mena / Javier Criado / Luis Iribarne / Antonio Corral / 
Palabras Clave: Digital Twin - Interoperabilidad - IoT - Microservicios - Open Data - WoT

5 - Efficient Large-scale Distance-Based Join Queries in SpatialHadoop

Efficient processing of Distance-Based Join Queries (DBJQs) in spatial databases is of paramount importance in many application domains (e.g. image processing, location-based systems, geographical information systems (GIS), continuous monitoring in streaming data settings, road network systems, etc.). The most representative and known DBJQs are the K Closest Pairs Query (KCPQ) and the e Distance Join Query (eDJQ). These types of join queries are characterized by a number of desired pairs (K) or a distance threshold (e) between the components of the pairs in the nal result, over two spatial datasets. Both are expensive operations, since two spatial datasets are combined with additional constraints, and they become even more costly operations for large-scale data. Given the increasing volume of spatial data originating from multiple sources and stored in distributed servers, it is not always efficient to perform DBJQs on a centralized server. For this reason, this paper addresses the problem of computing DBJQs on big spatial datasets in SpatialHadoop, an extension of Hadoop-MapReduce that supports efficient processing of spatial queries in a cloud-based setting. SpatialHadoop injects spatial data awareness in each Hadoop layer, i.e. language, storage, MapReduce and operations layers.We propose novel algorithms, based on plane-sweep, to perform efficient parallel DBJQs on large-scale spatial datasets in SpatialHadoop. In addition to the plane-sweep base technique, we present a methodology for improving the performance of the KCPQ algorithms by the computation of an upper bound of the distance of the K-th closest pair. To demonstrate the benets of our proposed methodologies, we present the results of the execution of an extensive set of experiments that demonstrate the efficiency and scalability of our proposals using big synthetic and real-world points datasets.

Autores: Antonio Corral / Francisco Garcia-Garcia / Luis Iribarne / Michael Vassilakopoulos / Yannis Manolopoulos / 
Palabras Clave: eDJQ - KCPQ - MapReduce - Spatial Data Processing - Spatial Query Evaluation - SpatialHadoop

6 - Efficient Distance Join Query Processing in Distributed Spatial Data Management Systems

Due to the ubiquitous use of spatial data applications and the large amounts of such data these applications use, the processing of large-scale distance joins in distributed systems is becoming increasingly popular. Distance Join Queries(DJQs) are important and frequently used operations in numerous applications, including data mining, multi-media and spatial databases. DJQs (e.g., k Nearest Neighbor Join Query, k Closest Pair Query, +A7U Distance Join Query, etc.) are costly operations, since they involve both the join and distance-based search, and performing DJQs efficiently is a challenging task. Recent Big Data developments have motivated the emergence of novel technologies for distributed processing of large-scale spatial data in clusters of computers, leading to Distributed Spatial Data Management Systems(DSDMSs). Distributed cluster-based computing systems can be classified as Hadoop-based or Spark-based systems. Based on this classification, in this paper, we compare two of the most recent and leading DSDMSs, SpatialHadoop and LocationSpark, by evaluating the performance of several existing and newly proposed parallel and distributed DJQ algorithms under various settings with large spatial real-world datasets. A general conclusion arising from the execution of the distributed DJQ algorithms studied is that, while SpatialHadoop is a robust and efficient system when large spatial datasets are joined (since it is built on top of the mature Hadoop platform), LocationSpark is the clear winner in total execution time efficiency when medium spatial datasets are combined (due to in-memory processing provided by Spark). However, LocationSpark requires higher memory allocation when large spatial datasets are involved in DJQs (even more so when k and +A7U are large). Finally, this detailed performance study has demonstrated that the new distributed DJQ algorithms we have pro-posed are efficient, robust and scalable with respect to different parameters, such as dataset sizes, k, +A7U and number of computing nodes.

Autores: Francisco Garcia-Garcia / Antonio Corral / Luis Iribarne / Michael Vassilakopoulos / Yannis Manolopoulos / 
Palabras Clave: Distance Join - LocationSpark - Space Partitioning - Spatial Data Processing - Spatial Query Evaluation - SpatialHadoop

7 - Improving Distance-Join Query Processing with Voronoi-Diagram based Partitioning in SpatialHadoop

SpatialHadoop is an extended MapReduce framework supporting global indexing techniques that partition spatial datasets across several machines and improve spatial query processing performance compared to traditional Hadoop systems. SpatialHadoop supports several spatial operations (e.g.,K Nearest Neighbor search, range query, spatial intersection join, etc.) and seven spatial partitioning techniques (Grid, Quadtree, STR, STR+ACs, k-d tree, Z-curve and Hilbert-curve). Distance-Join Queries (DJQs), like the K Nearest Neighbors Join Query (KNNJQ) and K Closest Pairs Query (KCPQ), are common operations used in numerous spatial applications. DJQs are costly operations, since they combine spatial joins with distance-based search. Data partitioning improves the management of large datasets and speeds up query performance.Therefore, performing DJQs efficiently with new partitioning methods in SpatialHadoop is a challenging task. In this paper, a new data partitioning technique based on Voronoi-Diagrams is designed and implemented in SpatialHadoop. Moreover, improved KNNJQ and KCPQ MapReduce algorithms, using the new partitioning mechanism, are also designed and developed for SpatialHadoop. Finally, the results of an extensive set of experiments with real-world datasets are presented, demonstrating that the new partitioning technique and the improved DJQ MapReduce algorithms are efficient, scalable and robust in SpatialHadoop.

Autores: Francisco Garcia-Garcia / Antonio Corral / Luis Iribarne / Michael Vassilakopoulos / 
Palabras Clave: Data Partitioning - K Closest Pairs - K Nearest Neighbors Join - MapReduce - Spatial Query Evaluation - SpatialHadoop

8 - Hacia una Plataforma de Gestión Inteligente de Calidad de Aire en Puertos Marítimos

Actualmente, el tráfico rodado y marítimo produce una alta contaminación medioambiental en los puertos marítimos, afectando a las ciudades en las que se integran. En particular, la polución es uno de los problemas más importantes a combatir dado que puede afectar seriamente a la salud y a la calidad de vida tanto del personal portuario y turistas, como de los ciudadanos que viven cerca de los puertos, pudiendo propiciar o empeorar determinadas enfermedades o incluso causar la muerte en determinados grupos de riesgo. Aunque los puertos inteligentes suelen monitorizar la calidad medioambiental, no acometen el envío automatizado de alertas contextuales según las situaciones de interés detectadas en tiempo real ni tampoco proporcionan un repositorio de componentes software sobre calidad del aire que pueda ser reutilizado por otros puertos marítimos que compartan las mismas necesidades. Este artículo presenta un proyecto I+D+i donde se propone una plataforma innovadora, reutilizable y adaptable que permita monitorizar y gestionar, de manera más eficiente y en tiempo real, la calidad de aire en distintos puertos marítimos, así como enviar automáticamente alertas contextuales con objeto de reducir todo lo posible el daño al medio ambiente, a las ciudades en las que se integran, así como a su contexto socioeconómico. Se trata, por tanto, de un proyecto con una contribución innovadora y sostenible hacia la transformación digital de los puertos, aunando los ámbitos de las Ciudades Inteligentes y de la Industria 4.0.

Autores: Juan Boubeta-Puig / Javier Criado / Guadalupe Ortiz / Nicolás Padilla / Alfonso García de Prado / Rosa Ayala / David Corral-Plaza / Antonio Corral / Inmaculada Medina-Bulo / Luis Iribarne / 
Palabras Clave: arquitectura orientada a servicios y dirigida por eventos - calidad del aire - procesamiento de eventos complejos - Puerto inteligente - transformación digital - Web de las Cosas