Improving Distance-Join Query Processing with Voronoi-Diagram based Partitioning in SpatialHadoop





Publicado en

Actas de las XXV Jornadas de Ingeniería del Software y Bases de Datos (JISBD 2021)

Licencia Creative Commons


SpatialHadoop is an extended MapReduce framework supporting global indexing techniques that partition spatial datasets across several machines and improve spatial query processing performance compared to traditional Hadoop systems. SpatialHadoop supports several spatial operations (e.g.,K Nearest Neighbor search, range query, spatial intersection join, etc.) and seven spatial partitioning techniques (Grid, Quadtree, STR, STR+ACs, k-d tree, Z-curve and Hilbert-curve). Distance-Join Queries (DJQs), like the K Nearest Neighbors Join Query (KNNJQ) and K Closest Pairs Query (KCPQ), are common operations used in numerous spatial applications. DJQs are costly operations, since they combine spatial joins with distance-based search. Data partitioning improves the management of large datasets and speeds up query performance.Therefore, performing DJQs efficiently with new partitioning methods in SpatialHadoop is a challenging task. In this paper, a new data partitioning technique based on Voronoi-Diagrams is designed and implemented in SpatialHadoop. Moreover, improved KNNJQ and KCPQ MapReduce algorithms, using the new partitioning mechanism, are also designed and developed for SpatialHadoop. Finally, the results of an extensive set of experiments with real-world datasets are presented, demonstrating that the new partitioning technique and the improved DJQ MapReduce algorithms are efficient, scalable and robust in SpatialHadoop.


Acerca de García-García, Francisco

Palabras clave

Data Partitioning, K Closest Pairs, K Nearest Neighbors Join, MapReduce, Spatial Query Evaluation, SpatialHadoop
Página completa del ítem
Notificar un error en este resumen
Mostrar cita
Mostrar cita en BibTeX
Descargar cita en BibTeX