Distance Range Queries in SpatialHadoop





Publicado en

Actas de las XXI Jornadas de Ingeniería del Software y Bases de Datos (JISBD 2016)


CC BY 4.0


Efficient processing of Distance Range Queries (DRQs) is of great importance in spatial databases due to the wide area of applications. This type of spatial query is characterized by a distance range over one or two datasets. The most representative and known DRQs are the eDistance Range Query (eDRQ) and the eDistance Range Join Query (eDRJQ). Given the increasing volume of spatial data, it is difficult to perform a DRQ on a centralized machine efficiently. Moreover, the eDRJQ is an expensive spatial operation, since it can be considered a combination of the eDR and the spatial join queries. For this reason, this paper addresses the problem of computing DRQs on big spatial datasets in SpatialHadoop, an extension of Hadoop that supports spatial operations efficiently, and proposes new algorithms in SpatialHadoop to perform efficient parallel DRQs on large-scale spatial datasets. We have evaluated the performance of the proposed algorithms in several situations with big synthetic and real-world datasets. The experiments have demonstrated the efficiency (in terms of total execution time and number of distance computations) and scalability (in terms of epsilon values, sizes of datasets and number of computing nodes) of our proposal.


Acerca de Corral, Antonio

Palabras clave

Página completa del ítem
Notificar un error en este artículo
Mostrar cita
Mostrar cita en BibTeX
Descargar cita en BibTeX