Artículo: Distributed algorithms for big spatial and spatio-textual query processing
Fecha
Editor
Publicado en
Licencia Creative Commons
Resumen
A vast amount of geo-referenced data is generated daily by mobile devices, GPS-enabled devices, and other sensors, increasing the importance of spatio-textual analyses of such data. Big Spatio-Textual Data requires new distributed processing technologies for managing, storing, analyzing, and visualizing large-scale spatio-textual data. Distributed Spatio-Textual Data Management Systems (DSTDMSs) consist of shared nothing clusters of computers specifically designed for distributed processing of large-scale spatio-textual data. This paper presents our emerging work on designing new storage methods and query processing algorithms for Apache Sedona (a recent open-source in-memory cluster computing system for spatial data processing) to support batch and streaming spatio-textual data processing. Our research aims to incorporate new partitioning methods and indexing mechanisms that will help to implement new (static and continuous) spatio-textual queries, especially distance-based spatio-textual joins. Finally, we will evaluate the new proposals with exhaustive experiments over Apache Sedona as a DSTDMS, analyzing and drawing conclusions from the experimental result.