Búsqueda avanzada

El autor Gonzalo Navarro ha publicado 5 artículo(s):

1 - An index for moving objects with constant-time access to their compressed trajectories

As the number of vehicles and devices equipped with GPS technology has grown explosively, an urgent need has arisen for time- and space-efficient data structures to represent their trajectories. The most commonly desired queries are the following: queries about an object’s trajectory, range queries and nearest neighbor queries. In this paper we consider that the objects can move freely and we present a new compressed data structure for storing their trajectories, based on a combination of logs and snapshots, with the logs storing sequences of the objects’ relative movements and the snapshots storing their absolute positions sampled at regular time intervals. We call our data structure ContaCT because it provides Constant-time access to Compressed Trajectories. Its logs are based on a compact partial-sums data structure that returns cumulative displacement in constant time, and allows us to compute in constant time any object’s position at any instant, enabling a speedup when processing several other queries. We have compared ContaCT experimentally with another compact data structure for trajectories, called GraCT, and witha classic spatio-temporal index, the MVR-tree. Our results show that ContaCT outperforms the MVR-tree by orders of magnitude in space and also outperforms the compressed representation in time performance.

Autores: Nieves R. Brisaboa / Travis Gagie / Adrián Gómez-Brandón / Gonzalo Navarro / Jose R. Parama / 
Palabras Clave: Compact data structures - Spatio-temporal query - Trajectories representation

2 - Compact Representations of Event Sequences

Brisaboa, N. R.; de Bernardo, Guillermo; Navarro, G.; Varela Rodeiro, T.; Seco, D.: «Compact Representations of Event Sequences», en Proceedings of the 2018 Data Compression Conference (DCC 2018), IEEE Computer Society, Snowbird, Utah (Estados Unidos), 2018, pp. 237-246.GGS Class: 2DOI: 10.1109/DCC.2018.00032We introduce a new technique for the efficient management of large sequences of multidimensional data, which takes advantage of regularities that arise in real-world datasets and supports different types of aggregation queries. More importantly, our representation is flexible in the sense that the relevant dimensions and queries may be used to guide the construction process, easily providing a space-time tradeoff depending on the relevant queries in the domain. We provide two alternative representations for sequences of multidimensional data and describe the techniques to efficiently store the datasets and to perform aggregation queries over the compressed representation. We perform experimental evaluation on realistic datasets, showing the space efficiency and query capabilities of our proposal.

Autores: Tirso Varela Rodeiro / Nieves Rodríguez Brisaboa / Gonzalo Navarro / Diego Seco / Guillermo de Bernardo Roca / 
Palabras Clave: Aggregation - Multidimensional data - Succinct data structures

3 - Two-Dimensional Block Trees

Brisaboa, N. R.; Gagie, T.; Gomez Brandon, A.; Navarro, G.: «Two-Dimensional Block Trees», en Proceedings of the 2018 Data Compression Conference (DCC 2018), IEEE Computer Society, Snowbird, Utah (Estados Unidos), 2018, pp. 227-236.GGS Class: 2DOI: 10.1109/DCC.2018.00031The Block Tree (BT) is a novel compact data structure designed to compress sequence collections. It obtains compression ratios close to Lempel-Ziv and supports efficient direct access to any substring. The BT divides the text recursively into fixed-size blocks and those appearing earlier are represented with pointers. On repetitive collections, a few blocks can represent all the others, and thus the BT reduces the size by orders of magnitude. In this paper we extend the BT to two dimensions, to exploit repetitiveness in collections of images, graphs, and maps. This two-dimensional Block Tree divides the image regularly into subimages and replaces some of them by pointers to other occurrences thereof. We develop a specific variant aimed at compressing the adjacency matrices of Web graphs, obtaining space reductions of up to 50% compared with the k2-tree, which is the best alternative supporting direct and reverse navigation in the graph.

Autores: Nieves R. Brisaboa / Travis Gagie / Adrián Gómez Brandón  / Gonzalo Navarro / 
Palabras Clave: Compact data structures - Compression - Images - Web graphs

4 - Universal indexes for highly repetitive document collections

Abstract ======== Indexing highly repetitive collections has become a relevant problem with the emergence of large repositories of versioned documents, among other applications. These collections may reach huge sizes, but are formed mostly of documents that are near-copies of others. Traditional techniques for indexing these collections fail to properly exploit their regularities in order to reduce space. We introduce new techniques for compressing inverted indexes that exploit this near-copy regularity. They are based on run-length, Lempel-Ziv, or grammar compression of the differential inverted lists, instead of the usual practice of gap-encoding them. We show that, in this highly repetitive setting, our compression methods significantly reduce the space obtained with classical techniques, at the price of moderate slowdowns. Moreover, our best methods are universal, that is, they do not need to know the versioning structure of the collection, nor that a clear versioning structure even exists. We also introduce compressed self-indexes in the comparison. These are designed for general strings (not only natural language texts) and represent the text collection plus the index structure (not an inverted index) in integrated form. We show that these techniques can compress much further, using a small fraction of the space required by our new inverted indexes. Yet, they are orders of magnitude slower. Publication Details =================== Francisco Claude, Antonio Fariña, Miguel A. Martínez-Prieto, Gonzalo Navarro. Universal indexes for highly repetitive document collections Information Systems, 61, pp. 1-23, 2016, DOI: Citations Google Scholar: 3 (2 self-citations)

Autores: Francisco Claude / Antonio Fariña / Miguel A. Martinez-Prieto / Gonzalo Navarro / 
Palabras Clave: Inverted index - Repetitive collections - Self-index

5 - Semantrix: A Compressed Semantic Matrix

We present a compact data structure to represent both the duration and length of homogeneous segments of trajectories from moving objects in a way that, as a data warehouse, it allows us to efficiently answer cumulative queries. The division of trajectories into relevant segments has been studied in the literature under the topic of Trajectory Segmentation. In this paper, we design a data structure to compactly represent them and the algorithms to answer the more relevant queries. We experimentally evaluate our proposal in the real context of an enterprise with mobile workers (truck drivers) where we aim at analyzing the time they spend in different activities. To test our proposal under higher stress conditions we generated a huge amount of synthetic realistic trajectories and evaluated our system with those data to have a good idea about its space needs and its efficiency when answering different types of queries.

Autores: Nieves R. Brisaboa / Antonio Fariña / Gonzalo Navarro / Tirso V. Rodeiro / 
Palabras Clave: Aggregation - Data Structures - Multidimensional data