Helsinki Algorithms Seminar: "Hybrid Indexing" Hector Ferrada, UH
Weekly meeting of researchers in the Helsinki area interested in the art of algorithms and algorithm design.
Map © OpenStreetMap. Some rights reserved.
Hector Ferrada
University of Helsinki
Hybrid Indexing
Abstract:
Hybrid indexing (Ferrada et al., Phil. Trans. R. Soc. A 372) is a recent approach to text indexing that allows the space-usage of conventional text indexes (e.g., suffix trees, suffix arrays, FM-indexes) to scale well with the text size, n, when z, the size of the Lempel-Ziv parsing of the text, is small relative to n. The price for this improved scalability is that an upper bound M on the pattern length that can be searched for must be declared at index construction time. Because the size of the resulting index contains an O(Mz) term, M must be kept reasonably small, though it has been shown that M \approx 100 leads to acceptable performance in some genomic applications. However, despite its promise, the practical performance of hybrid indexing relative to other compressed index data structures is poorly understood. This talk addresses that need, detailing extensive experiments showing hybrid indexing -- when carefully implemented -- to be significantly smaller and faster than alternative approaches on a broad range of data of different levels of compressibility. We also describe our work, ``Hybrid Indexing Revisited" (Ferrada, Kempa and Puglisi, to appear in ALENEX'18); in which we present practical extensions to hybrid indexing that obviate the restriction on M, supporting search for patterns of arbitrary length, as well as fast extraction (i.e. decompression) of arbitrary substrings from the original string.
**
Helsinki Algorithms Seminar is a weekly meeting of researchers in the Helsinki area interested in the art of algorithms and algorithm design, broadly interpreted to cover both theoretical ideas and algorithm engineering on concrete computing platforms. In most cases we have a presentation prepared for each meeting to communicate an idea, a recent result, work-in-progress, or demo, but this should not be at the expense of discussion and simply having fun with algorithms.
Our affiliations are with Aalto University and the University of Helsinki, and accordingly our activities alternate between the Otaniemi Campus of Aalto University and the Kumpula Campus of University of Helsinki, catalyzed by the Helsinki Institute for Information Technology HIIT, under the Algorithmic Data Analysis (ADA) programme.
Welcome!