CS Forum: Przemek Uznański, ETH Zurich

2017-06-27 14:15:00 2017-06-27 15:00:00 Europe/Helsinki CS Forum: Przemek Uznański, ETH Zurich CS department's public guest lecture on 'Optimal trade-offs for pattern matching with k mismatches'. The lecture is open to everyone free-of-charge. http://old.cs.aalto.fi/en/midcom-permalink-1e75590fa90866a559011e7a33a511d973291f291f2 Konemiehentie 2, 02150, Espoo

CS department's public guest lecture on 'Optimal trade-offs for pattern matching with k mismatches'. The lecture is open to everyone free-of-charge.

27.06.2017 / 14:15 - 15:00
seminar room T5, Konemiehentie 2, 02150, Espoo, FI

Speaker: Przemek Uznański
Affiliation: ETH Zurich

Host: Prof. Jukka Suomela
Time: 14:15 (coffee at 14:00)
Venue: T5, CS building

Optimal trade-offs for pattern matching with k mismatches

Abstract

Given a pattern of length m and a text of length n, the goal in mismatch pattern matching is to compute, for every m-substring of the text, the exact Hamming distance to the pattern. For this problem Abrahamson [SIAM J. Comput. 1987] provided an O~(n*m^0.5) solution. Due to the conditional lower bound by Indyk, we know that this is optimal in the regime of combinatorial algorithms (conditioned on combinatorial matrix multiplication conjecture).

A classical relaxation of this problem is pattern matching with k mismatches, where the goal is to compute exact Hamming distance if it is at most k, and output that it exceeds k. A series of algorithms was proposed over the last 30 years: starting with O(kn) by Landau and Vishkin [TCS 1986], improved to O~(n*k^0.5) and O~(n + n/m * k^3) by Amir, Lewenstein and Porat [SODA 2000] and finally O~(n + n/m * k^2) by Clifford et al. [SODA 2016].

Thus the best algorithm was due to the combination of results of Amir et. al. and Clifford et al.: O~(n + n/m * min(k^2, m*k^0.5)).

We improve upon that complexity to O~(n + n/m * k * m^0.5), providing smooth transition between O~(n) complexity when k=m^0.5 and O~(n*m^0.5) when k=m. Our algorithm carefully combines ideas from all the previous approaches, while providing new insights into the structure of the problem.

Perhaps slightly surprisingly, we show that this complexity is optimal (up to logarithmic factors, conditioned on combinatorial matrix multiplication conjecture), extending the reduction of Indyk to take into the account the dependency on k.

Bio

I am currently postdoc at ETH Zurich (since December 2015), in Peter Widmayer's group. Previously I was affiliated to Aalto University (Jukka Suomela's group) and University Aix-Marseille (Jeremie Chalopin's group) as a postdoc. I received PhD from Inria Bordeaux (advisor: Olivier Beaumont).

I am interested in distributed algorithms (i.e. self-organization, locality), graph algorithms (i.e. algebraic tools), stringology (i.e. computing in sublinear time/memory) and biological computation (i.e. population protocols).