String Matching with Inversions and Translocations in Linear Average Time (Most of the Time)

Reading time: 3 minute
...

📝 Original Info

  • Title: String Matching with Inversions and Translocations in Linear Average Time (Most of the Time)
  • ArXiv ID: 1012.0280
  • Date: 2013-05-09
  • Authors: Researchers from original ArXiv paper

📝 Abstract

We present an efficient algorithm for finding all approximate occurrences of a given pattern $p$ of length $m$ in a text $t$ of length $n$ allowing for translocations of equal length adjacent factors and inversions of factors. The algorithm is based on an efficient filtering method and has an $\bigO(nm\max(\alpha, \beta))$-time complexity in the worst case and $\bigO(\max(\alpha, \beta))$-space complexity, where $\alpha$ and $\beta$ are respectively the maximum length of the factors involved in any translocation and inversion. Moreover we show that under the assumptions of equiprobability and independence of characters our algorithm has a $\bigO(n)$ average time complexity, whenever $\sigma = \Omega(\log m / \log\log^{1-\epsilon} m)$, where $\epsilon > 0$ and $\sigma$ is the dimension of the alphabet. Experiments show that the new proposed algorithm achieves very good results in practical cases.

💡 Deep Analysis

Deep Dive into String Matching with Inversions and Translocations in Linear Average Time (Most of the Time).

We present an efficient algorithm for finding all approximate occurrences of a given pattern $p$ of length $m$ in a text $t$ of length $n$ allowing for translocations of equal length adjacent factors and inversions of factors. The algorithm is based on an efficient filtering method and has an $\bigO(nm\max(\alpha, \beta))$-time complexity in the worst case and $\bigO(\max(\alpha, \beta))$-space complexity, where $\alpha$ and $\beta$ are respectively the maximum length of the factors involved in any translocation and inversion. Moreover we show that under the assumptions of equiprobability and independence of characters our algorithm has a $\bigO(n)$ average time complexity, whenever $\sigma = \Omega(\log m / \log\log^{1-\epsilon} m)$, where $\epsilon > 0$ and $\sigma$ is the dimension of the alphabet. Experiments show that the new proposed algorithm achieves very good results in practical cases.

📄 Full Content

We present an efficient algorithm for finding all approximate occurrences of a given pattern $p$ of length $m$ in a text $t$ of length $n$ allowing for translocations of equal length adjacent factors and inversions of factors. The algorithm is based on an efficient filtering method and has an $\bigO(nm\max(\alpha, \beta))$-time complexity in the worst case and $\bigO(\max(\alpha, \beta))$-space complexity, where $\alpha$ and $\beta$ are respectively the maximum length of the factors involved in any translocation and inversion. Moreover we show that under the assumptions of equiprobability and independence of characters our algorithm has a $\bigO(n)$ average time complexity, whenever $\sigma = \Omega(\log m / \log\log^{1-\epsilon} m)$, where $\epsilon > 0$ and $\sigma$ is the dimension of the alphabet. Experiments show that the new proposed algorithm achieves very good results in practical cases.

Reference

This content is AI-processed based on ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut