Skip to Main content Skip to Navigation

Complexité de la recherche de motifs dans un texte aléatoire

Abstract : Let Ʃ be an alphabet of size s ⩾2. A pattern or dictionary is a set of words written with letters from Ʃ. The multiple pattern-matching problem consist in finding all occurrences of the words of a given dictionary in a text. In this thesis, we focuse on the estimation of the complexity of the exact or appromate multiple pattern-matching problem in termes of minimum ratio of text to be read in a random text of lenght n in order to find all the exact or approximate occurrences of the words of an arbitrary dictionary. This complexity is linked to the notion of content r = (rᵢ)I ≥1 of a dictionary, i.e., the vector of integers whose i-th entry rᵢ is the number of words of lenght i int the dictionary. On one hand, we prove that the complexity of the exact multiple pattern-matching problem for a random dictionary of content r when at most k editiong errors (insertion, deletion, substitution) are allowed is in Ɵ (αsΦ(r)+ βs K+1/ᵐmᵢn) with Φ(r)=1/ks (max/m In(smr m)/m + 1/2ˢᵐmᵢn), and αs, βs, et ks depend only on the size s of the alphabet. For the exact pattern-matching problem as well as for the approximate pattern-matching problem, the steps of proof are the same. For the upper bounds, algorithms that reached the required complexity for dictionary of content r are presented and analyzed. The lower bounds are established thanks to counting arguments.
Complete list of metadatas
Contributor : Abes Star :  Contact
Submitted on : Wednesday, September 23, 2020 - 7:36:13 PM
Last modification on : Wednesday, October 14, 2020 - 4:09:17 AM
Long-term archiving on: : Thursday, December 3, 2020 - 4:22:51 PM


Version validated by the jury (STAR)


  • HAL Id : tel-02947291, version 1



Tsinjo Tony Rakotoarimalala. Complexité de la recherche de motifs dans un texte aléatoire. Complexité [cs.CC]. Université Sorbonne Paris Cité, 2019. Français. ⟨NNT : 2019USPCD015⟩. ⟨tel-02947291⟩



Record views


Files downloads