Apostolico–Giancarlo algorithm: Difference between revisions

Content deleted Content added
Yobot (talk | contribs)
m References: WP:CHECKWIKI error fixes - Replaced endash with hyphen in sortkey per WP:MCSTJR using AWB (9100)
Line 1:
In [[computer science]], the '''Apostolico–Giancarlo algorithm''' is a variant of the [[Boyer-MooreBoyer–Moore string search algorithm]], the basic application of which is searching for occurrences of a pattern <math>P</math> in a text <math>T</math>. As with other comparison-based string searches, this is done by aligning <math>P</math> to a certain index of <math>T</math> and checking whether a match occurs at that index. <math>P</math> is then shifted relative to <math>T</math> according to the rules of the Boyer-Moore algorithm, and the process repeats until the end of <math>T</math> has been reached. Application of the Boyer-Moore shift rules often results in large chunks of the text being skipped entirely.
 
With regard to the shift operation, Apostolico-Giancarlo is exactly equivalent in functionality to Boyer-Moore. The utility of Apostolico-Giancarlo is to speed up the match-checking operation at any index. With Boyer-Moore, finding an occurrence of <math>P</math> in <math>T</math> requires that all <math>n</math> characters of <math>P</math> be explicitly matched. For certain patterns and texts, this is very inefficient - a simple example is when both pattern and text consist of the same repeated character, in which case Boyer-Moore runs in <math>O(n m)</math> where <math>m</math> is the length in characters of <math>T</math>. Apostolico-Giancarlo speeds this up by recording the number of characters matched at the alignments of <math>T</math> in a table, which is combined with data gathered during the pre-processing of <math>P</math> to avoid redundant equality checking for sequences of characters that are known to match.