Substring: Difference between revisions

Content deleted Content added
top: suggest to use double quotes, so no nowiki tags at all are needed
Jepin (talk | contribs)
updated definitions of substring, prefix and suffix, which contained several errors
Line 9:
== Substring ==
 
A string <math>u</math> is a substring (or factor)<ref name=Lot97/> of a string <math>T = t_1 \dots t_nt</math> isif athere stringexists two strings <math>\hat T = t_{1+i} \dots t_{m+i}p</math>, whereand <math>0 \leq ip</math> andsuch that <math>mt + i \leq= npus</math>. In particular, the empty string is a substring of every string. A substring of a string is a prefix of a suffix of the string, and equivalently a suffix of a prefix. If <math>\hat Ts</math> is a substring of <math>Tt</math>, it is also a [[subsequence]], which is a more general concept. Given a pattern <math>P</math>, you can find its occurrences in a string <math>T</math> with a [[string searching algorithm]]. Finding the longest string which is equal to a substring of two or more strings is known as the [[longest common substring problem]].
 
Example: The string <code>ana</code> is equal to substrings (and subsequences) of <code>banana</code> at two different offsets:
Line 20:
 
In the mathematical literature, substrings are also called '''subwords''' (in America) or '''factors''' (in Europe).
 
Not including the empty substring, the number of substrings of a string of length <math>n</math> where symbols only occur once, is the number of ways to choose two distinct places between symbols to start/end the substring. Including the very beginning and very end of the string, there are <math>n+1</math> such places. So there are <math>\tbinom{n+1}{2} = \tfrac{n(n+1)}{2}</math> non-empty substrings.
 
== Prefix ==
{{see also|String operations#Prefixes}}
A prefix of a string <math>Tp</math> =is t_1a \dotsprefix<ref t_n<name=Lot97/math> isof a string <math>\widehatt</math> Tif =there t_1exists \dotsa t_{m}string <math>s</math>, wheresuch that <math>mt \leq= nps</math>. A ''proper prefix'' of a string is not equal to the string itself (<math>0 \leq m < n</math>);<ref name=Kel95/> some sources<ref name=Gus97/> in addition restrict a proper prefix to be non-empty (<math>0 < m < n</math>). A prefix can be seen as a special case of a substring.
 
Example: The string <code>ban</code> is equal to a prefix (and substring and subsequence) of the string <code>banana</code>:
Line 33 ⟶ 31:
ban
 
The square subset symbol is sometimes used to indicate a prefix, so that <math>\widehat Tp \sqsubseteq Tt</math> denotes that <math>\widehat Tp</math> is a prefix of <math>Tt</math>. This defines a [[binary relation]] on strings, called the [[prefix relation]], which is a particular kind of [[prefix order]].
 
In [[formal language theory]], the term ''prefix of a string'' is also commonly understood to be the set of all prefixes of a string, with respect to that language.
 
== Suffix ==
 
A string <math>s</math> is a suffix<ref name=Lot97/> of a string is<math>t</math> anyif substringthere ofexists thea string which includes<math>p</math> itssuch lastthat letter,<math>t including= itselfps</math>. A ''proper suffix'' of a string is not equal to the string itself. A more restricted interpretation is that it is also not empty{{ref|Gus97}}. A suffix can be seen as a special case of a substring.
 
Example: The string <code>nana</code> is equal to a suffix (and substring and subsequence) of the string <code>banana</code>:
Line 82 ⟶ 78:
| ___location = London
| isbn = 0-13-497777-7
}}</ref>
<ref name=Lot97>{{cite book
| last = Lothaire
| first = M.
| year = 1997
| title = Combinatorics on words
| publisher = Cambridge University Press
| ___location = Cambridge
| isbn = 0-521-59924-5
}}</ref>
}}