Talk:Comparison of regular expression engines: Difference between revisions

Content deleted Content added
Speed: new section
Unicode property support: Suggested that Unicode in regexes should reference what degree of compliance with formal requirements each language meets
Line 35:
 
I have not found any evidence, that Python supports unicode properties (like <code>\p{L}</code>). I'm not sure how it is about another implementations, so I am fixing only the Python item. See e.g. [http://regular-expressions.mobi/refflavors.html]. [[User:Mykhal|Mykhal]] ([[User talk:Mykhal|talk]]) 21:10, 9 January 2008 (UTC)
 
Only ICU and Perl offer full Unicode property support as of this writing; notes added. I cannot find any evidence that vim supports Unicode properties (like <code>\pL</code>, <code>\p{Lu}</code>, <code>\p{Alphabetic}</code>, <code>\p{Script=Latin}</code>, or <code>\p{Line_Break=A_Letter}</code>. I have removed its support.
 
I strongly suggest that just mentioning ''Unicode property support'' is far too broad a brush for usefulness. The most important thing is whether a regex system is or is not compliant with the requirements spelt out in [http://unicode.org/reports/tr18/|UTS#18's Unicode Regular Expressions]. This is quite specific about formal requirements, such as Level 1, Level 2, or Level 3. Suggestions? Standards compliance is easily referenceable through specific claims in each language's documentation.
 
Even mentioning whether things like <code>\w</code>, <code>\s</code>, and <code>\b</code> work with Unicode or whether thye're ASCII-only would be much more useful than the current column features. 17:50, 5 February 2010 (UTC)
 
== Languages? ==