HTML sanitization: Difference between revisions

Content deleted Content added
Citation bot (talk | contribs)
Add: website. | Use this bot. Report bugs. | Suggested by BrownHairedGirl | Linked from User:BrownHairedGirl/Articles_with_bare_links | #UCB_webform_linked 778/2189
sep lede, destub, link to parent per LEDE
Line 1:
{{Refimprove|date=December 2009}}
In [[data sanitization]], '''HTML sanitization''' is the process of examining an [[HTML]] document and producing a new HTML document that preserves only whatever tags are designated "safe" and desired. HTML sanitization can be used to protect against attacks such as [[cross-site scripting|cross-site scripting]] (XSS)]] by sanitizing any HTML code submitted by a user.
 
== See alsoDetails ==
Basic tags for changing fonts are often allowed, such as <code>&lt;b&gt;</code>, <code>&lt;i&gt;</code>, <code>&lt;u&gt;</code>, <code>&lt;em&gt;</code>, and <code>&lt;strong&gt;</code> while more advanced tags such as <code>&lt;script&gt;</code>, <code>&lt;object&gt;</code>, <code>&lt;embed&gt;</code>, and <code>&lt;link&gt;</code> are removed by the sanitization process. Also potentially dangerous attributes such as the <code>onclick</code> attribute are removed in order to prevent malicious code from being injected.
 
Line 9 ⟶ 10:
 
== Implementations ==
 
In [[PHP]], HTML sanitization can be performed using the <code>strip_tags()</code> function at the risk of removing all textual content following an unclosed less-than symbol or angle bracket.<ref>{{cite web|url=http://us3.php.net/manual/en/function.strip-tags.php|title=strip_tags|publisher=PHP.NET}}</ref> The HTML Purifier library is another popular option for PHP applications.<ref>http://www.htmlpurifier.org</ref>
 
Line 16:
In [[.NET Framework|.NET]], a number of sanitizers use the Html Agility Pack, an HTML parser.<ref>http://htmlagilitypack.codeplex.com/</ref><ref>{{Cite web|url=http://eksith.wordpress.com/2011/06/14/whitelist-santize-htmlagilitypack/|title = Whitelist santize with HtmlAgilityPack|date = 14 June 2011}}</ref><ref name="HtmlRuleSanitizer" />
 
In [[JavaScript]] there are "JS-only" sanitizers for the [[Front_and_back_endsfront and back ends|back end]], and browser-based<ref>{{Cite web|url=https://github.com/jitbit/HtmlSanitizer|title=JS HTML Sanitizer|website=[[GitHub]]|date=14 October 2021}}</ref> implementations that use browser's own [[Document Object Model]] (DOM) parser to parse the HTML (for better performance).
 
== See also ==
* [[Data sanitization]]
 
== References ==
Line 25 ⟶ 22:
 
[[Category:HTML]]
 
 
{{web-software-stub}}