Content deleted Content added
Citation bot (talk | contribs) Add: website. | Use this bot. Report bugs. | Suggested by BrownHairedGirl | Linked from User:BrownHairedGirl/Articles_with_bare_links | #UCB_webform_linked 778/2189 |
No edit summary |
||
(9 intermediate revisions by 6 users not shown) | |||
Line 1:
{{Short description|Process of removing undesirable parts of an HTML document}}
{{
In [[data sanitization]], '''HTML sanitization''' is the process of examining an [[HTML]] document and producing a new HTML document that preserves only whatever tags and attributes are designated "safe" and desired. HTML sanitization can be used to protect against attacks such as [[cross-site scripting
Basic tags for changing fonts are often allowed, such as <code><b></code>, <code><i></code>, <code><u></code>, <code><em></code>, and <code><strong></code> while more advanced tags such as <code><script></code>, <code><object></code>, <code><embed></code>, and <code><link></code> are removed by the sanitization process. Also potentially dangerous [[HTML attribute|attributes]] such as the <code>onclick</code> attribute are removed in order to prevent malicious code from being injected.
Sanitization is typically performed by using either a [[whitelist]] or a [[Blacklist (computing)|blacklist]] approach. Leaving a safe HTML element off a whitelist is not so serious; it simply means that that feature will not be included post-sanitation. On the other hand, if an unsafe element is left off a blacklist, then the vulnerability will not be sanitized out of the HTML output. An out-of-date blacklist can therefore be dangerous if new, unsafe features have been introduced to the HTML Standard.
Line 9 ⟶ 11:
== Implementations ==
In [[PHP]], HTML sanitization can be performed using the <code>strip_tags()</code> function at the risk of removing all textual content following an unclosed less-than symbol or angle bracket.<ref>{{cite web|url=http://us3.php.net/manual/en/function.strip-tags.php|title=strip_tags|publisher=PHP.NET}}</ref> The HTML Purifier library is another popular option for PHP applications.<ref>{{Cite web|url=http://
▲In [[PHP]], HTML sanitization can be performed using the <code>strip_tags()</code> function at the risk of removing all textual content following an unclosed less-than symbol or angle bracket.<ref>{{cite web|url=http://us3.php.net/manual/en/function.strip-tags.php|title=strip_tags|publisher=PHP.NET}}</ref> The HTML Purifier library is another popular option for PHP applications.<ref>http://www.htmlpurifier.org</ref>
In [[Java (programming language)|Java]] (and [[.NET Framework|.NET]]), sanitization can be achieved by using the [[OWASP]] Java HTML Sanitizer Project.<ref>{{Cite web|url=https://www.owasp.org/index.php/OWASP_Java_HTML_Sanitizer_Project|title = OWASP Java HTML Sanitizer}}</ref>
In [[.NET Framework|.NET]], a number of sanitizers use the Html Agility Pack, an HTML parser.<ref>{{Cite web |url=http://htmlagilitypack.codeplex.com/ |title=HTML Agility Pack - Home |access-date=2013-01-04 |archive-date=2013-01-01 |archive-url=https://web.archive.org/web/20130101170916/http://htmlagilitypack.codeplex.com/ |url-status=dead }}</ref><ref>{{Cite web|url=http://eksith.wordpress.com/2011/06/14/whitelist-santize-htmlagilitypack/|title = Whitelist santize with HtmlAgilityPack|date = 14 June 2011}}</ref><ref name="HtmlRuleSanitizer" /> Another library is HtmlSanitizer.<ref>{{cite web |last1=Ganss |first1=Michael |title=HtmlSanitizer |url=https://github.com/mganss/HtmlSanitizer/ |access-date=7 December 2023 |date=5 December 2023}}</ref>
In [[JavaScript]] there are "JS-only" sanitizers for the [[
▲== See also ==
== References ==
Line 25 ⟶ 23:
[[Category:HTML]]
|