HTML sanitization: Difference between revisions

Content deleted Content added
Fix grammar
Tag: Reverted
No edit summary
 
(One intermediate revision by one other user not shown)
Line 1:
{{Short description|Process of removing undesirable parts of aan HTML document}}
{{More citations needed|date=December 2009}}
In [[data sanitization]], '''HTML sanitization''' is the process of examining aan [[HTML]] document and producing a new HTML document that preserves only whatever tags and attributes are designated "safe" and desired. HTML sanitization can be used to protect against attacks such as [[cross-site scripting]] (XSS) by sanitizing any HTML code submitted by a user.
 
== Details ==
Line 15:
In [[Java (programming language)|Java]] (and [[.NET Framework|.NET]]), sanitization can be achieved by using the [[OWASP]] Java HTML Sanitizer Project.<ref>{{Cite web|url=https://www.owasp.org/index.php/OWASP_Java_HTML_Sanitizer_Project|title = OWASP Java HTML Sanitizer}}</ref>
 
In [[.NET Framework|.NET]], a number of sanitizers use the Html Agility Pack, aan HTML parser.<ref>{{Cite web |url=http://htmlagilitypack.codeplex.com/ |title=HTML Agility Pack - Home |access-date=2013-01-04 |archive-date=2013-01-01 |archive-url=https://web.archive.org/web/20130101170916/http://htmlagilitypack.codeplex.com/ |url-status=dead }}</ref><ref>{{Cite web|url=http://eksith.wordpress.com/2011/06/14/whitelist-santize-htmlagilitypack/|title = Whitelist santize with HtmlAgilityPack|date = 14 June 2011}}</ref><ref name="HtmlRuleSanitizer" /> Another library is HtmlSanitizer.<ref>{{cite web |last1=Ganss |first1=Michael |title=HtmlSanitizer |url=https://github.com/mganss/HtmlSanitizer/ |access-date=7 December 2023 |date=5 December 2023}}</ref>
 
In [[JavaScript]] there are "JS-only" sanitizers for the [[front and back ends|back end]], and browser-based<ref>{{Cite web|url=https://github.com/jitbit/HtmlSanitizer|title=JS HTML Sanitizer|website=[[GitHub]]|date=14 October 2021}}</ref> implementations that use browser's own [[Document Object Model]] (DOM) parser to parse the HTML (for better performance).