Content deleted Content added
No edit summary
m Replace em-dash with with en-dash.
 
(17 intermediate revisions by 14 users not shown)
Line 1:
{{Short description|HyperText Markup Languagelanguage for documents}}
{{Redirect2|.htm|.html||HTM (disambiguation){{!}}HTM}}
{{pp-vandalism|small=yes}}
Line 27:
}}
{{HTML}}
'''Hypertext Markup Language''' ('''HTML''') is the standard [[markup language]]{{efn|Even though HTML can be run in a browser, it is not viewed as a [[programming language]] in programming language discourse.<ref>{{Cite book |author-link=Felienne Hermans|last1=Hermans |first1=Felienne |last2=Schlesinger |first2=Ari |chapter=A Case for Feminism in Programming Language Design |date=2024-10-17 |title=Proceedings of the 2024 ACM SIGPLAN International Symposium on New Ideas, New Paradigms, and Reflections on Programming and Software |chapter-url=https://dl.acm.org/doi/10.1145/3689492.3689809A Case for Feminism in Programming Language Design |journaldate=[[OOPSLA]]2024-10-17 |language=en |publisher=ACM |pages=205–222 |doi=10.1145/3689492.3689809 |isbn=979-8-4007-1215-9}}</ref>}} for documents designed to be displayed in a [[web browser]]. It defines the content and structure of [[web content]]. It is often assisted by technologies such as [[Cascading Style Sheets]] (CSS) and [[scripting language]]s such as [[JavaScript]], a programming language.
 
[[Web browser]]s receive HTML documents from a [[web server]] or from local storage and [[browser engine|render]] the documents into multimedia web pages. HTML describes the structure of a [[web page]] [[Semantic Web|semantically]] and originally included cues for its appearance.
 
[[HTML element]]s are the building blocks of HTML pages. With HTML constructs, [[HTML element#Images and objects|images]] and other objects such as [[Fieldset|interactive forms]] may be embedded into the rendered page. HTML provides a means to create [[structured document]]s by denoting structural [[semantics]] for text such as headings, paragraphs, lists, [[Hyperlink|links]], quotes, and other items. HTML elements are delineated by ''tags'', written using [[Bracket#Angle brackets|angle brackets]]. Tags such as {{code|lang=html|code=<img>}} and {{code|lang=html|<input>}} directly introduce content into the page. Other tags such as {{code|lang=html|code=<p>}} and {{code|lang=html|code=</p>}} surround and provide information about document text and may include sub-element tags. [[Web browser|Browsers]] do not display the HTML tags, but use them to interpret the content of the page.
 
HTML can embed programs written in a [[scripting language]] such as [[JavaScript]], which affects the behavior and content of web pages. The inclusion of CSS defines the look and layout of content. The [[World Wide Web Consortium]] (W3C), former maintainer of the HTML and current maintainer of the CSS standards, has encouraged the use of [[CSS]] over explicit presentational HTML {{as of|1997|lc=y|since=y|post=.}}<ref name="deprecated">{{cite web|title=HTML 4.0 Specification — W3C Recommendation — Conformance: requirements and recommendations |url=https://www.w3.org/TR/REC-html40-971218/conform.html#deprecated|date=December 18, 1997|publisher=World Wide Web Consortium|url-status=live|archive-url=https://web.archive.org/web/20150705040855/http://www.w3.org/TR/REC-html40-971218/conform.html|archive-date=July 5, 2015|access-date=July 6, 2015}}</ref> A form of HTML, known as [[HTML5]], is used to display video and audio, primarily using the {{code|lang=html|<canvas>}} element, together with JavaScript.
Line 95:
; : Although its syntax closely resembles that of [[SGML]], [[HTML5]] has abandoned any attempt to be an SGML application and has explicitly defined its own "html" serialization, in addition to an alternative XML-based XHTML5 serialization.<ref>{{cite web|url=https://www.w3.org/blog/2008/01/html5-is-html-and-xml/|title=HTML5, one vocabulary, two serializations|date=15 January 2008 |access-date=February 25, 2009}}</ref>
; 2011&nbsp;HTML5 – Last Call :
; : On 14 February 2011, the W3C extended the charter of its HTML Working Group with clear milestones for HTML5. In May 2011, the working group advanced HTML5 to "Last Call", an invitation to communities inside and outside W3C to confirm the technical soundness of the specification. The W3C developed a comprehensive test suite to achieve broad interoperability for the full specification by 2014, which was the target date for recommendation.<ref name="w3c2014">{{cite web|url=https://www.w3.org/2011/02/htmlwg-pr.html|title=W3C Confirms May 2011 for HTML5 Last Call, Targets 2014 for HTML5 Standard|publisher=[[World Wide Web Consortium]]|access-date=18 February 2011|date=14 February 2011}}</ref> In January 2011, the WHATWG renamed its "HTML5" living standard to "HTML". The W3C nevertheless continuescontinued its project to release HTML5.<ref>{{cite web|url=http://blog.whatwg.org/html-is-the-new-html5|title=HTML Is the New HTML5|author=Hickson, Ian |website=The WHATWG Blog |date=January 19, 2011 |access-date=21 January 2011|archive-date=6 October 2019|archive-url=https://web.archive.org/web/20191006023430/https://blog.whatwg.org/html-is-the-new-html5}}</ref>
; 2012&nbsp;HTML5 – Candidate Recommendation :
; : In July 2012, WHATWG and [[W3C]] decided on a degree of separation. W3C will continue the HTML5 specification work, focusing on a single definitive standard, which is considered a "snapshot" by WHATWG. The WHATWG organization will continue its work with HTML5 as a "Living Standard". The concept of a living standard is that it is never complete and is always being updated and improved. New features can be added but functionality will not be removed.<ref>{{cite web|url=http://www.netmagazine.com/news/html5-gets-splits-122102|title=HTML5 gets the splits|publisher=Net magazine |first1=Craig |last1=Grannell |date=July 23, 2012 |access-date=23 July 2012 |archive-url=https://web.archive.org/web/20120725214739/http://www.netmagazine.com/news/html5-gets-splits-122102 |url-status=dead |archive-date=Jul 25, 2012 }}</ref>
Line 147:
{{Main|HTML element}}
[[File:HTML element content categories.svg|thumb|HTML element content categories]]
HTML documents imply a structure of nested [[HTML element]]s. These are indicated in the document by HTML ''tags'', enclosed in angle brackets thus: {{code|lang=html|code=<p>}}.<ref>{{cite web|title=HTML Elements|url=https://www.w3schools.com/html/html_elements.asp|publisher=w3schools|access-date=16 March 2015}}</ref>{{better source needed|date=February 2019}}
 
In the simple, general case, the extent of an element is indicated by a pair of tags: a "start tag" {{code|lang=html|code=<p>}} and "end tag" {{code|lang=html|code=</p>}}. The text content of the element, if any, is placed between these tags.
Line 161:
The general form of an HTML element is therefore: {{code|lang=html|code=<tag attribute1="value1" attribute2="value2">''content''</tag>}}. Some HTML elements are defined as ''empty elements'' and take the form {{code|lang=html|code=<tag attribute1="value1" attribute2="value2">}}. Empty elements may enclose no content, for instance, the {{code|lang=html|code=<br>}} tag or the inline {{code|lang=html|code=<img>}} tag.
The name of an HTML element is the name used in the tags.
The end tag's name is preceded by a slash character, <code>&#47;</code>,. andIf thata intag emptyhas elementsno thecontent, an end tag is neithernot requiredallowed. norIf allowedattributes are not mentioned, default values are used in each case.
If attributes are not mentioned, default values are used in each case.
 
==== Element examples ====
Line 200 ⟶ 199:
===== Line breaks =====
 
{{code|lang=html|code=<br>}}. The difference between {{code|lang=html|code=<br>}} and {{code|lang=html|code=<p>}} is that {{code|lang=html|code=<br>}} [[line breaking character|breaks a line]] without altering the semantic structure of the page, whereas {{code|lang=html|code=<p>}} sections the page into [[paragraph]]s. The element {{code|code=<br>|lang=html}} is an ''empty element'' in that, although it may have attributes, it can take no content and it maymust not have an end tag.
<syntaxhighlight lang="html"><p>This <br> is a paragraph <br> with <br> line breaks</p></syntaxhighlight>
 
Line 250 ⟶ 249:
Escaping also allows for characters that are not easily typed, or that are not available in the document's [[character encoding]], to be represented within the element and attribute content. For example, the acute-accented <code>e</code> (<code>é</code>), a character typically found only on Western European and South American keyboards, can be written in any HTML document as the entity reference <code>&amp;eacute;</code> or as the numeric references <code>&amp;#xE9;</code> or <code>&amp;#233;</code>, using characters that are available on all keyboards and are supported in all character encodings. [[Unicode]] character encodings such as [[UTF-8]] are compatible with all modern browsers and allow direct access to almost all the characters of the world's writing systems.<ref>{{cite web|title=''The Unicode Standard'': A Technical Introduction |publisher=Unicode |url=https://www.unicode.org/standard/principles.html|access-date=2010-03-16}}</ref>
{| class="wikitable"
|+HTML escape sequence examples
|+Example HTML Escape Sequences
!Named
!Decimal
Line 340 ⟶ 339:
 
=== Document type declaration ===
HTML documents are required to start with a d[[Document type declaration|ocumentdocument type declaration]] (informally, a "doctype"). In browsers, the doctype helps to define the rendering mode—particularly whether to use [[quirks mode]].
 
The original purpose of the doctype was to enable the parsing and validation of HTML documents by SGML tools based on the [[document type definition]] (DTD). The DTD to which the DOCTYPE refers contains a machine-readable grammar specifying the permitted and prohibited content for a document conforming to such a DTD. Browsers, on the other hand, do not implement HTML as an application of SGML and as consequence do not read the DTD.
Line 364 ⟶ 363:
Semantic HTML is a way of writing HTML that emphasizes the meaning of the encoded information over its presentation (look). HTML has included semantic markup from its inception,<ref>{{cite book|last1=Berners-Lee|first1=Tim|last2=Fischetti|first2=Mark|title=Weaving the Web: The Original Design and Ultimate Destiny of the World Wide Web by Its Inventor|url=https://archive.org/details/weavingweborigin00bern_0|url-access=registration|isbn=978-0-06-251587-2|publisher=Harper|___location=San Francisco|year=2000}}</ref> but has also included presentational markup, such as {{code|lang=html|code=<font>}}, {{code|lang=html|code=<i>}} and {{code|lang=html|code=<center>}} tags. There are also the semantically neutral [[div and span]] tags. Since the late 1990s, when [[Cascading Style Sheets]] were beginning to work in most browsers, web authors have been encouraged to avoid the use of presentational HTML markup with a view to the [[separation of content and presentation]].<ref>{{cite web|url=https://www.w3.org/MarkUp/Guide/Style.html|title=Adding a touch of style|last=Raggett|first=Dave|year=2002|publisher=W3C|access-date=October 2, 2009}} This article notes that presentational HTML markup may be useful when targeting browsers "before Netscape 4.0 and Internet Explorer 4.0". See the [[list of web browsers]] to confirm that these were both released in 1997.</ref>
 
In a 2001 discussion of the [[Semantic Web]], [[Tim Berners-Lee]] and others gave examples of ways in which intelligent software "agents" may one day automatically crawl the web and find, filter, and correlate previously unrelated, published facts for the benefit of human users.<ref>{{cite magazine |author=Berners-Lee |first1=Tim |last2=Hendler |first2=James |last3=Lassila |first3=Ora |date=May 1, 2001 |title=The Semantic Web |url=http://www.scientificamerican.com/article.cfm?id=the-semantic-web |magazine=Scientific American |access-date=October 2, 2009}}</ref> Such agents are not commonplace even now, but some of the ideas of [[Web 2.0]], [[Mashup (web application hybrid)|mashups]] and [[Price comparison service|price comparison websites]] may be coming close{{citation needed|date=February 2025}}. The main difference between these web application hybrids and Berners-Lee's semantic agents lies in the fact that the current [[Feed aggregator|aggregation]] and hybridization of information is usually designed by [[web developer]]s, who already know the web locations and the [[Application programming interface|API semantics]] of the specific data they wish to mash, compare and combine.
 
An important type of web agent that does crawl and read web pages automatically, without prior knowledge of what it might find, is the [[web crawler]] or search-engine spider. These software agents are dependent on the semantic clarity of web pages they find as they use various techniques and [[algorithm]]s to read and index millions of web pages a day and provide web users with [[Web search engine|search facilities]] without which the World Wide Web's usefulness would be greatly reduced.
Line 424 ⟶ 423:
* Use the empty-element syntax only for elements specified as empty in HTML.
* Remove the closing slash in empty-element tags: for example {{code|lang=html|code=<br>}} instead of {{code|lang=html|code=<br/>}}.
* Include explicit close tags for elements that permit content but are left empty (for example, {{code|lang=html|code=<div></div>}}, not {{code|lang=html|code=<div />}}).
* Omit the XML declaration.
 
Line 477 ⟶ 476:
 
== WHATWG HTML versus HTML5 ==
{{Main|#Transition of HTML Publicationpublication to WHATWG}}
The HTML Living Standard, which is developed by WHATWG, is the official version, while W3C HTML5 is no longer separate from WHATWG.
 
Line 484 ⟶ 483:
There are some [[WYSIWYG]] editors (''what you see is what you get''), in which the user lays out everything as it is to appear in the HTML document using a [[graphical user interface]] (GUI), often similar to [[word processor]]s. The editor renders the document rather than showing the code, so authors do not require extensive knowledge of HTML.
 
The WYSIWYG editing model has been criticized,<ref>Sauer, C.: WYSIWIKI&nbsp;– Questioning WYSIWYG in the Internet Age. In: Wikimania (2006)</ref><ref>Spiesser, J., Kitchen, L.: Optimization of HTML automatically generated by WYSIWYG programs. In: 13th International Conference on World Wide Web, pp. 355—364355–364. WWW '04. ACM, New York, NY (New York, NY, U.S., May 17–20, 2004)</ref> primarily because of the low quality of the generated code; there are voices{{who|date=June 2020}} advocating a change to the [[WYSIWYM]] model (''what you see is what you mean'').
 
WYSIWYG editors remain a controversial topic because of their perceived flaws such as: