Semantic HTML: Difference between revisions

Content deleted Content added
Citation bot (talk | contribs)
Removed parameters. | You can use this bot yourself. Report bugs here. | Suggested by AManWithNoPlan | All pages linked from cached copy of User:AManWithNoPlan/sandbox2 | via #UCB_webform_linked 2713/5333
Monkbot (talk | contribs)
m Task 18 (cosmetic): eval 12 templates: hyphenate params (10×);
Line 4:
 
== History ==
HTML has included semantic markup since its inception.<ref>{{cite book|last1=Berners-Lee|first1=Tim|authorlink1author-link1=Tim Berners-Lee|last2=Fischetti|first2=Mark|title=Weaving the Web: The Original Design and Ultimate Destiny of the World Wide Web by Its Inventor|url=https://archive.org/details/weavingweborigin00bern_0|url-access=registration|isbn=978-0062515872 |publisher=Harper|___location=San Francisco|year=2000}}</ref> In an HTML document, the author may, among other things, "start with a title; add headings and paragraphs; add emphasis to [the] text; add images; add links to other pages; [and] use various kinds of lists".<ref>{{cite web| url=http://www.w3.org/MarkUp/Guide/Overview.html|title=Getting started with HTML|last=Raggett|first=Dave|authorlinkauthor-link=Dave Raggett|date=24 April 2005|publisher=[[World Wide Web Consortium]]|accessdateaccess-date=8 December 2010}}</ref>
 
Various versions of the HTML standard have included [[HTML element#Presentation|presentational markup]] such as <code>&lt;font&gt;</code> (added in HTML 3.2; removed in HTML 4.0 Strict), <code>&lt;i&gt;</code> (all versions) and <code>&lt;center&gt;</code> (added in HTML 3.2). There are also the semantically neutral [[span and div]] elements. Since the late 1990s when [[Cascading Style Sheets]] were beginning to work in most browsers, web authors have been encouraged to avoid the use of presentational HTML markup with a view to the [[separation of presentation and content]].<ref>{{cite web|url=http://www.w3.org/MarkUp/Guide/Style.html|title=Adding a touch of style|last=Raggett|first=Dave|date=8 April 2002|publisher=World Wide Web Consortium|accessdateaccess-date=8 December 2010}} This article notes that presentational HTML markup may be useful when targeting browsers "before [[Netscape Communicator|Netscape 4.0]] and [[Internet Explorer 4|Internet Explorer 4.0]]" which were both released in 1997.</ref>
 
In 2001, [[Tim Berners-Lee]] participated in a discussion of the [[Semantic Web]], where it was presented that intelligent software 'agents' might one day automatically trawl the Web and find, filter and correlate previously unrelated, published facts for the benefit of end users.<ref>{{cite web | url=http://www.scientificamerican.com/article.cfm?id=the-semantic-web|title=The Semantic Web|first1=Tim|last1=Berners-Lee|first2=James|last2=Hendler|first3=Ora|last3=Lassila|publisher=Scientific American|year=2001|accessdateaccess-date=2009-10-02}}</ref> Such agents are not commonplace even now, but some of the ideas of [[Web 2.0]], [[Mashup (web application hybrid)|mashups]] and [[Price comparison service|price comparison websites]] may be coming close. The main difference between these web application hybrids and Berners-Lee's semantic agents lies in the fact that the current [[news aggregator|aggregation]] and hybridisation of information is usually designed in by web developers, who already know the web locations and the [[Application programming interface|API semantics]] of the specific data they wish to mash, compare and combine.
 
An important type of web agent that does crawl and read web pages automatically, without prior knowledge of what it might find, is the [[Web crawler]] or search-engine spider. These software agents are dependent on the semantic clarity of web pages they find as they use various techniques and [[algorithm]]s to read and index millions of web pages a day and provide web users with [[Web search engine|search facilities]].
 
In order for search-engine spiders to be able to rate the significance of pieces of text they find in HTML documents, and also for those creating mashups and other hybrids, as well as for more automated agents as they are developed, the semantic structures that exist in HTML need to be widely and uniformly applied to bring out the meaning of published information.<ref name="Semantic_Web_Revisted">{{cite web|url=http://eprints.ecs.soton.ac.uk/12614/1/Semantic_Web_Revisted.pdf|title=The Semantic Web Revisited|first1=Nigel|last1=Shadbolt|first3=Wendy|last3=Hall|first2=Tim|last2=Berners-Lee|publisher=IEEE Intelligent Systems|date=May–June 2006|accessdateaccess-date=8 December 2010}}</ref>
 
While the true semantic web may depend on complex [[Resource Description Framework|RDF]] [[Ontology (information science)|ontologies]] and [[metadata]], every HTML document makes its contribution to the meaningfulness of the Web by the correct use of headings, lists, titles and other semantic markup wherever possible. This "plain" use of HTML has been called "Plain Old Semantic HTML" or POSH.<ref>{{cite web |url=http://microformats.org/wiki/posh |title=Plain Old Semantic HTML (POSH) |date=April 20, 2007 |website=Microformats Wiki |publisher=microformats community |accessdateaccess-date=May 4, 2013}}</ref> The correct use of Web 2.0 'tagging' creates [[Folksonomy|folksonomies]] that may be equally or even more meaningful to many.<ref name="Semantic_Web_Revisted"/> [[HTML 5]] introduced new semantic elements such as <code>section</code>, <code>article</code>, <code>footer</code>, <code>progress</code>, <code>nav</code>, <code>aside</code>, <code>mark</code>, and <code>time</code>.<ref>{{cite web|last1=Robinson|first1=Mike|title=Let's Talk about Semantics|url=http://html5doctor.com/lets-talk-about-semantics/|publisher=HTML 5 Doctor|accessdateaccess-date=26 October 2015}}</ref> Overall, the goal of the [[W3C]] is to slowly introduce more ways for browsers, developers, and crawlers to better distinguish between different types of data, allowing for benefits such as better display on browsers on different devices.
 
Presentational elements were not formally [[Deprecation|deprecated]] in HTML 4.01 and XHTML recommendations, but were recommended against. In HTML 5, some of those elements, such as <code>i</code><ref>{{cite web|title=HTML5 |at=Section 4.5.17: The i element |url=https://www.w3.org/TR/html5/text-level-semantics.html#the-i-element |publisher=World Wide Web Consortium }}</ref> and <code>b</code><ref>{{cite web|title=HTML5 |at=Section 4.5.18: The b element |url=https://www.w3.org/TR/html5/text-level-semantics.html#the-b-element |publisher=World Wide Web Consortium }}</ref> are still specified as their meaning has been clearly defined "as to be stylistically offset from the normal prose without conveying any extra importance".{{cite quote|date=July 2019}}
Line 26:
 
== Google "rich snippets" ==
In 2010, [[Google]] specified three forms of structured metadata that their systems will use to find structured semantic content within webpages. Such information, when related to reviews, people profiles, business listings, and events will be used by Google to enhance the "snippet", or short piece of quoted text that is shown when the page appears in search listings. Google specifies that that data may be given using [[Microdata (HTML5)|microdata]], [[microformat]]s or [[RDFa]].<ref>{{cite web|title=Rich snippets|url=http://www.google.com/support/webmasters/bin/answer.py?answer=99170|work=Webmaster Central|accessdateaccess-date=26 May 2010}}</ref> Microdata is specified inside <code>itemtype</code> and <code>itemprop</code> attributes added to existing HTML elements; microformat keywords are added inside <code>class</code> attributes as discussed above; and RDFa relies on <code>rel</code>, <code>[[typeof]]</code> and <code>property</code> attributes added to existing elements.<ref>{{cite web|title=Businesses and organizations - About organization information|url=http://www.google.com/support/webmasters/bin/answer.py?answer=146861|work=Webmaster Central|accessdateaccess-date=26 May 2010}}</ref>
 
== See also ==