Content deleted Content added
Alexflatter (talk | contribs) No edit summary |
→Semantic HTML: request citation for software crawler claim |
||
Line 364:
Semantic HTML is a way of writing HTML that emphasizes the meaning of the encoded information over its presentation (look). HTML has included semantic markup from its inception,<ref>{{cite book|last1=Berners-Lee|first1=Tim|last2=Fischetti|first2=Mark|title=Weaving the Web: The Original Design and Ultimate Destiny of the World Wide Web by Its Inventor|url=https://archive.org/details/weavingweborigin00bern_0|url-access=registration|isbn=978-0-06-251587-2|publisher=Harper|___location=San Francisco|year=2000}}</ref> but has also included presentational markup, such as {{code|lang=html|code=<font>}}, {{code|lang=html|code=<i>}} and {{code|lang=html|code=<center>}} tags. There are also the semantically neutral [[div and span]] tags. Since the late 1990s, when [[Cascading Style Sheets]] were beginning to work in most browsers, web authors have been encouraged to avoid the use of presentational HTML markup with a view to the [[separation of content and presentation]].<ref>{{cite web|url=https://www.w3.org/MarkUp/Guide/Style.html|title=Adding a touch of style|last=Raggett|first=Dave|year=2002|publisher=W3C|access-date=October 2, 2009}} This article notes that presentational HTML markup may be useful when targeting browsers "before Netscape 4.0 and Internet Explorer 4.0". See the [[list of web browsers]] to confirm that these were both released in 1997.</ref>
In a 2001 discussion of the [[Semantic Web]], [[Tim Berners-Lee]] and others gave examples of ways in which intelligent software "agents" may one day automatically crawl the web and find, filter, and correlate previously unrelated, published facts for the benefit of human users.<ref>{{cite magazine |author=Berners-Lee |first1=Tim |last2=Hendler |first2=James |last3=Lassila |first3=Ora |date=May 1, 2001 |title=The Semantic Web |url=http://www.scientificamerican.com/article.cfm?id=the-semantic-web |magazine=Scientific American |access-date=October 2, 2009}}</ref> Such agents are not commonplace even now, but some of the ideas of [[Web 2.0]], [[Mashup (web application hybrid)|mashups]] and [[Price comparison service|price comparison websites]] may be coming close{{citation needed}}. The main difference between these web application hybrids and Berners-Lee's semantic agents lies in the fact that the current [[Feed aggregator|aggregation]] and hybridization of information is usually designed by [[web developer]]s, who already know the web locations and the [[Application programming interface|API semantics]] of the specific data they wish to mash, compare and combine.
An important type of web agent that does crawl and read web pages automatically, without prior knowledge of what it might find, is the [[web crawler]] or search-engine spider. These software agents are dependent on the semantic clarity of web pages they find as they use various techniques and [[algorithm]]s to read and index millions of web pages a day and provide web users with [[Web search engine|search facilities]] without which the World Wide Web's usefulness would be greatly reduced.
|