Web Archive (file format): Difference between revisions

Content deleted Content added
Alternatives: Removed uninformative, redundant clause ("and must not be confused with ... webarchive")
hatnote
 
(47 intermediate revisions by 37 users not shown)
Line 1:
{{short description|Safari's Web archive file format}}
{{Cleanup|date=April 2008}}
{{redirect|Webarchive|other uses|Web archive (disambiguation)}}
{{Refimprove|date=April 2008}}
{{distinguish|Web ARChive}}
{{About|webarchive file format|web archiving|web archiving|web.archive.org website|Internet Archive}}
{{Infobox file format
| name = Web archiveArchive
| logo =
| icon =
Line 20:
| latest release date = <!-- {{Start date and age|YYYY|mm|dd|df=yes/no}} -->
| genre = [[web page]] [[file archive]]
| container for = [[websites]]
| contained by =
| extended from = [[plist|Apple Binary Property List]]
| extended to =
| standard =
Line 28:
| url =
}}
The'''Web Archive''' (stylized by [[Apple Inc.|Apple]] as '''Web archive''', extension '''.webarchive''') is a [[Web archive file]] format is available on [[Mac OS XmacOS]] and [[Windows]] for saving and reviewing complete web pages using the [[Safari (web browser)|Safari web browser]].<ref name="folderize">[http{{cite web|last1=Frakes|first1=Dan|title=De-archive Web Archives|url=https://www.macworld.com/article/50198/2006/041050198/webarchivefolderizer.html|website=Macworld|publisher=IDG De-archiveCommunications|accessdate=15 WebJune Archives]2018}}</ref> The webarchiveWeb Archive format differs from a standalone [[HTML]] file because it also saves linked files such as images, [[stylesheetsCascading |Style Sheets|CSS]], and [[JavaScript]].<ref>{{cite web|last1=Arnott|first1=Nick|title=Apple declines to fix vulnerability in Safari's Web Archive files, likely because it requires user action to exploit|url=http://www.imore.com/apple-declines-fix-vulnerability-safaris-webarchive-files-likely-because-it-requires-user-action|website=iMore|date=28 April 2013|publisher=Mobile NatiionsNations|accessdate=7 February 2015}}</ref> The webarchiveWeb Archive format is a concatenation of source files with filenames saved in the binary [[Property list|plist]] format using NSKeyedArchiver.{{FactCitation needed|date=October 2008}} Support for webarchive documents was added in Safari 4 Beta on Windows and is included in subsequent versions. Safari for [[iOS]] (iPhone and iPad) does not support web archive files natively, however a third party app<ref>[https://itunes.apple.com/us/app/web-archive-viewer/id591047302 Web Archive Viewer]</ref> provides this functionality.
 
Support for Web Archive documents was added in Safari 4 Beta on Windows and was included in subsequent versions, until its discontinuation in 2012. Safari on [[iOS]] and [[iPadOS]] ([[iPhone]] and [[iPad]]) has supported Web Archive files since at least [[iOS 13]].<ref name="iOS support iOS13">{{cite web|title=iOS and IPadOS 13 Review|url=https://www.macstories.net/stories/ios-and-ipados-13-the-macstories-review/9/#content|website=MacStories|accessdate=25 September 2019}}</ref> Previously there was a third party iOS app called Web Archive Viewer that provided this functionality.
== Usage ==
* A version of the webarchive format is used to bundle whole music albums and movies with extra content and menus inside [[iTunes LP|iTunes LP and Extras]].{{Fact|date=March 2012}}
 
* Webarchives are automatically generated for ads submitted to Apple's [[iAd]] advertising platform.<ref>{{cite web|title=iAd JS Programming Guide: Web Archives and Manifest Files|url=https://developer.apple.com/library/iad/documentation/UserExperience/Conceptual/iAdJSProgGuide/CreatingBundles/CreatingBundles.html#//apple_ref/doc/uid/TP40010301-CH15-SW6|website=Mac Developer Library|publisher=Apple|accessdate=7 February 2015}}</ref>
 
== Usage ==
* A version of the webarchiveWeb Archive format is used to bundle whole music albums and movies with extra content and menus inside [[iTunes LP|iTunes LP and Extras]].{{FactCitation needed|date=March 2012}}
* WebarchivesWeb areArchive files were automatically generated for ads submitted to Apple's [[iAd]] advertising platform.<ref>{{cite web|title=iAd JS Programming Guide: Web Archives and Manifest Files|url=https://developer.apple.com/library/iad/documentation/UserExperience/Conceptual/iAdJSProgGuide/CreatingBundles/CreatingBundles.html#//apple_ref/doc/uid/TP40010301-CH15-SW6|website=Mac Developer Library|publisher=Apple|accessdate=7 February 2015}}</ref>
* The [[WebKit]] framework's WebArchive class is used to simplify cutting-and-pasting with whole or partial web pages.<ref>{{cite web|title=WebArchive Class Reference|url=https://developer.apple.com/library/mac/documentation/Cocoa/Reference/WebKit/Classes/WebArchive_Class/index.html|website=Mac Developer Library|publisher=Apple|accessdate=7 February 2015}}</ref>
 
== Vulnerability ==
In February 2013, a vulnerability with the webarchiveWeb Archive format was discovered and reported by Joe Vennix, a [[Metasploit Project]] developer. The [[exploit (computer security) | exploit]] allows an attacker to send a crafted webarchiveWeb Archive to a user containing code to access [[HTTP cookie | cookies]], local files, and other data. Apple's response to the report was that it will not fix the bug, most likely because it requires action on the users' part in opening the file.<ref>{{cite web|last1=Vennix|first1=Joe|title=Abusing Safari's webarchive file format|url=https://community.rapid7.com/community/metasploit/blog/2013/04/25/abusing-safaris-webarchive-file-format|website=Rapid7 Metasploit|date=25 April 2013|publisher=Rapid7|accessdate=7 February 2015}}</ref>
 
== Converting for other browsers ==
Workarounds to allow the file to be viewed in other browsers are possible, though specific webpage contents may hinder this process. This requires the using one of the free tools [[WebArchive Folderizer]] (for OS X 10.2 and higher)<ref name="folderize"/> or [[WebArchive Extractor]] (for OS X 10.4.3 and higher).<ref>[httphttps://robrohan.github.io/WebArchiveExtractor/ WebArchive Extractor]</ref> Webarchives can be converted to WARC using the [[National Library of Norway]]'s Warchaeology set of tools.<ref>[https://nlnwa.github.io/warchaeology/cmd/warc_convert/ Warchaeology convert documentation]</ref>
 
== Alternatives ==
[[Mozilla Archive Format|MAFF]] is an open format (with a published specification) that enables saving of whole webpages in a single file. It is currently supported by [[Firefox]], using an extension.<ref>{{cite web |url=https://addons.mozilla.org/en-US/firefox/addon/mozilla-archive-format/ |title=Mozilla Archive Format, with MHT and Faithful Save |accessdate=8 December 2011|url-status=dead |archive-url=https://web.archive.org/web/20171102005204/https://addons.mozilla.org/en-US/firefox/addon/mozilla-archive-format/ |archive-date=2 November 2017}}</ref><ref>{{cite web |url=https://addons.mozilla.org/en-US/android/addon/webscrapbook/ |title=WebScrapBook |accessdate=17 November 2019 }}</ref> Other web browsers use the [[MHTML]] format or do the equivalent by saving a directory of inline resources (usually images) alongside the [[HTML]] file, sometimes compressed, like the [[KDE WAR (file format)|.war]] format used by [[Konqueror]] (tar+gzip or tar+bzip2). Safari does not support these alternative archive formats.
 
For archiving entire websites, the [[Internet Archive]] has developed the [[Web ARChive]] (WARC) format which was standardized by [[International Standards Association|ISO]].
 
[[HTMLD]] (HTML Directory) is a NeXT-developed format for saving web pages and their dependencies in a [[Archive file|bundle]] that may also be served by a web server.<ref>{{cite web|url=http://xent.com/~rohit/HTMLD-Spec.htmld/index.html|title=.htmld Discussion}}</ref>
 
Chrome offers the "webpage, complete" format which saves the page with a folder containing the required resources.
== References ==
<references/>
 
==See also==
* [[Web archiving]] – the general process of archiving web pages
* [[List of web archiving file formats]] – file formats for archiving web pages
 
== References ==
{{Mac-stub}}
<references/>
 
[[Category:Web Archivesarchives]]
[[Category:Archive formats]]
[[Category:Web browsers]]
 
 
{{Mac-stub}}