Web Archive (file format): Difference between revisions

Content deleted Content added
Yobot (talk | contribs)
m References after punctuation per WP:REFPUNCT, WP:CITEFOOT, WP:PAIC + other fixes
hatnote
 
(20 intermediate revisions by 13 users not shown)
Line 1:
{{short description|Safari’sSafari's webarchiveWeb archive file format}}
{{About|Safari’s webarchive file format|web archiving|web archiving|other uses|Web archive (disambiguation)}}
{{redirect|Webarchive|other uses|Web archive (disambiguation)}}
{{distinguish|Web ARChive}}
{{short description|Safari’s webarchive file format}}
{{More citations needed|date=April 2008}}
{{Infobox file format
| name = Web archiveArchive
| logo =
| icon =
Line 29 ⟶ 28:
| url =
}}
The'''Web Archive''' (stylized by [[Apple Inc.|Apple]] as '''Web archive''', extension '''.webarchive''') is a [[Web archive file]] format is available on [[macOS]] and [[Windows]] for saving and reviewing complete web pages using the [[Safari (web browser)|Safari web browser]].<ref name="folderize">{{cite web |last1=Frakes |first1=Dan |title=De-archive Web Archives |url=https://www.macworld.com/article/1050198/webarchivefolderizer.html |website=Macworld |publisher=IDG Communications |accessdate=15 June 2018}}</ref> The webarchiveWeb Archive format differs from a standalone [[HTML]] file because it also saves linked files such as images, [[Cascading Style Sheets|CSS]], and [[JavaScript]].<ref>{{cite web|last1=Arnott|first1=Nick|title=Apple declines to fix vulnerability in Safari's Web Archive files, likely because it requires user action to exploit|url=http://www.imore.com/apple-declines-fix-vulnerability-safaris-webarchive-files-likely-because-it-requires-user-action|website=iMore|date=28 April 2013|publisher=Mobile Nations|accessdate=7 February 2015}}</ref> The webarchiveWeb Archive format is a concatenation of source files with filenames saved in the binary [[Property list|plist]] format using NSKeyedArchiver.{{Citation needed|date=October 2008}}

Support for webarchiveWeb Archive documents was added in Safari 4 Beta on Windows and iswas included in subsequent versions, until its discontinuation in 2012. Safari inon [[iOS]] 13and [[iPadOS]] ([[iPhone]] and [[iPad]]) has supportsupported forWeb web archiveArchive files since at least [[iOS 13]].<ref name="iOS support iOS13">{{cite web|title=iOS and IPadOS 13 Review|url=https://www.macstories.net/stories/ios-and-ipados-13-the-macstories-review/9/#content|website=MacStories|publisher=MacStories|accessdate=25 September 2019}}</ref> Previously there was a third party iOS app called Web Archive Viewer that provided this functionality.
 
==Usage==
* A version of the webarchiveWeb Archive format is used to bundle whole music albums and movies with extra content and menus inside [[iTunes LP|iTunes LP and Extras]].{{Citation needed|date=March 2012}}
* WebarchivesWeb areArchive files were automatically generated for ads submitted to Apple's [[iAd]] advertising platform.<ref>{{cite web|title=iAd JS Programming Guide: Web Archives and Manifest Files|url=https://developer.apple.com/library/iad/documentation/UserExperience/Conceptual/iAdJSProgGuide/CreatingBundles/CreatingBundles.html#//apple_ref/doc/uid/TP40010301-CH15-SW6|website=Mac Developer Library|publisher=Apple|accessdate=7 February 2015}}</ref>
* The [[WebKit]] framework's WebArchive class is used to simplify cutting-and-pasting with whole or partial web pages.<ref>{{cite web|title=WebArchive Class Reference|url=https://developer.apple.com/library/mac/documentation/Cocoa/Reference/WebKit/Classes/WebArchive_Class/index.html|website=Mac Developer Library|publisher=Apple|accessdate=7 February 2015}}</ref>
 
==Vulnerability==
In February 2013, a vulnerability with the webarchiveWeb Archive format was discovered and reported by Joe Vennix, a [[Metasploit Project]] developer. The [[exploit (computer security)|exploit]] allows an attacker to send a crafted webarchiveWeb Archive to a user containing code to access [[HTTP cookie|cookies]], local files, and other data. Apple's response to the report was that it will not fix the bug, most likely because it requires action on the users' part in opening the file.<ref>{{cite web|last1=Vennix|first1=Joe|title=Abusing Safari's webarchive file format|url=https://community.rapid7.com/community/metasploit/blog/2013/04/25/abusing-safaris-webarchive-file-format|website=Rapid7 Metasploit|date=25 April 2013|publisher=Rapid7|accessdate=7 February 2015}}</ref>
 
==Converting for other browsers==
Workarounds to allow the file to be viewed in other browsers are possible, though specific webpage contents may hinder this process. This requires one of the free tools [[WebArchive Folderizer]] (for OS X 10.2 and higher)<ref name="folderize"/> or [[WebArchive Extractor]] (for OS X 10.4.3 and higher).<ref>[https://robrohan.github.io/WebArchiveExtractor/ WebArchive Extractor]</ref> Webarchives can be converted to WARC using the [[National Library of Norway]]'s Warchaeology set of tools.<ref>[https://nlnwa.github.io/warchaeology/cmd/warc_convert/ Warchaeology convert documentation]</ref>
 
==Alternatives==
Line 47 ⟶ 48:
For archiving entire websites, the [[Internet Archive]] has developed the [[Web ARChive]] (WARC) format which was standardized by [[International Standards Association|ISO]].
 
[[HTMLD]] (HTML Directory) is a NeXT-developed format for saving web pages and their dependencies in a [[Archive file|bundle]] that may also be served by a web server.<ref>{{cite web|url=http://xent.com/~rohit/HTMLD-Spec.htmld/index.html|title=.htmld Discussion}}</ref>
 
Chrome offers the "webpage, complete" format which saves the page with a folder containing the required resources.
 
==See also==
* [[Web archiving]] – the general process of archiving web pages
* [[List of web archiving file formats]] – file formats for archiving web pages
 
==References==
<references/>
 
[[Category:Web Archivesarchives]]
[[Category:Archive formats]]
[[Category:Web browsers]]