Microsoft Compiled HTML Help

This is an old revision of this page, as edited by 82.76.6.48 (talk) at 12:49, 13 June 2006 (External links). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Microsoft Compressed HTML Help is a proprietary format for online help files. They have a set of web pages written in a subset of HTML and a hyperlinked table of contents. CHM format is optimized for reading, as files are heavily indexed. All files are compressed together with LZX compression. Most CHM browsers display a table of contents on the left.

A CHM help file has a ".chm" or ".CHM" extension and is often referred to as a "chum" file. The file starts with bytes "ITSF" (in ASCII), for "Info-Tech Storage Format". The format has been partially reverse engineered and there are container and internal files specifications available.

There are some open source tools which can read and explore these files (see for example xCHM or GnoCHM), but they lack various features of the Microsoft Windows tools.

HTML Help files are made with help authoring tools such as PowerCHM or HTML Help Workshop.

Microsoft Compressed HTML Help is more complex than Microsoft WinHelp, which is based on Rich Text Format.

For more information go to HTML Help Web Page on MSDN.

Advantages

  • file size smaller than plain HTML
  • range of formatting options that HTML gives for text presentation
  • ability to search the full text
  • ability to assemble several CHM files into one file with common TOC, index and search (see MSDN)

Applications

This format was originally intended only for encoding help files, but other uses have since been found. It is very handy for packing saved HTML pages in one compact and browsable archive and for creating compact ebooks. Some people use it to keep personal notes, because it can organize them in an ordered hierarchical table and allows quick text searching.

Extracting to HTML

On Windows, a CHM file can be extracted to plain HTML with the command:

hh.exe -decompile extracted filename.chm

This will decompress all files embedded in filename.chm to folder extracted.

or by using HTML Help Workshop.

On Unix systems which use apt as a packaging tool, a CHM file is extracted to plain HTML with

 $ sudo apt-get install chmlib-bin
 $ extract_chmLib tero.chm tero/

Another useful set of tools for CHM files in non-Windows environments is the CHM Tools Package. It's available as source code, and includes a program, chumdump, which extracts the HTML from a CHM file into a separate directory.

It's also available via Fink on Macintosh OS X. If fink is installed on your system, you can type:

  $ sudo fink install chmtools

At a Terminal prompt to install the package. You can then extract a CHM file with:

  $ chmdump chmfile.chm outdir