Multimedia Container Format: Difference between revisions

Content deleted Content added
SmackBot (talk | contribs)
m Date/fix the maintenance tags or gen fixes using AWB
complete rewrite
Line 1:
{{ infobox file format
{{Cleanup|date=December 2006}}
| name = Multimedia Container Format (MCF)
<!-- | icon = [[Image:Matroska-logo-128x128.png]]
| logo = -->
| extension = <tt>.mcf</tt>
| owner = [http://mcf.sourceforge.net mcf.sourceforge.net]
| genre = [[Container format (digital)|Container format]]
| container for = [[Multimedia]]
}}
 
'''MCF''' is an open (the [[specification]]s are available for everybody, free of charge), free (no [[royalties]]) data storage format called '''Multimedia Container Format''' (or '''Movie Container Format'''). The group originating it has promised that the format and all software developed by them for it will also stay free; it won't be turned into a commercial project once it's popular.
 
'''Multimedia Container Format''', abbreviated MCF, is a [[container format (digital)|container format]] and a predecessor of [[Matroska]]. The project has been abandoned since early 2004, but many of its innovative features found their way into Matroska.
Essentially it is a file format, like [[Microsoft]]'s [[AVI]]. However, it is also much more than that, having streaming and broadcasting features in the same format. It is not a video or audio [[compression algorithm]], but instead just a [[Container format (digital)|container]] that can hold any media inside it. This includes MPEG-4 ([[XviD]] and [[DivX]]), [[AC3]], [[Vorbis]], [[MP3]] and others.
 
==History==
MCF was the first project to create an open and flexible media container format that could encapsulate a video stream, multiple audio streams and subtitle data in one file. The project was started in 2000 by the developer [[Lasse Kärkkäinen]] (Tronic) as an attempt to improve the aging AVI format, which has no support for multiple audio streams or embedded subtitles. The first draft specification was published in 2001. At first the project generated some confusion about its intended goals. This was solved when the lead developer created a simple player for the format which supported embedded subtitles, which sparked interest and the community began to grow. Several new features were added and the specification refined.
{{Inappropriate tone|date=December 2007}}
The reason this project was started is that in year 2000 the only usable movie format on PC was AVI, which had seen its best days and couldn't really be extended any further. The project started with brainstorming on what should an optimal video format do. At that time, the project was just a text file on the home computer of the current project leader, [[Lasse Kärkkäinen]] (Tronic). He wrote a very primitive specification and then contacted the definite number one AVI hobbyist, [[Avery Lee]], best known for his free video editing application [[VirtualDub]], for comments. After a few e-mails back and forth, the format began to get its shape. At that time the format was basically just ''a better AVI'', with only some simple improvements such as timecodes, tagging and resistance to incomplete file transfers and other errors.
 
The crucial event in the project's history was the invention of [[Extensible Binary Meta Language|EBML]] in the fall of 2002, a binary meta-format inspired by XML, by the programmer [[Steve Lhomme]], quickly followed by a six months long coding break by Kärkkäinen due to military service. Since MCF was deemed nearly release-ready at the time, EBML was not accepted, which led Lhomme to set up his own [[Matroska]] project based on EBML. Due to the absence of the lead developer, most of the interest quickly shifted to the new Matroska project; by the time Kärkkäinen returned from the army, the developer community around MCF had completely disintegrated. Lack of manpower and educational commitments caused Kärkkäinen's attempts at reviving the project to fail. The final specifications were never published, and the last news entry on the project's SourceForge web page is dated September 6th, 2003.
In 2001, the specification was written in [[HTML]] and the format was publicly published for the first time, with a request for comments. Nobody really seemed to understand what the project was about, because there had not been any alternative container formats on the PC before. It was very difficult to get any developers even to have a look at it. Finally, on web forum which no longer exists, some people paid attention and asked questions. However, they were quite cautious about such a new format, and many asked why not use [[QuickTime]], [[Ogg]] or simply [[AVI]] instead. At this point everything was only on paper and thus it was difficult to gain respect for the project. This led to Lasse Kärkkäinen writing the first software for the format. In two days, he had a parser library and a simple [[ASCII]] art video player, supporting subtitles, released. This attracted a team, and the real development began. One of those to join was a [[C++]] coder named [[Ingo Ralf Blum]] (ingo). Another talented C++ programmer, [[Steve Lhomme]] (robUx4), heard of the project few months later ([[January 23]], [[2002]] is the exact date, found on his [http://forum.hardware.fr/hardwarefr/VideoSon/divx-4-5-kel-avance-et-pour-quand--sujet-25662-2.htm forum post]). This man was to have an important role in the project.
 
==Features==
Ingo Ralf Blum and Steve Lhomme were soon writing another version of the library, now in C++ (called, more properly, [[libmcf]]). Parts of this library are still in use in [[libebml]]. However, the format itself really started evolving, as the project was exponentially gathering more interest and thus good ideas. Because of that, the library never quite reflected the specifications. At some point, Ingo disappeared from the net for personal reasons, leaving the code in Steve's hands.
One of the objectives of the new format was to simplify its handling by players. This was to be done by making it feature-complete, eliminating the need for third-party extensions and actively discouraging them. Because of the simple, fixed structure, the time required to read and parse the header information was minimal. The small size of the header (2.5 kB), which at the same time contained all the important data, facilitated quick scanning of collections of MCF files, even over slow network links.
 
The key feature of MCF was being able to store several chapters of video, menus, subtitles in several languages and multiple audio streams (e.g. for different languages) in the same file. At the same time, the content could be split between several files called segments; assembling the segments into a complete movie was automatic, given the segments were all present. Segments could also be played separately, and overlap between segments was customizable. The format also allowed for variable frame rate video. To verify integrity, [[cyclic redundancy check|CRC32]] checksums were embedded into the file, and [[digital signature]]s were supported. A degree of resilience was built into the parser, allowing for playback of partially corrupted movies.
During year 2002, the format matured a lot, and the team started looking to get it running and thus into proper testing. However, at the same time, features were still pouring in and the specification could not be frozen. Pretty soon the libmcf was so much behind the specifications (especially thanks to the invention of Elements, which changed the format a lot) that it was decided to discontinue its development and to eventually write a new library, once the specification became final.
 
MCF's per-frame overhead (7 bytes) was considerably lower than AVI (40 bytes), and comparable to Matroska (10 bytes).
At the end of 2002, Steve Lhomme started to experiment with a completely new way of storing data, [[Extensible Binary Meta Language]] (EBML). He replaced most of the simple binary fields of the MCF format with these EBML fields. After some discussion (he was proposing to replace the almost-ready-for-release MCF with this new system), he started his own project, [[Matroska]], to experiment more with his system (the date was [[December 6]], [[2002]]),. Around the same time, Lasse Kärkkäinen, still leading the project and writing the specifications, had to go to military service and because the developers still left weren't comfortable enough in coding software, the MCF development was stalled for six months.
 
After a six month hiatus, in the summer of 2003, Matroska had gotten a four-month advance. All those who were in a hurry had shifted their interest to Matroska. This relieved pressure for rushed release for MCF, and enabled the format to be fine-tuned further. The experiments with Matroska provided useful feedback, and thus MCF received many improvements that it would not have had without the split. This includes proper multi-segment support (virtual addressing) and even lower overhead.
 
As of [[April 8]], [[2004]], the format was finalizing its specifications, and the C++ library for handling it ([[libmcf]]) was roughly half done. However, development was very slow, as university takes most of Lasse's time.
 
==Key features==
{{Inappropriate tone|date=December 2007}}
Strict standard: doesn't allow users to add extensions by their own, like AVI, OGM and Matroska do. There is no technical reason to limit this, but this increases overall compatibility a lot and this is very important because the project is really aiming for hardware support. It's much better to have a single format that has the important features of all others. To make writing software easy, MCF uses fixed binary structures or otherwise simple methods whenever possible. One thing not found in MCF is [[XML]] (however, there might still be some stored inside MCF).
 
Also, because of MCF's strictly defined binary structure and offsets of few headers, software on a machine of the appropriate [[endianness]] can read them with single read command; this is very fast and really counts when you want to make inventory of all the files on your [[hard disk]], on a network drive, or even of those on some [[file transfer protocol|FTP]] site (just read the first 2.5&nbsp;[[kilobyte|KB]] of each file and you have all the data you need: no seeking and only a very little transfer).
 
Menus, chapters, subtitles and everything in the same file/stream.
 
ASCII-looking binary: MCF uses a lot of human-readable ASCII as identifiers instead of some more opaque binary. This allows easy debugging files with any hex editor. Also, the beginning of each MCF file contains some information readable with a [[text editor]].
 
Variable framerates: MCF allows changing the frame rate of video in each scene. For instance, [[NTSC]] and [[PAL]] video use different frame rates, and MCF allows them to be intercut. Also, some codecs are tuned for a specific amount of motion per frame. An MCF encoder can lower the frame rate in scenes with less motion, reducing the amount of bandwidth required and improving quality with such codecs.
 
Seamless multi-segment: MCF supports dividing a long movie to several files, which can be recorded on several discs. If all segments are present at the same time, the user won't even notice the crossing of boundaries (most of the dirty work is done transparently in the parser, so players or codecs don't really need to bother with it). The user can also join the segments into a single big file or split it into segments of different sizes. Also, every segment is playable alone too, and you can define any overlap time you wish on them.
 
Full [[cyclic redundancy check|CRC32]] protection: data can be protected, and if an error occurs (broken resume on download being the most common reason), the broken part is skipped (no more frozen frames or unplayable files). The parser can also tell where exactly the error happened, so you can re-download this specific part of the file. Another related feature is the playback of incomplete downloads without slow index regeneration or a requirement for a smart player with a smart parser.
 
Digital signatures: allow [[authentication|authenticating]] an author's releases against changes. If anyone changes the content on route, the digital signature can tell which parts have been changed, removed or added after signing it. One movie can also be signed by several different people, signing different parts of it, with different keys. This system uses commonly known [[Public-key cryptography|public key]] algorithms, so it should be hard to break. Unlike a signature on an entire AVI file, MCF signs only the content, not the container. So, one can [[remux]] it (to better suit streaming, maybe), divide it into segments in a different way (or combine all segments), or basically do anything that doesn't change the actual contents and the signature will stay valid. Or he can add or remove tracks (some languages he doesn't need, for instance), and the digital signature still validates the parts that haven't been removed (but also tells which are missing or added).
 
No standardized [[digital rights management]] (DRM) or copy protection bits. The maintainers intend never to add support for these features trying to limit what users can do with the files they have. If you want to protect your content from being freely distributed, don't give copies of it for people you can't trust in the first place! The technical reasons for this are obvious too — if you can view something and you have specifications to the copy limiting system or sources for the player, you can break any protection in no time. The only systems that could work even for a moment (and probably not much longer) would be closed-source or hardware protection. Both of those also limit who can watch it (i.e. do I need a Fritz chip on my computer, with a complete Fritz-protected hardware/OS/software chain, or maybe [[Microsoft Windows]] with some commercial player). And in the end, someone will still break it, like happened with [[DVD]]'s [[content-scrambling system|CSS]]. Not to mention that one could just make a copy through [[analog hole|analog reconversion]] no matter how strong the digital protection used is. However, implementors are free to use a proprietary DRM layer around the whole MCF file.
 
==Limits==
 
The limits of the MCF format were based on human perception and expectations of progress in bitrates of video. The [[time code]] precision of the format is limited to 1 ms. The addressing in the file is limited to 64 bits, which is extremely large. Frame size is limited by 32-bit frame size number, limiting frame size at 4 [[gibibyte|GiB]]. Time codes are stored as 40-bit integers, which caps maximum movie length at approximately 35 years. The number of distinct streams in one file is 2<sup>16</sup>, or 65536. A movie can be split into a maximum of 255 segments.
Some limits of MCF are based on the limits of human perception, which cannot change. Humans can't notice a timing error of 0.5&nbsp;ms between video and audio, and they don't benefit from over 1000 frame/s framerate either. This makes these borders available for it — the format can be simplified by using one millisecond precision in timecodes because we can be absolutely sure that it won't cause trouble in future either (except maybe for writers of some special editing software which needs 100% accuracy).
 
==Overhead==
File sizes (also applies to size of a single stream without any breaks, or to total size of a movie split over several files) are limited by the 64-bit addressing, that cannot be changed without breaking compatibility. If we assume that the current [[exponential growth]] of 1000 times the amount every 10 years to continue, this should be enough for another 30 years. However, it is likely that even that growth cannot be sustained.
 
Frame sizes are limited by 32-bit addressing, so it is possible to have single frames up to four gigaoctets in size.
 
The timecode system allows a single movie to last for almost 35 years, without breaks. This means 40-bit integer for milliseconds, but actually a combination of 32-bit and 16-bit integers is used, to save space.
 
The number of different tracks, that can put in a single stream/file, is 65536. It is possible to fit a video stream in one, multi-channel audio in the second one and then there will still be 65534 to spare.
 
The number of segments one movie can be divided one movie into is 255.
 
Every MCF file requires around 3&nbsp;KB of headers, so you really need to store at least 20&nbsp;KB of data per file for it to be even remotely efficient (however, file systems would lose even much larger amounts of data because of allocation unit waste space).
Line 60 ⟶ 35:
 
When streaming and looking for lowest possible latency, the smallest possible transfer unit is one frame (with 25 octet overhead each, if using such minimal size units). The overhead is still only a half of that of OpenDML AVI file, and MCF can offer checksum protection in that! However, Matroska can do this with just about 16 octets.
 
==Design goals==
 
* User-oriented: if there is a conflict between content producers' and users' view on something, users will win.
* Feature rich: supports almost everything you really need and more features are added if required.
* Compatible and easy to implement: wild extensions not allowed, simple data structures used whenever possible.
* Platform independent: streams, broadcasts and files are no different from each-other
* Extensible: nearly everything in MCF can be extended by adding new fields, still keeping the high space efficiency (most other extensible formats suffer from severe overhead when some extensions are added).
* Efficient: general space efficiency and the speed of features such as seeking are important
* Safe: the data structures used aren't as vulnerable to "unchecked buffer" or other common programming flaws as other systems could be, but also protect against some problems that cannot be fixed in parsers. MCF's design goal is that the parser should never hang for a long time, not even when reading broken data.
* Primarily targeted at high quality movie distribution: extremely low bitrate, low latency streaming or editing capabilities aren't MCF's priorities, but MCF can pretty much get them for free too.
 
==See also==
Line 80 ⟶ 44:
==External links==
 
* [http://mcf.sourceforge.net/Docs/MCF/ -Unfinalized FormatMCF format specificationsspecification]
* [http://mcf.sourceforge.net/oldpage/ - Historic website and specifications (summer 2002)]
 
[[Category:Computer file formats]]