This page details the XAD de-archiving system, and the "clients" I have written for XAD that decompress various disk and file archive formats.
- Introduction to file and disk archivers and XAD
- AMOS sample banks
- ar(1) file archives
- bzip and bzip2 decompressors
- Microsoft CAB archives
- COP disk archives
- Disk Imploder archives
- id WAD, WAD2 and PAK files
- Magic Shadow Archives
- MakeSFX self-extracting archives
- RPM Package Manager
- SOS disk format
- Microsoft TNEF attachments
- Unreal packages
- Wrapster archives
- ZAP disk archives
File archivers are programs that will coalesce a group of files into one single file, making management and distribution of the files simpler. Similarly, disk archivers turn the raw contents of floppy disks into files, making it simple to archive floppy disks on a hard drive, or distribute floppy disks over a network. Disk archivers are now generally relics of the past, as are many file archivers. However, some file archivers from the past are still in heavy use today, such as ZIP or tar.
After a few years of being involved in the XFD project, Dirk Stöcker decided that where XFD solved the problem of unpacking individual compressed files and executables, the greater problem of compressed archives needed to be addressed. Many of the Amiga's file and disk archivers were abandoned. If the archiving software itself stopped working, people could would not be able to retrieve their archives. Many archivers only exist as executables, with no source code available. So Dirk took the successful "slave" system from XFD, and built XAD, the eXternal Archive Decompressor. This has a simple but powerful interface that allows any programmer to list and extract file and disk archives. Already, Andrew Bell has produced an excellent graphical interface for this system called Voodoo-X.
XAD is currently available for the Amiga, but it is also being converted to a UNIX library, and has been incorporated into Mac OS X software. It is written solely in endian-neutral C and cleanly seperates implementation-specific functionality. The most popular formats supported are ACE, ADF, ARC, Arj, bzip, bzip2, CAB, compress, DMS, gzip, LhA, LZX, RAR, Stuffit, tar, ZIP and Zoo. This includes all their variants and self-extracting forms too.
This is a standard AMOS bank format. All this client does is extract the individual raw Amiga sound samples from the bank with a standard IFF 8SVX header, so they can easily be imported to other applications.
If you've ever looked in /usr/lib/ on a UNIX system, you'll find many static link libraries -- these are created in a standard format by a program called ar. This program is limited in its application to data archiving, especially compared to cpio and tar. However, the object code linker in UNIX can read from ar archives, so this became the standard format for creating static link libraries. Furthermore, the Debian GNU/Linux distribution uses the ar format to bundle files together.
bzip2, by Julian Seward, is a block-sorting file compressor. It uses the Burrows-Wheeler transform (BWT) with the Move To Front (MTF) compression algorithm and Huffman coding. It also includes some run length encoding which "is entirely irrelevant" and block randomisation which "doesn't really need to be there". Nethertheless, "compression is generally considerably better than that achieved by more conventional LZ77 and LZ78-based compressors, and approaches the performance of the PPM family of statistical compressors". Fortunately, you don't have to understand the algorithm to use it, because bzip2's method has been packaged into a simple to use library interface, libbzip2.
Before the efforts of bzip2, there was bzip. Unlike bzip2, it used arithmetic coding as the entropy encoder, which may or may not have infringed IBM patents. Generally, bzip2 is used to compress UNIX tar files at trendy software sites, instead of using the more conventional gzip. It has better compression than gzip, but also requires vastly more CPU power and memory. libbz2 is very similar to zlib. This is intentional, as libbz2 was intended as a successor to zlib, as Jean-Loup Gailly has said in interview.
This is Microsoft's most recent installation-media compression format. It supports 3 different compression algorithims - deflate, Quantum and LZX. See the cabextract page for more information.
COP! is a disk copier and archiver. It is part of the RAP!TOP!COP! compilation of utilities, all written in 1993 by Armin Sander (aka TIP/TNM). RAP! is a Stakker-like realtime disk compression system. TOP! is a disk defragmenter.
COP! can use XPK libraries for compressing its disk images, and it can also use four special "PACK" libraries that come with RAP!: firstly, lhst which in turn uses the lh.library compression, by Holger P. Krekel and Olaf Barthel, runl which is a simple run length encoder, and scn1 and scf1. These last two methods were written by Armin. They produce compressed data in the same format, but scf1 is a tweaked form of scn1 that is faster at the expense of compression quality -- "dem Paket beiliegend ist nun auch der scf1-Packer, der um einige Prozente schlechter abschneidet als scn1-Packer, aber dafür im einiges schneller ist". This XAD client does not currently support the lhst method, but it will soon.
The Disk Imploder by Albert Jan-Brouwer, otherwise known as DImp, is the application of the Imploder algorithm to standard Amiga disks. It was bundled with FImp (the File Imploder) in the Imploder distribution. The Imploder itself was an executable compressor. DImp was available before DMS, and also addresses many of the problems that DMS has. But it doesn't compress as well as DMS, so it too was a victim of the Amiga disk archivers war. The author calls the compression method "LZ77 like compression with a per-mode static Huffman coding step on the various parts of the skip, offset and length tuples". This is true, although the Huffman coding is "homebrew", as the paths taken on the tree are actually if-then-else branches in the algorithm, not table lookups or loops.
id are a well known games company, famous for their DOOM and Quake series of games. For DOOM, they invented the WAD file format for storing all the game data. For Quake, they originally intended to use their slightly improved WAD2 file format, but instead went for a new PAK format, which gave them much more space for filenames. WAD2 was then used for a few of the files stored inside the PAK format.
The id XAD client doesn't attempt to anything intelligent with these files yet. It simply extracts the raw data contained in them. A more intelligent client which converts files into a more useful file format will be released later.
The Magic Shadow Archiver is a disk archiver for the Atari ST. For many years, it was the ST's standard disk archiver. You can get some Atari games disks in MSA format from the little green desktop.
MakeSFX is an AmigaDOS script for creating self-extracting archives, written by me.
!PackDir is a RISC OS archiver written by John Kortink. It is a fast archiver, but not as popular as the !SparkFS combined file-system and archiver. The compression used is LZW -- the same as compress(1) and the GIF image format. My thanks go to John Kortink for providing me with the file format of his archiver.
One of the first Linux distributors was Red Hat, and one of the things they invented to make installing UNIX software easy on Linux was the Redhat Package Manager. This basically collected the source code or binaries of a particular software package into a gzip compressed cpio archive, prepended with a tagged header format storing meta-data. This was such a good format that most other Linux distributors used it too, prompting a name change from RedHat Package Manager to the RPM Package Manager. More information on the format is available from rpm.org, and RPM packages can be found using rpmfind.net.
This slave, and its source, comes bundled with XAD
In the early 1990s, Dierk Ohlerich (Chaos/Sanity) was tired of hearing that the demos he coded didn't work on machines beyond an A500, so he started work on a new "demo operating system", where he stored all his disk loading routines and memory management routines. He called this the Sanity Operating System, and developed a number of very good, and widely compatible demos using it. To quote him from the scrolltext of Jesterday:
Perhaps you noticed that this Musicdisk has a similar "caching and prefetching" system like the one I coded for Zine, It uses all autoconfiguring Memory-Expansions. I also wrote a relocator that puts the code in Fastmem (if possible) to speed up the decrunching. In fact I coded a complete Operating system (to be precise: SOS - Sanity Operating System) with some kind of filing system, loadsegment, complex memory managment and build in disk prefetching caching. I hope to improve the compatibility of my Programs with this because I'm really sick of hearing that my productions only run on half of all Amigas. This one runs even on the 500 plus and perhaps on turboboards (I can`t test it but perhaps it does). But I am a bit depressed because of the low compatibility of my previous productions (except Elysium) so I think that it does not run on many Amigas. Lets simply hope it works on your amiga and continue with some other things ...
Microsoft sell mail server software called Exchange, which they recommend that companies to use for their internal mail network. It uses a format called TNEF, which stands for Transport Neutral Encapsulation Format.
When mails go from a company's internal network to the internet, the Exchange server usually converts this format into the standard email format. Sometimes, the Exchange server is misconfigured and sends you an email with a uuencoded section called WINMAIL.DAT or a MIME application/ms-tnef section. If you've ever received one, firstly you can tell the sender not to do it again, but then you can save off your TNEF attachment within your email program, and look at it with XAD, using this client. You should be able to see any files that were attached to the email. This client is based on tnef by Mark Simpson.
Unreal is a popular series of PC games, which use a "hairy" package format to store textures, models, music, AI scripts and so on. Currently, while this XAD client can parse the general structure of these packages, it will only extract music (.UMX) files. This is because music files don't require conversion to a "useful" format, unlike all the other data stored in packages.
Wrapster is a sneaky way to get any type of file onto MP3-only music sharing P2P networks. As MP3-only sharing networks no longer really exist, this software is obsolete.
ZAP archives are normal Amiga disks compressed with a ByteKiller variant. ZAP also includes RLE encoding, which is pointless as the unhashed search method used by the compressor actually encodes repeated bytes as a single LZ match. ZAP is another disk archiver that didn't win against DMS, so virtually nobody will have any of these archives.