ok, took me a little bit more time than expected, but you get details and tools...
I don't know where to start...
Ok, I grabbed some cds and a dvd from my desktop and created isos, 2048 bytes sectors.
1. games: civilization 3, and freelancer (btw I own these games!)
2. data: a clipart cd (with wmf files)
3. applications: office xp and the vs.net 2003 60day-trial-dvd
Then I took some common used compressors (rar and zip) and packed all images as good as possible to get the compressed size.
After that I picked two free available compression algorithms we could use for this project:
gzip = pretty old, but still good, excelent for small data-blocks
bz2 = almost like rar in compression, but much faster
And I compressed the images with these tools too.
Now I there are some points I was thinking about. I assume the DT needs the data somehow blockwise (like it is in the image or on cd);
that means one big stream like rar, zip aso create are useless. Every block in the compressed image has to be compressed by its own. Only this way you can seek to every block w/o reading and expanding data (useless data).
Since the compression is not that good in single blockmode, I started playing with compressing 16 blocks (32... 64... aso). From 1 to 16 is big gain in saved bytes, and there could also be a speed advantage, since in most cases more than one block is requested. In every case you wanna access one single block you would have to expand 16, 32... block, could be a waste of resource, we have to find the best value for this.
Take a look at the linke results files, or play around with the gziso (analysis tool) to get an impression.
Here are some things i figured out so far:
- the difference between gz and bz2 becomes bigger with higher number of merged blocks
- single block compression is really worse, but at 16-block compression you are already close to zip (with gz) and rar (bz2)
- speed is no reason, maybe for creating a compressed image, but decompression is in both case lightning fast
things to think about
:
- now a block/merged blocks can be either compressed or uncompressed, maybe you should be able to set a threshold, like if the compression only brings 5%, store it uncompressed
- for the file format I thinking about something like png, jpg aso. like: <tag><size><crc><...data...>, but only for the header, not for the image data, has a big advantage for enhancing the format and being backward compatible, trust me... I know what I talking about
ok, enough for now... I'll answer every question, and I thank for every hint...
links:
http://home.attbi.com/~xrmb/gziso/compare1/compare1.html the results
http://home.attbi.com/~xrmb/gziso/bzip2.zip bzip2 compressor (win32 exe)
http://home.attbi.com/~xrmb/gziso/minigzip.zip gzip compressor (win32 exe)
http://home.attbi.com/~xrmb/gziso/gziso-v0.1.1-bin.zip gziso, the analyzer tool I wrote for details (win32 exe)
http://home.attbi.com/~xrmb/gziso/gziso-v0.1.1-src.zip the source code for courios people
http://home.attbi.com/~xrmb/gziso/gziso-results.zip details gziso result file from my test
http://home.attbi.com/~xrmb/gziso/compare1.sxc numbers and charts in a StarOffice spreadsheet
http://home.attbi.com/~xrmb/gziso/compare1.xls numbers and charts in an Excel spreadsheet
upcoming things:
- compare with ntfs compression
- make source code ansi-c
- enhance gziso to write the images
- add an audio cd to comparison
I hope you guys like the work I've done, ok... I'm a little bit upset about the week compression of all the available algorithms :) But thats out of my range...