Keep it clean
If you believe that a wish duplicates another one or is not meant for the category, use Options button above to report a duplicate or spam.
Add your wish
If there is an item you wish to have on GOG.com and it’s not yet on the wishlist, please add your wish
Use better compression for Linux tarballs. For example tar.xz
.xz uses LZMA/LZMA2 just like 7-zip
hence quite standard those days and used for many linux packages
so please support it
Gzip is something from the last century (not to mention "millennium"!) and is good for exactly one thing: it compresses faster than hard drive can write so it is suitable for on-the-fly compression. Why GOG uses this outdated algorithm is beyond me...
LZMA (or .xz) is asymmetrical and requires 10 times less resources for extraction making it an ideal solution in our "compress once, extract many times" scenario.
P.S. Multivolume archives could be nice but the real problem is the lack of patching system...
My proposal is using SquashFS, which is mountable as read-only, and can be extracted just like tarballs. Not to mention: widely supported.
Compression methods available: GZip (default), XZ, LZMA, LZO, and LZ4.
Link: www.tldp.org/HOWTO/html_single/SquashFS-HOWTO/
Then you can run a game directly from the SquashFS file... well, as long as the game needn't write to files when launching. And saving is out of question, unless you use a symlink in it, like this: ../game-data/this-games-saves/ in your game directory, with a bash script launcher which creates that directory for you.
Using SquashFS, I got the game Populous 3, a 200 MB game down to 73 MB!
First by compressing the data directories with XZ and 128k blocks, and then compressing the whole Populous directory in the same way.
Nothing can compress that amazingly! As you can then mount and play... well, if you have the mount-points as symlinks first, of course. As you can't mount on a read-only file-system.
Just to add my 2cents about memory usage: Why would someone even mention memory using during decompression when the game inside will use a lot more than that. It's a pointless argument if you ask me.
If download times are crazy long, no one will care about memory and decompression times. And for big games it can make a major difference. I won't be playing those huge games on weak machines anyway (i.e. those which lack RAM and CPU).
It would be also convenient to split the compressed file as they do for the windows version. Witcher 2 for windows has 13 parts, most of them of 1.5GB, while both Linux and Mac versions only have a single large file of ~20GB.
I am against this. You can always find the algorithm with the higher compression ratio but that will usually come with the price of increased compression/decompression times and high memory usage. So this actually sounds like a really bad idea. It's similar to why people still use zip instead of alternatives on Windows. A standard compression format with a pretty solid compression ratio and acceptable compression/decompression speeds.
xz 5.2.0 release now includes parallel compression / decompression:
http://git.tukaani.org/?p=xz.git;a=blob;f=NEWS;hb=HEAD
So GOG should use that.
@eiii: The required RAM for decompressing an archive compressed with "xz -9" is 65 MB. The man page mentions systems with 16 MB of RAM total.
Even if parallel decompression with 8 threads is used, that is only around 520 MB for decompression. Any gaming rig should have more than enough RAM available; if not, then fewer threads can be used.
When compressing individual small files (e.g. man pages) bzip2 is a bit better (using the man1 directory on my system, with 4422 files: "bzip2 -9" saved 221 KB compared to "xz -6").
When compressing large tar files, xz is better (even if only by a few megabytes).
That said, "xz -6" (for large files) still usually gives better compression than bzip2.
xz also takes less time to decompress than bzip2 (single-threaded comparison).
It would save a bit of bandwidth......
The main point for me is to reduce the size of the game archive, so +1. But please be careful with increasing the compression level for xz, it also increases the REQUIRED memory to decompress the archive. From the xz man page: "it's not a good idea to blindly use -9 for everything". Also in some cases bzip2 has better compression ratios than xz, so some tests may be helpful before deciding on compression settings.
@Ganni1987: Just a small caution. Using DVDs for backups can be risky. Their lifetime is limited (the optical layer there disintegrates with time even if you don't use it). So a better idea is to use hard drives for backup (if you worry about a hardware failure, use a RAID). That's a more robust method for long term storage.
I don't mind having a longer decompression time, smaller downloads for me would mean less waiting time and I can fit more games on 1 DVD-R so I'm in favor of this.
To clarify, 70 Mbit/s is not an absolute value. It's relative to the dataset size in this example.
@bernstein82: Also, if smarter parallel decompression would be used, it would save time overall. So far all all those tools perform decompression in single thread, that's why it's slow for LZMA.
just my two cents but since it saves time for less than 70mbit/s average download speeds its a safe bet most people will benefit from lzma compression. Plus those who prefer archiving smaller files. While things might change over time (e.g. download speeds getting faster & cpus getting faster) i doubt it will ever heavily favor less compression.
@boltronics: Thanks for pointing it out. I knew they had plans for it. But so far it didn't make it into the release and repos. By the way, can it support multithreaded decompression? Long decompression time is a downside.
@shmeri Check out the 5.1.3alpha build of xz. tukaani.org/xz/
It has multi-threading support (use the -T 0 switch) and is very fast.
Ha, I knew something was off (it's late here). I took the wrong time for delta. Attempt #2 :)
Actual decompression delta = 10:01 = 601 sec.
Then:
X > 4.96 * 2^33 / (601 * 10^9) ≈ 0.07 Gb/s ≈ 70.89 Mb/s
That's more like it. With more than 70.89 Mb/s, gzip saves time. With less - 7z does.
Size reduction is very significant, but decompression time is quite big as well (16-17 min vs 6 min). So one tradeoff is in download time vs decompression time for end users. Those with slower connection with benefit from xz / 7z more, and those with faster one would save time on gzip.
Decompression delta between gzip and 7z for TW2: 17:43 = 1063 seconds.
Size delta between gzip and 7z: 4.96 GB
Let's say the user has download bandwidth of X Gb/s (that's X * 10^9 bits / sec, not X* 2^30).
Then the difference in download time would be d = 4.96 * 2^33 / (X * 10^9) seconds.
When d < 1063 seconds, gzip saves time (download + decompression). Otherwise 7z saves time.
4.96 * 2^33 / (X * 10^9) < 1063
X > 4.96 * 2^33 / (1063 * 10^9) ≈ 0.04 Gb/s ≈ 40.08 Mb/s
I.e. for download bandwidth roughly more than 40 Mbps, gzip saves time. Otherwise 7z would be overall faster.
Of course this doesn't take storage trade-off into account. Some might prefer to save space on disk despite time difference :) (I hope I didn't make some mistake in calculations ;)
Some more benchmarks:
Original (Windows version): witcher2.tar (22.8 GB)
== compression command ==
gzip: gzip -v -k -9 witcher2.tar
pxz: pxz -v -k -T 8 -9 witcher2.tar
7z: 7z a -m0=lzma2 -mx=9 -mmt=8 witcher2.tar.7z witcher2.tar
== result ==
gzip: 19.96 GB (witcher2.tar.gz)
pxz: 15.06 GB (witcher2.tar.xz)
7z: 15 GB (witcher2.tar.7z)
== size reduction ==
gzip: 12.3%
pxz: 33.83%
7z: 34.08%
== compression time ==
gzip: 17:57
pxz: 44:27
7z: 35:40
== decompression time ==
gzip: 6:05
pxz: 17:13
xz: 16:56
7z: 16:06
== Notes ==
8 threads were used for compression with pxz and 7z.
xz was decompressing the same file as pxz.
Hardware: Intel Core i7 4770 (3.40 GHz, 4 cores, hyperthreading), 16 GB RAM.
Correction - there is a pxz tool which implements LZMA compression compatible with XZ but in parallel fashion! So time still can be reduced.
A pity xz doesn't support multithreading, so it's more time consuming to compress / decompress, but saved traffic is huge.
4+ GB is not a small feat.
Yeah, and it's better to use -9 in xz which compresses it even more.
I compressed The Witcher 2 in both tar.gz and tar.xz:
Original size: 20 GB
tar.gz: 18.2 GB (took 20 minutes)
tar.xz: 13.9 GB (took 2 hours 27 min)
That’s with the default compression settings, so I think -6 for xz.
That depends. Not all games do the best work on compressing their resources. So there is no point to use inferior compression in the package.
Not sure how much reduction you can realistically get. Most of the size consists of videos, which are already compressed.
Agreed. Certainly not a big deal, but it saves you guys and us bandwidth, so it's a win-win.
Yep, that would be nice. Voted.
30 comments about this wish