Available Languages:

SuperREP: huge-dictionary LZ77 preprocessor

Description

SuperREP is the first LZ77 compressor that supports dictionaries larger than RAM available. Default settings (-l512) allows to process files that are 20x larger than RAM size. Memory requirements are proportional to 1/L, so by increasing -l value it's possible to process even larger files. Compression speed is 100 mb/s and decompression runs at 250 mb/s on i3-2100.

User reports

LZ77 algorithm

-m1..-m3 compression modes implement LZ77 compression algorithm.

-m1: input file split into chunks of L bytes (specified by -c option, 512 bytes by default). For every chunk, the program stores SHA-1 hash. When later it encounter L-byte chunk with the same SHA-1 value, it replaces new chunk with reference to previous one, assuming that chunks are equal.

-m2: same as -m1, but only weak hash stored for every chunk. When the program encounters chunk with the same hash value, it rereads old chunk from input file in order to compare data.

-m3: same as -m2, but the program compares bytes before and after equal chunks in order to extend match as much as possible. -l option may be used to specify minimum match length.

On decompression, every squeezed chunk is restored by reading the contents of previous equal chunk from output file. The compression algorithm in -m2/m3 modes rereads the same chunks in order to compare them with current data. This puts heavy load on OS I/O system and disk cache.

The algorithm requires that input file when compressing (except for -m1 mode) and output file when decompressing is seekable. If that's not true, you may use -temp option to tell the program to create temporary file used to store a copy of all uncompressed data.

The algorithm also need to know size of input file in advance when compressing. If the program cannot determine the file size (i.e. when compressing from stdin), file size should be supplied via -s option. Values larger than actual filesize will work as well, and by default 25gb is assumed.

Future-LZ algorithm

-m1f..-m3f compression modes implement Future-LZ compression algorithm.

Future-LZ is a modification of LZ77 storing matches at the match source rather than destinatioon position. For compressor like SREP that utilizes only long matches it allows to decrease amount of data stored in the decompressor, therefore decreasing amount of memory required for decompression. In my tests, decompression required RAM equal to about 10% of filesize. Moreover, since we know order of access to these stored data, they may be swapped from RAM to diskfile without losing efficiency, so decompression may be performed using just about 100 mb of RAM.

Future-LZ compression is performed in two passes - at the first pass, matches are found and saved in the memory. Then matches are sorted by their source address and the second pass performed, storing each match in the compressed file at position of match source. So, Future-LZ compression requires seekable input file on compression stage. Decompression access both input and output files sequentially, thus very fast.

Interpreting decompression stats for Future-LZ mode


Let's consider the following line:

Matches 1073 3119 9752, I/Os 0, RAM 346/1024, VM 664/984, R/W 824/1488

First 3 numbers denote current, maximum so far, and total number of matches in dictionary.

I/Os is a number of long matches that were copied directly form decompressed file (-m option).

Other numbers mean that there are 1024 MiB of RAM allocated, of those 346 MiB is currently used (memory is never returned to OS).
VM file is 984 MiB long, of those 664 MiB is in use now.
1488 MiB were written to VM file, of those 824 MiB was already read back.
VM.current = VM.W - VM.R equation is always true, here it's 664=1488-824.

At the end of decompression, we have a sort of that:

RAM 0/983, VM 0/1000, R/W 1560/1560

i.e. two zeros and R=W, while the rest of numbers shows how much memory/disk was used and how much data were written (and read back) to VM file.

Also, sum of RAM and VM size required for given file decompression should be constant. Actually, it slightly grows as -mem decreased, due to inefficiencies in the memory management.

Please also note that -mem option limits total amount of RAM used, that includes 40 mb for I/O buffers, while RAM value in stats shows only memory used for match data.

Memory usage for compression

When compressing, memory usage:
-m1: 20*filesize/L + 4*filesize/L + roundup(5*filesize/L) + roundup(A*filesize/L) + 48 mb
-m2/3: 4*filesize/L + roundup(5*filesize/L) + roundup(A*filesize/L) + 48 mb
where roundup() rounds its argument up to the next power of 2, L is value of -c option, and A is value of -a option. So, overall, memory allocated in 4-5 chunks.

Memory usage for decompression

Decompression of files compressed with LZ77 requires only 24mb for I/O buffers and no hash. Repeated data are copied directly from output file, though, so you need to have enough RAM used for disk cache in order to make decompression fast.

Decompression of files compressed with Future-LZ prints amount of memory required (it depends on many factors). You may limit amount of RAM used for decompression using -mem option:
-mem75% means "use no more than 75% of RAM" - that's the default setting
-mem75p means the same
-mem600mb means itself
-mem75%-600mb means "use no more than 75% of RAM minus 600 mb"
Extraneous data will be placed into the file specified by -vmfile option.
Also you may decrease amount of required RAM by using -m option to copy longer matches directly from output file.

Temporary file

Temporary file created automatically when necessary. You may disable its usage by "-temp=" option w/o parameter.

How to set up FreeArc to use SREP in filter mode

In order to use SREP as external compressor in FreeArc, add one of the following sections to your arc.ini:

This section is optimized for srep:f (Future-LZ) compression, therefore compressed data are sent immediately to stdout:

[External compressor:srep]
packcmd   = srep    {options} $$arcdatafile$$.tmp -         <stdout>
unpackcmd = srep -d {options} -                   - <stdin> <stdout>

This section is optimized for LZ77 compression:

[External compressor:srep]
packcmd   = srep    {options} $$arcdatafile$$.tmp $$arcpackedfile$$.tmp
unpackcmd = srep -d {options} - - <stdin> <stdout>

Some explanations for arc.ini settings:

Compression:

1. SREP should know infile size since it allocates memory proportional to filesize at the beginning of compression. If it doesn't know filesize, 25 gb (-s option) is assumed, so it allocates a lot of memory (1-2 gb).

It's why i don't have used here - if freearc first writes data to the $$arcdatafile$$.tmp, then srep will know its size and allocate just the amount of memory required. If you can afford allocating 1-2 gb of memory (and you never compress with srep more than 25 gb), then you can use mode that will make process a bit faster (20-40 seconds per gb).

2. Except for plain -m1 mode, freearc needs to reread input data again: in -m2/m3 mode it checks matches by rereading input file, and in -f mode it performs 2 passes - first one to find matches, second one to encode them. SREP automatically uses tempfile to store copy of input data when data are read from stdin and they will be required later (i.e. in all modes except for -m1), so it will work anyway but not much faster than $$arcdatafile$$.tmp mode.

3. If srep writes data to stdout, next compression algorithms may be performed in parallel. this means that both srep and lzma will use memory at the same time so total memory usage will be increased. some people don't have enough ram for such usage. it's why my default settings for LZ77 modes don't use stdin/stdout.

4. With Future-LZ, we perform 2 passes: 1) we read input data and find matches using lot of memory, 2) we read data again and produce compressed file using minimum of memory. so we can perform second pass simultaneously with lzma and it's why i recommend to use mode with Future-LZ.


Decompression:

1. If data was compressed without -f, srep will copy matches from output file. if output data are written to stdout, duplicate tempfile will be created.

2. With -f, srep allocates amount of memory specified by -mem option, leabing less memory for lzma decompression.

In both cases, decompression from stdin to stdout usually will be better than using files.



History/Downloads

Downloads contain sources and Win32, Win64, Linux-i386, Linux-x64 executables.

Version Date Download Improvements
3.2Apr 6, 2013srep32.zip
  • -m2 -lN now is the same as -m3 -lN -cN: compression ratio is average between -m1 and -m3, while speed is the same as in old versions
  • -a0: the same compresssion ratio as -a1, memory usage is smaller by 5-10%, but 1.5-2x slower
  • -a32/-a64: sometimes faster than -a16 (only with large pages), but needs even more memory
  • -slp[+/-/]: force/disable/try(default) large pages support
  • -v[0..2]: verbosity level
  • -pcMAX_OFFSET: print performance counters for matches closer than MAX_OFFSET
  • -l64k/-c1mb syntax support (k/m/kb/mb suffixes for kilobytes/megabytes)
  • Both 32-bit and 64-bit default executables are compiled with GCC 4.7
  • 32/64-bit dynamic/static linux builds
3.1Feb 23, 2013srep31.zip
  • -m1f -a4 now is default compression mode, for quick and dirty compression. Use -m3f -a1 for maximum compression
  • 32-bit version became 1.5x faster than in SREP 3.0, but still up to 1.5x slower than 64-bit code
  • gcc64 version: srep64g.exe. srep32i/srep64i still are the fastest executables
  • displays CPU Time spent in *ALL* threads, speeds are measured in MiB/s
  • -pc option displays internal performance counters
3.0Jan 30, 2012srep30.zip
  • -m1f..-m3f: Future-LZ compression; -m3f now is the default mode
  • -mem: limit amount of RAM used for Future-LZ decompression
  • -vmfile and -vmblock options fine-tune VM file used in -mem mode
  • -mBYTES: copy matches longer than BYTES directly from outfile
  • 3x faster compression: I/O and MD5/SHA1 tasks were offloaded into separate thread
  • unrolled internal loop, unrolling factor controlled by -a option, -a1 is the slowest but requires least memory
  • -nomd5: don't store/check MD5 checksum of every block
  • -mmap: use memory-mapped file for match checking by I/O
  • when necessary, temporary file is created automatically
  • made stderr always unbuffered (useful for GUIs around srep.exe providing progress indicator)
  • "srep" and "srep -d" commands now work as a filter if stdin and stdout are redirected
  • 64-bit version now can use >4 GB of RAM
  • fixed bug when compressing data from pipe (i.e. producer | srep)
2.0Feb 15, 2011srep20.zip
  • -m3: new default compression mode that finds byte-exact matches; so srep:m3 outperforms rep+srep:m2
  • -temp=FILENAME option that allows to use stdin-to-stdout mode without any restrictions (all external data required for compression/decompression are stored in this file)
  • -c option to explicitly specify hash chunk size
  • -s option to specify size of input data
  • "srep file" and "srep file.srep" syntax now supported for compression and decompression respectively, simplifying program usage and allowing to just drag-n-drop file to executable's icon in order to compress or decompress it
  • on disk overflow (or other write error), program displays message, deletes outfiles and returns dos error code
  • compression memory usage was reduced by 8 mb
1.5May 11, 2010srep15.zip
  • -m1: old method (compression memory = 6-7% of filesize, check matches by SHA1 digest)
  • -m2: new, default method (compression memory = 2-3% of filesize, check matches by rereading old data)
  • -index option - keep index of compressed data in separate file in order to improve compression ratio
  • 64-bit executable that's still 100% compatible but faster than 32-bit one
1.0Dec 15, 2009srep10.zip
  • -delete option that delete source file after successful (de)compression
  • checking of -l value
0.8Nov 24, 2009srep08.zip
  • better compression due to improved hashing and compressed format
  • faster compression on files <1gb
  • MD5 integrity checking on decompressed data
  • first 8 bytes of compressed file now contains SREP signature, helping programs like unix magic
  • exit code == 0 on success
0.7Nov 23, 2009srep07.zip
  • reduced memory usage down to 6-8% of filesize. For example, 24gb file needs 256+256+960 mb memory chunks
  • now hash keeps address of the last chunk with the same contents
  • hashing improved a little
  • fixed WinXP crashing bug
0.6Nov 23, 2009srep06.zip
  • fixed 64-bit version, now it properly handles files >2gb
  • fixed decompresion with non-default -l
  • -s prints stats after each block
0.5Nov 23, 2009srep05.zip
  • first version that was able to compress and extract data

Other links