Linux users and system administrators will never fail to cross paths with file management routines. As the Linux system, programs, and user files grow from Megabytes to Gigabytes, there is always the need to zip or compress some of your OS-bound files.
[ You might also like: How to Create a Large 1GB or 10GB File in Linux ]
The advantages of zipping or compressing these files are as follows:
- You get to save and create extra storage space on your Linux machine.
- Since Linux will have extra storage space to work with, its efficiency will improve.
- Compressed files are easier to transfer to other machine environments and systems.
- Zipping or compressing your files provides a data encryption advantage.
Most Linux users are familiar with Gzip as an effective means of compressing large files or creating zip files from slightly larger file sizes. Gzip comes pre-installed in almost all Linux operating system distributions.
You can check for its availability on your Linux system with the following command:
$ gzip --version
How to Compress File Using Gzip in Linux
To compress a simple file with Gzip, you only need to run a command similar to the following:
$ gzip linuxshelltips_v2.txt
You should be on the same path with the file that needs compressing or zipping when referencing the above gzip command.
How to Compress Large Files Using Gzip in Linux
Using Gzip for compressing files on Linux is fast and efficient until you start dealing with large files. Compressing 100GB+ files through Gzip takes unnecessary hours for the whole compression process to successfully complete even when underperforming machine hardware with minimum CPU specs like Core (TM) i3-2350M CPU @2.30 GHz. What if you do not have this much time on your hands?
Gzip is one-task bound meaning it can perfectly execute a single file compression job at a time. It only deals with files and will ignore compressing directories.
Despite such limitations, applications don’t need to extract gzip-compressed files before reading them. However, its dependence on a single processor core makes large file compression take hours to complete.
Gzip Alternatives for High Compression in Linux
While this article’s title addresses the need for speedily zipping large files with high compression through Gzip, we had to address the drawbacks of Gzip for you to embrace the better alternatives to it.
Pigz – Compress 100GB+ Files with High Compression
Unlike Gzip which is single-core oriented, Pigz is multi-core oriented and will also compress your targeted files to a
“.gz” file extension.
It has the advantage of improved compression time when dealing with large file sizes like 100GB+. Think of Pigz as a multi-thread Gzip version.
Install Pigz in Linux
Reference one of the following Pigz installation commands in relation to your Linux operating system distribution.
$ sudo apt-get install pigz [On Debian, Ubuntu and Mint] $ sudo yum install pigz [On RHEL/CentOS/Fedora and Rocky Linux/AlmaLinux] $ sudo emerge -a sys-apps/pigz [On Gentoo Linux] $ sudo pacman -S pigz [On Arch Linux] $ sudo zypper install pigz [On OpenSUSE]
Compressing 100GB+ Files with Pigz
Consider the following 100GB+ file statistics:
$ stat LinuxShellTipsBackup.iso
Since this file (LinuxShellTipsBackup.iso) meets the criterion of being 100GB+, i.e. 164GB, we should try compressing it with Pigz.
The command usage should be similar to Gzip’s.
$ Pigz -9 -k -p4 LinuxShellTipsBackup.iso
The command options:
- -9: Provides the best compression (High compression).
- -k: Retains the original file.
- -p4: Tells Pigz to use 4 processor cores since it’s multi-core oriented.
More processor cores make the compression process faster. The number of processor cores you choose to use should depend on the processor properties of your machine e.g. Core i3, Core i5, Core i7.
The resulting file compression size is 156GB from the original 164GB. If we decide not to keep the original file, we will have 8GB (164B-156GB) of extra storage.
Another good thing is that you can still open and navigate through the archived files without necessarily extracting them.
$ stat LinuxShellTipsBackup.iso.gz
To decompress or extract your files, use either of the following commands:
$ unpigz LinuxShellTipsBackup.iso.gz or $ pigz -d LinuxShellTipsBackup.iso.gz
Pigz vs Gzip Compression Speed Comparison
Let us compare their compression speed with a slightly smaller file.
$ time pigz file.mp4
$ time gzip file.mp4
Pigz wins with faster compression time even without specifying the number of processor cores to use.
The notable trick of zipping large files (100GB+) is to make sure the zipping application you are using supports multi-core or multi-thread processing. Such programs (e.g Pigz) limit or reduce the bottleneck effect associated with the compression of large file sizes.