Linux - GeneralThis Linux forum is for general Linux questions and discussion.
If it is Linux Related and doesn't seem to fit in any other forum then this is the place.
Notices
Welcome to LinuxQuestions.org, a friendly and active Linux Community.
You are currently viewing LQ as a guest. By joining our community you will have the ability to post topics, receive our newsletter, use the advanced search, subscribe to threads and access many other special features. Registration is quick, simple and absolutely free. Join our community today!
Note that registered members see fewer ads, and ContentLink is completely disabled once you log in.
If you have any problems with the registration process or your account login, please contact us. If you need to reset your password, click here.
Having a problem logging in? Please visit this page to clear all LQ-related cookies.
Get a virtual cloud desktop with the Linux distro that you want in less than five minutes with Shells! With over 10 pre-installed distros to choose from, the worry-free installation life is here! Whether you are a digital nomad or just looking for flexibility, Shells can put your Linux machine on the device that you want to use.
Exclusive for LQ members, get up to 45% off per month. Click here for more info.
Suppose I want to back up the entire contents of an HDD.
Method 1: Copy everything over to a second HDD. Will take up same space.
Method 2: Make a backup image which will potentially be smaller.
Question: What kind of size savings can be expected by the backup image method (method 2), and what are the cons compared to method 1?.
Distribution: Debian /Jessie/Stretch/Sid, Linux Mint DE
Posts: 5,195
Rep:
Compression yield a size decrease anywhere from 1:1 to 2:1.
Movies, JPG pictures and music are virtually uncompressible. They are already compressed and hold no redundancy which can be removed to compress any further. OTOH really large plain text files (like log files) could compress up to 10:1.
So it is hard to say and very dependent on the actual data contents. But todays bulk data is mostly multimedia, that is already compressed format.
The big disadvantage with compressed backups is that they usually are monolithic. If there is one error in the comressed file you loose everything. The second disadvantage is that it is harder to find a certain file and restore it.
I use rsync to backup on external hard disks or remote servers. But that is for data up to 2 TB. If you have really big data that might not suffice. But I doubt whether you want to use compressed backups for data larger than 2 TB anyway.
Well what do you want to copy? Just normal files or the whole format and structure on that hard drive? Sort of makes a bit of a difference. Plus, what types of files? If they're binaries, images, sound files, and videos; many of those may be compressed to some degree. For instance simple documents, or a bunch of code files are highly compressible. Something like a JPEG image is already compressed whereas BMP and GIF are less compressed. Not sure, there are a ton of movie formats, I'm guessing that WMV, FLV, and MP4 are compressed somewhat and maybe MPEG is a bit more able to be compressed. So, before you go crazy with one big zip, perhaps you ought to examine your data and determine whether or not compressing makes any sense. Further, compressing all the data means you have to then decompress it someday. Therefore if you make one big compressed file representing all your data, you'll need space to decompress that file in the future; this includes memory and time. I'm assuming that you may open the compressed file and pick what you wish to extract, however if there are too many files, it will take up memory just to show you the report and take you time to locate what you wish to extract. That is circular though ... some large body of data will need to be organized in some fashion where you can reference things, similarly it will be organized the same within the zip file.
Doubting this is the case, but if you need to maintain the integrity of the disk format and structure, for instance say it's your root file system, then dd would be used to create a same sized disk image. Another option on that is to do something like create a file based file system to copy all files into, then resize that file system to use the minimum space actually needed by the file system. That technique works for ext4 only, I believe.
If you want to backup normal data files, then I'd structure them all under a heirarchy and either compress or not depending on whether it buys you much added space when you do compress.
Mostly media files, so I guess the cons far outweigh the pros for compression.
rtmistler, excelltent point on the space requirement for decompressing data, I was totally missing that.
LinuxQuestions.org is looking for people interested in writing
Editorials, Articles, Reviews, and more. If you'd like to contribute
content, let us know.