sparse files

Mike Reinehr cmr
Thu May 26 12:31:15 PDT 2005


On Thursday 26 May 2005 11:21 am, Roger Oberholtzer wrote:
> (I deleted Matt's question about why I don't just gzip things. Here is
> a reply.)
>
> When I make a sparse image for use with qemu, say 6GB, very little is
> actually allocated from my hard disk. It only really takes space when I
> really put something in the file. With qemu, I use this file as a 'disk
> image' into which an operating system can be installed. As such, it must
> have some concept of size (in my case 6GB). Effectively, I am telling
> qemu that there is a partition that is 6GB. In this image, the operating
> system can do whatever it likes, just as it would with a partition. But
> the space taken on my hard disk is only what the installed operating
> system actually puts there. Not the full 6GB.
>
> Of course, I could tell qemu to use a file where storage from the hard
> disk has in fact been taken. Then gzip or is an option for transporting
> these images.
>
> If I install w98 into a 6 GB sparse file, it will only really take 300M
> or so. So, I could back up the 6 GB image to a CD. It the file is really
> 6GB, it will take more. OK, I would initialize the space to 0, so if it
> is not used the compression may be quite good.
>
> What I am wondering (another experiment) is if I remove content from a
> sparse file, does the space allocated for it decrease as well?
>
> Disk space is cheap these days. But sparse files seem less wasteful.

Roger,

I've been so fixated on learning why tar wouldn't work, that until today, 
haven't looked for any alternatives. Finally, just now I googled only for 
'Linux & sparse' and guess what popped up -- the cp command! Here's the 
relevant parts from `info cp`:

--sparse=WHEN'
      A "sparse file" contains "holes"--a sequence of zero bytes that
      does not occupy any physical disk blocks; the `read' system call
      reads these as zeroes.  This can both save considerable disk space
      and increase speed, since many binary files contain lots of
      consecutive zero bytes.  By default, `cp' detects holes in input
      source files via a crude heuristic and makes the corresponding
      output file sparse as well.  Only regular files may be sparse.


      The WHEN value can be one of the following:
     `auto'
           The default behavior: if the input file is sparse, attempt to
           make the output file sparse, too.  However, if an output file
           exists but refers to a non-regular file, then do not attempt
           to make it sparse.


     `always'
           For each sufficiently long sequence of zero bytes in the
           input file, attempt to create a corresponding hole in the
           output file, even if the input file does not appear to be
           sparse.  This is useful when the input file resides on a
           filesystem that does not support sparse files (for example,
           `efs' filesystems in SGI IRIX 5.3 and earlier), but the
           output file is on a type of filesystem that does support them.
           Holes may be created only in regular files, so if the
           destination file is of some other type, `cp' does not even
           try to make it sparse.


     `never'
           Never make the output file sparse.  This is useful in
           creating a file for use with the `mkswap' command, since such
           a file must not have any holes.

This should work if you can establish an NFS connection. I don't even want to 
think about trying to burn sparse UDF or iso9660 images!

Cheers!

cmr

-- 
Debian 'Sarge': Registered Linux User #241964

"More laws, less justice." -- Marcus Tullius Ciceroca, 42 BC


More information about the Linux-users mailing list