Sparse files

Facts - Problems - Conclusions - Sources - Addenda

Facts

File contents needs not be represented physically one-to-one on disk. Examples are files in compressed filesystems and sparse files on Unix systems.

Most filesystems organize their files as a (not necessarily contiguous) sequence of disk blocks. The indices of these blocks are (on Unix filesystems) kept in so-called Inodes (and so-called indirect blocks). If one of these blocks consists entirely of "default" bytes (i.e. null bytes), the block needs not be allocated on disk (its index in the inode can be set to null). Such files with unallocated blocks are called sparse files.

Typically, it's up to the applications (not the operating system) to produce and keep sparse files. This is achieved on the Unix system call api level by using lseek(2) instead of write(2). Application implementations are free to use this feature (GNU implementations of Unix commands are a sample).

Samples of applications that more or less rely on the sparse file feature are the ones that use the dbm database format library and database applications that lseek to large offsets obtained from large hash values.

A sparse file can be recognized by comparing its size (ls -l) with the disk space it occupies (du -s or ls -s) or by displaying its disk layout using the filesystem debugger fsdb(1a). fsdb also reveals the locations of the unallocated blocks (which is otherwise not clear if there are allocated and unallocated null-blocks).

Problems

Sparse files may require much more media space and copy time when backed up. Larger tapes must be used and filesystem downtime may grow too long.

Sparse files are eventually expanded when copied (cp, cpio), moved (mv) or restored (tar, cpio, dd). This wastes disk space and disks may be too small for the restores.

Executable loaders of some Unix implementations seem to have problems with sparse executable files.

It is not evident how much disk space an application will eventually occupy, if disk space is reserved in form of sparse files.

Network filesystems may not properly implement the lseek feature, or the target filesystem may not be able to create sparse files.

Conclusions

If you are a programmer, don't use (rely on) the sparse file feature.

As a user, apply tools that preserve the sparseness of such files.

Sources

A compact sparse file creation tool can be found here: sparsefile.c. Its core function 'sparse' may be built into own tools.

[sparsefile.c revision history: 2012/6/26 current version (better support for output files larger than 2GB; thanks to Renne Nissinen for pointing out the lseek value overflow problem in the LP64 memory model); 1996/12/19 initial version.]

[Compilation hints: To avoid the overhead of successive lseek operations with nul gaps of 2 GB or more on LP64 64 bit systems: compile with e.g.: -DSPARSEFILE_SIZETYPE=off_t. For support for files of 2 GB and more on 32 bit Linux systems: compile with: -DSPARSEFILE_LSEEK=lseek64 -D_LARGEFILE64_SOURCE, or -D_FILE_OFFSET_BITS=64 (otherwise lseek may fail with EOVERFLOW ('Value too large for defined data type') as soon as the 2 GB file size limit is hit).]

Some GNU implementations of Unix commands include sparse file handling: cp, tar at gnu.org (archive names: coreutils-/fileutils-, tar-; older URL).

The Safe/Fast I/O Library sfio (older URL, older URL) at research.att.com includes also the handling of sparse files.

[The sparsefile.c implementation is designed solely around the system calls fstat/read/lseek/write (not using features as [the newer] ftruncate), so the last [possibly partial] data block must be written, even if it consists of all zeros/nuls. It could be argued that the output's sparseness is incomplete in such cases, although only by a constant amount of one disk block.]

Addenda

Instead of interpreting unallocated blocks to contain "default" (nul, zero) bytes, one could say that these bytes are not intended to be read (but rather sought over, by the code that knows of the file's internal structure, e.g. a dbm library).

For convenient use of unrelated tools (e.g. wc or md5sum), and following the Unix paradigm of "flat" files, instead of signalling EIO, nul (default) bytes are returned; making the distinction of unallocated blocks from blocks of intentional nul bytes infeasible (by means of the "normal" file API).

The proposed SEEK_DATA lseek extension (Solaris; proposed mainly for performance reasons), would make the distinction feasible. [SEEK_DATA: file pointer being set to a next non-hole file region; however: stated that filesystems are allowed to expose ranges of zeros, but not required to.]

[Another proposition to identify the locations of unallocated blocks was to destructively cut off last data blocks by means of ftruncate while observing the number of still allocated blocks (fstat's st_blocks), which however assumes the ftruncate operation exactly preserves the remaining sparseness.]

The use of the ftruncate system call instead of writing a last [possibly partial] block of all zeros could avoid cases of incomplete sparseness.

Keywords: sparse file, sparsefile, sparse file tool, unallocated blocks, lseek


Eric Laroche / lr / Mon Jul 7 2014 / Tue Jun 26 2012 / Wed Apr 7 2010 / Tue Jan 7 1997