Introduction to ZFS File System

ZFS is a revolutionary file system developed by SunSnapshots
Microsystems and open source communitySnapshot is just a read-only copy of volume or entire
developers. It leverages the best features found infile system. Creating snapshots is very
other products found currently on the market, suchstraightforward and quick process. At the beginning
as network appliances snapshots, object-basedsnapshots don't take additional space in storage pool.
storage management, transaction and checksumming,As active data changes, snapshot begins to grow as
deduplication, with own ideas. The end result is aa reference to older data.
completely new approach to file system design. ZFSTransactional Semantic
is very young, yet it made such an impact on UnixIn transactional file system all data is managed with
vendors and open source community that manycopy-on-write method. data is never overwritten and
have planned and already ported ZFS to otherno transaction is always committed or ignored. That
operating systems.mechanism means file system will never be corrupted
ZFS addresses may issues of modern file systems.by accidental power loss or system failure. So there
File integrity, scalability and management difficultiesis no need for equivalent of fsck command. While the
are all thing of the past with the use of ZFS.last bits of written data could possibly be lost, entire
Storage Poolsdata structure remains unchanged and in consistent
ZFS eliminates the need for volume manager. Insteadstate. besides, all synchronous data is physically
of create virtual volumes, devices are grouped inwritten before write operation, therefore there is a
storage pools. That leads to system without individualwarranty that it will never be lost.
physical devices, that allows to share entire diskChecksumming and sefl-healing
space with all file systems in pool. When a newWith ZFS, the checksum is calculated from all data
devices are added to the storage pool, all filewith algorithm selected by user. Traditional systems
systems can allocate additional space. It resemblesallowed checksums on block level, beyond volume
virtual memory operation. When additional memorymanager and traditional file system. This traditional
banks are added to the system, the operatingschema means that some sort of error, like writing
system does not force user to configure additionalentire block to improper location might end up with
memory, all processes in the system canproper checksum for bad data. Checksums therefore
automatically use additional memory.are not stored in the block but next to the pointer
Data Integrityto the block. All block checksums are done in server
ZFS is a transactional file system. That means that itsmemory and recovery is done on file system level. In
state is always consistent. Older file systemsresult it is transparent to applications. In addition ZFS
overwrite blocks when modifying data. In case ofhas capability to self-heal corrupted data. ZFS allows
power failure the data in a block is corrupted. To fixto create pool of data store with different
the issue fsck command finds corrupted blocks andredundancy level, including mirroring and RAID-5. If
pointers and tries to reconnect them. However fsckbad block of data is detected, ZFS imports data
need to scan entire volume and that operation isfrom another copy and replaces bad block with good
extremely time consuming. To overcome thiscopy.
journaling was introduced, but the same scenarioZFS Scalability
would happen if journal entry become corrupted.ZFS was designed from scratch with scalability in
When reliability was concern a mirroring softwaremind. All data is allocated dynamically, so there is no
was used to have a current copy of working data.need to preallocate it, in other words, reducing
But if the two mirrors became inconsistent (powerscalability. Each directory can contain 256 trillion
failure again), they needed to be synced up again.entries, there is also no limits for file system or
That added additional overhead to disk system. Innumber of files it can contain. ZFS also includes
addition a system can't always predict which copy isfeatures such as deduplication, data pipelining,
correct and one possibility is that bad datadynamic block sizing, intelligent prefetch, dynamic
overwrites the good one. ZFS addresses thesestriping, and built-in compression to improve
issues by making transaction-based copy-on-writeperformance. The storage limits have been also
modifications and constantly checksumming everyimproved greatly by using 128-bit architecture.
in-use block in the file system.