| ZFS is a revolutionary file system developed by Sun | | | | Snapshots |
| Microsystems and open source community | | | | Snapshot is just a read-only copy of volume or entire |
| developers. It leverages the best features found in | | | | file system. Creating snapshots is very |
| other products found currently on the market, such | | | | straightforward and quick process. At the beginning |
| as network appliances snapshots, object-based | | | | snapshots don't take additional space in storage pool. |
| storage management, transaction and checksumming, | | | | As active data changes, snapshot begins to grow as |
| deduplication, with own ideas. The end result is a | | | | a reference to older data. |
| completely new approach to file system design. ZFS | | | | Transactional Semantic |
| is very young, yet it made such an impact on Unix | | | | In transactional file system all data is managed with |
| vendors and open source community that many | | | | copy-on-write method. data is never overwritten and |
| have planned and already ported ZFS to other | | | | no transaction is always committed or ignored. That |
| operating systems. | | | | mechanism means file system will never be corrupted |
| ZFS addresses may issues of modern file systems. | | | | by accidental power loss or system failure. So there |
| File integrity, scalability and management difficulties | | | | is no need for equivalent of fsck command. While the |
| are all thing of the past with the use of ZFS. | | | | last bits of written data could possibly be lost, entire |
| Storage Pools | | | | data structure remains unchanged and in consistent |
| ZFS eliminates the need for volume manager. Instead | | | | state. besides, all synchronous data is physically |
| of create virtual volumes, devices are grouped in | | | | written before write operation, therefore there is a |
| storage pools. That leads to system without individual | | | | warranty that it will never be lost. |
| physical devices, that allows to share entire disk | | | | Checksumming and sefl-healing |
| space with all file systems in pool. When a new | | | | With ZFS, the checksum is calculated from all data |
| devices are added to the storage pool, all file | | | | with algorithm selected by user. Traditional systems |
| systems can allocate additional space. It resembles | | | | allowed checksums on block level, beyond volume |
| virtual memory operation. When additional memory | | | | manager and traditional file system. This traditional |
| banks are added to the system, the operating | | | | schema means that some sort of error, like writing |
| system does not force user to configure additional | | | | entire block to improper location might end up with |
| memory, all processes in the system can | | | | proper checksum for bad data. Checksums therefore |
| automatically use additional memory. | | | | are not stored in the block but next to the pointer |
| Data Integrity | | | | to the block. All block checksums are done in server |
| ZFS is a transactional file system. That means that its | | | | memory and recovery is done on file system level. In |
| state is always consistent. Older file systems | | | | result it is transparent to applications. In addition ZFS |
| overwrite blocks when modifying data. In case of | | | | has capability to self-heal corrupted data. ZFS allows |
| power failure the data in a block is corrupted. To fix | | | | to create pool of data store with different |
| the issue fsck command finds corrupted blocks and | | | | redundancy level, including mirroring and RAID-5. If |
| pointers and tries to reconnect them. However fsck | | | | bad block of data is detected, ZFS imports data |
| need to scan entire volume and that operation is | | | | from another copy and replaces bad block with good |
| extremely time consuming. To overcome this | | | | copy. |
| journaling was introduced, but the same scenario | | | | ZFS Scalability |
| would happen if journal entry become corrupted. | | | | ZFS was designed from scratch with scalability in |
| When reliability was concern a mirroring software | | | | mind. All data is allocated dynamically, so there is no |
| was used to have a current copy of working data. | | | | need to preallocate it, in other words, reducing |
| But if the two mirrors became inconsistent (power | | | | scalability. Each directory can contain 256 trillion |
| failure again), they needed to be synced up again. | | | | entries, there is also no limits for file system or |
| That added additional overhead to disk system. In | | | | number of files it can contain. ZFS also includes |
| addition a system can't always predict which copy is | | | | features such as deduplication, data pipelining, |
| correct and one possibility is that bad data | | | | dynamic block sizing, intelligent prefetch, dynamic |
| overwrites the good one. ZFS addresses these | | | | striping, and built-in compression to improve |
| issues by making transaction-based copy-on-write | | | | performance. The storage limits have been also |
| modifications and constantly checksumming every | | | | improved greatly by using 128-bit architecture. |
| in-use block in the file system. | | | | |