Table of contents of the article:
Local snapshots, incremental replication, and remote backupsWhen it comes to data protection on modern Linux systems, OpenZFS offers an extremely solid technical foundation. However, having an advanced file system doesn't automatically mean having an effective backup strategy. Snapshots must be created regularly, stored according to a sensible logic, replicated elsewhere, monitored, and, above all, tested during recovery. This is where snapshots come in. Sanoid e Syncoid, two tools that together allow you to transform the powerful native capabilities of ZFS into a tidy, automatable and truly production-ready process.
In many server environments, especially in hosting, systems, and infrastructure, backup is still perceived as a simple file copy. A job is configured rsyncYou schedule a nightly cron job, send a few archives to a remote server, and assume everything is under control. The problem arises when something actually needs to be recovered: a deleted directory, a compromised website, a previous version of a file, a corrupted database, or an entire lost dataset. At that point, the difference between a generic copy and a strategy based on snapshots, retention, and incremental replication becomes enormous.
OpenZFS changes the way we think about data because it integrates many features into the filesystem itself that would otherwise require separate tools. ZFS is not just a filesystem: it is also a volume manager, an integrity protection system, a snapshot platform, a replication engine, and an advanced storage management system. Thanks to the model copy-on-write, end-to-end checksums and the ability to send snapshots via zfs send, it becomes possible to build efficient, incremental and consistent backups.
Why ZFS snapshots are so important
A ZFS snapshot is a snapshot of the state of a dataset at a given point in time. It's not a complete copy of the data, at least not in the traditional sense. When created, a snapshot initially takes up very little space because it simply preserves references to existing blocks. Space only begins to grow when those blocks are modified or deleted in the active dataset.
This behavior stems from ZFS's copy-on-write functionality. When a file is modified, ZFS doesn't directly overwrite the old blocks, but writes the new data to a new location and then updates the metadata. If a snapshot exists that references the previous blocks, they remain available. This way, the snapshot continues to represent the exact state of the dataset at the time it was created.
The practical result is very powerful. If a snapshot of a dataset containing a website is created at 10:00 AM, and at 10:30 AM an update breaks part of the application, the previous files can be recovered from the snapshot. If a user accidentally deletes a directory, that directory can be restored as long as a snapshot containing it exists. If a deployment overwrites important content, you can roll back without having to restore an entire remote backup.
Creating a snapshot manually is extremely simple:
zfs snapshot tank/www@snap-2026-05-28
To view available snapshots:
zfs list -t snapshot
Or, to get more useful information about space occupied, referenced data and creation date:
zfs list -t snapshot -o name,used,referenced,creation
A complete restore of a dataset to a previous snapshot can be done with:
zfs rollback tank/www@snap-2026-05-28
Rollback is a very powerful feature, but it must be used with caution. It restores the entire dataset to the state of the selected snapshot, discarding subsequent changes. In many cases, it's more prudent to recover only the necessary files from the snapshot, avoiding losing valid changes made in the meantime. This is one of the reasons why local snapshots are so useful: they allow for quick, selective, and often non-invasive restores.
Local snapshots: the first line of defense
Local snapshots should not be confused with a full backupIf they are on the same pool and that pool is lost, snapshots are also lost. Therefore, they do not protect against all scenarios: catastrophic storage failure, physical server loss, total system compromise, or intentional destruction of datasets. However, in everyday practice, they often represent the first and fastest line of defense.
Many incidents don't require restoring from a remote backup. They simply require going back a few minutes, hours, or days. In a hosting environment, for example, a client might delete files via FTP or SFTP, a WordPress update might render the site unusable, an automated process might overwrite a directory, a plugin might generate unwanted changes, or a configuration might be accidentally altered. In all these cases, having frequent snapshots allows for quick action.
The key point is that snapshots must be frequent, organized, and managed with consistent retention. Creating a manual snapshot every now and then isn't a strategy. Likewise, creating continuous snapshots without ever deleting them risks uncontrolled space consumption. A policy is needed: how many hourly snapshots to keep, how many days to keep daily ones, how long to keep weekly or monthly ones, and which datasets to treat differently.
The problem of manual management
Managing ZFS snapshots manually is easy for a single dataset, but quickly becomes cumbersome on real systems. A server can have dozens or hundreds of datasets: websites, home directories, databases, application backups, staging environments, repositories, log directories, container volumes, or virtual machines. Each dataset may have different needs.
A dataset containing static files can tolerate less frequent snapshots and longer retention periods. A dataset with dynamic application data may require hourly snapshots. A dataset with highly active databases must be treated with greater caution, as the rate of change can rapidly increase the space used by snapshots. Furthermore, application consistency, not just filesystem consistency, must be considered.
Writing custom scripts to handle all this is possible, but it introduces complexity. You need to define a naming convention, avoid collisions, delete old snapshots, manage recursive datasets, differentiate policies, produce logs, intercept errors, and maintain the script over time. The more your infrastructure grows, the greater the risk that a homemade solution will become fragile. This is precisely why Sanoid is so useful.
What is Sanoid?
Sanoid Sanoid is a tool designed to automate ZFS snapshot management. Its purpose is simple: create snapshots according to a defined schedule and delete old ones according to retention rules. Instead of relying on manual scripts, Sanoid allows you to describe the desired behavior in a readable and easily versionable configuration file.
With Sanoid, you can define different templates for different types of datasets. For example, you can have a policy for production data, one for databases, one for archives, and one for less critical environments. Each template can specify how many frequent, hourly, daily, weekly, monthly, or yearly snapshots to keep.
An example configuration could be the following:
[tank/www]
use_template = production
recursive = yes
[tank/home]
use_template = production
recursive = yes
[tank/mysql]
use_template = database
recursive = no
[template_production]
frequently = 0
hourly = 24
daily = 14
weekly = 8
monthly = 6
yearly = 0
autosnap = yes
autoprune = yes
[template_database]
frequently = 0
hourly = 12
daily = 7
weekly = 4
monthly = 3
yearly = 0
autosnap = yes
autoprune = yes
In this example, the datasets tank/www e tank/home They use a production policy with 24 hourly, 14 daily, 8 weekly, and 6 monthly snapshots. The dataset tank/mysql Instead, use a shorter policy, suitable for very dynamic data or to be integrated with other strategies, such as logical dumps or database replication.
the options autosnap e autoprune These are key. The first enables automatic snapshot creation; the second allows Sanoid to delete snapshots that exceed the configured retention. Without pruning, snapshots would continue to accumulate, retaining old blocks and consuming space in the pool.
Retention and space occupied by snapshots
One of the most important things to understand is that snapshots don't take up space in an intuitive way. A newly created snapshot may seem almost free, but if a lot of data is changed or deleted after it's created, that snapshot will start to retain older blocks. Therefore, the actual space consumed by snapshots depends on the dataset's change rate.
To analyze the space occupied it is useful to use:
zfs list -o name,used,referenced,usedbysnapshots
This command allows you to see how much space the dataset is using, how much is referenced, and how much is held by snapshots. For more details on snapshots:
zfs list -t snapshot -o name,used,referenced,creation
Overly aggressive retention can consume significant space, especially on frequently changing datasets. Retention that's too short may not provide sufficient historical depth. The balance depends on the data's value, available space, and recovery objectives. In a professional infrastructure, these parameters should be chosen consciously, not left to chance.
Database and application consistency
When talking about snapshots, we need to distinguish between filesystem consistency and application consistency. ZFS guarantees filesystem-consistent snapshots, but it can't know whether an application has completed all its internal operations at the exact moment of the snapshot. This is particularly important for databases like MySQL, MariaDB, or PostgreSQL.
With transactional engines like InnoDB, a hot snapshot can often be recovered similarly to a crash restart thanks to transactional logs. However, for mission-critical environments, it's not prudent to rely solely on this consideration. It is preferable to combine snapshots with specific application procedures: logical dumps, database replication, controlled flushes, temporary locks, snapshots on secondary replicas, or pre- and post-snapshot hooks.
Sanoid and Syncoid are excellent tools, but they're no substitute for application knowledge. A proper strategy must always consider what's being written to the dataset at the time the snapshot is created. Snapshotting static files is very different from snapshotting a live database, a message queue, a search index, or a stateful in-memory application.
ZFS send: The foundation of efficient replication
Local snapshots are essential, but they're not enough on their own. A serious backup strategy must include a copy to a separate destination. This is where zfs send, one of the most powerful features of OpenZFS.
The command zfs send Allows you to turn a snapshot into a stream. This stream can be saved to a file, sent via a pipe, transferred via SSH, or received directly from another pool via zfs receiveIn practice, ZFS can export the state of a dataset and reconstruct it elsewhere in a consistent way.
A complete submission can be done like this:
zfs send tank/www@snap-2026-05-28 | zfs receive backup/www
In this case the snapshot tank/www@snap-2026-05-28 is sent and received in the dataset backup/wwwThis is useful for a first sync, but the real strength of ZFS comes with incremental pushes.
After sending a first full snapshot, subsequent snapshots can be transferred by sending only the differences between two points in time:
zfs send -i tank/www@snap-2026-05-27 tank/www@snap-2026-05-28 | zfs receive backup/www
This command only sends what has changed between the May 27 and May 28 snapshots. There's no need to scan the entire filesystem, no need to compare millions of files, and no need to rely on application timestamps or checksums. ZFS already knows the difference between snapshots and can generate an efficient incremental stream.
Remote replication via SSH
Replication can also be performed to a remote server. The concept is simple: the source server generates the stream with zfs send, SSH transports it and the backup server receives it with zfs receive.
zfs send -i tank/www@snap-old tank/www@snap-new | ssh backup.example.com zfs receive backup/www
This approach is clean, transparent, and very powerful. It doesn't require proprietary formats, doesn't introduce opaque archives, and doesn't force closed-source software. The destination remains a ZFS dataset, with snapshots that can be browsed, mounted, and used for real-world restores.
The problem is that manually managing zfs send e zfs receive In production, it can become complex. You need to identify the common snapshot between source and destination, choose the correct incremental, manage recursive datasets, properties, interruptions, resumes, naming, errors, and differences between environments. Doing it once is simple; doing it regularly, across many datasets, is another matter entirely.
What is Syncoid?
Syncoid It's the tool that simplifies ZFS replication. It's the natural complement to Sanoid: while Sanoid manages snapshot creation and retention, Syncoid takes care of replicating datasets to another local or remote destination. Internally, it uses native ZFS primitives, so zfs send e zfs receive, but automates many operational details.
A local replica can be started with:
syncoid tank/www backup/www
Remote replication via SSH can be done with:
syncoid tank/www root@backup.example.com:backup/www
To include child datasets as well:
syncoid -r tank/www root@backup.example.com:backup/www
Syncoid identifies common snapshots, calculates the most suitable incremental path, and transfers only what's needed. This reduces the risk of human error and makes frequent replication much easier to automate. Instead of manually building long command chains, you can delegate the synchronization logic to Syncoid.
Why Sanoid and Syncoid are such a great pair
Sanoid and Syncoid work well together because they solve two parts of the same problem. Sanoid creates and maintains a local history of the data through regular snapshots. Syncoid transfers that history, or part of it, to another pool or a remote server. The result is a comprehensive strategy: rapid local recovery and remote protection against more severe scenarios.
Local snapshots are ideal for everyday incidents: accidental deletions, unwanted changes, failed updates, overwritten files. Remote replication, on the other hand, protects against hardware failures, server loss, primary pool issues, compromises, or more extensive disasters. You don't have to choose between local snapshots and remote backup: you need both.
A good architecture can include frequent snapshots on the production server and periodic replications to a backup server. A different, perhaps longer, retention period can be maintained on the remote server to provide greater historical depth than the source. This approach allows for a combination of recovery speed, efficiency, and resilience.
Example strategy for a production server
Let's imagine a server with a ZFS pool called tank and some main datasets:
tank/www tank/home tank/mysql tank/backups
The dataset tank/www contains the websites, tank/home user homes, tank/mysql the database data and tank/backups Any logical dumps or application exports. A possible strategy could include hourly snapshots for web data, daily snapshots for a few weeks, weekly snapshots for a few months, and monthly snapshots for longer retention.
For databases, you can choose a more conservative policy, integrated with logical dumps or database replication. For static data, you can maintain a longer retention period. For highly dynamic datasets, however, space consumption must be carefully monitored.
An example of a cron job for Syncoid might be:
15 * * * * /usr/sbin/syncoid -r tank/www root@backup.example.com:backup/www >/dev/null 2>&1 30 * * * * /usr/sbin/syncoid -r tank/home root@backup.example.com:backup/home >/dev/null 2>&1 45 2 * * * /usr/sbin/syncoid tank/mysql root@backup.example.com:backup/mysql >/dev/null 2>&1
In this example the output is redirected to /dev/nullThis is a solution often used in cron jobs to avoid unwanted output. In production, however, it's advisable not to lose visibility into errors. A better option may be to send logs to syslog, a dedicated file, or a monitoring system. A backup that fails without notifying anyone is extremely dangerous, as it creates a false sense of security.
Monitoring and recovery testing
Configuring snapshots and replication isn't enough. A backup is only useful if it can be restored. For this reason, it's essential to regularly monitor the pool's status, the presence of snapshots, available space, and the correct updating of remote replicas.
Some useful commands are:
zpool status zfs list zfs list -t snapshot zfs list -o name,used,referenced,usedbysnapshots
You need to verify that the pool is healthy, that there are no checksum errors, that the snapshots are recent, and that the remote destination is receiving incrementals correctly. It's equally important to check that space isn't running out due to overly aggressive retention or datasets with a high change rate.
Restore testing is an often overlooked but crucial practice. Periodically, you should take a replicated dataset, mount it in an isolated environment, and verify that the files are readable and that the application can restart. An untested backup is a promise, not a guarantee. In professional settings, the restore procedure should be documented as carefully as the backup procedure.
Replication security
When replicating via SSH, you also need to think about privileges. Using the root user directly is simple, but it's not always the best choice. In many environments, it's preferable to use dedicated SSH keys, limit authorized IP addresses, configure specific users, and consider using ZFS delegation via zfs allow.
The backup server shouldn't be a simple extension completely controlled by the primary server. If an attacker compromises the source and can also delete remote snapshots, replication loses much of its value. For this reason, in the most sensitive contexts, independent retention on the destination, limited permissions, pull replication instead of push, or additional offline copies can be envisaged.
Backup security isn't just about transfer encryption, but also protection against accidental or malicious deletion. A good replication should make it easy to write new snapshots to the destination, but difficult to destroy the existing history.
Advantages over traditional file-based backups
Compared to many file copy-based solutions, Sanoid and Syncoid directly leverage the internal knowledge of ZFS. A backup based on rsyncFor example, it must traverse the filesystem, comparing files, timestamps, sizes, and possibly checksums. On trees with millions of files, this step can be costly, even when little information has changed.
With ZFS, however, the difference between two snapshots is already known to the filesystem. zfs send It can generate an incremental stream of changed blocks without having to scan the entire file tree. This makes replication very efficient, especially on large datasets with relatively few changes.
Another advantage is that the destination isn't an opaque archive. The replication result is a real ZFS dataset that can be consulted, mounted, and used. Snapshots remain available on the remote server, allowing recovery of previous versions of the data. This makes backup not only efficient, but also operational.
Not a magic solution, but a professional basis
Sanoid and Syncoid aren't a magic wand. They don't eliminate the need for proper dataset design, they don't replace monitoring, they don't address application consistency on their own, and they don't automatically protect against every scenario. However, they are extremely valuable tools because they work consistently with the OpenZFS philosophy: using native primitives, reducing complexity, transferring only what changes, and keeping control in the hands of the administrator.
The quality of a backup strategy always depends on the whole: dataset structure, snapshot frequency, retention, remote replication, security, monitoring, restore testing, and documented procedures. Sanoid and Syncoid excellently cover two fundamental elements of this chain: snapshot automation and efficient replication.
Conclusion
Sanoid and Syncoid represent one of the most compelling pairings for those using OpenZFS in server environments, hosting, Linux infrastructure, self-hosted storage, or production systems. Sanoid allows you to transform snapshots from a manual operation to an automated policy, with clear and predictable retention. Syncoid allows you to replicate datasets to another pool or a remote server by leveraging the power of zfs send e zfs receive.
Together, they offer a modern data protection model: local snapshots for rapid recovery, incremental replication for remote protection, efficient transfer, transparent management, and full technical control. They're not a product to install and forget, but a solid foundation on which to build a professional process.
For those who administer Linux servers and use OpenZFS, learning how to use snapshots properly, zfs sendSanoid and Syncoid mean taking a quantum leap in risk management. It means moving from occasional copies and fragile scripts to a structured, verifiable, and consistent strategy. In a world where data is increasingly critical, being able to go back in time and efficiently replicate what really matters isn't a luxury: it's an operational necessity.