I wrote this post for the Altinity Blog - the original can be found here: https://altinity.com/blog/introduction-to-clickhouse-backups-and-clickhouse-backup
The native replication support built into ClickHouse provides high availability and resilience against individual node failures. However, rare disaster scenarios may require recovering data from backups. These include data corruption and the failure of all replicas in a shard or cluster.
A critical component of any ClickHouse backup scheme is “freezing” tables. As with all databases, consistent backups depend on ClickHouse being in a “quiesced” state. Instead of having to halt the database entirely, ClickHouse has native support for “freezing” tables for backup or migration. This is a no-downtime operation.
Manual Backups in Four Easy Steps
ClickHouse includes native support for instantaneous point-in-time backups, through its ‘ALTER TABLE… FREEZE’ feature.
- Confirm your shadow directory is empty:
ls /var/lib/clickhouse/shadow/
- Ask ClickHouse to freeze your table:
echo -n 'alter table events freeze' | clickhouse-client
- Save your backup in case of disaster:
cd /var/lib/clickhouse/
sudo mkdir backup
sudo cp -r shadow/ backup/my-backup-name - Finally, clean up the backup source for next time:
sudo rm -rf /var/lib/clickhouse/shadow/*
ClickHouse uses filesystem hard links to achieve instantaneous backups with no downtime (or locking) for ClickHouse services. These hard links can be further leveraged for efficient backup storage. On filesystems that support hard links, such as local filesystems or NFS, use cp with the -l flag (or rsync with the –hard-links and –numeric-ids flags) to avoid copying data.
When hard links are utilized, storage on disk is much more efficient. Because they rely on hard links, each backup is effectively a “full” backup, even though duplicate use of disk space is avoided.
Test Your Backup
It is rightly said that a backup is worthless if the restoration process hasn’t been tested. Perform regular test restores to ensure your data will be there when you need it.
Here are the steps for manual recovery:
- Drop your test table, or find another server for testing.
- Create your test table for recovery:
cat events.sql | clickhouse-client
- Copy your backup to the table’s `detached` directory:
cd /var/lib/clickhouse
sudo cp -rl backup/my-backup-name/* data/default/events/detached/ - Attach the detached parts:
echo 'alter table events attach partition 202006' | clickhouse-client
- Confirm your data has been restored:
echo 'select count() from events' | clickhouse-client
Automate the Backup Process with clickhouse-backup
The clickhouse-backup
tool, created by Alex Akulov, helps to automate the manual steps above: https://github.com/AlexAkulov/clickhouse-backup. We like clickhouse-backup
and have implemented several new features, which are described here for the first time.
To get started you’ll need to install clickhouse-backup
. Full instructions are in the ReadMe.md file. Here’s an example of installation from a tarball. RPMs, Debian packages, and Docker images are also available.
wget https://github.com/AlexAkulov/clickhouse-backup/releases/download/v0.5.2/clickhouse-backup.tar.gz
tar -xf clickhouse-backup.tar.gz
cd clickhouse-backup/
sudo cp clickhouse-backup /usr/local/bin
clickhouse-backup -v
The API features and new storage options like remote_storage
described in this blog article are not yet available in an official build. You either need to build from source or run the latest docker image. Here’s an example of the latter.
docker run --rm -it --network host \
-v "/var/lib/clickhouse:/var/lib/clickhouse" \
-e CLICKHOUSE_PASSWORD="password" \
-e S3_BUCKET="clickhouse-backup" \
-e S3_ACCESS_KEY="access_key" \
-e S3_SECRET_KEY="secret" \
alexakulov/clickhouse-backup:master --help
For the rest of the article we assume you have a build that has the new features. When used on the command line, clickhouse-backup
requires a configuration file. Here is a minimal example.
$ cat /etc/clickhouse-backup/config.yml
general:
remote_storage: none
You will need to add additional configuration options for non-default ClickHouse installations or authentication. A full config example can be created by running clickhouse-backup default-config
. This is a great starting point for your use, showing all available settings.
Once configured, clickhouse-backup provides a variety of subcommands for managing backups.
$ clickhouse-backup help NAME: clickhouse-backup - Tool for easy backup of ClickHouse with cloud support
...
COMMANDS:
tables Print list of tables
create Create new backup
upload Upload backup to remote storage
list Print list of backups
download Download backup from remote storage
restore Create schema and restore data from backup
delete Delete specific backup
default-config Print default config
freeze Freeze tables
clean Remove data in 'shadow' folder
server Run API server
help, h Shows a list of commands or help for one command
Just like the manual backup example above, you will need to use sudo
or run clickhouse-backup
as the clickhouse
user.
The configuration file allows for certain databases or tables to be ignored. The tables
subcommand will show you which tables will be backed up:
$ clickhouse-backup tables
default.events
system.metric_log (ignored)
system.query_log (ignored)
system.query_thread_log (ignored)
system.trace_log (ignored)
Creating a backup is as easy as:
$ sudo clickhouse-backup create
2020/07/06 20:13:02 Create backup '2020-07-06T20-13-02'
2020/07/06 20:13:02 Freeze `default`.`events`
2020/07/06 20:13:02 Skip `system`.`metric_log`
2020/07/06 20:13:02 Skip `system`.`query_log`
2020/07/06 20:13:02 Skip `system`.`query_thread_log`
2020/07/06 20:13:02 Skip `system`.`trace_log`
2020/07/06 20:13:02 Copy metadata
2020/07/06 20:13:02 Done.
2020/07/06 20:13:02 Move shadow
2020/07/06 20:13:02 Done.
As you can see in the example above, the backup completed within the same second.
You can review existing local backups:
$ sudo clickhouse-backup list
Local backups:
- '2020-07-06T20-13-02' (created at 06-07-2020 20:13:02)
Note that `Size` is not computed for local backups for performance reasons.
Internally, clickhouse-backup
utilizes hard links when possible, as described above. The backup is stored in /var/lib/clickhouse/backup/BACKUPNAME
. The backup name defaults to a timestamp, but you can optionally specify the backup name with the –name flag. The backup contains two directories: a `metadata` directory, with the DDL SQL statements necessary to recreate the schema, and a `shadow` directory with the data as a result of the ALTER TABLE ... FREEZE
operation.
Restoring from a backup is also easy. For example:
$ echo 'drop table events' | clickhouse-client
$ sudo clickhouse-backup restore 2020-07-06T20-13-02
2020/07/06 20:14:46 Create table `default`.`events`
2020/07/06 20:14:46 Prepare data for restoring `default`.`events`
2020/07/06 20:14:46 ALTER TABLE `default`.`events` ATTACH PART '202006_1_1_4'
2020/07/06 20:14:46 ALTER TABLE `default`.`events` ATTACH PART '202006_2_2_2'
2020/07/06 20:14:46 ALTER TABLE `default`.`events` ATTACH PART '202006_3_3_3'
2020/07/06 20:14:46 ALTER TABLE `default`.`events` ATTACH PART '202006_4_4_3'
2020/07/06 20:14:46 ALTER TABLE `default`.`events` ATTACH PART '202006_5_5_2'
2020/07/06 20:14:46 ALTER TABLE `default`.`events` ATTACH PART '202006_6_6_1'
The restore
subcommand automates both schema and data restoration. In case you only want to restore the schema, use the optional --schema
flag. Or if you only want to restore the data (assuming the schema already exists), you can use the --data
flag. The latter case is especially useful in restoring to a server that already has existing data.
Another useful feature is support for specifying a table pattern with most commands, such as create and restore. The --table
argument allows you to backup (or restore) a specific table. You can also use a regex to, for example, target a specific database: --table=dbname.*
.
Remote Backup Destinations
Of course, you could rsync your backup to a remote destination, save it to an object store like S3, or archive it using an existing backup solution. Local storage is usually insufficient to meet data durability requirements.
The clickhouse-backup tool supports uploading and downloading backups from a remote object store, such as S3, GCS, or IBM’s COS. A minimal AWS S3 configuration looks like:
s3:
access_key: <YOUR AWS ACCESS KEY>
secret_key: <YOUR AWS SECRET KEY>
bucket: <YOUR BUCKET NAME>
region: us-east-1
path: "/some/path/in/bucket"
Once you have configured your credentials and destination bucket, clickhouse-backup
can take care of the rest:
$ clickhouse-backup upload 2020-07-06T20-13-02
2020/07/07 15:22:32 Upload backup '2020-07-06T20-13-02'
2020/07/07 15:22:49 Done.
The remote backup can be downloaded to local storage before restoration:
$ sudo clickhouse-backup download 2020-07-06T20-13-02
2020/07/07 15:27:16 Done.
The clickhouse-backup
config file supports backups_to_keep_local
and backups_to_keep_remote
settings – tune them to meet your data retention requirements. For example, set backups_to_keep_local: 7
and backups_to_keep_remote: 31
to retain a week’s worth of nightly backups locally and a month’s worth remotely. Set both to 0 to disable backup pruning.
There is also a --diff-from
option to the upload subcommand. This feature compares files to a previous local backup, only uploading new/changed files. It is essential you retain the previous backup in order to do a restore from the new backup.
Data transfer time and cost are critical aspects of remote storage. How long will it take to restore a large table to a new server? This will be largely dependent on network and storage bandwidth. It’s critical to test various recovery scenarios to have a good understanding of what recovery times you can achieve in a failure. Cost management will be important if you are using a public cloud.
Using the clickhouse-backup API
Finally, clickhouse-backup
can run as a service that provides a REST API. This is a new feature. The API mirrors the command-line commands and options, and may be a more convenient way to integrate with an existing scheduling or CI/CD system.
$ sudo clickhouse-backup server &
$ curl localhost:7171/backup/list
{"Type":"local","Name":"2020-07-06T20-13-02","Size":0,"Date": "2020-07-06T20:13:02.328066165Z"}
Documentation of the API endpoints can be found here: https://github.com/AlexAkulov/clickhouse-backup#api
Using clickhouse-backup
in Production
It is important to take note of the known limitations of clickhouse-backup
, which are documented here: https://github.com/AlexAkulov/clickhouse-backup#limitations
In addition, the documentation contains this important warning:
Never change files permissions in /var/lib/clickhouse/backup
. This path contains hard links. Permissions on all hard links to the same data on disk are always identical. That means that if you change the permissions/owner/attributes on a hard link in backup path, permissions on files with which ClickHouse works will be changed too. That might lead to data corruption.
Recovery Scenarios
Failed Replica
Failure of individual servers or nodes is by far the most common disaster scenario seen in production. In almost all cases, the failed replica should be replaced and the schema recreated. ClickHouse’s native replication will take over and ensure the replacement server is consistent. This failure scenario is worth testing in advance to understand the network and compute the impact of rebuilding the new replica.
Failed Shard
In a clustered environment, at least one replica from each shard should be backed up. The clickhouse-backup
API is one approach for orchestrating backup naming and execution across a cluster.
If all replicas in a shard were to fail, or more commonly, data was corrupted, the entire shard must be restored from a backup as described above. Ideally, restore the backup to one replica, restore the schema to the others, and allow ClickHouse’s native replication take over.
Failed Cluster
A completely failed cluster, whether due to infrastructure failure or data corruption, can be restored in the same manner as a failed shard. One replica in each individual shard must be restored via the process above.
Alternate Backup Strategies
Offline Replica with Filesystem Snapshots
A common alternative is to use an “offline replica” for backups. A replica is configured (often in another region), which is not used for queries as part of any Distributed Tables. “Offline replica” should not plan to do any merges, which can be specified with always_fetch_merged_part
and replica_can_become_leader
ClickHouse MergeTree settings. While production replicas are best served by the ext4 filesystem, the backup replica uses ZFS (or another filesystem that supports snapshots.) This approach provides a quick restoration process. Note that backups, in this case, are still local to the server/node and do not necessarily provide sufficient data durability. ZFS provides directory-based filesystem access to individual snapshots, so it would be possible to automate the storage of these snapshots on a remote system or object store.
Storage-as-a-Service with Snapshots
It is common for Cloud deployments to use network-based block storage (such as AWS EBS or GCP persistent disks). Some on-prem deployments use Ceph or OpenEBS for this purpose. Each of these “storage-as-a-service” technologies supports transparent volume snapshots. By first freezing tables for backup and then creating a snapshot, you can achieve nearly instantaneous backups.
Internally, snapshots only store blocks on disk that have changed since a previous snapshot. While not a true “incremental” backup, these systems provide highly efficient use of disk space. Beware that network-based block storage is rarely as performant as local disk, and be sure to monitor snapshot retention and cost.
Using Kafka to Improve Backups
So far we have discussed specific point-in-time backups that are created on a nightly or hourly basis (for example). Some organizations require the ability to restore to any arbitrary point in time. Because ClickHouse doesn’t have a native binary log (such as the Postgres WAL), some other mechanism is needed to “replay” the data since the last specific point-in-time backup.
Many organizations use Kafka to meet this requirement. Streaming data through Kafka into ClickHouse has many advantages for availability and fault tolerance. Another advantage is the ability to reset ingestion to any offset in the Kafka partition. When performing point-in-time backups, the Kafka offset must be stored. During recovery, the ClickHouse Kafka Engine configuration is set to the Kafka offset at the time the backup was created, and data after that point in time will be ingested.
A simple way to store Kafka offsets is to INSERT them into a table in ClickHouse that is included in the backup. This can be done in a wrapper script that pauses ingest from Kafka, writes the current topic partition offsets, starts backup, and enables ingestion again. When you restore data, you can reset offsets for the consumer group, then re-enable ingest. See the ClickHouse Kafka engine tutorial on this blog for an example of resetting offsets.
Next Steps
Before diving in and implementing a backup solution, take a moment to reflect on the requirements of your business and end users. Get a good understanding of Recovery Time Objectives (RTO) and Recovery Point Objectives (RPO) for your environment. Then, consider the approaches outlined above to determine which is the best fit.
One of ClickHouse’s greatest strengths is the support and contributions of a diverse and invested community. The automation provided by clickhouse-backup
is no exception: great thanks to Alex Akulov!
At Altinity we care as much about your data as you do. Contact us for help with backups, or any other ClickHouse challenge!
Photo by Simon Migaj on Unsplash.