Backup Solutions for Seafile

We have a relatively large seafile installation, with some 400GB of data and some 200 users. We are now hitting the wall with regards to being able to backup seafile as the number of small files is making it difficult to complete a backup in a timely fashion.

For instance,
seafile-data# find . -type f |wc -l
seafile-data# find . -size +10M -type f |wc -l
seafile-data# find . -size +1M -type f |wc -l

So we have 1.7 million files less than 1M in size. We have been using duplicity for backups to object storage but the manifest and signature files are getting rather large and unwieldy to manipulate due to these millions of files. I wonder if anyone else is hitting these constraints/difficulties.

We run a seaf-gc weekly to try and tidy up and release storage as fast as possible. Library history is kept to 90 days.

I didn’t have issue so far with 2057559 files (655 GiB) and a few users (>1M 174388, >10m 658). I’d recommend backing up the data using rsync. First do database backups, seconds copy all files over to another server using rsync.

The problem I see is that duplicity is not really suitable for backups with many files.

You have an interesting problem. Can you give some background on the system you’re using?

What are the hardware specs of the Seafile server?

What OS and version of Seafile are you running?

Is the data store co-located with the Server OS or saved to an external volume?

Is the current backup being saved to a local volume or a network device?

When you say it’s difficult to complete a backup in a timely fashion, how much time it is taking?

Do you also need an off-site backup of the server (or the files)?

I’m not familiar with Duplicity but this might be interesting/useful:

I’ve recently installed duplicity v0.7.07. Have not used duplicity before. I have noticed very low performance doing backups. For medium sized files (several MB) throughput was about 8MB/s. I have done a test backup (89.3 MB, 785 small files) which has run for 167 seconds. This is about 0.5MB/s. In both cased both disks and CPU were mostly idle (my HDD throughput is about 130MB/s). Strace revealed an excessive fsync() for files under ~/.cache/duplicity (*.sigtar.part, *.manifest.part), done after each write(). Running with fsync() disabled (via LD_PRELOAD) cut the second test time from 167 to 3 seconds,a 55X improvement.

I have located the following patch as a fix for bug #1538333 (released in v0.7.07):

Several of the comments on the post may be important as well.

For large installations (ours has about 9 TB now, growing by about 1 TB/month), rsync and derivatives are much too slow to be usable. We use an nfs-mount of a ZFS filesystem and use snapshotting and zfs send/receive for backup. Works like a charm. Downtime ist just a few seconds - the time it takes to shutdown seafile, dump the database, create the snapshot and restart seafile.

As long as the garbage collector is not running at the same time you can just backup the database online and create a snapshot afterwards (not even required as long as you can live with some orphaned objects on restore). This works because in the database pointers to the most recent commits of the libraries are being saved. Those are also valid in case there has been new data written to the library after backing up the database, because older commits are only being touched in case the garbage collector is running.
On the other hand the downtime required is really low, so it’s not really an issue to stop Seafile for a moment before Backing up the databases and creating the snapshot.

Interesting that there is already about 1 TB of new data per month. Wouldn’t have expected that.

1 Like

The reason for duplicity was that it allows a multitude of backends to store the data. We currently LVM snaphot the filesystem so the downtime to seafile is short like C_Shilittchen says. As most of our support is to OpenStack clouds we have been using the swift object storage. Duplicity was nice because of the encryption security option, manifests and signing of the the dump so you know it has not been tampered with.

We are curious if the single weekly running of the garbage collection is actually freeing space correctly, but we can’t tell for sure.

I agree with shoeper that the tool and how it goes about creating the backup is the issue. The actual backup to object storage does not take overly long time. It is the creation of the manifest and the file signatures. Currently it runs for close to 24 hours. Nearly all of that is creating the signatures. The lenght of time to create and the size of the signature file is the issue.

I have come across rclone and freezer since starting this conversation and will be investigating them as possible replacements.