Selective (offsite) backup

Hey, I have a personal installation which is running on the community docker version and which is using a BTRFS snapshot based strategy for fallback and revisioning. There are many libraries which make extensive use of the file revisioning feature.

I’d like to do an offsite backup to cover a worst case scenario for relevant data (both unencrypted and encrypted), which is

  1. only selected libraries or libraries below a size limit, these could be defined by name or by using some kind of tag in the library name
  2. within a library, only the last file revision or a limited number of file revisions

Currently, I wouldn’t know a good solution to automize this. For (1) I would probably have to get the library identifiers to do a selective sync of the Seafile data folder but still using full database snapshot. For (2), I only see the seadrive/fuse for unencrypted folders, which would be inferior and slower than using the raw stored backend data.

Does anybody have a recommendation or idea?

1 Like

Hi,
We were asking ourselves the same question… We thought about using the webdav share to get to the latest version of the library. But I don’t think that there is a way with encrypted libraries…

ok, what I finally did which came closest to my vision was

  • restrict revisioning in seafile to 30 days on libraries
  • manually blacklist all libraries exceeding a certain limit in the data folders using the library id with wildcard path matching

that seems to work for me but i have have to maintain the blacklist.

Just a question: Why are you versioning using BTRFS? Seafile already does its own versioning and if your BTRFS snapshots are not aligned to both the Seafile database and actual library blocks then restoring will be a nightmare, no? The database contains all the history/versioning block information so all restores implicitly contain file versioning, etc.

I usually set my versioning via Seafile and then dump the database and back that up along with a current snapshot of the actual data-blocks. That way it’s one thing to restore and then I can use Seafile itself to rollback to previous versions.

Sorry if I missed the point, this post just caught my interest. However, for your requirement to only include certain libraries, that is interesting… I hope others maybe have an automated solution, I might find that useful also.

I have several reasons to use BTRFS snapshots in addition to revision in Seafile itself:

  1. I don’t trust the application level: I have seen complaints about missing files, the Drive client has had issues which led to unattended file deletions which would be lost if the revisions are not enabled for a library run out of the revision age setting, and I have not done any disaster recovery for Seafile with broken DB and/or file storage to be sure about it myself. BTRFS snapshots just add a global back-in-time-travel option as fallback, just on local storage, not as backup. I use the same strategy for different installations and make use of automated snapshot utility BTRBK, which implements triggers and retention time management.
  2. A BTRFS snapshot takes less than a second and just fills up space that is not currently used because the Seafile storage partition (logical volume) has quite some empty space reserved in order to grow the installation. I can adjust the strategy any time. The snapshots are fully incremental on block level (COW filesystem), so they just consume what is added or deleted.
  3. I’m running a Docker compose based setup and put the Dockerfiles with app versions as well as the backend data files and the database files all together on the same BTRFS (sub)volume. This way, I don’t need to do any specific DB backup scripts and can “boot” into any Seafile data version, discard it and boot recent again, if needed, just by changing the mountpoint to a previous snapshot.
  4. Since snapshots are read-only by default and don’t change after having taken them, I can use the latest snapshot to perform a regular off-site backup at any time without having a downtime in the Seafile server. So even if you would not make use of the snapshots long-term like me, they would be useful for running backups.
  5. I don’t want to do offsite backup of all file revisions because that is just too expensive. Since there seems to be no way to do a selective backup of particular revisions and since users can by default basically keep all revisions of all files in a library, I have to restrict the revision age globally, but I can still do recovery of older files based on the snapshots, if really needed.

If you can read some German, I have explained my setup a little here: Seafile-Docker hinter eigenem reverse proxy betreiben

Cool, definitely gave me something to think about. Maybe I’ll run that document through Google translate and see how well that works out, haha. Cheers, thanks for taking the time to explain what you’re doing.