Tutorial: recover your data even if your database is corrupted

Hi everyone,

I have corrupted my Seafile Database and thus I could not access my data anymore.
Because I wanted my data back, I coded a software named seafile_recovery.
It is a low-level tool that can read Seafile’s on-disk data format without needing the database.

In the following, I explain how I have proceeded under the form of a small tutorial.

Step by step instructions to recover your data

(The tutorial has been written for Linux, some adjustments may be required for other OSes)

Fir install Golang by following the instructions on their website (golang dot org).

Then install the tool (the last command should display the tool’s help message):

$ go get git.deuxfleurs.fr/quentin/seafile_recovery
$ export PATH="$HOME/go/bin:$PATH"
$ seafile_recovery --help

Then find your storage directory. If your Seafile installation is in /srv/seafile-server-latest, by default your storage folder will be in /srv/seafile-data/storage. To avoid specifying the storage folder path, we change our working directory to /srv/seafile-data (otherwise specify --storage=xxxx in the command line):

$ cd /srv/seafile-data

Now, we can list the non virtual repositories simply by runnning the ls command on the fs folder:

$ ls -l storage/fs/
total 8
drwxrwxr-x. 13 quentin quentin 4096 26 avril 12:22 0011d396-4890-463a-8266-bcbd978d8d1c
drwxrwxr-x.  4 quentin quentin 4096 21 avril 10:06 80701bb3-997c-4a48-b771-a39830dcaf71

In our example, we have 2 repositories:

  • 0011d396-4890-463a-8266-bcbd978d8d1c
  • 80701bb3-997c-4a48-b771-a39830dcaf71

Let’s pick the first one and try to get some information about it:

$ seafile_recovery head 0011d396-4890-463a-8266-bcbd978d8d1c
2021/04/28 15:10:34 Repo contains 6 commits
2021/04/28 15:10:34 Repo has 1 sources
2021/04/28 15:10:34 Repo has 1 sinks
2021/04/28 15:10:34 Proposing following HEAD:
RootId: 5911dd2d363f591e43df4e80591d0a54975f2aaf
CreatorName: quentin@example.com
Creator: 0000000000000000000000000000000000000000
Description: Added "telecom-reclaimed-web-single-page.pdf".
Ctime: 2021-04-26 12:22:59 +0200 CEST
RepoName: Ma bibliothèque
RepoDesc: Ma bibliothèque

We learnt some information about the repository, especially its name (“Ma bibliothèque”), who did the last change (“quentin@example.com”) and the RootId (“5911dd2d363f591e43df4e80591d0a54975f2aaf”).

We can now explore the repository’s last file hierarchy thanks to the RootId value (it works even if you only copy part of the ID, that’s what I do to keep the command more readable):

$ seafile_recovery ls 0011d396-4890-463a-8266-bcbd978d8d1c --dir=5911dd2
2021/04/28 15:15:40 5911dd /
2021/04/28 15:15:40 b88ab9 /seafile-tutorial.doc
2021/04/28 15:15:40 d24616 /Capture d’écran de 2021-04-11 23-07-31.png
2021/04/28 15:15:40 f123de /My Folder/
2021/04/28 15:15:40 15be4d /My Folder/telecom-reclaimed-web-single-page.pdf
2021/04/28 15:15:40 380a0e /My Folder/Capture d’écran vidéo de 19-12-2020 10:30:15.webm
2021/04/28 15:15:40 Total size: 25.6M

Now, I want to extract the Seafile’s folder “My Folder” and its content on my disk, in a folder named out. Note that the ID for “My Folder” is f123de as seen in the output of our previous command. This ID is required for our next command:

$ seafile_recovery cp 0011d396-4890-463a-8266-bcbd978d8d1c --dir=f123de ./out
2021/04/28 15:17:28 f123de /
2021/04/28 15:17:28 15be4d /telecom-reclaimed-web-single-page.pdf
2021/04/28 15:17:28 380a0e /Capture d’écran vidéo de 19-12-2020 10:30:15.webm
$ ls out/
'Capture d’écran vidéo de 19-12-2020 10:30:15.webm'   telecom-reclaimed-web-single-page.pdf

Finally, if you prefer to upload this content directly to a S3 bucket, you can do:

$ seafile_recovery cp 0011d396-4890-463a-8266-bcbd978d8d1c --dir=f123de s3://ACCESS_KEY:SECRET_KEY@ENDPOINT/REGION/BUCKET[/PREFIX]
2021/04/28 15:17:28 f123de /
2021/04/28 15:17:28 15be4d /telecom-reclaimed-web-single-page.pdf
2021/04/28 15:17:28 380a0e /Capture d’écran vidéo de 19-12-2020 10:30:15.webm

Be careful ! This tool is not intended to change your seafile backend from your local filesystem to an S3 backend. Migrating to the S3 backend implies to keep Seafile’s objects which is a totally different job. Appropriate scripts are available from Seafile’s maintainers.

Thanks for reading until here, I will try to answer your questions/remarks if you have some :slight_smile:

12 Likes

This is amazing!
I hope I will never need to use your tool. But let me thank you in advance under the assumption that one day I will have to.

1 Like

This is a really great tool! I just registered to give you a heart :slight_smile: