Real time backup


#1

Hello,

I’m experiencing some problems with the real time backup solution.

seafile.log at the backup server:
[12/27/2016 05:43:31 PM] http-tx-mgr.c(1842): Repo 91d67643 sync status: [Get head commit id] success, transition to [Get diff commit ids].
[12/27/2016 05:43:31 PM] http-tx-mgr.c(1842): Repo 32ff8913 sync status: [Get head commit id] success, transition to [Get diff commit ids].
[12/27/2016 05:43:31 PM] http-tx-mgr.c(1829): Failed to sync repo 91d67643, error in [Get diff commit ids].
[12/27/2016 05:43:31 PM] http-tx-mgr.c(1842): Repo 32ff8913 sync status: [Get diff commit ids] success, transition to [Get diff commits].
[12/27/2016 05:43:31 PM] http-tx-mgr.c(1842): Repo 32ff8913 sync status: [Get diff commits] success, transition to [Get fs].
[12/27/2016 05:43:33 PM] http-tx-mgr.c(1306): Bad response code for POST http://*//seafhttp/server-sync/repo/32ff8913-a362-45ec-9bfa-818d3227eaa7/multi-fs-id-list/?force=0: 500.
[12/27/2016 05:43:33 PM] http-tx-mgr.c(1829): Failed to sync repo 32ff8913, error in [Get fs].
[12/27/2016 05:49:11 PM] http-tx-mgr.c(1608): Failed to get block 6417cdbaa75164a26a17175f2c550f44e6572699 from primary: Internal server error.
[12/27/2016 05:49:12 PM] http-tx-mgr.c(1608): Failed to get block 846999a75e9bda4ac3270947e5e8dbb709e21ebb from primary: Internal server error.
[12/27/2016 05:49:12 PM] http-tx-mgr.c(1608): Failed to get block 0e269ae15c86e37c04d18298389f32cbd0adc7e9 from primary: Internal server error.
[12/27/2016 05:49:12 PM] http-tx-mgr.c(1608): Failed to get block 6d743c0a12b5193ab872fff4fe7514da8811c008 from primary: Internal server error.
[12/27/2016 05:49:13 PM] http-tx-mgr.c(1608): Failed to get block 7a8a4529035213e0b311e9617feca1db839b9848 from primary: Internal server error.
[12/27/2016 05:49:13 PM] http-tx-mgr.c(1829): Failed to sync repo e5bf0d66, error in [Get blocks].

seafile.log on the primary server:
[12/27/2016 05:49:41 PM] filelock-mgr.c(917): Cleaning expired file locks.
[12/27/2016 06:12:41 PM] …/common/fs-mgr.c(1889): [fs mgr] Failed to read dir 5fc08e6fc3c298af7b7b6f90cc8236f1fee7edb4.
[12/27/2016 06:12:41 PM] …/common/diff-simple.c(141): Failed to find dir 32ff8913-a362-45ec-9bfa-818d3227eaa7:5fc08e6fc3c298af7b7b6f90cc8236f1fee7edb4.
[12/27/2016 06:12:41 PM] http-server.c(3527): Failed to diff remote and commit 833720a6c909fa645e3d8dc1435bbac16e321019 for repo 32ff8913.
[12/27/2016 06:43:33 PM] …/common/fs-mgr.c(1889): [fs mgr] Failed to read dir 5fc08e6fc3c298af7b7b6f90cc8236f1fee7edb4.
[12/27/2016 06:43:33 PM] …/common/diff-simple.c(141): Failed to find dir 32ff8913-a362-45ec-9bfa-818d3227eaa7:5fc08e6fc3c298af7b7b6f90cc8236f1fee7edb4.
[12/27/2016 06:43:33 PM] http-server.c(3527): Failed to diff remote and commit 833720a6c909fa645e3d8dc1435bbac16e321019 for repo 32ff8913.
[12/27/2016 06:49:41 PM] filelock-mgr.c(917): Cleaning expired file locks.
[12/27/2016 05:49:10 PM] …/common/block-backend-fs.c(486): [block bend] Failed to stat block e5bf0d66-7b1e-43f0-a9e9-e6b95fba2a6d:6417cdbaa75164a26a17175f2c550f44e6572699 at //seafile-data/storage/blocks/e5bf0d66-7b1e-43f0-a9e9-e6b95fba2a6d/64/17cdbaa75164a26a17175f2c550f44e6572699: No such file or directory.
[12/27/2016 05:49:10 PM] …/common/block-backend-fs.c(486): [block bend] Failed to stat block e5bf0d66-7b1e-43f0-a9e9-e6b95fba2a6d:846999a75e9bda4ac3270947e5e8dbb709e21ebb at /
/seafile-data/storage/blocks/e5bf0d66-7b1e-43f0-a9e9-e6b95fba2a6d/84/6999a75e9bda4ac3270947e5e8dbb709e21ebb: No such file or directory.
[12/27/2016 05:49:10 PM] …/common/block-backend-fs.c(486): [block bend] Failed to stat block e5bf0d66-7b1e-43f0-a9e9-e6b95fba2a6d:0e269ae15c86e37c04d18298389f32cbd0adc7e9 at //seafile-data/storage/blocks/e5bf0d66-7b1e-43f0-a9e9-e6b95fba2a6d/0e/269ae15c86e37c04d18298389f32cbd0adc7e9: No such file or directory.
[12/27/2016 05:49:10 PM] …/common/block-backend-fs.c(486): [block bend] Failed to stat block e5bf0d66-7b1e-43f0-a9e9-e6b95fba2a6d:7a8a4529035213e0b311e9617feca1db839b9848 at /
/seafile-data/storage/blocks/e5bf0d66-7b1e-43f0-a9e9-e6b95fba2a6d/7a/8a4529035213e0b311e9617feca1db839b9848: No such file or directory.
[12/27/2016 05:49:10 PM] …/common/block-backend-fs.c(486): [block bend] Failed to stat block e5bf0d66-7b1e-43f0-a9e9-e6b95fba2a6d:6d743c0a12b5193ab872fff4fe7514da8811c008 at /*/seafile-data/storage/blocks/e5bf0d66-7b1e-43f0-a9e9-e6b95fba2a6d/6d/743c0a12b5193ab872fff4fe7514da8811c008: No such file or directory.

seaf-fsch.sh on the primary server:
[12/27/16 15:47:04] fsck.c(595): Running fsck for repo e5bf0d66-7b1e-43f0-a9e9-e6b95fba2a6d.
[12/27/16 15:47:04] fsck.c(422): Checking file system integrity of repo *(e5bf0d66)…
[12/27/16 15:57:31] fsck.c(659): Fsck finished for repo e5bf0d66.

It seems that those kind of errors block the whole backup system because on the primary I have about 670 GB data and on the backup server only 249 GB data. In the beginning the transfer speeds where very high and at the moment it is really slacking.

Any thoughts on how to resolve this?


#2

Hi,

As mentioned on the manual (https://manual.seafile.com/deploy_pro/real_time_backup.html), you can do the following:

  1. Run “seaf-backup-cmd.sh status” command on the backup server to check the current backup status. Libraries with errors will be printed.
  2. Run seaf-fsck on the primary server for libraries with backup errors. Fsck only checks the latest state of the libraries. So it won’t detect data corruption in history data. If fsck reports okay for a library, there is still possibility that some history data is corrupted. That would stop the backup process since it tries to sync the entire history. If fsck reports corruption in the current state, you have to fix it first. Otherwise you can go to step 3.
  3. Use "./seaf-backup-cmd.sh sync --force " to ask the backup server to skip historical corruptions and finish the backup. Note that the backup will contain corrupted historical data. A backup can only be as good as the original copy.

Hope this help.


#3

Thank you.

Indeed, seaf-fsck did not report anything.

I am running --force at the moment but I find it quite strange that it keeps reporting “[12/29/2016 07:27:38 AM] …/common/block-backend-fs.c(486): [block bend] Failed to stat block e5bf0d66-7b1e-43f0-a9e9-e6b95fba2a6d:bd008ebdcc18fd5193a8b9e82dcba115c7cc3d9e at /*/seafile-data/storage/blocks/e5bf0d66-7b1e-43f0-a9e9-e6b95fba2a6d/bd/008ebdcc18fd5193a8b9e82dcba115c7cc3d9e: No such file or directory.” for at least 250 files.

Is this all historical data? If it is, it should be problematic I presume as the “real” data is still not corrupt. Correct?

Would this also indicate that I have to run the garbage tool to remove there kind of errors?


#4

What’s the history limits you set for this library? You can find that by looking into MySQL in the seafile tables.

select * from RepoHistoryLimit where repo_id=‘xxxx’;
select * from RepoValidSince where repo_id=‘xxxx’;

If nothing returned by the first query, the library uses global history limit setting, which is set in seafile.conf. If you have ever run GC on this library before, there should be an entry in RepoValidSince table. The backup process won’t try to backup history data before the timestamp stored in ValidSince table.

From you logs, have problems of the library ‘32ff8913-a362-45ec-9bfa-818d3227eaa7’ been fixed by using --force command?


#5

mysql> select * from RepoHistoryLimit;
±-------------------------------------±-----+
| repo_id | days |
±-------------------------------------±-----+
| ad910e35-5f73-4073-9e08-c3b7aec87bde | 30 |
| db0bf7c3-6119-424a-a078-f9a394c8b325 | 30 |
| e5bf0d66-7b1e-43f0-a9e9-e6b95fba2a6d | 30 |
±-------------------------------------±-----+
3 rows in set (0.00 sec)

mysql> select * from RepoValidSince;
±-------------------------------------±-----------+
| repo_id | timestamp |
±-------------------------------------±-----------+
| 32ff8913-a362-45ec-9bfa-818d3227eaa7 | 1476484774 |
| 91d67643-4b72-45fb-bcdb-80652391763b | 1476484772 |
| ad910e35-5f73-4073-9e08-c3b7aec87bde | 1476473865 |
| b2619685-57fc-4edf-8816-3180e89cf325 | 1479066713 |
| c0d2189d-f258-44d7-825b-537c575657a8 | 1476471311 |
| ce7651f9-504c-4afe-9e1c-d5c20c56ac0f | 1476471249 |
| db0bf7c3-6119-424a-a078-f9a394c8b325 | 1476471106 |
| e5bf0d66-7b1e-43f0-a9e9-e6b95fba2a6d | 1476466668 |
±-------------------------------------±-----------+
8 rows in set (0.00 sec)

Yes the issues have been fixed.

One last strange thing.

91d67643-4b72-45fb-bcdb-80652391763b = My Library Template

And this is the error:

./seaf-backup-cmd.sh sync --force 91d67643-4b72-45fb-bcdb-80652391763b
Failed to sync repo 91d67643-4b72-45fb-bcdb-80652391763b: Failed to get commit ids from primary.

[12/30/2016 07:43:31 AM] http-tx-mgr.c(1842): Repo 91d67643 sync status: [Get head commit id] success, transition to [Get diff commit ids].
[12/30/2016 07:43:31 AM] http-tx-mgr.c(1829): Failed to sync repo 91d67643, error in [Get diff commit ids].
[12/30/2016 08:16:56 AM] http-tx-mgr.c(1842): Repo 91d67643 sync status: [Sync init] success, transition to [Get head commit id].
[12/30/2016 08:16:57 AM] http-tx-mgr.c(1842): Repo 91d67643 sync status: [Get head commit id] success, transition to [Get diff commit ids].
[12/30/2016 08:16:57 AM] http-tx-mgr.c(1829): Failed to sync repo 91d67643, error in [Get diff commit ids].


#6

In relation to this. If I understand it correctly we should run GC on the backupserver too. Will this lead to a conflict in the database?


#7

It seems the 3 libraries experienced different kinds of corruption or data lost in history.

  • 91d67643-4b72-45fb-bcdb-80652391763b: Since it’s the template library, perhaps you’ve set up Seafile in one seafile-data folder first and then changed to another new seafile-data folder. The objects in that library was not transferred to the new seafile-data folder.
  • 32ff8913-a362-45ec-9bfa-818d3227eaa7: fs objects are lost. I don’t know why. But it may be lost during some server failure or migration?
  • e5bf0d66-7b1e-43f0-a9e9-e6b95fba2a6d: blocks are lost. The same guess as above.

#8

Right now we haven’t made the necessary modification to allow run GC on the backup server. You need to keep the history on backup server now. We’ll add the ability in the future.


#9

No the data of seafile has always been in seafile-data. The primary and the backup server use the same directories too. Is it possible to delete the template directory and create it again or delete it permanently? The error will be gone in that case.

Yes perhaps during the migration from the seafile primary to a new user there could have been some loss. When the library is synced again and seaf-fsck doesn’t show any errors this shouldn’t bother the user much. I noticted that when AV is running on the clients computer and the AV deleted a file while the file was already “taken” by seafile it creates a conflict. Perhaps this would be the case of the fs loss.

GC on the backup: As you know history takes quite a bit of space. How do we manage the deletion of the history in the meanwhile?


#10

Right now you have to keep all the history for a while. I don’t want to recommend you about some “hacky” ways to run GC on backup to avoid conflict or data loss. We’ll add this ability in recent releases.


#11

Thank you very much! Keep up the (very) good work! :slight_smile:


#12

Is there any update on the GC for the backup?


#13

Not yet. We plan to add it in 6.1 version.


#14

Was it added to the 6.1 version?


#15

Any update? Thanks


#16

It’s a Pro feature. So please wait for the Pro release.


#17

Superb, thx!


#18

Any update? Thanks


#19

Hi,

Is there any update on the GC for the backup server?


#20

We don’t have time to implement this feature yet.