Seaf-gc suddenly fails collect existing blocks for all repos

Hi there,

probably since the upgrade to server pro 7.0.8 (from 6.x) seaf-gc stopped working. I see errors like this:

[10/28/19 13:06:48] gc-core.c(773): GC version 1 repo Datenhalde(fcd15cb1-65ab-4544-a253-4dbc413edd6c)
[10/28/19 13:06:48] …/…/common/block-backend-ceph.c(630): Failed to open objects list ctx: Success.
[10/28/19 13:06:48] gc-core.c(524): Failed to collect existing blocks for repo fcd15cb1, stop GC.

for all libraries of all users. Furthermore, I see this on my seafile test cluster and my production cluster.

At the same time, seaf-fsck cannot see any problem for all the libraries I checked.

Any hints welcome.

Best,
Hp

Hey Jonathan, Daniel,

I really need some help on this.

I get the same error for newly created libraries. so this is clearly not an issue of libraries having corrupt histories.

Best,
Hp

We will look into the problem. For urgent problem, you can send to support@seafile.com

ok, thank you.

it is not particularly urgent, as everything else works fine, and we have enough storage to let the garbage accumulate a while :wink:

Hi @hkunz

We found a possible cause. Have you upgraded your ceph cluster to Iluminous? As their release note stated: https://ceph.io/releases/v12-2-0-luminous-released/, the API we used to list objects are removed.

no, we changed nothing on our ceph cluster. it is running jewel.

could it be that our ceph version (10.2.11) is too old for seafile 7?

What happens if you execute this command: seaf-gc.sh -r ? Same error?

I get no error running seaf-gc.sh -r. but there is no deleted library, so nothing really happens. and it does not solve our problem, of course :slight_smile:

Of course, i just wanted to check your feedback, as we use ceph as well - and maybe i can help. As this option will call another part of the script, this feedback may help the devs. Can you try to delete a library (and library trash) and try it again?

I see. Just tried what you suggested (deleted a library, cleared the trash). When I run seaf-gc the first time I get:

[11/11/19 23:46:49] gc-core.c(673): === Repos deleted by users ===
[11/11/19 23:46:49] gc-core.c(697): Start to GC deleted repo 50aac6a2-c77e-48bd-9f33-e6ab11355e02.
[11/11/19 23:46:49] gc-core.c(643): Deleting commits for repo 50aac6a2-c77e-48bd-9f33-e6ab11355e02.
[11/11/19 23:46:49] …/…/common/obj-backend-ceph.c(467): Failed to open objects list ctx: Success.

nevertheless, the library (or at least the “reference” to it) was removed. If I run seaf-gc again, it does not try to remove the library.

Thanks,
Hp

What OS do you use? We can provide you with a debug package.

debian buster

Hi @hkunz

Please try this package and send the log message: https://download.seafile.com/f/78d0471ffd774a62bca7/?dl=1

Hi @Jonathan,

I untared the package and just ran seaf-gc.sh from it (without upgrading the server itself, i.e. seafile and seahub services running were still 7.0.8).

But seaf-gc.sh worked without errors! Many thanks!

Did you fix something in the package you provided? Or can I just upgrade to the “normal” seafile 7.0.10?

Best,
Hp