Ceph: failed to stat block

Hi all,

I suddenly get errors like:

[01/03/2022 11:00:37 AM] …/common/block-backend-ceph.c(559): [Block bend] Failed to stat block 501079e4: No such file or directory.

both in seafile.log and also when I run seaf-fsck.

I see this for quite a few blocks/libraries as far as I can tell. both most libraries seem to be ok.

I am using seafile-pro 7.1.7, ceph 14.2

any ideas what could be the cause of this?
and how I could solve this?

@Jonathan unfortunately this is a bit urgent.

Best and a Happy New Year,
Hp

it seems that during a period of time (several days) many uploads and files synchronized by the client have been stored incorrectly. I.e. those files appear in the folder/library view in the web frontend, but they cannot be downloaded. similarly, for libraries containing such files, the sync client throws an “server error”.

reverting the library to an earlier (constistent) state and re-uploading the files solves the problem.

I have two questions however:

  1. what could be the cause of such incorrectly stored files?

  2. is there an easier way to remedy the situation?

  3. it is still an open question if seaf-gc and seaf-fsck work on such “restored” libraries. for the libraries with incorrectly stored files, these tools abort.

any help appreciated,
Hp

Hi @hkunz

What’s the version of the client causing this issue? Any error messages in server seafile.log when the client upload the corrupted files?

If the files are corrupted, fsck should be able to reset them to empty files without affecting other files. I remember there is a few bug fixes related to missing blocks for fsck. Since you’re in 7.1.7 version, they may not be fixed in that version. You can upgrade to latest 7.1 version or 8.0 version (which is stable too).

hm, I guess it did not matter which client it was. It happened both with the sync client and with uploads done over the web interface.

I will check the logs.

yes, I should definitely update to 8.0. If I understand correctly, then even a newer fsck could not really repair the files (because the data is not available) but will repair the library such, that the clients are able to sync again.

one last urgent question @Jonathan: is there a fast way to find all libraries with missing blocks? I mean faster than using fsck.

Many thanks,
Hp

nI see the following things in the logs, that I do not understand:

  1. various message of the form:
    Dec 28 09:18:05 filipe 2021-12-28 09:18:05,448 [WARNING] django.request:152 get_response Not Found: /api2/repos/0be1dd52-530f-48d9-b259-2151d3efbbed/

  2. or similar:
    │Dec 28 09:50:41 filipe 2021-12-28 09:50:41,053 [WARNING] django.request:152 get_response Not Found: /f/ca2e045464be4f7e9091/

  3. several message like:
    │Dec 28 15:23:03 filipe seaf-server[9522]: zip-download-mgr.c(808): Zip progress info not found for token 8aa75b17-76ae-450b-963c-bd33918c53f1: invalid token or related zip task failed.
    │Dec 28 15:23:03 filipe 2021-12-28 15:23:03,446 [ERROR] seahub.api2.endpoints.query_zip_progress:34 get Zip progress info not found.

  4. this is the first missing block error I see:
    Dec 30 07:46:31 filipe seaf-server[9522]: …/common/block-backend-ceph.c(559): [Block bend] Failed to stat block e430a8b6: No such file or directory.

without any other error message from seafile preceeding it.

  1. sometimes I see:
    Dec 30 10:59:23 filipe seaf-server[9522]: pack-dir.c(165): Failed to stat block c88b2351-7dae-463f-9eea-2244fe833083:63799a37ed73b99393237fdbea5ecd467d21d7d3
    Dec 30 10:59:23 filipe seaf-server[9522]: pack-dir.c(478): Failed to archive dir AFD69E97-0D25-4A8F-A8D8-9E68C3E8D688.snapshots in repo c88b2351.

Is ceph also logging something? Do the missing blocks exist in the cluster? Is the cluster scrubbed?

in ceph I see no error messages, everything seems to be fine.

how could I check if the blocks exist in the (ceph) cluster?

we are using nautilus with pretty much the default settings. as far as I know this means that the pgs are scrubbed. these are the settings related to scrub:

mds_max_scrub_ops_in_progress = 5
mon_scrub_inject_crc_mismatch = 0.000000
mon_scrub_inject_missing_keys = 0.000000
mon_scrub_interval = 86400
mon_scrub_max_keys = 100
mon_scrub_timeout = 300
mon_warn_pg_not_deep_scrubbed_ratio = 0.750000
mon_warn_pg_not_scrubbed_ratio = 0.500000
osd_debug_deep_scrub_sleep = 0.000000
osd_deep_scrub_interval = 604800.000000
osd_deep_scrub_keys = 1024
osd_deep_scrub_large_omap_object_key_threshold = 200000
osd_deep_scrub_large_omap_object_value_sum_threshold = 1073741824
osd_deep_scrub_randomize_ratio = 0.150000
osd_deep_scrub_stride = 524288
osd_deep_scrub_update_digest_min_age = 7200
osd_max_scrubs = 1
osd_op_queue_mclock_scrub_lim = 0.001000
osd_op_queue_mclock_scrub_res = 0.000000
osd_op_queue_mclock_scrub_wgt = 1.000000
osd_requested_scrub_priority = 120
osd_scrub_auto_repair = false
osd_scrub_auto_repair_num_errors = 5
osd_scrub_backoff_ratio = 0.660000
osd_scrub_begin_hour = 0
osd_scrub_begin_week_day = 0
osd_scrub_chunk_max = 25
osd_scrub_chunk_min = 5
osd_scrub_cost = 52428800
osd_scrub_during_recovery = false
osd_scrub_end_hour = 24
osd_scrub_end_week_day = 7
osd_scrub_interval_randomize_ratio = 0.500000
osd_scrub_invalid_stats = true
osd_scrub_load_threshold = 0.500000
osd_scrub_max_interval = 604800.000000
osd_scrub_max_preemptions = 5
osd_scrub_min_interval = 86400.000000
osd_scrub_priority = 5
osd_scrub_sleep = 0.000000

I don’t know the exact structure anymore, but there should be an object named by its checksum. Maybe it was the library as namespace or a namespace for blocks, one for commits and one for fs and objects are stored with their checksum or ad that below the library (uuid). Could be that the first two characters of the checksum are one element above (ab/cdef… instead of abcdef…).

Can a request timeout with ceph? Is there a log for timeouts?

Is the garbage collector working on the production database?

hi @shoeper , I am quite confident that the blocks are actually missing in ceph. why?

  • most of the blocks are found (so seafile can access ceph without problems)
  • it worked a long time in this configuration
  • there are no timeouts and no instability in the network
  • load on ceph is low
  • is is consistent: a block that is not found, is never found (it is not the case that it works sometimes and sometimes not)

I believe that there is an inconsistency between what seafile expects should be on the (block) storage and what actually is on the storage.

furthermore it seems to be the case that for some period of time (roughly 29.12. up to 2.1.) all additions (and probably also modification) of files lead to the missing blocks error. in other words, during that time it seems that seafile was adding files to its metadata (mysql?) but did not actually write the data to the storage.

I cannot be sure of that, it is very difficult/time consuming to confirm that from the logs.

anyway, after restarting all seafile nodes (I am using a 3-node seafile cluster) all the new additions/modifications work fine. I have no explanation for this. and from the seafile logs, there are no indications of such a problem.

I looks like seafile was “thinking” that everything was fine.

We had one drive in the ceph cluster with smartd warnings. just to be on the save side, I took the whole node out of the ceph cluster. but this had no effect. I am not sure how ceph would deal with faulty data from a problematic drive. but since we have a redundancy of 3, I would expect that we would have at least two valid copies of each seafile block.

Unfortunately not. However you can run fsck with --shallow option to skip checking block contents to speed up the check. See latest documentation.

If it happens at certain period and for all file uploads, it’s likely some configuration or environmental change. It’s unlikely that Seafile reports upload success without actually saving data for all files.

what happens exactly when fsck resets the files with missing blocks? will then the sync client attempt to sync the local (non-empty) file to the servers, or will it sync the empty file from the server and overwrite the local (non-empty) version of the file?

you may be right, could you indicate what changes in the environment/configuration could have such an effect? I am not aware of any such changes, but you as developer of seafile have a better understanding than me, under which conditions this could happen.

many thanks,
hp

hm, it seems I found the problem. as mentioned we are using a seafile cluster as described in the docs. this worked very well for many years. I just realized, that the mysql-galera-cluster running on the seafile servers was broken. two nodes (the background node, and one of the frontend nodes) thought they are not in a galera cluster and the second frontend node was in galera-cluster of size one. in other words, there was no mysql-replication working between the seafile servers.

this is a very bad situation and so far I have no clue how this could have happened.

people/clients access seafile over only one of the frontend nodes, except when this primary node is rebooted, but of course we have background jobs (gc, spam, elasticsearch, …) running on the background node.

@Jonathan do you think that could explain the missing blocks we see?

I think it explains the issue. The garbage collector is then deleting new data because it is not referenced from the database.

yes that would indeed explain the situation. many thanks

@shoeper I switched off all nodes except the primary frontend node. restarted mysql and seafile on this node. this is the wsrep status I have:

MariaDB [(none)]> SHOW GLOBAL STATUS LIKE 'wsrep_%';
+-------------------------------+----------------------+
| Variable_name                 | Value                |
+-------------------------------+----------------------+
| wsrep_applier_thread_count    | 0                    |
| wsrep_cluster_conf_id         | 18446744073709551615 |
| wsrep_cluster_size            | 0                    |
| wsrep_cluster_state_uuid      |                      |
| wsrep_cluster_status          | Disconnected         |
| wsrep_connected               | OFF                  |
| wsrep_local_bf_aborts         | 0                    |
| wsrep_local_index             | 18446744073709551615 |
| wsrep_provider_name           |                      |
| wsrep_provider_vendor         |                      |
| wsrep_provider_version        |                      |
| wsrep_ready                   | OFF                  |
| wsrep_rollbacker_thread_count | 0                    |
| wsrep_thread_count            | 0                    |
+-------------------------------+----------------------+

my goal at the moment is to get a clean single (non-clusteres) seafile server.

seafile seems to be running fine, but in the mysql error.log I see:

2022-01-05 15:13:37 44 [Warning] Aborted connection 44 to db: ‘ccnet-db’ user: ‘seafile’ host: ‘localhost’ (Got an error reading communication packets)
2022-01-05 15:13:37 42 [Warning] Aborted connection 42 to db: ‘seahub-db’ user: ‘seafile’ host: ‘localhost’ (Got an error reading communication packets)
[…]

do you have any idea what could cause this. however, seafile seems to be running fine, but I would like to understand these errors.

many thanks for your support!

These errors could be caused by restarting Seafile. As the error message indicates the client (seafile) closes the connections.