Corruption: Recovering 0-byte / empty files (silent data loss)

I have discovered several files that are reporting the Seafile UI as being XXX.X MB but expand the details, it shows as 0 B. When I download the file, I get an empty 0B file.

When I try using seaf-fsck --export, I also just get an empty 0B file.

Some screenshots attached of a single example.

How do I recover these files?

Hello @j1a2o ,

The file size is not zero, but its content is actually empty. This is usually caused by uploads from older versions of the mobile app. You can check the file’s modification history through the library’s history and try restoring it from a historical snapshot.

If it cannot be restored from the snapshot, the file cannot be recovered because it is missing blocks.

This is very bad. As shown in the screenshot, there are no prior snapshots on these files. They were valuable pictures that I expected Seafile to save. It silently failed on me and went undetected for 2 years.

Why doesn’t seaf-fsck detect this condition and warn about it? Or why doesn’t the UI warn about it? It would have been caught much sooner, perhaps when I still had the originals available.

Hello @j1a2o ,

The old version of the mobile app first created the file and then uploaded the blocks, which might have caused this issue. The current version of the mobile app no longer uses the previous API for file uploads.

Also, when I mentioned historical snapshots, I was referring to the entire library snapshot accessed through the library’s history, not the history of a single file. You can check the library history to see if there is any record of the file being uploaded or modified, and then examine the snapshot corresponding to that record to see if the file is intact.

That doesn’t explain why seaf-fsck and/or the UI doesn’t detect and warn about this issue. I had discovered it myself by chance after a couple years, using rclone. There might be a LOT of people out there with silent data loss from Seafile without even knowing it.

As of right now, I have zero confidence that this issue won’t silently happen again.

This is a really serous problem. We want to investigate the problem further.

Can you check the library history to see which clients uploaded this correctly file?

We will see how to check how many files are effected by this problem.

1 Like

This was with seadrive_2.0.24 on Mac.

For one of my users they’ve lost 553 photos & videos due to this issue, out of 17,424 files total. That’s a 3% failure rate!

Thanks for taking this issue seriously.

Hello @j1a2o ,

Could you please send a screenshot of the record of a corrupted file in the library history? This will help me understand what actions the client performed based on the history. Thanks.

Many of the files in the “2023 Video” directory are lost.

After analyzing the code of SeaDrive 2.0, this issue is possibly caused by special timing of the file system events sent from the macFuse extension. SeaDrive 2.0 uses a third-party kernel extension called macFuse to implement the virtual file system. When copying a lot of files into the virtual drive in a short time, such event timing can happen more easily. The code in 2.0 doesn’t handle such event timing correctly and may cause the issue you report.

We’re currently not able reproduce the issue though. We think it shouldn’t be a very common issue. In SeaDrive 3.0, we don’t use macFuse so this issue should not happen.

We’ll also add a command to the fsck tool to check such corruption.

1 Like

Thanks for looking into this. Please let me know when the fsck tool checks for this condition. Until then, it’s hard to trust that this won’t happen again undetected.

Hi,

Starting from version 13.0.9, we add an option to check whether the file size matches the actual file content. With this option, you can detect files with size mismatches, along with the method and time of their upload.

To check whether the file size matches, add the --check-file-size or -S option to seaf-fsck.

Confirming that it works for detection. I have 3,259 files lost forever in my Seafile repository, which really sucks.

Also, how come fsck –help doesn’t document the –check-file-size option?

Hi,

Thank you for the reminder. We will add this option to the --help in the upcoming version.