Expired history entries will not be deleted with Seaf-gc

Hey all,

I tested the history settings on 7.0.10 PE with one library by setting it to 1 day and changing the data each day.

I realised, the expired files will be deleted when performing Seaf-gc, but the expired history entries for this file will not be deleted.
So when I click on an expired history entry (e.g. 3 days old), it opens an empty file and when i want to download this file I get an error (the website could not be displayed).

Can someone replicate this? Many thanks in advance!

Regards,
Wonderwhy

Ok, it got weirder:

Now Seafile does not create new history entries for my test file, it says the current version is one day old but I changed it 2 minutes ago. Data is saved and timestamp is correct, but no history entry is created.
Running seaf-gc shows that no new block is created when I change the data which would mean that seafile really does not create any history.
But when I create a new file and I change its data, new history entries will be created for the new file.

I will keep you updated.

Regards,
Wonderwhy

Hey all,

I installed the Pro version 7.0.11 on a different and freshly installed ubuntu machine and i still have the same problem that expired history entries will not get deleted.

Can someone replicate this? Many thanks in advance!

Regards,
Wonderwhy

Hey all,

with the Pro version 7.0.18 nothing changed.

Is it intended, that the expired history entries for this file will not be deleted when the file is long gone?

What is also weird, is that in order to delete old versions of the files via garbage collection, I first have to go the trash and have to clean all (not e.g. 3 days, but all, although the versions are one month old) and then the old versions can be deleted.
Should not the garbage collection delete the old versions without manually deleting them first via the trash?

Many thanks in advance!

Regards,
Wonderwhy

I never emptied the trash but my garbage collector usually deletes some blocks. I use very long history retention, though. My garbage collector runs once a month.

Looking at the manual the trash retention is configurable. Default is 60 days. See https://download.seafile.com/published/seafile-manual/config/seafile-conf.md (Default trash expiration time)

For technical reasons it also can happen that some blocks will only be removed in a later run.

Hmm, that makes sense, so the trash has its own “timer” when it should delete something.
Thanks for making it clear!
Yes, as far as i know, seafile never deletes the newest and second newest version, even if the second newest version is older than the history length.

But why are the history entries not deleted, even if the versions behind the history entries are long gone. Do you see the same behaviour?

For me it seems works. I have a temporary library where I added and deleted quite some data over the years and the size on disk is around the real size. It keeps 10 days of history. I don’t know (and don’t care) when exactly the data is deleted. The server is running for years, now. There are 100 libraries and almost a million files + history. No storage issues so far.

To benchmark it you have to find out the details yourself.

I meant that e.g. my library which only has a history length of 1 day, the history entries are all there, even from 6 month ago. If you want to view them, then they cannot be found, as the version of the file was deleted long ago.
Shouldn’t the entries be deleted with the old version of the file?

Ok then this is due to the “new” database storage of versions.

I cannot find documentation on how to clean it up. Would have expected it at https://download.seafile.com/published/seafile-manual/maintain/clean_database.md

@daniel.pan how can invalid history references from the database to commits be cleaned up?

Ok, when I disable the File History in seafevents.conf, then all the invalid history references dissapear:

File History Enabled:

File History Disabled:

I always tested with txt-files, so I never saw it working properly.
I tried it with log-files (which are not part of the File History) and there it is working properly even when File History is enabled.

So when File History is enabled, invalid history references will not be deleted for file types which are covered by the File History.

The reference in the database is not cleaned up. We will check the problem.

Is the problem solved? We also run Pro 7.0.10 and our FileHistory database table has a size of 800 MB. The oldest entry is 1.5 years old, although we’ve set keep_days=14 and we run seaf-gc every week.
Will the entries be cleared when we upgrade to 7.1 or 8? Or is it safe to run a mysql command to delete the old entries?

Regards,
Dirk

I think so far it won’t ever be cleaned up. A good solution would be when the garbage collector would remove orphaned entries.

You can clean the records in FileHistory manually. We will add it to the clean database script later.

2 Likes

OK, thank you.

1 Like