Significant storage overhead issue

Hello,

I have encountered a problem with increased disk space consumption. I do not use synchronization, but upload files via REST API. I found that file and directory metadata in Seafile takes up several times more space than the stored files themselves. This is especially noticeable if the files are uploaded to root directory only, and the files themselves are not very large. It seems that every time a new file is uploaded, a new commit is created, a new copy with the directory description of the directory where the file was uploaded is created, and the old copy with the directory description is not deleted because it is linked to the previous commit, etc. As a result, we have thousands of copies of catalog description files, which can take up hundreds of kilobytes per file, despite compression. Is there any way to get rid of useless for us obsolete metadata?

This is from Seafile FAQ:

Seafile keep file history as well as folder history. The history may take a lot of space. There are two steps involved in controlling storage occupied by file/folder history.

Correct history configuration according to your scenario

Change the default history settings for all libraries in seafile.conf

https://manual.seafile.com/config/seafile-conf/#default-history-length-limit

It is recommended to keep only histories for 90 days.

Note, the above configuration changes the default history settings for all libraries. But users can still set histories for their libraries via Web UI to overwrite the default settings. If you don’t want it, you need also turn off the history setting feature, by adding the config in seahub_settings.py:

ENABLE_REPO_HISTORY_SETTING = False

If you want to know the exact history settings for a specific library, you can also check it in “System admin → libraries → Context menu for that library”

Run GC

After setting libraries’ history to a proper value, you can then run seaf-gc to remove storage used by history:

./seaf-gc
./seaf-gc --rm-fs

Note, the first command remove history of files and the second command remove history of folders.

1 Like

Hi Daniel,

Thanks for the instructions. We already had file history disabled and users could not change this setting. That being said, I don’t know why so much garbage accumulated in the fs descriptions. Unfortunately, our version of Seafile is too old and seaf-gc does not support the --rm-fs option. Is an upgrade the only way to get rid of this garbage?

If you upload files via web interface, a new file will create a new version of the folder, which will take a lot of space accumulated.