Another seafile-data is too big: 50GiB for 109GiB of real data

Hi !

I use the seafile command line client and when I looked to the size I see that:

  • my data size is: 109GiB
  • size of seafile internal storage: 50GiB

I tried to restart the client but it did nothing.
Is that normal ? If it is, can I do something to avoid this ?

P.S. :

  • seafile client: 5.1.4
  • my repository contains: 183364 files
  • OS: linux

Hi
it may be necessary to delete unused blocks.
this will delete the files that were deleted
remember to shut down your server if you do not use the pro version

https://manual.seafile.com/maintain/seafile_gc.html

Wrong guess :slight_smile:
I’m on the client side, not the server one.

What do you mean by this?

seafile-data
|------- branch.db
|------- certs.db
|------- clone.db
|------- commits
|------- config.db
|------- deleted_store
|------- filelocks.db
|------- fs
|------- index
|------- repo.db
|------- storage ====> 50GiB
|------- tmpfiles
|------- transfer.db

Has the synchronization already been finished? Mine also isn’t small but for a few hundred gigabytes and about half a million files it is 6,5 GiB.

How large are the individual folders within it (blocks, commits, fs)?

Below are a more detailled informations.

xxxx@yyyy:~# seaf-cli status
# Name  Status  Progress

# Name  Status
lib1  synchronized
lib2  synchronized

A ncdu view:

99,3GiB [##########] /var/backups/seafile-data
52,7GiB [#####     ] /var/backups/seafile-client

my data being in the seafile-data, and the seafile’s data in seafile-client; I know I didn’t make a good use of my brain when I choose those names.

A view of

 /var/backups/seafile-client
          |------------seafile-data
                                |---------------seafile  (an empty directory)
                                |---------------seafile-data

And inside the last seafile-data (i.e. inside of /var/backups/seafile-client/seafile-data/seafile-data):
seafile-data
and inside the storage directory (i.e. inside of /var/backups/seafile-client/seafile-data/seafile-data/storage):
inside_storage

A ps -ax view:

XXXX ?        Ss     0:01 ccnet --daemon -c /root/.ccnet
YYYY ?        Ssl    6:37 seaf-daemon --daemon -c /root/.ccnet -d /var/backups/seafile-client/seafile-data/seafile-data -w /var/backups/seafile-client/seafile-data/seafile

I just updated the client to 6.2.4:

ii  libseafile0                           6.2.4                           amd64        Shared libraries for Seafile
ii  libsearpc1                            3.1.0                           amd64        SeaRPC library for Seafile client
ii  python-searpc                         3.1.0                           all          simple and easy-to-use C language RPC framework
ii  seafile-cli                           6.2.4                           amd64        Seafile command line interface.
ii  seafile-daemon                        6.2.4                           amd64        Seafile daemon

Same results.

Which file system is it? Could it be that it does have not so many inodes and the storage is “wasted” because of that?

How many files are in fs?

Shouldn’t be a problem:

XXXX@YYYY# df -i
Sys. de fichiers   Inœuds  IUtil.  ILibre IUti% Monté sur
/dev/vda1        13107200 3195159 9912041   25% /

I used 1/4 of them. But as I’m in a virtual environnement (my server is hosted on OVH’s cloud), I can’t be sure of the provided information.

Is a lot of your data duplicated? Seafile does auto de-dup, so it won’t waste space on the server side if blocks of data are identical across files.

Sorry, but I don’t understand the question: I’m on the client side and I’m talking about the internal data space storage used by the seafile client that is too big.

Given this precision: I have no data duplicated.

@lemmel I have exactly the same problem. My library is 2.8 TB in size and I try so sync it now with the seafile-cli client. The seafile-data storage is already 300 GB in size and growing. I have no idea whats wrong and why its doing it. I would really appreciate a suggestion from the community.

For me it is either a bug or a feature: so I’m expecting no real answer.
I tried to open a ticket (seafile-client github), but it was close because it was “unlikely a bug”.
So I’m waiting enough time to reopen the ticket.

It must be a bug. I’ve synced 800 GB of 2.8 TB so far and my seafile-data directory is 450 GB in size. That’s ridiculous.

Overall my experience is that seafile is real mess when you try to manage more than 1 TB of data. It’s awful implemented. E.g. the verbosity is so bad, you never know what the application is currently doing.

Before seafile client, I was using (still using) seadrive. Look here. I stuck there Seadrive stops sync with libcurl error

I’m on seafile for years already, but with smaller data amount. Now my seafile data directory on the server is 10 TB big. Seafile can’t handle such amounts efficiently.

My observation is this: Seafile is downloading the data to the local storage in seafile-data which can grow very big (more than half the size of the actuall library it was in my case), then after a while seafile stops the download and starts to extract the data into its actual destination. So make sure the drive where seafile-data is on, is big enough to handle huge libraries.

1 Like

My data are synchronized so the need for temporary space should not be an explanation.

I have enough place ; I was monitoring my server when I saw the problem.

Is there a solution to this issue yet. in my case the seafile-data/blocks size was larger than the library’s one. I’m using seaf-cli 8 on ubuntu 20.04. i’m garbage collecting my server every week. so the blocks should not be that large anyway.

Found no solution: I had to synchronize from the start.

The problem did not reappear but there has been several updates since (python, libraries, and the client itself).

So I can provide no help; sorry.