Seafile server should use lower number of files

Lonsarg · April 17, 2019, 7:42pm

Seafile server has quite an excesive usage of files, 600 000 for 430GB in my case. I used to do a whole vhdx image backup and now switched to rsync backup of actual files and I must say the process took extremely longer. It is not just backup, this simply puts too much strain on filesystem filetable for many operations.

It would really help to have lower number by at least 10x if not more. You do not need to touch your architecture for this, you can make a file access abstraction that puts for example 50 files into one “container” or even more if we are at it and than reading only the needed 1MB part via seek. Various SQL servers do this with internal structure so only bigger files are exposed to filesystem so I do think this is the futureproof way to go.

DerDanilo · April 18, 2019, 5:03am

The blocksize can be configured. I set mine to 8mb and it works.

Lonsarg · April 18, 2019, 5:56am

Oh I didin’t know that, I found this setting in manual, is this it?:
[fileserver]
#Set block size to 2MB
fixed_block_size=2

This will probably also cause the server to deduplicate at the new block size instead at per 1MB blocks? Still I will definetly increase this setting, this will probably reduce file count cont for my for at least 5x, since most my files are around 5MB.

DerDanilo · April 18, 2019, 12:39pm

Correct.

You will have to transfer files to new libraries to have the data migrated to larger blocks. This task can be done via web interface and does not require a resync with local folders if done properly.

Lonsarg · April 20, 2019, 1:48pm

I do not see any option on web interface to transfer data from one library to another, even regular copy command works only inside one library. Transfer command is only for transferring existing library to another user.

I gues I will need to create new libraries if I want to do this, though I do not like the “losing all history part”…

Though I am still confused if new library is absolutely nessesary, the manual states block size setting is taken into account for new files, this would mean new files on existing library also?

marcusm · April 21, 2019, 7:24pm

Yes, that should be correct. You can test it with a new library and 1-2 files inside it - change the block size, restart seafile and upload a new file which is bigger than the block file size and now there should appear new blocks with a bigger size (see /seafile-data/storage/blocks/< library-id >)

There is only the copy command, which only works for non-encrypted libraries.

DerDanilo · May 22, 2019, 11:13am

The block size seems only to be honored if files are being uploaded via web interface.
What good is this setting if the clients don’t honor them?
This is a problem when having to scale seafile e.g. via S3/Swift and any other network defined storage. Multiple calls for a file which could be handled with just a few or simply one call.

Any information on this is appreciated.

Keagel · June 29, 2022, 6:04pm

3 years later and the setting still has no effect on the clients. Maybe because blocks are made server-side when using the web interface but client side when using a client? I don’t see why the clients can’t get the setting from the server and make blocks of the correct size in consequence though.

DSmidgy · March 5, 2023, 4:02pm

Yes, it is really frustrating making backups (I am using rclone sync - from HDD to external USB-3 HDD) with such small files. I am making a backup right now. 21 hour have passed and it is still not finished.
This “[fileserver] fixed_block_size=X” should be applied to all clients (probably part of the post-authentication data send to the client).

LeoLI · April 17, 2023, 4:57am

start backup on [/opt/seafile]
scan finished in 955.103s: 3804060 files, 2.388 TiB
[16:11:02] 99.85% 3007255 files 2.385 TiB, total 3804060 files 2.388 TiB, 0 errors ETA 1:29

I have 25% files are 98 bytes block file.