Find / delete duplicates

Good morning from Norway!
I have a seafile server on CentOS 8, and it works like a charm!
I just want to figure out how I can run a duplicate finder, to identify and delete duplicate files.

Until now, I have just exported the library and then used the command fdupes in the terminal.
The thing is that I have 1,8 TB of data, so it takes a LOT of time to sync back the files to the seafile server…

I also want to automate this process.

Any tips?

As long as the files are in one Library they are deduplicated (at least not written twice to the file system).

Looking through your files and deleting duplicates is not something Seafile can do so you need to use some software one one of the clients for that.

Thank you!

This means that deleting duplicates files on clients, will not have impact server side?

If you are using the seafile client, the files is at the server. All actions will be done at the server.

What I am doing:

  • Seafsck.sh export library to a folder on my server.
  • Running fdupes in the exported folder.
  • Deleting the library from seafile server.
  • Using seafile client to upload the exported folder back to the server.

This will work for duplicate images / files. It probably will work to use a duplicate finder software on a client itself too, but then you have to sync everything to the client first.