Syncing new drive or empty directory deletes files on server

Hello,
May be I don’t understand some principles of how the Seafile sync algorithm works, but here is my issue:

  • I have a Library on server
  • I would like to sync it with my empty portable USB drive
  • During synchronization, the drive disconnected
  • I re-connect the drive and it turns out that previous session has only managed to create empty directory tree
  • The Seafile client recognizes that the drive is back and starts new sync session automatically
  • Since all the directories are empty, the second sync session removes all the files on the server and I have to restore them from snapshot

Why this happens? Am I missing something? How to sync new drives correctly? Is it dangerous or not to keep libraries on portable drives?

Thank you

Seafile is for static sync. What you do with files in synced libraries, that happend on whole system. So if you sync files to your flash drive, pulg-off and plug-on, then it’s right behavioral that all of your files are deleted cause Seafile think you deleted these files. It’s just creating 1:1 copy of your folder in computer

I don’t think that drive disconnection may justify such a behavior.
Even local HDD can accidentally be disconnected. Or I can shut down Windows when sync is not complete.
And what if a Library is shared among multiple users. How Seafile decides which copy to sync to the others?

It’s recommended, let Seafile fully sync library. if you shutdown PC, then you shutdown Seafile too so there’s no problem. Unpluging internal HDD while computer is running is not recommended and you can damange HDD, PC and your OS

I compared Seafile with Google Drive and MS OneDrive. Looks like they do the same: delete files from the cloud if on startup they do not see them in a directiory.
I also compared Seafile with Synology Cloud Drive and it works differently. It only removes files when running and noticing that a user is removing something explicitly. If on startup Synology Cloud Drive doesn’t find some files in folders - it restores them from the cloud.
In my opinion, Seafile (as well as Google Drive and MS OneDrive) and its policy regarding empty dirs is dangerous and may cause data loss.

I just made a test with my USB stick. I synced a folder from the stick using the desktop client (Windows 6.1.1) to Pro server 6.2.2. Then I stopped the client, disconnected the stick, and restarted the client. The client dropped the synchronization of the folder. So the data is still on the seafile server in my library. Seafile didn’t delete them. That’s what I expected.

Yes, and when you setup in client “Do not automatically unsync library” then when you put your stick in the sync will continue.

That’s right. If you remove the entire library folder so the library no longer exists on the drive, the Seafile drops synchronization or (depends on the settings) puts an exclamation mark on the library and waits for it to be back.
But if you shutdown the Seafile client, then remove some files INSIDE the library so the library is still on the drive but in inconsistent state, then start up the Seafile client - the client removes that files from the server having no idea if that files were removed by user or for example by a chkdsk in case an FS had errors and should be checked.
I believe that such a blind-removal behavior is wrong and may cause data loss.

Yes, and when you setup in client “Do not automatically unsync library” then when you put your stick in the sync will continue.

Yes, it does what you tell seafile to do.

You guys just don’t get what I’m talking about.
Not need to disconnect drives or remove local copies of libraries.
You shut down Seafile client, remove files from the library (not the library itself), start up Seafile client and it removes from server the files you removed while the client was not running.
I believe that is weird. The client should restore the files from server to a local copy of the library instead.

Yes, this is exactly the designed behavior. And it would be very bad if it wouldn’t work this way because it would completely break synchronization.

Keep in mind that there is a file history. In case you did delete something by accident you can still recover it using Seahub in most cases (history would have to be disabled and the garbage collector would have to be ran to not being able to recover a file).

Ok. This is by design and sounds reasonable.
But how to deal with the situation when sync session was interrupted for example by power outage?
As I noticed, when syncing a library to a new drive, the client creates folders first. After empty directory tree is created, the client starts synchronizing files. So it is possible for the client to be interrupted when we only have an empty directory tree. And after system reboot the client decides that all the files were removed - and removes them from the server.
That’s exactly the situation I came across. Thousands of files were removed. I had to revert the snapshot.
And even the snapshot is not a silver bullet. It depends on how much time has passed since the moment when something went wrong and how many files have changed since that moment.

I think in this case (interruption before the sync is finished), the safest way is to unsync the lib, and then use the option ‘sync with an existing folder’ to resync, which will merge the local files and remote ones instead of deleting the missing ones.

I conducted a simple test of syncing a lib. After all empty directories are created, I use the task manager to kill the seafile process and seaf-daemon process. Then I start the seafile program again. But I find the seafile will continue to download files from cloud to my local machine rather than trying to remove the remote files. So I guess your case (e.g., disconnected drive, power outage) might be extreme. Again, even it occurs, this can be solved via ‘sync with an existing folder’.

It would be helpful if developers published a document with the sync algorithm explained.
For example, how the synchronization works if a library is shared among multiple users and some of them have been offline for a long time so they do not have some fresh files in their local copies. In such case, the Seafile client can either delete files on server or download them to the outdated local libraries.

Just read whole manual. There is everything you need.

Seafile sync work like GIT so long time offline user will get newest version of files(new, edited or deleted) :slight_smile:

Found the chapter about the sync algorithm. Unfortunately, it doen’t disclose all the subtleties.
In GIT, deletetion is an explicit action. But in Seafile deletion is assumed just due to the fact that a file is absent in a directory.
Thanks for the hint about docs anyways.
This topic can probably be closed.