Mysql replication problems on real-time backup

I am almost sure, that I am missing something.

I see for example:
[ERROR] Slave SQL: Could not execute Write_rows_v1 event on
table seahub-db.profile_profile; Duplicate entry ‘[username omitted]’ for key ‘user’, Error_code: 1062; h
andler error HA_ERR_FOUND_DUPP_KEY; the event’s master log mysql-bin.000002, end_log_pos 42069, Gtid 0-1-
164, Internal MariaDB error code: 1062

and after resolving the above conflict, I am having others.

I suspect, this is happening because the mysql databases are changed on the real-time backup, e.g. when users log in, and do stuff.

So, I have a very general questions:

  1. is it forbidden to log in to the real-time backup servers?
  2. I guess also the background tasks should not run on the real-time backup, right?
  3. is it possible to set the real-time backup to “readonly” somehow? or should I forbid logins entirely?
  4. it is for example possible to add and/or remove libraries on the real-time backup. this is clearly not the behavior I would expect from a backup :slight_smile:

so, am I missing something important here? Or is this the ways it is I have to ensure with other means that no changes are done to the backup?

Best,
Hp

Hi,

Have you done any change on the backup server except for logging in? It should be okay to login to the backup server for checking the libraries. But you should not do any change on the backup server.

As you can see in our doc, many tables are purposely ignored from MySQL replication. These are the tables that would be changed if you login to the backup server. We ignore them so that they won’t conflict with the entries from primary server. Perhaps some recent changes make the profile table changed by login… I’ll check.

This is what probably happened:

  1. I logged in to the backup server
  2. I tested, if I could add a library directly on the backup (and I could)
  3. I suspect this broke the mysql replication (modifications on the slave)
  4. Thus mysql replication stopped wotking
  5. Another user (LDAP) logged in for the first time on the “normal” node
  6. His profile was thus created on the normal node, but that could never propagate to the backup node, because replication was stopped
  7. This new user logged in to the backup server and therefore his profile was also created on the backup node

But what exactly happened is not so important to me. What is important is:

If the backup node has full write access to its databases, then such things will occur.

My question is, how to avoid this. So this is what I see (in short):

  1. Do not run background tasks (virus scann, etc.) on the backup node

  2. Either forbid user logins or make the replicated databases read only.

  3. is easy. For 2. I am wondering if there is some mechanism in seafile which would implement one or the other.

Best,
Hp

I am using a galera cluster (mariadb 10.1.22 on debian) with replication to a real-time backup server, as described in the docs.

galera works fine, I am using the the seafile background tasks node as master for the replication to the backup node.

however, today I realized, that only changes done on the background tasks node were replicated to the backup node.
no changes done on the foreground nodes were ever replicated to the backup node.

the reason is described here:

after using the setting

[mysqld]
log_slave_updates=ON

on the master for the replication to the backup node, everything works as expected.

I think this should be included in the docs
https://manual.seafile.com/deploy_pro/real_time_backup.html (in the section concerning cluster setups)

Best,
Hp

Hi Hanspeter,

Sorry for late reply. We were in vacation the last few days.

Do you think it’s sufficient to only allow sys admin to log in to backup node? I think there should at least be some visual way for the admin the verify the backup works.

Thanks for the information. We’ll add it to the documentation.

I think it would great if logins could be restricted to admins. This way we could avoid users tampering with the backup (by accident) and breaking things. (I am emulating this currently, by disabling LDAP users on the real-time backup)

In the long run, I would prefer if I could set the real-time backup read-only (write access only to the tables that are ignored for the replication), such that nobody (not even admins) could break the replication. Personally I do not care if users can login to the backup-server, as long as they cannot write to it.

I my opinion, a read-only backup (no user can change anything, neither in the web interface nor in the client) would be technically the cleanest solution. (Actually, as I understand, you should never write to a replication slave). Maybe this could be solved by defining the roles on the backup node accordingly. But I think right now I cannot define roles with read-only access, right?

Alternatively, I could also configure mysql access such, that the seafile user cannot change the replicated tables on the slave/backup. But this will lead to ugly error messages when using the web interface I guess.

Ideally, we would have a combination of a seafile instance configured as read only combined with a seafile database user that has no write access to the replicated tables.

BTW: It is great if you guys go to vacations. Really, you do great work so you definitely deserved it
:wink:

It would be a lot of work to change every function in the interface to support read-only mode. A simpler solution (if you don’t care about whether users can login) is to disable seahub access in Nginx.

yes I understand that this would be a lot of work.

could you elaborate on what you mean by disable seahub access in the webserver (nginx/apache)?

I agree with you that it is a nice thing if an administrator could login to the seafile web interface of the backup server. as far as I understand, this would not be possible when disabling seahub access in the webserver, right?

how could I restrict access only to admins? in other words, how can I avoid that normal users (or guests) log in to the web interface or use the client to access the seafile backup server?

would this be possible by redefining the roles?

many thanks for your support,
hp

I mean by commenting out the “seahub” configuration in Nginx conf file. That way there will be no way to access the server via web interface or api.

We could add an option in seahub_settings.py to only allow admin login.

That would definitively help and I would definitively use such a feature.

Best regards and thanks,
Hp