PostgreSQL as standard database for Seafile instead of MySQL?

Hello together

recently I was visiting a workshop about PostgreSQL. As far as I understood the arguments PostgreSQL offers advantages related to database integrity and clear documentation in comparison to MySQL (and maybe MariaDB).

So I just wanted to initiatiate to collect some ideas about this topic (because I recently found a thread to use Postgres for Seafile):

  1. Is MySQL a good choice for Seafile on long term?
  2. What would be a better alternative?

Here are some information for a comparison:

In my opinion it more or less does not matter which database is used. Remember you can run seafile with sqlite!

The reason for my assumption is, that seafile uses only few tables with small and simple datasets. As long as the used database does not have any major bugs I don’t see any scaling or stability problems using MySQL (or any other DB) in the foreseeable future.

My seafile-data directory is around 600GB. This is the size of my seafile databases in the MySQL server:

# du -hs *ddb
124K    ccnet@002ddb
416K    seafile@002ddb
33M     seahub@002ddb

Here are the latest sql dumps from these databases:

->du -hs *2016-07-31-12*
12K     ccnet-db.sql.2016-07-31-12-00-11
64K     seafile-db.sql.2016-07-31-12-00-12
14M     seahub-db.sql.2016-07-31-12-00-13

Actually, that is a good point. Does anybody have any numbers on how well sqlite does compared to mysql/postgres for Seafile?

Using sqlite would cut down quite a bit in regards to install complexity. I used mysql in my install without thinking about it too much. Anybody out there who uses sqlite in medium-sized installs? Any comments on its performance?

SQLite locks the entire database when it carries out updates. So using SQLite in a high concurrency environment would cause some database operations to fail (and retried). For production, we always recommend to use MySQL. Postgres basically works. But we don’t have enough resources to maintain the upgrade scripts for Postgres.

1 Like

Yeah, makes sense. What exactly is he database used for, opposed to the filesystem-store? When do write operations occur?

Thanks for your opinion. Maybe I should have focused the topic more on Pro environments (although I assume that Seafile CE users would benefit from PostgreSQL as well):

Maybe it makes no difference when we speak about seafile-data from 1-3TB but even in this size (I have 1,5TB of data) a seaf-fsck for solving database corruption needs around 48h on my system (Cubietruck) so I really don’t want to know how long it would take to check for database corruption with a really large data directory. But I have to admit that I don’t know if PostgreSQL would avoid database corruption than MySQL/MariaDB. I just know that a database expert from the workshop I’ve visited said that he never had database corruption with PostgreSQL but with MySQL (because MySQL originally was not intended to be a database system as far as I understood).

Beside this I think it matters if you use MySQL/MariaDB/PostgreSQL or SQLite for Seafile:

I have no numbers/statistics but I’ve seen an an Owncloud installation on a real server system running with SQLite and compared to MySQL it was like, well… it feels like surfing with 56K. I don’t if the behaviour would be the same using wiht Seafile but:

Thanks for confirming.
Unfortunately I’m far away to be a database expert and have no experience comparing databases in productive environments.
But beside this I think it would be worth to think about using a database which has a stable open source development since decades and is not developed by a company which has obviously less interest in further developing an open source database because it would be a rival for their proprietary one (so: Oracle).
Considering this and the common experience (?) that PostgreSQL is more stable would be an argument to switch the standard database to PostgreSQL - especially for Pro users because they maybe would feel much more comfortable with a more reliable database.

Yes, that’s a well known issue with Owncloud. I’m just curious what kind of information is actually stored in the db for Seafile, and how many write operations there are. Since Seafile is based on git, I reckon that a good chunk of the work is also done on the filesystem. Anyway, sqlite is most likely not a good option (plus, I’ve got a mysql and postgres server running anyway), I’m just interested for curiosities sake.

I wouldn’t worry about that too much. Just use MariaDB (or maybe even Percona). MariaDB is developed by the community, and it’s a drop in replacement for MySQL.

While I prefer postgreSQL as a database to MySQL personally, I don’t think we have to be worried about stability/corruption/etc on MySQL/MariaDB for our Seafile installs. It’s a proven technology, and lots of companies use it in production without any issues for ages. Having the option of postgreSQL would be nice, but, in my opinion, it should be waaaaaay down the list in terms of priorities.Plus, we have to reckognize the resource constrains on a project like Seafile, so I’d prefer to have one good and stable integration than a few half-assed ones.
Again, I think Mysql is as good a choice as postgres for Seafile. Both have their advantages and disadvantages. Mysql is (arguably) easier to install and administer. Plus, there are more resources about it on the internet, and people are more likely to be familiar with it and/or have an instance running already somewhere. This, and the option of using MariaDB instead makes me think that keeping it as the default Seafile db is the sensible option (for now, anyway).

Just my 2 cents.

There are 3 dbs in Seafile: ccnet db, seafile db and seahub db.
The ccnet db contains user and group information; the seafile db contains the head commit information for each library and also sharing information; the seahub db is relatively larger, it contains session cookies for the web interface, audit log, and other application tables.

Updates to these dbs are infrequent. Seafile only update the head commit information once for large bunch of files (unlike ownCloud updates for each file).

4 Likes

Fsck doesn’t check corruption in the database. We always assume the db itself is not corrupted (it’s the responsibility of the db software). Fsck only checks the consistency between the db and the file store, and also the file store itself for corruption. It runs long since it has to calculate hash for each file.

3 Likes

Thanks Jonathan, interesting!

@Jonathan: Any reason why Seafile needs 3 databases? Couldn’t all this be done in 1 database with as many tables as needed? One of the most cumbersome things when setting up seafile, is setting up these three databases.

Exactly my point!
And MariaDB/MySQL is a stable solution and is in my opinion sufficient even for big solutions, e.g. Wikipedia.

[quote=“makkus, post:7, topic:139, full:true”]

I wouldn’t worry about that too much. Just use MariaDB (or maybe even Percona). MariaDB is developed by the community, and it’s a drop in replacement for MySQL.[/quote]

Well, it’s a “drop in replacement” for now. As far as I understood the people speaking in the database workshop I’ve visited it is not sure that this will last.

[quote=“makkus, post:7, topic:139, full:true”]
While I prefer postgreSQL as a database to MySQL personally, I don’t think we have to be worried about stability/corruption/etc on MySQL/MariaDB for our Seafile installs. It’s a proven technology, and lots of companies use it in production without any issues for ages. Having the option of postgreSQL would be nice, but, in my opinion, it should be waaaaaay down the list in terms of priorities.Plus, we have to reckognize the resource constrains on a project like Seafile, so I’d prefer to have one good and stable integration than a few half-assed ones.[/quote]

I completely understand that there are much more important things to do on Seafile. I just wanted to share my impression of the recently visited workshop which was telling me: there are Pro users who would feel better not to run Seafile with MySQL (but have to use MySQL for support reasons) and the arguments for using PostgreSQL (stability, fantastic documentation, stable development) convinced me that there should be at least a discussion about using PostgreSQ for SeafileL.
For now everything is working fine but as an essential part of Seafile I just thought there should be some ideas how to improve the database part of Seafile especially of the following argument:

So it would be nice to have some statistics about average database corruption using MySQL, MariaDB and PostgreSQL.

Anyway: this information made it already worth to start the discussion. Thank you all!

As a long time advocate and user of Seafile, I am going to pipe in here as I personally have offered to help with PostgreSQL support and even submitted a script for performing database upgrades. The offer of support was met with silence. The upgrade script was rejected IMO without a valid reason.

I now have an issue open about Seafile 6.0.x change in behaviour that makes PostgreSQL consume all available memory when used in a large active environment. I have repeatedly posted updates and asked what is required as part of diagnositic capture. It is not getting much response as well.

What I have been informed is that the Seafile team removed libzdb and wrote their own database layer.

2 Likes

I suggest you be patient. We haven’t given up support for PostgreSQL. But giving so many urgent issues to resolve, like SQLite database lock issue, Firefox 50 issue, we don’t have resource to look into PostgreSQL yet.

1 Like

[quote=“caniwi, post:14, topic:139, full:true”]
As a long time advocate and user of Seafile, I am going to pipe in here as I personally have offered to help with PostgreSQL support and even submitted a script for performing database upgrades. The offer of support was met with silence. The upgrade script was rejected IMO without a valid reason.[/quote]

Thanks for your statement @daniel.pan - I personally completely understand that you don’t have time for another “feature” like PostreSQL support. From my point of view It is incredible that Seafile devs are able to keep such a great community support with personal communication and publishing new Seafile versions somehow every two weeks (looks like you have one eye on the code and one eye on the forum 24/7 :wink: ).

But for example: why you don’t try it with PostgreSQL as you handle it with the ARM builds which are (in my perspective) professionally provided by @jobenvil as a community member? Maybe @caniwi could do the same for PostgreSQL support as @jobenvil does for the RaspberryPi builds?
Maybe you could setup an official haiwen git repository for PostgreSQL support and point out that PostgreSQL support (whatever is needed for this) is maintained only by community. As far as I can see this would satisfy people like @caniwi and can work as great as with @jobenvil with almost no risk for you (if you point out that PostgreSQL is supported only by the community at the moment).

1 Like