Seafile server down...but not sure where to start looking

Hi guys,

Hopefully someone can start giving me some pointers on this. My server turned down this morning, and seems I’m unable to bring it back. Not like it’s the first time, it’s been keeping the same database since 2015…but seems this time I’m unable to find anything useful on the logs. Any help in pointing me in the right direction would be greatly appreciated!

What I’m getting: Nginx turns a 502 bad gateway page, and the clients fail to connect to the server. seafile and seahub processes seem to remain active in the server, but to no obvious result.

The logs:
Seafile.log

[09/30/18 19:16:28] http-server.c(173): fileserver: worker_threads = 10
[09/30/18 19:16:28] http-server.c(188): fileserver: fixed_block_size = 8388608
[09/30/18 19:16:28] http-server.c(203): fileserver: web_token_expire_time = 3600
[09/30/18 19:16:28] http-server.c(218): fileserver: max_indexing_threads = 1
[09/30/18 19:16:28] http-server.c(233): fileserver: max_index_processing_threads= 3
[09/30/2018 07:16:28 PM] ../common/mq-mgr.c(54): [mq client] mq cilent is started
[09/30/2018 07:16:29 PM] size-sched.c(96): Repo size compute queue size is 0
[09/30/2018 07:21:29 PM] size-sched.c(96): Repo size compute queue size is 0

Seahub.log

2018-09-26 18:06:25,461 [WARNING] django.request:152 get_response Not Found: /api2/events/
2018-09-26 18:11:25,475 [WARNING] django.request:152 get_response Not Found: /api2/events/
2018-09-26 18:16:25,486 [WARNING] django.request:152 get_response Not Found: /api2/events/
2018-09-26 18:21:25,476 [WARNING] django.request:152 get_response Not Found: /api2/events/
2018-09-26 18:26:25,479 [WARNING] django.request:152 get_response Not Found: /api2/events/
2018-09-26 18:31:25,504 [WARNING] django.request:152 get_response Not Found: /api2/events/
2018-09-26 18:36:25,496 [WARNING] django.request:152 get_response Not Found: /api2/events/
2018-09-26 18:41:25,494 [WARNING] django.request:152 get_response Not Found: /api2/events/
2018-09-26 18:46:25,507 [WARNING] django.request:152 get_response Not Found: /api2/events/
2018-09-26 18:51:25,499 [WARNING] django.request:152 get_response Not Found: /api2/events/

(not sure this one is relevant to the issue at hand)

ccnet.log:

[09/30/18 19:16:26] ../common/session.c(132): using config file /media/seafile/minipc/conf/ccnet.conf
[09/30/18 19:16:26] ../common/session.c(455): socket file exists, delete it anyway
[09/30/18 19:16:26] ../common/session.c(484): Listen on /media/seafile/minipc/ccnet/ccnet.sock for local clients
[09/30/18 19:16:26] ../common/session.c(290): Update pubinfo file
[09/30/18 19:16:27] ../common/session.c(398): Accepted a local client
[09/30/18 19:16:27] ../common/session.c(398): Accepted a local client
[09/30/18 19:16:28] ../common/session.c(398): Accepted a local client
[09/30/18 19:16:28] ../common/session.c(398): Accepted a local client
[09/30/18 19:16:31] ../common/session.c(398): Accepted a local client
[09/30/18 19:16:31] ../common/peer.c(943): Local peer down

gunicorn_error.log:

[2018-09-30 11:17:06 +0000] [1780] [INFO] Booting worker with pid: 1780
[2018-09-30 11:17:10 +0000] [1794] [INFO] Booting worker with pid: 1794
[2018-09-30 11:17:23 +0000] [1919] [INFO] Booting worker with pid: 1919
[2018-09-30 11:17:54 +0000] [2095] [INFO] Booting worker with pid: 2095
[2018-09-30 11:18:36 +0000] [2147] [INFO] Booting worker with pid: 2147
[2018-09-30 11:21:34 +0000] [2604] [INFO] Booting worker with pid: 2604
[2018-09-30 11:21:34 +0000] [2605] [INFO] Booting worker with pid: 2605
[2018-09-30 11:21:44 +0000] [2627] [INFO] Booting worker with pid: 2627
[2018-09-30 11:22:06 +0000] [2668] [INFO] Booting worker with pid: 2668
[2018-09-30 11:24:30 +0000] [2950] [INFO] Booting worker with pid: 2950

controller.log

[09/30/18 19:16:26] seafile-controller.c(169): starting ccnet-server ...
[09/30/18 19:16:26] seafile-controller.c(73): spawn_process: ccnet-server -F /media/seafile/minipc/conf -c /media/seafile/minipc/ccnet -f /media/seafile/minipc/logs/ccnet.log -d -P /media/seafile/minipc/pids/ccnet.pid
[09/30/18 19:16:26] seafile-controller.c(88): spawned ccnet-server, pid 971
[09/30/18 19:16:27] seafile-controller.c(571): ccnet daemon connected.
[09/30/18 19:16:27] seafile-controller.c(201): starting seaf-server ...
[09/30/18 19:16:27] seafile-controller.c(73): spawn_process: seaf-server -F /media/seafile/minipc/conf -c /media/seafile/minipc/ccnet -d /media/seafile/minipc/seafile-data -l /media/seafile/minipc/logs/seafile.log -P /media/seafile/minipc/pids/seaf-server.pid
[09/30/18 19:16:27] seafile-controller.c(88): spawned seaf-server, pid 995
[09/30/18 19:16:27] seafile-controller.c(544): seafdav not enabled.

WTF… So here’s an update. The clients show the error of “failed to get libraries information”…But they seem to be in sync. I just dropped a file I intended to back up to a synced folder, so it gets backed up at some point when I fix the issue…And it synced it immediately (pop up notification saying “file XXXX synced successfully”). But I don’t see anything, as I still can’t access the WEB UI nor the client one, which still shows the error.

Any help would be very welcome, or I can provide any additional information as needed.

Not sure if this might have something…nginx error.log shows the following:

2018/10/01 15:37:45 [error] 795#795: *8156 connect() failed (111: Connection refused) while connecting to upstream, client: 192.168.0.1, server: , request: "GET /s$
2018/10/01 15:37:52 [error] 795#795: *8090 connect() failed (111: Connection refused) while connecting to upstream, client: 192.168.0.1, server: , request: "GET /s$
2018/10/01 15:37:54 [error] 795#795: *8103 connect() failed (111: Connection refused) while connecting to upstream, client: 192.168.0.1, server: , request: "GET /s$
2018/10/01 15:37:54 [error] 795#795: *8090 connect() failed (111: Connection refused) while connecting to upstream, client: 192.168.0.1, server: , request: "GET /s$
2018/10/01 15:37:58 [error] 795#795: *8090 connect() failed (111: Connection refused) while connecting to upstream, client: 192.168.0.1, server: , request: "GET /s$
2018/10/01 15:38:03 [error] 795#795: *8103 connect() failed (111: Connection refused) while connecting to upstream, client: 192.168.0.1, server: , request: "GET /s$
2018/10/01 15:38:04 [error] 795#795: *8090 connect() failed (111: Connection refused) while connecting to upstream, client: 192.168.0.1, server: , request: "GET /s$
2018/10/01 15:38:07 [error] 795#795: *8103 connect() failed (111: Connection refused) while connecting to upstream, client: 192.168.0.1, server: , request: "POST /$
2018/10/01 15:38:15 [error] 795#795: *8156 connect() failed (111: Connection refused) while connecting to upstream, client: 192.168.0.1, server: , request: "POST /$
2018/10/01 15:38:16 [error] 795#795: *8156 connect() failed (111: Connection refused) while connecting to upstream, client: 192.168.0.1, server: , request: "GET /s$

Now, 192.168.0.1 is the router, but the server is located at a different IP address. Is this supposed to look like that?

Thanks!

Anyone could tell me where to start looking for more details? Does it look like an nginx issue? I didn’t touch the config in years, and it started failing now. Has there been any changes to nginx recently deprecating any setting? What else could it be? Thanks a lot guys!

Now, 192.168.0.1 is the router, but the server is located at a different IP address

This means you have an IP address configuration issue somewhere. There’s no reason for your router IP to be showing up as talking to the server at any point.

Check all client and server IP address configuration. Verify you don’t have that IP (192.168.0.1) in use on a client accidentally.

Thanks…Seem that one shows wherever the requests come from. Not very sure. I can see the gunicorn error log keeps showing booting new processes, but there’s no other entries of any other kind near where the server started to fail. So, a couple of days before it started giving trouble, it started outputting only Booting worker with PID… lines. What is this about, what is this gunicorn used for, and what’s the worker supposed to do?

Thanks!

Ok, so hopefully the final update on this one:
Seems Seafile wasn’t (necessarily) the culprit. Seafile install and DBs run on a separate partition. I restored the server VM back a full week, and lo and behold, Seafile was back to running again. Forced a full backup, updated all the older packages…and seems to keep running after a reboot. I will keep a close eye…but seems to be working again.

Thanks!