Seafile Server 7.1.x deteriorating

Recently, one of my Seafile Servers started to deteriorate with messages like this in /var/log/kern.log:

Feb 26 08:20:10 transfer kernel: [227297.467719] pool[14757]: segfault at 8 ip 0000000000442654 sp 00007f8464fd7750 error 4 in seaf-server[400000+105000]
Feb 26 08:20:10 transfer kernel: [227297.467734] Code: c7 44 24 68 00 00 00 00 48 85 db 75 2a e9 9c 01 00 00 0f 1f 40 00 e8 cb 57 fc ff 48 8b 7c 24 68 48 85 ff 74 05 e8 bc 57 fc ff <4
8> 8b 5b 08 48 85 db 0f 84 77 01 00 00 48 8b 2b 48 8d 54 24 60 31
Feb 26 08:52:41 transfer kernel: [229249.383857] pool[17967]: segfault at 8 ip 0000000000442654 sp 00007fccb4ff8750 error 4 in seaf-server[400000+105000]
Feb 26 08:52:41 transfer kernel: [229249.383872] Code: c7 44 24 68 00 00 00 00 48 85 db 75 2a e9 9c 01 00 00 0f 1f 40 00 e8 cb 57 fc ff 48 8b 7c 24 68 48 85 ff 74 05 e8 bc 57 fc ff <4
8> 8b 5b 08 48 85 db 0f 84 77 01 00 00 48 8b 2b 48 8d 54 24 60 31
Feb 26 08:57:39 transfer kernel: [229547.207478] pool[18048]: segfault at 8 ip 0000000000442654 sp 00007f5e197f9750 error 4 in seaf-server[400000+105000]
Feb 26 08:57:39 transfer kernel: [229547.207490] Code: c7 44 24 68 00 00 00 00 48 85 db 75 2a e9 9c 01 00 00 0f 1f 40 00 e8 cb 57 fc ff 48 8b 7c 24 68 48 85 ff 74 05 e8 bc 57 fc ff <4
8> 8b 5b 08 48 85 db 0f 84 77 01 00 00 48 8b 2b 48 8d 54 24 60 31

It’s a virtual machine running on a Hyper-V. Operating system is Debian GNU/Linux 10 (buster).
The user just gets a non-descriptive error. After refreshing the page, usually it works out eventually.
I dared to make an upgrade from 7.1.4 to 7.1.5 but it didn’t fix anything.

I’m assuming there’s a connection between these faults and damaged repos :

[02/26/21 00:18:49] gc-core.c(456): === GC is finished ===
[02/26/21 00:18:49] gc-core.c(460): The following repos are damaged. You can run seaf-fsck to fix them.
[02/26/21 00:18:49] gc-core.c(463): 1e13c35a-a82b-4cac-91e1-d7893371ee31
[02/26/21 00:18:49] gc-core.c(463): 28785a8d-e1d4-4749-a088-d433b367f4d1
[02/26/21 00:18:49] gc-core.c(463): 2d4f54a5-4e20-473a-b1ae-590bd620f3a7
[02/26/21 00:18:49] gc-core.c(463): 31e7cc1f-3c03-49b4-9104-8fdb678c8eef
[02/26/21 00:18:49] gc-core.c(463): 3e0f3a80-e142-4f90-80e1-f3d9a3bf4aba
[02/26/21 00:18:49] gc-core.c(463): 55740408-9a2e-457a-9bb0-c41cbd03c50a
[02/26/21 00:18:49] gc-core.c(463): 56b714ef-8828-4a3d-9121-8685b2b6ab9b
[02/26/21 00:18:49] gc-core.c(463): 71e03931-033a-4617-b4b9-036ec7b1bf76
[02/26/21 00:18:49] gc-core.c(463): 7feff8fd-fddc-4ec6-8f9d-b6a7e68c7a02
[02/26/21 00:18:49] gc-core.c(463): 83954dce-7ad9-4b56-9d6f-08b743e33680
[02/26/21 00:18:49] gc-core.c(463): 8b2375bd-c267-4d6a-ad5c-716f8275f55d
[02/26/21 00:18:49] gc-core.c(463): 8d37713c-e86d-4814-953e-cff826374d63
[02/26/21 00:18:49] gc-core.c(463): 92a719dd-0cb2-4af4-8c69-80bdff0f057a
[02/26/21 00:18:49] gc-core.c(463): 93d032b3-cfdd-424a-b5a4-23982dc7aa98
[02/26/21 00:18:49] gc-core.c(463): 985b9145-86b7-472f-8d79-ed764a2413b5
[02/26/21 00:18:49] gc-core.c(463): 9cacc932-36f7-4345-9e56-19bce57b19a0
[02/26/21 00:18:49] gc-core.c(463): 9e21eb93-cd8c-4084-8c9d-7fd72ef3596e
[02/26/21 00:18:49] gc-core.c(463): bc711457-2f8b-4e4a-8fe4-b130cfe08263
[02/26/21 00:18:49] gc-core.c(463): d9d8601c-f57a-48cd-891c-d7b64c9aad6f
[02/26/21 00:18:49] gc-core.c(463): e1e3f9fb-5e4e-4b93-a0a0-3f5f1eac5d24
[02/26/21 00:18:49] gc-core.c(463): e283790b-cd2d-4ef9-b192-db210593fb6d
[02/26/21 00:18:49] gc-core.c(463): fe9b4056-7b41-43af-838e-542963e71aad
seafserv-gc run done

seaf-fsck is run periodically but it doesn’t fix any of these repos.

This is from seaf-fuse.log:

2021-02-26 00:19:03.104 - <139635882320256> wsgidav.dc.seahub_db INFO : Init seahub database…
2021-02-26 00:19:03.138 - <139635882320256> wsgidav.wsgidav_app INFO : WsgiDAV/3.0.4 Python/3.7.3 Linux-4.19.0-14-amd64-x86_64-with-debian-10.8
2021-02-26 00:19:03.138 - <139635882320256> wsgidav.wsgidav_app INFO : Lock manager: LockManager(LockStorageDict)
2021-02-26 00:19:03.138 - <139635882320256> wsgidav.wsgidav_app INFO : Property manager: None
2021-02-26 00:19:03.138 - <139635882320256> wsgidav.wsgidav_app INFO : Domain controller: SeafileDomainController()
2021-02-26 00:19:03.139 - <139635882320256> wsgidav.wsgidav_app INFO : Registered DAV providers by route:
2021-02-26 00:19:03.139 - <139635882320256> wsgidav.wsgidav_app INFO : - ‘/:dir_browser’: FilesystemProvider for path ‘/home/seafile/seafile-server-7.1.5/seahub/thirdpart/wsgidav/dir_browser/htdocs’ (Read-Only)
2021-02-26 00:19:03.139 - <139635882320256> wsgidav.wsgidav_app INFO : - ‘/seafdav’: SeafileProvider for Seafile (Read-Write)
2021-02-26 00:19:03.139 - <139635882320256> wsgidav.wsgidav_app WARNING : Basic authentication is enabled: It is highly recommended to enable SSL.
2021-02-26 00:19:03.139 - <139635882320256> wsgidav WARNING : Could not import lxml: using xml instead (up to 10% slower). Consider pip install lxml(see https://pypi.python.org/pypi/lxml).

This, I don’t get either:

root@transfer:~# pip install lxml
-bash: pip: command not found
root@transfer:~# pip3 install lxml
Requirement already satisfied: lxml in /usr/local/lib/python3.7/dist-packages (4.5.1)

And seafile.log contains a lot of this:

[02/26/21 05:00:59] repo-mgr.c(256): Commit 43fdecaf3285ec10a42a7b097c8982ad727bcec3 is missing
[02/26/21 05:00:59] repo-mgr.c(322): Repo 31e7cc1f is corrupted.
[02/26/21 05:00:59] repo-mgr.c(256): Commit f4053115119e997642ba917cd3e325c45d8dc6dc is missing
[02/26/21 05:00:59] repo-mgr.c(322): Repo 83954dce is corrupted.
[02/26/21 05:01:21] repo-mgr.c(256): Commit e6d1bb360984253ba8f81f15af04eb02d1e5299d is missing
[02/26/21 05:01:21] repo-mgr.c(322): Repo d9d8601c is corrupted.
[02/26/21 05:01:21] repo-mgr.c(256): Commit 6f02963119789f7e1fb895f2835500779102ac48 is missing
[02/26/21 05:01:21] repo-mgr.c(322): Repo e283790b is corrupted.
[02/26/21 05:01:21] repo-mgr.c(256): Commit c82f16070d9f398763362dc68bada644c0d493a9 is missing
[02/26/21 05:01:21] repo-mgr.c(322): Repo 55740408 is corrupted.
[02/26/21 05:01:21] repo-mgr.c(256): Commit 26fc87e4bc6afae63e49ef6811cf0ffa180cce67 is missing
[02/26/21 05:01:21] repo-mgr.c(322): Repo 71e03931 is corrupted.
[02/26/21 05:01:29] repo-mgr.c(256): Commit ea6031215395bafdcee56e0b556ed0d5adb07560 is missing
[02/26/21 05:01:29] repo-mgr.c(322): Repo 28785a8d is corrupted.

And seahub.log shows a lot of these:

2021-02-26 11:13:50,232 [ERROR] django.request:135 handle_uncaught_exception Internal Server Error: /api2/account/info/
Traceback (most recent call last):
File “/home/seafile/seafile-server-7.1.5/seafile/lib64/python3.6/site-packages/pysearpc/utils.py”, line 30, in sendall
n = fd.send(data[offset:])
BrokenPipeError: [Errno 32] Broken pipe
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File “/home/seafile/seafile-server-7.1.5/seahub/thirdpart/django/core/handlers/exception.py”, line 41, in inner
response = get_response(request)
File “/home/seafile/seafile-server-7.1.5/seahub/thirdpart/django/core/handlers/base.py”, line 249, in _legacy_get_response
response = self._get_response(request)
File “/home/seafile/seafile-server-7.1.5/seahub/thirdpart/django/core/handlers/base.py”, line 187, in _get_response
response = self.process_exception_by_middleware(e, request)
File “/home/seafile/seafile-server-7.1.5/seahub/thirdpart/django/core/handlers/base.py”, line 185, in _get_response
response = wrapped_callback(request, *callback_args, **callback_kwargs)
File “/home/seafile/seafile-server-7.1.5/seahub/thirdpart/django/views/decorators/csrf.py”, line 58, in wrapped_view
return view_func(*args, **kwargs)
File “/home/seafile/seafile-server-7.1.5/seahub/thirdpart/django/views/generic/base.py”, line 68, in view
return self.dispatch(request, *args, **kwargs)
File “/home/seafile/seafile-server-7.1.5/seahub/seahub/api2/base.py”, line 23, in dispatch
response = super(APIView, self).dispatch(*a, **kw)
File “/home/seafile/seafile-server-7.1.5/seahub/thirdpart/rest_framework/views.py”, line 505, in dispatch
response = self.handle_exception(exc)
File “/home/seafile/seafile-server-7.1.5/seahub/seahub/api2/base.py”, line 20, in handle_exception
return super(APIView, self).handle_exception(exc)
File “/home/seafile/seafile-server-7.1.5/seahub/thirdpart/rest_framework/views.py”, line 465, in handle_exception
self.raise_uncaught_exception(exc)
File “/home/seafile/seafile-server-7.1.5/seahub/thirdpart/rest_framework/views.py”, line 476, in raise_uncaught_exception
raise exc
File “/home/seafile/seafile-server-7.1.5/seahub/thirdpart/rest_framework/views.py”, line 502, in dispatch
response = handler(request, *args, **kwargs)
File “/home/seafile/seafile-server-7.1.5/seahub/seahub/api2/views.py”, line 343, in get
return Response(self._get_account_info(request))
File “/home/seafile/seafile-server-7.1.5/seahub/seahub/api2/views.py”, line 309, in _get_account_info
quota_total = seafile_api.get_user_quota(email)
File “/home/seafile/seafile-server-7.1.5/seafile/lib64/python3.6/site-packages/seaserv/api.py”, line 724, in get_user_quota
return seafserv_threaded_rpc.get_user_quota(username)
File “/home/seafile/seafile-server-7.1.5/seafile/lib64/python3.6/site-packages/pysearpc/client.py”, line 126, in newfunc
ret_str = self.call_remote_func_sync(fcall_str)
File “/home/seafile/seafile-server-7.1.5/seafile/lib64/python3.6/site-packages/pysearpc/named_pipe.py”, line 101, in call_remote_func_sync
ret_str = transport.send(self.service_name, fcall_str)
File “/home/seafile/seafile-server-7.1.5/seafile/lib64/python3.6/site-packages/pysearpc/named_pipe.py”, line 62, in send
sendall(self.pipe, header)
File “/home/seafile/seafile-server-7.1.5/seafile/lib64/python3.6/site-packages/pysearpc/utils.py”, line 32, in sendall
raise NetworkError(‘Failed to write to socket: %s’ % e)
pysearpc.errors.NetworkError: Failed to write to socket: [Errno 32] Broken pipe

I don’t know what to tell you about the lxml that is already installed, but seafile doesn’t know it is. I wouldn’t worry about that until you get the other problems taken care of.

Maybe it’s just been bad luck, but of the 4 Hyper-V servers I have ever had to deal with, 2 of them at some point screwed something up in the NTFS and corrupted the VHDs for at least 1 VM. So, I would first check the host’s logs for warning signs, and run a chkdsk if you can.

Assuming that doesn’t show any problems, and given that you are getting these same errors for several repos I would suggest that you start with checking the /var/log/syslog and dmesg for more serious errors. Since it is virtual it isn’t likely that you will see hardware-related problems in those logs, but you might see something that gives more idea how bad it is, or if something else is involved.

After that, shutdown the seafile services, database, etc, and unmount the filesystem(s) you store the seafile data and database in, and check them. Either fsck or xfs_repair (or whatever is the right tool for the filesystem). If it says it fixed anything, run it again, just to make sure everything is good.

Then mount the filesystems. If there were errors in syslog from your database (mysql, or postgres), check the database to see if fixing the filesystem resolved the database issues.

Then run seaf_fsck. See https://manual.seafile.com/maintain/seafile_fsck/

Good luck, I hope it turns out to be only a minor issue.

Thanks for your input. Actually, the server is completely rebooted every day. Apparently, some time ago I disabled the run of seaf-fsck in the startup script because it’s taking pretty long (about 2 hours). After enabling it again, the issue seems to be gone. At least, I don’t see any crashing seaf-server in the logs anymore.

Actually, it’s not quite solved yet. I removed all the libraries that seaf-fsck couldn’t fix. Apparently, they weren’t accessible anymore anyway. With seaf-fuse the affected libraries weren’t shown at all and in the web interface transferring or sharing resulted in “Library blabla not found”. All I could do was delete them.
However, there are still some repo IDs for which I couldn’t find any corresponding library. seaf-server is still crashing occasionally. This time I used gdb to get a stacktrace:

(gdb) attach 5244
Attaching to process 5244
[New LWP 5245]
[New LWP 5246]
[New LWP 5247]
[New LWP 5248]
[New LWP 5249]
[New LWP 5251]
[New LWP 5252]
[New LWP 5253]
[New LWP 5254]
[New LWP 5255]
[New LWP 5256]
[New LWP 5257]
[New LWP 5258]
[New LWP 5259]
[New LWP 5260]
[Thread debugging using libthread_db enabled]
Using host libthread_db library “/lib/x86_64-linux-gnu/libthread_db.so.1”.
0x00007fd059b237ef in epoll_wait (epfd=5, events=0x26d8290, maxevents=32, timeout=1000) at …/sysdeps/unix/sysv/linux/epoll_wait.c:30
30 …/sysdeps/unix/sysv/linux/epoll_wait.c: No such file or directory.
(gdb) c
Continuing.
[New Thread 0x7fd04ffff700 (LWP 5398)]
[New Thread 0x7fd045ffb700 (LWP 5399)]
[New Thread 0x7fd0457fa700 (LWP 5400)]
[New Thread 0x7fd044ff9700 (LWP 5401)]
[New Thread 0x7fd03ffff700 (LWP 5403)]
[New Thread 0x7fd03f7fe700 (LWP 5404)]
[Thread 0x7fd03f7fe700 (LWP 5404) exited]

Thread 18 “pool” received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fd045ffb700 (LWP 5399)]
seaf_fill_repo_obj_from_commit (repos=repos@entry=0x7fd045ffa800) at repo-mgr.c:3223
3223 repo-mgr.c: No such file or directory.
(gdb) bt
#0 seaf_fill_repo_obj_from_commit (repos=repos@entry=0x7fd045ffa800) at repo-mgr.c:3223
#1 0x000000000044476a in seaf_get_group_repos_by_user (mgr=0x2709910, user=, org_id=, error=) at repo-mgr.c:4471
#2 0x000000000040b45b in marshal_objlist__string (func=0x44e370 <seafile_get_group_repos_by_user>, param_array=, ret_len=0x7fd045ffac30) at …/lib/searpc-marshal.h:1195
#3 0x00007fd05b9670e5 in searpc_server_call_function (svc_name=svc_name@entry=0x7fd008028720 “seafserv-threaded-rpcserver”,
func=func@entry=0x7fd008028010 “[“get_group_repos_by_user”, “USER@DOMAIN.TLD”]”, len=56, ret_len=ret_len@entry=0x7fd045ffac30) at searpc-server.c:380
#4 0x00007fd05b968046 in named_pipe_client_handler (data=0x7fd04000b5e0) at searpc-named-pipe-transport.c:295
#5 0x00007fd05b0ccedc in g_thread_pool_thread_proxy () from /home/seafile/seafile-server-7.1.5/seafile/lib/libglib-2.0.so.0
#6 0x00007fd05b0cc540 in g_thread_proxy () from /home/seafile/seafile-server-7.1.5/seafile/lib/libglib-2.0.so.0
#7 0x00007fd05a57efa3 in start_thread (arg=) at pthread_create.c:486
#8 0x00007fd059b234cf in clone () at …/sysdeps/unix/sysv/linux/x86_64/clone.S:95
(gdb)

(gdb) bt full
#0 seaf_fill_repo_obj_from_commit (repos=repos@entry=0x7fd045ffa800) at repo-mgr.c:3223
repo =
commit =
repo_id = 0x7fd0080023a0 “\200\034\002\b\320\177”
commit_id = 0x7fd00800ab00 “\260\211\001\b\320\177”
repo_name = 0x0
last_modifier = 0x0
p = 0x0
next =
#1 0x000000000044476a in seaf_get_group_repos_by_user (mgr=0x2709910, user=, org_id=, error=) at repo-mgr.c:4471
group =
groups = 0x7fd008008740
p = 0x7fd008008680
q = 0x0
repos = 0x7fd008018d40
repo =
rpc_client =
sql = 0x7fd008008660
group_id = 7
repo_group_id = 7
group_name = 0x7fd00800bbc0 “\200”, <incomplete sequence \327>
#2 0x000000000040b45b in marshal_objlist__string (func=0x44e370 <seafile_get_group_repos_by_user>, param_array=, ret_len=0x7fd045ffac30) at …/lib/searpc-marshal.h:1195
error = 0x0
param1 =
ret =
object =
#3 0x00007fd05b9670e5 in searpc_server_call_function (svc_name=svc_name@entry=0x7fd008028720 “seafserv-threaded-rpcserver”,
func=func@entry=0x7fd008028010 “[“get_group_repos_by_user”, " USER@DOMAIN.TLD”]", len=56, ret_len=ret_len@entry=0x7fd045ffac30) at searpc-server.c:380
service = 0x26dc540
array = 0x7fd008002a30
ret =
jerror = {line = -1, column = -1, position = 56,
source = “\000\000\000\000\060#\000\b\320\177\000\000\001\000\000\000\000\000\000\000\227A\263Y\320\177\000\000\200\000\000\000\000\000\000\000\a\000\000\000\000\000\000\000\a\000\000\000\000\000\000\000\220\000\000\000\000\000\000\000\t\000\000\000\320\177\000\000\b\t\000\b”,
text = “\000\177\000\000\177\000\000\000\000\000\000\000\260\000\000\000\000\000\000\000\377\377\377\377\377\377\377\002\000\000\000\000\000\000\000\t\000\000\000\062", '\000' <repeats 19 times>, "[\000\000\000n\000\000\000\241\000\000\000\000\000\000\000\001\000\000\000\000\000\000\000\a\000\000\000\000\000\000\000\260\000\000\000\000\000\000\000\v\000\000\000\000\000\000\000\030\t\000\b\320\177\000\000\237\000\000\000\000\000\000\000\000\221chδMz\377\377\377\377\377\377\377\210\253\377E\320\177\000\000\001\000\000\000\000\000\000\000\020\200\002\b”}
error = 0x0
start = {tv_sec = -160, tv_usec = 8812898844441547008}
end = {tv_sec = 1, tv_usec = 30}
intv =
fname = 0x7fd00800a1e0 “get_group_repos_by_user”
fitem =
#4 0x00007fd05b968046 in named_pipe_client_handler (data=0x7fd04000b5e0) at searpc-named-pipe-transport.c:295
service = 0x7fd008028720 “seafserv-threaded-rpcserver”
body = 0x7fd008028010 “[“get_group_repos_by_user”, “USER@DOMAIN.TLD”]”
ret_len = 18446744073709551615
ret_str =
handler_data = 0x7fd04000b5e0
connfd = 65
len = 117
bufsize = 4096
buf = 0x7fd008001320 “{“service”: “seafserv-threaded-rpcserver”, “request”: “[\“get_group_repos_by_user\”, \“USER@DOMAIN.TLD\”]”}47cd\”, \“USER@DOMAIN.TLD\”]"}"}ilen.pdf\"]"}r.mp4\"]"}"}"}"]"}.zip\"]"}s"…
#5 0x00007fd05b0ccedc in g_thread_pool_thread_proxy () from /home/seafile/seafile-server-7.1.5/seafile/lib/libglib-2.0.so.0
No symbol table info available.
–Type for more, q to quit, c to continue without paging–
#6 0x00007fd05b0cc540 in g_thread_proxy () from /home/seafile/seafile-server-7.1.5/seafile/lib/libglib-2.0.so.0
No symbol table info available.
#7 0x00007fd05a57efa3 in start_thread (arg=) at pthread_create.c:486
ret =
pd =
now =
unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140532504311552, -5543852356342406112, 140532788558766, 140532788558767, 140532504311552, 40717616, 5526141424728253472,
5526183959851803680}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
not_first_call =
#8 0x00007fd059b234cf in clone () at …/sysdeps/unix/sysv/linux/x86_64/clone.S:95ex

It looks like I was able to fix this by removing references to these orphaned repositories, that seaf-fsck couldn’t repair:
fe9b4056-7b41-43af-838e-542963e71XXX
985b9145-86b7-472f-8d79-ed764a24XXX
8d37713c-e86d-4814-953e-cff826374XXX
71e03931-033a-4617-b4b9-036ec7b1XXX
31e7cc1f-3c03-49b4-9104-8fdb678cXXXX

For each of these IDs, I deleted the corresponding rows from seafile-db:
delete from Branch where repo_id = 'XXX;
delete from Repo where repo_id = 'XXX;
delete from RepoFileCount where repo_id = 'XXX;
delete from RepoGroup where repo_id = 'XXX;
delete from RepoHead where repo_id = 'XXX;
delete from RepoInfo where repo_id = 'XXX;
delete from RepoOwner where repo_id = 'XXX;
delete from RepoSize where repo_id = 'XXX;
delete from RepoValidSince where repo_id = 'XXX;
delete from SharedRepo where repo_id = 'XXX;
delete from VirtualRepo where repo_id = 'XXX;

After a restart of the server I don’t see any crashes or spurios errors in the web interface any longer. I believe, if seaf-fsck checked a bit more thorougly, it could have done the same and seaf-fsck could handle these inconsistencies more gracefully because it’s apparently causing NULL pointer derefencing in some C functions.

The actual cause of these orphaned repositories was apparently an event about two years ago, when the storage was running out of space due to a lot of uploads causing inconsistencies between the on-disk status and the database.