On my server I noticed that the total number of file descriptors used grew linearly after a recent Seafile restart. Upon investigating it turned out that for some reason ccnet was not running, causing seafile-controller
to repeatedly attempt to connect to ccnet.sock
without success. Pointing strace
to the seafile-controller
process showed that seafile-controller
allocates a new socket for each attempt, without closing the old one:
# strace -f -p 11827
strace: Process 11827 attached
restart_syscall(<... resuming interrupted poll ...>) = 0
socket(AF_UNIX, SOCK_STREAM, 0) = 28
connect(28, {sa_family=AF_UNIX, sun_path="*redacted*ccnet.sock"}, 110) = -1 ENOENT (No such file or directory)
poll([{fd=4, events=POLLIN}], 1, 1000) = 0 (Timeout)
socket(AF_UNIX, SOCK_STREAM, 0) = 29
connect(29, {sa_family=AF_UNIX, sun_path="*redacted*ccnet.sock"}, 110) = -1 ENOENT (No such file or directory)
poll([{fd=4, events=POLLIN}], 1, 1000) = 0 (Timeout)
socket(AF_UNIX, SOCK_STREAM, 0) = 30
connect(30, {sa_family=AF_UNIX, sun_path="*redacted*ccnet.sock"}, 110) = -1 ENOENT (No such file or directory)
poll([{fd=4, events=POLLIN}], 1, 1000) = 0 (Timeout)
socket(AF_UNIX, SOCK_STREAM, 0) = 31
connect(31, {sa_family=AF_UNIX, sun_path="*redacted*ccnet.sock"}, 110) = -1 ENOENT (No such file or directory)
Setting a breakpoint at connect
and fetching a backtrace with gdb points to ccnet_client_connect_daemon
being the culprit:
(gdb) bt full
#0 connect () at ../sysdeps/unix/syscall-template.S:84
No locals.
#1 0x00007f2126bd1222 in ccnet_client_connect_daemon () from target:*redacted*/seafile/lib/libccnet.so.0
No symbol table info available.
#2 0x000000000040518a in ?? ()
No symbol table info available.
#3 0x00007f2125120eed in g_timeout_dispatch () from target:*redacted*/seafile/lib/libglib-2.0.so.0
No symbol table info available.
#4 0x00007f21251204c9 in g_main_context_dispatch () from target:*redacted*/seafile/lib/libglib-2.0.so.0
No symbol table info available.
#5 0x00007f2125120818 in g_main_context_iterate.isra () from target:*redacted*/seafile/lib/libglib-2.0.so.0
No symbol table info available.
#6 0x00007f2125120aea in g_main_loop_run () from target:*redacted*/seafile/lib/libglib-2.0.so.0
No symbol table info available.
#7 0x0000000000403dd1 in ?? ()
No symbol table info available.
#8 0x00007f21249342e1 in __libc_start_main (main=0x403880, argc=8, argv=0x7ffde4ba5a08, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7ffde4ba59f8)
at ../csu/libc-start.c:291
result = <optimized out>
unwind_buf = {cancel_jmp_buf = {{jmp_buf = {0, -3737018442206360469, 4212017, 140728440871424, 0, 0, 3735938693221108843, 3630244487668092011}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0},
data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
not_first_call = <optimized out>
#9 0x000000000040455a in ?? ()
No symbol table info available.
And indeed ccnet_client_connect_daemon
returns when connect
fails, without freeing anything: https:// github. com/haiwen/ccnet/blob/5b9f64c2438517e1c95b28678097419542d1d084/lib/ccnet-client.c#L267 (remove spaces from the URL, I may not post links yet).