Seafile leaks sockets when unable to connect to ccnet.sock

On my server I noticed that the total number of file descriptors used grew linearly after a recent Seafile restart. Upon investigating it turned out that for some reason ccnet was not running, causing seafile-controller to repeatedly attempt to connect to ccnet.sock without success. Pointing strace to the seafile-controller process showed that seafile-controller allocates a new socket for each attempt, without closing the old one:

# strace -f -p 11827
strace: Process 11827 attached
restart_syscall(<... resuming interrupted poll ...>) = 0
socket(AF_UNIX, SOCK_STREAM, 0)         = 28
connect(28, {sa_family=AF_UNIX, sun_path="*redacted*ccnet.sock"}, 110) = -1 ENOENT (No such file or directory)
poll([{fd=4, events=POLLIN}], 1, 1000)  = 0 (Timeout)
socket(AF_UNIX, SOCK_STREAM, 0)         = 29
connect(29, {sa_family=AF_UNIX, sun_path="*redacted*ccnet.sock"}, 110) = -1 ENOENT (No such file or directory)
poll([{fd=4, events=POLLIN}], 1, 1000)  = 0 (Timeout)
socket(AF_UNIX, SOCK_STREAM, 0)         = 30
connect(30, {sa_family=AF_UNIX, sun_path="*redacted*ccnet.sock"}, 110) = -1 ENOENT (No such file or directory)
poll([{fd=4, events=POLLIN}], 1, 1000)  = 0 (Timeout)
socket(AF_UNIX, SOCK_STREAM, 0)         = 31
connect(31, {sa_family=AF_UNIX, sun_path="*redacted*ccnet.sock"}, 110) = -1 ENOENT (No such file or directory)

Setting a breakpoint at connect and fetching a backtrace with gdb points to ccnet_client_connect_daemon being the culprit:

(gdb) bt full
#0  connect () at ../sysdeps/unix/syscall-template.S:84
No locals.
#1  0x00007f2126bd1222 in ccnet_client_connect_daemon () from target:*redacted*/seafile/lib/libccnet.so.0
No symbol table info available.
#2  0x000000000040518a in ?? ()
No symbol table info available.
#3  0x00007f2125120eed in g_timeout_dispatch () from target:*redacted*/seafile/lib/libglib-2.0.so.0
No symbol table info available.
#4  0x00007f21251204c9 in g_main_context_dispatch () from target:*redacted*/seafile/lib/libglib-2.0.so.0
No symbol table info available.
#5  0x00007f2125120818 in g_main_context_iterate.isra () from target:*redacted*/seafile/lib/libglib-2.0.so.0
No symbol table info available.
#6  0x00007f2125120aea in g_main_loop_run () from target:*redacted*/seafile/lib/libglib-2.0.so.0
No symbol table info available.
#7  0x0000000000403dd1 in ?? ()
No symbol table info available.
#8  0x00007f21249342e1 in __libc_start_main (main=0x403880, argc=8, argv=0x7ffde4ba5a08, init=<optimized out>, fini=<optimized out>, rtld_fini=<optimized out>, stack_end=0x7ffde4ba59f8)
    at ../csu/libc-start.c:291
        result = <optimized out>
        unwind_buf = {cancel_jmp_buf = {{jmp_buf = {0, -3737018442206360469, 4212017, 140728440871424, 0, 0, 3735938693221108843, 3630244487668092011}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, 
            data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
        not_first_call = <optimized out>
#9  0x000000000040455a in ?? ()
No symbol table info available.

And indeed ccnet_client_connect_daemon returns when connect fails, without freeing anything: https:// github. com/haiwen/ccnet/blob/5b9f64c2438517e1c95b28678097419542d1d084/lib/ccnet-client.c#L267 (remove spaces from the URL, I may not post links yet).

2 Likes

Can someone look into this, please?

2 Likes

Can you at least acknowledge this issue, even if a fix cannot be provided timely?

1 Like

@daniel.pan

We will check the problem.

2 Likes

This issue no longer exists in 7.1 version, as controller no longer connect to ccnet-server.