Since we want to upgrade from Seafile 6.2 to 6.3, we need to finally switch from fastcgi to wsgi, because fastcgi is not supported anymore by the underlying Django framework.
As a university user, we are currently depending on Shibboleth for authentication and I’m not happy about the current solution of transferring the trusted information of Shibboleth SP to seahub using an untrusted path (HTTP headers).
All Shibboleth resources state (even the one cited in the other thread), that you should use environment variables as a secure channel, where possible. And with mod_wsgi it would be possible.
Please consider supporting mod_wsgi as a secure alternative to gunicorn and mod_proxy.
At first glance it doesn’t seem too complicated to use that instead. And it could even be faster.
The biggest problem here is, that we need to configure the same environment build by seahub.sh for mod_wsgi.
For me using HTTP headers and relying on anti-spoofing measures is like having a walk on the german Autobahn, but having a safety car behind you. Instead you could simply use a pedestrian path next to the Autobahn (environment variables).
I really thought about it and I can’t share your concerns about the “untrusted path” - from reading your Ansible playbooks, you also run Apache on the same node as Seafile, so there isn’t even a network inbetween. Where do you expect tampering to happen there?
The argument raised by the Shibboleth documentation is about header spoofing by end user clients with malicious intent and that anti-spoofing feature in the SP seems to be enabled by default since an ancient version, so that threat is mitigated.
Or have I misunderstood your argument?
From a general view, I’m quite happy that the industry standard has moved to language-specific application servers and (proxied) HTTP from a general purpose HTTP server. This simplifies deployments across languages, since you don’t need language-specific server modules with all their caveats.
But of course, from the technical point of view, everything is possible and everyone should do as pleases them.
The HTTP headers are an untrusted path / channel, because end user clients have in general access to it.
There are mitigations in place, but we don’t know, if there is a way to circumvent them.
That’s why in several places Shibboleth documentation states [1, 2, 3], that you should stick to the secure channel “environment variables”, if possible. And using mod_wsgi (which has a daemon mode, too) it is possible.
In general, and this is noted below, you should always favor environment variables to request headers if the server platform supports that option. Environment variables cannot be influenced by the client and are much safer.
Please take the time and read the whole background section. Then the citation in the other thread sounds more like “It is better, than it was in the past. So that you have no reason anymore to disable the feature, if you need to use HTTP headers”.
As to the caveats and “industry standard” (is there any?):
Using Shibboleth is clearly a corner case here. AFAIK most (all?) other authentication systems don’t rely on webserver-specific modules and because of that don’t rely that much on having a secure communication channel between the webserver and the application server (secure means cannot be tampered with from the outside). So in other cases proxying at least doesn’t have security drawbacks.
Till now I didn’t run into any caveats of language specific server modules. All solutions have advantages and disadvantages. For me currently using mod_wsgi seems to have more advantages.
I feel very concerned about this issue. Thank you @egroeper for reporting it.
I’m not skilled enough to argue in this thread, but i noticed some collegues about it. @daniel.pan, @xiez and @Jonathan, please consider it as a major issue for universities using shibboleth as a common authentication protocol.
“Under no circumstances should you rely on the request header option other than as a temporary measure while adjusting applications to use the environment option. There are no known scenarios in which environment variables can’t be used, including with Java containers, though sometimes extra effort or Apache settings may be needed. Do NOT take shortcuts with this. Do the work and use them.”
@daniel.pan, can you consider implementing a kind of start-with_mod_wsgi option for seahub.sh script?
as i tried to debug the HTTP exchanges on the browser side, i did not see something significant.
Is the risk resides only between apache2 and gunicorn (running on the same host) as suggested @schlarbm ?
Well, who is able to understand?