[MM3-users] Re: Timeouts

April 26, 2021

      ...
On Apr 25, 2021, at 7:06 PM, tlhackque via Mailman-users <mailman-users@mailman3.org> wrote:
On 25-Apr-21 20:34, Mark Sapiro wrote:
...
On 4/25/21 4:37 PM, tlhackque via Mailman-users wrote:
...
The described timeouts are something that hyperkitty ought to be able to
avoid.  For apache, the timeout is idle time between blocks of output.
Hyperkitty can avoid this by generating the archive in segments (based
on size, or elapsed time), flushing its output buffer,  generating a
multi-file archive, and/or using Transfer-Encoding: chunked (chunked
doesn't work for http/2).   It ought to be able to break the
work into
blocks of "n" messages & do something to generate output.  Besides
avoiding timeouts, working in segments allows the GUI to display
meaningful progress (e.g. if you're loading with XMLHttpRequest,
"onprogress") It really oughtn't be up to the user to break up the
request.
It is not the web server that times out. I'm not sure about uwsgi
because I don't use it, but the timeouts I see are on servers that use
gunicorn as the WSGI interface to Django and the timeout is in a
gunicorn worker. This is controlled by the timeout setting in the
gunicorn config. <https://docs.gunicorn.org/en/stable/settings.html#timeout>
Note that even 300 seconds is not enough to download the entire
<https://mail.python.org/archives/list/python-dev@python.org/> archive.
It may be possible to get HyperKitty to chunk the output to avoid this,
but it doesn't currently do that. Care to submit an MR?
I'm afraid (u)WSGI, Django, and gunicorn are not technologies that I
work with.
It sounds as if hyperkitty is compiling the entire archive before
sending the first byte.
The gunicorn doc that you pointed to says
Workers silent for more than this many seconds are killed and restarted.
Setting it to 0 has the effect of infinite timeouts by disabling
timeouts for all workers entirely.
"Silent" sounds like the standard webserver "you have to push some bits,
or we assume you're stuck".
My understanding is that gunicorn is a Python persistence server that is
run behind a webserver proxy.  So the (proxy) webserver (apache, nginx,
...) timeouts also apply and would need to be increased.
Might be interesting to try 0 (gunicorn) / 1200 (webserver) with your
python-dev archive, time it and see how much (encoded) data is
transferred...  (I would expect most mailing list archives to compress
nicely, though those with binary attachments wont.)
For uwsgi, I think the parameter is called harakiri1 (I don’t know why such a name though).
...
if request takes longer than specified harakiri time (in seconds), the request will be dropped and the corresponding worker recycled.
This should be set to a long enough value that allows downloading the archive.
If you are using http socket, then you want http-timeout.
Also, to set the timeout in webserver (nginx)
location / {
uwsgi_read_timeout 120s;
uwsgi_send_timeout 120s;
uwsgi_pass 0.0.0.0:8000;
include     uwsgi_params;
}
Or some other value that you want.
...
But fiddling with timeouts is treating the symptom, not the root cause.
The right solution is to stream, segment (or chunk) the output, because
in the general case, no timeout is long enough.  It'll always be
possible to find an archive that's just one byte (or second) longer than
any chosen timeout.  (See the halting problem.)   You want
the timeout
to catch a lack of progress, not total time that's a function of
transaction size.  (Webservers may also have limits on transaction size

e.g. mod_security, but they're only useful when the upper bound on a
response is knowable.)  Thus, the timeout(s) should be roughly
independent of the archive size; on the order of time-to-first-byte
(which ordinarily is longer than time between segments/chunks).

Also note that streaming requires fewer server resources than compiling
a complete archive before sending, since you don't need to create the
entire archive in memory (or in a tempfile).  You only need enough
memory to efficiently buffer the file I/O and to contain the compression
tables/output buffer.  Except for trivial cases, this will be
independent of the archive size.  The only downside is that if the comm
link is slow, you may hold a reader lock on the source data for longer
than necessary with the current scheme.
Seems as though this deserves an issue.  I guess I could open one - but
you have the experience/test cases.
Hyperkitty doesn’t actually create an archive in memory or in a temp file. It uses streaming response with on the fly compression to read from database and relay to the client for download.
https://gitlab.com/mailman/hyperkitty/-/blob/master/hyperkitty/views/mlist.p...
The problem could be that uwsgi seems to kill an ongoing downloading process, not an idle one for some reason. And, it seems that it is a known and intentional behavior. I don’t see a good way to disable it completely, but perhaps it can be set to a long enough value that it never essentially kills a running worker which is moving bits.
--
thanks,
Abhilash Raj (maxking)