[MM3-users] Re: Postorius fails to show list because of a large message being held

July 14, 2020


      On 7/14/20 3:54 AM, Brian Carpenter wrote:
...
On 7/14/20 6:49 AM, Gilles Filippini wrote:
...
Mailman Core logs:
[2020-07-14 10:41:21 +0000] [36] [CRITICAL] WORKER TIMEOUT (pid:49)
[2020-07-14 10:41:21 +0000] [49] [INFO] Worker exiting (pid: 49)
[2020-07-14 10:41:22 +0000] [50] [INFO] Booting worker with pid: 50
I am pretty sure that timeout is related to gunicorn. If it is, then
increasing the worker out timeout setting for gunicorn should work. You
tend to get such errors when doing an export of a large membership
roster. This is the first time I have seen it related to a held message.
It seems weird that a gunicorn worker would run for more that the
default 30 seconds because of a 23MB message, but I think Brian is correct.
...
I have this in my setup:
mailman.cfg:
[webservice]
configuration: /opt/mailman/mm/gunicorn.cfg
gunicorn.cfg:
[gunicorn]
workers = 4
timeout = 900
You will need to restart gunicorn after adjusting the timeout setting.
It's confusing, but after adjusting that setting, you need to restart
Mailman core, not gunicorn.
If you are running a gunicorn service, that is what is providing WSGI
support for Django (Postorius and/or HyperKitty). You may be using
gunicorn for this or uWSGI or Apache mod_wsgi, but is any case, that is
not the gunicorn we are considering here.
Regardless of what you use to provide WSGI support for Django, Mailman
core uses gunicorn to support the REST API, and that is the gunicorn
affected by the settings in the gunicorn.cfg pointed to by
[webservice]
configuration: /opt/mailman/mm/gunicorn.cfg
If, with the above configuration, you do ps -fwwA|grep runner=rest you
will see something like:
mailman  20582 20561  0 01:21 ?        00:00:16
/opt/mailman/mm/venv/bin/python /opt/mailman/mm/venv/bin/runner -C
/opt/mailman/mm/deployment/mailman.cfg --runner=rest:0:1
mailman  20722 20582  0 01:21 ?        00:01:19
/opt/mailman/mm/venv/bin/python /opt/mailman/mm/venv/bin/runner -C
/opt/mailman/mm/deployment/mailman.cfg --runner=rest:0:1
mailman  20725 20582  0 01:21 ?        00:01:18
/opt/mailman/mm/venv/bin/python /opt/mailman/mm/venv/bin/runner -C
/opt/mailman/mm/deployment/mailman.cfg --runner=rest:0:1
mailman  20726 20582  0 01:21 ?        00:01:18
/opt/mailman/mm/venv/bin/python /opt/mailman/mm/venv/bin/runner -C
/opt/mailman/mm/deployment/mailman.cfg --runner=rest:0:1
mailman  20727 20582  0 01:21 ?        00:01:50
/opt/mailman/mm/venv/bin/python /opt/mailman/mm/venv/bin/runner -C
/opt/mailman/mm/deployment/mailman.cfg --runner=rest:0:1
The first of these, pid 20582, is Mailman's actual REST runner. The
other 4 with parent pid 20582 are the 4 gunicorn worker processes forked
from the REST runner.
It's the REST runner or Mailman core that needs to be restarted to pick
up that change.
--
Mark Sapiro <mark@msapiro.net>        The highway is for gamblers,
San Francisco Bay Area, California    better use your sense - B. Dylan