On 6/6/23 07:58, Nelson Strother wrote:
No errors are being recorded in the mailman log files. This is GNU Mailman 3.3.8 via
pip install mailman
on Debian 5.10.179-1 (2023-05-12) running on a shared system where VMware gives this server enough cycles thatmailman start
andmailman stop
each consume from 20 minutes to an hour of wall clock time, so I do not issue those commands recreationally, attempting to keep the system available for users. What should I do to help understand the cause for these failures?
If a runner has died, its death and a reason should be logged in Mailman's var/logs/mailman.log with a message similar to
Master detected subprocess exit (pid: 8617, why: SIGNAL 15, class: in, slice: 1/1)
This may not be the case if the runner is killed by the OS for an out of memory or similar reason. For this, look in syslog.
mailman stop
can take a long time because it is waiting for a runner
to stop. See https://gitlab.com/mailman/mailman/-/issues/255 but that
issue was fixed long ago. I don't understand why mailman start
would
take more time than mailman restart
. In fact, mailman restart
effectively does stop
and start
, but only for those runners which
are running.
Since you seem to frequently have missing runners, I suspect something
like an OOM condition is causing the OS to kill them. Although, I wonder
if you are correctly interpreting the logs. While the absence of the
retry
, task
, nntp
and archive
runners might not be noticed
except for messages not being archived, if either the in
or pipeline
runner is not running, no list posts will be processed.
Would not it be helpful for this limitation of restart to be included in: mailman restart --help with a suggestion to use
mailman stop
andmailman start
instead?
I have just filed https://gitlab.com/mailman/mailman/-/issues/1082 for this.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan