Mailman stops processing....... where to start looking for root cause?
Today was the 2nd time mailman stopped sending out mail, it receives mail but sends nothing out. To resolve the issue I restarted the mailman services and then all the incoming/queued mail was sent out... last time this happened was 21 days ago.... Before that, it has been rock solid for years.
We are running in a docker environment with no recent changes.
On the host in /var/log/maillog, I see the mail coming in and going to mailman in docker.
for docker, the mailman logs are in /opt/mailman/core/var/logs
in /opt/mailman/core/var/logs/smtp.log ---------- I see the messages coming in /opt/mailman/core/var/logs/debug.log ------------ is empty /opt/mailman/core/var/logs/mailman.log -------------- I do not see anything that looks like it had an issue
So where can I look to try to determine a cause? If it happens again, what and where can I cook to troubleshoot before restarting the service?
Thanks
On 3/29/24 12:34, bob B via Mailman-users wrote:
Today was the 2nd time mailman stopped sending out mail, it receives mail but sends nothing out. To resolve the issue I restarted the mailman services and then all the incoming/queued mail was sent out... last time this happened was 21 days ago.... Before that, it has been rock solid for years. ... So where can I look to try to determine a cause? If it happens again, what and where can I cook to troubleshoot before restarting the service?
One of the runners has died, most likely the out
runner, but possibly
another. The missing messages are stuck in a queue. Do
ls -laR /opt/mailman/core/var/queue
to see in which queue the messages are. The runner that processes that queue is gone or possibly stuck.
Do
ps -fwwu mailman|grep runner=
when things are working to get a list of what runners should be there. Do it again when things aren't working to see what's missing.
There also should be something in mailman.log about why the runner died,
but if it was killed for some external reason, there may not be
anything. Look in syslog
for something like an out of memory (OOM) or
other issue that caused the runner to die.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
Mark Sapiro writes:
One of the runners has died, most likely the
out
runner, but possibly another. The missing messages are stuck in a queue. Dols -laR /opt/mailman/core/var/queue
to see in which queue the messages are. The runner that processes that queue is gone or possibly stuck.
I have seen "stuck", in a multisite installation where Mailman's smarthost ended up being about 1000 miles from Mailman. I'm not sure what the issue is (we fixed the DNS, not Mailman :-), but every once in the while the connection would hang. In that case nothing showed up in the log. The only symptoms were the buildup in the out queue and a stale connection to port 25 on the smarthost. If your smarthost is not "network far" from the Mailman host, this is quite unlikely from my experience, although Mark may know better.
In any case, I think the best way to start diagnosis is to check the Mailman and MTA queues for stuck messages. Of course in normal operation you will often see messages in the queue, but long queues or messages older than a minute are 99% a dead or stuck queue runner.
Steve
Thanks, any idea of where the syslog is in the Maxking Docker version?
docker exec -it -u mailman mailman-core /bin/bash bash-5.0$ cd /var/log bash-5.0$ ls bash-5.0$
The opt/mailman/core/var/logs/mailman.log does not have anything that jumps out.
On 2024-04-02 16:26:59 -0000 (-0000), bob B via Mailman-users wrote:
Thanks, any idea of where the syslog is in the Maxking Docker version?
docker exec -it -u mailman mailman-core /bin/bash bash-5.0$ cd /var/log bash-5.0$ ls bash-5.0$
The opt/mailman/core/var/logs/mailman.log does not have anything that jumps out.
It looks like logging is going to container stdout, though if you're running it through the compose files you could add a logging section like:
logging:
driver: syslog
options:
tag: "mailman-core"
...for each container (adjusting the tag for each container so you
can separate the log entries). You should also be able to view the
log buffer for each container by using the docker-compose logs
subcommand.
Jeremy Stanley
participants (4)
-
bob B
-
Jeremy Stanley
-
Mark Sapiro
-
Stephen J. Turnbull