[MM3-users] Re: Held messages not delivered after approval

Aug. 4, 2025 · *sure*


      Krinetzki, Stephan writes:
...
/opt/mailman/var/queue/bad:
-rw-rw----  1 mailman mailman  221723 Aug  1 17:26 1754061973.4885209+46b1ae3716439bf3ef98090296dfce0320fc3017.psv
This one might be spam, but it's weird that it managed to get pickled
but can't be read.
...
-rw-rw----  1 mailman mailman   32912 Aug  2 00:00 1754085602.191851+3576cf33232db110fa7761233f67245564553652.psv
-rw-rw----  1 mailman mailman     416 Aug  2 00:00 1754085604.0204346+ad485da0c45cb0ad17a5dc42613c3eb3f313c20e.psv
-rw-rw----  1 mailman mailman 1407649 Aug  2 00:00 1754085623.275817+f23139c8127c454b4fe65453af3db18e558b0e87.psv
-rw-rw----  1 mailman mailman 1407634 Aug  2 00:02 1754085729.3529432+1643f907bac39a22a7d71e50b031c4f8a574082c.psv
I have no clue about these four (see below for comments on cron).
...
/opt/mailman/var/queue/out:
Looks normal for your configuration.
...
/opt/mailman/var/queue/shunt:
I don't understand why on August 1st you see shunts at intervals
throughout the working day, then suddenly on the 2nd they all happen
at midnight.
Have you tried "mailman unshunt"?  If not what happens when you do?
If the shunts are happening because of the restart, then they should
go through on unshunt.  If they don't, there's some other problem.
You can also try renaming the .psvs to .pck, and check the metadata in
the pickle for which queue to move it to.  That's more risky, and you
shouldn't try it if the output of "mailman qfile" isn't as expected.
...
I don't see here a problem. But the timestamp seems to be related
to the restart of mailman. Can I skip this in the logrotate?
As I mentioned before, there was (and may still be) a bug in Mailman's
logging such that Mailman fails to reopen the logs, and typically
after a couple of days you end up with a nameless open file collecting
the logs and uselessly consuming more and more disk space.  The
restart is intended to work around this problem.
...
Btw: The crontab is the following:
@daily mailman cd /opt/mailman; source /opt/mailman/mailman-venv/bin/activate; /opt/mailman/mailman-venv/bin/mailman digests --send > /dev/null 2>&1
The django-admin commands aren't directly related, I'm going to ignore
them for now.  The only thing I know for *sure* runs at midnight daily
is "mailman digests --send".  On my Debian Linode, the default (which
I left alone) is for logrotate's cron job to live in /etc/cron.daily,
which is run at 06:25 daily using "run-parts".  (This is quite a
common setup on Linux.)  So we need to know where the logrotate job is
specified (crontab, cron.d, or cron.daily) and at what time (@daily =
midnight) to be sure that the mailman restart is related to the bad
and shunt queue files.
...
So i checked the mailman.log:
[2025-08-01 00:00:02 +0200] [324558] [INFO] Shutting down: Master
[2025-08-01 00:00:23 +0200] [567059] [INFO] Shutting down: Master
[2025-08-01 00:00:42 +0200] [567206] [INFO] Shutting down: Master
[2025-08-01 00:01:01 +0200] [567278] [INFO] Shutting down: Master
[2025-08-01 00:01:34 +0200] [567379] [INFO] Shutting down: Master
[2025-08-01 00:01:52 +0200] [567516] [INFO] Shutting down: Master
[2025-08-01 00:02:11 +0200] [567646] [INFO] Shutting down: Master
That is not normal.  Your control process is crashing every 15-20
seconds.  I think it probably is a problem with the digests, not with
the restart.  What appears to be happening is that the digest process
gets triggered, it creates a message and queues it, then fails to send
it so nastily that Mailman restarts (or stops and something like
systemd restarts it).  On restart, Mailman finds the digest message
(probably in the out queue), tries to send it again, crashes again,
and eventually decides that isn't going to work, sends it to bad, and
stops crashing.
There's normally lot more chatter at startup and shutdown, for example
about runners being started.  That's probably because you have that
redirected to a separate log file, or maybe that information doesn't
get output with a log level of "warn".  Maybe the crash information is
in the runner.log.
According to the config you posted earlier, you're sending most
channels to separate log files.  Have you checked any of them other
than mailman.log and smtp.log?  Also, note that httpd.log and
error.log are normally used by Mailman core's gunicorn (ie, the REST
API).  I'm not sure what effect directing Mailman's error channel to
error.log will have, but I suspect you could end up losing logs or
having text from different sources mixed.
I haven't thought about it carefully, but I would have separate logs
for bounces, subscriptions, smtp, and nntp because they are quite
separate.  Everything else would go into mailman.log, because that
makes it easier to trace a single message through the whole process.
Until you know that you don't need it, I would have most channels at
the info level.  The debug level is almost never useful unless you're
a developer trying to fix something (vs a troubleshooter trying to
diagnose the problem).  The logs compress very well (often 70%
reduction), so it's generally a good idea to include the extra
information at info level.  Remember, the real explosion is logging is
that outgoing mail gets logged up to 43k times per incoming post.  Of
course you can do quite a bit better if you can sacrifice the
personalized footers, but most sites don't anymore because there are
strict rules about convenience of unsubscription.
...
Well...i will stop the restart after the log rotate today.
You can do that if you want, but it's likely that you'll end up losing
logs.
...
...
And for every one of those shunted messages there should be an
exception with traceback logged in mailman.log. Those tracebacks
should be helpful.
If there were any. Maybe the "debug" level should be "info". But
for which logs?
Setting the channel to "debug" gives maximum verbosity, and unhandled
exceptions are logged at "warn" or "error" level (maximum severity).
...
Maybe the restart at night after the lograte maybe the issue.
Not with Mailman bouncing up and down pretty much as fast as it can.
The restart can only account for one restart, the other 6 were caused
by something else.
--
GNU Mailman consultant (installation, migration, customization)
Sirius Open Source    https://www.siriusopensource.com/
Software systems consulting in Europe, North America, and Japan

[MM3-users] Re: Held messages not delivered after approval

Stephen J. Turnbull