[MM3-users] parallel processing of out queue messages

Oct. 3, 2023


      Dan Caballero writes:
...
I have tried increasing the number of runners to alleviate a
backlog of messages in the "out" queue. I increased the out runners
to from 32 to 64.
How many messages per day are you processing?!  How many CPUs do you
have?  How much memory?
I know a site with 2 in runners and 8 out runners on something
basically the same as a 8CPU 16GB premium Linode.  It processes over
100,000 messages/day.  The out queue occasionally gets in double
digits but 99% of the time it's back at 0/1 in 3 seconds.  Load
average is usually around 5, CPU utilization 50-80%.  I'm not sure it
needs 2 in runners or 8 out runners, but it definitely needs at least
4 out runners.  (We haven't tested a 4-runner configuration since
resolving the "runner stall" issue described below.  Virtual CPUs are
not at a premium at that client.)
...
Is there another configuration needed beyond the number of runner
instances to get more messages processed at once?
No, the slicing algorithm is trivial, the only upper bound (if you
have enough memory ;-) is 2^160.
One thing we ran into on the system above was some (still
unidentified) issue between Mailman and its SMTP out gateway that
caused an out runner to stall for extended periods (sometimes it would
restart, often not, nothing interesting in the log).  Unfortunately I
don't have the monitoring tool we used, but it's easy to create one.
The slicing algorithm is based on the file name, and divides the space
of SHA1 hash values into N contiguous regions of each length.[1] Get a
directory listing of the out queue, count each slice's length.  If you
see one (rarely more) that just keeps increasing, that's it.
You can also use utilities like lsof to keep tabs on the connections
to the outgoing SMTP gateway (connections are identified by the source
port).  If one lasts more than 5 seconds, that's it.
Footnotes:
[1]  The slicing algorithm is in the __init__ function in
mailman/src/mailman/core/switchboard.py.

[MM3-users] parallel processing of out queue messages

Stephen J. Turnbull