debian's 3.2.1 mailman package too unreliable in production - Mail held occasionally causes server errors, and unexplained queue/bad messages -
We have an issue where a mailing list has had a message sent to it from a non-member, so the message has been held, but when the list owner or I attempt to list the held messages for the list we get a Server Error: "An error occurred while processing your request.".
I see the mail being held in the mailman log with the non-subscriber reason, and followed the suggestions in https://lists.mailman3.org/archives/list/mailman-users@mailman3.org/thread/C... which worked to look at and discard that held message. The problem is, there are more and we have over 900 mailing lists
FYI, we are running:Mailman Core VersionGNU Mailman 3.2.1 (La Villa Strangiato)Mailman Core API Version3.0Mailman Core Python Version3.7.3 (default, Jul 25 2020, 13:03:44) [GCC 8.3.0]
So we are looking at setting up a virtualenv to run the latest mailman3 3.3.1 as the 3.2.1 package from debian is looking unstable - barely usable. For example, while poking around I found 641 messages in queue/bad , but when I inspect them they look fine. If, as mentioned elsewhere, I rename them to .pck, put them in queue/shunt, and "mailman unshunt" the mail finally gets delivered. Why work now? Some kind of transient bug?
Is 3.3.1 the way to go?
On that last error, I've now enabled debug and hope to catch why messages are being dropped into queue/bad instead of being delivered. Already I have found errors in the mailman log such as:
PermissionError: [Errno 13] Permission denied: '/var/lib/mailman3/queue/out/1603233694.919073+d907eb9512ce11fda515c19d 723537fc4cb15da2.pck'Oct 20 23:41:35 2020 (9793) Skipping and preserving unparseable message: 1603233694.919073+d907eb9512ce11fda515c19d723537fc4cb15da2Oct 20 23:41:35 2020 (9787) Uncaught runner exception: [Errno 13] Permission denied: '/var/lib/mailman3/queue/archive/1603233694.9209244+a0c1f3aab51d3d202aa 37911ab453f094b50b450.pck'Oct 20 23:41:35 2020 (9793) Uncaught runner exception: [Errno 13] Permission denied: '/var/lib/mailman3/queue/out/1603233694.9260252+138142eeb15b16b5da16934 954d1d3aeb1331198.pck'Oct 20 23:41:35 2020 (9793) Traceback (most recent call last): File "/usr/lib/python3/dist- packages/mailman/core/runner.py", line 158, in _one_iteration msg, msgdata = self.switchboard.dequeue(filebase) File "/usr/lib/python3/dist-packages/mailman/core/switchboard.py", line 150, in dequeue with open(filename, 'rb') as fp:PermissionError: [Errno 13] Permission denied: '/var/lib/mailman3/queue/out/1603233694.9260252+138142eeb15b16b5da16934 954d1d3aeb1331198.pck'Oct 20 23:41:35 2020 (9790) Uncaught runner exception: [Errno 13] Permission denied: '/var/lib/mailman3/queue/in/1603233694.9129462+8be9ab12e5dfeab086a1ab8b 3917550aea313505.pck'Oct 20 23:41:35 2020 (9793) Skipping and preserving unparseable message: 1603233694.9260252+138142eeb15b16b5da16934954d1d3aeb1331198Oct 20 23:41:35 2020 (9793) Uncaught runner exception: [Errno 13] Permission denied: '/var/lib/mailman3/queue/out/1603233694.9269004+5f8572edfb07e0e28b9f0b7 928a1184b8c8048ba.pck'Oct 20 23:41:35 2020 (9793) Traceback (most recent call last): File "/usr/lib/python3/dist- packages/mailman/core/runner.py", line 158, in _one_iteration msg, msgdata = self.switchboard.dequeue(filebase) File "/usr/lib/python3/dist-packages/mailman/core/switchboard.py", line 150, in dequeue with open(filename, 'rb') as fp:PermissionError: [Errno 13] Permission denied: '/var/lib/mailman3/queue/out/1603233694.9269004+5f8572edfb07e0e28b9f0b7 928a1184b8c8048ba.pck'Oct 20 23:41:35 2020 (9793) Skipping and preserving unparseable message: 1603233694.9269004+5f8572edfb07e0e28b9f0b7928a1184b8c8048ba
The mailman3 user, "list", owns and has full rights all the way down: # ls -ld /var/lib/mailman3/queue/*drwxrwx--- 2 list list 4096 Oct 21 01:45 /var/lib/mailman3/queue/archivedrwxrwx--- 2 list list 77824 Oct 21 00:15 /var/lib/mailman3/queue/baddrwxrwx--- 2 list list 4096 Oct 19 10:44 /var/lib/mailman3/queue/bouncesdrwxrwx--- 2 list list 4096 Oct 5 17:18 /var/lib/mailman3/queue/commanddrwxrwx--- 2 list list 4096 Oct 21 00:00 /var/lib/mailman3/queue/digestdrwxrwx--- 2 list list 4096 Oct 21 00:20 /var/lib/mailman3/queue/indrwxrwx--- 2 list list 4096 Aug 11 18:44 /var/lib/mailman3/queue/nntpdrwxrwx--- 2 list list 4096 Oct 21 01:45 /var/lib/mailman3/queue/outdrwxrwx--- 2 list list 4096 Oct 21 01:45 /var/lib/mailman3/queue/pipelinedrwxrwx--- 2 list list 4096 Aug 11 18:44 /var/lib/mailman3/queue/retrydrwxrwx--- 2 list list 69632 Oct 21 01:45 /var/lib/mailman3/queue/shuntdrwxrwx--- 2 list list 4096 Oct 21 00:20 /var/lib/mailman3/queue/virgin so I'm kinda running out of options why this particular problem is happening. Any pointers? Or just toss the mailman 3.2.1 debian package out and go for a virtualenv 3.3.1 mailman?
Thanks -- Alex
On 10/20/20 6:30 PM, Alex Schuilenburg via Mailman-users wrote:
We have an issue where a mailing list has had a message sent to it from a non-member, so the message has been held, but when the list owner or I attempt to list the held messages for the list we get a Server Error: "An error occurred while processing your request.".
Details of the server error should be in Django's log which I think in the Debian package is named mailmansuite.log.
I see the mail being held in the mailman log with the non-subscriber reason, and followed the suggestions in https://lists.mailman3.org/archives/list/mailman-users@mailman3.org/thread/C... which worked to look at and discard that held message. The problem is, there are more and we have over 900 mailing lists
FYI, we are running:Mailman Core VersionGNU Mailman 3.2.1 (La Villa Strangiato)Mailman Core API Version3.0Mailman Core Python Version3.7.3 (default, Jul 25 2020, 13:03:44) [GCC 8.3.0]
So we are looking at setting up a virtualenv to run the latest mailman3 3.3.1 as the 3.2.1 package from debian is looking unstable - barely usable.
Actually, the latest Mailman core is 3.3.2rc2 with the final due by the end of October.
While I would never discourage people from installing Mailman from our source distributions, without more understanding of your actual errors, it is difficult to know if that will help.
For example, while poking around I found 641 messages in queue/bad , but when I inspect them they look fine. If, as mentioned elsewhere, I rename them to .pck, put them in queue/shunt, and "mailman unshunt" the mail finally gets delivered. Why work now? Some kind of transient bug?
If they are in queue/bad because of permissions errors, it is not surprising that moving them to .pck files in queue/shunt is successful.
Is 3.3.1 the way to go?
On that last error, I've now enabled debug and hope to catch why messages are being dropped into queue/bad instead of being delivered. Already I have found errors in the mailman log such as:
PermissionError: [Errno 13] Permission denied: '/var/lib/mailman3/queue/out/1603233694.919073+d907eb9512ce11fda515c19d 723537fc4cb15da2.pck'Oct 20 23:41:35 2020 (9793) Skipping and preserving unparseable message: 1603233694.919073+d907eb9512ce11fda515c19d723537fc4cb15da2Oct 20 23:41:35 2020 (9787) Uncaught runner exception: [Errno 13] Permission denied: '/var/lib/mailman3/queue/archive/1603233694.9209244+a0c1f3aab51d3d202aa 37911ab453f094b50b450.pck'Oct 20 23:41:35 2020 (9793) Uncaught runner exception: [Errno 13] Permission denied: '/var/lib/mailman3/queue/out/1603233694.9260252+138142eeb15b16b5da16934 954d1d3aeb1331198.pck'Oct 20 23:41:35 2020 (9793) Traceback (most recent call last): File "/usr/lib/python3/dist- packages/mailman/core/runner.py", line 158, in _one_iteration msg, msgdata = self.switchboard.dequeue(filebase) File "/usr/lib/python3/dist-packages/mailman/core/switchboard.py", line 150, in dequeue with open(filename, 'rb') as fp:PermissionError: [Errno 13] Permission denied: '/var/lib/mailman3/queue/out/1603233694.9260252+138142eeb15b16b5da16934 954d1d3aeb1331198.pck'Oct 20 23:41:35 2020 (9790) Uncaught runner exception: [Errno 13] Permission denied: '/var/lib/mailman3/queue/in/1603233694.9129462+8be9ab12e5dfeab086a1ab8b 3917550aea313505.pck'Oct 20 23:41:35 2020 (9793) Skipping and preserving unparseable message: 1603233694.9260252+138142eeb15b16b5da16934954d1d3aeb1331198Oct 20 23:41:35 2020 (9793) Uncaught runner exception: [Errno 13] Permission denied: '/var/lib/mailman3/queue/out/1603233694.9269004+5f8572edfb07e0e28b9f0b7 928a1184b8c8048ba.pck'Oct 20 23:41:35 2020 (9793) Traceback (most recent call last): File "/usr/lib/python3/dist- packages/mailman/core/runner.py", line 158, in _one_iteration msg, msgdata = self.switchboard.dequeue(filebase) File "/usr/lib/python3/dist-packages/mailman/core/switchboard.py", line 150, in dequeue with open(filename, 'rb') as fp:PermissionError: [Errno 13] Permission denied: '/var/lib/mailman3/queue/out/1603233694.9269004+5f8572edfb07e0e28b9f0b7 928a1184b8c8048ba.pck'Oct 20 23:41:35 2020 (9793) Skipping and preserving unparseable message: 1603233694.9269004+5f8572edfb07e0e28b9f0b7928a1184b8c8048ba
Were the runners all running as user list
when the above errors occur.
The mailman3 user, "list", owns and has full rights all the way down: # ls -ld /var/lib/mailman3/queue/*drwxrwx--- 2 list list 4096 Oct 21 01:45 /var/lib/mailman3/queue/archivedrwxrwx--- 2 list list 77824 Oct 21 00:15 /var/lib/mailman3/queue/baddrwxrwx--- 2 list list 4096 Oct 19 10:44 /var/lib/mailman3/queue/bouncesdrwxrwx--- 2 list list 4096 Oct 5 17:18 /var/lib/mailman3/queue/commanddrwxrwx--- 2 list list 4096 Oct 21 00:00 /var/lib/mailman3/queue/digestdrwxrwx--- 2 list list 4096 Oct 21 00:20 /var/lib/mailman3/queue/indrwxrwx--- 2 list list 4096 Aug 11 18:44 /var/lib/mailman3/queue/nntpdrwxrwx--- 2 list list 4096 Oct 21 01:45 /var/lib/mailman3/queue/outdrwxrwx--- 2 list list 4096 Oct 21 01:45 /var/lib/mailman3/queue/pipelinedrwxrwx--- 2 list list 4096 Aug 11 18:44 /var/lib/mailman3/queue/retrydrwxrwx--- 2 list list 69632 Oct 21 01:45 /var/lib/mailman3/queue/shuntdrwxrwx--- 2 list list 4096 Oct 21 00:20 /var/lib/mailman3/queue/virgin so I'm kinda running out of options why this particular problem is happening. Any pointers? Or just toss the mailman 3.2.1 debian package out and go for a virtualenv 3.3.1 mailman?
Unexplained permissions errors are often SELinux issues. Is SELinux enabled? If so, have you tried disabling it?
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
participants (2)
-
Alex Schuilenburg
-
Mark Sapiro