"Cannot connect to SMTP server localhost on port 25"
Hello,
I've been getting "Cannot connect to SMTP server localhost on port 25" from runner "out" in mailman.log over some mails stuck in shunt queue, postfix logs show nothing when this happens, tcpdump on lo interface port 25 shows nothing. The log shows up when I start mailman, and then it seems not to show up again.
This is my mailman.cfg's relevant parts:
[mta] verp_confirmations: yes verp_personalized_deliveries: yes verp_delivery_interval: 1
incoming: mailman.mta.postfix.LMTP outgoing: mailman.mta.deliver.deliver lmtp_host: localhost lmtp_port: 8024 smtp_host: localhost smtp_port: 25 configuration: python:mailman.config.postfix
I can do 'telnet localhost 25' just fine, including as mailman user. I can send and receive mail on my mail server. I tried making a test list and sent/received messages fine.
smtp debug logs show "Using agent: <mailman.mta.deliver.Deliver object at 0x7fdd9793c940>" and "IndividualDelivery to: <email@address.tld>" on repeat, where the email address is one of the addresses of the shunted mails.
version 3.2.2
On 9/18/19 11:29 AM, Manos Pitsidianakis wrote:
Hello,
I've been getting "Cannot connect to SMTP server localhost on port 25" from runner "out" in mailman.log over some mails stuck in shunt queue, postfix logs show nothing when this happens, tcpdump on lo interface port 25 shows nothing. The log shows up when I start mailman, and then it seems not to show up again.
That log message is only written once. It won't be written again unless the out runner is restarted or there is a successful delivery followed by another socket.error.
I can do 'telnet localhost 25' just fine, including as mailman user. I can send and receive mail on my mail server. I tried making a test list and sent/received messages fine.
Is only one list affected, and is it all messages from this list or only some?
smtp debug logs show "Using agent: <mailman.mta.deliver.Deliver object at 0x7fdd9793c940>" and "IndividualDelivery to: <email@address.tld>" on repeat, where the email address is one of the addresses of the shunted mails.
Do some deliveries of the problem message succeed?
I.e., in the normal case without failure, there will be one "IndividualDelivery to: <email@address.tld>" message for each recipient in the recipients list.
In the shunted message case, the msgdata object in the shunted pickle contains the recipients. Are there "IndividualDelivery to: <email@address.tld>" entries for some but not all recipients. Is the last of these an invalid local address? (Postfix can return a 421 status and drop the connection if there are too many invalid local recipients, although I don't know if this would apply in the VERP case.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
Thank you for your prompt response Mark,
Στις 2019-09-18 19:12, Mark Sapiro έγραψε:
Is only one list affected, and is it all messages from this list or only some?
Just two messages from this certain list. Subsequent messages and other activity seems to work properly
Do some deliveries of the problem message succeed?
I think not, the IndividualDelivery lines list all all recipients (list is small so I was able to check)
Postfix can return a 421 status and drop the connection if there are too many invalid local recipients, although I don't know if this would apply in the VERP case.
Well there is no connection to postfix at all when the runner starts and the "Cannot connect to SMTP server" log shows up. Otherwise I'd have seen something in the postfix logs or tcpdumps in localhost.
On 9/18/19 2:29 PM, Manos Pitsidianakis wrote:
Στις 2019-09-18 19:12, Mark Sapiro έγραψε:
Is only one list affected, and is it all messages from this list or only some?
Just two messages from this certain list. Subsequent messages and other activity seems to work properly
There should be entries in the mailman.log for each shunted message listing the exception with a traceback and the fact that the message was shunted.
What are these entries?
Well there is no connection to postfix at all when the runner starts and the "Cannot connect to SMTP server" log shows up. Otherwise I'd have seen something in the postfix logs or tcpdumps in localhost.
The "Cannot connect to SMTP server" log message should not occur immediately on startup unless there are queued messages. That log message should only occur on an attempt to actually send a message.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
Thank you again for your reply, Mark.
Στις 2019-09-18 21:53, Mark Sapiro έγραψε:
What are these entries?
I have these log entries, are these what you were referring to?
https://paste.debian.net/hidden/37d863f6/
The "Cannot connect to SMTP server" log message should not occur immediately on startup unless there are queued messages. That log message should only occur on an attempt to actually send a message.
I think the problematic e-mails that cause this issue are stuck on the outgoing queue. Right now there are 2 pck files and one .bak in var/queue/out changing filenames repeatedly.
On 9/19/19 2:22 AM, Manos Pitsidianakis wrote:
Thank you again for your reply, Mark.
Στις 2019-09-18 21:53, Mark Sapiro έγραψε:
What are these entries?
I have these log entries, are these what you were referring to?
No. You referred to "shunted" mails. I assumed that mail was winding up in the shunt queue, but this is apparently not the case.
The "Cannot connect to SMTP server" log message should not occur immediately on startup unless there are queued messages. That log message should only occur on an attempt to actually send a message.
I think the problematic e-mails that cause this issue are stuck on the outgoing queue. Right now there are 2 pck files and one .bak in var/queue/out changing filenames repeatedly.
What is happening is you have three messages in the out queue which are problematic. When Mailman or the out runner is restarted, for some reason the attempt to send these triggers the "Cannot connect to SMTP server" issue which is logged only once.
Here's what happens:
The oldest .pck file in the out queue gets picked up by the out runner and renamed to a .bak.
The runner attempts to send the message and gets a socket.error exception so it requeues the message to be tried again. This removes the .bak and creates a new .pck.
This process continues so the files keep getting renamed.
I suggest you stop Mailman, and move the three files out of the out/ queue, and then restart Mailman.
This should stop the out runner from "spinning" on these messages. Then you can examine the files with Mailman's 'bin/mailman qfile' command and see if you can identify something in the recipients list or other metadata that might be causing this.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
Στις 2019-09-19 18:52, Mark Sapiro έγραψε:
No. You referred to "shunted" mails. I assumed that mail was winding up in the shunt queue, but this is apparently not the case.
But they are. Not exactly when or why but they show up in queue/shunt and queue/out at times. (maybe because I used mailman unshunt?)
This should stop the out runner from "spinning" on these messages. Then you can examine the files with Mailman's 'bin/mailman qfile' command and see if you can identify something in the recipients list or other metadata that might be causing this.
I cannot discern anything in the qfiles. It'd take some effort to deanonymise them for posting here.
I've had new posts in the list and they get stuck as well.
On 9/21/19 1:28 AM, Manos Pitsidianakis wrote:
I cannot discern anything in the qfiles. It'd take some effort to deanonymise them for posting here.
I've had new posts in the list and they get stuck as well.
It's a stretch, but I suggest you examine the list membership and remove any invalid local addresses.
Also, look in the mail server's logs to see if you can see anything about incoming connections being established and then dropped.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
Thanks for the reply Mark,
Στις 2019-09-21 16:19, Mark Sapiro έγραψε:
It's a stretch, but I suggest you examine the list membership and remove any invalid local addresses.
There aren't any local addresses.
Also, look in the mail server's logs to see if you can see anything about incoming connections being established and then dropped. Unfortunately no. Listening on all interfaces with tcpdump and there are no packets when mailman starts and the "Cannot connect to SMTP" log appears. It looks like there is no connection at all.
Can I use a debugger to see exactly what happens in the runner?
On 9/21/19 1:00 PM, Manos Pitsidianakis wrote:
Can I use a debugger to see exactly what happens in the runner?
You might be able to run the runner under pdb <https://docs.python.org/3.2/library/pdb.html>.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
Στις 2019-09-21 22:18, Mark Sapiro έγραψε:
On 9/21/19 1:00 PM, Manos Pitsidianakis wrote:
Can I use a debugger to see exactly what happens in the runner?
You might be able to run the runner under pdb <https://docs.python.org/3.2/library/pdb.html>.
I think I found what's wrong:
/usr/local/lib/python3.5/dist-packages/mailman/core/runner.py(134)run() -> filecnt = self._one_iteration()
/usr/local/lib/python3.5/dist-packages/mailman/core/runner.py(173)_one_iteration() -> self._process_one_file(msg, msgdata)
/usr/local/lib/python3.5/dist-packages/mailman/core/runner.py(266)_process_one_file() -> keepqueued = self._dispose(mlist, msg, msgdata)
/usr/local/lib/python3.5/dist-packages/mailman/runners/outgoing.py(95)_dispose() -> self._func(mlist, msg, msgdata)
/usr/local/lib/python3.5/dist-packages/mailman/mta/deliver.py(86)deliver() -> refused = agent.deliver(mlist, msg, msgdata)
/usr/local/lib/python3.5/dist-packages/mailman/mta/base.py(163)deliver() -> callback(mlist, message_copy, msgdata_copy)
/usr/local/lib/python3.5/dist-packages/mailman/mta/decorating.py(32)decorate() -> decorator.process(mlist, msg, msgdata)
/usr/local/lib/python3.5/dist-packages/mailman/handlers/decorate.py(264)process() -> process(mlist, msg, msgdata)
/usr/local/lib/python3.5/dist-packages/mailman/handlers/decorate.py(89)process() -> footer = decorate('list:member:regular:footer', mlist, d)
/usr/local/lib/python3.5/dist-packages/mailman/handlers/decorate.py(226)decorate() -> template = getUtility(ITemplateLoader).get(name, mlist, **extradict)
/usr/local/lib/python3.5/dist-packages/mailman/model/template.py(188)get() -> name, lookup_context, **substitutions)
/usr/local/lib/python3.5/dist-packages/mailman/database/transaction.py(85)wrapper() -> return function(args[0], config.db.store, *args[1:], **kws)
/usr/local/lib/python3.5/dist-packages/mailman/model/template.py(109)get() -> contents = protocols.get(actual_uri, **auth)
/usr/local/lib/python3.5/dist-packages/mailman/utilities/protocols.py(38)get() -> response = requests.get(url, timeout=REQUEST_TIMEOUT, **kws) /usr/lib/python3/dist-packages/requests/api.py(70)get() -> return request('get', url, params=params, **kwargs) /usr/lib/python3/dist-packages/requests/api.py(56)request() -> return session.request(method=method, url=url, **kwargs) /usr/lib/python3/dist-packages/requests/sessions.py(488)request() -> resp = self.send(prep, **send_kwargs) /usr/lib/python3/dist-packages/requests/sessions.py(609)send() -> r = adapter.send(request, **kwargs) /usr/lib/python3/dist-packages/requests/adapters.py(487)send() -> raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPConnectionPool(host='localhost', port=8000): Max retries exceeded with url: /mailman3/api/templates/list/meli-announce.meli.delivery/list:member:regular:footer (Caused by NewConnectionError('<requests.packages.urllib3.connection.HTTPConnection object at 0x7f4f0495a358>: Failed to establish a new connection: [Errno 111] Connection refused',))
mailman3 tries to get the template I've set via the mailman suite interface but it assumes it is hosted locally: 'http://localhost:8000/mailman3/api/templates/list/meli-announce.meli.deliver...' In my setup, they are separate machines. I can't find any way to change this in the configuration. Also, can't the same template be loaded from the master's localhost:8001 api?
Haven't fiddled with python for some time, why is this exception catched by "except socket.error"? Is it a subclass of requests.exceptions.ConnectionError?
Στις 2019-09-22 09:11, Manos Pitsidianakis έγραψε:
mailman3 tries to get the template I've set via the mailman suite interface but it assumes it is hosted locally: 'http://localhost:8000/mailman3/api/templates/list/meli-announce.meli.deliver...' In my setup, they are separate machines. I can't find any way to change this in the configuration. Also, can't the same template be loaded from the master's localhost:8001 api?
I created an ssh-tunnel that forwards postorious 8000 port to localhost as a systemd service to workaround this temporarily.
On 9/22/19 2:11 AM, Manos Pitsidianakis wrote:
mailman3 tries to get the template I've set via the mailman suite interface but it assumes it is hosted locally: 'http://localhost:8000/mailman3/api/templates/list/meli-announce.meli.deliver...' In my setup, they are separate machines. I can't find any way to change this in the configuration. Also, can't the same template be loaded from the master's localhost:8001 api?
See <https://mailman.readthedocs.io/en/latest/build/lib/mailman/rest/docs/templates.html>. There are other ways to set the URI for a template which will work in your case.
When you set a template via Postorius, it sets the URI for that template to point to itself to return what it set. In your case, this URI points to localhost which is Postorius' localhost, but it is set in core and core's localhost is a different machine so it doesn't work.
You could file an issue at <https://gitlab.com/mailman/postorius/issues> about this.
Haven't fiddled with python for some time, why is this exception catched by "except socket.error"? Is it a subclass of requests.exceptions.ConnectionError?
The other way around.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
participants (2)
-
Manos Pitsidianakis
-
Mark Sapiro