MM3 gateway errors and frequent reboot needs
We have had to reboot the list server almost every week and wonder if there is something leaking and eating up resources. I does not appear to be memory, though. The server is slow to respond (could be because it's overseas) and then after some time we get intermittent gateway errors and finally we cannot access the lists online because of the gateway errors. It appears to me that the issue is with Postorius, but I could be wrong. Do files get closed after opening? Are we alone having such issues?
Please advise.
PS: My old MM2 installation could (and did) run without any help. Not so with MM3.
On 4/19/20 1:21 PM, hansen@rc.org wrote:
We have had to reboot the list server almost every week and wonder if there is something leaking and eating up resources. I does not appear to be memory, though. The server is slow to respond (could be because it's overseas) and then after some time we get intermittent gateway errors and finally we cannot access the lists online because of the gateway errors. It appears to me that the issue is with Postorius, but I could be wrong. Do files get closed after opening? Are we alone having such issues?
We are not seeing an issue like this on either lists.mailman3.org (this list) or mail.python.org (at the moment 118 @python.org Mailman 3 lists)
What are you seeing in web server logs?
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
On 4/19/20 4:38 PM, Mark Sapiro wrote:
On 4/19/20 1:21 PM, hansen@rc.org wrote:
We have had to reboot the list server almost every week and wonder if there is something leaking and eating up resources. I does not appear to be memory, though. The server is slow to respond (could be because it's overseas) and then after some time we get intermittent gateway errors and finally we cannot access the lists online because of the gateway errors. It appears to me that the issue is with Postorius, but I could be wrong. Do files get closed after opening? Are we alone having such issues?
We are not seeing an issue like this on either lists.mailman3.org (this list) or mail.python.org (at the moment 118 @python.org Mailman 3 lists)
What are you seeing in web server logs?
I concur with Mark. I am not seeing any such behavior on none of my MM3 servers. One server we were having similar problems and that was due to a bunch of bad behaving indexing bots (semrush, ahrefbot) hammering the server. So definitely check your web server log.
-- Please let me know if you need further assistance.
Thank you for your business. We appreciate our clients. Brian Carpenter EMWD.com
-- EMWD's Knowledgebase: https://clientarea.emwd.com/index.php/knowledgebase
EMWD's Community Forums http://discourse.emwd.com/
hansen@rc.org writes:
We have had to reboot the list server almost every week and wonder if there is something leaking and eating up resources.
That's something you would have to check yourself, using Unix tools like ps, df, and lsof or Activity Monitor on Mac. I don't recall any cumulative resource problems like this. Some old reports indicate certain configurations and activity patterns put a lot of instantaneous pressure on the system, but this doesn't sound like that. Also those problems were specific to the software stack (databases, caches, search engines) that was used. The current recommended stacks do not have those problems as far as I know.
The only current performance problem I'm aware of is in using the REST API directly to get large collections (eg, 10,000 member subscriber lists). (My currect thinking is that those are best addressed by using the corresponding paged endpoints, but still researching.) However, that is not a cumulative problem, it's a size-of-list problem, so it doesn't sound related.
Are you running all of Mailman core, Postorius, and HyperKitty on the same server? For Postorius and HyperKitty, what webserver are you using? What database backend? What other applications are running on the server? What operating system and version? Recent hardware?
I does not appear to be memory, though. The server is slow to respond (could be because it's overseas) and then after some time we get intermittent gateway errors and finally we cannot access the lists online because of the gateway errors.
What does "gateway errors" mean? If problems with Internet routing or firewalls, I don't understand how network errors could be arise from a problem in Postorius. Why do you think Postorius could be responsible for those, aside from them appearing after switching to Mailman 3? Are you sure you weren't having them before?
It appears to me that the issue is with Postorius, but I could be wrong. Do files get closed after opening?
Yes, as far as we know. No reason to think otherwise since this is not something that has been reported that I can recall, and Python 3 helps us to code in a style that is not so vulnerable to such leaks.
Are we alone having such issues?
So far, yes.
PS: My old MM2 installation could (and did) run without any help. Not so with MM3.
I don't think any of *our* feelings would be hurt if people decided to keep their Mailman 2 installations. :-) It's still great software for what it does. But it took more than 10 years to get there (the first production installations were in about 1994! and it's been feature- stable since about 2005 I'd guess.
Unfortunately it wasn't possible to bring in the kind of features that many people need nowadays without a complete rewrite (partly a matter of manpower -- Python 3 brings a lot of power and fun to programming, partly a matter of technical debt making Mailman 2 code hard to change in big ways, and partly a matter of the need to port to Python 3 for the very long run), and the new features themselves changed the way that Mailman (and Mailman's users and administrators) interact with the network.
participants (4)
-
Brian Carpenter
-
hansen@rc.org
-
Mark Sapiro
-
Stephen J. Turnbull