Hyperkitty performance problem
Hey all,
last week we caught some AI bots causing high LOAD on our mailman3 servers archive, that we blocked with nginx rules.
I'm not sure if it is related to this incident, but we noticed this when analyzing the log files at the time. In the mailman.log hyperkitty is listed 10k times and more to archive a single mail over the time of 2h, seemingly all with the same process ID
(1186) HyperKitty archived message [...]
The mailman3-web.log has matching log entries that the mail was archived
hyperkitty.views.mailman Archived message
every 0.5 seconds.
In the web interface to the archive, the mail is only listed once though.
My suspicion is, that these archiving processes are now overloading the system, preventing a timely catching up work of the queued messages. I could not find anything in the Mailman3 nor Hyperkitty documentation to better distribute the system resources to the Mailman runner or prevent Hyperkitty to restart archiving a mail over and over again.
Any hint about where to look for the reason of this is highly appreciated.
Best, Tobias
-- Tobias Diekershoff >>> System Hacker Free Software Foundation Europe Schönhauser Allee 6/7, 10119 Berlin, Germany | t +49-30-27595290 Registered at Amtsgericht Hamburg, VR 17030 | fsfe.org/support OpenPGP-Key ID ... 0x25FE376FF17694A1 Fingerprint ...... 23EE F484 FDF8 291C BA09 A406 25FE 376F F176 94A1
Hey all,
so far we could not find the reason behind the problem. However while reviewing the installation process one one our team mentioned that he read the installation instructions for Hyperkitty different then the rest of the team.
Are Mailman3/Postorius and Hyperkitty supposed to run in separated databases, or are they supposed to use the same database?
Thanks in advance for clarification! Tobias
On 06.11.24 08:18, Tobias Diekershoff wrote:
Hey all,
last week we caught some AI bots causing high LOAD on our mailman3 servers archive, that we blocked with nginx rules.
I'm not sure if it is related to this incident, but we noticed this when analyzing the log files at the time. In the mailman.log hyperkitty is listed 10k times and more to archive a single mail over the time of 2h, seemingly all with the same process ID
(1186) HyperKitty archived message [...]
The mailman3-web.log has matching log entries that the mail was archived
hyperkitty.views.mailman Archived message
every 0.5 seconds.
In the web interface to the archive, the mail is only listed once though.
My suspicion is, that these archiving processes are now overloading the system, preventing a timely catching up work of the queued messages. I could not find anything in the Mailman3 nor Hyperkitty documentation to better distribute the system resources to the Mailman runner or prevent Hyperkitty to restart archiving a mail over and over again.
Any hint about where to look for the reason of this is highly appreciated.
Best, Tobias
-- Tobias Diekershoff >>> System Hacker Free Software Foundation Europe Schönhauser Allee 6/7, 10119 Berlin, Germany | t +49-30-27595290 Registered at Amtsgericht Hamburg, VR 17030 | fsfe.org/support OpenPGP-Key ID ... 0x25FE376FF17694A1 Fingerprint ...... 23EE F484 FDF8 291C BA09 A406 25FE 376F F176 94A1
On Fri, Nov 22, 2024 at 9:59 AM Tobias Diekershoff <tobiasd@fsfe.org> wrote:
Hey all,
so far we could not find the reason behind the problem. However while reviewing the installation process one one our team mentioned that he read the installation instructions for Hyperkitty different then the rest of the team.
Are Mailman3/Postorius and Hyperkitty supposed to run in separated databases, or are they supposed to use the same database?
Thanks in advance for clarification! Tobias
The official documentation specifies two DBs - one for Mailman Core and the other for Mailmanweb. As far as I know, there are no conflicts in the tables so you could use the same DB for both.
-- Best regards, Odhiambo WASHINGTON, Nairobi,KE +254 7 3200 0004/+254 7 2274 3223 In an Internet failure case, the #1 suspect is a constant: DNS. "Oh, the cruft.", egrep -v '^$|^.*#' ¯\_(ツ)_/¯ :-) [How to ask smart questions: http://www.catb.org/~esr/faqs/smart-questions.html]
On 11/22/24 00:37, Odhiambo Washington via Mailman-users wrote:
On Fri, Nov 22, 2024 at 9:59 AM Tobias Diekershoff <tobiasd@fsfe.org> wrote:
Are Mailman3/Postorius and Hyperkitty supposed to run in separated databases, or are they supposed to use the same database?
...
The official documentation specifies two DBs - one for Mailman Core and the other for Mailmanweb. As far as I know, there are no conflicts in the tables so you could use the same DB for both.
And the mailmanweb DB is for HyperKitty and Postorius. I know of no doc suggesting separate databases for HyperKitty and Postorius and in fact accomplishing that would require separate Django instances for HyperKitty and Postorius.
I.e., Postorius, HyperKitty and django-mailman3 are all Django applications and use the database configured in Django's settings. Mailman core is separate and uses the database configured in mailman.cfg.
As Odhiambo says, there are no table conflicts between these databases so they can be either the same database or separate databases as you wish.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
On 22.11.24 17:54, Mark Sapiro wrote:
As Odhiambo says, there are no table conflicts between these databases so they can be either the same database or separate databases as you wish.
Thank you both for clarification Odhiambo and Mark!
-Tobias
-- Tobias Diekershoff >>> System Hacker Free Software Foundation Europe Schönhauser Allee 6/7, 10119 Berlin, Germany | t +49-30-27595290 Registered at Amtsgericht Hamburg, VR 17030 | fsfe.org/support OpenPGP-Key ID ... 0x25FE376FF17694A1 Fingerprint ...... 23EE F484 FDF8 291C BA09 A406 25FE 376F F176 94A1
participants (3)
-
Mark Sapiro
-
Odhiambo Washington
-
Tobias Diekershoff