Hyperkitty performance problem

Hey all,
last week we caught some AI bots causing high LOAD on our mailman3 servers archive, that we blocked with nginx rules.
I'm not sure if it is related to this incident, but we noticed this when analyzing the log files at the time. In the mailman.log hyperkitty is listed 10k times and more to archive a single mail over the time of 2h, seemingly all with the same process ID
(1186) HyperKitty archived message [...]
The mailman3-web.log has matching log entries that the mail was archived
hyperkitty.views.mailman Archived message
every 0.5 seconds.
In the web interface to the archive, the mail is only listed once though.
My suspicion is, that these archiving processes are now overloading the system, preventing a timely catching up work of the queued messages. I could not find anything in the Mailman3 nor Hyperkitty documentation to better distribute the system resources to the Mailman runner or prevent Hyperkitty to restart archiving a mail over and over again.
Any hint about where to look for the reason of this is highly appreciated.
Best, Tobias
-- Tobias Diekershoff >>> System Hacker Free Software Foundation Europe Schönhauser Allee 6/7, 10119 Berlin, Germany | t +49-30-27595290 Registered at Amtsgericht Hamburg, VR 17030 | fsfe.org/support OpenPGP-Key ID ... 0x25FE376FF17694A1 Fingerprint ...... 23EE F484 FDF8 291C BA09 A406 25FE 376F F176 94A1

Hey all,
so far we could not find the reason behind the problem. However while reviewing the installation process one one our team mentioned that he read the installation instructions for Hyperkitty different then the rest of the team.
Are Mailman3/Postorius and Hyperkitty supposed to run in separated databases, or are they supposed to use the same database?
Thanks in advance for clarification! Tobias
On 06.11.24 08:18, Tobias Diekershoff wrote:
Hey all,
last week we caught some AI bots causing high LOAD on our mailman3 servers archive, that we blocked with nginx rules.
I'm not sure if it is related to this incident, but we noticed this when analyzing the log files at the time. In the mailman.log hyperkitty is listed 10k times and more to archive a single mail over the time of 2h, seemingly all with the same process ID
(1186) HyperKitty archived message [...]
The mailman3-web.log has matching log entries that the mail was archived
hyperkitty.views.mailman Archived message
every 0.5 seconds.
In the web interface to the archive, the mail is only listed once though.
My suspicion is, that these archiving processes are now overloading the system, preventing a timely catching up work of the queued messages. I could not find anything in the Mailman3 nor Hyperkitty documentation to better distribute the system resources to the Mailman runner or prevent Hyperkitty to restart archiving a mail over and over again.
Any hint about where to look for the reason of this is highly appreciated.
Best, Tobias
-- Tobias Diekershoff >>> System Hacker Free Software Foundation Europe Schönhauser Allee 6/7, 10119 Berlin, Germany | t +49-30-27595290 Registered at Amtsgericht Hamburg, VR 17030 | fsfe.org/support OpenPGP-Key ID ... 0x25FE376FF17694A1 Fingerprint ...... 23EE F484 FDF8 291C BA09 A406 25FE 376F F176 94A1

On Fri, Nov 22, 2024 at 9:59 AM Tobias Diekershoff <tobiasd@fsfe.org> wrote:
Hey all,
so far we could not find the reason behind the problem. However while reviewing the installation process one one our team mentioned that he read the installation instructions for Hyperkitty different then the rest of the team.
Are Mailman3/Postorius and Hyperkitty supposed to run in separated databases, or are they supposed to use the same database?
Thanks in advance for clarification! Tobias
The official documentation specifies two DBs - one for Mailman Core and the other for Mailmanweb. As far as I know, there are no conflicts in the tables so you could use the same DB for both.
-- Best regards, Odhiambo WASHINGTON, Nairobi,KE +254 7 3200 0004/+254 7 2274 3223 In an Internet failure case, the #1 suspect is a constant: DNS. "Oh, the cruft.", egrep -v '^$|^.*#' ¯\_(ツ)_/¯ :-) [How to ask smart questions: http://www.catb.org/~esr/faqs/smart-questions.html]

On 11/22/24 00:37, Odhiambo Washington via Mailman-users wrote:
On Fri, Nov 22, 2024 at 9:59 AM Tobias Diekershoff <tobiasd@fsfe.org> wrote:
Are Mailman3/Postorius and Hyperkitty supposed to run in separated databases, or are they supposed to use the same database?
...
The official documentation specifies two DBs - one for Mailman Core and the other for Mailmanweb. As far as I know, there are no conflicts in the tables so you could use the same DB for both.
And the mailmanweb DB is for HyperKitty and Postorius. I know of no doc suggesting separate databases for HyperKitty and Postorius and in fact accomplishing that would require separate Django instances for HyperKitty and Postorius.
I.e., Postorius, HyperKitty and django-mailman3 are all Django applications and use the database configured in Django's settings. Mailman core is separate and uses the database configured in mailman.cfg.
As Odhiambo says, there are no table conflicts between these databases so they can be either the same database or separate databases as you wish.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan

On 22.11.24 17:54, Mark Sapiro wrote:
As Odhiambo says, there are no table conflicts between these databases so they can be either the same database or separate databases as you wish.
Thank you both for clarification Odhiambo and Mark!
-Tobias
-- Tobias Diekershoff >>> System Hacker Free Software Foundation Europe Schönhauser Allee 6/7, 10119 Berlin, Germany | t +49-30-27595290 Registered at Amtsgericht Hamburg, VR 17030 | fsfe.org/support OpenPGP-Key ID ... 0x25FE376FF17694A1 Fingerprint ...... 23EE F484 FDF8 291C BA09 A406 25FE 376F F176 94A1

Following the root cause analysis which Tobias started all those weeks ago we (the admin team behind the installation) are still struggling with the following phenomenon: messages on average take more than 24 hours to be processed, more precisely, the average lifetime of a pickled message object in /var/lib/mailman3/queue/pipeline
clocks in at around 26 hours).
Couple of stats of the installation: standard installation from Bookworm OS repos, Hyperkitty as archiver and Postorius as web frontend as explained above, running Python 3.11.2 from the standard systemd service as packaged with the Bookworm deb file. All backends (Core + Hyperkitty) are supported by Postgres version 15+248. The MTA is a standard Postfix installation, again from OS packages.
The underlying VM has 7 cores with just under 24 GB of main memory. This production instance is handling less than 130 mailing list (MLs) with an average of less than 10 postings per day per ML. CPU core utilisation hovers around 50% with the lion share allocated to the four pipeline runners as part of the MM configuration.
OS resource utilisation is well below bounds (approx. 8 GB of main memory allocated to running processes), plenty of available sockets space (I noticed some transient Connection lost during _handle_client()
warnings in the logs so I checked that the SMTP runner can connect to Postfix for delivering the messages after processing by checked the socket allocation of the running processes).
Cursory review of the corresponding Core classes (runner + pipeline implementation in addition to posting_pipeline
) didn't reveal any further pointers. What I did notice though that increasing the logging levels of the components (namely smtp
, pipeline
and lmtp
to debug
) in /etc/mailman3/mailman.cfg
didn't add any useful information to the logs as configured after restarting the Core.
As outlined above, Hyperkitty doesn't seem to do a check based on ML and message ID before archiving a message in the database. But this only add a REST roundtrip and Postgres communication through Hyperkitty's underlying Django framework to the overall system load and the driving UWSGI instance is well within CPU cycle bounds.
Any pointers are appreciated - more than happy to provide more info if required.

Hi monochromec. I have a similar installation and was also having various problems and delays. I'm on a similarly sized system (Ubuntu 24.04 with venv installations of mailman3; 32GB memory).
What I found was that tuning of the resources made a very significant impact. Out of the box tuning parameters for the various components were not sufficient to keep mail flowing.
The "system info" tab in the web interface tells you where things are held up, if they are. I was having trouble with the 'out' queue getting slow, and the postfix 'mailq' was showing thousands of messages waiting to be delivered. I was running out of smtpd processes. Archiving was delayed (though not as much as yours).
Some of the things I did:
- Increase PostgreSQL resource limits. Some of the parameters in my configuration files:
Tuning, via https://pgtune.leopard.in.ua/ for 'data warehouse' app.
max_connections = 100 # default shared_buffers = 8GB effective_cache_size = 24GB maintenance_work_mem = 2GB # wal_buffers = 16MB default_statistics_target = 500 effective_io_concurrency = 2 work_mem = 26214kB min_wal_size = 4GB max_wal_size = 16GB max_parallel_workers_per_gather = 4 max_parallel_maintenance_workers = 4
- Increase the number of mailman processes in mailman.cfg
Note: Values need to be a power of 2. Make sure PostgreSQL allows more
connections than the # of instances [runner.in] class: mailman.runners.incoming.IncomingRunner instances: 16 # instances: 8
[runner.out] class: mailman.runners.outgoing.OutgoingRunner instances: 64
The "out" setting, notably, decreased mail delivery times.
- Increase postfix resources. Notably in main.cf:
https://www.postfix.org/TUNING_README.html
# default_process_limit = 100 # default default_process_limit = 250
Maybe these will point you in the right direction. I realize you are focused on hyperkitty, and my experience was that all the major components (database, MTA and mailman3) all needed more resources. ~ Greg
On Sat, Dec 28, 2024 at 7:26 AM monochromec via Mailman-users < mailman-users@mailman3.org> wrote:
Following the root cause analysis which Tobias started all those weeks ago we (the admin team behind the installation) are still struggling with the following phenomenon: messages on average take more than 24 hours to be processed, more precisely, the average lifetime of a pickled message object in
/var/lib/mailman3/queue/pipeline
clocks in at around 26 hours).Couple of stats of the installation: standard installation from Bookworm OS repos, Hyperkitty as archiver and Postorius as web frontend as explained above, running Python 3.11.2 from the standard systemd service as packaged with the Bookworm deb file. All backends (Core + Hyperkitty) are supported by Postgres version 15+248. The MTA is a standard Postfix installation, again from OS packages.
The underlying VM has 7 cores with just under 24 GB of main memory. This production instance is handling less than 130 mailing list (MLs) with an average of less than 10 postings per day per ML. CPU core utilisation hovers around 50% with the lion share allocated to the four pipeline runners as part of the MM configuration.
OS resource utilisation is well below bounds (approx. 8 GB of main memory allocated to running processes), plenty of available sockets space (I noticed some transient
Connection lost during _handle_client()
warnings in the logs so I checked that the SMTP runner can connect to Postfix for delivering the messages after processing by checked the socket allocation of the running processes).Cursory review of the corresponding Core classes (runner + pipeline implementation in addition to
posting_pipeline
) didn't reveal any further pointers. What I did notice though that increasing the logging levels of the components (namelysmtp
,pipeline
andlmtp
todebug
) in/etc/mailman3/mailman.cfg
didn't add any useful information to the logs as configured after restarting the Core.As outlined above, Hyperkitty doesn't seem to do a check based on ML and message ID before archiving a message in the database. But this only add a REST roundtrip and Postgres communication through Hyperkitty's underlying Django framework to the overall system load and the driving UWSGI instance is well within CPU cycle bounds.
Any pointers are appreciated - more than happy to provide more info if required.
Mailman-users mailing list -- mailman-users@mailman3.org To unsubscribe send an email to mailman-users-leave@mailman3.org https://lists.mailman3.org/mailman3/lists/mailman-users.mailman3.org/ Archived at: https://lists.mailman3.org/archives/list/mailman-users@mailman3.org/message/...
This message sent to gbnewby@petascale.org

On 12/28/24 05:19, monochromec via Mailman-users wrote:
Following the root cause analysis which Tobias started all those weeks ago we (the admin team behind the installation) are still struggling with the following phenomenon: messages on average take more than 24 hours to be processed, more precisely, the average lifetime of a pickled message object in
/var/lib/mailman3/queue/pipeline
clocks in at around 26 hours).
This issue is with Mailman core's pipeline runner. See https://docs.mailman3.org/projects/mailman/en/latest/src/mailman/docs/8-mile... for an overview of message handling and queues in Mailman core.
It seems that the pipeline runner(s) are taking an inordinately long time to process a message through the pipeline of handlers causing the pipeline queue to become backlogged.
Couple of stats of the installation: standard installation from Bookworm OS repos, Hyperkitty as archiver and Postorius as web frontend as explained above, running Python 3.11.2 from the standard systemd service as packaged with the Bookworm deb file. All backends (Core + Hyperkitty) are supported by Postgres version 15+248. The MTA is a standard Postfix installation, again from OS packages.
The primary support for Debian packages is Debian. See https://wiki.list.org/x/12812344
The underlying VM has 7 cores with just under 24 GB of main memory. This production instance is handling less than 130 mailing list (MLs) with an average of less than 10 postings per day per ML. CPU core utilisation hovers around 50% with the lion share allocated to the four pipeline runners as part of the MM configuration.
This is an average of less than 1 message per minute. Processing a message through the pipeline should take nowhere near a minute. Possibly there is an issue in the rfc_2369.py handler. This handler communicates with HyperKitty via mailman_hyperkitty to obtain values for the List-Archive: and Archived-At: headers added to the message. Possibly there is some configuration issue with mailman_hyperkitty that causes these queries to take a long time and possibly eventually time out.
Are there any Connection to HyperKitty failed:
, HyperKitty failure on ...
or Invalid response from HyperKitty
type messages in mailman.log?
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan

Thanks again for the speedy support.
The only (occasional) error message is that archiving failed ("archiving failed, re-queuing (mailing-list ...") but none of the above Mark mentioned. What is suspicious though is that uwsgi is at the top of the list of CPU slice consumers. Which may point at a problem with the REST-based communication between HK and MM.
Postgres only consumes roughly 5% of available CPU bandwidth but I'll review Greg's suggestions to see if these make any difference.

In addition I have the following questions:
- How to turn on verbose logging for the pipeline runner (as the changes in mailman3.cfg didn't have any effects as outlined above)?
- Is there any alternative means to monitor the pipeline runner activities other than logs and monitoring the

On Mon, Dec 30, 2024 at 4:18 PM monochromec via Mailman-users < mailman-users@mailman3.org> wrote:
In addition I have the following questions:
- How to turn on verbose logging for the pipeline runner (as the changes in mailman3.cfg didn't have any effects as outlined above)?
from https://gitlab.com/mailman/mailman/-/blob/master/src/mailman/config/schema.c... . So I guess you can do: [logging.runner] level: debug
-- Best regards, Odhiambo WASHINGTON, Nairobi,KE +254 7 3200 0004/+254 7 2274 3223 In an Internet failure case, the #1 suspect is a constant: DNS. "Oh, the cruft.", egrep -v '^$|^.*#' ¯\_(ツ)_/¯ :-) [How to ask smart questions: http://www.catb.org/~esr/faqs/smart-questions.html]

Further RCA reveals the fact that the archiver runner tries to archive the same message over and over again - hence the overhead during pipeline processing. The total # of attempts reaches well into five digits for a single message.
Before I dive into the Core's CB: Any idea what's causing this?
@Greg: Although not applicable in this special case, the pointers you provided a well-needed refresher on Postgres optimisation. :-)

On 12/31/24 06:07, monochromec via Mailman-users wrote:
Further RCA reveals the fact that the archiver runner tries to archive the same message over and over again - hence the overhead during pipeline processing. The total # of attempts reaches well into five digits for a single message.
This is not directly related to the pipeline runner issues but they may have the same underlying cause. All the pipeline runner does is put the message in the archive queue. It's the archive runner that is doing all these retries.
However, the pipeline runner in processing the rfc_2369.py handler is also communicating with hyperkitty to get the values for the List-Archive: and Archived-At: headers.
You need to focus on the communication between mailman_hyperkitty and HyperKitty. If there are significant delays there, it could explain your pipeline runner issue.
Btw, I see https://gitlab.com/mailman/mailman/-/issues/1189 is also from you. I wondered at the time why you were trying to increase the number of pipeline runners which is not something that would normally be required.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan

Thanks for your input and happy new year to you!
The logs set debug level prove to be inconclusive: the runner is handing the message over to the archive runner which in turn forwards to the message to HyperKitty. HyperKitty, not checking if a message with the same ML name and msgid is already present in the DB forwards this to Postgres for storage.
The following questions remain to further the RCA:
- How to obtain additional logging information beyond the settings outlined below [1]?
- Why does the archive runner forward a message to HyperKitty if it's already done this in the past (or is this another try if the message hasn't been archived successfully - how do I check this from the archive runner's perspective if the archiver only logs the entry "HyperKitty archived message <msgid> to the HyperKitty REST URL"?
- Any additional insights you might be able to share (additional logging configuration, etc.) is greatly appreciated!
[1] mailman.cfg excerpt:
[runner.pipeline] instances: 4
[runner.out] instances: 1
[logging.smtp]
#level: debug
[loggging.database]
#level: debug
[logging.runner]
#level: debug
[logging.pipeline]
level: debug
qualname: pipeline
propagate: 0
[logging.lmtp]
level: debug
path: lmtp.log
#
[logging.archiver]
level: debug
[logging.mischief]
#level: debug
[logging.plugins]
#level: debug
mailman-web.py excerpt:
LOGGING = { 'version': 1, 'loggers': { 'hyperkitty': { 'level': 'DEBUG', 'propagate': True } } }
The GitLab issue you mentioned I raised due to my incorrect configuration of the # of runner instances was due to the assumption that I suspected a single pipeline runner might not be sufficient with the load and hence my attempt at increasing this.

Hey all,
thank you again for your input before - I wanted to update you all on the situation
On 22.11.24 07:59, Tobias Diekershoff wrote:
so far we could not find the reason behind the problem. In the meanwhile we have identified a symptom and could remove it, but not the root of the issue.
To rule out virtualization problems in our setup, we tried to migrate MM3 to a bare metal server in the data center. After importing the database for MM3 and Hyperkitty to the new postgresql server MM3 complained about
MultipleResultsFound: Multiple rows were found when one or none was required With that error message--that does not occur on the running server--we found an old thread here on the mailing list [1] (Issue 657) and indeed we have multiple duplicates per mailing list on our server. Not one like in the linked thread, but some thousand (up to 150k) different per mailing list. Removing them returned the delivery of mails to the lists back to normal.
However the duplicates are added back to the listarchiver table for all archivers (mhonarc, hyperkitty, mail-archive and prototype) even though only hyperkitty is activated. We tried to add a constrain on the table to only allow one entry per list and archiver, but that caused the runner to fail and the message to be shunt.
Uncaught runner exception: (raised as a result of Query-invoked autoflush; consider using a session.no_autoflush block if this flush is occurring prematurely) (psycopg2.errors.UniqueViolation) duplicate key value violates unique constraint "workaround" DETAIL: Key (mailing_list_id, name)=(98, mhonarc) already exists. hence we removed the constrain and now have a cron job running instead, that cleans out the duplicate database entries for the archivers.
I've checked that the workaround patch that mentioned in the issue 657 at gitlab [2] is in place. But it seems not to affect the issue we face.
Should I update issue 657 with our problem/findings, or open a new issue for them?
What puzzles us is that MM3 on the new server complained about these duplicates, were the running installation does not. They are "the same" (underlying Debian OS, installed packages, copied the config over).
Best, Tobias
1: https://lists.mailman3.org/archives/list/mailman-users%40mailman3.org/thread...
2: https://gitlab.com/mailman/mailman/-/issues/657#note_1846487131
-- Tobias Diekershoff >>> System Hacker Free Software Foundation Europe Schönhauser Allee 6/7, 10119 Berlin, Germany | t +49-30-27595290 Registered at Amtsgericht Hamburg, VR 17030 | fsfe.org/support OpenPGP-Key ID ... 0x25FE376FF17694A1 Fingerprint ...... 23EE F484 FDF8 291C BA09 A406 25FE 376F F176 94A1
Mailman's content filtering has removed the following MIME parts from this message.
Content-Type: application/pgp-keys Name: OpenPGP_0x25FE376FF17694A1.asc

Tobias Diekershoff writes:
On 22.11.24 07:59, Tobias Diekershoff wrote:
so far we could not find the reason behind the performance problem.
Since you're already running two allegedly identical systems, have you tried hitting the experimental one with the "do what the developers recommend" hammer? That is, blow away the Debian packages in favor of a vanilla installation from source in a virtual environment (and without --system-site-packages!)? It's impossible to be sure what's going on in a Debian package environment because not only Mailman, but any of the scores of packages that Mailman 3 depends on may have been patched or reconfigured by Debian.
To rule out virtualization problems
Do you mean a VM, which should not cause a problem, or a container, which is known to be tricky to configure? (Note, I've not heard of it causing the kinds of problems you describe.)
in our setup, we tried to migrate MM3 to a bare metal server in the data center. After importing the database for MM3 and Hyperkitty to the new postgresql server MM3 wcomplained about
MultipleResultsFound: Multiple rows were found when one or none was required [in the listarchiver table]
I've seen that in a migration from a Mailman 2 site to Mailman 3. Once I removed the duplicate rows in the archiver table, it stopped. I don't know why it happened, but it was a small number of lists (4 or 5 out of 20k), and at most one extra per (list, archiver) pair. I suspect it carried over from the Mailman 2 instance. I no longer have access to that system to confirm, sadly.
However the duplicates are added back to the listarchiver table for all archivers (mhonarc, hyperkitty, mail-archive and prototype) even though only hyperkitty is activated.
By "duplicates are added back" do you mean something like this:
mailman=> select * from listarchiver; id | mailing_list_id | name | _is_enabled ----+-----------------+--------------+------------- 1 | 1 | mhonarc | f 2 | 1 | mail-archive | f 3 | 1 | prototype | t 4 | 1 | hyperkitty | t 5 | 2 | mhonarc | f 6 | 2 | mail-archive | f 7 | 2 | prototype | t 8 | 2 | hyperkitty | t
where each list has one of each archiver whether or not that archiver is active? This is correct behavior, as long as the archiving/*.py and mailman_hyperkitty.py files are importable by Mailman core. Or by "duplicates are added back" do you mean this?
mailman=> select * from listarchiver; id | mailing_list_id | name | _is_enabled ----+-----------------+--------------+------------- 1 | 1 | mhonarc | f 2 | 1 | mhonarc | f 3 | 1 | mail-archive | f 4 | 1 | mail-archive | f 5 | 1 | prototype | t 6 | 1 | prototype | t 7 | 1 | hyperkitty | t 8 | 1 | hyperkitty | t 9 | 2 | mhonarc | f 10 | 2 | mhonarc | f 11 | 2 | mail-archive | f 12 | 2 | mail-archive | f 13 | 2 | prototype | t 14 | 2 | prototype | t 15 | 2 | hyperkitty | t 16 | 2 | hyperkitty | t
(don't take the id -> (list_id, archiver) mapping too seriously). In the multiple instances of an archiver per list case, which are enabled?
We tried to add a constrain on the table to only allow one entry per list and archiver, but that caused the runner to fail and the message to be shunt.
Uncaught runner exception: (raised as a result of Query-invoked autoflush; consider using a session.no_autoflush block if this flush is occurring prematurely) (psycopg2.errors.UniqueViolation) duplicate key value violates unique constraint "workaround" DETAIL: Key (mailing_list_id, name)=(98, mhonarc) already exists.
This doesn't make any sense. That appears to be the listarchiver table. As far as I can tell in a quick grep, the only place that is mutated is in mailman/models/mailinglist.py:643 (line number is current HEAD), which happens whenever the list checks for its active archivers. It's initialized lazily for each list in that place.
@public @implementer(IListArchiverSet) class ListArchiverSet: @dbconnection def __init__(self, store, mailing_list): self._mailing_list = mailing_list system_archivers = {} for archiver in config.archivers: system_archivers[archiver.name] = archiver # Add any system enabled archivers which aren't already associated # with the mailing list. for archiver_name in system_archivers: exists = store.query(ListArchiver).filter( ListArchiver.mailing_list == mailing_list, ListArchiver.name == archiver_name).one_or_none() if exists is None: store.add(ListArchiver(mailing_list, archiver_name, system_archivers[archiver_name]))
As you can see, the .add is protected by "if exists". That code has not changed since v3.3.8, which is what is in Bookworm according to "apt list mailman3".
If that is so in your installation, I'm pretty sure the problem is somewhere between the SQLAlchemy API and the RDBMS, not in Mailman itself. And if that's true, that might explain the weirdness in archiving as well.
What version of PostgreSQL are you using? of SQLAlchemy? Are both the original system with performance problems and the new one accessing the same PostgreSQL server, or is there are separate PostgreSQL instance for each?
Steve

Stephen J. Turnbull writes:
This doesn't make any sense. That appears to be the listarchiver table. As far as I can tell in a quick grep, the only place that is mutated is in mailman/models/mailinglist.py:643 (line number is current HEAD), which happens whenever the list checks for its active archivers. It's initialized lazily for each list in that place.
HOWEVER, if you DO manage to populate that table with thousands of duplicate entries, we do not check for unique entries when we fetch the list of archivers:
@property
@dbconnection
def archivers(self, store):
entries = store.query(ListArchiver).filter(
ListArchiver.mailing_list == self._mailing_list)
yield from entries
which could explain archiving one message thousands of times.
(You shouldn't be able to get there without passing through the constructor __init__ though, which should be checking. It apparently does check according to your report ... my head hurts.)
Steve
participants (6)
-
Greg Newby
-
Mark Sapiro
-
monochromec
-
Odhiambo Washington
-
Stephen J. Turnbull
-
Tobias Diekershoff