On 12/14/22 03:31, Mark Sapiro wrote:
On 12/13/22 10:38, Dan Caballero wrote:
Hi Mark,
I ran the commands and it took about 35 seconds for the loop to run.
Here's the result.
count 4168
So that's at least part of the issue. How does 35 seconds compare to the length of time to process one moderated message through Postorius?
Probably should've read more of the thread before previous reply ;-)
So, the issue _could_ be due to latency to the remote database vs local database that we have on m.p.o. Mailman3 is not very efficient when it comes to total no. of database queries done per operation, which is something I have been (very slowly) tracking and fixing on a case-by-case basis.
So, for example, handling 1 held message in m.p.o made 1.1k postgres database calls in total and even though they were each only few hundred micro-seconds each, the total added up to roughly ~2sec avg response time on the handle message API endpoint. If you handle more than 1, it scales linearly.
My suspicion is that in your case Dan, due to a remote instance, the latency per call is higher (maybe in the order of few or 10s of milliseconds is my guess?) and then depending on the total no. of entries you have in pendings and pendedkeyvalue tables (with some filters, not _all_ entries), it adds up to a high value.
The no. of database calls here is of order n^2, where n is the entries in pendings of type "held message" and "data" and their respective linked relationships in pendedkeyvalue table.
My MR makes it such that we don't need to scan all entries of "type"="data", so that part will become constant time. And the pending of type "held message" will be limited to no. of held messages in a single MailingList(compared to all mailing lists like today), so it will help depending on the distribution of held messages in various lists. I am also exploring ways to make that 2nd query also constant time.
-- thanks, Abhilash Raj (maxking)