Hi Darren,
On 3/8/18 9:43 AM, Darren Smith wrote:
Hello all,
We had an old installation of Mailman 2.1 across three servers with over 32,000 mailing lists on them. Due to circumstances beyond my control, we had to shut them down and quickly set up Mailman 3 on a new server.
We have Mailman3 set up on a single server this time, as we were having problems with a 3-server setup and Hyperkitty and archives. Long story short, it is up and running, and our users are starting to come back.
However, we do not have very many users hitting Postorius and Hyperkitty yet, and we are seeing a distressing number of 504 gateway timeouts.
We noticed that there is a significant number of queries to the API whenever hitting Postorius's front page to load the mailing lists, and then on every load. It looks as if it makes a single call to the API to get information about a single list, and does this 10 times - once for each list - to render the page. This gets much worse when users set the view to 200 per page, as they are inclined to do with 32,000 mailing lists...
What we've done, actually, is to override the front page and point the users to a different set of static pages where we have all of our lists categorized and searchable, and when they find the list they are looking for, we have a link back to postorius. It has taken a lot of the stress off of the server.
However, I still see where the database at times can run VERY hot.
We have set up this server to run with PostgreSQL. When I do a ps aux | grep postgres, I find a lot of processes that have been spawned to take care of mailman - a dozen or so. However, one of the threads is taking up about 85% of the load, another thread about 14%, and little dribbles here and there from the rest of the threads.
Here are the questions:
- Has anyone run into problems like these? I would think that maybe this is an issue having so many mailing lists, but honestly there aren't that many people using the system yet. It is going to ramp up as the users realize that we are live again, but we aren't getting that much traffic.
I can say that in general, Postorius doesn't really optimize REST API queries and so loading a single page can sometimes make things bad.
Currently, the largest installation of Mailman 3 that we know of runs at lists.fedoraproejct.org and the administrators of the list would know how well does our Web UI scale.
However, I don't think we have anyone actively working on trying to fix this particular issue. Pull Requests and fixes are welcome :)
- Is there any configuration that I can set, especially with regards to the database, so that it can be used more efficiently and not throttle through one or two threads?
There isn't any configuration that would change how databases are used in Mailman Core. Django *may* have some, but I don't know about any of them of the top of my head.
thanks, Abhilash