Dear Steve:
I really appreciate you also took your time to provide more insights on how Mailman works, and how their different components interact each other.
There are two "minor" tasks and one major task that we would need to accomplish within PostGRESQL, namely rebuilding the index related to hyperkitty_email table primary key, rebuilding also another index on the very same table supporting an uniqueness constraint (these would be the minor); and running vacuum full on the whole database to effectively reduce the size of precisely hyperkitty_email table, which is the largest table in this database, and if possible retrieve back to the SAN the extra space we had to add to cope with that very table size increasing.
The way vacuum full locks the tables are, among all the locking mechanisms provided by PostGRESQL to guarantee the transactional integrity, one of the strongest, if not the strongest one (as far as I know, the tables are more ore less recreated when running vacuum full).
The idea we were considering was more or less stopping all of the components, then starting up only PostGRESQL, and then performing each one of the mentioned tasks. Actually our PG release allow "online" indices rebuild (drop + rebuild concurrently), but we did not want to assume the risk if, for whatever reason, new records were added to the table breaking the uniqueness of both constraints (fixing this cases turns out to be a real pain in most cases). Alternatively, we could lock the table ourselves to get more or less the same effect, true, although, after all, the final impact on potential connections willing to run DML would be the same. In other words, rather than having lots of transactions coming from whether core component processes or web component processes fail, we prefer to stop everything. But it's got of course the worst downside: we would receive
Let me please now go back to your kind explanation to see if I did get things correctly. You mention that regardless the connectivity that Mailman's core processes might have with the database, the emails will be received by and queued until the connection is back again and they can be delivered to the distribution lists' members. Could you please confirm whether this task would be actually delegated to Postfix (or any MTA integrated with Mailman)? If so, this is actually what we need: we could afford delaying the reception a couple of days, whilst we are sure that no mail will be lost.
Once more, thanks a lot for your kind help. Best regards.
-----"Stephen J. Turnbull" <[1]stephenjturnbull@gmail.com> wrote: -----
To: [2]eugenio.jordan@esa.int From: "Stephen J. Turnbull" <[3]stephenjturnbull@gmail.com> Date: 08/24/2021 05:33PM Cc: [4]mailman-users@mailman3.org Subject: [MM3-users] Mailman backend maintenance task Hi, Eugenio, I wrote this earlier but am in the middle of moving my office so my mail has been intermittent. I have been following your discussion with Abhilash, and he's definitely the expert, especially if you are using the containers he creates and distributes. But there's some information here he hasn't mentioned yet. Please consider the following discussion to be background that allows you to understand some of the considerations. I've only run the Mailman 3 suite with all three running constantly on the same host, so I have no experience with this kind of issue. [5]eugenio.jordan@esa.int writes: > Our customer is currently using PostGRESQL as backend, and we would > like to perform some maintenance tasks, namely running vacuum full, > or at least trying to rebuild hyperkitty_email primary key related > index. We have been asked on the real impact of putting in place > such initiative. Though the latter is related to archiving, I > haven't found a way to stop just Hyperkitty or Django related > processes other than stopping Mailman's core, hence preventing > mails addressed to distribution lists from being delivered, could > you please confirm if I am correct? Mailman core can certainly run without either Postorius or HyperKitty. Controlling Mailman core (moderation, helping users) without Postorius is annoying, but it can be done. If you stop the HyperKitty process, what should happen, I believe, is that posts for archiving will accumulate in the 'archive' queue until HyperKitty is available again. It's been a while since I studied this so I could be completely wrong, but as I understand it HyperKitty and Postorius are not daemon processes. Rather, they are WSGI applications, which means they are subprocesses spawned by your webserver, and controlled by it. In order to keep them from running, you would reconfigure the webserver to not call those WSGI applications. How that is done is specific to the webserver and the WSGI module. If you are running from Docker containers, most likely, you can just stop their containers. The larger problem is that Mailman core uses a RDBMS. Normally both Django and Mailman are referring to the same RDBMS, PostgreSQL in your case. I'm not familiar with the vaccuum operation; if it requires taking the whole RDBMS down, and Mailman is running on that RDBMS, you're out of luck. Mailman can accept posts and queue them, but it can't deliver them to subscribers without access to the RDBMS tables. If it's just a matter of locking some tables while other are available, then it should work because the tables used by core and HyperKitty are disjoint as far as I know. (I think Postorius and HyperKitty both use Django's user tables for authorization, so at least for those tables both Postorius and HyperKitty will have to be down, but core can continue running because its database is completely separate.) > Regarding the former, as far as I have read, the "mappings" lists > -> addresses are stored just in the database, so if we run some > kind of procedure or task like vacuum which will lock exclusively > tables, or want anyway to have the database stopped for a cold > backup or whatever, Mailman willl not work, that is, again the > mails addressed to the distribution lists will not be > delivered. Will you please confirm this point, too? That depends on how "vacuum" works. If you can tell the RDBMS not to lock Mailman core's tables, it can continue to run. Definitely if the database is not running, Mailman will continue to receive and store the posts, but it won't be able to distribute mail to subscribers until its tables are available again. At that point the queued posts will be processed normally, except that if there's a large backlog, they won't go out in a quick burst, it may take a while. One possible worry, depending on how you are connected to the Internet, is that your provider may interpret the sudden burst of outgoing mail when Mailman comes back on line as spam or some other sort of mischief. Mailman has no way to throttle this: it feeds messages to the MTA until the MTA says "stop". Then it waits until the MTA is ready again. Regards, Steve
This message is intended only for the recipient(s) named above. It may contain p roprietary information and/or protected content. Any unauthorised disclosure, use, retention or dissemination is prohibited. If you have received this e-mail in error, please notify the sender immediately. ESA applies appropri ate organisational measures to protect personal data, in case of data privacy queries, please contact the ESA Data Prot ection Officer (dpo@esa.int).
References
- mailto:stephenjturnbull@gmail.com
- mailto:eugenio.jordan@esa.int
- mailto:stephenjturnbull@gmail.com
- mailto:mailman-users@mailman3.org
- mailto:eugenio.jordan@esa.int