On Aug 24, 2021, at 11:22 AM, Stephen J. Turnbull <stephenjturnbull@gmail.com> wrote:
Eugenio Jordan (external) writes:
The idea we were considering was more or less stopping all of the components, then starting up only PostGRESQL, and then performing each one of the mentioned tasks.
In that case, your Mailman installation will basically be down (no web presence, no deliveries) for the duration of the database maintenance.
You mention that regardless the connectivity that Mailman's core processes might have with the database, the emails will be received by and queued until the connection is back again and they can be delivered to the distribution lists' members. Could you please confirm whether this task would be actually delegated to Postfix (or any MTA integrated with Mailman)?
If Mailman core is running, it will accept the mail and store it in its own queue. If not, the MTA will store it in its deferred queue until Mailman comes back up.
The big problem with delegating this to the MTA is that after two days the MTA probably won't retry for whole days, but if you do an MTA queue flush, it will probably stop other mail delivery for quite a while, since it will be occupied with processing the Mailman backlog.
The way Mailman works is that there is a master process, which manages a suite of runners. Each runner is responsible for a particular stage. The LMTP runner does nothing except accept messages from the MTA, and store each one in a file for the next runner in the chain.
Although, looking at the code1, it seems like LMTP runner will actually try to verify that the incoming email in destined to a valid email address before it will accept the email.
So, if the database isn’t reachable, I think LMTP runner will start rejecting emails from MTA. This is perhaps handling the situation where the transport maps are out-of-date and Mailman receives an email from MTA for a list that has been deleted.
While I haven’t tested this situation, I don’t think it would work. So far I am seeing that holding emails off at MTA level and letting it retry after the database maintenance has been complete might be the only thing that works if the entire Postgresql instance has to be brought down.
If MTA will retry continuously for 2 days, would that not be enough time to complete the migration and bring the database back up?
-- thanks, Abhilash Raj (maxking)