The key difference between Stephen Daniel's configuration and the one I am running is the amount of memory, 512 MB in my case. Everything is fine (e.g. less than 60% of memory in use) all day long /except /when Mailman3 is starting and stopping. I see no evidence of the Out Of Memory reaper getting involved, but there is some swapping. E.g. the kswapd0 process will consume 30 seconds or a minute and a half of CPU time during the mailman startup phase. Obviously if a process holding the lock is swapped out, delays and unhappiness will ensue.
I did invest the time for another experiment using a computer I own and
to which I have full physical access. In its BIOS, I disabled
simultaneous multi-threading mode and reduced the number of compute
units to one, so it temporarily behaved as a single-core system. In that
mode, 'mailman start' completes promptly with about 10 seconds of 100%
CPU utilization. Attempting to reduce the gigabytes of memory on that
system with chmem --disable ...
down to 1 gigabyte made no appreciable
progress in an hour, so I gave up on exploring whether 1 gigabyte or 768
megabytes or ... would be the magic minimum amount of memory to prevent
these failures.
I have not noticed any documentation stating minimum resource requirements for running Mailman3. Even if we found accurate answers under each operating system today, that knowledge will expire at some future date as other parts of the system may become bloated, even if Mailman3's appetite for resources does not grow.
Since the lockfile contains the process ID of the runner holding the lock, how painful might it be upon an attempt to claim a lock that is held to detect if the owning process is runnable vs. swapped out, across the supported operating system platforms? Then the question would be whether Mailman3 merely informs the user "cannot run here, give me more memory", or whether sending a signal (e.g. SIGCONT ?) to the owning process might accelerate how soon the lock-owning process is swapped in so that constructive progress resumes?
On 6/19/23 13:51, Stephen Daniel wrote:
I'm currently running the entire mailman3 stack, including postgres and postfix, plus my production web server, all on a GCP e2-small instance. e2-small is two virtual CPUs running on a single physical core, restricted to consuming at most 50% of the CPU cycles on that core. The system has 2 GB of memory, 10GB of disk space.
The entire stack runs well, albeit slowly, on this configuration, for which I pay about $13/month. The entire stack is shut down every night so I can take a snapshot while the applications are down. Once the snapshot is complete, the stack is restarted. I have no startup issues in this configuration.
It is hard to imagine a more limited pool of resources, yet everything works.