toward improving correctness when running on single core system ...
I recognize that most real computer systems now are equipped with multiple CPUs, usually with each CPU having multiple cores or hardware threads. But when one obtains a computer system from a service provider with a small expected load, it is likely that a virtual computer system will provide only a single core. Yes, one can obtain a multi-core virtual system, maybe for only twice the cost; but it would seem wasteful to make that choice solely to support starting and stopping the Mailman3 program when the system load at all other times is satisfied by a single core system.
From my observations, mailman 3.3.8 is unreliable when running on a
single core system. When mailman start
is issued, there is a
moderately high probability that one or more of the thirteen runner
processes will not survive this startup phase. Depending upon the
current usage of the mailman installation, one may or may not notice
this fact. In either case, mailman status
provides the same
information, with no hint of missing or damaged capabilities when one or
more of the runner processes are stillborn. [1]
One potential improvement would be to increase the lifetime of the locks
used by each of the runner processes during mailman start
. [2] Based
on a small sample of runs of mailman start
, this change may provide
correctness (i.e. I have yet to see a runner process go MIA). But it
still seems unsatisfactory for availability / performance, as the
elapsed wall clock time for the system to settle down and be functional
and responsive is approximately 13 minutes. While this is an improvement
from the 20 minute or greater delays observed before making this change,
it still differs greatly from the nearly instantaneous startup time
experienced on multicore / multiprocessor systems.
The best "performance" I have yet obtained on this single core system is WITHOUT the change in [2] below, but by crossing abstraction boundaries and reducing the amount of time this single CPU spends futilely busy waiting for itself when it has not provided an opportunity for another process (maybe even the process currently holding the desired lock) to make any progress. [3] With this change, "only" 9 minutes elapsed wall clock time is wasted (with maybe only the middle 4 minutes at 100% CPU utilization) before the system settles down and is functional and responsive. However, ample room for improvement remains. [4]
Please suggest better ideas or corrections.
Nelson
[1] For a few examples of these failures, see: https://lists.mailman3.org/archives/list/mailman-users@mailman3.org/thread/F...
[2] Here is a diff from the original mailman/database/factory.py (or .../mailman/.local/lib/python3.9/site-packages/mailman/database/factory.py if you wish to apply this to an installed system):
25a26
from datetime import timedelta 53c54,55 < with Lock(os.path.join(config.LOCK_DIR, 'dbcreate.lck')):
with Lock(os.path.join(config.LOCK_DIR, 'dbcreate.lck'), timedelta(seconds=30)): which increases the delay from the default of 15 seconds to 30 seconds before another process will be empowered to break the lock being held by one of mailman's runner processes, which leads to the death of a process.
7a8
import subprocess 338a340,344 if sys.platform.startswith('linux') and
(len(subprocess.check_output(["lscpu", "-p"]).splitlines()) == 5): # there is only one processor [four lines of comments, and only one line for one CPU] # let another process run before joining the traffic jam for this lock os.system('sleep 20s') For possible similar benefits on other platforms, one should also include variations on this theme using something like e.g.: if sys.platform.startswith('win32') and
subprocess.check_output('wmic path win32_Processor get NumberOfLogicalProcessors').strip().endswith(b'1'): I seriously doubt all (or even any?) other users of the flufl.lock
[3] Here is a diff from the original
.../mailman/.local/lib/python3.9/site-packages/flufl/lock/_lockfile.py :
package would benefit from this exact change ... but many may benefit
from sleeping for e.g. a half-second when on a single core? In the case
of Mailman3, sleeping for a fixed duration shorter than 20 seconds
retains the original problem of broken locks and runner process death,
while sleeping for a fixed duration longer than 20 seconds leads to an
increase in both the wall clock delay time as well as the CPU time
consumed during mailman start
. I have not experimented with random or
staggered delays, which may further reduce the number of simultaneously
blocked waiters for this lock.
[4] I have seen mailman start
complete in less than 1 second wall clock
time on a multicore / multiprocessor system. All of these observations
made with mailman 3.3.8 as installed via pip.
Nelson Strother writes:
From my observations, mailman 3.3.8 is unreliable when running on a single core system.
"A single core system" == "your single core system", right? Have you confirmed this on other single core systems? Can't confirm at the moment, I don't think I have access to Mailman running on a single core system. (I guess Linux provides ways to lock a process group to a single core, but the RDBMS is probably an independent process group, and that probably matters. It will be a while before I can confirm.)
How about for other versions of Mailman 3?
In either case,
mailman status
provides the same information, with no hint of missing or damaged capabilities when one or more of the runner processes are stillborn.
If so, we should fix mailman status
. Presumably there's also a
chance for the runner to detect and log the expired lock before it
dies.
One potential improvement would be to increase the lifetime of the locks used by each of the runner processes during
mailman start
.
The resource being locked is the core database. One guess about the difference with multicore systems is that on them the RDBMS runs on a different core from Mailman, but that doesn't explain why runner initialization takes so long that locks expire. If a #cores > 1 system initializes in second(s), a 1 core system "should" initialize in #cores * seconds at most.
So I guess something else is accessing the database and interfering with the runners (or vice versa). It's possible that changing to a different RDBMS (or a different transaction granularity) will alleviate the problem. (This is not a fix because somebody isn't locking when they should, or perhaps using the wrong lock. But it might help.)
Another workaround would be to have the runners start synchronously (ie, have the master process wait on each one to signal initialization complete before starting the next one), assuming the current approach is "fork and forget".
But it still seems unsatisfactory for availability / performance, as the elapsed wall clock time for the system to settle down and be functional and responsive is approximately 13 minutes. [...] The best "performance" I have yet obtained on this single core system is [by] [3] [which waits 20s before trying to obtain the lock].
Is 20s necessary? There are a lot of runners, you may be able to cut the startup time by 2-3 minutes if you can cut that to 5s or less. That would still be 100x longer than the naive #cores * seconds estimate, but perhaps significant.
Steve
[My apologies for the mangled formatting of the diffs in [2] and [3] above. at least as shown on the web view of this list. Will happily repost or resend by direct email if anyone expresses interest.]
Presumably there's also a chance for the runner to detect and log the expired lock before it dies.
This may require some cleverness, as the delays introduced by recording a log message e.g. "I'm the nntp runner holding the lock, and it has not expired yet." are likely to change the results of the lock competition, maybe even allowing all runner processes to continue as intended. It is unclear to me whether a process would ever have an opportunity to record a log message e.g. "I'm the nntp runner holding the lock which has expired, but I am not dead yet." before it dies. If such an opportunity exists, it should extend the lock's lifetime via e.g Lock.refresh(timedelta(seconds=10)).
As I attempted to summarize at the end of [3] above, each time I tried shorter sleep intervals (1, 10, 15 seconds) within .../flufl/lock/_lockfile.py the original problem remains, namely one or more locks are broken and runner processes die.
On 6/19/23 03:00, Stephen J. Turnbull wrote:
The best "performance" I have yet obtained on this single core system is [by] [3] [which waits 20s before trying to obtain the lock].
Is 20s necessary? There are a lot of runners, you may be able to cut the startup time by 2-3 minutes if you can cut that to 5s or less. That would still be 100x longer than the naive #cores * seconds estimate, but perhaps significant.
I'm currently running the entire mailman3 stack, including postgres and postfix, plus my production web server, all on a GCP e2-small instance. e2-small is two virtual CPUs running on a single physical core, restricted to consuming at most 50% of the CPU cycles on that core. The system has 2 GB of memory, 10GB of disk space.
The entire stack runs well, albeit slowly, on this configuration, for which I pay about $13/month. The entire stack is shut down every night so I can take a snapshot while the applications are down. Once the snapshot is complete, the stack is restarted. I have no startup issues in this configuration.
It is hard to imagine a more limited pool of resources, yet everything works.
On Mon, Jun 19, 2023 at 12:38 PM Nelson Strother < justfixit@marilynstrother.com> wrote:
[My apologies for the mangled formatting of the diffs in [2] and [3] above. at least as shown on the web view of this list. Will happily repost or resend by direct email if anyone expresses interest.]
Presumably there's also a chance for the runner to detect and log the expired lock before it dies.
This may require some cleverness, as the delays introduced by recording a log message e.g. "I'm the nntp runner holding the lock, and it has not expired yet." are likely to change the results of the lock competition, maybe even allowing all runner processes to continue as intended. It is unclear to me whether a process would ever have an opportunity to record a log message e.g. "I'm the nntp runner holding the lock which has expired, but I am not dead yet." before it dies. If such an opportunity exists, it should extend the lock's lifetime via e.g Lock.refresh(timedelta(seconds=10)).
As I attempted to summarize at the end of [3] above, each time I tried shorter sleep intervals (1, 10, 15 seconds) within .../flufl/lock/_lockfile.py the original problem remains, namely one or more locks are broken and runner processes die.
On 6/19/23 03:00, Stephen J. Turnbull wrote:
The best "performance" I have yet obtained on this single core system is [by] [3] [which waits 20s before trying to obtain the lock].
Is 20s necessary? There are a lot of runners, you may be able to cut the startup time by 2-3 minutes if you can cut that to 5s or less. That would still be 100x longer than the naive #cores * seconds estimate, but perhaps significant.
Mailman-users mailing list -- mailman-users@mailman3.org To unsubscribe send an email to mailman-users-leave@mailman3.org https://lists.mailman3.org/mailman3/lists/mailman-users.mailman3.org/ Archived at: https://lists.mailman3.org/archives/list/mailman-users@mailman3.org/message/...
This message sent to swd@pobox.com
The key difference between Stephen Daniel's configuration and the one I am running is the amount of memory, 512 MB in my case. Everything is fine (e.g. less than 60% of memory in use) all day long /except /when Mailman3 is starting and stopping. I see no evidence of the Out Of Memory reaper getting involved, but there is some swapping. E.g. the kswapd0 process will consume 30 seconds or a minute and a half of CPU time during the mailman startup phase. Obviously if a process holding the lock is swapped out, delays and unhappiness will ensue.
I did invest the time for another experiment using a computer I own and
to which I have full physical access. In its BIOS, I disabled
simultaneous multi-threading mode and reduced the number of compute
units to one, so it temporarily behaved as a single-core system. In that
mode, 'mailman start' completes promptly with about 10 seconds of 100%
CPU utilization. Attempting to reduce the gigabytes of memory on that
system with chmem --disable ...
down to 1 gigabyte made no appreciable
progress in an hour, so I gave up on exploring whether 1 gigabyte or 768
megabytes or ... would be the magic minimum amount of memory to prevent
these failures.
I have not noticed any documentation stating minimum resource requirements for running Mailman3. Even if we found accurate answers under each operating system today, that knowledge will expire at some future date as other parts of the system may become bloated, even if Mailman3's appetite for resources does not grow.
Since the lockfile contains the process ID of the runner holding the lock, how painful might it be upon an attempt to claim a lock that is held to detect if the owning process is runnable vs. swapped out, across the supported operating system platforms? Then the question would be whether Mailman3 merely informs the user "cannot run here, give me more memory", or whether sending a signal (e.g. SIGCONT ?) to the owning process might accelerate how soon the lock-owning process is swapped in so that constructive progress resumes?
On 6/19/23 13:51, Stephen Daniel wrote:
I'm currently running the entire mailman3 stack, including postgres and postfix, plus my production web server, all on a GCP e2-small instance. e2-small is two virtual CPUs running on a single physical core, restricted to consuming at most 50% of the CPU cycles on that core. The system has 2 GB of memory, 10GB of disk space.
The entire stack runs well, albeit slowly, on this configuration, for which I pay about $13/month. The entire stack is shut down every night so I can take a snapshot while the applications are down. Once the snapshot is complete, the stack is restarted. I have no startup issues in this configuration.
It is hard to imagine a more limited pool of resources, yet everything works.
One of the simpler ways of experimenting with mailman on a restricted system is using an SBC - something like a Raspberry Pi 2 or 3, or a Beaglebone, or similar. Both have limited memory and CPU, and are relatively slow systems. I would avoid a Pi 4 because it's too powerful in this context!
Logging errors always changes the behaviour, but you can do much by noting things to internal variables and then after the event printing those (e.g. set a variable to count the number of times a try-lock operation found the lock held), and perhaps then doing the logging to a memory filesystem (Linux: tmpfs) which will have much lower latency. Normally a linux /tmp is using tmpfs.
A multicore system will have different behaviour to a single core, but the kernel will almost always be preemptive (force timeslice). So one option might be to experiment with a fast-tick (e.g. 1000Hz or higher) kernel such that the timeslices are shorter. (It used to be the case Linux ran on 100Hz - I know most kernels are now "tick-less", meaning there is no persistent timer interrupt marking time, but there will be a unit of timeslice).
A final option: if the problem is (as it appears to be) one of starvation, then perhaps the system is better run (on that hardware) with fewer worker processes. That is, cut the 13 workers down to e.g. 8. That said, starvation is something that can normally be designed out of systems though I know far to little about the architecture here to say if that is the case here.
HTH,
Ruth
On 19/06/2023 18:51, Stephen Daniel wrote:
I'm currently running the entire mailman3 stack, including postgres and postfix, plus my production web server, all on a GCP e2-small instance. e2-small is two virtual CPUs running on a single physical core, restricted to consuming at most 50% of the CPU cycles on that core. The system has 2 GB of memory, 10GB of disk space.
The entire stack runs well, albeit slowly, on this configuration, for which I pay about $13/month. The entire stack is shut down every night so I can take a snapshot while the applications are down. Once the snapshot is complete, the stack is restarted. I have no startup issues in this configuration.
It is hard to imagine a more limited pool of resources, yet everything works.
On Mon, Jun 19, 2023 at 12:38 PM Nelson Strother < justfixit@marilynstrother.com> wrote:
[My apologies for the mangled formatting of the diffs in [2] and [3] above. at least as shown on the web view of this list. Will happily repost or resend by direct email if anyone expresses interest.]
Presumably there's also a chance for the runner to detect and log the expired lock before it dies.
This may require some cleverness, as the delays introduced by recording a log message e.g. "I'm the nntp runner holding the lock, and it has not expired yet." are likely to change the results of the lock competition, maybe even allowing all runner processes to continue as intended. It is unclear to me whether a process would ever have an opportunity to record a log message e.g. "I'm the nntp runner holding the lock which has expired, but I am not dead yet." before it dies. If such an opportunity exists, it should extend the lock's lifetime via e.g Lock.refresh(timedelta(seconds=10)).
As I attempted to summarize at the end of [3] above, each time I tried shorter sleep intervals (1, 10, 15 seconds) within .../flufl/lock/_lockfile.py the original problem remains, namely one or more locks are broken and runner processes die.
On 6/19/23 03:00, Stephen J. Turnbull wrote:
The best "performance" I have yet obtained on this single core system is [by] [3] [which waits 20s before trying to obtain the lock].
Is 20s necessary? There are a lot of runners, you may be able to cut the startup time by 2-3 minutes if you can cut that to 5s or less. That would still be 100x longer than the naive #cores * seconds estimate, but perhaps significant.
Mailman-users mailing list -- mailman-users@mailman3.org To unsubscribe send an email to mailman-users-leave@mailman3.org https://lists.mailman3.org/mailman3/lists/mailman-users.mailman3.org/ Archived at: https://lists.mailman3.org/archives/list/mailman-users@mailman3.org/message/...
This message sent to swd@pobox.com
Mailman-users mailing list -- mailman-users@mailman3.org To unsubscribe send an email to mailman-users-leave@mailman3.org https://lists.mailman3.org/mailman3/lists/mailman-users.mailman3.org/ Archived at: https://lists.mailman3.org/archives/list/mailman-users@mailman3.org/message/...
This message sent to ruth@ivimey.org
-- Software Manager & Engineer Tel: 01223 414180 Blog: http://www.ivimey.org/blog LinkedIn: http://uk.linkedin.com/in/ruthivimeycook/
I have been following this thread with curiosity. Is there any current Linux OS that recommends a single core hardware with 500MB RAM for a production server? I could whip up a VM with such specs on my laptop using VirtualBox or VMware, but what's the OS? And is it for promotion environment? I don't understand what this thread is trying to prove in the 21st century. I'm sorry to say it's just a waste of time. Perhaps you should try another system for a mailing list.
On Tue, 20 Jun 2023, 22:18 Ruth Ivimey-Cook, <ruth@ivimey.org> wrote:
One of the simpler ways of experimenting with mailman on a restricted system is using an SBC - something like a Raspberry Pi 2 or 3, or a Beaglebone, or similar. Both have limited memory and CPU, and are relatively slow systems. I would avoid a Pi 4 because it's too powerful in this context!
Logging errors always changes the behaviour, but you can do much by noting things to internal variables and then after the event printing those (e.g. set a variable to count the number of times a try-lock operation found the lock held), and perhaps then doing the logging to a memory filesystem (Linux: tmpfs) which will have much lower latency. Normally a linux /tmp is using tmpfs.
A multicore system will have different behaviour to a single core, but the kernel will almost always be preemptive (force timeslice). So one option might be to experiment with a fast-tick (e.g. 1000Hz or higher) kernel such that the timeslices are shorter. (It used to be the case Linux ran on 100Hz - I know most kernels are now "tick-less", meaning there is no persistent timer interrupt marking time, but there will be a unit of timeslice).
A final option: if the problem is (as it appears to be) one of starvation, then perhaps the system is better run (on that hardware) with fewer worker processes. That is, cut the 13 workers down to e.g. 8. That said, starvation is something that can normally be designed out of systems though I know far to little about the architecture here to say if that is the case here.
HTH,
Ruth
I'm currently running the entire mailman3 stack, including postgres and postfix, plus my production web server, all on a GCP e2-small instance. e2-small is two virtual CPUs running on a single physical core, restricted to consuming at most 50% of the CPU cycles on that core. The system has 2 GB of memory, 10GB of disk space.
The entire stack runs well, albeit slowly, on this configuration, for which I pay about $13/month. The entire stack is shut down every night so I can take a snapshot while the applications are down. Once the snapshot is complete, the stack is restarted. I have no startup issues in this configuration.
It is hard to imagine a more limited pool of resources, yet everything works.
On Mon, Jun 19, 2023 at 12:38 PM Nelson Strother < justfixit@marilynstrother.com> wrote:
[My apologies for the mangled formatting of the diffs in [2] and [3] above. at least as shown on the web view of this list. Will happily repost or resend by direct email if anyone expresses interest.]
Presumably there's also a chance for the runner to detect and log
On 19/06/2023 18:51, Stephen Daniel wrote: the
expired lock before it dies.
This may require some cleverness, as the delays introduced by recording a log message e.g. "I'm the nntp runner holding the lock, and it has not expired yet." are likely to change the results of the lock competition, maybe even allowing all runner processes to continue as intended. It is unclear to me whether a process would ever have an opportunity to record a log message e.g. "I'm the nntp runner holding the lock which has expired, but I am not dead yet." before it dies. If such an opportunity exists, it should extend the lock's lifetime via e.g Lock.refresh(timedelta(seconds=10)).
As I attempted to summarize at the end of [3] above, each time I tried shorter sleep intervals (1, 10, 15 seconds) within .../flufl/lock/_lockfile.py the original problem remains, namely one or more locks are broken and runner processes die.
On 6/19/23 03:00, Stephen J. Turnbull wrote:
The best "performance" I have yet obtained on this single core system is [by] [3] [which waits 20s before trying to obtain the lock].
Is 20s necessary? There are a lot of runners, you may be able to cut the startup time by 2-3 minutes if you can cut that to 5s or less. That would still be 100x longer than the naive #cores * seconds estimate, but perhaps significant.
Mailman-users mailing list -- mailman-users@mailman3.org To unsubscribe send an email to mailman-users-leave@mailman3.org https://lists.mailman3.org/mailman3/lists/mailman-users.mailman3.org/ Archived at:
https://lists.mailman3.org/archives/list/mailman-users@mailman3.org/message/...
This message sent to swd@pobox.com
Mailman-users mailing list -- mailman-users@mailman3.org To unsubscribe send an email to mailman-users-leave@mailman3.org https://lists.mailman3.org/mailman3/lists/mailman-users.mailman3.org/ Archived at: https://lists.mailman3.org/archives/list/mailman-users@mailman3.org/message/...
This message sent to ruth@ivimey.org
-- Software Manager & Engineer Tel: 01223 414180 Blog: http://www.ivimey.org/blog LinkedIn: http://uk.linkedin.com/in/ruthivimeycook/
Mailman-users mailing list -- mailman-users@mailman3.org To unsubscribe send an email to mailman-users-leave@mailman3.org https://lists.mailman3.org/mailman3/lists/mailman-users.mailman3.org/ Archived at: https://lists.mailman3.org/archives/list/mailman-users@mailman3.org/message/...
This message sent to odhiambo@gmail.com
The just-released Debian 12 is free enough of bloatware that: https://www.debian.org/releases/bookworm/amd64/ch03s04.en.html for a server without a GUI desktop minimum RAM of 256 megabytes, recommended RAM of 512 megabytes. These memory requirements remain unchanged from what was documented at the time Debian 11 was released, and Debian 11 and Mailman3 play nicely together on such a system per my observations except sometimes during startup.
Why increase the carbon footprint of the hosting provider as they use VMware to slice up their resources (by asking for more than you need)?
On 6/20/23 15:45, Odhiambo Washington wrote:
I have been following this thread with curiosity. Is there any current Linux OS that recommends a single core hardware with 500MB RAM for a production server? I could whip up a VM with such specs on my laptop using VirtualBox or VMware, but what's the OS? And is it for promotion environment? I don't understand what this thread is trying to prove in the 21st century. I'm sorry to say it's just a waste of time. Perhaps you should try another system for a mailing list.
:
Nelson Strother writes:
Presumably there's also a chance for the runner to detect and log the expired lock before it dies.
This may require some cleverness, as the delays introduced by recording a log message e.g. "I'm the nntp runner holding the lock, and it has not expired yet." are likely to change the results of the lock competition, maybe even allowing all runner processes to continue as intended.
Unless the runner is killed with an uncatchable signal, there likely is a point before exit to do this, and no, *all* runner processes won't continue, because this one is already dying in my suggestion. I understand there may be cases where you can't find a place to put the log, but I don't think it's worth treating this as a recoverable error if it happens that after the message is logged the initialization somehow completes successfully -- once you detect an expired lock, give up.
It is unclear to me whether a process would ever have an opportunity to record a log message e.g. "I'm the nntp runner holding the lock which has expired, but I am not dead yet." before it dies.
Agreed. Until it's clear that the opportunity doesn't exist, this idea may provide a faster simpler way to diagnose this kind of problem.
If such an opportunity exists, it should extend the lock's lifetime via e.g Lock.refresh(timedelta(seconds=10)).
I disagree. The point of setting an expiration is that you know the process is misbehaving if it holds the lock that long. Setting a longer expiration is a reasonable workaround until we understand and fix the misbehavior. But allowing the misbehaving process to extend possession of the lock is a bad idea.
As I attempted to summarize at the end of [3] above, each time I tried shorter sleep intervals (1, 10, 15 seconds) within .../flufl/lock/_lockfile.py
It wasn't clear to me which experiment that referred to. Thank you for clarifying.
participants (5)
-
Nelson Strother
-
Odhiambo Washington
-
Ruth Ivimey-Cook
-
Stephen Daniel
-
Stephen J. Turnbull