Here are the (four identical, but for timestamps and process numbers, at a glance) tracebacks, and some related info:
--- from /var/log/syslog Jun 8 03:35:47 localhost mailman[13107]: Traceback (most recent call last): Jun 8 03:35:48 localhost mailman[13107]: File "/var/tmp/mailman/.local/bin/runner", line 8, in <module> Jun 8 03:35:48 localhost mailman[13107]: sys.exit(main()) Jun 8 03:35:48 localhost mailman[13107]: File "/var/tmp/mailman/.local/lib/python3.9/site-packages/click/core.py", line 1130, in __call__ Jun 8 03:35:48 localhost mailman[13107]: return self.main(*args, **kwargs) Jun 8 03:35:48 localhost mailman[13107]: File "/var/tmp/mailman/.local/lib/python3.9/site-packages/click/core.py", line 1055, in main Jun 8 03:35:48 localhost mailman[13107]: rv = self.invoke(ctx) Jun 8 03:35:48 localhost mailman[13107]: File "/var/tmp/mailman/.local/lib/python3.9/site-packages/click/core.py", line 1404, in invoke Jun 8 03:35:48 localhost mailman[13107]: return ctx.invoke(self.callback, **ctx.params) Jun 8 03:35:48 localhost mailman[13107]: File "/var/tmp/mailman/.local/lib/python3.9/site-packages/click/core.py", line 760, in invoke Jun 8 03:35:48 localhost mailman[13107]: return __callback(*args, **kwargs) Jun 8 03:35:48 localhost mailman[13107]: File "/var/tmp/mailman/.local/lib/python3.9/site-packages/click/decorators.py", line 26, in new_func Jun 8 03:35:48 localhost mailman[13107]: return f(get_current_context(), *args, **kwargs) Jun 8 03:35:48 localhost mailman[13107]: File "/var/tmp/mailman/.local/lib/python3.9/site-packages/mailman/bin/runner.py", line 160, in main Jun 8 03:35:49 localhost mailman[13107]: initialize(config_file, verbose) Jun 8 03:35:49 localhost mailman[13107]: File "/var/tmp/mailman/.local/lib/python3.9/site-packages/mailman/core/initialize.py", line 231, in initialize Jun 8 03:35:49 localhost mailman[13107]: initialize_2(propagate_logs=propagate_logs) Jun 8 03:35:49 localhost mailman[13107]: File "/var/tmp/mailman/.local/lib/python3.9/site-packages/mailman/core/initialize.py", line 189, in initialize_2 Jun 8 03:35:49 localhost mailman[13107]: config.db = getUtility(IDatabaseFactory, utility_name).create() Jun 8 03:35:49 localhost mailman[13107]: File "/var/tmp/mailman/.local/lib/python3.9/site-packages/mailman/database/factory.py", line 60, in create Jun 8 03:35:49 localhost mailman[13107]: return database Jun 8 03:35:49 localhost mailman[13107]: File "/var/tmp/mailman/.local/lib/python3.9/site-packages/flufl/lock/_lockfile.py", line 470, in __exit__ Jun 8 03:35:49 localhost mailman[13107]: self.unlock() Jun 8 03:35:49 localhost mailman[13107]: File "/var/tmp/mailman/.local/lib/python3.9/site-packages/flufl/lock/_lockfile.py", line 420, in unlock Jun 8 03:35:49 localhost mailman[13107]: raise NotLockedError('Already unlocked') Jun 8 03:35:49 localhost mailman[13107]: flufl.lock._lockfile.NotLockedError: Already unlocked Jun 8 03:35:49 localhost dhclient[633]: RCV: Advertise message on ens192 from fe80::250:56ff:fe95:b457. Jun 8 03:36:46 localhost mailman[13100]: Traceback (most recent call last): Jun 8 03:36:46 localhost mailman[13100]: File "/var/tmp/mailman/.local/bin/runner", line 8, in <module> Jun 8 03:36:46 localhost mailman[13100]: sys.exit(main()) Jun 8 03:36:46 localhost mailman[13100]: File "/var/tmp/mailman/.local/lib/python3.9/site-packages/click/core.py", line 1130, in __call__ Jun 8 03:36:46 localhost mailman[13100]: return self.main(*args, **kwargs) Jun 8 03:36:46 localhost mailman[13100]: File "/var/tmp/mailman/.local/lib/python3.9/site-packages/click/core.py", line 1055, in main Jun 8 03:36:46 localhost mailman[13100]: rv = self.invoke(ctx) Jun 8 03:36:46 localhost mailman[13100]: File "/var/tmp/mailman/.local/lib/python3.9/site-packages/click/core.py", line 1404, in invoke Jun 8 03:36:46 localhost mailman[13100]: return ctx.invoke(self.callback, **ctx.params) Jun 8 03:36:46 localhost mailman[13100]: File "/var/tmp/mailman/.local/lib/python3.9/site-packages/click/core.py", line 760, in invoke Jun 8 03:36:46 localhost mailman[13100]: return __callback(*args, **kwargs) Jun 8 03:36:46 localhost mailman[13100]: File "/var/tmp/mailman/.local/lib/python3.9/site-packages/click/decorators.py", line 26, in new_func Jun 8 03:36:46 localhost mailman[13100]: return f(get_current_context(), *args, **kwargs) Jun 8 03:36:46 localhost mailman[13100]: File "/var/tmp/mailman/.local/lib/python3.9/site-packages/mailman/bin/runner.py", line 160, in main Jun 8 03:36:46 localhost mailman[13100]: initialize(config_file, verbose) Jun 8 03:36:46 localhost mailman[13100]: File "/var/tmp/mailman/.local/lib/python3.9/site-packages/mailman/core/initialize.py", line 231, in initialize Jun 8 03:36:46 localhost mailman[13100]: initialize_2(propagate_logs=propagate_logs) Jun 8 03:36:46 localhost mailman[13100]: File "/var/tmp/mailman/.local/lib/python3.9/site-packages/mailman/core/initialize.py", line 189, in initialize_2 Jun 8 03:36:46 localhost mailman[13100]: config.db = getUtility(IDatabaseFactory, utility_name).create() Jun 8 03:36:46 localhost mailman[13100]: File "/var/tmp/mailman/.local/lib/python3.9/site-packages/mailman/database/factory.py", line 60, in create Jun 8 03:36:46 localhost mailman[13100]: return database Jun 8 03:36:46 localhost mailman[13100]: File "/var/tmp/mailman/.local/lib/python3.9/site-packages/flufl/lock/_lockfile.py", line 470, in __exit__ Jun 8 03:36:46 localhost mailman[13100]: self.unlock() Jun 8 03:36:46 localhost mailman[13100]: File "/var/tmp/mailman/.local/lib/python3.9/site-packages/flufl/lock/_lockfile.py", line 420, in unlock Jun 8 03:36:46 localhost mailman[13100]: raise NotLockedError('Already unlocked') Jun 8 03:36:46 localhost mailman[13100]: flufl.lock._lockfile.NotLockedError: Already unlocked Jun 8 03:36:56 localhost mailman[13109]: Traceback (most recent call last): Jun 8 03:36:56 localhost mailman[13109]: File "/var/tmp/mailman/.local/bin/runner", line 8, in <module> Jun 8 03:36:56 localhost mailman[13109]: sys.exit(main()) Jun 8 03:36:56 localhost mailman[13109]: File "/var/tmp/mailman/.local/lib/python3.9/site-packages/click/core.py", line 1130, in __call__ Jun 8 03:36:56 localhost mailman[13109]: return self.main(*args, **kwargs) Jun 8 03:36:56 localhost mailman[13109]: File "/var/tmp/mailman/.local/lib/python3.9/site-packages/click/core.py", line 1055, in main Jun 8 03:36:56 localhost mailman[13109]: rv = self.invoke(ctx) Jun 8 03:36:56 localhost mailman[13109]: File "/var/tmp/mailman/.local/lib/python3.9/site-packages/click/core.py", line 1404, in invoke Jun 8 03:36:56 localhost mailman[13109]: return ctx.invoke(self.callback, **ctx.params) Jun 8 03:36:56 localhost mailman[13109]: File "/var/tmp/mailman/.local/lib/python3.9/site-packages/click/core.py", line 760, in invoke Jun 8 03:36:56 localhost mailman[13109]: return __callback(*args, **kwargs) Jun 8 03:36:56 localhost mailman[13109]: File "/var/tmp/mailman/.local/lib/python3.9/site-packages/click/decorators.py", line 26, in new_func Jun 8 03:36:56 localhost mailman[13109]: return f(get_current_context(), *args, **kwargs) Jun 8 03:36:56 localhost mailman[13109]: File "/var/tmp/mailman/.local/lib/python3.9/site-packages/mailman/bin/runner.py", line 160, in main Jun 8 03:36:56 localhost postfix/anvil[13147]: statistics: max connection rate 1/60s for (smtp:163.123.143.10) at Jun 8 03:33:27 Jun 8 03:37:00 localhost postfix/anvil[13147]: statistics: max connection count 1 for (smtp:163.123.143.10) at Jun 8 03:33:27 Jun 8 03:37:00 localhost mailman[13109]: initialize(config_file, verbose) Jun 8 03:37:01 localhost mailman[13109]: File "/var/tmp/mailman/.local/lib/python3.9/site-packages/mailman/core/initialize.py", line 231, in initialize Jun 8 03:37:01 localhost mailman[13109]: initialize_2(propagate_logs=propagate_logs) Jun 8 03:37:01 localhost mailman[13109]: File "/var/tmp/mailman/.local/lib/python3.9/site-packages/mailman/core/initialize.py", line 189, in initialize_2 Jun 8 03:37:01 localhost mailman[13109]: config.db = getUtility(IDatabaseFactory, utility_name).create() Jun 8 03:37:01 localhost mailman[13109]: File "/var/tmp/mailman/.local/lib/python3.9/site-packages/mailman/database/factory.py", line 60, in create Jun 8 03:37:01 localhost mailman[13109]: return database Jun 8 03:37:01 localhost mailman[13109]: File "/var/tmp/mailman/.local/lib/python3.9/site-packages/flufl/lock/_lockfile.py", line 470, in __exit__ Jun 8 03:37:01 localhost mailman[13109]: self.unlock() Jun 8 03:37:01 localhost mailman[13109]: File "/var/tmp/mailman/.local/lib/python3.9/site-packages/flufl/lock/_lockfile.py", line 420, in unlock Jun 8 03:37:01 localhost mailman[13109]: raise NotLockedError('Already unlocked') Jun 8 03:37:01 localhost mailman[13109]: flufl.lock._lockfile.NotLockedError: Already unlocked Jun 8 03:37:01 localhost postfix/anvil[13147]: statistics: max cache size 1 at Jun 8 03:33:27 Jun 8 03:37:01 localhost mailman[13110]: Traceback (most recent call last): Jun 8 03:37:01 localhost mailman[13110]: File "/var/tmp/mailman/.local/bin/runner", line 8, in <module> Jun 8 03:37:01 localhost mailman[13110]: sys.exit(main()) Jun 8 03:37:01 localhost mailman[13110]: File "/var/tmp/mailman/.local/lib/python3.9/site-packages/click/core.py", line 1130, in __call__ Jun 8 03:37:01 localhost mailman[13110]: return self.main(*args, **kwargs) Jun 8 03:37:01 localhost mailman[13110]: File "/var/tmp/mailman/.local/lib/python3.9/site-packages/click/core.py", line 1055, in main Jun 8 03:37:01 localhost mailman[13110]: rv = self.invoke(ctx) Jun 8 03:37:01 localhost mailman[13110]: File "/var/tmp/mailman/.local/lib/python3.9/site-packages/click/core.py", line 1404, in invoke Jun 8 03:37:01 localhost mailman[13110]: return ctx.invoke(self.callback, **ctx.params) Jun 8 03:37:01 localhost mailman[13110]: File "/var/tmp/mailman/.local/lib/python3.9/site-packages/click/core.py", line 760, in invoke Jun 8 03:37:01 localhost mailman[13110]: return __callback(*args, **kwargs) Jun 8 03:37:01 localhost mailman[13110]: File "/var/tmp/mailman/.local/lib/python3.9/site-packages/click/decorators.py", line 26, in new_func Jun 8 03:37:01 localhost mailman[13110]: return f(get_current_context(), *args, **kwargs) Jun 8 03:37:01 localhost mailman[13110]: File "/var/tmp/mailman/.local/lib/python3.9/site-packages/mailman/bin/runner.py", line 160, in main Jun 8 03:37:01 localhost mailman[13110]: initialize(config_file, verbose) Jun 8 03:37:01 localhost mailman[13110]: File "/var/tmp/mailman/.local/lib/python3.9/site-packages/mailman/core/initialize.py", line 231, in initialize Jun 8 03:37:01 localhost mailman[13110]: initialize_2(propagate_logs=propagate_logs) Jun 8 03:37:01 localhost mailman[13110]: File "/var/tmp/mailman/.local/lib/python3.9/site-packages/mailman/core/initialize.py", line 189, in initialize_2 Jun 8 03:37:01 localhost mailman[13110]: config.db = getUtility(IDatabaseFactory, utility_name).create() Jun 8 03:37:01 localhost mailman[13110]: File "/var/tmp/mailman/.local/lib/python3.9/site-packages/mailman/database/factory.py", line 60, in create Jun 8 03:37:01 localhost mailman[13110]: return database Jun 8 03:37:01 localhost mailman[13110]: File "/var/tmp/mailman/.local/lib/python3.9/site-packages/flufl/lock/_lockfile.py", line 470, in __exit__ Jun 8 03:37:01 localhost mailman[13110]: self.unlock() Jun 8 03:37:01 localhost mailman[13110]: File "/var/tmp/mailman/.local/lib/python3.9/site-packages/flufl/lock/_lockfile.py", line 420, in unlock Jun 8 03:37:01 localhost mailman[13110]: raise NotLockedError('Already unlocked') Jun 8 03:37:01 localhost mailman[13110]: flufl.lock._lockfile.NotLockedError: Already unlocked --- note that processes 13100, 13107, 13109, and 13110 are the ones missing from the sequential process numbers shown by systemctl status mailman below, were presumably momentarily the missing command, digests, retry, and virgin runners ...
-- all of the recent lines from /var/tmp/mailman/logs/mailman.log : Jun 08 03:24:17 2023 (4268) Master stopped Jun 08 03:30:18 2023 (13093) Master started Jun 08 03:43:31 2023 (13108) task runner started. Jun 08 03:43:52 2023 (13108) Task runner evicted 0 expired pendings Jun 08 03:43:52 2023 (13108) Task runner deleted 0 orphaned workflows Jun 08 03:43:53 2023 (13108) Task runner deleted 0 orphaned requests Jun 08 03:43:54 2023 (13108) Task runner deleted 0 orphaned messages Jun 08 03:43:54 2023 (13108) Task runner evicted expired cache entries Jun 08 03:44:17 2023 (13101) in runner started. Jun 08 03:44:23 2023 (13105) pipeline runner started. Jun 08 03:44:54 2023 (13103) nntp runner started. Jun 08 03:45:59 2023 (13098) archive runner started. Jun 08 03:46:06 2023 (13104) out runner started. Jun 08 03:46:23 2023 (13099) bounces runner started. Jun 08 03:46:43 2023 (13106) rest runner started. [2023-06-08 03:46:43 +0000] [13106] [INFO] Starting gunicorn 20.1.0 [2023-06-08 03:46:43 +0000] [13106] [INFO] Listening at: http://127.0.0.1:8001 (13106) [2023-06-08 03:46:43 +0000] [13106] [INFO] Using worker: sync [2023-06-08 03:46:44 +0000] [13345] [INFO] Booting worker with pid: 13345 [2023-06-08 03:46:44 +0000] [13346] [INFO] Booting worker with pid: 13346 Jun 08 03:46:46 2023 (13102) lmtp runner started. Jun 08 04:44:04 2023 (13108) Task runner evicted 0 expired pendings Jun 08 04:44:04 2023 (13108) Task runner deleted 0 orphaned workflows Jun 08 04:44:05 2023 (13108) Task runner deleted 0 orphaned requests Jun 08 04:44:05 2023 (13108) Task runner deleted 0 orphaned messages Jun 08 04:44:05 2023 (13108) Task runner evicted expired cache entries {with these last five lines being repeated hourly since ...} --
root@localhost:~# systemctl status mailman ● mailman.service - GNU Mailing List Manager Loaded: loaded (/etc/systemd/system/mailman.service; enabled; vendor preset: enabled) Active: active (running) since Thu 2023-06-08 03:30:18 UTC; 13h ago Process: 13092 ExecStart=/var/tmp/mailman/.local/bin/mailman --config /var/tmp/mailman/var/etc/mailman.cfg start (code=exited, status=0/SUCCESS) Main PID: 13093 (python3) Tasks: 17 (limit: 465) Memory: 73.5M CPU: 15min 23.511s CGroup: /system.slice/mailman.service ├─13093 /usr/bin/python3 /var/tmp/mailman/.local/bin/master -C /var/tmp/mailman/var/etc/mailman.cfg ├─13098 /usr/bin/python3 /var/tmp/mailman/.local/bin/runner -C /var/tmp/mailman/var/etc/mailman.cfg --runner=archive:0:1 ├─13099 /usr/bin/python3 /var/tmp/mailman/.local/bin/runner -C /var/tmp/mailman/var/etc/mailman.cfg --runner=bounces:0:1 ├─13101 /usr/bin/python3 /var/tmp/mailman/.local/bin/runner -C /var/tmp/mailman/var/etc/mailman.cfg --runner=in:0:1 ├─13102 /usr/bin/python3 /var/tmp/mailman/.local/bin/runner -C /var/tmp/mailman/var/etc/mailman.cfg --runner=lmtp:0:1 ├─13103 /usr/bin/python3 /var/tmp/mailman/.local/bin/runner -C /var/tmp/mailman/var/etc/mailman.cfg --runner=nntp:0:1 ├─13104 /usr/bin/python3 /var/tmp/mailman/.local/bin/runner -C /var/tmp/mailman/var/etc/mailman.cfg --runner=out:0:1 ├─13105 /usr/bin/python3 /var/tmp/mailman/.local/bin/runner -C /var/tmp/mailman/var/etc/mailman.cfg --runner=pipeline:0:1 ├─13106 /usr/bin/python3 /var/tmp/mailman/.local/bin/runner -C /var/tmp/mailman/var/etc/mailman.cfg --runner=rest:0:1 ├─13108 /usr/bin/python3 /var/tmp/mailman/.local/bin/runner -C /var/tmp/mailman/var/etc/mailman.cfg --runner=task:0:1 ├─13345 /usr/bin/python3 /var/tmp/mailman/.local/bin/runner -C /var/tmp/mailman/var/etc/mailman.cfg --runner=rest:0:1 └─13346 /usr/bin/python3 /var/tmp/mailman/.local/bin/runner -C /var/tmp/mailman/var/etc/mailman.cfg --runner=rest:0:1
Jun 08 03:37:01 localhost mailman[13110]: initialize_2(propagate_logs=propagate_logs) Jun 08 03:37:01 localhost mailman[13110]: File "/var/tmp/mailman/.local/lib/python3.9/site-packages/mailman/core/initialize.py", line 189, in initialize_2 Jun 08 03:37:01 localhost mailman[13110]: config.db = getUtility(IDatabaseFactory, utility_name).create() Jun 08 03:37:01 localhost mailman[13110]: File "/var/tmp/mailman/.local/lib/python3.9/site-packages/mailman/database/factory.py", line 60, in create Jun 08 03:37:01 localhost mailman[13110]: return database Jun 08 03:37:01 localhost mailman[13110]: File "/var/tmp/mailman/.local/lib/python3.9/site-packages/flufl/lock/_lockfile.py", line 470, in __exit__ Jun 08 03:37:01 localhost mailman[13110]: self.unlock() Jun 08 03:37:01 localhost mailman[13110]: File "/var/tmp/mailman/.local/lib/python3.9/site-packages/flufl/lock/_lockfile.py", line 420, in unlock Jun 08 03:37:01 localhost mailman[13110]: raise NotLockedError('Already unlocked') Jun 08 03:37:01 localhost mailman[13110]: flufl.lock._lockfile.NotLockedError: Already unlocked root@localhost:~#
On 6/8/23 13:02, Mark Sapiro wrote:
On 6/7/23 21:30, Nelson Strother wrote:
Interestingly if I use
systemctl start mailman
thus far the results are: process starting from .../mailman/.local/bin/runner and ending with
- the wall clock duration is shorter than when
mailman start
is issued from the command line, and -- either that all of the runners remain present once they are started, -- or I can see in syslog a traceback from each missing runner"flufl.lock._lockfile.NotLockedError: Already unlocked".
I do not yet understand how to make use of these clues, but at least one can see an epitaph from each deceased process.
The NotLockedError is most likely because a lock has been set with a lifetime and its lifetime has passed and some other process has broken the lock before the original process tries to unlock it. This probably has something to do with the long times to do start/restart.
However, the only lock that should be involved is the master lock set by the master watcher, so I don't really understand what might be happening.
What is the full traceback from these errors?