slow restart time for mailmanweb
Debian 11.2, Mailman 3.3.5, Python 3.9.2, MariaDB 10.5.12-0+deb11u1
On a Debian system running MariaDB, mailman3 restarts quickly but mailmanweb takes about 90 seconds:
root@myhost # time systemctl restart mailmanweb
real 1m30.156s user 0m0.013s sys 0m0.000s
The daemon.log has a timeout error:
Dec 28 12:16:01 myhost systemd[1]: Stopping GNU Mailman Web UI... Dec 28 12:16:03 myhost uwsgi[409333]: [uWSGI] getting INI configuration from /etc/mailman3/uwsgi.ini Dec 28 12:17:31 myhost systemd[1]: mailmanweb.service: State 'stop-sigterm' timed out. Killing. Dec 28 12:17:31 myhost systemd[1]: mailmanweb.service: Killing process 409333 (uwsgi) with signal SIGKILL. Dec 28 12:17:31 myhost systemd[1]: mailmanweb.service: Killing process 418249 (uwsgi) with signal SIGKILL. Dec 28 12:17:31 myhost systemd[1]: mailmanweb.service: Killing process 418250 (uwsgi) with signal SIGKILL. Dec 28 12:17:31 myhost systemd[1]: mailmanweb.service: Killing process 418252 (sh) with signal SIGKILL. Dec 28 12:17:31 myhost systemd[1]: mailmanweb.service: Killing process 418254 (mailman-web) with signal SIGKILL. Dec 28 12:17:31 myhost systemd[1]: mailmanweb.service: Killing process 418258 (mailman-web) with signal SIGKILL. Dec 28 12:17:31 myhost systemd[1]: mailmanweb.service: Killing process 418260 (mailman-web) with signal SIGKILL. Dec 28 12:17:31 myhost systemd[1]: mailmanweb.service: Killing process 418261 (mailman-web) with signal SIGKILL. Dec 28 12:17:31 myhost systemd[1]: mailmanweb.service: Killing process 418262 (mailman-web) with signal SIGKILL. Dec 28 12:17:31 myhost systemd[1]: mailmanweb.service: Killing process 418263 (mailman-web) with signal SIGKILL. Dec 28 12:17:31 myhost systemd[1]: mailmanweb.service: Killing process 418264 (mailman-web) with signal SIGKILL. Dec 28 12:17:31 myhost systemd[1]: mailmanweb.service: Killing process 418265 (mailman-web) with signal SIGKILL. Dec 28 12:17:31 myhost systemd[1]: mailmanweb.service: Killing process 418253 (uwsgi) with signal SIGKILL. Dec 28 12:17:31 myhost systemd[1]: mailmanweb.service: Main process exited, code=killed, status=9/KILL Dec 28 12:17:31 myhost systemd[1]: mailmanweb.service: Failed with result 'timeout'. Dec 28 12:17:31 myhost systemd[1]: Stopped GNU Mailman Web UI.
Here is /etc/systemd/system/mailmanweb.service file.
[Unit] Description=GNU Mailman Web UI After=syslog.target network.target mariadb.service mailman3.service
[Service] Environment="PYTHONPATH=/etc/mailman3/" User=mailman Group=mailman ExecStart=/opt/mailman/venv/bin/uwsgi --ini /etc/mailman3/uwsgi.ini
[Install] WantedBy=multi-user.target
Starting with Debian 11, the MariaDB service is called mariadb and not mysqld or mysql. I tried substituting mysqld and mysql here and rerunning 'systemctl daemon-reload' each time, but restart times were similar, and also produced the same timeout error.
And here is /etc/mailman3/uwsgi.ini:
#####
[uwsgi] # Port on which uwsgi will be listening. http-socket = 0.0.0.0:8000 # If running uwsgi from the virtual environment ... virtualenv = /opt/mailman/venv/
module=mailman_web.wsgi:application # Add to python import path. pythonpath = /etc/mailman3/ # The default settings module. env = DJANGO_SETTINGS_MODULE=settings
# Setup default number of processes and threads per process. master = true processes = 2 threads = 2
# Setup the django_q related worker processes. attach-daemon = /opt/mailman/venv/bin/mailman-web qcluster
# Setup the request log. req-logger = file:/opt/mailman/web/logs/uwsgi.log
# Log qcluster commands separately. logger = qcluster file:/opt/mailman/web/logs/uwsgi-qcluster.log log-route = qcluster uwsgi-daemons
# Last log and it logs the rest of the stuff. logger = file:/opt/mailman/web/logs/uwsgi-error.log
#####
What's causing the timeout, and what can I do to speed up restarts? Thanks.
dn
On 12/28/21 1:07 PM, David Newman wrote:
Debian 11.2, Mailman 3.3.5, Python 3.9.2, MariaDB 10.5.12-0+deb11u1
On a Debian system running MariaDB, mailman3 restarts quickly but mailmanweb takes about 90 seconds:
root@myhost # time systemctl restart mailmanweb
real 1m30.156s user 0m0.013s sys 0m0.000s
The daemon.log has a timeout error:
Dec 28 12:16:01 myhost systemd[1]: Stopping GNU Mailman Web UI... Dec 28 12:16:03 myhost uwsgi[409333]: [uWSGI] getting INI configuration from /etc/mailman3/uwsgi.ini Dec 28 12:17:31 myhost systemd[1]: mailmanweb.service: State 'stop-sigterm' timed out. Killing. ... Dec 28 12:17:31 myhost systemd[1]: Stopped GNU Mailman Web UI.
It looks to me like the timeout is in stopping, not starting. What happens if you separately
systemctl stop mailmanweb
and
systemctl start mailmanweb
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
On 12/28/21 1:20 PM, Mark Sapiro wrote:
It looks to me like the timeout is in stopping, not starting. What happens if you separately
systemctl stop mailmanweb
and
systemctl start mailmanweb
Confirmed:
root@myhost:/etc/mailman3# time systemctl stop mailmanweb
real 1m30.239s user 0m0.010s sys 0m0.004s root@myhost:/etc/mailman3# time systemctl start mailmanweb
real 0m0.012s user 0m0.011s sys 0m0.001s
On 12/28/21 3:01 PM, Mark Sapiro wrote:
Is your /etc/systemd/system/mailmanweb.service the recommended one at
https://docs.mailman3.org/en/latest/install/virtualenv.html#automatically-st...
Yes, modulo the use of MariaDB instead of PostgreSQL.
If so, try adding
KillSignal=SIGQUIT
to the Service section and see if that fixes it.
root@myhost:/etc/mailman3# time systemctl restart mailmanweb
real 0m1.026s user 0m0.008s sys 0m0.004s
My limited understanding of SIGQUIT is that it dumps core, but 'find / -type f -name '*.core' did not return any core dumps from this event.
Still unsure why the service hung on stop, but thanks!
dn
David Newman writes:
My limited understanding of SIGQUIT is that it dumps core,
*chuckle* The differences among user signals like SIGTERM and SIGQUIT are a bit gibbous especially since you can change the handler or ignore them entirely. (SIGKILL of course is one of a very few signals that is well-defined since it cannot be caught or ignored and is handled entirely by the OS.)
but 'find / -type f -name '*.core' did not return any core dumps from this event.
Dumping core is the default action, but that requires that core dumps be enabled for that process. You can check the general environment from the shell with "ulimit -c" (if logged in as the Mailman user) but Mailman or Python itself might have different ideas, and either might be installing a handler that does something different. Mark might know, but I'm not familiar with that really low-level stuff (I should fix this .... ;-)
On 12/28/21 3:51 PM, David Newman wrote:
On 12/28/21 1:20 PM, Mark Sapiro wrote:
If so, try adding
KillSignal=SIGQUIT
to the Service section and see if that fixes it.
root@myhost:/etc/mailman3# time systemctl restart mailmanweb
real 0m1.026s user 0m0.008s sys 0m0.004s
My limited understanding of SIGQUIT is that it dumps core, but 'find / -type f -name '*.core' did not return any core dumps from this event.
Actually, SIGINT is probably better, but they should be equivalent. See https://uwsgi-docs.readthedocs.io/en/latest/Management.html#signals-for-cont... and https://uwsgi-docs.readthedocs.io/en/latest/Management.html#stopping-the-ser...
I'll update the doc.
Still unsure why the service hung on stop, but thanks!
It hung on stop because systemd wasn't told how to stop uwsgi so it sent a SIGTERM which doesn't stop it.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
On 12/28/21 7:52 PM, Mark Sapiro wrote:
On 12/28/21 3:51 PM, David Newman wrote:
On 12/28/21 1:20 PM, Mark Sapiro wrote:
> If so, try adding > > KillSignal=SIGQUIT > > to the Service section and see if that fixes it.
root@myhost:/etc/mailman3# time systemctl restart mailmanweb
real 0m1.026s user 0m0.008s sys 0m0.004s
My limited understanding of SIGQUIT is that it dumps core, but 'find / -type f -name '*.core' did not return any core dumps from this event.
Actually, SIGINT is probably better,
SIGINT does fine:
root@myhost:/etc/systemd/system# time systemctl restart mailmanweb
real 0m1.024s user 0m0.009s sys 0m0.006s
but they should be equivalent. See https://uwsgi-docs.readthedocs.io/en/latest/Management.html#signals-for-cont... and https://uwsgi-docs.readthedocs.io/en/latest/Management.html#stopping-the-ser...
I'll update the doc.
Thank you!
Still unsure why the service hung on stop, but thanks!
It hung on stop because systemd wasn't told how to stop uwsgi so it sent a SIGTERM which doesn't stop it.
Got it. Again, many thanks.
dn
On Tue, 2021-12-28 at 13:07 -0800, David Newman wrote:
Debian 11.2, Mailman 3.3.5, Python 3.9.2, MariaDB 10.5.12-0+deb11u1
On a Debian system running MariaDB, mailman3 restarts quickly but mailmanweb takes about 90 seconds:
I get the same thing. Here's my log snippet:
Dec 28 16:18:41 emergenus systemd[1]: mailmanweb.service: State 'stop- sigterm' timed out. Killing. Dec 28 16:18:41 emergenus systemd[1]: mailmanweb.service: Killing process 3046 (uwsgi) with signal SIGKILL. Dec 28 16:18:41 emergenus systemd[1]: mailmanweb.service: Killing process 20439 (uwsgi) with signal SIGKILL. Dec 28 16:18:41 emergenus systemd[1]: mailmanweb.service: Killing process 20440 (uwsgi) with signal SIGKILL. Dec 28 16:18:41 emergenus systemd[1]: mailmanweb.service: Killing process 20442 (sh) with signal SIGKILL. Dec 28 16:18:41 emergenus systemd[1]: mailmanweb.service: Killing process 20444 (mailman-web) with signal SIGKILL. Dec 28 16:18:41 emergenus systemd[1]: mailmanweb.service: Killing process 20446 (mailman-web) with signal SIGKILL. Dec 28 16:18:41 emergenus systemd[1]: mailmanweb.service: Killing process 20447 (mailman-web) with signal SIGKILL. Dec 28 16:18:41 emergenus systemd[1]: mailmanweb.service: Killing process 20448 (mailman-web) with signal SIGKILL. Dec 28 16:18:41 emergenus systemd[1]: mailmanweb.service: Killing process 20449 (mailman-web) with signal SIGKILL. Dec 28 16:18:41 emergenus systemd[1]: mailmanweb.service: Killing process 20450 (mailman-web) with signal SIGKILL. Dec 28 16:18:41 emergenus systemd[1]: mailmanweb.service: Main process exited, code=killed, status=9/KILL Dec 28 16:18:41 emergenus systemd[1]: mailmanweb.service: Failed with result 'timeout'.
After it times out, it starts fine.
If I just stop the thing and then start it, it goes fast. Here's that log:
Dec 28 16:27:22 emergenus systemd[1]: Stopping GNU Mailing List Manager... Dec 28 16:27:24 emergenus mailman[23408]: Shutting down Mailman's master runner Dec 28 16:27:27 emergenus systemd[1]: mailman3.service: Succeeded. Dec 28 16:27:27 emergenus systemd[1]: Stopped GNU Mailing List Manager. Dec 28 16:27:27 emergenus systemd[1]: mailman3.service: Consumed 42.834s CPU time. Dec 28 16:27:27 emergenus systemd[1]: Starting GNU Mailing List Manager... Dec 28 16:27:29 emergenus kernel: [ 8076.868797] [UFW BLOCK] IN=eth0 OUT= MAC=00:16:3e:ef:72:96:00:23:9c:9e:9f:c0:08:00 SRC=92.63.197.108 DST=203.159.80.234 LEN=40 TOS=0x00 PREC=0x00 TTL=253 ID=58971 PROTO=TCP SPT=44610 DPT=21987 WINDOW=1024 RES=0x00 SYN URGP=0 Dec 28 16:27:29 emergenus mailman[23433]: Starting Mailman's master runner Dec 28 16:27:29 emergenus mailman[23433]: Generating MTA alias maps Dec 28 16:27:30 emergenus systemd[1]: mailman3.service: Can't open PID file /opt/mailman/mm/var/master.pid (yet?) after start: Operation not permitted
I don't know what's up with the pid thing -- it moves right on in about 1 second.
billo
On 12/28/21 1:28 PM, William Oliver wrote:
If I just stop the thing and then start it, it goes fast. Here's that log:
Dec 28 16:27:22 emergenus systemd[1]: Stopping GNU Mailing List Manager... Dec 28 16:27:24 emergenus mailman[23408]: Shutting down Mailman's master runner Dec 28 16:27:27 emergenus systemd[1]: mailman3.service: Succeeded. Dec 28 16:27:27 emergenus systemd[1]: Stopped GNU Mailing List Manager. Dec 28 16:27:27 emergenus systemd[1]: mailman3.service: Consumed 42.834s CPU time. Dec 28 16:27:27 emergenus systemd[1]: Starting GNU Mailing List Manager... Dec 28 16:27:29 emergenus kernel: [ 8076.868797] [UFW BLOCK] IN=eth0 OUT= MAC=00:16:3e:ef:72:96:00:23:9c:9e:9f:c0:08:00 SRC=92.63.197.108 DST=203.159.80.234 LEN=40 TOS=0x00 PREC=0x00 TTL=253 ID=58971 PROTO=TCP SPT=44610 DPT=21987 WINDOW=1024 RES=0x00 SYN URGP=0 Dec 28 16:27:29 emergenus mailman[23433]: Starting Mailman's master runner Dec 28 16:27:29 emergenus mailman[23433]: Generating MTA alias maps
These log entries are from restarting mailman, not mailmanweb.
Dec 28 16:27:30 emergenus systemd[1]: mailman3.service: Can't open PID file /opt/mailman/mm/var/master.pid (yet?) after start: Operation not permitted
I don't know what's up with the pid thing -- it moves right on in about 1 second.
It's a race condition. systemd (systemctl start) is trying to read /opt/mailman/mm/var/master.pid before the master has created it. That's why it says (yet?).
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
On Tue, 2021-12-28 at 13:52 -0800, Mark Sapiro wrote:
On 12/28/21 1:28 PM, William Oliver wrote:
These log entries are from restarting mailman, not mailmanweb
Damn. Sorry about that. Brain fart.
Yep, it's still slow with mailmanweb:
Dec 28 17:41:00 emergenus systemd[1]: Stopping GNU Mailman Web UI... Dec 28 17:41:01 emergenus uwsgi[20860]: [uWSGI] getting INI configuration from /etc/mailman3/uwsgi.ini Dec 28 17:41:01 emergenus CRON[44474]: (mailman) CMD (/opt/mailman/venv/bin/mailman-web runjobs minutely) Dec 28 17:41:06 emergenus kernel: [12494.670812] [UFW BLOCK] IN=eth0 Dec 28 17:42:01 emergenus CRON[44755]: (mailman) CMD (/opt/mailman/venv/bin/mailman-web runjobs minutely) Dec 28 17:42:07 emergenus kernel: [12555.550730] [UFW BLOCK] IN=eth0 OUT= MAC=00:16:3e:ef:72:96:00:23:9c:9e:9f:c0:08:00 SRC=92.63.197.112 DST=203.159.80.234 LEN=40 TOS=0x00 PREC=0x00 TTL=253 ID=9540 PROTO=TCP SPT=57024 DPT=25238 WINDOW=1024 RES=0x00 SYN URGP=0 Dec 28 17:42:30 emergenus systemd[1]: mailmanweb.service: State 'stop- sigterm' timed out. Killing. Dec 28 17:42:30 emergenus systemd[1]: mailmanweb.service: Killing process 20860 (uwsgi) with signal SIGKILL. Dec 28 17:42:30 emergenus systemd[1]: mailmanweb.service: Killing process 44476 (uwsgi) with signal SIGKILL. Dec 28 17:42:30 emergenus systemd[1]: mailmanweb.service: Killing process 44477 (uwsgi) with signal SIGKILL. Dec 28 17:42:30 emergenus systemd[1]: mailmanweb.service: Killing process 44478 (sh) with signal SIGKILL. Dec 28 17:42:30 emergenus systemd[1]: mailmanweb.service: Killing process 44480 (mailman-web) with signal SIGKILL. Dec 28 17:42:30 emergenus systemd[1]: mailmanweb.service: Killing process 44483 (mailman-web) with signal SIGKILL. Dec 28 17:42:30 emergenus systemd[1]: mailmanweb.service: Killing process 44484 (mailman-web) with signal SIGKILL. Dec 28 17:42:30 emergenus systemd[1]: mailmanweb.service: Killing process 44485 (mailman-web) with signal SIGKILL. Dec 28 17:42:30 emergenus systemd[1]: mailmanweb.service: Killing process 44486 (mailman-web) with signal SIGKILL. Dec 28 17:42:30 emergenus systemd[1]: mailmanweb.service: Killing process 44487 (mailman-web) with signal SIGKILL. Dec 28 17:42:30 emergenus systemd[1]: mailmanweb.service: Killing process 44479 (uwsgi) with signal SIGKILL. Dec 28 17:42:30 emergenus systemd[1]: mailmanweb.service: Killing process 44481 (uwsgi) with signal SIGKILL. Dec 28 17:42:30 emergenus systemd[1]: mailmanweb.service: Main process exited, code=killed, status=9/KILL Dec 28 17:42:30 emergenus systemd[1]: mailmanweb.service: Failed with result 'timeout'. Dec 28 17:42:30 emergenus systemd[1]: Stopped GNU Mailman Web UI. Dec 28 17:42:30 emergenus systemd[1]: mailmanweb.service: Consumed 2min 27.819s CPU time.
On 12/28/21 2:43 PM, William Oliver wrote:
Yep, it's still slow with mailmanweb:
Is your /etc/systemd/system/mailmanweb.service the recommended one at https://docs.mailman3.org/en/latest/install/virtualenv.html#automatically-st...
If so, try adding
KillSignal=SIGQUIT
to the Service section and see if that fixes it.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
participants (4)
-
David Newman
-
Mark Sapiro
-
Stephen J. Turnbull
-
William Oliver