I just upgraded our Mailman3 system to use Python 3.9. About 6 hours after the deployment there was a PyMySQL error and delivery of messages stopped until I restarted the services.
I'm pasting the full stack trace error from mailman.log below. It mentions PyMySQL. We have PyMySQL-1.0.2 installed.
Thanks for your help.
Dan Caballero Systems Administrator Academic Computing Solutions IMSS - Caltech https://imss.caltech.edu
Feb 23 23:27:10 2022 (300) Uncaught runner exception: (pymysql.err.OperationalError) (1053, 'Server shutdown in progress')
[SQL: SELECT anon_1.member_id AS anon_1_member_id, anon_1.member__member_id AS anon_1_member__member_id, anon_1.member_role AS anon_1_member_role, anon_1.member_list_id AS anon_1_member_list_id, anon_1.member_moderation_action AS anon_1_member_moderation_action, anon_1.member_address_id AS anon_1_member_address_id, anon_1.member_preferences_id AS anon_1_member_preferences_id, anon_1.member_user_id AS anon_1_member_user_id, anon_1.member_bounce_score AS anon_1_member_bounce_score, anon_1.member_last_bounce_received AS anon_1_member_last_bounce_received, anon_1.member_last_warning_sent AS anon_1_member_last_warning_sent, anon_1.member_total_warnings_sent AS anon_1_member_total_warnings_sent
FROM (SELECT member
.id AS member_id, member
._member_id AS member__member_id, member
.role
AS member_role, member
.list_id AS member_list_id, member
.moderation_action AS member_moderation_action, member
.address_id AS member_address_id, member
.preferences_id AS member_preferences_id, member
.user_id AS member_user_id, member
.bounce_score AS member_bounce_score, member
.last_bounce_received AS member_last_bounce_received, member
.last_warning_sent AS member_last_warning_sent, member
.total_warnings_sent AS member_total_warnings_sent
FROM member
, address
WHERE member
.list_id = %(list_id_1)s AND member
.role
= %(role_1)s AND address.email = %(email_1)s AND member
.address_id = address.id UNION SELECT member
.id AS member_id, member
._member_id AS member__member_id, member
.role
AS member_role, member
.list_id AS member_list_id, member
.moderation_action AS member_moderation_action, member
.address_id AS member_address_id, member
.preferences_id AS member_preferences_id, member
.user_id AS member_user_id, member
.bounce_score AS member_bounce_score, member
.last_bounce_received AS member_last_bounce_received, member
.last_warning_sent AS member_last_warning_sent, member
.total_warnings_sent AS member_total_warnings_sent
FROM member
, address, user
WHERE member
.list_id = %(list_id_2)s AND member
.role
= %(role_2)s AND address.email = %(email_2)s AND member
.user_id = user.id AND user._preferred_address_id = address.id) AS anon_1]
[parameters: {'list_id_1': 'avery-disc.caltech.edu', 'role_1': 1, 'email_1': 'nandrade@caltech.edu', 'list_id_2': 'avery-disc.caltech.edu', 'role_2': 1, 'email_2': 'nandrade@caltech.edu'}]
(Background on this error at: http://sqlalche.me/e/13/e3q8)
Feb 23 23:27:10 2022 (300) Traceback (most recent call last):
File "/opt/mailmanve/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1276, in _execute_context
self.dialect.do_execute(
File "/opt/mailmanve/lib/python3.9/site-packages/sqlalchemy/engine/default.py", line 608, in do_execute
cursor.execute(statement, parameters)
File "/opt/mailmanve/lib/python3.9/site-packages/pymysql/cursors.py", line 148, in execute
result = self._query(query)
File "/opt/mailmanve/lib/python3.9/site-packages/pymysql/cursors.py", line 310, in _query
conn.query(q)
File "/opt/mailmanve/lib/python3.9/site-packages/pymysql/connections.py", line 548, in query
self._affected_rows = self._read_query_result(unbuffered=unbuffered)
File "/opt/mailmanve/lib/python3.9/site-packages/pymysql/connections.py", line 775, in _read_query_result
result.read()
File "/opt/mailmanve/lib/python3.9/site-packages/pymysql/connections.py", line 1156, in read
first_packet = self.connection._read_packet()
File "/opt/mailmanve/lib/python3.9/site-packages/pymysql/connections.py", line 725, in _read_packet
packet.raise_for_error()
File "/opt/mailmanve/lib/python3.9/site-packages/pymysql/protocol.py", line 221, in raise_for_error
err.raise_mysql_exception(self._data)
File "/opt/mailmanve/lib/python3.9/site-packages/pymysql/err.py", line 143, in raise_mysql_exception
raise errorclass(errno, errval)
pymysql.err.OperationalError: (1053, 'Server shutdown in progress')
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/opt/mailmanve/lib/python3.9/site-packages/mailman/core/runner.py", line 176, in _one_iteration
self._process_one_file(msg, msgdata)
File "/opt/mailmanve/lib/python3.9/site-packages/mailman/core/runner.py", line 269, in _process_one_file
keepqueued = self._dispose(mlist, msg, msgdata)
File "/opt/mailmanve/lib/python3.9/site-packages/mailman/runners/outgoing.py", line 111, in _dispose
self._func(mlist, msg, msgdata)
File "/opt/mailmanve/lib/python3.9/site-packages/mailman/mta/deliver.py", line 86, in deliver
refused = agent.deliver(mlist, msg, msgdata)
File "/opt/mailmanve/lib/python3.9/site-packages/mailman/mta/base.py", line 164, in deliver
member = mlist.members.get_member(recipient)
File "/opt/mailmanve/lib/python3.9/site-packages/mailman/model/roster.py", line 120, in get_member
memberships = self._get_all_memberships(email)
File "/opt/mailmanve/lib/python3.9/site-packages/mailman/database/transaction.py", line 85, in wrapper
return function(args[0], config.db.store, *args[1:], **kws)
File "/opt/mailmanve/lib/python3.9/site-packages/mailman/model/roster.py", line 116, in _get_all_memberships
return members_a.union(members_u).all()
File "/opt/mailmanve/lib/python3.9/site-packages/sqlalchemy/orm/query.py", line 3373, in all
return list(self)
File "/opt/mailmanve/lib/python3.9/site-packages/sqlalchemy/orm/query.py", line 3535, in __iter__
return self._execute_and_instances(context)
File "/opt/mailmanve/lib/python3.9/site-packages/sqlalchemy/orm/query.py", line 3560, in _execute_and_instances
result = conn.execute(querycontext.statement, self._params)
File "/opt/mailmanve/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1011, in execute
return meth(self, multiparams, params)
File "/opt/mailmanve/lib/python3.9/site-packages/sqlalchemy/sql/elements.py", line 298, in _execute_on_connection
return connection._execute_clauseelement(self, multiparams, params)
File "/opt/mailmanve/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1124, in _execute_clauseelement
ret = self._execute_context(
File "/opt/mailmanve/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1316, in _execute_context
self._handle_dbapi_exception(
File "/opt/mailmanve/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1510, in _handle_dbapi_exception
util.raise_(
File "/opt/mailmanve/lib/python3.9/site-packages/sqlalchemy/util/compat.py", line 182, in raise_
raise exception
File "/opt/mailmanve/lib/python3.9/site-packages/sqlalchemy/engine/base.py", line 1276, in _execute_context
self.dialect.do_execute(
File "/opt/mailmanve/lib/python3.9/site-packages/sqlalchemy/engine/default.py", line 608, in do_execute
cursor.execute(statement, parameters)
File "/opt/mailmanve/lib/python3.9/site-packages/pymysql/cursors.py", line 148, in execute
result = self._query(query)
File "/opt/mailmanve/lib/python3.9/site-packages/pymysql/cursors.py", line 310, in _query
conn.query(q)
File "/opt/mailmanve/lib/python3.9/site-packages/pymysql/connections.py", line 548, in query
self._affected_rows = self._read_query_result(unbuffered=unbuffered)
File "/opt/mailmanve/lib/python3.9/site-packages/pymysql/connections.py", line 775, in _read_query_result
result.read()
File "/opt/mailmanve/lib/python3.9/site-packages/pymysql/connections.py", line 1156, in read
first_packet = self.connection._read_packet()
File "/opt/mailmanve/lib/python3.9/site-packages/pymysql/connections.py", line 725, in _read_packet
packet.raise_for_error()
File "/opt/mailmanve/lib/python3.9/site-packages/pymysql/protocol.py", line 221, in raise_for_error
err.raise_mysql_exception(self._data)
File "/opt/mailmanve/lib/python3.9/site-packages/pymysql/err.py", line 143, in raise_mysql_exception
raise errorclass(errno, errval)
sqlalchemy.exc.OperationalError: (pymysql.err.OperationalError) (1053, 'Server shutdown in progress')
[SQL: SELECT anon_1.member_id AS anon_1_member_id, anon_1.member__member_id AS anon_1_member__member_id, anon_1.member_role AS anon_1_member_role, anon_1.member_list_id AS anon_1_member_list_id, anon_1.member_moderation_action AS anon_1_member_moderation_action, anon_1.member_address_id AS anon_1_member_address_id, anon_1.member_preferences_id AS anon_1_member_preferences_id, anon_1.member_user_id AS anon_1_member_user_id, anon_1.member_bounce_score AS anon_1_member_bounce_score, anon_1.member_last_bounce_received AS anon_1_member_last_bounce_received, anon_1.member_last_warning_sent AS anon_1_member_last_warning_sent, anon_1.member_total_warnings_sent AS anon_1_member_total_warnings_sent
FROM (SELECT member
.id AS member_id, member
._member_id AS member__member_id, member
.role
AS member_role, member
.list_id AS member_list_id, member
.moderation_action AS member_moderation_action, member
.address_id AS member_address_id, member
.preferences_id AS member_preferences_id, member
.user_id AS member_user_id, member
.bounce_score AS member_bounce_score, member
.last_bounce_received AS member_last_bounce_received, member
.last_warning_sent AS member_last_warning_sent, member
.total_warnings_sent AS member_total_warnings_sent
FROM member
, address
WHERE member
.list_id = %(list_id_1)s AND member
.role
= %(role_1)s AND address.email = %(email_1)s AND member
.address_id = address.id UNION SELECT member
.id AS member_id, member
._member_id AS member__member_id, member
.role
AS member_role, member
.list_id AS member_list_id, member
.moderation_action AS member_moderation_action, member
.address_id AS member_address_id, member
.preferences_id AS member_preferences_id, member
.user_id AS member_user_id, member
.bounce_score AS member_bounce_score, member
.last_bounce_received AS member_last_bounce_received, member
.last_warning_sent AS member_last_warning_sent, member
.total_warnings_sent AS member_total_warnings_sent
FROM member
, address, user
WHERE member
.list_id = %(list_id_2)s AND member
.role
= %(role_2)s AND address.email = %(email_2)s AND member
.user_id = user.id AND user._preferred_address_id = address.id) AS anon_1]
[parameters: {'list_id_1': 'avery-disc.caltech.edu', 'role_1': 1, 'email_1': 'nandrade@caltech.edu', 'list_id_2': 'avery-disc.caltech.edu', 'role_2': 1, 'email_2': 'nandrade@caltech.edu'}]
(Background on this error at: http://sqlalche.me/e/13/e3q8)
On 2/24/22 10:33, dancab@caltech.edu wrote:
I just upgraded our Mailman3 system to use Python 3.9. About 6 hours after the deployment there was a PyMySQL error and delivery of messages stopped until I restarted the services.
I'm pasting the full stack trace error from mailman.log below. It mentions PyMySQL. We have PyMySQL-1.0.2 installed. ... Feb 23 23:27:10 2022 (300) Uncaught runner exception: (pymysql.err.OperationalError) (1053, 'Server shutdown in progress')
What is the status of the mysql server. The above says PyMySQL gets a 'Server shutdown in progress' response from mysqld.
What was going on with mysqld at the time of the error?
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
Thanks for your help Mark. We use AWS RDS for MySQL. I found there was an automated update applied by AWS around the time show in the logs for the error.
Feb 23 23:27:10 2022 (300) Uncaught runner exception: (pymysql.err.OperationalError) (1053, 'Server shutdown in progress')
The outage lasted for about 1 minute. We have Mulit-AZ failover configured for RDS and a new DB instance started within that minute.
What is interesting is that Hyperkitty archiving continued to work and Postorius was still responsive.
Only mail delivery was hung overnight. I had to restart the Gunicorn process followed by a Mailman core restart for mail to flow again.
Any insight into how we could monitor this in the future? For example which specific "runner" reported the error above?
Thanks again.
-- Dan
On 2/24/22 14:01, dancab@caltech.edu wrote:
Only mail delivery was hung overnight. I had to restart the Gunicorn process followed by a Mailman core restart for mail to flow again.
Because Mailman's "out" runner (/opt/mailmanve/lib/python3.9/site-packages/mailman/runners/outgoing.py from the above traceback) is what hit the exception and died. The master watcher should restart it, but see https://gitlab.com/mailman/mailman/-/issues/898
Any insight into how we could monitor this in the future? For example which specific "runner" reported the error above?
Something like
ps --ppid=`cat /opt/mailman/mm/var/master.pid|tr -d '\n'`|wc -l
will report the number of child processes of the master. You could test that against the expected number and do something.
Note that mailman restart
probably won't start a runner that died -
you need to do mailman stop;mailman start
to be sure.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
Thanks again Mark. This is very helpful. In my urgency to fix the issue I think I tried "mailman restart" first and then the Gunicorn restart followed by a full mailman stop; mailman start. So I'll set something up to monitor the runners and trigger a mailman stop; mailman start
participants (2)
-
dancab@caltech.edu
-
Mark Sapiro