All list messages shunted due to mailman core database lock
Hi there,
I have been trying to debug a git commit development list for a project I maintain: http://lists.mlpack.org/mailman3/hyperkitty/list/mlpack-git@lists.mlpack.org...
The list is run on mailman3 as provided by Debian.
Messages are successfully received---you can see all the test messages in the archives I just linked to if you want to click on it.
However, no messages are actually relayed to subscribers. I spent a while digging into logs and found that every message sent to the list results in an error like this in /var/log/mailman3/mailman.log:
Oct 02 17:01:26 2023 (2957556) Uncaught runner exception: (raised as a result of Query-invoked autoflush; consider using a session.no_autoflush block if this flush is occurring prematurely)
(sqlite3.OperationalError) database is locked
[SQL: UPDATE mailinglist SET last_post_at=?, post_id=? WHERE mailinglist.id = ?]
[parameters: ('2023-10-02 21:01:21.702224', 99565, 2)]
(Background on this error at: http://sqlalche.me/e/13/e3q8)
Oct 02 17:01:26 2023 (2957556) Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/sqlalchemy/engine/base.py", line 1276, in _execute_context
self.dialect.do_execute(
File "/usr/lib/python3/dist-packages/sqlalchemy/engine/default.py", line 609, in do_execute
cursor.execute(statement, parameters)
sqlite3.OperationalError: database is locked
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/mailman/core/runner.py", line 173, in _one_iteration
self._process_one_file(msg, msgdata)
File "/usr/lib/python3/dist-packages/mailman/core/runner.py", line 266, in _process_one_file
keepqueued = self._dispose(mlist, msg, msgdata)
File "/usr/lib/python3/dist-packages/mailman/runners/pipeline.py", line 37, in _dispose
process(mlist, msg, msgdata, pipeline)
File "/usr/lib/python3/dist-packages/mailman/core/pipelines.py", line 50, in process
handler.process(mlist, msg, msgdata)
File "/usr/lib/python3/dist-packages/mailman/handlers/acknowledge.py", line 46, in process
member = mlist.members.get_member(sender)
File "/usr/lib/python3/dist-packages/mailman/model/roster.py", line 120, in get_member
memberships = self._get_all_memberships(email)
File "/usr/lib/python3/dist-packages/mailman/database/transaction.py", line 85, in wrapper
return function(args[0], config.db.store, *args[1:], **kws)
File "/usr/lib/python3/dist-packages/mailman/model/roster.py", line 116, in _get_all_memberships
return members_a.union(members_u).all()
File "/usr/lib/python3/dist-packages/sqlalchemy/orm/query.py", line 3373, in all
return list(self)
File "/usr/lib/python3/dist-packages/sqlalchemy/orm/query.py", line 3534, in __iter__
self.session._autoflush()
File "/usr/lib/python3/dist-packages/sqlalchemy/orm/session.py", line 1633, in _autoflush
util.raise_(e, with_traceback=sys.exc_info()[2])
File "/usr/lib/python3/dist-packages/sqlalchemy/util/compat.py", line 182, in raise_
raise exception
File "/usr/lib/python3/dist-packages/sqlalchemy/orm/session.py", line 1622, in _autoflush
self.flush()
File "/usr/lib/python3/dist-packages/sqlalchemy/orm/session.py", line 2540, in flush
self._flush(objects)
File "/usr/lib/python3/dist-packages/sqlalchemy/orm/session.py", line 2682, in _flush
transaction.rollback(_capture_exception=True)
File "/usr/lib/python3/dist-packages/sqlalchemy/util/langhelpers.py", line 68, in __exit__
compat.raise_(
File "/usr/lib/python3/dist-packages/sqlalchemy/util/compat.py", line 182, in raise_
raise exception
File "/usr/lib/python3/dist-packages/sqlalchemy/orm/session.py", line 2642, in _flush
flush_context.execute()
File "/usr/lib/python3/dist-packages/sqlalchemy/orm/unitofwork.py", line 422, in execute
rec.execute(self)
File "/usr/lib/python3/dist-packages/sqlalchemy/orm/unitofwork.py", line 586, in execute
persistence.save_obj(
File "/usr/lib/python3/dist-packages/sqlalchemy/orm/persistence.py", line 230, in save_obj
_emit_update_statements(
File "/usr/lib/python3/dist-packages/sqlalchemy/orm/persistence.py", line 994, in _emit_update_statements
c = cached_connections[connection].execute(
File "/usr/lib/python3/dist-packages/sqlalchemy/engine/base.py", line 1011, in execute
return meth(self, multiparams, params)
File "/usr/lib/python3/dist-packages/sqlalchemy/sql/elements.py", line 298, in _execute_on_connection
return connection._execute_clauseelement(self, multiparams, params)
File "/usr/lib/python3/dist-packages/sqlalchemy/engine/base.py", line 1124, in _execute_clauseelement
ret = self._execute_context(
File "/usr/lib/python3/dist-packages/sqlalchemy/engine/base.py", line 1316, in _execute_context
self._handle_dbapi_exception(
File "/usr/lib/python3/dist-packages/sqlalchemy/engine/base.py", line 1510, in _handle_dbapi_exception
util.raise_(
File "/usr/lib/python3/dist-packages/sqlalchemy/util/compat.py", line 182, in raise_
raise exception
File "/usr/lib/python3/dist-packages/sqlalchemy/engine/base.py", line 1276, in _execute_context
self.dialect.do_execute(
File "/usr/lib/python3/dist-packages/sqlalchemy/engine/default.py", line 609, in do_execute
cursor.execute(statement, parameters)
sqlalchemy.exc.OperationalError: (raised as a result of Query-invoked autoflush; consider using a session.no_autoflush block if this flush is occurring prematurely)
(sqlite3.OperationalError) database is locked
[SQL: UPDATE mailinglist SET last_post_at=?, post_id=? WHERE mailinglist.id = ?]
[parameters: ('2023-10-02 21:01:21.702224', 99565, 2)]
(Background on this error at: http://sqlalche.me/e/13/e3q8)
Oct 02 17:01:26 2023 (2957556) SHUNTING: 1696280486.719109+114cb2cc738a61874a5eed7e3f66a3ac111f5b90
I have tried a handful of things:
- Restart mailman3 and mailman3-web systemd services.
- Ensure that there are no stale locks in /var/lib/mailman3/locks/ when mailman3 core is stopped.
- Move the database in /var/lib/mailman3/data/mailman.db out of the way and replace it with a copy, in case some process is holding it open.
- Unshunt all the messages with
sudo /usr/bin/mailman3 unshunt
(didn't realize this would drop all the shunted messages, but not a big deal).
Most threads I have read appear to be related to the Django interface encountering a locked database, but this looks different---it seems mailman core is having the problem, not anything having to do with mailman3-web or Django.
Does anyone have any thoughts on things I can do to fix this error? I am not a mailman3 expert (and to be perfectly honest I don't want to be one, I just want to post things to the list :)).
One extra caveat: rebooting the machine I am using is not a realistic option.
Thanks in advance for any help.
Ryan
-- Ryan Curtin | "Not like this... not like this." ryan@ratml.org | - Switch
Ryan Curtin writes:
However, no messages are actually relayed to subscribers. I spent a while digging into logs and found that every message sent to the list results in an error like this in /var/log/mailman3/mailman.log:
Oct 02 17:01:26 2023 (2957556) Uncaught runner exception: (raised as a result of Query-invoked autoflush; consider using a session.no_autoflush block if this flush is occurring prematurely) (sqlite3.OperationalError) database is locked
This is the whole problem. Here's a list of possibilities: https://bobbyhadz.com/blog/operational-error-database-is-locked
My best guess is that you have multiple Mailman instances running, but it could also be that you've got the database open in a console tool or something like that. I don't think it's timeouts nor related to the autoflush -- that should work, it does for lots of people.
If nothing else works you can reboot the whole system, and that should take care of it.
We don't recommend using sqlite in production, but you know your needs. However, this might be a good moment to switch to a database server like PostgreSQL or MySQL. In my experience, PostgreSQL just works. MySQL has one nit, namely it assumes that UTF-8 is at most 3 bytes, but emoji and a few other special characters in common use, as well as characters required by some less common languages, need 4 bytes. So make sure that the Mailman database are configured for utf8mbcs4 (or something like that, you'll know it when you see it in the docs).
On Tue, Oct 03, 2023 at 05:22:44PM +0900, Stephen J. Turnbull wrote:
Ryan Curtin writes:
However, no messages are actually relayed to subscribers. I spent a while digging into logs and found that every message sent to the list results in an error like this in /var/log/mailman3/mailman.log:
Oct 02 17:01:26 2023 (2957556) Uncaught runner exception: (raised as a result of Query-invoked autoflush; consider using a session.no_autoflush block if this flush is occurring prematurely) (sqlite3.OperationalError) database is locked
We don't recommend using sqlite in production, but you know your needs. However, this might be a good moment to switch to a database server like PostgreSQL or MySQL. In my experience, PostgreSQL just works. MySQL has one nit, namely it assumes that UTF-8 is at most 3 bytes, but emoji and a few other special characters in common use, as well as characters required by some less common languages, need 4 bytes. So make sure that the Mailman database are configured for utf8mbcs4 (or something like that, you'll know it when you see it in the docs).
Thanks for the quick response.
I tried your suggestion to switch to MySQL, and this seemed to work. For the sake of posterity I will briefly document what I did, since the migration is not trivial (and not totally documented as far as I could find). This is specifically for the migration of mailman3 core, not mailman3 web, which does occasionally give database locked errors and perhaps should be switched to MySQL too, but, I would rather write my software than debug its mailing list. In any case:
Modify mailman.cfg using the commented-out examples to use the MySQL backend instead of the sqlite3 backend.
Dump old sqlite3 databases with the help of the Python script found at https://stackoverflow.com/questions/18671/quick-easy-way-to-migrate-sqlite3-... a few answers down:
sudo sqlite3 /var/lib/mailman3/data/mailman.db .dump |
python3 /tmp/sqlite3-to-mysql.py > /tmp/mailman.db.dumpNote that the final line of that script needs to be changed from "print line," to the Python3 "print(line)".
Manually modify some things that the Python script didn't get in /tmp/mailman.db.dump. Specifically, any time "VARCHAR" shows up as a type in a "CREATE TABLE", it needs to be changed to "VARCHAR(n)" for some size "n". I just guessed sizes based on the name of the variable, usually 128, 512, or 4096. That's very arbitrary and probably unsafe and I should have done a better job. Whatever.
Add "SET FOREIGN_KEY_CHECKS = 0;" to the top of /tmp/mailman.db.dump, and "SET FOREIGN_KEY_CHECKS = 1;" to the bottom, to relax the constraints while importing.
Create the "mailman3" database in MySQL, then create the user/pass specified in /etc/mailman3/mailman.cfg and grant all privileges on the "mailman3" database.
Import the dump: mysql -u mailman -p mailman3 < /tmp/mailman.db.dump. Hopefully that succeeds. Probably minor changes need to be made here and there if not.
Restart mailman, cross your fingers, then send an email to the list and if you're lucky you'll receive it.
Hopefully that helps someone. (With the big caveat that I don't know about mailman and probably almost everything suggested there is unsafe. But it seems to have worked for me... for now.)
-- Ryan Curtin | "What? Facts?" ryan@ratml.org | - Joe Cairo
participants (2)
-
Ryan Curtin
-
Stephen J. Turnbull