I (finally) set up the standard cronjobs (http://docs.mailman3.org/en/latest/config-web.html#scheduled-tasks-required) last week, but this doesn't seem to have any obvious effect on this problem (i.e., it's not clearly either better or worse). But I wonder whether the fact that I had overlooked this (so that our installation didn't have these cronjobs installed for months) could have resulted in some hard-to-fix corrupt state?
We also looked into the other logs a little and found that uwsgi-error.log is getting filled constantly. Even if the cronjobs are commented out, and even in the middle of the night, and even though we only have a few mailing lists, all with limited activity, there's probably more than a line per second in this log. Here's the last screenful, which seems as representative as any:
14:45:15 [Q] INFO Process-1:7687 processing [compute_thread_positions] 14:45:15 [Q] ERROR Failed [update_from_mailman] - MailingList matching query does not exist. 14:45:16 [Q] ERROR Failed [compute_thread_positions] - Thread matching query does not exist. 14:45:16 [Q] INFO Process-1:7687 processing [rebuild_mailinglist_cache_recent] 14:45:16 [Q] INFO Process-1:7688 processing [compute_thread_positions] 14:45:16 [Q] ERROR Failed [rebuild_mailinglist_cache_recent] - MailingList matching query does not exist. 14:45:17 [Q] ERROR Failed [compute_thread_positions] - Thread matching query does not exist. 14:45:17 [Q] INFO Process-1:7687 processing [rebuild_mailinglist_cache_recent] 14:45:17 [Q] ERROR Failed [rebuild_mailinglist_cache_recent] - MailingList matching query does not exist. 14:45:18 [Q] INFO Process-1:7688 processing [rebuild_mailinglist_cache_for_month] 14:45:18 [Q] ERROR Failed [rebuild_mailinglist_cache_for_month] - MailingList matching query does not exist. 14:45:19 [Q] INFO Process-1:7687 processing [rebuild_mailinglist_cache_for_month] 14:45:19 [Q] ERROR Failed [rebuild_mailinglist_cache_for_month] - MailingList matching query does not exist. 14:45:20 [Q] INFO Process-1:7688 processing [compute_thread_positions] 14:45:20 [Q] ERROR Failed [compute_thread_positions] - Thread matching query does not exist. 14:45:20 [Q] INFO Process-1:7687 processing [check_orphans] 14:45:21 [Q] INFO Process-1:7688 processing [compute_thread_positions] 14:45:21 [Q] ERROR Failed [check_orphans] - Email matching query does not exist. 14:45:21 [Q] ERROR Failed [compute_thread_positions] - Thread matching query does not exist. 14:45:22 [Q] INFO Process-1:7688 processing [update_from_mailman] 14:45:22 [Q] INFO Process-1:7687 processing [compute_thread_positions] 14:45:23 [Q] INFO Process-1:7688 processing [check_orphans] 14:45:23 [Q] ERROR Failed [update_from_mailman] - MailingList matching query does not exist. 14:45:23 [Q] ERROR Failed [compute_thread_positions] - Thread matching query does not exist. 14:45:24 [Q] ERROR Failed [check_orphans] - Email matching query does not exist. 14:45:24 [Q] INFO Process-1:7687 processing [rebuild_mailinglist_cache_recent] 14:45:24 [Q] ERROR Failed [rebuild_mailinglist_cache_recent] - MailingList matching query does not exist. 14:45:25 [Q] INFO Process-1:7687 processing [update_from_mailman] 14:45:25 [Q] INFO Process-1:7688 processing [compute_thread_positions] 14:45:25 [Q] INFO Process-1:7687 processing [rebuild_mailinglist_cache_recent] 14:45:26 [Q] INFO Process-1:7688 processing [rebuild_mailinglist_cache_for_month] 14:45:26 [Q] INFO Process-1:7687 processing [check_orphans] 14:45:26 [Q] ERROR Failed [update_from_mailman] - MailingList matching query does not exist. 14:45:26 [Q] ERROR Failed [compute_thread_positions] - Thread matching query does not exist. 14:45:27 [Q] ERROR Failed [rebuild_mailinglist_cache_recent] - MailingList matching query does not exist. 14:45:27 [Q] ERROR Failed [rebuild_mailinglist_cache_for_month] - MailingList matching query does not exist. 14:45:27 [Q] ERROR Failed [check_orphans] - Email matching query does not exist. 14:45:27 [Q] INFO Process-1:7688 processing [rebuild_mailinglist_cache_for_month] 14:45:28 [Q] INFO Process-1:7687 processing [check_orphans] 14:45:28 [Q] INFO Process-1:7688 processing [update_from_mailman] 14:45:29 [Q] ERROR Failed [rebuild_mailinglist_cache_for_month] - MailingList matching query does not exist. 14:45:29 [Q] ERROR Failed [check_orphans] - Email matching query does not exist. 14:45:29 [Q] ERROR Failed [update_from_mailman] - MailingList matching query does not exist. 14:45:29 [Q] INFO Process-1:7687 processing [compute_thread_positions] 14:45:30 [Q] ERROR Failed [compute_thread_positions] - Thread matching query does not exist. 14:45:30 [Q] INFO Process-1:7688 processing [rebuild_mailinglist_cache_recent] 14:45:30 [Q] INFO Process-1:7688 processing [rebuild_mailinglist_cache_recent] 14:45:30 [Q] ERROR Failed [rebuild_mailinglist_cache_recent] - MailingList matching query does not exist. 14:45:31 [Q] ERROR Failed [rebuild_mailinglist_cache_recent] - MailingList matching query does not exist. 14:45:31 [Q] INFO Process-1:7688 processing [check_orphans] 14:45:31 [Q] ERROR Failed [check_orphans] - Email matching query does not exist. 14:45:31 [Q] INFO Process-1:7687 processing [rebuild_mailinglist_cache_for_month] 14:45:31 [Q] ERROR Failed [rebuild_mailinglist_cache_for_month] - MailingList matching query does not exist. 14:45:32 [Q] INFO Process-1:7688 processing [update_from_mailman] 14:45:32 [Q] ERROR Failed [update_from_mailman] - MailingList matching query does not exist.
We also noticed that there's almost constant writing going on. A typical line from iotop:
7430 be/4 www-data 0.00 B/s 31.34 K/s 0.00 % 44.14 % python ./manage.py qcluster
We will try a MM3 upgrade on Monday (to realign all the suite components with the 3.1 versions here: https://wiki.list.org/Mailman3), and maybe migrate to postgresql, but I have a feeling that there's a more "basic" problem. Help or even speculation would be appreciated. :)