I (finally) set up the standard cronjobs (http://docs.mailman3.org/en/latest/config-web.html#scheduled-tasks-required) last week, but this doesn't seem to have any obvious effect on this problem (i.e., it's not clearly either better or worse). But I wonder whether the fact that I had overlooked this (so that our installation didn't have these cronjobs installed for months) could have resulted in some hard-to-fix corrupt state?
On 01/19/2018 03:53 PM, Cyprian Laskowski wrote: that shouldn't be the case. The cronjobs fix issues for all exisiting entries. They are executed as often as deemed necessary to keep state somehow stable without impacting performance too much...
We also looked into the other logs a little and found that uwsgi-error.log is getting filled constantly. Even if the cronjobs are commented out, and even in the middle of the night, and even though we only have a few mailing lists, all with limited activity, there's probably more than a line per second in this log. Here's the last screenful, which seems as representative as any:
14:45:15 [Q] INFO Process-1:7687 processing [compute_thread_positions] 14:45:15 [Q] ERROR Failed [update_from_mailman] - MailingList matching query does not exist. 14:45:16 [Q] ERROR Failed [compute_thread_positions] - Thread matching query does not exist. 14:45:16 [Q] INFO Process-1:7687 processing [rebuild_mailinglist_cache_recent] 14:45:16 [Q] INFO Process-1:7688 processing [compute_thread_positions] 14:45:16 [Q] ERROR Failed [rebuild_mailinglist_cache_recent] - MailingList matching query does not exist. 14:45:17 [Q] ERROR Failed [compute_thread_positions] - Thread matching query does not exist. 14:45:17 [Q] INFO Process-1:7687 processing [rebuild_mailinglist_cache_recent] 14:45:17 [Q] ERROR Failed [rebuild_mailinglist_cache_recent] - MailingList matching query does not exist. 14:45:18 [Q] INFO Process-1:7688 processing [rebuild_mailinglist_cache_for_month] 14:45:18 [Q] ERROR Failed [rebuild_mailinglist_cache_for_month] - MailingList matching query does not exist. 14:45:19 [Q] INFO Process-1:7687 processing [rebuild_mailinglist_cache_for_month] 14:45:19 [Q] ERROR Failed [rebuild_mailinglist_cache_for_month] - MailingList matching query does not exist. 14:45:20 [Q] INFO Process-1:7688 processing [compute_thread_positions] 14:45:20 [Q] ERROR Failed [compute_thread_positions] - Thread matching query does not exist. 14:45:20 [Q] INFO Process-1:7687 processing [check_orphans] 14:45:21 [Q] INFO Process-1:7688 processing [compute_thread_positions] 14:45:21 [Q] ERROR Failed [check_orphans] - Email matching query does not exist. 14:45:21 [Q] ERROR Failed [compute_thread_positions] - Thread matching query does not exist. 14:45:22 [Q] INFO Process-1:7688 processing [update_from_mailman] 14:45:22 [Q] INFO Process-1:7687 processing [compute_thread_positions] 14:45:23 [Q] INFO Process-1:7688 processing [check_orphans] 14:45:23 [Q] ERROR Failed [update_from_mailman] - MailingList matching query does not exist. 14:45:23 [Q] ERROR Failed [compute_thread_positions] - Thread matching query does not exist. 14:45:24 [Q] ERROR Failed [check_orphans] - Email matching query does not exist. 14:45:24 [Q] INFO Process-1:7687 processing [rebuild_mailinglist_cache_recent] 14:45:24 [Q] ERROR Failed [rebuild_mailinglist_cache_recent] - MailingList matching query does not exist. 14:45:25 [Q] INFO Process-1:7687 processing [update_from_mailman] 14:45:25 [Q] INFO Process-1:7688 processing [compute_thread_positions] 14:45:25 [Q] INFO Process-1:7687 processing [rebuild_mailinglist_cache_recent] 14:45:26 [Q] INFO Process-1:7688 processing [rebuild_mailinglist_cache_for_month] 14:45:26 [Q] INFO Process-1:7687 processing [check_orphans] 14:45:26 [Q] ERROR Failed [update_from_mailman] - MailingList matching query does not exist. 14:45:26 [Q] ERROR Failed [compute_thread_positions] - Thread matching query does not exist. 14:45:27 [Q] ERROR Failed [rebuild_mailinglist_cache_recent] - MailingList matching query does not exist. 14:45:27 [Q] ERROR Failed [rebuild_mailinglist_cache_for_month] - MailingList matching query does not exist. 14:45:27 [Q] ERROR Failed [check_orphans] - Email matching query does not exist. 14:45:27 [Q] INFO Process-1:7688 processing [rebuild_mailinglist_cache_for_month] 14:45:28 [Q] INFO Process-1:7687 processing [check_orphans] 14:45:28 [Q] INFO Process-1:7688 processing [update_from_mailman] 14:45:29 [Q] ERROR Failed [rebuild_mailinglist_cache_for_month] - MailingList matching query does not exist. 14:45:29 [Q] ERROR Failed [check_orphans] - Email matching query does not exist. 14:45:29 [Q] ERROR Failed [update_from_mailman] - MailingList matching query does not exist. 14:45:29 [Q] INFO Process-1:7687 processing [compute_thread_positions] 14:45:30 [Q] ERROR Failed [compute_thread_positions] - Thread matching query does not exist. 14:45:30 [Q] INFO Process-1:7688 processing [rebuild_mailinglist_cache_recent] 14:45:30 [Q] INFO Process-1:7688 processing [rebuild_mailinglist_cache_recent] 14:45:30 [Q] ERROR Failed [rebuild_mailinglist_cache_recent] - MailingList matching query does not exist. 14:45:31 [Q] ERROR Failed [rebuild_mailinglist_cache_recent] - MailingList matching query does not exist. 14:45:31 [Q] INFO Process-1:7688 processing [check_orphans] 14:45:31 [Q] ERROR Failed [check_orphans] - Email matching query does not exist. 14:45:31 [Q] INFO Process-1:7687 processing [rebuild_mailinglist_cache_for_month] 14:45:31 [Q] ERROR Failed [rebuild_mailinglist_cache_for_month] - MailingList matching query does not exist. 14:45:32 [Q] INFO Process-1:7688 processing [update_from_mailman] 14:45:32 [Q] ERROR Failed [update_from_mailman] - MailingList matching query does not exist.
We also noticed that there's almost constant writing going on. A typical line from iotop:
7430 be/4 www-data 0.00 B/s 31.34 K/s 0.00 % 44.14 % python ./manage.py qcluster
We will try a MM3 upgrade on Monday (to realign all the suite components with the 3.1 versions here: https://wiki.list.org/Mailman3), and maybe migrate to postgresql, but I have a feeling that there's a more "basic" problem. Help or even speculation would be appreciated. :) migrating to postgresql should at least take care of the database locking problem.
Make sure you are actually using either the master branch (with the latest commit) of hyperkitty or the latest release of it.
Now comes the "speculation": To me it looks like all this output is generated by the django-q tasks. I'm not that familiar with it, but I guess it's either some configuration error or a bug in Hyperkitty. The errors produced seem to corespond to unhandled exceptions, however a quick glance at the current master shows that they are handled, so I'm not sure what's going on here... Compare your settings to the settings in mailman-suite (or better the example-project of Hyperkitty). Update your settings if you spot differences that might have something to do with it, if that doesn't help open an issue about that here https://gitlab.com/mailman/hyperkitty/issues