Hi,
On Fri, Jul 01, 2022 at 12:57:41PM -0700, Mark Sapiro wrote:
I second Mark Dadgar's comment to replace whoosh with xapian. It will help, although it requires rebuilding the entire search index.
Thanks both. Yes I am currently using whoosh which is the Debian packages' default.
I decided to dramatically reduce the number of messages in our archives, as the vast majority of them are for lists that had never actually been looked at ever when under Mailman2, and external archives of them already exist.
For ~6k messages it still took approx 16 hours to run that monthly job, and during that time spewed out thousands of these errors:
Failed to remove document 'hyperkitty.email.29351' from Whoosh: Traceback (most recent call last): File "/usr/lib/python3/dist-packages/haystack/backends/whoosh_backend.py", line 309, in remove self.index.delete_by_query(q=self.parser.parse('%s:"%s"' % (ID, whoosh_id))) File "/usr/lib/python3/dist-packages/whoosh/index.py", line 365, in delete_by_query w = self.writer() File "/usr/lib/python3/dist-packages/whoosh/index.py", line 464, in writer return SegmentWriter(self, **kwargs) File "/usr/lib/python3/dist-packages/whoosh/writing.py", line 515, in __init__ raise LockError whoosh.index.LockError
…so I'm not even sure if it worked.
I will definitely have to look in to switching to Xapian.
Thanks, Andy