On 7/1/22 1:40 AM, Andy Smith wrote:
In the job list I see:
appname - jobname - when - help
hyperkitty - update_and_clean_index - monthly - Update the full-text index and clean old entries
so I'm guessing it's that, but there is also:
hyperkitty - update_index - hourly - Update the full-text index
which hasn't been noitceably a problem so I wonder why the "and clean old entries" bit is so much more heavyweight, what exactly it dopes and if there's any way to make it lighter on resources?
The difference is the update_and_clean_index job calls hyperkitty.search_indexes.update_index() with remove=True See https://gitlab.com/mailman/hyperkitty/-/blob/master/hyperkitty/search_indexe.... The comments say "Setting remove to True is extremely slow, it needs to scan the entire index and database."
I second Mark Dadgar's comment to replace whoosh with xapian. It will help, although it requires rebuilding the entire search index.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan