
Hi,
After running hyperkitty_import to import thousands of emails from mailman2 (the years 2013-2025) I then ran another import for years 2004-2013 from another mbox file. It was an entirely earlier timeframe, which may be relevant.
After the first import, the "Threads by month" dropdown showed 2013-2025.
After the second import, the "Threads by month" dropdown showed 2013-2025.
That is, 2004-2013 are missing. I have tried various things, including the mailman3-web cron maintenance tasks, waiting a day, and rebooting the server to clear memory. All messages were imported, and can be seen in "All threads". The messages are there. The only problem is the "Threads by month" dropdown. Beyond the dropdown, everything seems fine. Could it be a bug? Should I open an issue? Or, what should I try next? Is there a "refresh all threads by month" specific task?
Deleting all archives and importing the older messages first... that worked correctly. It is a large destructive actions though, should that count as the official solution?
Detour:
While debugging it seemed to be an interesting question whether the mailman3-web periodic cron tasks had been running, and if those tasks affected the hyperkitty import. Referring to "Cron Jobs for Mailman Web" https://docs.mailman3.org/en/latest/install/virtualenv.html#cron-jobs-for-ma... Those instructions are not dumping stdout/stderr to a log file, nor rotating the log files. What is your opinion if I sent a pull request that adds "Method 2 - Advanced", and include logging for the crons. Or maybe I am missing something. Is the output of the cron tasks getting logged in your environment?

- On 6/5/25 19:22, Sam Darwin via Mailman-users wrote:
After running hyperkitty_import to import thousands of emails from mailman2 (the years 2013-2025) I then ran another import for years 2004-2013 from another mbox file. It was an entirely earlier timeframe, which may be relevant.
It probably is. Hyperkitty doesn't import earlier messages by default if the archive includes newer messages. For that, you'd need to call hyperkitty_import with the --since parameter, for instance, hyperkitty_import '--since=2004-01-01 00:00Z' in your case.
Otherwise, messages older than what is the latest in the archive will be skipped
- which essentially means everything was skipped.
Mihai

Thanks for the reply. There is double evidence they were imported:
- I did include --since 1970. The command was mailman-web-wrapper hyperkitty_import --since 1970
- The messages are present. If I navigate to "All threads", and the last page, it's full of emails from 2004, as is the second-to-last page, and so on.

- On 6/5/25 19:51, Sam Darwin via Mailman-users wrote:
Thanks for the reply. There is double evidence they were imported:
- I did include --since 1970. The command was mailman-web-wrapper hyperkitty_import --since 1970
- The messages are present. If I navigate to "All threads", and the last page, it's full of emails from 2004, as is the second-to-last page, and so on.
Oh, sorry, I only read later that you can access the messages in the archive if moving through it directly.
Have you also ran update_index_one_list later on? I'm not sure if that only updates the list index for full text search or is also generating things like the dropdown by processing the messages.
And, yes, there are also cron jobs that run periodically, but if you have waited a day, these should have executed already. Curious.
Mihai

On 6/5/25 10:22, Sam Darwin via Mailman-users wrote:
Deleting all archives and importing the older messages first... that worked correctly. It is a large destructive actions though, should that count as the official solution?
No, that shouldn't be the "official" solution.
It may be too late now, but did you restart mailman-web? See the thread (started by you) at https://lists.mailman3.org/archives/list/mailman-users@mailman3.org/thread/J...
Detour:
While debugging it seemed to be an interesting question whether the mailman3-web periodic cron tasks had been running, and if those tasks affected the hyperkitty import. Referring to "Cron Jobs for Mailman Web" https://docs.mailman3.org/en/latest/install/virtualenv.html#cron-jobs-for-ma... Those instructions are not dumping stdout/stderr to a log file, nor rotating the log files. What is your opinion if I sent a pull request that adds "Method 2 - Advanced", and include logging for the crons. Or maybe I am missing something. Is the output of the cron tasks getting logged in your environment?
Any output to stdout or stderr from any of the crons will be emailed to
the owner of the crontab or the address in a MAILTO= in the crontab. You
should ensure that mail to mailman
is deliverable
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan

did you restart mailman-web?
After encountering the problem, one thing I tried is rebooting, which would restart mailman-web. Didn't help.
See the thread (started by you)
I had updated other threads, but not that one - the problem discovered a few weeks later was: qcluster not running.
You should ensure that mail to mailman is deliverable
I believe it is not yet. Will make a note to investigate that. However, I think logging to /var/log/ is better than email, for system processes.

On 6/5/25 12:33, Sam Darwin via Mailman-users wrote:
I believe it is not yet. Will make a note to investigate that. However, I think logging to /var/log/ is better than email, for system processes.
Arguably it's not a system process, but rather a user cron.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
participants (3)
-
Mark Sapiro
-
Mihai Moldovan
-
Sam Darwin