On 7/26/24 11:19 PM, Mark Sapiro wrote:
On 7/26/24 18:53, Bryan Fields wrote:
On 7/26/24 4:01 PM, Mark Sapiro wrote:
If this returns just the one message_id, I would then
DELETE FROM hyperkitty_thread WHERE id = 35670; DELETE FROM hyperkitty_email WHERE id = 150555;
It's giving an error due to a fk constraint on another table. I'm not familiar with the db structure enough to be certain how to clean/fix it.
Yes, you need to do these first.
DELETE FROM hyperkitty_favorite WHERE thread_id = 35670; DELETE FROM hyperkitty_tagging WHERE thread_id = 35670; DELETE FROM hyperkitty_lastview WHERE thread_id = 35670; DELETE FROM hyperkitty_vote WHERE email_id = 150555;
Not all those will exist, but then you should be able to
DELETE FROM hyperkitty_thread WHERE id = 35670; DELETE FROM hyperkitty_email WHERE id = 150555;
Sill no go on it.
mailmanweb=> DELETE FROM hyperkitty_favorite WHERE thread_id = 35670; DELETE 0 mailmanweb=> DELETE FROM hyperkitty_tagging WHERE thread_id = 35670; DELETE 0 mailmanweb=> DELETE FROM hyperkitty_lastview WHERE thread_id = 35670; DELETE 0 mailmanweb=> DELETE FROM hyperkitty_vote WHERE email_id = 150555; DELETE 0 mailmanweb=> DELETE FROM hyperkitty_thread WHERE id = 35670; ERROR: update or delete on table "hyperkitty_thread" violates foreign key constraint "hyperkitty_email_thread_id_038c9aee_fk_hyperkitty_thread_id" on table "hyperkitty_email" DETAIL: Key (id)=(35670) is still referenced from table "hyperkitty_email". mailmanweb=> DELETE FROM hyperkitty_email WHERE id = 150555; ERROR: update or delete on table "hyperkitty_email" violates foreign key constraint "hyperkitty_thread_starting_email_id_fa7c55f5_fk_hyperkitt" on table "hyperkitty_thread" DETAIL: Key (id)=(150555) is still referenced from table "hyperkitty_thread".
I'm actually thinking I need to fix the archive mbox, then delete them and re-import them once they are cleaned.
Actually, that would be safest. Fix the mbox, delete the entire archive and re-run the import.
I'll attempt to clean this up tomorrow an clean it up/re-import.
Is there a quick/easy way to drop the entire archive from the db or must I delete the entire list?
Also, FYI, assuming those messages with very old Date: headers had reasonable unix from dates, the hyperkitty/contrib/cleanarch3 script would fix them, except you have to get the script from https://gitlab.com/mailman/hyperkitty/-/blob/master/hyperkitty/contrib/clean... or https://www.msapiro.net/scripts/cleanarch3 because the script in the latest release has a bug.
I tried the dry run option on your script and it gave a bunch of output for bad dates. This was absent on the one shipping with the source. What's interesting is that pipermail seems to have no issue with this, detecting the date correctly.
If I recall correctly, pipermail checks date skew and 'fixes' the dates in the archive, but they remain off in the mbox.
I'll have to check that, but I've got at least some good leads now and tools to find suspected issues. It does seem like a bad design that importing a mailman2 mbox can cause the server to error 500.
Is the archive tool in pipermail's import more robust in this manner? I'd argue it's common to have missing In-Reply-To: headers where the subject and time would need to be used to infer the likely thread. I'll agree this is a major violation of the relevant RFC's to be missing this, but many MUA's (M$) are famous for doing just this.
Yes, it's arguably a defect in HyperKitty to not implement Jamie Zawinski's threading algorithm <https://www.jwz.org/doc/threading.html>, but it doesn't.
Well that kinda blows :-(
I'm not going to take down the old archives, but I'd like at least one to be searchable, which is why we wanted to use mailman3/hyperkitty.
Thank you,
Bryan Fields
727-409-1194 - Voice http://bryanfields.net