Mark Sapiro wrote:
On 4/25/21 2:20 PM, Andrew Hodgson wrote:
Hi,
Hyperkitty 1.3.4.
I am trying to download a complete list mbox by going to all threads view and using the download option. I have tried a couple of tools (Gunzip and Winrar) and both are giving me an unexpected end of file when trying to decompress the gz file.
[...]
Depending in the web server configuration, timeouts can occur when downloading large archive mboxes. instead of downloading the entire mbox with <https://lists.hodgsonfamily.org/hyperkitty/list/plextalk@lists.hodgsonfamily.org/export/plextalk@lists.hodgsonfamily.org-2021-04.mbox.gz?start=2008-02-19&end=2021-04-25>, do it in pieces by adjusting start and end. e.g.
[...]
Although, I don't think timing out is the issue, and I'm not sure what is, but I think it has something to do with messages in the archive. If I try to get the 3 pieces as above, the first piece with start=2008-02-19&end=2012-12-31 works but the others don't and even the smaller
Yep there is a problem with an imported mbox which seems to be present somewhere in the archive on all the lists which I imported using that method. Here is a recent trace from the download option:
[2021-04-26 08:53:12 +0000] [28181] [ERROR] Error handling request Traceback (most recent call last): File "/opt/mailman/venv/lib/python3.7/site-packages/gunicorn/workers/sync.py", line 180, in handle_request for item in respiter: File "/opt/mailman/venv/lib/python3.7/site-packages/hyperkitty/views/mlist.py", line 335, in stream_mbox msg = email.as_message() File "/opt/mailman/venv/lib/python3.7/site-packages/hyperkitty/models/email.py", line 178, in as_message msg["Message-ID"] = "<%s>" % self.message_id File "/usr/lib/python3.7/email/message.py", line 409, in __setitem__ self._headers.append(self.policy.header_store_parse(name, val)) File "/usr/lib/python3.7/email/policy.py", line 145, in header_store_parse raise ValueError("Header values may not contain linefeed " ValueError: Header values may not contain linefeed or carriage return characters
I do still have the original mbox files for the imported lists which were on Mailman 2.1. The other thing is when downloading these mbox files I noticed they aren't the raw messages as we had in Mailman 2.1 but modified with headers stripped and email addresses modified. Is there any way to get the raw mbox back as in the option we had in Mailman 2.1?
Thanks. Andrew.