data:image/s3,"s3://crabby-images/14a6f/14a6f3da39129d69d2d7fb9c382f7eb5791f4a7b" alt=""
On 4/25/21 2:20 PM, Andrew Hodgson wrote:
Hi,
Hyperkitty 1.3.4.
I am trying to download a complete list mbox by going to all threads view and using the download option. I have tried a couple of tools (Gunzip and Winrar) and both are giving me an unexpected end of file when trying to decompress the gz file.
Here is the list URL I am using: https://lists.hodgsonfamily.org/hyperkitty/list/plextalk@lists.hodgsonfamily... Depending in the web server configuration, timeouts can occur when downloading large archive mboxes. instead of downloading the entire mbox with <https://lists.hodgsonfamily.org/hyperkitty/list/plextalk@lists.hodgsonfamily.org/export/plextalk@lists.hodgsonfamily.org-2021-04.mbox.gz?start=2008-02-19&end=2021-04-25>, do it in pieces by adjusting start and end. e.g.
Although, I don't think timing out is the issue, and I'm not sure what is, but I think it has something to do with messages in the archive. If I try to get the 3 pieces as above, the first piece with start=2008-02-19&end=2012-12-31 works but the others don't and even
On 25-Apr-21 18:08, Mark Sapiro wrote: the
smaller
fails the same way.
If I try to get 2013 month by month, all work except December which gives me an internal server error. What's the traceback from that error.
The described timeouts are something that hyperkitty ought to be able to avoid. For apache, the timeout is idle time between blocks of output. Hyperkitty can avoid this by generating the archive in segments (based on size, or elapsed time), flushing its output buffer, generating a multi-file archive, and/or using Transfer-Encoding: chunked (chunked doesn't work for http/2). It ought to be able to break the work into blocks of "n" messages & do something to generate output. Besides avoiding timeouts, working in segments allows the GUI to display meaningful progress (e.g. if you're loading with XMLHttpRequest, "onprogress") It really oughtn't be up to the user to break up the request.
Until then: the apache directive is "TimeOut" (or "ProxyTimeout"), with a default value of 60 (seconds). It's a server config/virtual host parameter, so if you're running in an environment where you only have .htaccess, you need admin help or you're out of luck.
Other webservers (especially those with accelerators) may have more granular timeouts.