That's funny that this should come up, as I have just been asking questions in the list about importing mbox format files into hyperkitty, just yesterday.
The archives I needed to import went back 30 years. I thought I found a bug in Python when I discovered that posts that contain '^From' cause the Python mbox library code to choke and throw bogus cut messages. It turns out this is a well known issue with the very loose mbox format, of which there are dozens of species known to science. The problem is solved by mangling the files with a pre-process script. I changed all body text with '^From' to '^ From' and hyperkitty and Python will behave correctly. Not ideal but better than dropping messages from the old archive altogether.
This issue could very well go in an FAQ for mailman. Can I add a topic somehow? Many others will hit the same problem.
Andrew