hyperkitty_import performance boosting potential
Our organization is moving to MM3 from MM2, and I'm at the point where I am testing importing our archives. However, many of the lists that we want to transfer over have pretty large mbox files, so it takes a significant amount of time for it to process them.
I can throw all of the resources I need at it in order to make it faster, but I'm not sure if the import application is single-threaded only. Is there a way to enable a multi-threaded option? Can I safely run multiple commands at the same time using something like /usr/bin/parallel?
Thanks
On 7/8/24 11:59 AM, dlb-ml--- via Mailman-users wrote:
I can throw all of the resources I need at it in order to make it faster, but I'm not sure if the import application is single-threaded only. Is there a way to enable a multi-threaded option? Can I safely run multiple commands at the same time using something like /usr/bin/parallel?
If you're concerned about down time while doing the switch over, you could just do the hyperkitty_import processes while the MM 2.1 lists are still up, and then just import only the latest messages while the MM 2.1 lists are down.
That said, in order to provide parallelism, you could split the mbox into pieces and run multiple hyperkitty_import processes concurrently. If you do this, be sure to specify --since on the commands to be sure you get all the older messages.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
Thank you, that worked! I was able to create a script that splits the mbox files into parts based on the number of cores on the system, then imported them accordingly using the parallel binary.
participants (2)
-
dlb-ml@anl.gov
-
Mark Sapiro