On Tue, Apr 10, 2018, at 3:04 PM, Darren Smith wrote:
Hey everyone,
So I have a question about log rotation...
Our mailman.log has grown to over 5 gigs. It's time to add this to log rotation. We added what we thought might work to the /etc/logrotate.d/mailman script, but when we ran it with a -f, we learned that the -f flag doesn't keep old logs.
That in itself is fine - a normal scheduled log rotation won't have that problem.
My problem is that doing the rotation caused the mailman not to be able to write to the log anymore. It appears that it no longer has a handle to the file.
You can send a SIGHUP to mailman's master process and it will reload all the log files. The PID of the master process is stored in "var/data/master.pid".
To overcome this with, say, nginx, there is a command in logrotate that you can add that restarts the service.
I have found, however, that restarting the mailman service can be VERY problematic. There is so much going on that it takes a very long time to stop, and I have seen several instances where the service never shuts down cleanly - even after 3-4 minutes. The service thinks it is shut down, but there are several threads that haven't yet completed. If they are killed, then you have to go clean up the master.lck files etc. It seems like a bad way of doing things and a good way to somehow corrupt data.
Yes, this is a known thing and the reason for it is that the SIGTERM isn't delivered till the process wakes up next time. We can have the process sleep for a smaller amount of time to fix this, but then it would mean that it will be running much more often.
It is the Retry runner, which has sleep_time set to 15m. I don't know if you want to retry very often, like 2min or 5min, but you can change it by adding the following snippet to your mailman.cfg
[runner.retry] class: mailman.runners.retry.RetryRunner sleep_time: 5m
Anyway - after rambling...
- Has anyone come up with a good log rotation strategy that doesn't necessitate a restart of the service?
Manually SIGHUP the master runner is what I can think of the top of my head. A better solution would be to implement a "mailman logrotate" (or similar) command to do it, please free free to open an issue for it.
- Has anyone seen problems stopping and restarting the service? Is there a way to help with this problem?
Hope this is answered above.
-- Abhilash Raj maxking@asynchronous.in