Hey everyone,
So I have a question about log rotation...
Our mailman.log has grown to over 5 gigs. It's time to add this to log rotation. We added what we thought might work to the /etc/logrotate.d/mailman script, but when we ran it with a -f, we learned that the -f flag doesn't keep old logs.
That in itself is fine - a normal scheduled log rotation won't have that problem.
My problem is that doing the rotation caused the mailman not to be able to write to the log anymore. It appears that it no longer has a handle to the file.
To overcome this with, say, nginx, there is a command in logrotate that you can add that restarts the service.
I have found, however, that restarting the mailman service can be VERY problematic. There is so much going on that it takes a very long time to stop, and I have seen several instances where the service never shuts down cleanly - even after 3-4 minutes. The service thinks it is shut down, but there are several threads that haven't yet completed. If they are killed, then you have to go clean up the master.lck files etc. It seems like a bad way of doing things and a good way to somehow corrupt data.
Anyway - after rambling...
- Has anyone come up with a good log rotation strategy that doesn't necessitate a restart of the service?
- Has anyone seen problems stopping and restarting the service? Is there a way to help with this problem?
Thanks,
Darren
On Tue, Apr 10, 2018, at 3:04 PM, Darren Smith wrote:
Hey everyone,
So I have a question about log rotation...
Our mailman.log has grown to over 5 gigs. It's time to add this to log rotation. We added what we thought might work to the /etc/logrotate.d/mailman script, but when we ran it with a -f, we learned that the -f flag doesn't keep old logs.
That in itself is fine - a normal scheduled log rotation won't have that problem.
My problem is that doing the rotation caused the mailman not to be able to write to the log anymore. It appears that it no longer has a handle to the file.
You can send a SIGHUP to mailman's master process and it will reload all the log files. The PID of the master process is stored in "var/data/master.pid".
To overcome this with, say, nginx, there is a command in logrotate that you can add that restarts the service.
I have found, however, that restarting the mailman service can be VERY problematic. There is so much going on that it takes a very long time to stop, and I have seen several instances where the service never shuts down cleanly - even after 3-4 minutes. The service thinks it is shut down, but there are several threads that haven't yet completed. If they are killed, then you have to go clean up the master.lck files etc. It seems like a bad way of doing things and a good way to somehow corrupt data.
Yes, this is a known thing and the reason for it is that the SIGTERM isn't delivered till the process wakes up next time. We can have the process sleep for a smaller amount of time to fix this, but then it would mean that it will be running much more often.
It is the Retry runner, which has sleep_time set to 15m. I don't know if you want to retry very often, like 2min or 5min, but you can change it by adding the following snippet to your mailman.cfg
[runner.retry] class: mailman.runners.retry.RetryRunner sleep_time: 5m
Anyway - after rambling...
- Has anyone come up with a good log rotation strategy that doesn't necessitate a restart of the service?
Manually SIGHUP the master runner is what I can think of the top of my head. A better solution would be to implement a "mailman logrotate" (or similar) command to do it, please free free to open an issue for it.
- Has anyone seen problems stopping and restarting the service? Is there a way to help with this problem?
Hope this is answered above.
-- Abhilash Raj maxking@asynchronous.in
On 04/10/2018 03:39 PM, Abhilash Raj wrote:
Yes, this is a known thing and the reason for it is that the SIGTERM isn't delivered till the process wakes up next time. We can have the process sleep for a smaller amount of time to fix this, but then it would mean that it will be running much more often.
It is the Retry runner, which has sleep_time set to 15m. I don't know if you want to retry very often, like 2min or 5min, but you can change it by adding the following snippet to your mailman.cfg
A long time ago (in a galaxy far away) Mailman 2.1 had a global DELIVERY_RETRY_WAIT setting. This setting had been ineffective for years, relying instead on the retry runner's sleep time.
In 2.1.26, for what turned out to be a probably spurious reason, I made DELIVERY_RETRY_WAIT effective again, although the retry runner still sleeps for 15 minutes.
We could implement something similar in MM 3 to allow the retry runner to have a much shorter sleep time without actually retrying the delivery that often. The down side of that is even though we aren't retrying delivery each time the runner wakes up, we are dequeueing and requeueing the message each time and this would have a file system impact.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
Well, that is certainly a wealth of information from you two, thanks a ton! I've got a few things to try now.
I'm going to have to look up the retry runner to see if that might help when we have to shut things down.
I'll let you know what I find...
On Tue, Apr 10, 2018 at 5:41 PM, Mark Sapiro <mark@msapiro.net> wrote:
Yes, this is a known thing and the reason for it is that the SIGTERM isn't delivered till the process wakes up next time. We can have the
On 04/10/2018 03:39 PM, Abhilash Raj wrote: process sleep for a smaller amount of time to fix this, but then it would mean that it will be running much more often.
It is the Retry runner, which has sleep_time set to 15m. I don't know if
you want to retry very often, like 2min or 5min, but you can change it by adding the following snippet to your mailman.cfg
A long time ago (in a galaxy far away) Mailman 2.1 had a global DELIVERY_RETRY_WAIT setting. This setting had been ineffective for years, relying instead on the retry runner's sleep time.
In 2.1.26, for what turned out to be a probably spurious reason, I made DELIVERY_RETRY_WAIT effective again, although the retry runner still sleeps for 15 minutes.
We could implement something similar in MM 3 to allow the retry runner to have a much shorter sleep time without actually retrying the delivery that often. The down side of that is even though we aren't retrying delivery each time the runner wakes up, we are dequeueing and requeueing the message each time and this would have a file system impact.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
Mailman-users mailing list mailman-users@mailman3.org https://lists.mailman3.org/mailman3/lists/mailman-users.mailman3.org/
On 04/10/2018 03:04 PM, Darren Smith wrote:
- Has anyone come up with a good log rotation strategy that doesn't necessitate a restart of the service?
I think Abhilash has answered this, but for completeness, here's what I use on lists.mailman3.org which uses nginx and gunicorn
Mailman's logs
/opt/mailman/mailman-bundler/var/logs/*.log {
missingok
sharedscripts
su mailman mailman
postrotate
/opt/mailman/mailman-bundler/bin/mailman reopen &>/dev/null || true
if [ -r /opt/mailman/mailman-bundler/var/gunicorn.pid ]; then
PID=cat /opt/mailman/mailman-bundler/var/gunicorn.pid
kill -s USR1 $PID
fi
endscript
}
mailman-web log
/var/log/mailman-web/*.log { missingok notifempty delaycompress su mailman mailman }
And here's mail.python.org with apache and mod_wsgi
Mailman's logs
/opt/mailman/mailman-bundler/var/logs/*.log { missingok sharedscripts su mailman mailman postrotate /opt/mailman/mailman-bundler/bin/mailman reopen &>/dev/null || true endscript }
mailman-web log
/var/log/mailman-web/*.log { missingok notifempty delaycompress su mailman mailman }
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
On Tue, Apr 10, 2018, at 4:12 PM, Mark Sapiro wrote:
On 04/10/2018 03:04 PM, Darren Smith wrote:
- Has anyone come up with a good log rotation strategy that doesn't necessitate a restart of the service?
I think Abhilash has answered this, but for completeness, here's what I use on lists.mailman3.org which uses nginx and gunicorn
Mailman's logs
/opt/mailman/mailman-bundler/var/logs/*.log { missingok sharedscripts su mailman mailman postrotate /opt/mailman/mailman-bundler/bin/mailman reopen &>/dev/null || true if [ -r /opt/mailman/mailman-bundler/var/gunicorn.pid ]; then PID=
cat /opt/mailman/mailman-bundler/var/gunicorn.pid
kill -s USR1 $PID fi endscript }mailman-web log
/var/log/mailman-web/*.log { missingok notifempty delaycompress su mailman mailman }
And here's mail.python.org with apache and mod_wsgi
Mailman's logs
/opt/mailman/mailman-bundler/var/logs/*.log { missingok sharedscripts su mailman mailman postrotate /opt/mailman/mailman-bundler/bin/mailman reopen &>/dev/null || true
Oh, I didn't know about the "mailman reopen" command, thanks Mark!
-- Abhilash Raj maxking@asynchronous.in
participants (3)
-
Abhilash Raj
-
Darren Smith
-
Mark Sapiro