Exception in the HyperKitty archiver: 'utf-8' codec can't encode characters
Hi,
I'm seeing the above error in mailman-core logs each time an email is processed. Is there something I missed in the configuration? I'm using dockerized mailman 0.3 from maxking.
Thanks in advance,
_g.
On 6/15/20 9:43 AM, Gilles Filippini wrote:
Hi,
I'm seeing the above error in mailman-core logs each time an email is processed. Is there something I missed in the configuration? I'm using dockerized mailman 0.3 from maxking.
I'm guessing you have one message in var/archives/hyperkitty/spool/ that is causing this issue. The message itself will probably have invalid utf-8 encodings in a message part declared as charset=utf-8.
Every time a new message is archived, HyperKitty reprocesses this message and hits the error again.
You can examine the message with bin/mailman qfile
. It is a Python
pickle of a message object.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
Mark Sapiro a écrit le 15/06/2020 à 20:21 :
On 6/15/20 9:43 AM, Gilles Filippini wrote:
Hi,
I'm seeing the above error in mailman-core logs each time an email is processed. Is there something I missed in the configuration? I'm using dockerized mailman 0.3 from maxking.
I'm guessing you have one message in var/archives/hyperkitty/spool/ that is causing this issue. The message itself will probably have invalid utf-8 encodings in a message part declared as charset=utf-8.
Every time a new message is archived, HyperKitty reprocesses this message and hits the error again.
You can examine the message with
bin/mailman qfile
. It is a Python pickle of a message object.
There are 4 emails stored into this folder. 3 of them are spam, and the last one is a legitimate email. This is annoying that such emails could accumulate and stay into var/archives/hyperkitty/spool/ forever, causing more and more hyperkitty errors.
Any idea how to make their processing more error proof?
Thanks,
_g.
On 6/15/20 12:37 PM, Gilles Filippini wrote:
There are 4 emails stored into this folder. 3 of them are spam, and the last one is a legitimate email. This is annoying that such emails could accumulate and stay into var/archives/hyperkitty/spool/ forever, causing more and more hyperkitty errors.
Any idea how to make their processing more error proof?
Change the offending .encode() method to add errors='replace'.
The traceback from the error should tell you where it is.
You could also open an issue at <https://gitlab.com/mailman/hyperkitty/-/issues> for us to look at. If you do that, please include an error traceback if possible. If there is not a traceback in mailman.log associated with the error, there might be one in HyperKitty's log (configured in Django's settings). Also, including enough of a message to duplicate the issue is very helpful.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
Mark Sapiro a écrit le 15/06/2020 à 21:58 :
On 6/15/20 12:37 PM, Gilles Filippini wrote:
There are 4 emails stored into this folder. 3 of them are spam, and the last one is a legitimate email. This is annoying that such emails could accumulate and stay into var/archives/hyperkitty/spool/ forever, causing more and more hyperkitty errors.
Any idea how to make their processing more error proof?
Change the offending .encode() method to add errors='replace'.
The traceback from the error should tell you where it is.
You could also open an issue at <https://gitlab.com/mailman/hyperkitty/-/issues> for us to look at. If you do that, please include an error traceback if possible. If there is not a traceback in mailman.log associated with the error, there might be one in HyperKitty's log (configured in Django's settings). Also, including enough of a message to duplicate the issue is very helpful.
I've now deleted the offending messages. Will open an issue with related details the next time this error occurs.
Thanks a lot,
_g.
Gilles Filippini a écrit le 15/06/2020 à 22:05 :
Mark Sapiro a écrit le 15/06/2020 à 21:58 :
On 6/15/20 12:37 PM, Gilles Filippini wrote:
There are 4 emails stored into this folder. 3 of them are spam, and the last one is a legitimate email. This is annoying that such emails could accumulate and stay into var/archives/hyperkitty/spool/ forever, causing more and more hyperkitty errors.
Any idea how to make their processing more error proof?
Change the offending .encode() method to add errors='replace'.
The traceback from the error should tell you where it is.
You could also open an issue at <https://gitlab.com/mailman/hyperkitty/-/issues> for us to look at. If you do that, please include an error traceback if possible. If there is not a traceback in mailman.log associated with the error, there might be one in HyperKitty's log (configured in Django's settings). Also, including enough of a message to duplicate the issue is very helpful.
I've now deleted the offending messages. Will open an issue with related details the next time this error occurs.
I've experienced this error again today. Filed issue #301.
Thanks,
_g.
On 6/16/20 12:44 PM, Gilles Filippini wrote:
I've experienced this error again today. Filed issue #301.
Thank you, but see my comment at <https://gitlab.com/mailman/hyperkitty/-/issues/301#note_362369905>
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
It turns out this error is caused by a non-ascii character in a prologue section of the email. When the message gets parsed into a mailman.email.message.Message objectand then flattened to a string, the non-ascii in the prologue gets converted to unicode surrogates which causes problems in subsequent processing. The bug report has moved twice in the process of diagnosing and fixing this - from https://gitlab.com/mailman/hyperkitty/-/issues/301 to https://gitlab.com/mailman/mailman-hyperkitty/-/issues/25 to https://gitlab.com/mailman/mailman/-/issues/732 The fix is in core and is quite simple. ``` --- a/src/mailman/email/message.py +++ b/src/mailman/email/message.py @@ -55,7 +55,8 @@ class Message(email.message.Message): except (KeyError, LookupError, UnicodeEncodeError): value = email.message.Message.as_bytes(self).decode( 'ascii', 'replace') - return value + # Also ensure no unicode surrogates in the returned string. + return email.utils._sanitize(value) @property def sender(self): ``` See https://gitlab.com/mailman/mailman/-/merge_requests/665 -- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
participants (2)
-
Gilles Filippini
-
Mark Sapiro