messages with attachments, without text discarded silently
Using the current mailman version, I can reproduce the following:
- send a text-formatted email with attachment but without text body to a list.
- the msg will be received by mailman
- the msg shows up in shunt, the log shows: ile "/usr/lib/python3.12/email/message.py", line 343, in set_payload payload = payload.encode(charset.output_charset, 'surrogateescape') ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ UnicodeEncodeError: 'ascii' codec can't encode characters in position 10-17: ordinal not in range(128)
an attempt to unpickle the file leads to the same error. I know it's uncommon to send emails without txt but with attachments - but it may happen; at least we would welcome an error message.
On 1/27/25 12:35, Schulz via Mailman-users wrote:
Using the current mailman version, I can reproduce the following:
- send a text-formatted email with attachment but without text body to a list.
- the msg will be received by mailman
- the msg shows up in shunt, the log shows: ile "/usr/lib/python3.12/email/message.py", line 343, in set_payload payload = payload.encode(charset.output_charset, 'surrogateescape') ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ UnicodeEncodeError: 'ascii' codec can't encode characters in position 10-17: ordinal not in range(128)
Please provide the full traceback and the MIME structure of the message.
an attempt to unpickle the file leads to the same error. I know it's uncommon to send emails without txt but with attachments - but it may happen; at least we would welcome an error message.
Without more of the traceback I can't be sure which handler is doing this and without the MIME structure I can't be sure what's going on.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
Thank you for providing the information below.
I received both posts by direct mail, but the posts were discarded by
the list because the attached .pck file was Content-Type:
application/octet-stream and the list has a header filter on
content-type with the pattern ^ *application/(?!(pgp|[^ ;]*signature))
and a discard action.
In other words, the list will not accept any posts with MIME parts with Content-Type: application/* unless the subtype appears to be a signature. This is an anti-spam measure. Even if the message were accepted, that part would be removed by content filtering.
On 1/31/25 12:14, Schulz wrote:
Hello, here is the traceback: Jan 28 20:31:16 2025 (25) it: DMARC lookup for js@jslz.de mailto:js@jslz.de (_dmarc.jslz.de) found p=reject in _dmarc.jslz.de. = v=DMARC1; p=reject; pct=100; rua=mailto:dmarc@faudin.de mailto:dmarc@faudin.de Jan 28 20:31:16 2025 (25) ACCEPT: <17def72e63ed443458a88ec892a17fe01b8a287c@jslz.de mailto:17def72e63ed443458a88ec892a17fe01b8a287c@jslz.de > Jan 28 20:31:16 2025 (28) Uncaught runner exception: 'ascii' codec can't encode characters in position 10-17: ordinal not in range(128) Jan 28 20:31:16 2025 (28) Traceback (most recent call last): File "/usr/lib/python3.12/site-packages/mailman/core/runner.py", line 179, in _one_iteration self._process_one_file(msg, msgdata) File "/usr/lib/python3.12/site-packages/mailman/core/runner.py", line 272, in _process_one_file keepqueued = self._dispose(mlist, msg, msgdata) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/lib/python3.12/site-packages/mailman/runners/pipeline.py", line 37, in _dispose process(mlist, msg, msgdata, pipeline) File "/usr/lib/python3.12/site-packages/mailman/core/pipelines.py", line 53, in process handler.process(mlist, msg, msgdata) File "/usr/lib/python3.12/site-packages/mailman/handlers/mime_delete.py", line 408, in process process(mlist, msg, msgdata) File "/usr/lib/python3.12/site-packages/mailman/handlers/mime_delete.py", line 180, in process reset_payload(msg, useful) File "/usr/lib/python3.12/site-packages/mailman/handlers/mime_delete.py", line 228, in reset_payload msg.set_payload(subpart.get_payload(decode=True).decode( File "/usr/lib/python3.12/email/message.py", line 343, in set_payload payload = payload.encode(charset.output_charset, 'surrogateescape') ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ UnicodeEncodeError: 'ascii' codec can't encode characters in position 10-17: ordinal not in range(128) Jan 28 20:31:16 2025 (28) SHUNTING: 1738092676.7424345+fcb0d63d7b50bc797c3dd0a44de64bbd6b27eb92
the .pck and the original email can't be attached I think (this msg didn't make it to the list)
these are the first lines of the .eml:
MIME-Version: 1.0 Date: Tue, 28 Jan 2025 19:31:15 +0000 Content-Type: multipart/mixed; boundary="6efe513a-b2f8-47e5-ace0-b3b6aa5a056c-1" From: "J. Schulz" <js@jslz.de> Message-ID: <17def72e63ed443458a88ec892a17fe01b8a287c@jslz.de> TLS-Required: No Subject: empty, but another att To: it@freie-dorfschule.de
--6efe513a-b2f8-47e5-ace0-b3b6aa5a056c-1 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable
--6efe513a-b2f8-47e5-ace0-b3b6aa5a056c-1 Content-Type: application/pdf; name="someExample.pdf" Content-Disposition: attachment; filename="someExample.pdf" Content-Transfer-Encoding: base64
JVBERi0xLjcKJcOkw7zDtsOfCjIgMCBvYmoKPDwvTGVuZ3RoIDMgMCBSL0ZpbHRlci9GbGF0 ZURlY29kZT4+CnN0cmVhbQp4nC2NTQvCMBiD7++vCHjz0L7t1q2DMXAfgoeBg4Ln4TpRULEU 9OdbRB4SCISEhcKb5C7E6zqfI9qxoxcYLFhbmMoIXRrYXAlbKARPpy0eNKSWPAa/PsN9jtEv G83Of2Jdy7E79OCmafv/UCJcqHVkCmFRlsmzCm6B3CsohltrVqw545wNF0llSjZRNe5Gg6Pp 9zfhC14SKM4KZW5kc3RyZWFtCmVuZG9iagoKMyAwIG9iagoxNTMKZW5kb2JqCgo3IDAgb2Jq Cjw8L0xlbmd0aCA4IDAgUi9GaWx0ZXIvRmxhdGVEZWNvZGUvTGVuZ3RoMSA5MDc2Pj4Kc3Ry ZWFtCnic5Vh9dFPHlb/zniTLH1iSP4RtGd4zwh9gW7ItTIAa/LAtYbDBso1BAgJ6lmQkYkta SbZj0haTJoGYzyaEtEnOhuajTQgJz5AcTEqBLbs9yTZp04/ddje00G6a3Z4mkO02Pd3TYO+d 0bP5aJKes2f/2yfPmzv3zr33N/feGWmcjA8GIQtGgQfJPyDH5s3K1APAWwAkxz+UFPsiGgfS VwG4V/pi2weeOLPlDwCawwBpr27vH+n7k+ZkC0AWNm0sFJQD8rtQCWB6FW0sDiHjyRvHdTj+ AMfzQwPJe/+Fe00DkGPEsbk/6pdbtKMoz6mg4wH53lhCU8/jWMKxGJEHgu+9/
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
On 1/31/25 17:38, Mark Sapiro wrote:
On 1/31/25 12:14, Schulz wrote:
---- these are the first lines of the .eml:
MIME-Version: 1.0 Date: Tue, 28 Jan 2025 19:31:15 +0000 Content-Type: multipart/mixed; boundary="6efe513a-b2f8-47e5-ace0-b3b6aa5a056c-1" From: "J. Schulz" <js@jslz.de> Message-ID: <17def72e63ed443458a88ec892a17fe01b8a287c@jslz.de> TLS-Required: No Subject: empty, but another att To: it@freie-dorfschule.de
--6efe513a-b2f8-47e5-ace0-b3b6aa5a056c-1 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable
--6efe513a-b2f8-47e5-ace0-b3b6aa5a056c-1 Content-Type: application/pdf; name="someExample.pdf" Content-Disposition: attachment; filename="someExample.pdf" Content-Transfer-Encoding: base64
JVBERi0xLjcKJcOkw7zDtsOfCjIgMCBvYmoKPDwvTGVuZ3RoIDMgMCBSL0ZpbHRlci9GbGF0 ZURlY29kZT4+CnN0cmVhbQp4nC2NTQvCMBiD7++vCHjz0L7t1q2DMXAfgoeBg4Ln4TpRULEU 9OdbRB4SCISEhcKb5C7E6zqfI9qxoxcYLFhbmMoIXRrYXAlbKARPpy0eNKSWPAa/PsN9jtEv G83Of2Jdy7E79OCmafv/UCJcqHVkCmFRlsmzCm6B3CsohltrVqw545wNF0llSjZRNe5Gg6Pp 9zfhC14SKM4KZW5kc3RyZWFtCmVuZG9iagoKMyAwIG9iagoxNTMKZW5kb2JqCgo3IDAgb2Jq Cjw8L0xlbmd0aCA4IDAgUi9GaWx0ZXIvRmxhdGVEZWNvZGUvTGVuZ3RoMSA5MDc2Pj4Kc3Ry ZWFtCnic5Vh9dFPHlb/zniTLH1iSP4RtGd4zwh9gW7ItTIAa/LAtYbDBso1BAgJ6lmQkYkta SbZj0haTJoGYzyaEtEnOhuajTQgJz5AcTEqBLbs9yTZp04/ddje00G6a3Z4mkO02Pd3TYO+d 0bP5aJKes2f/2yfPmzv3zr33N/feGWmcjA8GIQtGgQfJPyDH5s3K1APAWwAkxz+UFPsiGgfS VwG4V/pi2weeOLPlDwCawwBpr27vH+n7k+ZkC0AWNm0sFJQD8rtQCWB6FW0sDiHjyRvHdTj+ AMfzQwPJe/+Fe00DkGPEsbk/6pdbtKMoz6mg4wH53lhCU8/jWMKxGJEHgu+9/
I see the issue. Mailman's content filtering sees this multipart message as having two parts, the first of which is empty, so it tries to recast the message as a single part message containing only the second part. The problem is this code only works if the second part is a text/* part. Here it decodes the base64 encoded pdf and tries to recast it as unicode text which works for text/* parts with a declared charset, but not for non-text parts. This patch is untested, but I think it will fix the issue. ``` --- a/src/mailman/handlers/mime_delete.py +++ b/src/mailman/handlers/mime_delete.py @@ -221,7 +221,7 @@ Replaced multipart/alternative part with first alternative. def reset_payload(msg, subpart): # Reset payload of msg to contents of subpart, and fix up content headers - if subpart.is_multipart(): + if subpart.is_multipart() or subpart.get_content_maintype() != 'text': msg.set_payload(subpart.get_payload()) else: cset = subpart.get_content_charset() or 'us-ascii' ``` -- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
participants (2)
-
Mark Sapiro
-
Schulz