Missing Base64 encoding from control message reply
Dear all, I have a setup with GNU Mailman 3.3.8 where a user tried to subscribe. When replying to the "confirm" message, the user received a completely garbled response, like so:
N�z˥��ʋ�zf���&���j�����u�[zZ0:���v���,j�z֢��k�e�0 4Dҹ��r�^b��r�ߊ���*'���yם���{ax�L�X����0:���v���,j�z֢��k�d�m�-���zz�rn��l]y+���-E�( �^N��'��M���tۯ��Mz�����^i����}�o�\ӷ���6sN�zf���ݮ��r��z˥��������b�{h��݉��f�r�ne==
(The line above contains numerous unprintable characters, etc.) The user then forwarded the .eml file to me, which clearly shows the contents of the mail as plain text while being announced as base64 (Content-Transfer-Encoding: base64): START OF EML Return-Path: <LISTNAME-bounces@lists.DOMAIN.COM> Received: from DOMAIN.COM ([144.76.173.241]) by mx-ha.USERDOMAIN.NET (mx06 [IP]) with ESMTPS (Nemesis) id 1M27WV-1qhRFk39Sr-00AQcR for <USER@USERDOMAIN.NET>; Thu, 21 Sep 2023 19:10:37 +0200 Received: from localhost (localhost.localdomain [127.0.0.1]) by DOMAIN.COM (Postfix) with ESMTP id 4058B3BD5EEF for <USER@USERDOMAIN.NET>; Thu, 21 Sep 2023 19:10:37 +0200 (CEST) Received: from DOMAIN.COM ([127.0.0.1]) by localhost (DOMAIN.COM [127.0.0.1]) (amavis, port 10024) with ESMTP id x0ngnlRnGraH for <USER@USERDOMAIN.NET>; Thu, 21 Sep 2023 19:10:37 +0200 (CEST) Received: from [IP] (unknown [IP]) by DOMAIN.COM (Postfix) with ESMTP id DC8603BD5EEE for <USER@USERDOMAIN.NET>; Thu, 21 Sep 2023 19:10:36 +0200 (CEST) Subject: =?utf-8?q?The_results_of_your_email_commands?= From: LISTNAME-bounces@lists.DOMAIN.COM To: USER@USERDOMAIN.NET Content-Transfer-Encoding: base64 MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Message-ID: <169531623416.29.5762261018677554285@mailman-core> Date: Thu, 21 Sep 2023 17:10:34 +0000 Precedence: bulk X-Mailman-Version: 3.3.8 List-Id: <LISTNAME.lists.DOMAIN.COM> List-Help: <mailto:LISTNAME-request@lists.DOMAIN.COM?subject=help> List-Owner: <mailto:LISTNAME-owner@lists.DOMAIN.COM> List-Subscribe: <mailto:LISTNAME-join@lists.DOMAIN.COM> List-Unsubscribe: <mailto:LISTNAME-leave@lists.DOMAIN.COM> Envelope-To: <USER@USERDOMAIN.NET> X-Spam-Flag: NO
The results of your email command are provided below.
Original message details: From: USER <USER@USERDOMAIN.NET> Subject: Re: Your confirmation is needed to join the LISTNAME@lists.DOMAIN.COM mailing list. Date: Thu, 21 Sep 2023 19:10:26 +0200 Message-ID: <3c1eaaf8-1677-432f-b9dc-07f18902c06c@email.android.com>
Results: Confirmation token did not match
Done.
END OF EML
Is this possibly a bug in Mailman? Any ideas on what else could be causing this?
lists-mailman3@janaberger.de writes:
I have a setup with GNU Mailman 3.3.8 where a user tried to subscribe. When replying to the "confirm" message, the user received a completely garbled response, like so:
N?z?????zf???&???j?????u?[zZ0:???v???,j?z???k?e?0 4D???r?^b??r?????*'???y????{ax?L?X????0:???v???,j?z???k?d?m?-???zz?rn??l]y+???-E?( ?^N??'??M???t???Mz?????^i?????}?o?\????6sN?zf??????r??z?????????b?{h?????f?r?ne==
To diagnose, we really need to see that in context of the raw message as received by the user.
The user then forwarded the .eml file to me, which clearly shows the contents of the mail as plain text while being announced as base64 (Content-Transfer-Encoding: base64):
It also clearly doesn't show anything like the mojibake above. What is the relation between the two?
While the problems could be related to Mailman, it could also be any MTA along the way or the user's mail client.
I suspect that the .eml file has been significantly altered by intervening software. As far as I know, the Python email package does not MIME-encode plain ASCII text as you see in the email's subject.
The only other thing I can think of is it might have something to do with your internationalization settings. What is the default language of the list? (This is available in Postorius.) Have you changed the default encoding for English on your site? (I think you need to look in mailman.cfg for this.) What is the default language of the site?
Steve
On 9/22/23 09:32, Stephen J. Turnbull wrote:
To diagnose, we really need to see that in context of the raw message as received by the user.
It was posted. The issue is the message declares
Content-Transfer-Encoding: base64
but the body is not base64 encoded so the MUA tries to base64 decode it and even though that's invalid, produces the garbled output.
The only other thing I can think of is it might have something to do with your internationalization settings. What is the default language of the list? (This is available in Postorius.) Have you changed the default encoding for English on your site? (I think you need to look in mailman.cfg for this.) What is the default language of the site?
I note that the message also declares
Content-Type: text/plain; charset="utf-8"
and Python's email package will base64 encode utf-8 message bodies. I also note that the messages Subject: is RFC 2047 encoded as utf-8. These tell me that the user's preferred_language if set or the list's preferred_language is not English or the charset for English is changed to utf-8.
I tried to duplicate this in a test installation with the charset for English set to utf-8, but I get a similar response but with
Subject: =?utf-8?q?The_results_of_your_email_commands?=
and
Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
Mark Sapiro writes:
The issue is the message declares
Content-Transfer-Encoding: base64
but the body is not base64 encoded so the MUA tries to base64 decode it and even though that's invalid, produces the garbled output.
:-(
I tried to duplicate this in a test installation with the charset for English set to utf-8, but I get a similar response but with
Subject: =?utf-8?q?The_results_of_your_email_commands?=
I think that's unfortunate. Not really a bug, but the RFCs recommend using the defaults (ie, US-ASCII) where that's possible. I wonder if it's possible to use email.policy to change that.[1]
With respect to the actual problem, I don't have anything to say until we have more information about the relevant settings.
Oh, I guess I also wonder if maybe the language doesn't have a translation for that message yet?
Steve
Footnotes: [1] Speaking of bugs, my MUA decoded that MIME word when it quoted the message even though it's not in the header! Aaargh.
lists-mailman3@janaberger.de writes:
I have a setup with GNU Mailman 3.3.8 where a user tried to subscribe. When replying to the "confirm" message, the user received a completely garbled response, like so:
N?z?????zf???&???j?????u?[zZ0:???v???,j?z???k?e?0
4D???r?^b??r?????*'???y????{ax?L?X????0:???v???,j?z???k?d?m?-???zz?rn??l]y+???-E?(
?^N??'??M???t???Mz?????^i?????}?o?\????6sN?zf??????r??z?????????b?{h?????f?r?ne==
To diagnose, we really need to see that in context of the raw message as received by the user. You can see the same or similar result by opening the EML I quoted. Does
On 22.09.2023 18:32, Stephen J. Turnbull wrote: the list support attachments? Then I will attach the file directly.
The user then forwarded the .eml file to me, which clearly shows the contents of the mail as plain text while being announced as base64 (Content-Transfer-Encoding: base64):
It also clearly doesn't show anything like the mojibake above. What is the relation between the two?
Sorry, I should have made that clearer - when opening the EML file with e.g. Thunderbird, the contents (message text) of the mail as displayed is the garbled string. I assume it is the result of Thunderbird trying to process the contents as base64-encoded, which it is not.
While the problems could be related to Mailman, it could also be any MTA along the way or the user's mail client.
I suspect that the .eml file has been significantly altered by intervening software. As far as I know, the Python email package does not MIME-encode plain ASCII text as you see in the email's subject.
I have multiple control messages, all received from my instance which all contain the "Content-Transfer-Encoding: base64" marker, their contents is plain text (Content-Type: text/plain; charset="utf-8"), but they are still encoded using base64.
Of course that is not proof, but an indication that it is either done by the mail server itself or widely across MUAs.
The only other thing I can think of is it might have something to do with your internationalization settings. What is the default language of the list? (This is available in Postorius.) Have you changed the default encoding for English on your site? (I think you need to look in mailman.cfg for this.) What is the default language of the site?
Global "Preferred language" is "English (USA)", which is not overridden by the list in question. I have not changed the default encoding. Is there a surefire way to detect this?
My mailman.cfg has no section pertaining to encoding, and all related files also do not mention encoding.
Jana Berger writes:
You can see the same or similar result by opening the EML I quoted. Does the list support attachments? Then I will attach the file directly.
No, I just misjudged the length of the email -- without the header it's way too short to generate that much mojibake. I'll take your word for it now.
I have multiple control messages, all received
By whom? Through what other systems have they passed?
Is this happening with any other domain than USERDOMAIN.NET? The remote MTA in the message you included earlier seems to be something called Nemesis, and Google suggests it may be a German homebrew MTA. That's another candidate for causing this issue.
from my instance which all contain the "Content-Transfer-Encoding: base64" marker, their contents is plain text (Content-Type: text/plain; charset="utf-8"), but they are still encoded using base64.
Yes. The question remains, where is it happening and why.
As mentioned earlier, Mark tried to reproduce on one of his systems, but did not get the body in UTF-8/base64. It came out us-ascii/7bit, which is what we expect. Given Mark's experiments and the "all English" configuration you report, it seems unlikely to be from our distribution of Mailman.
One possibility is that Mailman or Python is getting confused by something in your whole system configuration (not limited to Mailman). Where did you get Mailman? Did you install a OS distribution package, from PyPI, or from source? How about Python 3?
I wonder if some MTA is using SMTPUTF8 and to make life "easier" for themselves they convert everything to UTF-8, but gets confused about whether base64 encoding is done or not.
Global "Preferred language" is "English (USA)", which is not overridden by the list in question. I have not changed the default encoding. Is there a surefire way to detect this?
My mailman.cfg has no section pertaining to encoding, and all related files also do not mention encoding.
Then everything is configured for UTF-8 except for English, which is US-ASCII.
The only way to change the site default language or any charset is in mailman.cfg (or in the code). If you have a distro package of Mailman, you could check in site-packages/mailman/config/schema.cfg to see the settings for default_language (should be en) and immediately after [language.master] (charset should be us-ascii). I think that changing language.master.charset to utf-8 is probably very tempting for distros so that people can safely put emojis and smartquotes in their template messages in English.
Steve
On 23.09.2023 16:26, Stephen J. Turnbull wrote:
Jana Berger writes:
You can see the same or similar result by opening the EML I quoted. Does the list support attachments? Then I will attach the file directly.
No, I just misjudged the length of the email -- without the header it's way too short to generate that much mojibake. I'll take your word for it now.
I have multiple control messages, all received
By whom? Through what other systems have they passed?
Is this happening with any other domain than USERDOMAIN.NET? The remote MTA in the message you included earlier seems to be something called Nemesis, and Google suggests it may be a German homebrew MTA. That's another candidate for causing this issue.
It was (until now) only reported from this domain, but Nemesis is used by GMX, one of the largest mail providers in Germany. I am reluctant to pin this issue on their end for now.
from my instance which all contain the "Content-Transfer-Encoding: base64" marker, their contents is plain text (Content-Type: text/plain; charset="utf-8"), but they are still encoded using base64.
Yes. The question remains, where is it happening and why.
As mentioned earlier, Mark tried to reproduce on one of his systems, but did not get the body in UTF-8/base64. It came out us-ascii/7bit, which is what we expect. Given Mark's experiments and the "all English" configuration you report, it seems unlikely to be from our distribution of Mailman.
One possibility is that Mailman or Python is getting confused by something in your whole system configuration (not limited to Mailman). Where did you get Mailman? Did you install a OS distribution package, from PyPI, or from source? How about Python 3?
I am using the "docker-mailman" image from https://github.com/maxking/docker-mailman
It is using Python 3.11 with Alpine 3.18 as its base.
I wonder if some MTA is using SMTPUTF8 and to make life "easier" for themselves they convert everything to UTF-8, but gets confused about whether base64 encoding is done or not.
I am using Postfix as the main MTA which is set to "smtputf8_enable = yes" together with "smtputf8_autodetect_classes = sendmail, verify". Is this an issue?
Global "Preferred language" is "English (USA)", which is not overridden by the list in question. I have not changed the default encoding. Is there a surefire way to detect this?
My mailman.cfg has no section pertaining to encoding, and all related files also do not mention encoding.
Then everything is configured for UTF-8 except for English, which is US-ASCII.
The only way to change the site default language or any charset is in mailman.cfg (or in the code). If you have a distro package of Mailman, you could check in site-packages/mailman/config/schema.cfg to see the settings for default_language (should be en) and immediately after [language.master] (charset should be us-ascii). I think that changing language.master.charset to utf-8 is probably very tempting for distros so that people can safely put emojis and smartquotes in their template messages in English.
I checked the schema.cfg and it has "charset: us-ascii" in its "[language.master]" section, as well as "default_language: en".
Jana Berger writes:
It was (until now) only reported from this domain, but Nemesis is used by GMX, one of the largest mail providers in Germany. I am reluctant to pin this issue on their end for now.
If it's only reported from one domain, and has never been reported for Mailman 3 before, I'm reluctant to pin this issue on Mailman.
I am using the "docker-mailman" image from https://github.com/maxking/docker-mailman
It is using Python 3.11 with Alpine 3.18 as its base.
OK, that's helpful, we have a good baseline.
I am using Postfix as the main MTA which is set to "smtputf8_enable = yes" together with "smtputf8_autodetect_classes = sendmail, verify". Is this an issue?
Given that the problem is pure ASCII text that as far as we know leaves Mailman labeled as US-ASCII/7bit, but later is being labeled as UTF-8 and base64-encoded -- yes, that's evidence.
Whether it's related to the issue or not, I don't know, I have no experience with Postfix's implementation of SMTPUTF8. If I understand the Postfix documentation[1] correctly, it *should* be irrelevant to Mailman since your configuration doesn't include smptd, Mailman sends outgoing mail by smtpd, and Mailman doesn't user SMTPUTF8. It appears that Postfix is conservative about SMTPUTF8 and does not use it for outgoing message unless required by the message at receipt.
The only thing I can think of at the moment that might move the discussion forward is if Mark knows how to run his test in your Docker environment to see if it also produces Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit
Footnotes: [1] https://www.postfix.org/postconf.5.html
On 9/24/23 07:20, Stephen J. Turnbull wrote:
The only thing I can think of at the moment that might move the discussion forward is if Mark knows how to run his test in your Docker environment to see if it also produces Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit
Assuming there is not something like
[language.en]
charset: utf-8
in your mailman.cfg overriding the default us-ascii charset for English, if you send a message From: yourself to list-confirm-xxxxxxx@example.com with Subject: Re: Your confirmation is needed to join the list@example.com mailing list (replacing list and example.com with the actual list and domain), you should receive a response like (some headers removed)
Subject: The results of your email commands
From: list-bounces@example.com
To: mark@example.com
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Message-ID: <169556563227.2592.7908020161516725770@example.com>
Date: Sun, 24 Sep 2023 07:27:12 -0700
Precedence: bulk
X-Mailman-Version: 3.3.9b1
List-Id: A test list <list.example.com>
List-Help: <mailto:list-request@example.com?subject=help>
List-Owner: <mailto:list-owner@example.com>
List-Subscribe: <mailto:list-join@example.com>
List-Unsubscribe: <mailto:list-leave@example.com>
The results of your email command are provided below.
- Original message details:
From: Mark Sapiro <mark@example.com>
Subject: Re: Your confirmation is needed to join the
list.example.com mailing list.
Date: Sun, 24 Sep 2023 07:27:10 -0700
Message-ID: <20230924142710.GA23819@example.com>
- Results:
Confirmation token did not match
- Done.
What do you get?
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
On 24.09.2023 16:52, Mark Sapiro wrote:
On 9/24/23 07:20, Stephen J. Turnbull wrote:
The only thing I can think of at the moment that might move the discussion forward is if Mark knows how to run his test in your Docker environment to see if it also produces Content-Type: text/plain; charset="US-ASCII" Content-Transfer-Encoding: 7bit
Assuming there is not something like
[language.en] charset: utf-8
in your mailman.cfg overriding the default us-ascii charset for English, if you send a message From: yourself to list-confirm-xxxxxxx@example.com with Subject: Re: Your confirmation is needed to join the list@example.com mailing list (replacing list and example.com with the actual list and domain), you should receive a response like (some headers removed)
Subject: The results of your email commands From: list-bounces@example.com To: mark@example.com MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Message-ID: <169556563227.2592.7908020161516725770@example.com> Date: Sun, 24 Sep 2023 07:27:12 -0700 Precedence: bulk X-Mailman-Version: 3.3.9b1 List-Id: A test list <list.example.com> List-Help: <mailto:list-request@example.com?subject=help> List-Owner: <mailto:list-owner@example.com> List-Subscribe: <mailto:list-join@example.com> List-Unsubscribe: <mailto:list-leave@example.com> The results of your email command are provided below. - Original message details: From: Mark Sapiro <mark@example.com> Subject: Re: Your confirmation is needed to join the list.example.com mailing list. Date: Sun, 24 Sep 2023 07:27:10 -0700 Message-ID: <20230924142710.GA23819@example.com> - Results: Confirmation token did not match - Done.
What do you get?
Well, that confirms it.
I send the mail from an account at GMX and received the Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64
with a properly (!) base64-encoded message body.
I then send the mail both from my own mail system (which is also the one being used by Mailman) and got
Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="utf-8"
with no encoding on the message. Lastly, I send the same mail from a GoogleMail account, which also received
Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="utf-8"
with no encoding.
... which means this is almost surely a bug in the MTA of GMX, which will be much, much more difficult and unlikely to be fixed - that is why I had hoped for this to be an issue in/with Mailman ;)
Thank you for the pointers and help!
Jana Berger writes:
I then send the mail both from my own mail system (which is also the one being used by Mailman) and got
Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset="utf-8"
with no encoding on the message.
I wonder why the charset is utf-8, when the message is pure ASCII, and your site and list are set for en/US-ASCII. Internationalization is a mystery. ;-) I will look into the code at some point, there are a number of minor edge cases in Python's email package that I don't understand.
that is why I had hoped for this to be an issue in/with Mailman ;)
Yeah, I wish we could help -- we know your pain -- but I don't think we can.
If you get more information, feel free to come back.
Steve
On 9/23/23 02:54, Jana Berger wrote:
I have multiple control messages, all received from my instance which all contain the "Content-Transfer-Encoding: base64" marker, their contents is plain text (Content-Type: text/plain; charset="utf-8"), but they are still encoded using base64.
It seems you are saying these messages are valid. I.e. they specify "Content-Transfer-Encoding: base64" and the body is in fact base64 encoded. This is as they should be.
Of course that is not proof, but an indication that it is either done by the mail server itself or widely across MUAs.
As you say, this suggests that some MTA in the path to your user or the user's MUA is responsible for decoding the base64 encoded body but leaving the "Content-Transfer-Encoding: base64" header as is.
Global "Preferred language" is "English (USA)", which is not overridden by the list in question. I have not changed the default encoding. Is there a surefire way to detect this?
My mailman.cfg has no section pertaining to encoding, and all related files also do not mention encoding.
The Subject: and body of these messages should be ascii, not utf-8 unless you have something like
[language.en]
charset: utf-8
in your mailman.cfg
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
I have now seen a similar instance on one of my servers. I was able to get the raw garbled response from the user. I have asked to get the raw message that was sent, but unfortunately, I haven't gotten it and I suspect I won't. I think it was an attempted post sent by mistake to the LIST-confirm address. I have tried, so far without success, to duplicate this, but I will keep at it.
The raw message is attached. This one is worse than the one in the OP in that the Content-Transfer-Encoding is base64, but the content is unreadable in the raw message. It is clearly not a valid base64 encoding. I'm guessing that the original message looked similar to the one in the OP, but Microsoft Exchange garbled it by doing some attempt at base64 decoding the unencoded plain text and placing that result in the message body
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
participants (4)
-
Jana Berger
-
lists-mailman3@janaberger.de
-
Mark Sapiro
-
Stephen J. Turnbull