Philip Colmer writes:
I'm not sure that the message ID identifies the author, does it?
It does not identify any person. According to RFC 822 and all successors, it is an "author field", so third parties such as mailing lists should keep their hands off. The agent that causes a message to be sent is the "author", and it's up to them. While I am not the authoritative spokesperson for Mailman, I'm pretty good at channeling the core group, and I'm confident that the Mailman developers do not consider Mailman's role to be "authorship", even though Mailman may do things like delete attachments or HTML alternative parts, convert HTML-only messages to text/plain, and decorate messages with tags in Subject and headers and footers on the body.
There are good reasons for Mailman adopting this posture. In particular, while Google does an execrable job of handling the poster's own duplicates, eliminating duplicates via mailing lists is an important use case for "Message-ID" for everybody except us list admins, who desperately want access to both copies.
I think that what Mailman 3 is doing is similar to forwarding the email. If I forward an email, again the headers change:
There are two kinds of forwards. In one, you simply *resend* the original. (This function is sometimes called "bounce", not to be confused with a refused and returned message.) In that case, the Message-ID should stay the same so that tools can identify the message. In the other, you *attach* the message to a new message. MUAs allow you to add your own text, most folks do (they're citing the message as support for something they want to say), and since they can't read your mind, the MUA or the MTA will treat it as a new message and assign a new Message-ID. On the other hand, Mailman has no intention of adding (or usually subtracting) content any more than the mailman does when they put a postmark on the envelope, so we do not consider Mailman to be an author.
The idea that mailing lists should "take authorship" of posts comes up often, in many different contexts. It's almost always the case the issue that is raised is caused by other human or software agents' misbehavior.
The IETF has addressed the issue of non-identical but equivalent messages with the Resent-Message-ID field (and in general the whole suite of Resent-* fields.
Mailman doesn't appear to be adding that header. Would that be an acceptable solution instead of the one I suggested above?
I doubt it solves anything. I'm sure Google uses Message-ID to deduplicate. That's one of the things Message-ID is *for*, Google just does it in a way that creates maximum breakage, including for their own users.
I appreciate your view that Google is broken. I'm just trying to find an acceptable solution to the end-user perception of "I've sent an email to the list but I haven't received a copy back". That is going to be a big issue/headache for us.
I feel for you, but do you think that's a bigger headache than if Mailman is hacked to change the Message-ID and every CC'd subscriber starts receiving duplicates? Do you really want to punish people who use competent MUAs because others use a horrible one? Google is not just broken "in my view," it's logically broken. *Somebody* among your subscribers is going to be annoyed by whatever you do to deal with this.
Note that changing Message-ID and setting all subscribers to "no dupes" is not a 100% solution to this; it means that filtering on List-* headers breaks because Mailman doesn't even send them the message, and the subscribers don't get serial numbers, tags, and footers in posts they're cc'd on.
"It's broken *all the way down*." If I ever get invited on "The New Abnormal", GMail will get my FTG for this.
Steve