HyperKitty and duplicate emails across maillists
How does HyperKitty handle duplicate messages/message-ids across maillists?
We have quite a few lists that are members of other lists.
I'm in the process of importing our old Mailman 2.1 archives and seeing a lot of duplicate message errors, and wondering if this is why.
Would new messages coming into Mailman be handled the same way?
Thanks, Derek
On Mon, Sep 9, 2019, at 9:26 AM, Derek Lambert wrote:
How does HyperKitty handle duplicate messages/message-ids across maillists?
We have quite a few lists that are members of other lists.
I'm in the process of importing our old Mailman 2.1 archives and seeing a lot of duplicate message errors, and wondering if this is why.
Would new messages coming into Mailman be handled the same way?
Yes, message-ids are supposed to be unique and that uniquenes is determined based on the message-id.
Hyperkitty will refuse to archive duplicate message id since the field is a unique field.
I don't know if there will be an exception raised or it will be handled more gracefully :)
Thanks, Derek
Mailman-users mailing list -- mailman-users@mailman3.org To unsubscribe send an email to mailman-users-leave@mailman3.org https://lists.mailman3.org/mailman3/lists/mailman-users.mailman3.org/
-- thanks, Abhilash Raj (maxking)
For new messages does mailman generate unique message IDs for each recipient?
E.x. if multiple lists (children) are members of another list (parent), will the children lists all be able to archive the same message sent by parent?
Thanks, Derek
On Mon, Sep 9, 2019 at 9:43 PM Abhilash Raj <maxking@asynchronous.in> wrote:
On Mon, Sep 9, 2019, at 9:26 AM, Derek Lambert wrote:
How does HyperKitty handle duplicate messages/message-ids across maillists?
We have quite a few lists that are members of other lists.
I'm in the process of importing our old Mailman 2.1 archives and seeing a lot of duplicate message errors, and wondering if this is why.
Would new messages coming into Mailman be handled the same way?
Yes, message-ids are supposed to be unique and that uniquenes is determined based on the message-id.
Hyperkitty will refuse to archive duplicate message id since the field is a unique field.
I don't know if there will be an exception raised or it will be handled more gracefully :)
Thanks, Derek
Mailman-users mailing list -- mailman-users@mailman3.org To unsubscribe send an email to mailman-users-leave@mailman3.org https://lists.mailman3.org/mailman3/lists/mailman-users.mailman3.org/
-- thanks, Abhilash Raj (maxking)
Mailman-users mailing list -- mailman-users@mailman3.org To unsubscribe send an email to mailman-users-leave@mailman3.org https://lists.mailman3.org/mailman3/lists/mailman-users.mailman3.org/
Derek Lambert writes:
For new messages does mailman generate unique message IDs for each recipient?
Mailman in principle doesn't generate message IDs at all. That's the job of originating clients or their MTAs.
Mailman is what is called an "intermediary": it may do useful transformations on a message (such as adding RFC 2369 header fields and the List-Id field, as well as headers, footers, and subject decorations, removing banned media types, and even changing the format of text parts (such as removing HTML parts from multipart/alternative messages, or translating HTML to plain text by formatting with a browser such as Lynx), but in some sense the content of the message remains faithful to the author's intent. In such cases, the message is considered to be the same for the purposes of assigning Message-Id, so Mailman doesn't touch it.
In the case that Mailman receives a message that has no Message-Id, it may be a good idea to generate one. (I don't think this is implemented. It has its problems as well as its benefits.) Otherwise, no, Mailman leaves the existing Message-Id alone.
E.x. if multiple lists (children) are members of another list (parent), will the children lists all be able to archive the same message sent by parent?
Yes. I'm not sure whether the message body is stored multiple times or only once[1], but the primary key for archived message is effectively the list id plus the message-id.
Steve
Footnotes: [1] Probably multiple times, once for each list that has archiving turned on. Ensuring that any relevant decorations (headers, footers, Subject tags) are consistent would require pretty fragile analysis of the messages and complicated storage and indicies.
On Mon, Sep 16, 2019, at 9:31 AM, Stephen J. Turnbull wrote:
Derek Lambert writes:
For new messages does mailman generate unique message IDs for each recipient?
Mailman in principle doesn't generate message IDs at all. That's the job of originating clients or their MTAs.
Mailman is what is called an "intermediary": it may do useful transformations on a message (such as adding RFC 2369 header fields and the List-Id field, as well as headers, footers, and subject decorations, removing banned media types, and even changing the format of text parts (such as removing HTML parts from multipart/alternative messages, or translating HTML to plain text by formatting with a browser such as Lynx), but in some sense the content of the message remains faithful to the author's intent. In such cases, the message is considered to be the same for the purposes of assigning Message-Id, so Mailman doesn't touch it.
In the case that Mailman receives a message that has no Message-Id, it may be a good idea to generate one. (I don't think this is implemented. It has its problems as well as its benefits.) Otherwise, no, Mailman leaves the existing Message-Id alone.
E.x. if multiple lists (children) are members of another list (parent), will the children lists all be able to archive the same message sent by parent?
Yes. I'm not sure whether the message body is stored multiple times or only once[1], but the primary key for archived message is effectively the list id plus the message-id.
It is stored separately for each list and Hyperkitty enforces unique combination(like Steve mentioned :-) of list_id and message_id. So, you can't have two message with same id in the same list but in separate lists, they should be fine.
Steve
Footnotes: [1] Probably multiple times, once for each list that has archiving turned on. Ensuring that any relevant decorations (headers, footers, Subject tags) are consistent would require pretty fragile analysis of the messages and complicated storage and indicies.
Mailman-users mailing list -- mailman-users@mailman3.org To unsubscribe send an email to mailman-users-leave@mailman3.org https://lists.mailman3.org/mailman3/lists/mailman-users.mailman3.org/
-- thanks, Abhilash Raj (maxking)
Am 16. September 2019 um 11:08 Uhr -0700 schrieb Abhilash Raj:
It is stored separately for each list and Hyperkitty enforces unique combination(like Steve mentioned :-) of list_id and message_id. So, you can't have two message with same id in the same list but in separate lists, they should be fine.
Just that I understand this right: Mailman only stores a message by list_id plus message ID? If so, what happens if a malicious user sends a message with an intentionally duplicate message ID to a list? Can a user thereby manipulate the message archive? Mailman should probably reject the message with the duplicate ID.
-- Blog: https://mg.guelker.eu
On 9/16/19 12:45 PM, Marvin Gülker wrote:
Just that I understand this right: Mailman only stores a message by list_id plus message ID? If so, what happens if a malicious user sends a message with an intentionally duplicate message ID to a list? Can a user thereby manipulate the message archive? Mailman should probably reject the message with the duplicate ID.
Mailman will accept the message (assuming it passes other checks), but Hyperkitty won't archive it because of the duplicate Message-ID. Thus, a malicious user can't manipulate the archive in this way.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
On Mon, Sep 16, 2019, at 12:45 PM, Marvin Gülker wrote:
Am 16. September 2019 um 11:08 Uhr -0700 schrieb Abhilash Raj:
It is stored separately for each list and Hyperkitty enforces unique combination(like Steve mentioned :-) of list_id and message_id. So, you can't have two message with same id in the same list but in separate lists, they should be fine.
Just that I understand this right: Mailman only stores a message by list_id plus message ID?
Mailman(technically, Hyperkitty) stores the message with both separate fields on list_id and message_id, but the unique constraint is applied to a combination of both the fields.
If so, what happens if a malicious user sends a message with an intentionally duplicate message ID to a list? Can a user thereby manipulate the message archive? Mailman should probably reject the message with the duplicate ID.
-- Blog: https://mg.guelker.eu
Mailman-users mailing list -- mailman-users@mailman3.org To unsubscribe send an email to mailman-users-leave@mailman3.org https://lists.mailman3.org/mailman3/lists/mailman-users.mailman3.org/
-- thanks, Abhilash Raj (maxking)
On 9/16/19 9:31 AM, Stephen J. Turnbull wrote:
In the case that Mailman receives a message that has no Message-Id, it may be a good idea to generate one. (I don't think this is implemented. It has its problems as well as its benefits.)
Mailman rejects messages without a Message-ID: header. See <https://gitlab.com/mailman/mailman/issues/490>.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
participants (5)
-
Abhilash Raj
-
Derek Lambert
-
Mark Sapiro
-
Marvin Gülker
-
Stephen J. Turnbull