Andreas Barth writes:
However, mailman doesn't have an internal id I can use to follow the message (or at least, it doesn't transfer one to exim upon delivery).
Mailman does have an internal ID for each LTMP transaction, which is the queue file name. That is generated by hashing the message, the current time and something else so that should be unique enough for your purpose. I guess we could log that, but I'm not sure how that would benefit you. Returning it in the LMTP status may be possible, but it's not currently exposed to the LMTP runner which manages that transaction.
I can indeed use the user set message id for following mail.
Then I don't see the problem. Identifying a particular message is what that is for. Unless you're getting a spew of repeats of a single message (which is certainly possible, but then I doubt that the problem you're solving is tracing a single instance of the spew!), that plus the timestamp should be enough to disambiguate most possible repeats.
However, I would prefer some internal id, and also to have logged the IDs exim (or probably any other MTA) transmits upon
I don't understand. MTAs don't explicitly transmit their internal IDs to the next hop; there's no provision for this in the SMTP (LMTP) protocol AFAICR. They can add it to their Received field, and they can log it themselves. In theory they can return it in the status message they pass back to the previous MTA upon acceptance, but I rarely recall seeing it done (Exchange does it, I think). And in any case this only works on your site, you're still going to have to fall back on some combination of message-id, author, and timestamp as goes from host to host.
Please be more specific about how IDs are transmitted at your site, and be careful that different mechanisms have to be used on receipt and forwarding.
delivering the mails when handing the mails over to the MTA again. (Probably in some format as "mailman-ID recepient MTA-id" so that it's easy for me to grep for message if a user complains.)
I guess we could do that for the outgoing MTA if it's present in the status message, but that's likely to be a lot of output unless all your subscribers are in the same domain. But there's no way to get it for the incoming MTA as far as I know, except maybe parsing it out of the Received chain.
Besides the dubious existence of the relevant data, the lack of support in the SMTP and LMTP protocol means that every MTA, and some exceptional instances of each MTA (I'm pretty sure both Postfix and Exim4 support customizable status messages) would have to be supported individually by parsing it out of the status return or the Received chain, neither of which is standardized. This sounds like a lot of work for us for each MTA in use on the Internet.
Steve