Guillermo Hernandez (Oldno7) via Mailman-users writes:
I find it very convenient too, but it concerns me that spammers, really "profesional" spammers and senders of phising mail change the forged address and sender domain too rapidly to track them. As of it, the non-member database is growing more and more. It could be a "monster" in two or three years...
Well, let's think that out. I think we allocate a 255-byte text field for email addresses, another 255 bytes for display names, and I can't imagine that including the rest of the data fills as much as 1k, even including indicies and other database optimization overhead. So if you get a million spam messages to your lists, each from a unique address, that's only one gigabyte. To get a million spam messages in three years, you'd need to get almost a thousand per day that get through your MTA's filters to Mailman's moderation queue, which is the only place this automatic construction of nonmembers takes place. I think your moderators would let you know about that long before your disk fills up.
I'm not making fun of you, I'm explaining why I don't think this is a high priority for us, and why you don't need to worry about monster databases, either: the human resources just for looking at the queue, let alone taking action, will be a constraint long before the disk or CPU consumption for the address database is. If there's something wrong with my analysis, let me know, but as far as I can see, if the spam flood to your lists gets to that scale, the first worries are denial of service and admin exhaustion. As usual, the best answer is to stop the barbarians at the gate (ie, the MTA).
I guess if the list is configured to automatically discard all posts from non-members, we could optimize away the non-member creation. But this seems like a relatively rare case, and if there were non-local bans on addresses, it might not be a good idea anyway.
In theory, we could even arrange to do the lookup at moderation time, although we currently do it at mail receipt time. That would allow a moderator to ban the addressee on list A and then not even see the post when they go to check on list B. I suspect even professional spammers don't bother to create a unique sender address for each recipient! Obviously, this kind of service would require putting the nonmember addresses in the database, at least temporarily.
Steve