On 8/8/21 11:51, Stephen J. Turnbull wrote:
Guillermo Hernandez (Oldno7) via Mailman-users writes:
I find it very convenient too, but it concerns me that spammers, really "profesional" spammers and senders of phising mail change the forged address and sender domain too rapidly to track them. As of it, the non-member database is growing more and more. It could be a "monster" in two or three years...
Well, let's think that out. I think we allocate a 255-byte text field for email addresses, another 255 bytes for display names, and I can't imagine that including the rest of the data fills as much as 1k, even including indicies and other database optimization overhead. So if you get a million spam messages to your lists, each from a unique address, that's only one gigabyte. To get a million spam messages in three years, you'd need to get almost a thousand per day that get through your MTA's filters to Mailman's moderation queue, which is the only place this automatic construction of nonmembers takes place. I think your moderators would let you know about that long before your disk fills up.
You are absolutely right. I am the problem... well, my background of developing from the eighties of the last century, when 1k (no mistake here) meaned a difference.
I'm not making fun of you,
Never thought that.
I'm explaining why I don't think this is a high priority for us, and why you don't need to worry about monster databases, either: the human resources just for looking at the queue, let alone taking action, will be a constraint long before the disk or CPU consumption for the address database is. If there's something wrong with my analysis, let me know, but as far as I can see, if the spam flood to your lists gets to that scale, the first worries are denial of service and admin exhaustion. As usual, the best answer is to stop the barbarians at the gate (ie, the MTA).
Yes... that's the point where I expurge most of that (discarding completely email from domains that will never send a legitimate email to our lists, and in a few cases entire .XXX areas) to "normal" addresses. But I have to confess that I relayed in the mailman control of subscriptions for that in the lists. And it all went well until I made a security hole with one non-member address
I guess if the list is configured to automatically discard all posts from non-members, we could optimize away the non-member creation. But this seems like a relatively rare case, and if there were non-local bans on addresses, it might not be a good idea anyway.
And that's exactly the case of the list where a malicious mail get into: it's a private list with "discard all non subscribers addresses", but there was a legacy address imported from mailman 2 that had permission to send mails (and I was not aware)... and a bastard spammer forged it (well, it was one of many tries with different origin addresses). My fault completely.
But this is a unique case in now more than 8 months using mailman3 (and more than 12 years using mailman).
Being that list private and non publicily available, I suspect that one subscriber address book has been hijacked, and I can expect some crap from a forged subscriber address. But that is another story...
Thanks a lot, again, for all your explanations and good work.
I'm always grateful to all of you.
In theory, we could even arrange to do the lookup at moderation time, although we currently do it at mail receipt time. That would allow a moderator to ban the addressee on list A and then not even see the post when they go to check on list B. I suspect even professional spammers don't bother to create a unique sender address for each recipient! Obviously, this kind of service would require putting the nonmember addresses in the database, at least temporarily.
Steve
--