Alan So writes:
- Create two lists with default setting
- Prepare a list of 1000 fresh email addresses not added to the system before
- Mass subscribe both lists of these 1000 email addresses around the same time with all three Pre (confirm, approved, verified) checked [...] (psycopg2.errors.UniqueViolation) duplicate key value violates unique constraint "ix_address_email" DETAIL: Key (email)=(testemail999@example.com) already exists.
I think there's some kind of race condition. I would bet it's in Mailman core, not in the RDMBS code or the ORM. The process is
- Check if the address is known.
- If yes: a. Get the address's user. b. Add the subscription pair (address, list) to the user. c. Add the address to the list.
- If no: a. Create the address object. b. Create a user for the address, and link them together. c. Add the subscription pair (address, list) to the user. d. Add the address to the list.
Each line is a separate database query, I suspect, so the race is between 1 and 3a. If two requests for the same new address arrive at the "same" time, both will try to create the address, only one can succeed.
I guess we should catch the error and retry. Raising and handling exceptions in Python is relatively slow, so even in your well constructed worst case, this shouldn't happen on every address, so I don't think having a separate queue or putting the whole thing in a transaction would be better. If you still have the log, I'd be curious to know how many unique errors you got.
I need to do other work for the next 48 hours, but a patch would be very welcome. I don't know offhand where this code lives, sorry.
Steve