Extracting mailing list users
Hi,
I would like to export users of my mailing lists including their full name (if present). The export button in Postorius creates a CSV with addresses only, and the tool bin/list_members mentioned in the Mailman 2 FAQ does not seem to be available in Mailman 3.
I have full access to the server/database, so scraping the web interface with Mark's script seems unneccessary. Is there an SQL statement (or an equivalent of bin/list_members) to extract the data?
Thanks, Jan
On Fri, Dec 30, 2022 at 1:57 PM Jan Eden via Mailman-users < mailman-users@mailman3.org> wrote:
Hi,
I would like to export users of my mailing lists including their full name (if present). The export button in Postorius creates a CSV with addresses only, and the tool bin/list_members mentioned in the [Mailman 2 FAQ] does not seem to be available in Mailman 3.
I have full access to the server/database, so scraping the web interface with [Mark's script] seems unneccessary. Is there an SQL statement (or an equivalent of bin/list_members) to extract the data?
Thanks, Jan
For MM3:
(venv) [mailman@gw ~/mm]$ mailman members listname@domain --output some_file.txt
-- Best regards, Odhiambo WASHINGTON, Nairobi,KE +254 7 3200 0004/+254 7 2274 3223 "Oh, the cruft.", egrep -v '^$|^.*#' ¯\_(ツ)_/¯ :-)
On 2022-12-30 14:10, Odhiambo Washington wrote:
On Fri, Dec 30, 2022 at 1:57 PM Jan Eden via Mailman-users < mailman-users@mailman3.org> wrote:
Hi,
I would like to export users of my mailing lists including their full name (if present). The export button in Postorius creates a CSV with addresses only, and the tool bin/list_members mentioned in the [Mailman 2 FAQ] does not seem to be available in Mailman 3.
I have full access to the server/database, so scraping the web interface with [Mark's script] seems unneccessary. Is there an SQL statement (or an equivalent of bin/list_members) to extract the data?
Thanks, Jan
For MM3:
(venv) [mailman@gw ~/mm]$ mailman members listname@domain --output some_file.txt
Thank you! This should have been obvious to me, but I never actually used the mailman command so far. On minor issue: All user names with non-ASCII characters appear with quoted-printable encoding (e.g. =?utf-8?q?J=C3=BCrgen?=). Is there a way to output UTF-8 instead of QP?
- Jan
On 12/30/22 03:25, Jan Eden via Mailman-users wrote:
On minor issue: All user names with non-ASCII characters appear with quoted-printable encoding (e.g. =?utf-8?q?J=C3=BCrgen?=). Is there a way to output UTF-8 instead of QP?
I just created https://gitlab.com/mailman/mailman/-/issues/1048 for this. I think this patch ``` --- a/src/mailman/commands/cli_members.py +++ b/src/mailman/commands/cli_members.py @@ -104,10 +104,10 @@ def display_members(ctx, mlist, role, regular, digest, if nomail is not None: if member.delivery_status not in status_types: continue - if email_only: + if email_only or not address.display_name: print(address.original_email, file=outfp) else: - print(formataddr((address.display_name, address.original_email)), + print(f'{address.display_name} <{address.original_email}>', file=outfp) ``` (line may be wrapped) will fix it. -- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
On 2022-12-30 10:29, Mark Sapiro wrote:
On 12/30/22 03:25, Jan Eden via Mailman-users wrote:
On minor issue: All user names with non-ASCII characters appear with quoted-printable encoding (e.g. =?utf-8?q?J=C3=BCrgen?=). Is there a way to output UTF-8 instead of QP?
I just created https://gitlab.com/mailman/mailman/-/issues/1048 for this. I think this patch ``` --- a/src/mailman/commands/cli_members.py +++ b/src/mailman/commands/cli_members.py @@ -104,10 +104,10 @@ def display_members(ctx, mlist, role, regular, digest, if nomail is not None: if member.delivery_status not in status_types: continue - if email_only: + if email_only or not address.display_name: print(address.original_email, file=outfp) else: - print(formataddr((address.display_name, address.original_email)), + print(f'{address.display_name} <{address.original_email}>', file=outfp)
``` (line may be wrapped) will fix it.
Excellent – thank you! - Jan
I just applied that patch, but still my members with german Umlauts are printed like =?utf-8?q?Stefan_H=C3=BClzer?=
Happy new year!
On 1/1/23 07:47, Jan Eden via Mailman-users wrote:
On 2023-01-01 12:51, Eggert Ehmke via Mailman-users wrote:
I just applied that patch, but still my members with german Umlauts are printed like =?utf-8?q?Stefan_H=C3=BClzer?=
It worked for me.
It works for me too. Are you sure you applied it to the correct instance of cli_members.py, e.g. the one in the site-packages/mailman/ directory in your venv if you have a virtualenv install?
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
I edited this file: venv/lib/python3.9/site-packages/mailman/commands/cli_members.py
The content at line 107ff is: #patch of Mark Sapiro # if email_only: if email_only or not address.display_name: print(address.original_email, file=outfp) else: # print(formataddr((address.display_name, address.original_email)), file=outfp) print(f'{address.display_name} <{address.original_email}>', file=outfp)
This file is actually executed when I call mailman members. I verified this by uncommenting the first (old) print line. Now each address is printed twice (as expected), with the utf8 code in both lines in case of umlauts.
On 1/1/23 08:14, Eggert Ehmke via Mailman-users wrote:
I edited this file: venv/lib/python3.9/site-packages/mailman/commands/cli_members.py
The content at line 107ff is: #patch of Mark Sapiro # if email_only: if email_only or not address.display_name: print(address.original_email, file=outfp) else: # print(formataddr((address.display_name, address.original_email)), file=outfp) print(f'{address.display_name} <{address.original_email}>', file=outfp)
This file is actually executed when I call mailman members. I verified this by uncommenting the first (old) print line. Now each address is printed twice (as expected), with the utf8 code in both lines in case of umlauts.
In that case the address.display_name value is the RFC 2047 encoded name. One way this can happen is if the subscribe was via email prior to Mailman 3.3.3. See https://gitlab.com/mailman/mailman/-/issues/802.
You could try something like this to fix these.
mailman shell
Welcome to the GNU Mailman shell
Use commit() to commit changes.
Use abort() to discard changes since the last commit.
Exit with ctrl+D does an implicit commit() but exit() does not.
>>> um = getUtility(IUserManager)
>>> from email.header import decode_header, make_header
>>> for address in um.addresses:
... decoded = str(make_header(decode_header(address.display_name)))
... if decoded != address.display_name:
... address.display_name = decoded
... print(f'Fixed {decoded}')
...
>>> # while you're at it do users too.
>>> for user in um.users:
... decoded = str(make_header(decode_header(user.display_name)))
... if decoded != user.display_name:
... user.display_name = decoded
... print(f'Fixed {decoded}')
...
>>> commit()
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
Thank you, that worked! Now all my addresses are clean.
Still, don't you think this should be possible via Postorius, or by some mailman cli command, like mailman rename_member <old> <new>
It happened I had still the patched cli_members.py where I had both the old and new line active: print(formataddr((address.display_name, address.original_email)), file=outfp) print(f'{address.display_name} <{address.original_email}>', file=outfp) The first (old) line still produces the output with utf-8 codes. Is this to be expected? Anyway, I comment that line out and everything works fine.
Happ new year again!
On 2023-01-02 14:36, Eggert Ehmke via Mailman-users wrote:
Thank you, that worked! Now all my addresses are clean.
Still, don't you think this should be possible via Postorius, or by some mailman cli command, like mailman rename_member <old> <new>
It happened I had still the patched cli_members.py where I had both the old and new line active: print(formataddr((address.display_name, address.original_email)), file=outfp) print(f'{address.display_name} <{address.original_email}>', file=outfp) The first (old) line still produces the output with utf-8 codes. Is this to be expected?
I would assume the original line outputs the QP-escaped names, to be used in an email header.
- Jan
On 1/2/23 07:06, Jan Eden via Mailman-users wrote:
On 2023-01-02 14:36, Eggert Ehmke via Mailman-users wrote:
Thank you, that worked! Now all my addresses are clean.
Still, don't you think this should be possible via Postorius, or by some mailman cli command, like mailman rename_member <old> <new>
Yes, it should be possible to change the display name for an address via Postorius. See https://gitlab.com/mailman/postorius/-/issues/451.
It happened I had still the patched cli_members.py where I had both the old and new line active: print(formataddr((address.display_name, address.original_email)), file=outfp) print(f'{address.display_name} <{address.original_email}>', file=outfp) The first (old) line still produces the output with utf-8 codes. Is this to be expected?
I would assume the original line outputs the QP-escaped names, to be used in an email header.
Yes. The original formataddr((address.display_name, address.original_email))
creates the format display name <email>
suitable for use in an email header. Thus, if the display name contains
non-ascii, it is RFC 2047 encoded. The encoding may be quoted-printable
or base64 depending on which is shorter.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
Eggert Ehmke via Mailman-users writes:
I just applied that patch, but still my members with german Umlauts are printed like Stefan Hülzer
I'm going to leave it as above because my stupid MUA translated the MIME word into UTF-8 even though it should not (MIME words should not be recognized in content, only in headers), and it made me laugh. You're not crazy -- it was a MIME word in your mail, of course. It's all the mail programs that are nuts!
I suspect what's happening is that the MIME words are in the backend database in the display name field, when they should just be UTF-8 strings. If so, you should be able to fix the existing ones with a withlist script. I haven't written one in a while and it's past 2am here so please forgive me for getting some sleep. I'll check mail tomorrow if you need help.
Happy New Year!
Steve
Hello Steve, I really would appreciate to get some help on the withlist command. I really like to program python, but this interface is not at all intuitive. I also find it strange to do administrative tasks in a programming language. In this special case, I believe it should be possible to work on the display names in Postorius in the member settings.
On 1/1/23 10:37, Eggert Ehmke via Mailman-users wrote:
Hello Steve, I really would appreciate to get some help on the withlist command. I really like to program python, but this interface is not at all intuitive. I also find it strange to do administrative tasks in a programming language. In this special case, I believe it should be possible to work on the display names in Postorius in the member settings.
You can update user display names in Postorius via the Users tab at the
top of the page, but it would be a tedious, one at a time process, and I
don't think you can modify address display names in Postorius and those
are the ones mailman members
displays.
I just posted a mailman shell (aka withlist) interaction at <https://lists.mailman3.org/archives/list/mailman-users@mailman3.org/message/3WI775RJKT3G2BEAP622CAOFMIJSWS4Y/>. Try that.
I also find it strange to do administrative tasks in a programming language.
Understood, but there are a couple of things at work here. One is that
this particular task is a one-off the correct the result of a bug, but
perhaps more significantly, mailman developers (at least me) are
comfortable with withlist
(it's been around since MM 2.1) and if I can
do it that way, it reduces the motivation to make something more
friendly for the normal admin. Unfortunately doing things easily that
way requires a lot of specialized knowledge of Mailman internals, so
most admins need help with it.
-- Mark Sapiro <mark@msapiro.net> The highway is for gamblers, San Francisco Bay Area, California better use your sense - B. Dylan
Am Montag, 2. Januar 2023, 02:40:58 CET schrieb Mark Sapiro:
Understood, but there are a couple of things at work here. One is that this particular task is a one-off the correct the result of a bug, but perhaps more significantly, mailman developers (at least me) are comfortable with
withlist
(it's been around since MM 2.1) and if I can do it that way, it reduces the motivation to make something more friendly for the normal admin. Unfortunately doing things easily that way requires a lot of specialized knowledge of Mailman internals, so most admins need help with it.
Mark, I completely understand your situation as a core developer with limited time and lots of requests from all directions. Being a Python programmer myself, I would be happy with a consistent documentation of the relevant Python classes to be able to develop own applications. The isolated code snippets in the mm3 documentation are not too helpful.
Such a documentation would possibly remove some of the workload you feel by constantly answering admin questions.´
When you look at the documentation of the Qt library you know what I mean.
Thank you again for this great project!
Cheers, Eggert
On 2022-12-30 10:57, Jan Eden via Mailman-users wrote:
... Is there an SQL statement (or an equivalent of bin/list_members) to extract the data?
Hi Jan,
Is this what you're after?
psql -U postgres -d mailman3 -t -A -F"," -c "SELECT public.address.email public.address.display_name FROM member JOIN address on public.address.id = public.member.address_id WHERE list_id = 'mylist.mydomain.com' AND role = 1" > $(date +%Y-%m-%d).members.txt
(replace mylist.mydomain.com with yours)
Cheers, Mark
On 2023-01-01 21:58, Mark Dale wrote:
On 2022-12-30 10:57, Jan Eden via Mailman-users wrote:
... Is there an SQL statement (or an equivalent of bin/list_members) to extract the data?
Hi Jan,
Is this what you're after?
psql -U postgres -d mailman3 -t -A -F"," -c "SELECT public.address.email public.address.display_name FROM member JOIN address on public.address.id = public.member.address_id WHERE list_id = 'mylist.mydomain.com' AND role = 1" > $(date +%Y-%m-%d).members.txt
(replace mylist.mydomain.com with yours)
OOPs, missed a comma in the above. It should be:
psql -U postgres -d mailman3 -t -A -F"," -c "SELECT public.address.email, public.address.display_name FROM member JOIN address on public.address.id = public.member.address_id WHERE list_id = 'mylist.mydomain.com' AND role = 1" > $(date +%Y-%m-%d).members.txt
On 2023-01-01 22:00, Mark Dale via Mailman-users wrote:
On 2023-01-01 21:58, Mark Dale wrote:
On 2022-12-30 10:57, Jan Eden via Mailman-users wrote:
... Is there an SQL statement (or an equivalent of bin/list_members) to extract the data?
Hi Jan,
Is this what you're after?
psql -U postgres -d mailman3 -t -A -F"," -c "SELECT public.address.email public.address.display_name FROM member JOIN address on public.address.id = public.member.address_id WHERE list_id = 'mylist.mydomain.com' AND role = 1" > $(date +%Y-%m-%d).members.txt
(replace mylist.mydomain.com with yours)
OOPs, missed a comma in the above. It should be:
psql -U postgres -d mailman3 -t -A -F"," -c "SELECT public.address.email, public.address.display_name FROM member JOIN address on public.address.id = public.member.address_id WHERE list_id = 'mylist.mydomain.com' AND role = 1" > $(date +%Y-%m-%d).members.txt
Thank you! Odhiambo pointed to the mailman cli, which did exactly what I wanted (but I will keep your SQL statement for future reference).
- Jan
participants (6)
-
Eggert Ehmke
-
Jan Eden
-
Mark Dale
-
Mark Sapiro
-
Odhiambo Washington
-
Stephen J. Turnbull