Thanks so much guys, let me try to answer some of your questions.
@AndrewH.
Yes, I saw your notes, but the impression I got during this effort is that package-based/non-docker implementations change too much between versions, so anything written based on Ubuntu 16.04 or earlier just didn't feel worth spending significant time on... that impression came from the "this is new to version 18.04" warnings in the Ubuntu 18.04 walkthrough ( https://docs.google.com/document/d/1xIcSsoNFp2nHi7r4eQys00s9a0k2sHhu1V5Plant...). See my comment below for more.
Nonetheless, I'm going to crawl through your email and config files and see if I find anything to suggest where I've been going wrong, so thanks a ton for that.
@JonathanM
My attempt at doing a DigitalOcean package/non-docker based implementation was frustrating. I found a really detailed walkthrough ( https://docs.google.com/document/d/1xIcSsoNFp2nHi7r4eQys00s9a0k2sHhu1V5Plant...), and after the usual hyperkitty authorization problem solving, it got me to a point where everything looked perfect, based on a glance at the web interface. However, a bizarre collection of errors showed up during QA testing, such as the following:
- Gravatars showing broken links about half of the time
- Emails from non-admin subscribers almost always not getting processed and resulting in "connection dropped" errors in the SMTP log
- Very random orange-banner errors (e.g. I think one was vaguely something like "[[(*)]] failed")
- Most confusingly... the "Last post made at" metric would always show 'none', no matter how many posts were made to the test list.
As I attempted to debug, I saw some connectivity issues in the logs (500/bad gateway errors, IIRC). I wanted to rule out an SSL-negotiation issue, partly because I've very recently had problems installing other complex software where some Ubuntu packages have apparently changed default package behaviours to insist on SSL, which forced those developers to redo installation scripts that hadn't been touched in a while. Because only 'default' behaviours were changed, I would expect that this wouldn't show up in installations that are upgraded from previous versions. So I turned off the cert-based forcing of SSL for all HTTP traffic and, sure enough, started getting errors about "incorrect SSL version" occurring during SSL handshaking attempts. That's when I threw up my hands and decided to make docker work.
@MatthewA
Thanks for the inspiration to keep at this. I spent quite a few post-NYE hours trying to make the cleanest possible docker-based mailman3 installation on AWS, which I'll outline below. It ended up failing, so I might next try Linode. As you mention, I know the docker containers should be platform independent, I'm just worried about the fact that docker just starts grabbing private IP addresses and using email in ways that require support from the underlying platforms (e.g. AWS's Simple Email Service and VPCs/IP address allocation).
So here's my latest (failed) attempt, to see if anyone can spot where I went wrong. Again, I'm just trying to get a dedicated mailman3 server up from scratch. Oh, I should add that I tried the mailman4 AWS Mailman3-as-SaaS <https://aws.amazon.com/marketplace/pp/New-IT-Mailman-4/B089M2PCVV>, and that worked just fine. However we can't host our data on a SaaS platform, so that's unfortunately not an option for us. I just want to mention it for anyone looking for a "give me mailman services in five minutes for $30/month" solution.
Phase 1 - Prepare the AWS instance and MTA
- Created an instance using pre-allocated support for 172.19.199.[1-4] IP address endpoints, a VPC with the 172.19.198.0/23 subdomain, and firewall security rules to pass ports 80, 8000, 8024, 22, 25, 587, 443, all ICMP
- Prepared docker using AWS methodology <https://docs.aws.amazon.com/AmazonECS/latest/developerguide/docker-basics.html> and successfully tested a docker container using hello world
- Prepared successfully tested AWS SES service (e.g. domain and account validation) and integrated with postfix. <https://docs.aws.amazon.com/ses/latest/DeveloperGuide/postfix.html>
Phase 2 - Prepare mailman3 docker files
- Tried to follow the standard directions... .I put most of the relevant files below.
- Curl test of 172.19.199.3 endpoint failed. Put some debug information below.
I'm going to work with the previous suggestions to see what I can pull together, but any ideas appreciated.
docker-compose.yaml
version: '2'
services: mailman-core: image: maxking/mailman-core:0.3 container_name: mailman-core hostname: mailman-core volumes: - /opt/mailman/core:/opt/mailman/ stop_grace_period: 30s links: - database:database depends_on: - database environment: - DATABASE_URL=postgres://mailman:mailmanpass@database/mailmandb - DATABASE_TYPE=postgres - DATABASE_CLASS=mailman.database.postgresql.PostgreSQLDatabase - HYPERKITTY_API_KEY=FXCX6wk4e-stuff-m3ozNNpsFcj7vq networks: mailman: ipv4_address: 172.19.199.2
mailman-web: image: maxking/mailman-web:0.3 container_name: mailman-web hostname: mailman-web depends_on: - database links: - mailman-core:mailman-core - database:database volumes: - /opt/mailman/web:/opt/mailman-web-data environment: - DATABASE_TYPE=postgres - DATABASE_URL=postgres://mailman:mailmanpass@database/mailmandb - HYPERKITTY_API_KEY= FXCX6wk4e-stuff-m3ozNNpsFcj7vq - SECRET_KEY= FXCX6wk4e-stuff-m3ozNNpsFcj7vq - SERVE_FROM_DOMAIN=johnlabdomain.website - MAILMAN_ADMIN_USER=mailman - MAILMAN_ADMIN_EMAIL=postmaster@johnlabdomain.website networks: mailman: ipv4_address: 172.19.199.3
database: environment: POSTGRES_DB: mailmandb POSTGRES_USER: mailman POSTGRES_PASSWORD: mailmanpass image: postgres:9.6-alpine volumes: - /opt/mailman/database:/var/lib/postgresql/data networks: mailman: ipv4_address: 172.19.199.4
networks: mailman: driver: bridge ipam: driver: default config: - subnet: 172.19.199.0/24
mailman-extra.cfg
[mta] incoming: mailman.mta.postfix.LMTP outgoing: mailman.mta.deliver.deliver lmtp_host: 172.19.199.2 lmtp_port: 8024 smtp_host: 172.19.199.1 smtp_port: 25 configuration: /etc/postfix-mailman.cfg
[mailman] # This address is the "site owner" address. Certain messages which must be # delivered to a human, but which can't be delivered to a list owner (e.g. a # bounce from a list owner), will be sent to this address. It should point to # a human. site_owner: postmaster@johnlabdomain.website
settings_local.py
USE_SSL = False EMAIL_BACKEND = 'django.core.mail.backends.smtp.EmailBackend' EMAIL_HOST = '172.19.199.1' EMAIL_PORT = 25 DEFAULT_FROM_EMAIL = "postmaster@johnlabdomain.website" SERVER_EMAIL = "postmaster@johnlabdomain.website"
docker infoClient: Debug Mode: false
Server: Containers: 3 Running: 3 Paused: 0 Stopped: 0 Images: 9 Server Version: 19.03.13-ce Storage Driver: overlay2 Backing Filesystem: xfs Supports d_type: true Native Overlay Diff: true Logging Driver: json-file Cgroup Driver: cgroupfs Plugins: Volume: local Network: bridge host ipvlan macvlan null overlay Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog Swarm: inactive Runtimes: runc Default Runtime: runc Init Binary: docker-init containerd version: c623d1b36f09f8ef6536a057bd658b3aa8632828 runc version: ff819c7e9184c13b7c2607fe6c30ae19403a7aff init version: de40ad0 (expected: fec3683) Security Options: seccomp Profile: default Kernel Version: 4.14.209-160.339.amzn2.x86_64 Operating System: Amazon Linux 2 OSType: linux Architecture: x86_64 CPUs: 1 Total Memory: 1.945GiB Name: ip-172-19-199-1.ca-central-1.compute.internal ID: 5HIU:HKNY:O5AK:GPDM:I7JD:PGX7:2YZN:RYLK:K2OQ:ICO2:F3RA:O2QZ Docker Root Dir: /var/lib/docker Debug Mode: false Registry: https://index.docker.io/v1/ Labels: Experimental: false Insecure Registries: 127.0.0.0/8 Live Restore Enabled: false
maillog has some interesting stuff I'm seeing for the first time. Let me know if this suggests any obvious suspects.
[ec2-user@ip-172-19-199-1 log]$ sudo cat maillog Jan 1 10:10:25 ip-172-19-199-1 postfix/postfix-script[2984]: starting the Postfix mail system Jan 1 10:10:25 ip-172-19-199-1 postfix/master[2988]: daemon started -- version 2.10.1, configuration /etc/postfix Jan 1 10:20:39 ip-172-19-199-1 postfix/postfix-script[4076]: stopping the Postfix mail system Jan 1 10:20:39 ip-172-19-199-1 postfix/master[2988]: terminating on signal 15 Jan 1 10:20:39 ip-172-19-199-1 postfix/postfix-script[4115]: fatal: the Postfix mail system is not running Jan 1 10:20:39 ip-172-19-199-1 postfix/postfix-script[4207]: starting the Postfix mail system Jan 1 10:20:39 ip-172-19-199-1 postfix/master[4209]: daemon started -- version 2.10.1, configuration /etc/postfix Jan 1 10:21:27 ip-172-19-199-1 postfix/pickup[4210]: 4EF7073: uid=1000 from=<postmaster@johnlabdomain.website> Jan 1 10:21:27 ip-172-19-199-1 postfix/cleanup[4218]: 4EF7073: message-id=<20210101102127.4EF7073@ip-172-19-199-1.ca-central-1.compute.internal
Jan 1 10:21:27 ip-172-19-199-1 postfix/qmgr[4211]: 4EF7073: from=<postmaster@johnlabdomain.website>, size=387, nrcpt=1 (queue active) Jan 1 10:21:27 ip-172-19-199-1 postfix/smtp[4220]: 4EF7073: to=<postmaster@johnlabdomain.website>, relay= email-smtp.ca-central-1.amazonaws.com[3.97.59.162]:587, delay=12, delays=12/0.04/0.13/0.1, dsn=2.0.0, status=sent (250 Ok 010d0176bd776654-8f53783a-07fd-4b47-92b0-b7f9a14bbbf7-000000) Jan 1 10:21:27 ip-172-19-199-1 postfix/qmgr[4211]: 4EF7073: removed Jan 1 10:24:39 ip-172-19-199-1 postfix/postfix-script[4346]: refreshing the Postfix mail system Jan 1 10:24:39 ip-172-19-199-1 postfix/master[4209]: reload -- version 2.10.1, configuration /etc/postfix Jan 1 10:24:39 ip-172-19-199-1 postfix/qmgr[4351]: error: open /opt/mailman/core/var/data/postfix_domains: No such file or directory Jan 1 10:27:58 ip-172-19-199-1 postfix/smtpd[4395]: error: open /opt/mailman/core/var/data/postfix_domains: No such file or directory Jan 1 10:27:58 ip-172-19-199-1 postfix/smtpd[4395]: error: open /opt/mailman/core/var/data/postfix_lmtp: No such file or directory Jan 1 10:27:58 ip-172-19-199-1 postfix/smtpd[4395]: connect from mail-yb1-f184.google.com[209.85.219.184] Jan 1 10:27:58 ip-172-19-199-1 postfix/trivial-rewrite[4397]: error: open /opt/mailman/core/var/data/postfix_domains: No such file or directory Jan 1 10:27:58 ip-172-19-199-1 postfix/trivial-rewrite[4397]: error: open /opt/mailman/core/var/data/postfix_lmtp: No such file or directory Jan 1 10:27:58 ip-172-19-199-1 postfix/trivial-rewrite[4397]: warning: regexp:/opt/mailman/core/var/data/postfix_lmtp is unavailable. open /opt/mailman/core/var/data/postfix_lmtp: No such file or directory Jan 1 10:27:58 ip-172-19-199-1 postfix/trivial-rewrite[4397]: warning: regexp:/opt/mailman/core/var/data/postfix_lmtp lookup error for "*" Jan 1 10:27:58 ip-172-19-199-1 postfix/trivial-rewrite[4397]: warning: regexp:/opt/mailman/core/var/data/postfix_lmtp is unavailable. open /opt/mailman/core/var/data/postfix_lmtp: No such file or directory Jan 1 10:27:58 ip-172-19-199-1 postfix/trivial-rewrite[4397]: warning: regexp:/opt/mailman/core/var/data/postfix_lmtp lookup error for "*" Jan 1 10:27:58 ip-172-19-199-1 postfix/trivial-rewrite[4397]: warning: regexp:/opt/mailman/core/var/data/postfix_domains is unavailable. open /opt/mailman/core/var/data/postfix_domains: No such file or directory Jan 1 10:27:58 ip-172-19-199-1 postfix/trivial-rewrite[4397]: warning: regexp:/opt/mailman/core/var/data/postfix_domains: table lookup problem Jan 1 10:27:58 ip-172-19-199-1 postfix/trivial-rewrite[4397]: warning: relay_domains lookup failure Jan 1 10:27:58 ip-172-19-199-1 postfix/trivial-rewrite[4397]: warning: regexp:/opt/mailman/core/var/data/postfix_domains is unavailable. open /opt/mailman/core/var/data/postfix_domains: No such file or directory Jan 1 10:27:58 ip-172-19-199-1 postfix/trivial-rewrite[4397]: warning: regexp:/opt/mailman/core/var/data/postfix_domains: table lookup problem Jan 1 10:27:58 ip-172-19-199-1 postfix/trivial-rewrite[4397]: warning: relay_domains lookup failure Jan 1 10:27:58 ip-172-19-199-1 postfix/smtpd[4395]: NOQUEUE: reject: RCPT from mail-yb1-f184.google.com[209.85.219.184]: 451 4.3.0 < person@obfuscatedbyme.com>: Temporary lookup failure; from=< ggroupname+bncBCUONKFK6AKRB3XIXD7QKGQEAITT5DI@googlegroups.com> to=< person@obfuscatedbyme.com > proto=ESMTP helo=<mail-yb1-f184.google.com> Jan 1 10:27:58 ip-172-19-199-1 postfix/smtpd[4395]: disconnect from mail-yb1-f184.google.com[209.85.219.184] Jan 1 10:31:18 ip-172-19-199-1 postfix/anvil[4396]: statistics: max connection rate 1/60s for (smtp:209.85.219.184) at Jan 1 10:27:58 Jan 1 10:31:18 ip-172-19-199-1 postfix/anvil[4396]: statistics: max connection count 1 for (smtp:209.85.219.184) at Jan 1 10:27:58 Jan 1 10:31:18 ip-172-19-199-1 postfix/anvil[4396]: statistics: max cache size 1 at Jan 1 10:27:58 Jan 1 10:36:30 ip-172-19-199-1 postfix/master[4209]: terminating on signal 15 Jan 1 10:37:18 ip-172-19-199-1 postfix/postfix-script[2960]: starting the Postfix mail system Jan 1 10:37:19 ip-172-19-199-1 postfix/master[2962]: daemon started -- version 2.10.1, configuration /etc/postfix Jan 1 10:37:19 ip-172-19-199-1 postfix/qmgr[2964]: error: open /opt/mailman/core/var/data/postfix_domains: No such file or directory Jan 1 10:39:28 ip-172-19-199-1 postfix/smtpd[3302]: error: open /opt/mailman/core/var/data/postfix_domains: No such file or directory Jan 1 10:39:28 ip-172-19-199-1 postfix/smtpd[3302]: error: open /opt/mailman/core/var/data/postfix_lmtp: No such file or directory Jan 1 10:39:28 ip-172-19-199-1 postfix/smtpd[3302]: warning: hostname security.criminalip.com does not resolve to address 89.248.168.112 Jan 1 10:39:28 ip-172-19-199-1 postfix/smtpd[3302]: connect from unknown[89.248.168.112] Jan 1 10:39:28 ip-172-19-199-1 postfix/smtpd[3302]: lost connection after STARTTLS from unknown[89.248.168.112] Jan 1 10:39:28 ip-172-19-199-1 postfix/smtpd[3302]: disconnect from unknown[89.248.168.112] Jan 1 10:42:48 ip-172-19-199-1 postfix/anvil[3303]: statistics: max connection rate 1/60s for (smtp:89.248.168.112) at Jan 1 10:39:28 Jan 1 10:42:48 ip-172-19-199-1 postfix/anvil[3303]: statistics: max connection count 1 for (smtp:89.248.168.112) at Jan 1 10:39:28 Jan 1 10:42:48 ip-172-19-199-1 postfix/anvil[3303]: statistics: max cache size 1 at Jan 1 10:39:28