[MM3-users] Re: REST API notes

June 29, 2017 · *that*

      On Jun 28, 2017, at 01:47 AM, tlhackque--- via Mailman-users wrote:
...
As a beginner with the REST API, I found it necessary to read code and
experiment to figure out how to get things done.
Thanks for the feedback.  Other than the work supporting Postorius and
HyperKitty, I think you've done some of the more extensive exploration of the
REST API, so your comments here (and greatly appreciated bug reports) are
quite valuable, and I think will help us make the REST API both more usable
and more discoverable.
It's important to point out some history of the REST API.  Originally, we used
the restish library (which was Python 2 and eventually abandoned upstream).
That choice informed the essentially dynamic nature of the REST API.  Which
means that unlike many other systems, we don't have -nor do I think we could
if we wanted to- easily and automatically generate a static description of the
REST API.
When I switched to Falcon (which is both Python 3 compatible and well
maintained upstream), I had to reimplement what I called "object based
traversal".  Falcon by default supports a routes based traversal mechanism,
similar to Flask and other systems.  But by plugging in (originally, by
monkeypatching, but now through an official Falcon API) support for object
based traversal, I was able to port the extensive existing REST implementation
to the new underlying library fairly easily.
All this to say that some of the problems you describe are due to historical
accident.  And that's served us well(-ish) while we only had to support
mailmanclient and P/HK.  But as the REST API is used in more places, those
deficiencies are becoming more glaring.  I think if we ever make progress on
Lemme (our authenticating REST proxy), we'll likely encounter the same
problems.  But at this point, it's infeasible to completely reimplement the
REST machinery.
...
There are a number of implementation choices that are unusual.
No doubt. :)
...
There is some documentation in src/mailman/rest/docs.  It's oriented toward
Python and intermixes the mailman client & REST API, apparently trying to
show equivalence.
Much of that is because these are doctests, i.e. testable documentation.  It's
a tradeoff between making them useful as docs but also testable without too
much clutter.  It's very much oriented toward Python because that makes
testability easier.  It would be good to have documentation for pure HTTP/JSON
consumers, but it would be imperative for that also to be testable so we're
sure it remains accurate, aside from any requirements to keep the docs in
sync.  Suggestions and contributions would be very much welcome here.
...
The REST API has a hybrid interface: Requests are made with
application/x-www-form-urlencoded POST, PUT, PATCH and delete http requests.
Requests are also accepted as parameter strings.
The responses are JSON.  (This is rather surprising - one would expect JSON
requests - and I hope someday they'll be accepted, as the split complicates
clients.  I suspect this evolved to simplify the initial Web GUI client
(Postorious), but it precludes using standard JSON-in/JSON-out client
libraries.)
Absolutely.  I want to support JSON-in/JSON-out.  Again, the current mismatch
is due to historical implementation decisions, but there's nothing in
principle preventing us from accepting JSON in, afaict.  We'd have to continue
to accept both, for backward compatibility reasons.
...
Error responses are often misidentified as content-type: application/json,
but actually contain a text/plain error message.  This isn't universally
true; for example the 401 response actually IS JSON.  A client has to guess
and handle decoding exceptions.
That's clearly a bug.  The Content-Type should always be accurate.  It would
be a new feature to also support JSON error responses.
...
The API presents resources hierarchically, rooted at '/'.  The top level
resources include /users, /lists. /addresses, /domains, /system
The next level is a resource id.
Kind of.  It all depends on what the top-level resource is returned.  This
makes sense when you think about the object based traversal machinery.
Each path component points internally to an object, and the subsequent path
component is handled by that object.  If that returns another object, then the
next remaining path component is consumed by *that* object.  And so on until
we reach the end of the path.  Then the object at the leaf responds to the
HTTP command, and each object knows how to represent itself as JSON, and it
knows its canonical location within the resource tree (there can be multiple
paths to any particular object, but there's always one canonical location).
...
There is a an id which is stable for the lifetime of an object, and a current
name (which can be changed).  For example, a list has a name like
mylist@example.com, and a list_id of mylist.example.com.  But if you change
the list name, it becomes mynewname@example.net, while the list_id remains
mylist.example.com.
Lists are weird in that they have two identifiers.  One is the posting
address, what we internally (and slightly incorrectly) call the "fully
qualified list name".  I don't particularly like that nomenclature anymore,
but we live with it.
A posting address can change, e.g. if you rename or rehome a list.
The (RFC 2919) List-ID is assigned when the mailing list is created and it's
immutable.  Section 4 discourages changing the List-ID and Mailman takes that
as a requirement.  Rename a list or rehome it to a different domain on the
same server and the List-ID will never change.
Again for historical reasons, many APIs both internal and external used the
posting address to identify the list, but this is wrong exactly because that
can change.  I've slowly been converting APIs to accept both the posting
address and the List-ID when identifying the mailing list.  New APIs generally
accept only the List-ID.  Bottom line, it's best to use the List-ID.
...
Lists are associated with "domains", which are the "domain part" of the
list's address.  That is, the part after the @.  This is sometimes referred
to as the "mail host", but there need not be a real host.
Yep, again historical nomenclature.
...
A domain has to be created before you can create a list with an address in
that domain.
Some APIs (e.g. mailman create CLI) will create the domain automatically if
it doesn't already exist, and unless you disable that explicitly.
...
Resources are created with a POST to their top-level resource.  To create a
domain, post to /domains with mail_host => the domain, and (optionally)
description => a description for the GUI.  The response isn't JSON as one
would expect.  In fact, it's an empty application/json response with a 201
status.
To create a list, one POSTs to /lists.  This post takes a restricted set of
parameters; in the case of a list, just its fqdn_listname, (and an optional
style - which isn't well defined).  The response isn't JSON as one would
expect.  The Location header of the response contains a URL of the new list.
What else would you expect?  From my reading of books like RESTful Web
Services (admittedly a long while ago), that's exactly the proper response to
an appending POST.  Return a 201 with a Location header to the new resource
and empty content.
...
To configure the list, you have to follow up with a PUT or PATCH to that URL
programatic way to determine which attributes are writable.  PATCH what you

/config This is where you can set the description, posting policy, etc.
It's unrealistic to do a PUT; even if you're cloning another list, there's no

know...
Mailing list resources are somewhat unique in that they probably have the most
properties of any resource/object in the system.  That's not surprising if you
think about it, but it does make PUTing to a mailing list more or less
impractical.  That's certainly not true of other, smaller resources though,
and of course, you still want symmetry there (plus, implementation-wise, it's
almost a no brainer to support both PUT and PATCH).
...
Members are associated with e-mail addresses - which belong to Users.
Users can have multiple addresses, addresses can be linked to only one user,
but may be unlinked, and members associate "subscribers" to mailing lists,
where a subscriber is either a user-with-preferred-address, or an address.
...
You create a user by posting to /users with email => the email address, and
optionally display_name => the name string.  (A user can also be implicitly
created by subscribing an e-mail address, but that gets confusing.)  The
e-mail address is the primary email address for the user.  More later.
Again, you get a Location header in the response, which you can use to PATCH
/preferences to set delivery_status, etc.  .These preferences are part of a
hierarchy - many have a system default, a list default, the user default, and
a subscription to that list value.  You can find the User associated with a
list by a GET of /addresses/address@example.net.
Technically, the returned user isn't necessarily associated with a list.  It
*may* be subscribed to the list, but that relationship is represented by a
member.
...
This GET returns two URLs: user => the user owning this address, and
self_link => the address object.
*All* resources in the REST API have a self_link, and while that may seem
redundant, it's not.  As mentioned above, you can take various paths through
the resource tree to get to a particular resource, but regardless of that,
every resource has exactly one canonical location.  That location is
represented by the self_link.
...
Note also that there are both an email and original_email attribute.  The
latter preserves case.  The former is used internally by Mailman as a
resource key, etc.  (Though exactly what happens if John@example.net and
john@example.net both subscribe is unclear.)
They can't.  Mailman is case-preserving case-insensitive for email addresses.
Technically speaking, john@example.com and JOHN@example.com can be different
mailboxes, but that never happens in practice anymore, and Mailman has always
explicitly treated them as the same address.  This goes back to the earliest
days of Mailman.
...
Once a list is configured, you can add members.  This requires the list_id -
which you don't (officially) have.  So you do a GET on the list resource, to
get the list_id.  Then you can subscribe a member with list_id => (the list
id), subscriber => email address.  Optionally you can
pre_verify/confirm/approve the member and/or add a display_name..  POST to
/members.  Again, you get a Location header back.  You can't specify
everything you might like. as a creation attribute.  So you may have to PATCH
the member to, for example, set the moderation_status.
This is true, but it also kind of makes sense if you grok the way preferences
work, i.e. hierarchically as you describe above.  TBH, I don't particularly
like the way preferences are modeled either internally or through the REST
API, but I haven't been able to come up with anything better.
...
You may also need to PATCH the member /preferences if you want to set
list-level delivery status, etc.
All true.  In each of those cases, you may be talking to a different
preference resource.
...
Consider adding at least an owner when creating a list.
I think that would be a useful addition to the REST API for list creation.
...
One challenge is that almost everything requires multiple REST operations to
set up.  But REST (by definition) is stateless.  So the best you can do is
order operations & hope.
I don't understand this.  REST is stateless, but not all HTTP operations are
idempotent.  POST certainly isn't, by definition, so if you use it to create a
new resource under a collection, you clearly cannot modify that resource
through PUT or PATCH until the resource is created, which only happens when
POST succeeds.
...
The mailman client examples refer to transactions (e.g in users.rst, there is
'transaction.commit(); - but REST can't hold state.  It does appear that the
server uses DB transactions to ensure that any given REST operation is ACID,
but the composite operations (e.g. create a list and set it's config) can not
be Atomic.  This is an architectural flaw in the API.
I think what you're getting at is that some POSTs to create new resources do
not accept the same parameters as the subsequent PUT or PATCH on the newly
created resource.  I think that may be true, but it's not universally so.
E.g. POST on /domain creates a new domain, returning the location of the new
resource.  You can provide both a description and an owner, and you can also
provide a mail_host.  But mail_host is immutable, and PATCH or PUT support
changing all mutable attributes on the domain.
That should be the general rule; if an attribute on a resource is mutable, you
should be able to PATCH and PUT it, *and* you should be able to specify all
mutable attributes, plus some immutable ones on the POST that creates the
resource.  GET should return all attributes, mutable or immutable.  This rule
may not be strictly adhered to for all resources and collections, but I would
consider that a bug, not an architectural flaw.
AFAIK, this is all working as any well-defined web service should work.
...
Symbolic names (which are required) for attributes are in the
src/mailman/interfaces/(class).py.
Please note that the interfaces under this package are for the *internal* API,
which often, but not always, is exposed in the REST API.  These two APIs serve
different purposes so they cannot be a one-to-one mapping, and there are many
resources in the REST API that don't correspond directly to internal objects,
and there are many internal objects that are not exposed to the REST API.
...
What you can get/set comes from src/mailman/rest/(class).py.
Yes, so clients of the REST API are served by objects in this package.
Plugins and other internal operations are served by the objects that implement
interfaces in the src/mailman/interfaces package.  This is a strict separation
of concerns, and it mirrors other 'external facing' interfaces, of which the
command line is another example.
Hope that helps.
-Barry

[MM3-users] Re: REST API notes

Barry Warsaw