add documentation about architecture

This commit is contained in:
Max Erenberg 2021-08-20 01:41:50 +00:00
parent 583fcded9b
commit dc09210d23
2 changed files with 114 additions and 8 deletions

View File

@ -1,12 +1,39 @@
# pyceo
work in progress
CEO (**C**SC **E**lectronic **O**ffice) is the tool used by CSC to manage
club accounts and memberships. See []( for an
overview of its architecture.
## Development
First, make sure that you have installed the
[syscom dev environment](
This will setup all of the services needed for ceo to work. You should clone
this repo in one of the dev environment containers.
this repo in the phosphoric-acid container under ctdalek's home directory; you
will then be able to access it from any container thanks to NFS.
### Environment setup
Once you have the dev environment setup, there are a few more steps you'll
need to do for ceo.
#### Kerberos principals
First, you'll need `ceod/<hostname>` principals for each of phosphoric-acid,
coffee and mail. (coffee is taking over the role of caffeine for the DB
endpoints). For example, in the phosphoric-acid container:
kadmin -p sysadmin/admin
<password is krb5>
addprinc -randkey ceod/phosphoric-acid.csclub.internal
ktadd ceod/phosphoric-acid.csclub.internal
Do this for coffee and mail as well. You need to actually be in the
appropriate container when running these commands, since the credentials
are being added to the local keytab.
On phosphoric-acid, you will additionally need to create a principal
called `ceod/admin` (remember to addprinc **and** ktadd).
#### Database
TODO - Andrew
#### Dependencies
Next, install and activate a virtualenv:
sudo apt install libkrb5-dev libsasl2-dev python3-dev
@ -16,7 +43,7 @@ pip install -r requirements.txt
pip install -r dev-requirements.txt
## C bindings
#### C bindings
Due to the lack of a decent Python library for Kerberos we ended up
writing our own C bindings using [cffi](
Make sure you compile the bindings:
@ -28,15 +55,13 @@ This should create a file named ''.
This will be imported by other modules in ceo.
## Running the application
ceod is essentially a distributed application, with instances on different
hosts offering different services. For example, the ceod instance on mail
offers a service to subscribe people to mailing lists, and
the ceod instance on phosphoric-acid offers a service to create new members.
ceod is a distributed application, with instances on different hosts offering
different services.
Therefore, you will need to run ceod on multiple hosts. Currently, those are
phosphoric-acid, mail and caffeine (in the dev environment, caffeine is
replaced by coffee).
To run ceod on a single host:
To run ceod on a single host (as root, since the app needs to read the keytab):
export FLASK_APP=ceod.api
export FLASK_ENV=development
@ -81,3 +106,14 @@ curl --negotiate -u : --service-name ceod \
-d '{"uid":"test_1","cn":"Test One","program":"Math","terms":["s2021"]}' \
-X POST http://phosphoric-acid:9987/api/members
## Miscellaneous
### Mailman
You may wish to add more mailing lists to Mailman; by default, only the
csc-general list exists (from the dev environment playbooks). Just
attach to the mail container and run the following:
/opt/mailman3/bin/mailman create new_list_name@csclub.internal
for instructions on how to access the Mailman UI from your browser.

70 Normal file
View File

@ -0,0 +1,70 @@
# Architecture
ceo is a distributed HTTP application running on three hosts. As of this
writing, those are phosphoric-acid, mail and caffeine (coffee in the dev
* The `mail` host provides the `/api/mailman` endpoints. This is because
the REST API for Mailman3 is currently configured to run on localhost.
* The `caffeine` host provides the `/api/db` endpoints. This is because
the root account of MySQL and PostgreSQL on caffeine can only be accessed
* All other endpoints are provided by `phosphoric-acid`. phosphoric-acid is the
only host with the `ceod/admin` Kerberos key which means it is the only host
which can create new principals and reset passwords.
Some endpoints can be accessed from multiple hosts. This is explained more in
Interestingly, ceod instances can actually make API calls to each other. For
example, when the instance on phosphoric-acid creates a new user, it will
make a call to the instance on mail to subscribe the user to the csc-general
mailing list.
## Security
In the old ceo, most LDAP modifications were performed on the client side,
using the client's Kerberos credentials to authenticate to LDAP via GSSAPI.
Using the client's credentials is desirable since we currently have custom
authz rules in our slapd.conf on auth1 and auth2. If we were to use the
server's credentials instead, this would result in two different sets of
authz rules - one at the API layer and one at the OpenLDAP layer - and
syscom members would very likely forget to update both at the same time.
So, we want a way for the server to use the client's credentials when
interacting with LDAP. The most secure way to do this is via a Kerberos
extension called "constrained delegation", or [S4U](
While the MIT KDC, which we are currently using, does provide support for S4U,
this [requires using LDAP as a database backend](,
which we are *not* using. While it is theoretically possible to migrate our
KDC databases to LDAP, this would be a very risky operation, and probably
not worth it if ceo is the only app which will use it.
Therefore, we will use unconstrained delegation. The client essentially
forwards their TGT to ceod, which uses it to access other services over GSSAPI
on the client's behalf. The TGT is formatted as a KRB-CRED message,
base64-encoded, and placed in an HTTP header named 'X-KRB5-CRED'.
Since the client's credentials are used when interacting with LDAP, this means
that most LDAP-related endpoints can actually be accessed from any host.
Only the Kerberos-specific endpoints (e.g. resetting a password) truly need
to be on phosphoric-acid.
### Authentication
The REST API uses SPNEGO for authetication via the HTTP Negotiate
Authentication scheme ( The API
does not verify that the user actually knows the key for the service ticket;
therefore, TLS is necessary to prevent MITM attacks. (TLS is also necessary
to protect the KRB-CRED message, which is unencrypted.)
SPNEGO is pretty awkward, to be honest, as it completely breaks the stateless
nature of HTTP. If we decide that SPNEGO is too much trouble, we should switch
to plain HTTP cookies instead, and cache them somewhere in the client's home
## Web UI
For future contributors: if you wish to make ceo accessible from the browser,
you will need to add some kind of "Kerberos gateway" logic to the API such
that the user's password can be used to obtain Kerberos tickets. One possible
implementation would be to prompt the user for a password, obtain a TGT,
then encrypt the TGT and store it as a JWT in the user's browser. The API
can decrypt the JWT later and use it as long as the ticket has not expired;
otherwise, the user will be re-prompted for their password.