diff --git a/README.md b/README.md index 402ca68..4321e47 100644 --- a/README.md +++ b/README.md @@ -1,12 +1,39 @@ # pyceo -work in progress +CEO (**C**SC **E**lectronic **O**ffice) is the tool used by CSC to manage +club accounts and memberships. See [architecture.md](architecture.md) for an +overview of its architecture. ## Development First, make sure that you have installed the [syscom dev environment](https://git.uwaterloo.ca/csc/syscom-dev-environment). This will setup all of the services needed for ceo to work. You should clone -this repo in one of the dev environment containers. +this repo in the phosphoric-acid container under ctdalek's home directory; you +will then be able to access it from any container thanks to NFS. +### Environment setup +Once you have the dev environment setup, there are a few more steps you'll +need to do for ceo. + +#### Kerberos principals +First, you'll need `ceod/` principals for each of phosphoric-acid, +coffee and mail. (coffee is taking over the role of caffeine for the DB +endpoints). For example, in the phosphoric-acid container: +```sh +kadmin -p sysadmin/admin + +addprinc -randkey ceod/phosphoric-acid.csclub.internal +ktadd ceod/phosphoric-acid.csclub.internal +``` +Do this for coffee and mail as well. You need to actually be in the +appropriate container when running these commands, since the credentials +are being added to the local keytab. +On phosphoric-acid, you will additionally need to create a principal +called `ceod/admin` (remember to addprinc **and** ktadd). + +#### Database +TODO - Andrew + +#### Dependencies Next, install and activate a virtualenv: ```sh sudo apt install libkrb5-dev libsasl2-dev python3-dev @@ -16,7 +43,7 @@ pip install -r requirements.txt pip install -r dev-requirements.txt ``` -## C bindings +#### C bindings Due to the lack of a decent Python library for Kerberos we ended up writing our own C bindings using [cffi](https://cffi.readthedocs.io). Make sure you compile the bindings: @@ -28,15 +55,13 @@ This should create a file named '_krb5.cpython-37m-x86_64-linux-gnu.so'. This will be imported by other modules in ceo. ## Running the application -ceod is essentially a distributed application, with instances on different -hosts offering different services. For example, the ceod instance on mail -offers a service to subscribe people to mailing lists, and -the ceod instance on phosphoric-acid offers a service to create new members. +ceod is a distributed application, with instances on different hosts offering +different services. Therefore, you will need to run ceod on multiple hosts. Currently, those are phosphoric-acid, mail and caffeine (in the dev environment, caffeine is replaced by coffee). -To run ceod on a single host: +To run ceod on a single host (as root, since the app needs to read the keytab): ```sh export FLASK_APP=ceod.api export FLASK_ENV=development @@ -81,3 +106,14 @@ curl --negotiate -u : --service-name ceod \ -d '{"uid":"test_1","cn":"Test One","program":"Math","terms":["s2021"]}' \ -X POST http://phosphoric-acid:9987/api/members ``` + +## Miscellaneous +### Mailman +You may wish to add more mailing lists to Mailman; by default, only the +csc-general list exists (from the dev environment playbooks). Just +attach to the mail container and run the following: +```sh +/opt/mailman3/bin/mailman create new_list_name@csclub.internal +``` +See https://git.uwaterloo.ca/csc/syscom-dev-environment/-/tree/master/mail +for instructions on how to access the Mailman UI from your browser. diff --git a/architecture.md b/architecture.md new file mode 100644 index 0000000..7fb2cb4 --- /dev/null +++ b/architecture.md @@ -0,0 +1,70 @@ +# Architecture +ceo is a distributed HTTP application running on three hosts. As of this +writing, those are phosphoric-acid, mail and caffeine (coffee in the dev +environment). + +* The `mail` host provides the `/api/mailman` endpoints. This is because + the REST API for Mailman3 is currently configured to run on localhost. +* The `caffeine` host provides the `/api/db` endpoints. This is because + the root account of MySQL and PostgreSQL on caffeine can only be accessed + locally. +* All other endpoints are provided by `phosphoric-acid`. phosphoric-acid is the + only host with the `ceod/admin` Kerberos key which means it is the only host + which can create new principals and reset passwords. + +Some endpoints can be accessed from multiple hosts. This is explained more in +[Security](#security). + +Interestingly, ceod instances can actually make API calls to each other. For +example, when the instance on phosphoric-acid creates a new user, it will +make a call to the instance on mail to subscribe the user to the csc-general +mailing list. + +## Security +In the old ceo, most LDAP modifications were performed on the client side, +using the client's Kerberos credentials to authenticate to LDAP via GSSAPI. +Using the client's credentials is desirable since we currently have custom +authz rules in our slapd.conf on auth1 and auth2. If we were to use the +server's credentials instead, this would result in two different sets of +authz rules - one at the API layer and one at the OpenLDAP layer - and +syscom members would very likely forget to update both at the same time. + +So, we want a way for the server to use the client's credentials when +interacting with LDAP. The most secure way to do this is via a Kerberos +extension called "constrained delegation", or [S4U](https://docs.microsoft.com/en-us/openspecs/windows_protocols/ms-sfu/1fb9caca-449f-4183-8f7a-1a5fc7e7290a). +While the MIT KDC, which we are currently using, does provide support for S4U, +this [requires using LDAP as a database backend](https://k5wiki.kerberos.org/wiki/Projects/ConstrainedDelegation#CHECK_ALLOWED_TO_DELEGATE), +which we are *not* using. While it is theoretically possible to migrate our +KDC databases to LDAP, this would be a very risky operation, and probably +not worth it if ceo is the only app which will use it. + +Therefore, we will use unconstrained delegation. The client essentially +forwards their TGT to ceod, which uses it to access other services over GSSAPI +on the client's behalf. The TGT is formatted as a KRB-CRED message, +base64-encoded, and placed in an HTTP header named 'X-KRB5-CRED'. + +Since the client's credentials are used when interacting with LDAP, this means +that most LDAP-related endpoints can actually be accessed from any host. +Only the Kerberos-specific endpoints (e.g. resetting a password) truly need +to be on phosphoric-acid. + +### Authentication +The REST API uses SPNEGO for authetication via the HTTP Negotiate +Authentication scheme (https://www.ietf.org/rfc/rfc4559.txt). The API +does not verify that the user actually knows the key for the service ticket; +therefore, TLS is necessary to prevent MITM attacks. (TLS is also necessary +to protect the KRB-CRED message, which is unencrypted.) + +SPNEGO is pretty awkward, to be honest, as it completely breaks the stateless +nature of HTTP. If we decide that SPNEGO is too much trouble, we should switch +to plain HTTP cookies instead, and cache them somewhere in the client's home +directory. + +## Web UI +For future contributors: if you wish to make ceo accessible from the browser, +you will need to add some kind of "Kerberos gateway" logic to the API such +that the user's password can be used to obtain Kerberos tickets. One possible +implementation would be to prompt the user for a password, obtain a TGT, +then encrypt the TGT and store it as a JWT in the user's browser. The API +can decrypt the JWT later and use it as long as the ticket has not expired; +otherwise, the user will be re-prompted for their password.