Easier Web Application Deployments #103

Open
opened 2023-09-24 15:11:36 -04:00 by e226li · 8 comments
Member

This proposal aims to replace existing multiplexer-based application deployments with systemd files.

Deliverables

  • Interactive script available to users that create systemd unit files in ~/.config/systemd/user/
    • Unit files start with hardening defaults
      • PrivateTmp=yes, PrivateDevices depending on application, ProtectSystem=full, etc.
      • Maybe NoNewPrivileges but I don’t see how it would do anything when systemd is run at the user level
    • Templates for different commonly used programs
    • Aliases for systemctl --user start/stop/restart etc.
      • e.g, ./script restart aliases to systemctl --user daemon-reload && systemctl --user start
    • Support for multiple services (e.g., if one club webapp has many moving parts)
    • Run with only user permissions so no possibility of privilege escalation
    • Needs to be robust only for ease of use, don’t need to worry about injections etc.
  • Web interface option via Swagger???
    Probably not a good idea, but is very technically feasible via. FastAPI or similar
    • e.g., can GET an endpoint using a webpage to get systemctl status output
    • UI example can be found here: https://petstore.swagger.io/
    • This might be too much scope creep and will significantly increase attack surface, as implementing auth well is difficult
  • Cronjob to run loginctl enable-linger myuser if myuser has systemd userspace services enabled, and loginctl disable-linger if myuser is set to nologin
  • Documentation on wiki
    • Usage examples
    • Templates used for each program
  • Tests:
    • Pytest
    • Drone integration?
    • Fuzzing if making web interface?

Notes

  • systemctl --user fails when you su - myuser; use machinectl shell myuser@ instead
  • Thanks to Max for the mention of loginctl enable-linger myuser
    • Confirmed working on a fresh Ubuntu 22.04 install
This proposal aims to replace existing multiplexer-based application deployments with systemd files. # Deliverables - Interactive script available to users that create systemd unit files in `~/.config/systemd/user/` - Unit files start with hardening defaults - `PrivateTmp=yes`, `PrivateDevices` depending on application, `ProtectSystem=full`, etc. - Maybe `NoNewPrivileges` but I don’t see how it would do anything when systemd is run at the user level - Templates for different commonly used programs - Aliases for `systemctl --user start/stop/restart` etc. - e.g, `./script restart` aliases to `systemctl --user daemon-reload && systemctl --user start` - Support for multiple services (e.g., if one club webapp has many moving parts) - Run with only user permissions so no possibility of privilege escalation - Needs to be robust only for ease of use, don’t need to worry about injections etc. - Web interface option via Swagger??? *Probably not a good idea, but is very technically feasible via. FastAPI or similar* - e.g., can GET an endpoint using a webpage to get `systemctl status` output - UI example can be found here: https://petstore.swagger.io/ - **This might be too much scope creep and will significantly increase attack surface**, as implementing auth well is difficult - Cronjob to run `loginctl enable-linger myuser` if myuser has systemd userspace services enabled, and `loginctl disable-linger` if myuser is set to nologin - Documentation on wiki - Usage examples - Templates used for each program - Tests: - Pytest - Drone integration? - Fuzzing if making web interface? ## Notes - `systemctl --user` fails when you `su - myuser`; use `machinectl shell myuser@` instead - Thanks to Max for the mention of `loginctl enable-linger myuser` - Confirmed working on a fresh Ubuntu 22.04 install
Author
Member

Existing comment chain on the web interface (ported over from the doc):

This might be too much scope creep and will significantly increase attack surface, as implementing auth well is difficult

Nathan

what's your advice? suggest against it? would each club have this or are just just hosting one for ourselves (for something)?

Eric

I'm not sure - this can be a killer feature that helps clubs a lot, but it's also a lot more work compared to just throwing together a wrapper that only creates/modifies systemd files.

As for implementation, it's probably best if each club spins up its own api server - the flow would just be "user calls web interface/API with API key -> API server calls interactive script -> stdout/stderr is returned to the user"

Existing comment chain on the web interface (ported over from the doc): > **This might be too much scope creep and will significantly increase attack surface**, as implementing auth well is difficult *Nathan* what's your advice? suggest against it? would each club have this or are just just hosting one for ourselves (for something)? *Eric* I'm not sure - this can be a killer feature that helps clubs a lot, but it's also a lot more work compared to just throwing together a wrapper that only creates/modifies systemd files. As for implementation, it's probably best if each club spins up its own api server - the flow would just be "user calls web interface/API with API key -> API server calls interactive script -> stdout/stderr is returned to the user"
Owner

We can think about the web interface later. Let's focus on the main use case for now.

I'm going to discourage the use of loginctl enable-linger for the following reasons:

  1. It's too easy to forget that it's there. It's out of sight, out of mind.
  2. systemd doesn't offer a public API to get a list of all users for whom lingering is enabled.
  3. This allows members to create and run their own systemd services, which we don't want. More details below.

Here's why we don't want members to directly create their own systemd services:

  1. We want to perform some kind of validation. For example, members' applications generally shouldn't be run on caffeine because that's only meant to run the main web server (Apache).
  2. We want to limit how many services a member/club can run.
  3. We want to make sure the unit file has certain security-related directives, e.g. ProtectSystem.

Here's what I suggest instead:

  1. Select a specific machine to run all of the web apps. This'll make our life easier later if we need to modify the unit files manually. Create a DNS alias for that machine. Don't let members choose which machine their app runs on; we (syscom) want to be able to move these around if necessary.
  2. Create a new ceod API which creates a systemd unit on the machine above for the app (maybe just use SSH for now, we can use something more complex later if necessary). Members can specify the working directory, command to execute, environment variables, and URL path (i.e. csclub.uwaterloo.ca/~username/path/to/your/app). No URL path is needed if the app doesn't need to be publicly exposed (e.g. a Discord bot). The systemd unit file should be a template unit file so that we can keep common directives in one place, e.g. csc-member-app@.service.
  3. ceo should automatically modify the member's ~/www/.htaccess to redirect the path above to the host + port on which the app is running. (TODO: figure out how to resolve port conflicts. If this is going to run on a general-use machine, we can't prevent other members from re-using that port.)
  4. ceod should stop and delete the systemd unit when the user's membership expires.

Bonus: figure out how to run the apps on a machine with Kerberized NFS. Then we could run it from one of the cloud machines, which would actually make our life a lot easier.

We can think about the web interface later. Let's focus on the main use case for now. I'm going to discourage the use of `loginctl enable-linger` for the following reasons: 1. It's too easy to forget that it's there. It's out of sight, out of mind. 2. systemd doesn't offer a public API to get a list of all users for whom lingering is enabled. 3. This allows members to create and run their own systemd services, which we don't want. More details below. Here's why we don't want members to directly create their own systemd services: 1. We want to perform some kind of validation. For example, members' applications generally shouldn't be run on caffeine because that's only meant to run the main web server (Apache). 2. We want to limit how many services a member/club can run. 3. We want to make sure the unit file has certain security-related directives, e.g. ProtectSystem. Here's what I suggest instead: 1. Select a specific machine to run all of the web apps. This'll make our life easier later if we need to modify the unit files manually. Create a DNS alias for that machine. Don't let members choose which machine their app runs on; we (syscom) want to be able to move these around if necessary. 2. Create a new ceod API which creates a systemd unit on the machine above for the app (maybe just use SSH for now, we can use something more complex later if necessary). Members can specify the working directory, command to execute, environment variables, and URL path (i.e. `csclub.uwaterloo.ca/~username/path/to/your/app`). No URL path is needed if the app doesn't need to be publicly exposed (e.g. a Discord bot). The systemd unit file should be a template unit file so that we can keep common directives in one place, e.g. `csc-member-app@.service`. 3. ceo should automatically modify the member's `~/www/.htaccess` to redirect the path above to the host + port on which the app is running. (TODO: figure out how to resolve port conflicts. If this is going to run on a general-use machine, we can't prevent other members from re-using that port.) 4. ceod should stop and delete the systemd unit when the user's membership expires. Bonus: figure out how to run the apps on a machine with Kerberized NFS. Then we could run it from one of the cloud machines, which would actually make our life a lot easier.
Owner

Some ideas for the ceo CLI:

  • Create a new app: ceo app add --name myapp -e MY_ENV_VAR=1 -c "npm start" --path myapp
  • Delete an app: ceo app delete myapp
  • Restart an app: ceo app restart myapp (this should SSH into the machine where the app is running and run systemctl restart)
  • Get app logs: ceo logs myapp (this should SSH into the machine where the app is running and run journalctl)
Some ideas for the ceo CLI: * Create a new app: `ceo app add --name myapp -e MY_ENV_VAR=1 -c "npm start" --path myapp` * Delete an app: `ceo app delete myapp` * Restart an app: `ceo app restart myapp` (this should SSH into the machine where the app is running and run `systemctl restart`) * Get app logs: `ceo logs myapp` (this should SSH into the machine where the app is running and run `journalctl`)
Author
Member

Hey Max, thanks for the feedback!

Couple things off the top of my head:

re 1.2: enabled/disabled lingers are just empty files that live in /var/lib/systemd/linger - no official api, but it should be trivial to make one ourselves

re 2.2: template files seem like a really good idea

And for the rest, I mostly steered clear of those because I wanted to keep the scope of this as small as possible - this project will probably be best as a drop in replacement for tmux, and we can fix the other validation issues later down the line as separate projects so things don't end up bogging down. I'm also a bit leery of making a ceo api, since that turns this project from essentially zero additional attack surface (only superuser related addition would be loginctl) to one with many attack surfaces.

(I know the web interface does run counter to the low scope philosophy, but that's just because it's a bit of a toy project for me.)

Hey Max, thanks for the feedback! Couple things off the top of my head: re 1.2: enabled/disabled lingers are just empty files that live in `/var/lib/systemd/linger` - no official api, but it should be trivial to make one ourselves re 2.2: template files seem like a really good idea And for the rest, I mostly steered clear of those because I wanted to keep the scope of this as small as possible - this project will probably be best as a drop in replacement for `tmux`, and we can fix the other validation issues later down the line as separate projects so things don't end up bogging down. I'm also a bit leery of making a ceo api, since that turns this project from essentially zero additional attack surface (only superuser related addition would be `loginctl`) to one with many attack surfaces. (I know the web interface does run counter to the low scope philosophy, but that's just because it's a bit of a toy project for me.)
Author
Member

Actually, clarifying my earlier comment, I think integrating this into the ceo api superficially (ceo app aliases to pythonscript) would be a good idea from a UX perspective - it's just that I don't have enough knowledge of the ceo backend that I'm worried about breaking things if I do any changes that are more complex.

Actually, clarifying my earlier comment, I think integrating this into the ceo api superficially (`ceo app` aliases to `pythonscript`) would be a good idea from a UX perspective - it's just that I don't have enough knowledge of the ceo backend that I'm worried about breaking things if I do any changes that are more complex.
Owner

Not sure if I recall correctly, but was it mentioned on IRC that we should avoid using LXC containers? So is the plan now to go with Systemd units?

These systemd units could be created from club accounts using ceo app ...?

Not sure if I recall correctly, but was it mentioned on IRC that we should avoid using LXC containers? So is the plan now to go with Systemd units? These systemd units could be created from **club accounts** using `ceo app ...`?
Author
Member

My understanding was that the current plan is to go systemd because it's the easiest. LXC containers will take too much space and have other feasibility concerns (@merenber), and will take a lot more work compared to manipulating config units so they're a long term solution at best (my thoughts). Might have misunderstood the consensus though.

My understanding was that the current plan is to go `systemd` because it's the easiest. LXC containers will take too much space and have other feasibility concerns (@merenber), and will take a lot more work compared to manipulating config units so they're a long term solution at best (my thoughts). Might have misunderstood the consensus though.
Author
Member

Thought some more about this, and I think there's two ideas for how this project should go forward. One is for this to be a singular improvement (<10 hours + tests) and another is for this to address other outstanding issues so that we won't have to junk this in the future because of side effects that become apparent after a couple of years.

I'm personally in favour of the former: I can do it, while I definitely don't have the experience or expertise to do the latter without a lot of help. Curious to hear everyone's opinions on this - I'd be happy to start hacking on the latter as well if that's the coding philosophy CSC wants to take.

Thought some more about this, and I think there's two ideas for how this project should go forward. One is for this to be a singular improvement (<10 hours + tests) and another is for this to address other outstanding issues so that we won't have to junk this in the future because of side effects that become apparent after a couple of years. I'm personally in favour of the former: I can do it, while I definitely don't have the experience or expertise to do the latter without a lot of help. Curious to hear everyone's opinions on this - I'd be happy to start hacking on the latter as well if that's the coding philosophy CSC wants to take.
merenber added the
priority
medium
label 2023-10-14 22:17:09 -04:00
Sign in to join this conversation.
No Milestone
No project
No Assignees
3 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: public/pyceo#103
No description provided.