I was performing regularly scheduled maintenance today on one of our client’s servers. Due to the length of time we’ve been building and managing servers for clients our build-outs can differ in slight ways as we evolve our processes.
I had decided that I was going to start shoring in those differences and get each server we manage to a stable and predictable place.
In this instance our maintenance user was NOT a member of the
adm group. I wanted them to be so that I didn’t have to login as the root user (or sudo it) to read system logs.
The command is supposed to be this:
usermod -a -G newgroup username
-a stands for “append” and the
-G sets the supplementary groups for the user.
What I ended up entering was
usermod -G newgroup username which removed the users access to all the other groups it had been a part of (root, sudo, staff, admin, etc). I could now read the logs I wanted but I couldn’t do anything else on the system, most importantly administrate the system.
This mistake shown a bright light on a glaring misstep that we’ve taken setting up these small servers for clients for years. There was NO way to recover from this error. The root user wasn’t allowed to login to the system remotely leaving
sudo the only way to access root privileges. I couldn’t use sudo anymore.
Backups stored on the server were inaccessible due to permissions (we also save them nightly to AWS which saved me here).
I had to recreate the server from scratch. Using backups from our off-server storage (AWS S3).
This was the only fix for this situation.
This was a glaring business continuity issue for our client. Luckily, in this instance, the site is fairly low traffic currently and hadn’t had any transactions that day since the nightly backup got stored off-site.
I’ve now updated every server we are actively or passively responsible for the server installation to have a password specifically for root. This will allow, in dire situations like this, us to login via our hosting provider’s console. At the very least we can rebuild and regain access for normal operations.
This also helps our clients in the case of Maje Media going *poof*. Since our clients all pay their own hosting bills and give us team access to their accounts to manage we CANNOT be a single point of failure.
The next step is to figure out a way to securely share these credentials with clients in a way that they’ll remember how to do it if the time comes that they need it.