i hate systemd


The Good, The Bad, The Ugly.


And The Survival Guide.

(Work In Progress)

The Systemd Survival Guide

So there's some Good, some Bad, some Ugly.

But to get paid, you're likely going to have to work with it, so here's a guide to help you to survive it.

At least until your cunning:
{ Ponzi Scheme |
Money Laundering |
Creating The Next Facebook / Twitter / Uber |
Marrying a Gorgeous yet Gullible Millionaire |
Insert Other Get-Rich-Quick Scheme Here
} plan

works out for you, and you can retire to a life of luxury.

In the meantime, hopefully this guide will be helpful. And please do remember to think of us from time to time when you're cruising in your luxury yacht, sipping a cocktail - with the beautiful yet gullible millionaire by your side, of course.

The Basics

The major use of systemd that you will need to get your head around first, is that it manages services. Just as Upstart or /etc/init.d/ scripts can start, stop, and tell you the status of your system's services (httpd, sshd, mysqld, etc), systemd's systemctl provides roughly equivalent functionality.

Start and Stop Services

To start or stop a service such as SSH, you use the systemctl start sshd or systemctl stop sshd commands. True to the Unix tradition, it doesn't output anything at all, it just does what you tell it, whether that is starting or stopping the service.

Check the Status of a Service

To check the status of a service, you guessed it - it's systemctl status sshd. This is where it gets complicated, quickly. There's no middle-ground with systemd. systemctl status gives you far more information than you generally need, and presented in an often very misleading manner.

# systemctl status sshd
● sshd.service - OpenSSH server daemon
   Loaded: loaded (/usr/lib/systemd/system/sshd.service; enabled; vendor preset: disabled)
   Active: inactive (dead) since Thu 2017-09-28 22:34:44 BST; 2min 14s ago
     Docs: man:sshd(8)
           man:sshd_config(5)
 Main PID: 691 (code=exited, status=0/SUCCESS)

Sep 19 23:08:32 fedora26.steve-parker.org systemd[1]: Starting OpenSSH server daemon...
Sep 19 23:08:33 fedora26.steve-parker.org sshd[691]: Server listening on 0.0.0.0 port 22.
Sep 19 23:08:33 fedora26.steve-parker.org sshd[691]: Server listening on :: port 22.
Sep 19 23:08:33 fedora26.steve-parker.org systemd[1]: Started OpenSSH server daemon.
Sep 19 23:08:46 fedora26.steve-parker.org sshd[1175]: Accepted password for steve from 192.168.8.73 port 37381 ssh2
Sep 21 22:52:38 fedora26.steve-parker.org sshd[26086]: Accepted password for steve from 192.168.8.73 port 47870 ssh2
Sep 28 22:34:44 fedora26.steve-parker.org sshd[691]: Received signal 15; terminating.
Sep 28 22:34:44 fedora26.steve-parker.org systemd[1]: Stopping OpenSSH server daemon...
Sep 28 22:34:44 fedora26.steve-parker.org systemd[1]: Stopped OpenSSH server daemon.
# 

This sshd is stopped. PID 691 is not running, but it was the main PID the last time sshd was active.

The Loaded: line tells you that the service is enabled - that is, it will start on boot, although the default for this distribution (Fedora) is that SSH is disabled. So it has been manually activated by the administrator.

The Active: inactive (dead) line does not mean that it is active. It is dead, and has been for 2 minutes and 14 seconds. If it was active, it would say Active: active. Obviously.

The Main PID: line tells you that it has exited ("code=exited"), and it stopped the service successfully (status=0/SUCCESS). This is basically saying that it successfully stopped the service; it hadn't crashed, failed to stop, or otherwise ended up dead. It is inactive (oh sorry, this is systemd - it's listed as "Active" of course. But it's dead, alright?) it's dead because systemd was told to stop the service, and systemd successfully arranged that for you. Didn't it do well? Go and get yourself an ice-cream, systemd. Well done.

If it had failed to stop the sshd service, it would have looked something like this iscsid process which had failed to stop:

  Process: 12375 ExecStop=/sbin/iscsiadm -k 0 2 (code=exited, status=203/EXEC)

Some of the text here is red. systemd enjoys colouring in. It likes to show bad things in red, and good things in green. It isn't always quite sure what a "good" or "bad" message actually looks like, though, so don't assume that all red messages are bad, or that all green messages are good. Similarly, terribly fatal messages may not be highlighted at all. systemd is trying, though, and if it keeps the effort up, it might be allowed to use more colours next term.

The rest of the output is just some stuff that systemd noticed sshd saying during its run. Some of it is startup messages, some of it is messages as it's operating (a user called 'steve' logged in). Then it was killed (by systemd). These messages aren't even necessarily in order, particularly the ones from systemd itself. Note that sshd reported receiving the signal 15 ("kill") signal moments before systemd reported that it was about to stop it:

Sep 28 22:34:44 fedora26.steve-parker.org sshd[691]: Received signal 15; terminating.
Sep 28 22:34:44 fedora26.steve-parker.org systemd[1]: Stopping OpenSSH server daemon...
Sep 28 22:34:44 fedora26.steve-parker.org systemd[1]: Stopped OpenSSH server daemon.

So it's hard to take these lines too seriously. It's actually incredibly easy to trick yourself (or, if you're bored, trick somebody else, like a line manager, for example) that it's saying something very different about the current state of the service, than it actually is.

Summary

The return code for systemctl status should be zero if all's well, and non-zero otherwise. Often 3. Always 3? Really, I'm not sure. I will find out, and when I do, I'll be sure to make a note of it here.

Unit Files

Unit files are ... TODO

Systemctl Commands

Systemctl commands ... TODO

Logs

systemd manages logs ... actually, you'd better get yourself a drink before we start on this one.

I'll wait.

What? No, not a frothy latte, that won't do you any good where we're going.
© Alexandr Steblovskiy – Fotolia.com
But okay, fine. Suit yourself.

One of the initial criticisms of systemd was that its logs are in a binary format. Because at 3am in a cold datacentre on a vt100 green screen (if you haven't been there, just don't ask, you don't want to have been there) what you really need is a clear, plain text log of what happened to the system as it went down in flames. But the designers of systemd decided that it would be better for our moral fibre to read the logs through their tools, so - as long as systemd itself is working fine - that's great. Really, it is. It's all absolutely fine. So long as systemd is working fine. And if you're in a cold datacentre, and it's now 3.30am, and trying to dig through systemd log files to find out why you're not tucked up safely in bed, then it seems pretty bloody safe to assume that systemd isn't working fine. So maybe it's not able to read the logs. Oops

At least the systemd developers have a strong reputation for empathising with your situation and doing what they can to help you out.
Oh. Feck. Yeah, they don't have that reputation, do they? Quite the opposite, in fact. So. You're screwed. And now it's 4am in a cold datacentre. And you're no further along than you were before.

If that frothy latte hasn't gone cold, I hope it's still doing you some good.

I will write sensible stuff

Really, I will write some sensible stuff here about logs. Some of it is quite nice. Totally unnecessary, but yeah. Some of it isn't completely terrible.

@ihatesystemd