Linux Window Manager Part 3: Building a Window Manager

In the previous posts in this series I’ve talked about graphical interfaces, the role that window managers play in them, the X protocol and X11 library. In this post we will bring those things together and build a simple window manager. Sort of.

The Singleton Window Manager

I say “sort of”, because the program we build is barely going to actually manage windows. We are going to build an innovative new type of window manager called the Singleton window manager, SWM. The idea behind it is that, hey, you don’t need all those windows on the screen at the same time. You only have one pair of eyes, one mouse pointer, right? One app on screen is enough. Anything else is just distraction.1 1 Oh, and you don’t need any status bars either. Wasted real estate.

Now, obviously this isn’t true. There are plenty of reasons you might want multiple windows on screen. But the intention here is only to show you how a window manager works. Other simplifications we will make is that we will consider only one monitor, and will not have any concepts of multiple ‘workspaces’ or ‘tags’.

The behavior of SWM

Here is what we will be implementing:

We will be implementing the first two requirements today.

Getting set up for WM development

The key thing you need is a linux machine with X11 dev libraries installed. I won’t go into how to do that, but on Ubuntu something like sudo apt install libx11-dev should do the trick.

SWMv0: The core loop

Our first goal is to create a window manager that can be loaded by our machine. We will verify it’s working by making sure it’s picking up key presses. We also need to be able to exit out of it.

We start by creating the file swm0.c2 2 See the full file at the swm github repo with the imports and globals.

#include <X11/Xlib.h>
#include <X11/cursorfont.h>
#include <X11/keysym.h>
#include <stdbool.h>
#include <stdio.h>

static Display* display;
static int screen;
static int screen_w, screen_h;
static int running = true;
static Window root;

Cursor cursor;

FILE* logfile;
const char* logpath = "/home/joe/programming/projects/swm/log_swm0.txt";

We are loading 3 X11 libraries, the use of which will be clear from the name and how we will use them later.

For globals we will have an X Display pointer, and slots to put the identifier of the screen and “root” window of that screen. We’ll also store the width and height of that screen, since we will need to refer to it often.

Finally there is a boolean ‘running’ which we’ll use as our loop terminator, a Cursor (an X Object) and a file and path for logging.

Our main function is like this:

int main(int args, char* argv[]) {
  if (!(display = XOpenDisplay(NULL)))
    return 1;
  setup();
  run();
  cleanup();
  XCloseDisplay(display);
  return 0;
}

This is pretty standard, and will change not at all during the build. We open the connection to the Xorg display, we set ourselves up and enter the run loop. Then when we are finished we clean up and close the display.

Setup and Substructure

The setup function is a bit more complicated.

static void setup() {
  //1. logging
  logfile = fopen(logpath, "w");
  fprintf(logfile, "START SESSION LOG\n");

  //2. screen setup
  screen = DefaultScreen(display);
  screen_w = DisplayWidth(display, screen);
  screen_h = DisplayHeight(display, screen);
  root = RootWindow(display, screen);
  fprintf(logfile, "Root window %ld\n", root);

  //3. reconfiguring root
  XSetWindowAttributes wa;
  cursor = XCreateFontCursor(display, XC_left_ptr);
  wa.cursor = cursor;

  wa.event_mask =
      SubstructureRedirectMask | SubstructureNotifyMask | KeyPressMask;

  XMoveResizeWindow(display, root, 0, 0, screen_w, screen_h);
  XChangeWindowAttributes(display, root, CWEventMask | CWCursor, &wa);
}

The screen setup is pretty standard stuff: you get the screen from the display (and the dimensions of the screen), and get the root window of the screen.

Next we reconfigure the root window so it acts as the base of a window manager. In X re-configuring windows is done with an XSetWindowAttributes that is then applied to the window with XChangeWindowAttributes. For now we are changing two things: We are setting the cursor, and we are setting the ‘event_mask’.

The first is important but not interesting - you have to set up a cursor or the program won’t work.

The second is important and also crucial to the operation of a window manager. As we saw in part 2, an event mask is just a bit mask which tells which tell XOrg which events are ‘listened’ for in this window. Previously we’d seen things like KeyPressMask and MouseMotionMask. Here we have the SubstructureRedirectMask and SubstructureNotifyMask. What these two do is to tell XOrg that whenever windows under the root emit events, to not process them directly but send them to our program to deal with. This is what allows our WM to work: Xorg routes any event traffic from other programs to us, and the vast bulk of our program will be deciding how to handle these events.

A critical thing here is that Xorg will only allow one window to have this substructure redirect and notify settings at once. This means you can only have one WM running on a Linux session - for obvious reasons. A proper window manager will check for this, and fail elegantly if we try to launch a second WM, but for now we’ll just let it ride.

The loop and event handlers

static void run() {
  XEvent event;
  while (running && XNextEvent(display, &event) == 0) {
    switch (event.type) {
      case KeyPress:
        keypress_handler(&event);
        break;
      default:
        fprintf(logfile, "Unhandled event-type: %d\n", event.type);
    }
  }
}

static void keypress_handler(XEvent* event) {
  XKeyEvent* ev = &event->xkey;
  KeySym keysym = XKeycodeToKeysym(display, ev->keycode, 0);
  fprintf(logfile, "Keypress \t\t KeySym: %ld modifier = 0x%x\n", keysym, ev->state);
  if (keysym == XK_q)
    running = false;
}

This is the same for the X11 program examples we saw in part 2, and for most X11 programs: We enter a loop where we get the event (with XNextEvent), match on the event.type, and for each event we have a ‘handler’ function which determines how we deal with the event.

In this case, we have a single handler for keypresses3 3 Note that the ‘keysym’ way of handling is deprecated, and you’ll get a warning every time you compile. But that’s what dwm does, and if it’s good enough for them, it’s good enough for this. . Otherwise we’ll log our the event type so we can see what messages we’re getting. The keypress handler detects if the button we have pressed is a lowercase q, and if it is, sets running to false, which will quit the program.

Finally we have ‘cleanup’. Right now there’s nothing to clean up, so we are just closing the log file.

static void cleanup() {
  fprintf(logfile, "END SESSION LOG\n");
  fclose(logfile);
}

Running your window manager

First, compile your program with:

cc swm0.c -lX11 -o swm0

Once you have your binary swm0, move it to /usr/bin/

Next you need to get yourself into a linux session that doesn’t have another WM running. One way to do this is to have your OS set up so it doesn’t launch a WM by default, and loads you into a command line prompt. If you don’t have this, and you are in a graphical environment already (which if you are reading this you presumably are), you can switch to another tty session. Usually this is done with the command Ctrl+Alt+Fx, where x is the number of the tty you want to switch to4 4 having two X sessions running at once, even on different ttys, can cause some weirdness, even when the WM has proper error handling, which ours assuredly does not. Be prepared to restart your machine if things go wonky. 5 5 A third method, and one which will be used when you are done with development and actually want to use your WM, is to create a swm.desktop file in /usr/share/xsessions. This will be used by your display manager so you can load into our new WM as opposed to your normal one. .

Using either of the above methods you should be looking at a linux session where Xorg isn’t loaded. To start your X session using your WM, in the command prompt put:

 startx /etc/X11/Xsession swm0

Assuming this worked, congratulations, you have a working WM. Admittedly it’s just a blank screen. Try typing ‘hello’, then hit ‘q’, and it’ll dump you back to the tty. Now, if you look at the log file you configured, you should see something like this, indicating your WM is listening to your key commands.

START SESSION LOG
Root window 1980
Keypress        KeySym: 104 state = 0x0
Keypress        KeySym: 101 state = 0x0
Keypress        KeySym: 108 state = 0x0
Keypress        KeySym: 108 state = 0x0
Keypress        KeySym: 111 state = 0x0
Keypress        KeySym: 119 state = 0x0
Keypress        KeySym: 113 state = 0x0
END SESSION LOG

This is step one in our WM creation journey. In this session we have implemented the following points of our requirements list:

In the next post, we will do the next ones: