lxgr's blog

Booting multiple Ubuntu versions with EFI

2015-04-30T13:32:00+02:00

For an upcoming project, I will have to use Ubuntu 14.04, and since I didn't want to downgrade my main Ubuntu install on my laptop, I decided to install the second version on a spare partition of my primary harddisk. Sounds easy, right? That's what I thought too, and I was very wrong.

I expected to be able to simply point the Ubuntu installer to the spare partition and wait for the automatic setup to complete, like I used to when I was using plain old BIOS and MBRs to boot my system. My current laptop, however, supports something called UEFI and Secure Boot, and since the "secure" part piqued my interest, I had decided to give it a try. I have been using this setup for dual-booting Ubuntu and Windows 8 for more than a year without any problems now.

After watching the installer copy all the necessary files of Ubuntu 14.04 to the disk and installing the bootloader, I booted into the new system, and everything worked as expected so far – I saw the new, old 14.04 desktop and was even able to open my main, encrypted 15.04 partition using Nautilus. But when I wanted to boot back into my main system, I realized that something had gone wrong: My 15.04 install didn't show up anywhere in the boot process. Not in the EFI boot menu of my laptop I can access by pressing F12 during the boot process (I used that to boot the Windows, because somehow the secure boot stuff interfered with grub's chain loading), and also not in grub itself – it looked as if the installation had never existed in the first place (except for the fact that I was able to see all of its file in 14.04, of course).

After the initial rush of panic had subsided, I was able to restore the system to its original state by the "usual" process of mounting and chrooting my previous system and executing grub-install. This worked, but now I was not able to boot into the new system anymore, besides executing all those steps again from within the old system. I realized I had to dig into the details of the Linux boot process on EFI (and secure boot) if I wanted to accomplish the triple-boot setup I had in mind. If you are unfamiliar with UEFI, I would recommend you to read an introduction before reading the following paragraphs.

The main problem seemed to be that the grub-install utility of both Ubuntu systems (which is also executed from the installer) wrote to the same location – I just didn't know what that location was. I started to dig into the details of the EFI boot process – from previous triple-boot experiments on a Mac I vaguely remembered something about a so-called "EFI system partition" where the initial boot loaders of all installed operating systems are stored, and also that there were some parts of the UEFI configuration that were stored in a non-volatile memory on the mainboard. This is very different from the "legacy" boot process, where the trusty BIOS simply loads a chunk of code stored in the MBR of some disk (which can be configured in the BIOS setup, but usually defaults to a sequence of CD-ROM/USB/primary HDD) and executes it. That process is both simple (it doesn't depend on any stable configuration storage within the PC) and robust (in my experience, it was possible to migrate to a new machine simply by swapping the hard drive!), but is also showing its age – things like a dual-boot setup of uncooperating operating systems quickly become a mess, as everybody who has ever installed Windows after Linux on the same machine probably knows. Additionally, there is no mechanism that allows checking the integrity of the system before it is booted, enabling malware that hooks itself into the boot process and is virtually undetectable by any software or operating system mechanism.

UEFI solves the problem of multiple operating systems by specifying the "EFI system partition", which is basically just a plain old FAT partition with a special partition flag and a standardized folder structure where every operating system on the disk or even machine (more on that later) can store its initial bootloader as an executable file. This is a much more robust and future-proof way of storing the first-stage bootloader than the very limited MBR scheme that basically only allows a single primary bootloader which has to locate and execute all secondary boot loaders of all operating systems on the drive. However, it is unfortunately not enough (at least on my computer!) to just dump a bootloader in the correct location (which would have been a really nice EFI feature!) – the corresponding operating system also has to tell the EFI about the new bootloader (both the EFI disk's UUID it resides on and the path on that disk), which then stores that information in its non-volatile configuration memory (a.k.a. NVRAM).

To sum the EFI boot process up, you need a folder on the EFI system partition containing your operating system's boot loader as an EFI binary (which in turn might be the first stage of a multi-stage boot loader that simply locates and executes its remaining parts) and a pointer (i.e. disk and path) to that file in the NVRAM. In the case of secure boot, the EFI binary will also be signed by some "trusted" entity, which could be your operating system vendor or amusingly Microsoft (even though you aren't actually using their operating system – this is because many hardware vendors opted to include their keys in their firmware, which was cause for much political discussion when secure boot was initially introduced).

Fortunately, grub-install takes care about all of that automatically as long as all the correct flags are supplied to it – but unfortunately, in the case of Ubuntu and secure boot, this only works for a single installation per machine (i.e. not per disk, which kept me puzzling for hours!). I'm no expert on secure boot, but I think that this might not be easily fixable by Ubuntu, depending on how exactly the signature mechanism is implemented.

When invoked with no parameters, Ubuntu's grub-install, on an EFI system, installs its signed bootloader to the EFI system partition that is mounted in /boot/efi (which might or might not be the one you want to use on a multi-disk setup!), in the folder /EFI/ubuntu/. The signed loader consists of a shim signed by Microsoft that subsequently executes the actual EFI loader called grubx64.efi in the same directory. Finally, there is a grub.cfg configuration file which contains a pointer (both as a disk UUID and as a GPT number) to the "regular" grub boot disk, which is usually your Linux root partition (or a separate boot partition if you are using an encrypted root device).

Initially, the problem only seemed to be that both Ubuntu installs were trying to install their bootloader and configuration to the same EFI subdirectory – I thought that if I were to somehow convince grub-install to install the EFI loader to some other subdirectory of /EFI, I would be able to select the Ubuntu I wanted from the EFI boot screen. grub-install conveniently has an option for exactly that; you can either change the value of the GRUB_DISTRIBUTOR variable in /etc/default/grub, or directly supply it using the --bootloader-id parameter. When invoking grub-install this way, you can see that a new folder in /EFI will be created using the supplied name (and registered with the EFI NVRAM). Unfortunately, while I was able to boot from the newly created boot entry, I didn't seem to be able to change the disk that grub was booting from in any way. It took me hours to find out why.

Remember that grub uses the value stored in /EFI/<loadername>/grub.cfg to determine the disk where it will continue loading. With a lot of trial-and-error experimentation, I was finally able to determine that regardless of which boot entry I was using in the EFI boot manager, grub would always read the same grub.cfg from /EFI/ubuntu, regardless of the actual bootloader location (i.e. subfolder of /EFI)! This location is actually hardcoded in the grubx64.efi binary, which can be verified by using strings or simply opening it with a hex editor. This means that regardless of the Ubuntu install from which grub-install was executed, only the system that most recently installed the loader in the default location /EFI/ubuntu was actually able to change the partition that grub would boot from. (I think I found out about that hard-coded string from some bug report, which I will try to find and reference here.)

If the hard-coded string is modified to reflect the actual location of the boot entry in the /EFI directory everything works as expected (with secure boot enforcing disabled)! Now why would the Ubuntu team be so stupid to hard-code a string that obviously would better be supplied by a parameter? The answer is secure boot: If you enable signature enforcing in the EFI configuration, the modified bootloader stops working. It seems that the string within the binary is covered in the asymmetric signature Canonical uses to certify their bootloader; they could either modify it (and break all systems where secure boot is enforced) or leave it as it is (and break multi-booting). They seem to have decided on the latter. (Maybe there is also a third way, where the configuration file location could be encoded relative to the binary location, i.e. ./grub.cfg, but I don't know enough about EFI to say whether such a thing is possible.)

As I later realized, there is an easier way than modifying the signed grub binary. Since secure boot doesn't work with the modified loader anyway, I tried to invoke grub-install with the --no-uefi-secure-boot parameter and examined the resulting bootloader: Without secure boot, there is only a single EFI executable that is also called grubx64.efi (which confused me to no end, since the other files are not cleaned up by grub-install, and I assumed that the configuration file was still working), but that has a much simpler internal structure and importantly has the boot disk location hardcoded. This wasn't as easy to find as the suspicious /EFI/ubuntu string – it is only some kind of relative disk ID like (,gpt2), if your boot partition is the second partition of the volume on which the EFI partition resides, but a complete disk UUID if the boot partition is located on a different disk.

Finally, here is the complete guide on how to install two Ubuntu versions on a single disk:

Disable secure boot in your EFI settings.
Install the first Ubuntu system on the disk. (If it already exists and you have spare space on your disk, you can obviously skip this.)
Backup the boot entry of the first disk by reinstalling it from within the system using a different name, without secure boot:
```
grub-install --bootloader-id=myfirstubuntu --no-uefi-secure-boot
```
Install the second system using the regular Ubuntu installer where you want it. This will break the boot entry of the first system called ubuntu, but not the backup you just created.
Boot the second system and create a backup of the bootloader, e.g. grub-install --bootloader-id=mysecondubuntu --no-uefi-secure-boot
(only if you want to primarily boot the first system) Boot the first system using your computer's EFI boot menu and execute grub-install without any parameters.

Congratulations, you now have two Ubuntus running on a single machine!

If you want to use a similar setup, but using more than one disk, you can basically use the same steps if you don't mind that the same EFI partition of the first disk will be used for both systems, which means that you can never format or remove that disk without also disrupting the system on the other disk. grub will just put a pointer to the second disk in its binary that is executed from the EFI partition on the first disk, which should theoretically even survive partition and disk renumbering (but don't count on it!).

If that is a problem for you, there is also the possibility of using a second EFI partition on the second disk, but the Ubuntu installer will make your life even harder by stubbornly insisting to use the EFI partition on the first disk; I was able to solve this only by creating a backup of the first system's bootloader as described above, installing the second system, mounting the second EFI partition in /boot/efi instead of the first one and rerunning grub-install --bootloader-id=....

You can verify if everything has been setup as you want it by examining the EFI directory on the EFI system partition(s) on your disk(s) as well as the output of efibootmgr -v, which lists the content of the boot list in the EFI NVRAM.

There is also an option --removable which supposedly sets up the EFI directory on a removable device, which looks a bit different than for internal devices and importantly doesn't create an NVRAM entry (which wouldn't be available on different machines anyway). You might be able to use that to boot from an internal disk too, but I have not tried that approach, however.

Of course, if that sounds like a lot of headache and your computer still supports the legacy BIOS boot process (a.k.a CSM in EFI parlance), you can just install the second system on a different disk with an MBR bootloader and configure your EFI for both CSM and EFI booting if it supports that; then you can just select the EFI entry of the first system or the second disk (which will start the second system's boot loader) in the EFI boot menu.

Let me know if you actually made it through that big wall of text and were able to solve your EFI boot problems in the comments!

Update (2015-12-11):

There is an interesting discussion about the whole topic in the Kubuntu forums. Apparently, it should also be possible to use multiple EFI partitions to get multiple instances of Ubuntu working with secure boot. Thanks for that idea, and sorry for being unreachable. I will probably have to add my mail address here sometime. In the meantime, you can try "me" at lxgr dot net.

How to fix video tearing on Chrome/Chromium and Compiz

2015-04-27T18:27:00+02:00

One thing I really like about Netflix is their excellent device and browser support. Unlike a certain other streaming service (the one from the company also selling books and clouds), which wouldn't allow watching their streams using an Android tablet (bizarrely, smartphones were somehow allowed...?) and requires Flash and/or Silverlight in the browser, Netflix only requires a browser that supports the HTML5 Media Source Extensions (plain "old" HTML5 video tag support is not enough), the HTML5 DRM extensions a.k.a. EME and one of the three supported DRM plugins (Microsoft's Playready, Apple's Fairplay, or Google's Widevine; used by and shipping with Windows/IE, Safari on OS X and Chrome, respectively). Of course, the DRM requirement is annoying (somehow, these things tend to be broken sooner rather than later and only make things inconvenient for legitimate customers), but it is much better than those horribly outdated and inefficient browser plugins.

The only thing that was annoying me was a very annoying case of screen tearing using Chrome in fullscreen on my laptop running Unity/Ubuntu. This was never an issue for me with other browsers (e.g. Firefox), video players or games, so I initially suspected a bug in Chromium and filed a bug report. Thanks to some very helpful comments on that bug's discussion thread, I have been able to finally understand what is going on.

However, as it turns out, the problem seems to be actually caused by Compiz (the default window manager of Unity on Ubuntu), or more specifically, its "unredirect output" feature for fullscreen applications. Compiz is a compositing window manager, which means that applications do not draw directly to the framebuffer, but to a texture in video memory; the window manager then uses the GPU to display all windows on their respective positions. This is both more efficient (for example, dragging a window doesn't require all affected applications to redraw their content up to 60 times per second anymore) and visually pleasing (it prevents those ugly broken windows that appear when dragging a window over another window whose application is not responding to redraw calls anymore).

Of course, while compositing all those windows/textures on the GPU is very efficient, it is still not free; especially when only a single full-screen application like a video player is being displayed, compositing is only a waste of resources. "Unredirect output" seems to allow such full-screen applications to again draw directly to the framebuffer (as opposed to a texture). However, some applications seem to have problems doing just that; they somehow get their timing wrong and draw their window contents at the wrong moments (i.e. not during the "vertical sync period" a.k.a. VSync), which leads to visual tearing.

It turns out that Compiz (at least on my distribution, Ubuntu 14.10) already comes with a pre-populated list of applications that are excluded from unredirecting. That list explains why I was experiencing tearing in Chrome, but not in any other application: All video players and other applications I have tried are preloaded on that list – except Chrome or Chromium!

(any) & !(class=Totem) & !(class=MPlayer) & !(class=Vlc) & !(class=Plugin-container) & !(class=Firefox)

Adding filter clauses for both Chrome and Chromium completely fixes the issue for me (the list can be accessed and modified in the "Composite" tab of the "CompizConfig Settings Manager", which should be available in your distribution if it ships with Compiz):

[...] & !(class=^Google-chrome) & !(class=Chromium)

This fix should solve the problem regardless of your graphics card manufacturer; if you are using an Intel GPU, you might also have luck with enabling the TearFree option of the video driver, which might or might not be more efficient and/or cause other problems with your graphics – I have decided to use the Compiz fix, since it aligns with the way all other applications already are drawing their fullscreen windows on my system.

If you are experiencing the same problem, let me know if this fix helps in the comments!

vim and that weird one-second startup delay

2014-05-15T20:16:00+02:00

Are you using vim, tmux, a graphical Linux desktop and are you experiencing random sluggishness when starting your editor? If not, you can skip this one.

This is something that had been bugging me for ages, first at work on my workstation, then at home: Long-running tmux sessions would sporadically induce startup delays of the vim editor of exactly one second. Reattaching tmux didn't solve the problem; logging out and back into my desktop always did.

First I thought I was just being impatient, but after some profiling with time, I was getting curious. strace revealed that the delay was exactly one second: Something in vims startup process was calling the nanosleep(2) system call with one second as an argument!

To make a long story short: This is caused by some X library that is mislead by a broken environment variable SESSION_MANAGER from a former X session. tmux tends to get rather attached to environment variables, which, in this instance, causes the sluggishness.

If the problem goes away after executing

unset SESSION_MANAGER

or something similar for your shell, you can fix it permanently by appending the following line to your .tmux.conf:

set-option -g -a update-environment " SESSION_MANAGER"

If you are now wondering why vim would need access to some X-related variables in the first place (as I was): It lets vim access your X clipboard! (Strangely, the variable SESSION_MANAGER is not actually needed for that, but you can verify it by overwriting some more critical X variable like DISPLAY or XAUTHORITY and subsequently trying to use the X clipboard from within vim.)

On agents and keychains (Part 3)

2014-05-12T21:40:00+02:00

In the previous posts of this series, I've described the operating environment of a password or private key agent and given a summary of their tasks. This time, we'll see how some real-world agents are implemented.

But before that, a disclaimer: I'm merely an interested observer of all of the tools mentioned below. All my knowledge is from looking at their documentation, source code and from practical experiments. If you plan to use any of them for your private, sensitive data, you should definitely not rely solely on this analysis.

Part 3: Real-world password and private key agents

`ssh-agent`

The first tool we'll be looking at is my personal favorite of the batch: ssh-agent. Its job is to protect an user's private SSH authentication keys.

Usually, those keys are stored in the user's home directory, encrypted with a symmetric key derived from a passphrase that has to be entered every time the key is used to connect to a remote server using SSH; ssh-agent was developed to avoid having to type it that often.

When an instance of ssh-agent is started, it creates a Unix domain socket; the file system path of that socket will usually be stored in an environment variable called SSH_AUTH_SOCKET. Starting the agent and setting the variable is usually handled by a few lines in the user's desktop and/or shell configuration files. This socket is then used by ssh-agent's clients to request various operations.

First of all, to be of any use, the private keys have to be actually loaded into the memory of the agent. This is performed by a tool called ssh-add, which basically asks the user for his passphrase and the location of his private keys, decrypts them in memory, and sends them over the Unix socket to the agent.

The nice thing about ssh-agent's protocol spoken over the socket is this: There is no command to extract a private key from it. Clients (mostly instances of the SSH client, ssh, really) can request the agent to sign some data on their behalf, which in turn allows them to authenticate against a remote SSH server. There are some other commands (e.g. to remove some or all private keys, temporarily lock or unlock the agent with a password, or get a list of currently loaded keys), but except for security bugs or other side channels, there is no way to make the agent reveal the private keys.

As I've mentioned, a security tool can only be as secure as the environment it's running in and on whose security measures it is relying on. In the case of ssh-agent, this is the user's Unix account, and in many cases some graphical desktop environment. ssh-agent (or at least the version of OpenSSH included in Ubuntu) tries to limit the ways other applications in the same context can interact with its virtual memory by disabling the ptrace(2) facility of the operating system. I'll write a lot more on that in a future post, but for now, it suffices to say that this (hopefully) makes it impossible for other processes to peek into an agent's memory space (using gdb or the /proc/<pid>/mem device).

`gpg-agent`

The next tool on our list seems to be quite similar to ssh-agent: gpg-agent also protects private keys, uses a Unix socket and an environment variable to answer to requests, and runs with the user's permissions, started in one of the various startup scripts of the desktop environment. It is used to protect a user's private (or secret) GnuPG encryption and signature keys.

Unfortunately, the similarity ends when it comes to how gpg-agent protects the user's private keys. In fact, I couldn't believe my eyes when I ran strace on an instance of gpg while executing a private-key operation:

$ strace gpg --armor --gen-revoke 2F5BBF5C
write(8, "GET_PASSPHRASE 1AA19BADB016B8BF3"..., 203) = 203
# [...]
read(8, "OK 70617373776F7264", 1002)    = 19

gpg-agent is not a private key agent at all! It merely caches the private key passphrase, handing it out to anyone asking niecly over the Unix domain socket (which means every application running with the user's privileges). This negates almost all of the security benefits of using an agent in the first place, and on top of that, it is even less secure than just storing the private key in the agent's memory: If the user's session is compromised, not only the private key, but also the passphrase can be recovered by an attacker.

I can only guess that there are various historical reasons for gpg-agent's architecture, but ssh-agent shows that there is a better way to handle private key caching in the userspace.

GNOME Keyring

GNOME is used as the default desktop environment for many Linux distributions; and even more are using only some parts while providing their own user interface (window manager, compositor, default applicatoins etc.) – Ubuntu is a famous example of the latter category.

The GNOME applications include a handy tool called GNOME Keyring, which is primarily a password manager, but can also act as a private-key manager for both SSH and GnuPG. I'm not using its private-key features any more for various reasons, but it is still my password manager of choice for everything else (primarily for web browsers).

The documentation page of the software is very upfront about what the tool can and cannot achieve: The developers openly state that for the current desktop architecture, secure privilege separation between applications is simply not possible.

There used to be some kind of access control system for applications, but as it is now, every application running with the user's privileges can store and request plaintext passwords to and from the agent, which in turn stores them in an encrypted database in the user's home directory.

The encryption key is derived from a passphrase defined by the user; if it is equal to their Unix login password, the keychain is conveniently "unlocked" (i.e., the encryption key is stored in the Keyring daemon's memory) as soon as the user logs in to their desktop.

gnome-keyring doesn't perform any security theater to make it seem as if the passwords of an unlocked keyring were somehow more secure than they are, but like ssh-agent, its memory is protected from access by other user processes on my system (because its binary has a Linux capability enabled).

OS X Keychain

Like GNOME, OS X provides users with a way to securely store their passwords on the disk while still granting automatic access to other applications as long as the user is logged in: OS X Keychain. However, it aims to go even further than that: By using an ACL based on the code signature or binary hash of requesting programs, it restricts access to the stored password to a subset of all applications running with the user's permissions.

The OS X desktop environment, at least theoretically, also seems to implement more security measures than X11: Applications do not seem to be able to install global keyloggers without first requesting special permissions from the user and the ptrace system call (or its cousin, task_for_pid, as it is known on OS X) is only available to privileged users or signed debugging tools (which in turn require user authentication).

Additionally, the Keychain service seems to be running with superuser privileges, so it might theoretically be able to perform some additional verifications of the process requesting a password (maybe the aforementioned checks of the binary hash and/or code signature).

But is that really enough to protect all potentially malicious, non-root accesses to the stored passwords? That will be the topic of the article concluding this little series, but before that, we'll see how process memory isolation of binaries running with the same user permissions could possibly be achieved.

On agents and keychains (Part 2)

2014-05-11T13:19:00+02:00

In the previous post of this series, I've roughly described the operating environment of a password or private key agent; this time, I'll try to summarize the basic structure and tasks of such an agent.

Part 2: What does an agent do?

The job of a password or private key agent is to protect, but also to share, secrets. In the case of a password manager, the secrets are the plaintext authentication tokens, or passwords, for various user accounts and services – e.g. web and mail passwords, Wi-Fi preshared keys and many more; a private key agent like ssh-agent or gpg-agent protects one or more private keys used for signing and/or encryption of messages or for the use in authentication protocols.

The entities that the secrets are (or are not) shared with are other processes running on the user's computer, running with the permissions of his user identity.

Restricting access to internal state

In order to be able to actually protect a user's secrets, the agent has to have some way to actually keep secrets from other processes. At least on a classical Unix system, this is a difficult task when running with the same permissions as those processes: As I've tried to explain in the last article, the permission model is not really designed for privilege separation of user applications.

In fact, the task at hand might be better solved by a classical Unix daemon; a trusted server process running under a different user ID (either the superuser's or one specifically created for the daemon). (In fact, this is the approach Apple took for their OS X Keyring.)

However, this is not the approach taken by some popular agents – but that's actually the topic of another article in this series.

Sharing secrets with trusted applications

While keeping secrets is an important task for an agent, there has to be some way for trusted applications to gain direct (in the case of passwords) or indirect (for private keys) access to those secrets – an agent that simply keeps all the secrets to itself is perfectly secure, but also perfectly useless.

The concept of a "trusted application" is trickier than it might seem: How would the trustworthiness of an application be defined in the first place? One might be tempted to enumerate a set of such trusted applications, e.g. the web browser(s) of a user for web authentication passwords, their mail client for their mail passwords and so on. But how does the agent actually identify the requesting entity?

Any approaches based on heuristics like the name of the executable file of the requesting process don't work: Users can install binaries in their home directory, where they can be freely replaced or modified by attackers.

A more sophisticated way to ensure integrity of a requesting process would be to hash the contents of its executable file and store that hash in the agent along with the secrets. This doesn't seem trivial to do, at least in the user space – on Linux, the agent would probably have to work with the /proc virtual file system to identify the executable of a process, but any such checks would be very likely be susceptible to TOCTOU vulnerabilities. The operating system might theoretically be able to provide the agent with a trustworthy hash of a requestor, though – but I suspect that this is not possible on the stock Linux kernel, for example.

An alternative to hashing the requesting binary is code signing. Operating systems that allow executable files to be signed by their developers or some other entity could provide the agent with the identity of the signer, which would allow "safe" modifications of the requester by its original developer or a system administrator (e.g. software updates or security patches).

Unfortunately, even if the authenticity of the requestor could be determined beyond doubt, this is still not enough: What if the attacker is able to coerce an otherwise trusted application to make a request on their behalf, and reveal the reply to them? This scenario will be part of another future article.

Limitations to the agent model

Before examining some agents in detail, it should be said that there are some fundamental limitations to what even a perfectly secure agent can achieve. If the user's machine (or even only his own account) is compromised in a global way, e.g. by an attacker that installs a key logger or is able to remotely control a user's session, the security benefits of the agent might very well be completely negated.

In other words, it is probably sufficient for an agent to be as secure as the environment it is operating in. An OS that does not provide even basic application isolation for graphical applications is very hard to protect indeed. (Unfortunately, almost all X11-based desktop environments are all but impossible to secure against untrustworthy applications.)

Even then, an agent might be of limited utility on such an untrustworthy system, if its task is to only indirectly grant access to sensitive data. This is exactly the situation for an SSH or GPG agent: By design, such an agent will never expose a user's private keys, but will only execute various private key operations on requestor-provided data items.

While an attacker with user privileges will be able to execute a number of such operations (e.g. logins on remote servers, decryption of single messages), after the compromise is detected, the keys do not necessarily have to be changed or revoked. (This might not be the case for some applications, though – e.g., if it is possible to sign new public keys with an existing private key and mark them a originating from the same user.)

On agents and keychains (Part 1)

2014-05-10T14:22:00+02:00

Many people, myself included, use tools like ssh-agent or gpg-agent to protect their private keys from theft without sacrificing the convenience of password-less logins. Presumably even more people use some kind of password manager, whether that is the one included with their operating system or a third-party one. I've been using both for a long time, but only recently started to wonder about their internals: What is the threat model here, and how do those tools provide the necessary protection? This will be a series of posts on the subject; in this one, I will try to examine the necessity of such programs, and the way process separation is implemented in various operating systems.

Part 1: The need for application isolation

Isolate what?

Before looking at the tools in question, we have to look at the environment in which they are being used. The various Unix operating systems have historically focused their security efforts on separating the actions of multiple users on a single system – i.e., Alice is not supposed to be able to read Bob's mail. Processes usually run with a user's permissions and are free to read and modify files in their home directory, as well as communicate with each other almost without restrictions (more on that later).

This model does not distinguish between a user and the programs he is using. Whether a user runs a simple unix command like mv or a complex application like a web browser, the operating sytem kernel assumes that all the system calls by the user's processes are identical with the user's intentions.

While this assumption is still reasonable if all of the binaries our user might run are provided by the same people who provide their operating system, things start to get interesting once users bring their own software, whether voluntarily or accidentially (in the form of malware received through whatever vector).

Once the user runs any piece of "evil" code, they lose. The operating system will still isolate their requests from other users on the system, but that might be little consolation for the case of a typical desktop user – more often than not, they are the only user on their system (at least as far as Unix permissions are concerned), and a malicios application running with their permissions amounts to a full system compromise.

At a first glance, it seems to be impossible for an user of such a system to protect some piece of information against their own processes, but still, the existance of password managers that use anything but a plain-text unencrypted database seems to indicate that at least their vendors think (or try to convince their customers to think) otherwise.

A different model

Before looking at the tools in question, I think it is interesting to examine some other ways of application privilege separation.

While most desktop systems essentially still operate under the same security paradigms, the situation is very different for the mobile operating systems. (Presumably) inspired by the situation of malware on the most common desktop operating systems, their creators have realized that in order to unleash an enormous whealth of third-party applications on their users by design, a more strict separation of privileges is in order.

Android has implemented application separation in a simple, yet very effective way: Every Android app has a unique Unix user account. This way, by default all application data is implicitly private. In order to use anything but their own data, applications have to use Android's library functions that will moderate access to system functions and potentially sensitive data. (Permissions to use those library functions are granted at the time an application is installed in an all-or-nothing fashion, but that is a design decision that could be modified to a more fine-grained model pretty easily.)

Apple's iOS uses a more traditional approach with regards to UIDs - all applications are running as the user mobile. Sandboxing is instead explicitly implemented in the kernel, which restricts each application's system calls to a secure subset. Basically, reads and writes to anything but a list of allowed files and directories and other security-critical system calls will fail. Some exceptions to this can be granted by the user at run-time (e.g. access to the address book or location based services).

Apps on the desktop?

Can such a security model be brought to the desktop without breaking almost any existing application? It seems that at least Apple thinks that this is the way to go. While it is still possible to run unrestricted applications on OS X, there is now also a sandboxing mechanism in place that allows a developer to whitelist the set of allowed system calls just like on iOS. This will presumably become a mandatory feature of new applications submitted to the Apple-curated Mac app store. Microsoft seems to be trying to do something similar with their Windows Store.

While this might solve the problem of application isolation, there are a lot of legacy applications that will probably be never ported to such a restrictive environment, and the third popular desktop operating system, Linux, does not provide such a sandboxed app-store model for obvious reasons.

Still, there are password managers and private key agents for all three major platforms – (how) do they work? This will be the topic for the following articles.

ssh-agent and the OS X Keychain

2014-05-08T16:06:00+02:00

Are you relying on OS X's Keychain to protect your SSH key passphrases? You shouldn't. (The "plain" ssh-agent is fine, though.)

To be continued!

How to fix slow DNS lookups on Ubuntu

2013-11-18T22:03:00+01:00

If you're using a relatively recent version of Ubuntu, chances are that you have encountered spurious slowdowns that might be related to a very specific DNS failure. For me, it was the fact that ping to a host without a reverse DNS entry would only transmit a single ICMP request per second, even when a higher rate was specified via the -i option.

I've traced the DNS requests that are performed by ping by default (the effect does not occur when using the -n option which disables host name lookup), and didn't notice anything out of the ordinary. The NXDOMAIN responses were occuring almost instantly, but nevertheless, it took precisely one second for this response to actually propagate to the ping process.

To make a long story short: The reson for this is that Ubuntu (or more precisely, the Name Switching Service), will (by default) try to lookup DNS records not only via the regular DNS server configured via the network settings, but also using Zeroconf (a.k.a. Bonjour); a protocol that can be used to resolve hostnames locally by using multicast DNS requests and responses.

This is not an issue for DNS queries that can be answered positively by your regular DNS server (those will always take precedence over records received via Zeroconf), but it can be a problem for negative DNS responses (NXDOMAIN): When the resolving library receives one of those, it will try a Zeroconf lookup, and this can take a while – especially for a host that does not exist.

Since Zeroconf is only rarely used on Linux and is almost always limited to the .local top-level domain, this behavior seems useless at best, and can be pretty irritating.

To fix it, you can simply disable the Zeroconf DNS lookups in the configuration file /etc/nsswitch.conf by changing the line

hosts:          files myhostname mdns4_minimal [NOTFOUND=return] dns mdns4

hosts:          files myhostname mdns4_minimal [NOTFOUND=return] dns

This doesn't entirely disable Zeroconf – it only restricts the lookups to the .local domain, which is almost always the only place where they are useful anyway.

The effects should be immediately noticeable – just try to ping one of the previously slow to respond hosts and check if the ICMP requests are still limited to one per second.

If you think that this should be the default configuration for Ubuntu, you are not alone – there is a bug report on Ubuntu's bug tracker that describes the problem, but since it's been known since 2007, I wouldn't bet on the default changing anytime soon.

Thoughts on a cloud-based password synchronization service

2013-10-24T16:15:00+02:00

Today, Apple has enabled its cloud-based password synchronization service, iCloud Keychain. The service promises to safely store and synchronize passwords and other sensitive user data like credit card numbers among multiple devices. Apple claims that the information is protected with AES, but that alone is meaningless without knowing where that key is actually stored.

As usual, there is not much public documentation, but there is a support document that contains some interesting propositions:

Adding a new device to an iCloud-synchronized Keychain displays a message on a previously registered device to accept or deny that new device.
When enabling Keychain sync, the user is given an option to create a backup code.
With the backup code, it's possible to recover the Keychain contents without the original device; without the code, (supposedly) not even Apple can access the contents.
The number of times a user can enter the security code is limited; Apple support can extend the limit, but eventually, the Keychain data will be deleted from the server.

Starting from those propositions alone, I was wondering how it might be possible to implement a password storage and synchronization service that has all those properties. Is there a way to enable such a service without simply storing the AES key on the servers, and using the user password to retrieve it together with the data? The following is based on speculation alone; I haven't done any reverse engineering on the actual Keychain software or protocol.

The first statement about adding a new device sounds like there is some kind of key exchange going on, which the user can allow or deny. The new device could present a public key to the original device, and the old device could then encrypt the AES key with that public key. (Every iOS device already has at least one RSA key in the form of a certificate signed by Apple's certificate authority.) Without any kind of fingerprint verification, there is no way to verify that the public key actually belongs to the new device and not to some third party, though.

Disregarding any possible MITM attacks on the key exchange, this way of adding new devices could be used to safely share the password database and its encryption key among many devices. The shared key can also be used for efficient synchronization of future modifications to the database.

The second and third statements about the backup code sounds like a way to store a copy of the database encryption key on the server, which might be wrapped with yet another key derived from the backup code. The default strength of the backup code is only a four-digit number, which even when used with PBKDF2 with many iterations is barely more secure than plaintext, but it can be changed to a more secure alphanumeric password. When using a reasonably secure passphrase, this makes it impossible for the service provider to access the database contents.

The fourth claim about a limit to the number of attempts to enter the backup code could be implemented with a secure hash function. When the backup code is first created, it is not only used as an input to a key derivation function which is then used to wrap the database encryption key before it is sent to the server, but also hashed (optionally with a salt and a number of iterations). The resulting hash is also transmitted to the server together with the wrapped backup key.

When the user later initiates a database restore, the server first transmits the salt (if there is one) to the client. The user then enters the backup code on the device, where it is hashed with the salt, and transmitted back to the server. Only when the response to that challenge is identical to the response stored on the server, the actual database will be sent to the client in its encrypted form. This way, the number of backup code attempts per second can be rate-limited on the server side.

This would make it possible to prevent brute-force attacks on a weak backup code for other clients. Of course, it doesn't help against an untrustworthy service provider, who will be able to brute-force the encryption key without any limitation, since he necessarily holds the backup copy of the database and its wrapped encryption key.

I would be very interested in a detailed protocol analysis of Apple's solution, like the one that was recently published about the iMessage protocol. Using an architecture like the one lined out above would put Apple in a similar position as for iMessage with regards to lawful interception: While government access would be possible via a MITM-attack on the device setup procedure, it wouldn't be as simple as demanding the user database and the according encryption key. Everything else would more or less invalidate the unambiguous statement (as quoted from the support page for iCloud Keychain) regarding Apple's capabilities: "If you choose to not create an iCloud Security Code, Apple will not be able to recover your iCloud Keychain."

Of course, if a user choses to use a four-digit numeric backup code (which is the proposed default by the setup wizard), the details of the implementation are rendered moot: There is no way such a weak password can provide any security against brute force attacks using any practical combination of hash function and iteration count. (This is probably also the reason why the service implements a rate-limiting feature for recovery access.) It would have been in the interest of Apple's user base to provide a strong, randomly generated alphanumeric string as a backup code by default, like Mozilla does for their bookmark synchronization service.

Update (2013-10-30): Ars Technica has published an interesting article on the topic, with similar conclusions. They claim that there is a different recovery process depending on whether a four-digit security code or an actual high-entropy password is used, which is somewhat strange (if there is really no server-side brute-force protection for alphanumeric passwords, a four-digit passcode could actually provide better protection than a five-character alphanumeric password). Using a high-entropy password seems like the better choice in any case.

Safe deterministic (EC)DSA signatures are coming to OpenSSL

2013-08-30T00:37:00+02:00

By now, everybody involved in implementing algorithms using the DSA or the ECDSA signature schemes should really understand the importance of a proper secret nonce as one of the inputs for a signature.

This is easy to get wrong, both because PRNGs are really, really, really hard to get right, and because not everybody implementing/using (EC)DSA expected to be needing randomness just for signing stuff (as opposed to creating key pairs).

Fortunately, there is a way out. Poncho on Stackexchange Crypto has notified me about an interesting RFC in the comments on a nice answer to a related question.

The really clever idea is that there is another way to (probabilistically) ensure that a secret nonce is used for every signature than just using a PRNG and hoping for the best.

Since reusing the same nonce for the same message signed by the same key will always give the same signature as an output (there are no other inputs to the signature algorithm), we just have to guarantee that the nonce is different and unpredictable for different messages.

By using a hash of the message and the private key as the nonce, these conditions can be satisfied even without a proper PRNG. Even better, it's possible to hash them together with some random data to provide backwards compatibility to implementations that react badly to deterministic (EC)DSA signatures. (Maybe some regression tests might interpret the lack of randomness as a fatal design flaw.)

An (informal) RFC is nice, but actual code is even nicer, so I'm very happy that a patch implementing this method and making it the default in OpenSSL has already been accepted to the development version.

If you're interested in the details, there's a blog post by the author that has some more details.

I'm really looking forward to this patch shipping in a lot of OpenSSL binaries, whether as a part of a distribution or embedded in some other software – there have been more than enough fatal PRNG(EC)DSA failures in the past for my liking.

TLS client certificates and Mobile Safari

2013-08-27T16:02:00+02:00

Update (2013-08-31): Apple has asked me to refrain from publishing any details on this security-relevant bug for the time being; I hope that a fix will be released soon. When that happens (or after a reasonable amount of time has passed), the original post will be restored.

Until then, I would strongly advise against using Mobile Safari when any X.509 client certificates are stored on an iOS device, e.g. an S/MIME encryption/signing certificate. Other in iOS, like Chrome, are not affected; neither are browsers on OS X (including Safari).

Second update (2013-10-23): Since my original post, iOS 7 has been released; the bug described below seems to have been fixed. The issue is of course still present in iOS <= 6.1.4. Since it seems to be Apple's policy not to release security fixes for discontinued OS versions, this leaves older devices like the original iPad and the iPod touch (up to the 4th generation) vulnerable. That's unfortunate, but since I'm definitely not the only one who knows about the issue, here is my original post. Be sure to take care when using any client certificates on an older iOS device.

tl;dr: If you have an S/MIME or other X.509 client certificate installed on your iOS device, Mobile Safari will hand it out to any web server that asks for it – without asking you.

Recently, I've looked into TLS with client certificates, specifically into how the various browsers and operating systems implement them.

In addition to authenticating a server and securing a connection between this server and an anonymous client, TLS also allows the client to identify itself to the server using its own X.509 certificate. This mode is only used by very few services using TLS, which could be attributed to the difficulty of issuing client certificates in the first place, and protecting them against both theft and loss later on.

However, I think that there are more issues with client certificates than that.

First of all, the client certificate is transmitted to the server unencrypted, which means that everybody between the client and the server is able to identify the user trying to connect. Since an X.509 certificate frequently contains personal information like the user's full name and mail address,this seems like a bad thing to do.

Additionally, TLS client certificates are used in a way that doesn't provide deniable authentication. To prove that the client is in posession of the private key corresponding to the X.509 certificate, he signs all previous handshake messages. Among other things, this contains a (client-provided) timestamp and the server certificate; the signature of those values can be used to prove that that somebody with access to the private key initiated a connection to a specific server at a specific time. Even worse, this signature is also still transmitted in plaintext (symmetric encryption and authentication aren't used before the next message (Finished) in the handshake.

Considering those (in my opinion substantial) disadvatages of the implementation of client certificate authentication in the current version of TLS, it might be better to perform authentication inside the secure TLS channel at the application layer, which is exactly how it's done for the vast majority of web services (via HTTP cookies) and other protocols protected by TLS.

(An even better solution would be a TLS extension that moves the client authentication inside the secure channel, or even uses something analogous to the server authentication in TLS, which might be able to provide deniable authentication for the client as well. But the rate at which TLS extensions and updates are adopted by software vendors is not exactly instantaneous.)

Since the status quo seems to be exactly that (whether that's due to the difficulty of issuing certificates or to the mentioned disadvantages of them with TLS), is there anything left to worry about?

There is: Broken browsers.

Probably due to their minimal use in real-word applications, some browsers' TLS client certificate implementations are a bit sloppy. When an HTTP server requests a client certificate (using the Certificate Request message in the TLS handshake), most of them display a pretty technical-looking dialog to the user, who might or might not understand what's going on.

This is clearly not an example of good user experience. So let's check how Apple does it in iOS...

Oops. They don't. They just pick the first certificate available (in my case, this is an S/MIME certificate that includes my full name, my employer and my e-mail address), transmit it and authenticate to the server by non-repudiably signing the TLS handshake – all in plaintext. All the previously mentioned caveats apply, only that the user has no choice about the matter in the first place.

If you want to try it yourself, just visit Mike's Toolbox with Mobile Safari, accept the self-signed server certificate and look for your name or e-mail address on that page.

This problem has been mentioned before publicly at least once, more than one year and one major OS version ago. On the desktop, this has already been fixed (with OS X 10.5.3); I'm really hoping it will be fixed with iOS 7 as well.

Android's SecureRandom - not even nonce

2013-08-15T15:30:00+02:00

There has been a bit of drama about the theft of some 55 Bitcoins (worth about $5500 at the current exchange rate), with the common denominator that all of the corresponding private keys were stored in Android wallets. While this is not nearly the first case of Bitcoin theft, it is probably the first one that is a direct result of a crypto bug.

In this case, the problem resulted from the (re)use of the nonce used in the elliptic curve signatures that are used to generate Bitcoin transactions. As everybody familiar with (or even implementing) ECC-based encryption schemes should know well by know, reusing the signature nonce, or using a predictable value even once, results in catastrophic failure: The private key can then be trivially calculated from the signature(s). (If you don't believe me, just ask Sony...) This seems to be the method that was used to steal the Bitcoins in question.

So far, so bad. The obvious question now is: Who was responsible for reusing the nonces in the first place? Since the flaw is not limited to a single wallet implementation, but only occurs on Android (even though some of the Bitcoin libraries are also used on desktop bitcoin clients), people quickly came to the conclusion that there must be a flaw in one of Android's cryptographic libraries.

In an announcement to the Bitcoin dev mailing list, Mike Hearn, the developer of the Java library bitcoinj announced that the offender in question is the class SecureRandom of the Android framework. The various wallets for Android were quickly patched to avoid that class and use /dev/urandom directly, and as far as their developers and users are concerned, the problem is now solved.

However, when there is a bug in a security primitive implemented in a such widely used library, chances are that other users are also affected. So what exactly went wrong, and what are the implications?

Shortly after the announcement of the bug, people were quick to point to a paper discussing several vulnerabilities of the SecureRandom implementations of various Java frameworks, among them Apache Harmony, which is the base for Google's Android framework.

Indeed, Android uses that implementation - but only up to and including version 4.1. Additionally, according to the paper, the flaw limits the entropy of an instance of SecureRandom to 64 bits of entropy. This is not enough for cryptographic applications like key generation or nonces, but also doesn't explain why in many cases, the exact same values were generated. On average, 2^32 transactions would have to be generated to yield a single collision – all of those with the same key, e.g. bitcoin address.

Another problem of the Harmony implementation of SecureRandom is that using setSeed() on an instace replaces the existing entropy in the generator (instead of being mixed securely combined with it). When used wrongly (e.g. by seeding with not-so-random data), this could also lead to predictable values generated by the instance. (This behavior was even used by some applications to use SecureRandom together with a deterministic seed as some kind of key storage facility. Madness, I know...)

With Android 4.2, Google finally switched to a different implementation based on OpenSSL. Since then, calls to the setSeed() method augment the internal entropy, instead of replacing it, as it should be. The new generator is even used when specifically asking for the Harmony-based one; obviously somebody at Google regarded the flaws important enough to allow modifying the behavior of legacy apps. (Of course, this broke all off-label usages of SecureRandom in the process).

So, Android versions from 4.2 should be safe, right? As it turns out, they are not. In some circumstances (I haven't looked at the source yet), the new SecureRandom implementation goes horribly, horribly wrong and returns identical values for successive invocations. This is obviously very bad, not only for Bitcoin wallets, but basically for everybody using the Java cryptographic operations. According to the Android devs, that includes Key generation of symmetric and asymmetric keys (using the KeyGenerator, KeyPairGenerator and KeyAgreement classes).

Ironically, using setSeed() with proper random data with the OpenSSL implementation avoids the bug, which leads to the interesting situation that using setSeed() is discouraged for Android 4.1 and earlier, but is essential for 4.2. (Android 4.3 seems to avoid both bugs, according to the code in the blog post). The Android developers were kind enough to provide a ready-to-use mitigation in the form of a drop-in replacement for SecureRandom that does exactly that. (You should probably still be careful for older Android versions using the Harmony implementation – all the caveats for that mentioned in the paper still apply.)

Every Android developer using one of the affected classes should evaluate their usage, implement the workaround and necessary countermeasures, which could include warning your users to replace any keys generated with the weak random number generator, which is exactly what the developer of Bitcoin Wallet for Android did promptly after learning about the vulnerability (the update generates a new, secure Bitcoin address and immediately transfers all funds to it after the update). If you don't trust any of the Android implementations of SecureRandom anymore, you should take a look at his implementation based directly on /dev/urandom.

As a user, you should check if there is an update available for any of your security-relevant applications, and when in doubt, stop using keys generated on one of the vulnerable Android versions (Android 4.2, and to a lesser degree also everything from before – the 64-bits of entropy again). I plan to evaluate some of the obvious candidates like ConnectBot myself.

This is my preliminary analysis on the situation - if you have anything to share on the matter, or have spotted some wrong conclusions in my argumentation, please leave your comment below or drop me a message!

A side note on the Bitcoin side of the ECC nonce problem: The original Bitcoin implementation elegantly avoids the problem, since it only reveals the public key at the moment of a transaction, and sends the "change" to a newly generated address, to which the ECC key remains unknown until the next transaction. This doesn't apply universally (for example if you receive transactions to an address which you have already used to send coins previously, e.g. well-known donation address; special transactions directly to public keys are also not protected). Since the method is based on a cryptographic hash of the public key, it even provides some protection against quantum-computing based attacks. I wonder what the exact motivation of the creators for using hashes was, but it sure is a nice trick! Unfortunately, bitcoinj currently sends all change to the original address, to which the key has already been revealed, which is why the attack worked in the first place. But there is a good reason to do this: Until we get deterministic address generation, ad-hoc creating addresses to transfer the change to makes wallet backups difficult and is prone to inadverted loss of private keys. But this is probably the topic of another blog post.

Uninitialized buffers in OpenGL

2013-05-20T21:25:00+02:00

As I've mentioned in my last article, I'm interested in the implementation details and the security of open and closed-source GPU drivers.

In addition to the security implications of the model that is used by some of the current drivers (they allow the OpenGL client to send commands directly to the GPU, with the kernel only checking for illegal address references in the command stream, instead of using an actual IOMMU), there is a much simpler way to cause mischief when given access to an accelerated OpenGL implementation on a system: Uninitialized buffers.

Normally, when requesting memory from the operating system (for example through the malloc standard library function, which in turn uses an anonymous, private mmap memory mapping), the kernel goes through the effort of zeroing out the contents of the newly allocated chunk of main memory. While this is not required by the C language specification in any way, and one should never rely on that implementation detail (smaller allocations could be handled by the library in a different way, and those are not guaranteed to be zero-initialized), it's a pretty important security feature.

Just imagine what would happen if the physical memory block used to be allocated to your browser, and contained the session cookie for an online banking session, or worse, an instance of GPG, containing your private key... And while most security-relevant code will probably go to great lengths to avoid that kind of thing from happening by overwriting the relevant memory locations before deallocating them, there is always the possibility of application crashes, which will render those protections useless.

All in all, that operating system feature is really essential to guarantee the isolation among different users who are working on the same machine simultaneously. (As a side note, a similar thing is performed for some file systems, which don't zero out newly-allocated blocks, but use a different method to achieve a similar effect, and to prevent users from gaining access to residual chunks of data in case of a system crash during the allocation.)

I was expecting the same thing to happen for GPU driver implementations, since nowadays, many window managers use OpenGL acceleration to draw the window contents to the right locations with various effects like transparency or animated window switching. Basically, the window content is stored as an OpenGL texture, which is later mapped to a rectangle on the graphical desktop. So, in many cases, their content is at least as security-critical as the content of main memory – just think about your terminal's or browser's window content. Well, it turns out I was wrong:

This screenshot shows a simple OpenGL demonstration program, which I modified just a tiny bit: I removed the part that loads the cube texture from memory, or more accurately, replaced the pointer to the image data with a null pointer (which seems to be allowed by the OpenGL specification). It is implementation-defined whether that means that the buffer should be zero-initialized, or can remain uninitialized – and the nouveau driver for my Nvidia card seems to do the latter, apparently for performance reasons.

I asked the nouveau developers in the IRC channel for their view on the topic, and Dave Airlie told me that while video buffers in the main memory should be zero-initialized on nouveau, buffers residing in video memory are not overwritten by default, while theoretically possible.

On integrated GPUs that use the main memory for all of their buffers, the problem could be even more severe – not only the content of other user's windows, but even arbitrary memory contents could be theoretically extracted with custom shader code. I retried the experiment on an Intel GPU, and was relieved to only see an untexturized black cube. The same thing happens on Android, where I tried it on both an Adreno- and an Nvidia Tegra–equipped device. However, this does not mean that those platforms are safe – it only means that somewhere in their OpenGL implementation, the buffer is zeroed, which might as well happen only in the userspace library, and could therefore be circumvented by directly interfacing with the command buffer (which is admittedly much more difficult, and might well be impossible for things like WebGL, where direct access to those buffers is not possible for application code).

One possible mitigation for that security risk is very simple, and therefore widely used: Just don't give access to the video hardware to anyone but users that are physically present. Many Linux distributions do just that with the allowed_users=console setting of the Xwrapper.config configuration file. This reduces the attack surface significantly – most computers are only used for desktop logins by a single person at a time, and anybody who is able to run software in that user's X session (which seems to be an additional requirement for GPU hardware access, at least on DRI/DRM) has much easier ways to grab arbitrary window contents.

But with WebGL becoming more and more popular, that situation is changing – now, web page authors can execute OpenGL code on any visitor's GPU hardware, and read back the content of the resulting images (with limits imposed by the same-origin policy). This might be one of the reasons why WebGL specifically mandates that implementations clear their buffers before allocation. That's obviously a very good idea, seeing that there is even a working exploit for that particular loophole! Now let's hope that all browser vendors read that part of the specification carefully, and we should be safe – but only against that specific security threat of running untrusted code on hardware with direct access to the main memory...

Graphics processing in hardware and software

2013-05-10T20:23:00+02:00

I've got a peculiar hobby: I like to worry about very specific implementation details of technologies I don't really understand at all; one of them being GPUs and graphics drivers.

On one hand, it's really simple: In almost every computing device, there is a GPU. This is basically a programmable, special-purpose, massively parallel CPU, and until recently, its only purpose was drawing triangles in different colors; and not just one or two, but lots of them – per second. Parts of it are dedicated to the triangle-drawing business, because that's still the most efficient way to do it, but most of the hard work happens in the programmable parts.

Since every device seems to need a driver, there is one for every GPU. And how hard can that be? Identify the triangle-drawing chip in question, figure out a way to talk to it, throw some triangle coordinates at it and marvel at the results.

But the more I think and read about those two components, the more I get the impression that it might not be that simple.

Concerning the GPU itself, I'm wondering what parts of the rendering pipeline (the process of interpreting large amounts of bits as triangle coordinates and textures and converting them to a rasterized 2D projection of a three-dimensional scene) are actually still happening in dedicated circuits, and how much of it really happens on general purpose CPUs, programmed by firmware internal to the GPU or possibly even the driver, and therefore the CPU. From what I've learned so far (mostly by reading lots of introductions to OpenGL, modern GPUs, technical documentations and source code, everything is possible – there are software renderers that run as software on the shaders of a GPU, and, on the other end of the spectrum, "hardware" components that are fed with ASCII representations of OpenGL shaders (with the help of not-so-open source drivers.

Some GPUs need blobs of firmware in order to do their job (which hints to a partial software-like approach to the problem); others don't – but that doesn't say anything, since firmware can also be stored inside of a chip, similar to the microcode of common "CISC" CPUs.

The more I think about that, the more I realize that, for this topic as for almost every other technical subject, there is no easy or general answer, and finding it the hard way takes lots of time, and also luck with finding the right documentation. Which brings me to the topic open source graphics drivers.

Since most of the magic seems to be happening at least partially in software, whether on the host CPU or in embedded DSPs of the GPU (though I realize that there are quite a few ASICs left), there is an understandable, but still annoying tendency of GPU vendors to treat their driver software with as much secrecy as their actual hardware products – simply because that actual product is actually the combination of the chip and the driver.

This brings us the obvious problems that all closed-source drivers share: We have no way of fixing problems when they arise, and also no way of making assertions, or even educated guesses, about the security properties of a software that runs with the highest privileges possible on millions, possibly billions, of machines storing sensitive data, both commercial and private.

Apart from actual vulnerabilities in the driver code running on the CPU, I'm wondering to what extent processes running on the GPU itself can access the main memory of the system, and how the various drivers ensure that such memory accesses don't circumvent the process separation that is now commonplace on most operating systems thanks to the memory virtualization provided by the combination of the memory management unit of the CPU and the security mechanisms of the operating system kernel.

Since shaders, the programs running on the GPU execution units, can be provided in source and sometimes also binary form by any user of the graphics (OpenGL) or general purpose (OpenCL) API, memory accesses of those shaders have to be obviously limited to something less than the whole system memory space. There seem to be two approaches:

For some GPU drivers, that protection is provided by the driver verifying all commands that are submitted from the user space to the GPU. It checks for illegal memory accesses and other potentially dangerous operations.
Other, mostly newer models provide a hardware MMU themselves that can be programmed by the operating system or the driver to disallow all memory accesses, except for the ones for data that is located in buffers owned by the same user.

According to a presentation on the subject, the first approach is currently used by the Linux drivers for AMD and Intel GPUs, while the second one seems to be only supported by the open nouveau driver for Nvidia GPUs.

The situation for OpenGL on Android seems different, even though it also uses the Linux kernel: Due to some references in the Kernel source code of almost all Android platforms which I examined, I suspect that most or all of the Android drivers actually use an IOMMU, that is, the hardware approach to the problem. I suspect that this is because it allow the mobile GPU vendors to open-source the Kernel portion of their drivers – the verification approach can obviously only be executed in the Kernel (or a trusted userspace daemon, with even more overhead), and needs a lot of knowledge about the format of the command stream sent to the GPU, which would thereby be openly documented.

As I've mentioned, most of the drivers are released as closed-source by their vendors (with Intel and possibly (I've not done any research on them) AMD being a laudable exception), but there are some open-source alternatives, most of them are created by tediously reverse-engineering the GPUs. At least for Nvidias Tegra line of mobile GPUs, that might change, though; after fingers having been pointed at each other, Nvidia finally seems to release a bit more to the open source community in the form of both documentation and actual code commits. One of them is especially interesting to me, since it confirms the IOMMU approach being used. On the mainline Linux kernel, it also seems possible to use the stream validation approach.

So what is my point? As I've said, I have a peculiar hobby, and somehow I find the topic of GPU drivers really interesting. I still don't know nearly enough even to be able to understand the Kernel source code, but I'll continue to try to get a clearer overview nevertheless. If you've got any hints for me, please go ahead and write me (blog at lxgr dot net)!

A quine in x86-64 assembly

2013-05-10T17:30:00+02:00

This summer term, I'm taking a really interesting course on computer security: While the lectures are pretty theoretical (one of the topics is a proof that shows that proving the general security properties of certain models is equivalent to the halting problem, which is done by implementing a turing machine within the access model...), the homeworks are partially about x86(-64) assembly programming. My only assembly programming experiences until now were with MMIX, which is almost completely on the opposite end of the RISC/CISC spectrum than the good old x86 (not to mention that the architecture is completely theoretical and has never been implemented in hardware). To make a long story short, I finally have a reason to program some x86 assembly!

Our most recent exercise sounds quite simple, but kept me busy longer than I expected: We are supposed to write a quine in x86-64 assembly, that is, a program that has its own source code as its (only) output – identical to the last byte. I was already familiar with quines, but I had never tried to create one before, and if I had, assembly would definitely not have been my language of choice, but since it was one of two non-optional homework exercises, I figured that it couldn't be so hard after all. So without further ado, here is a quine in x86-64 assembly (GNU assembler syntax):

.att_syntax noprefix
.globl main
main:
pushq rbp
movq rsp, rbp
mov $.Cs, rdi
mov $0xa, rsi
mov $0x22, edx
mov $.Cs, ecx
mov $0x22, r8d
mov $0xa, r9d
xor eax, eax
call printf
xor eax, eax
leave
ret
.Cs: .string ".att_syntax noprefix
.globl main
main:
pushq rbp
movq rsp, rbp
mov $.Cs, rdi
mov $0xa, rsi
mov $0x22, edx
mov $.Cs, ecx
mov $0x22, r8d
mov $0xa, r9d
xor eax, eax
call printf
xor eax, eax
leave
ret%c.Cs: .string %c%s%c%c"

It can be compiled, executed and verified for proper quine-ness like this:

gcc quine.s -o quine && ./quine > output && diff quine.s output

As do most quines, this probably needs some explanation. Generally speaking, all quines (at least the ones I've come across) share a common structure: There is some code of the language in question, and one or more rather long strings, which contain most of that code in quoted form. The trick is to print the quoted code twice: Once verbatim, and once with quotation marks and some additional characters, so that the string declaration itself is printed out. To do that, the string quotation signs have to be escaped or stored in some modified form – otherwise, they would be interpreted simply as the end of the string.

The most obvious guess is to just escape them with a backslash (like so: \"), but that doesn't really help us – now we also have to print a backslash in our output! The solution here (and in many other quines) is to store them in another form that can be safely quoted verbatim inside of a string, but still evaluates to the desired character in the output. In this case, the printf C library function is used to recreate two occurences of two otherwise problematic characters: The aforementioned quotation mark and the newline character. Both characters are stored as their numeric (or more specific, hexadecimal) representation according to the ASCII codepage: 0xa for the newline, and 0x22 for the double quotation mark. All that I was able to learn from the very nice example for a quine in C given in the (German) Wikipedia article on the subject, which also taught me the really neat trick of using the same string twice – once as a formatting string, and once again as one of the parameters of the printf function.

My approach was then to find a valid translation for that quine to assembly, which revolved mostly around two problems: Generating the machine instructions for the printf call, and escaping all occurences of problematic characters in the resulting program so that it can be stored as a valid string.

The first part of the problem can be easily solved by some experimentation with a similar C program and by disassembling the binary as compiled by GCC (and identifying the relevant lines in the output!) – it boils down to moving the address of the string and the literal values of the ASCII characters to the right registers (according to the x86-64 calling convention) and executing the call.

A minor detail of interest is the instruction xor eax, eax right before the function call: As it turns out, functions with variable-length parameter lists like printf expect the number of parameters passed to them in the eax register; in this case, there are exactly zero. I can only speculate about the reasons for this part of the calling conventions (after all, the total number of arguments is not passed in a register!), but I gather it has something to do with possible optimizations in functions further down/up the call stack, since saving those registers is rather costly and should be avoided if not necessary. I only figured out the importance of zeroing the register when I tried the program on a workstation at my university – while I could get away without it on my own laptop, it would invariably crash there without that instruction.

Another problem would have been the newline characters: Since the GASM syntax requires a newline after every machine instruction statement, it's not possible to get away like the C quine from the mentioned article, that is, to simply write the quine on a single line. Fortunately, GCC/GASM does the right thing when confronted with "raw" newline characters inside of a string, and just treats them the same way it would handle a proper \n newline. This causes some warnings from my version of GCC, but compiles/assembles nevertheless – otherwise, all the newline characters would probably have to be submitted as parameters for printf.

If you are familiar with the GASM assembly syntax, you might have noticed a minor oddity about the code: Register names are not prefixed with a percent sign as they usually have to be. The reason for that is that the percent sign has a very special meaning in the formatting string parameter of printf – it indicates that the following character(s) should be interpreted as a formatting directive, and replaced by a specific parameter of the function! This leads to a problem similar to simple backslash escaping: For every percent character, an ASCII-encoded percent sign has to be given to printf as a parameter, but for every new parameter, we need a new mov instruction to a register – which includes a percent sign...

This part of the problem is actually easier to solve on x86 (the 32 bit variant): Since all function parameters are there passed on the stack, they can be pushed there with the push instruction (push $0x22, push $0x0a, ...) – no percent sign necessary! On x86-64, the first 13 parameters are passed in processor register instead, which means that additional parameters would have to be generated by using up the first few parameter slots in a way that still creates the same output – not impossible, but very tedious (both in manual execution and code size/readability).

A trick to circumvent that problem is the use of the .noprefix directive of GCC/GASM: Since a percent sign in front of a variable is only a visual aid to the programmer and not necessary to correctly parse the program, this option allows us to simply omit all the percent prefixes – which is just what we need.

After the encoding has been taken care of, all that is left is the exact structure of the printf format string. As I've mentioned, every occurence of a percent character is replaced by one of the function parameters in the output, and by careful construction of the formatting string, together with the trick of using the very same string as both a formatting specification for printf and a parameter being substituted inside that formatting string it is possible to create an output that is exactly identical to the source code – a quine!

Jumboframes on the Internet?

2013-04-12T11:43:00+02:00

Recently, I've been experimenting with Wireshark for my bachelor's thesis, monitoring the performance of TCP uploads from my notebook to my web server. A while ago, I had also swapped my router for a nicer model capable of gigabit ethernet and 5 GHz wifi (due to increasing congestion of the 2.4 GHz band in my apartment building), of course also running OpenWrt like the old one.

Soon after the switch, I noticed an oddity in the Wireshark captures: Some of the outgoing TCP segments were reported as having a length of more than 1500 bytes, which is the upper payload limit for Ethernet, and therefore also for most, if not all, residential Internet connections.

Since I didn't really expect the path between my notebook and my server to be capable of a MTU higher than 1500, the obvious explanation for this to work would be IP fragmentation occuring in my router, which would be very unfortunate to say the least.

To figure out if that was really the case, I started tcpdump on the server, with the following interesting result:

No sign of IP fragmentation whatsoever; the packets were arriving as if sent with an MTU of 1500! This also matches both the MTU configured on my notebook and explains the pattern of ACKs received from the server – usually, every other segment should be ACKed by the recipient, but in this case, I was receiving much more than that.

After a bit of googling, I found the explanation: TCP segementation offload (TSO). This is (theoretically) a nice feature of some NICs that allow the operating system to delegate TCP segmentation and TCP and IP header generation to the network interface, relieving the CPU of those duties and possibly also increasing performance quite a bit. However, if there are bugs in the NIC firmware, this could lead to very obscure and hard to debug transmission errors, and it also makes debugging other network behavior more difficult, as I had experienced myself with my measurements.

There is an easy way to disable TSO on Linux:

sudo ethtool -K eth0 tso off
sudo ethtool -K eth0 gso off

GSO is a very similar technology, which can be used to offload some of the higher-layer networking tasks from the kernel to the network interface for protocols other than TCP. It also has to be disabled, because according to my tests, it happily takes over TSO's duties once disabled, also causing strage results in Wireshark.

Apart from the confusion in my packet traces, I have yet to experience any side effects of TSO or GSO with my NIC (likely because of the driver which is commendably developed by the NIC vendor itself), and plan to leave them enabled while I'm not working with Wireshark.

VPNs and IPv6, part 2

2013-03-29T17:28:00+01:00

As I've written before, VPNs can lead to insecure situations when used with IPv6 enabled networks.

The easiest way to mitigate that problem is actually just to enable IPv6 tunneling over the VPN itself, provided your VPN gateway has IPv6 connectivity and you have a spare /64 subnet you can dedicate to the VPN clients. (Unfortunately, this is the smallest subnet size OpenVpn is willing to accept). My provider has agreed to make an appropriate subnet available to my server, but I haven't been able to try it so far.

If that's not possible for you, e.g. due to IPv6 being unavailable at your VPN gateway, there is a simple workaround that breaks IPv6 connectivity for all connected clients: Just hand out bogus IPv6 addresses and routes to all clients, and drop all IPv6 traffic on the server. This is of course not as nice as an option to cleanly disable IPv6 connectivity, but at least for the Android client, I'm not aware of any other solution so far.

The following two lines in the OpenVPN server.conf should do the trick:

server-ipv6 ::1/64
tun-ipv6

Make sure to disable IPv6 forwarding on the VPN server to avoid any surprises (e.g. link-local IPv6 connectivity to other servers on the same subnet):

sysctl net.ipv6.conf.all.forwarding=0

Try the setup by connecting to the VPN and accessing one of the innumerable "what-is-my-IPv6"–services from your client to make sure it works as expected.

Static blogs and HTTP caching

2013-03-15T10:56:00+01:00

As you can see in the footer, this blog is powered by Pelican, a static blog generator written in Python. It's really simple to use and fits my requirements nicely – I can write posts offline on my notebook and view the results in my browser with the included web server, it doesn't require any insecure server-side software (the output is plain HTML, CSS and a bit of JavaScript for browsers that are not quite up to date) and is very easy on server resources because by default, almost everything can be cached by web browsers.

However, there is one annoying side effect of everything being cached: Since that also includes the landing page, new posts could be invisible to recurring visitors for quite a while. In a bit more detail, here is what is going on at the HTTP level:

By default, my webserver, lighttpd, delivers all static HTML pages with no explicit caching headers, but includes the modification time of the resource (only the relevant headers are included) and an ETag:

Date: Fri, 15 Mar 2013 10:03:43 GMT
ETag: "4531062"
Last-Modified: Thu, 14 Mar 2013 20:12:06 GMT

The ETag is good to have (browsers can use it to unambiguously revalidate cached content with the server, as I'll explain later), but the Last-Modified–Header combined with no explicit statement about cacheability triggers a heuristic defined in HTTP in most browsers. Basically, browsers calculate the difference between the time the resource was retrieved and the time it was last modified on the server, and cache the resource for 10% of that value without revalidating with the server.

This means that for a blog that is daily updated with new posts, users will eventually see the posts after a few hours after their last visit, but for a blog that hasn't been updated for several weeks or months, ten percent of that time can be pretty significant.

A simple solution is to just manually define a cache validity in the HTTP headers for some or all resources. lighttpd has the expires module that does just that. Here is the relevant line in my lighttpd.conf:

expire.url = ( "/theme/" => "access plus 7 days", "" => "access plus 1 hours" )

The effect is that all resources in the subdirectory theme will have an Expires header 7 days in the future, and everything else will be valid for just an hour. This is a tradeoff between server and client resource usage and immediate updates: For me, an hour of delay is not a big deal, and users jumping back and forth between blog posts will be able to do so without any further HTTP requests. Here are the response headers of the main blog page:

Cache-Control: max-age=3600
Date: Fri, 15 Mar 2013 10:23:06 GMT
ETag:"4531062"
Expires: Fri, 15 Mar 2013 11:23:06 GMT
Last-Modified: Thu, 14 Mar 2013 20:12:06 GMT

As you can see, the max-age directive exlicitly states a validity of 3600 seconds, and the Expires header also points to a value one hour in the future.

Even when that time is reached, the whole resource doesn't have to be transferred again: Browsers can just perform a conditional HTTP request using the ETag or Last-Modified headers that they cache together with the resource itself. If the content is still the same, the server will be able to deduce that from the headers and reply with a 304 Not Modified HTTP response. As long as your site is not very highly frequented or references many additional resources, cache revalidation is not too expensive.

One thing that has also helped me tremendously in understanding HTTP caching was an answer on Stackoverflow that explains how to force the various browsers to revalidate a resource or to completely bypass the cache – for debugging, it's very useful to know that there is a big difference between pressing F5 or Ctrl + F5 in most browsers.

Variable indirection in shell scripts

2013-03-13T15:30:00+01:00

Recently, I had to find a way to do variable indirection in a shell script. More specifically, I wanted to write a function that takes two arguments and interprets one of them as a string, and the other one as a variable to which that string should be added – a simple append function.

Usually, that would be a good occasion to switch to some more comfortable scripting language than the unix shell, but sometimes that's not possible. So here is how to do it (thanks to an article on TLDP):

append() {
    # Appends the value of $1 to the variable indicated by $2
    eval $2=\"\$$2 $1\"
}

eval is a very useful shell built-in that converts a string to a command, performing the regular shell variable substitution. In the small function above, this means that when calling the function like this:

A_VARIABLE="initial value"
append "some string" "A_VARIABLE"

The line

eval $2=\"\$$2 $1\"

first becomes (by regular shell variable substitution)

eval A_VARIABLE="$A_VARIABLE some string"

which is then evaluated as a command, again with variable substitution:

A_VARIABLE="initial value some string"

At least, I hope that this is what is actually going on... Quote escaping in shell scripts can be tricky sometimes. Many more useful examples of indirect references can be found in the referenced article.

TLS and RC4 - not so secure after all

2013-03-13T11:24:00+01:00

Turns out that TLS with RC4 (which was supposed to protect us against the BEAST and the CRIME attacks) is not so secure after all:

The attacks arise from statistical flaws in the keystream generated by the RC4 algorithm which become apparent in TLS ciphertexts when the same plaintext is repeatedly encrypted at a fixed location across many TLS sessions.

That sounds familiar... A few months ago, I read a very similar statement in a paragraph on attacks on RC4 in RFC4345 (Improved Arcfour Modes for SSH):

[...] A consequence of this is that encrypting the same data (for instance,a password) sufficiently many times in separate Arcfour keystreams can be sufficient to leak information about it to an adversary.

Intrigued by that, I posted a question on Stackexchange Cryptography, asking if the same problem wouldn't also apply to TLS, with pretty bad implications for password/cookie authentication. I got a very interesting response by a user named poncho, who claimed that he was able to successfully recover a password from 8 billion RC4 encrypted messages.

8 billion seems like too much for a practical attack even when the attacker is able to provoke repated retransmissions of the secret, but if there were a way to optimize that attack, TLS with RC4 would be in serious trouble. And this seems to be exactly what happened just now.

Matthew Green has published a very nice summary of the new attack and the implications on his blog, and I completely agree with his conclusion – we need to stop using RC4.

Server relocation

2013-03-08T11:46:00+01:00

This weekend, the server on which this blog is hosted will be moved from Graz to Vienna. If all goes well, there will be a short outage on Saturday evening/night, and much better connectivity afterwards.

VPNs and IPv6

2013-03-06T10:31:00+01:00

A while ago, I have configured a small OpenVPN for personal use (mostly for security when using public wireless networks) with OpenVPN. The setup is pretty easy, thanks to a very helpful tutorial (in German) and the sensible default settings of OpenVPN itself. (Setting up the certificate infrastructure was a bit annoying, though – I would really prefer an SSH-like approach where the users can create a private key for themselves, and the VPN server has a list of the key/user mappings, but that's another story.)

Configuring the server to push a default route to the clients is as simple as setting the push redirect-gateway def1 option in the server configuration, and mostly, that works as expected.

However, there is a huge caveat for IPv4-only clients. Since I don't have an IPv6 subnet big enough to provide IPv6 tunneling on my server as well (OpenVPN, or at least the version included in Ubuntu 12.04, seems to require a /64 subnet for now, but my provider only provides a tiny /112), I just didn't configure IPv6 and expected IPv6 connectivity to be broken. But that's not what's happening:

When connecting to the VPN from a client that has both IPv4 and IPv6 connectivity, only the IPv4 traffic will be routed over the VPN gateway, but the IPv6 traffic will be routed locally – and since the world IPv6 launch, there are quite a lot of hosts that are reachable over IPv6 and serve AAAA records to all users. Except for TLS protected connections, traffic to them will be unencrypted, and even then, the connection metadata (IP addresses etc.) will be plainly visible to anybody on the public network.

After thinking about that for a while, it kind of makes sense: At no point did I instruct OpenVPN to break my existing IPv6 connectivity, and since I didn't provide any IPv6 tunneling settings, my routes for that were just left alone. It can also be fixed easily enough – just configure a script that tears down IPv6 connectivity before connecting to the VPN, and restores it immediately after that. Maybe there is even a way to do that from the server via push instructions, but I've had no success with that so far.

Curious about the issue, I decided to check how other VPN solutions and clients handle that situation, with pretty much the same results:

Cisco's AnyConnect, when used with OpenConnect, behaves exactly like OpenVPN. The Android client, however, seems to specifically work around that problem – IPv6 connectivity breaks while connected to an IPv4-only VPN. I haven't been able to find out how that works, but I suspect that either some additional routes are pushed to the client, and the IPv6 traffic is discarded locally or at the gateway, or something is going on at the DNS level (since I get NAME_NOT_RESOLVED errors when visiting what-is-my-IPv6–like sites from the VPN).

OpenVPN for Android behaves just like the Linux client. Unfortunately, unlike Linux, Android provides no way for the user or the VPN application developer to disable IPv6, which makes a workaround pretty much impossible. I've reported that as a bug to the developer, even though it is a problem with android, not OpenVPN – maybe he'll figure out a solution. (I've also reported it as an Android bug, but I'm really not sure if there's anybody from Google watching that bug tracker...)

I'm not really sure what would be the best solution to the problem: Should VPN clients just break IPv6 connectivity by default, to protect the data of users who will most likely be using a VPN under the assumption that it will be doing just that? Or should it leave IPv6 alone by default, just providing an option to automatically disable their IPv6 connectivity when needed? I'm in favor of the first approach, but as it is, even the second one would be a big step forward – on Android, users have no way of disabling IPv6 traffic circumventing their VPN connection, and most will not even know that it's happening.

My OpenWrt setup

2013-01-28T12:15:00+01:00

This weekend, I finally reinstalled OpenWrt on my home router. I've been using a nightly build for several months now, and it had been working just fine, but unfortunately, the opkg (OpenWrt's package manager) repositories for the nightly builds are updated every few days, and all of the kernel modules have hard dependencies on a specific kernel version. So in order to install a new kernel module, I would have to upgrade my OpenWrt version every time – not very convenient.

Luckily, there is a release candidate for the newest version, called "Attitude Adjustment", which is what I upgraded to. This went without any major problems and I was even able to keep my configuration, but I had to reinstall all packages not included by default. At this opportunity, I decided to document my configuration and installed packages.

Avoiding Bufferbloat

Bufferbloat is a nasty phenomenon that occurs mostly on residential internet connections. In a nutshell, hugely oversized buffers combined with static queue management in cable and DSL modems cause huge delays during large uploads, which can lead to a very bad experience for interactive applications. I use a SIP phone to make voice calls, and previously to my solution, it was nearly impossible to make a phone call at the same time as somebody uploading pictures or videos on the same network – now it works just fine.

The solution to Bufferbloat is two-fold: First, the available bandwidth has to be limited on the router to just below the bandwith actually available to the modem, in order to avoid any buffering in the modem. Then, some adaptive or even "fair" queueing algorithm like CoDel can be used. Both tasks can be achieved via the Linux packet scheduler and its managment tool, tc. Both are available as OpenWrt packages:

opkg install tc kmod-sched

uci, OpenWrt's configuration system, doesn't support custom packet schedulers yet, so my setup script is implemented as a shell script that is executed once at every boot of the router.

#!/bin/sh

# Insert the necessary kernel modules
insmod sch_htb
insmod sch_fq_codel

# Reset the queueing disciplines
tc qdisc del dev eth1 root
tc qdisc del dev br-lan root

# Add a HTB queue to the internal interface (to limit upload speeds)
tc qdisc add dev eth1 root handle 1: htb default 1
# Limit the upload speed to 2048 kbit/s (adjust this to just below your actual upload speed!)
tc class add dev eth1 parent 1: classid 1:1 htb rate 2048kbit
# Enable CoDel as a queueing algorithm for the queue
tc qdisc add dev eth1 parent 1:1 handle 11: fq_codel

# The same for download speeds - adjust accordingly
tc qdisc add dev br-lan root handle 1: htb default 1
tc class add dev br-lan parent 1: classid 1:1 htb rate 32768kbit
tc qdisc add dev br-lan parent 1:1 handle 11: fq_codel

exit 0

This solved the Bufferbloat issue completely for me, but if it works for you depends on a lot of factors - test the script before enabling it at every boot.

IPv6 connectivity via an HE 6in4 tunnel

OpenWrt works with IPv6 out of the box. If your provider isn't supplying you with native v6 connectivity yet (which is unfortunately still very likely), you can use a free tunnel provided by Hurricane Electric. The registration process is pretty straightforward, and HE even provide configuration samples for many routers and operating systems. The one for OpenWrt didn't work for me (which was probably my fault, so you should give it a try!), but this one does:

First, the 6in4 tunneling module has to be installed:

opkg install 6in4

Then, the tunnel interface can be configured in /etc/config/network.

For the local network, an IPv6 address has to be configured for the router. It can be any address in one of the "Routed IPv6 Prefixed" shown in the tunnel details on tunnelbroker.net, so if you were assigned the prefix 2001:470:1b:1234::/64, you could choose 2001:470:1b:1234::1/64.

config interface 'lan'
    option ifname 'eth0'
    option type 'bridge'
    option proto 'static'
    option ipaddr '192.168.1.1'
    option netmask '255.255.255.0'
    # 
    option ip6addr '2001:470:1b:1234:1/48'

The 6in4 tunnel interface has to be configured like this (assuming your "Client IPv6 address" is 2001:470:1a:1234::2, your "Server IPv4 address" is 1.2.3.4, and your "Tunnel ID" is 12345):

config interface 'henet'
    option proto '6in4'
    option peeraddr '1.2.3.4'
    option ip6addr '2001:470:1a:1234::2/64'
    option tunnelid '12345'
    option username 'your.username'
    option password 'yourpassword'

Username and password are not identical to your HE login credentials – they can be retrieved on the "Example Configurations" tab on the "Tunnel Details" page.

To configure the firewall, you can exceuted the command provided by HE (uci set firewall.@zone[1].network='wan henet'), or manually insert the line option network 'wan henet' to the wan block in the file /etc/config/firewall.

If you want to use one of IPv6's nicest features, stateless autoconfiguration, radvd has to be installed...

opkg install radvd

...and configured like this (in /etc/config/radvd):

config interface
    option interface        'lan'
    option AdvSendAdvert    1
    option AdvManagedFlag   0
    option AdvOtherConfigFlag 0
    list client             ''
    option ignore           0

config prefix
    option interface        'lan'
    list prefix             '2001:470:1b:1234::/64'
    option AdvOnLink        1
    option AdvAutonomous    1
    option AdvRouterAddr    0
    option ignore           0

Dynamic DNS

OpenWrt has built-in support for many dynamic DNS services, and the documentation on their wiki has all you need to configure it.

Port forwarding

Specific ports of devices behind the OpenWrt NAT can be made reachable by a block like this in /etc/config/firewall:

config 'redirect'
    option 'name' 'myhomeserver'
    option 'src' 'wan'
    option 'proto' 'tcp'
    option 'src_dport' '22'
    option 'dest_ip' '192.168.1.123'
    option 'dest_port' '22'
    option 'target' 'DNAT'
    option 'dest' 'lan'

Port mirroring with iptables

My main motivation for upgrading to the release candidate was actually an iptables module called tee. It can be used to duplicate all or a subset of all packets going throuth the router and transmit them to some other host running a packet analyzer like Wireshark. This can be very useful to debug embedded wireless devices without having to resort to a Wifi card running as a packet sniffer, which was my previous approach.

In order for the following command to work, the tee module has to be installed on the router. For some reason, ip6tables is also required, or loading the module will fail:

opkg install iptables-mod-tee ip6tables

To forward all traffic going through the router to a machine in the private network at 192.168.1.123, simply exceute this command in a shell:

iptables -A POSTROUTING -t mangle -j TEE --gateway 192.168.1.123

The forwarded packets will still show the internal IP addresses, which makes finding a specific device much easier.

To be continued

That's my configuration for now – I hope to be able to update this blog regularly if I find some improvements to my setup.

Hello

2013-01-13T12:00:00+01:00

This is going to be my new personal blog. Topics will vary from programming and technical stuff to random thoughts about (possibly non-technical) things.