Debugging NFS Slowness

During patching for the recent GHOST bug, I updated all packages (including kernel) on a Ubuntu 14.04 file server (filer). This filer provided static content (mainly tens of thousands of images) to a number of web servers. You can see the effect in the following load graph from the filer:

Load average on the filer
Load average on the filer

You may notice from the above, that there were actually two issues. The first was solved by upgrading the filer from 14.04 to 14.10 based on a number of online references to symptoms and fixes. About an hour after this upgrade, a new form of NFS slowness manifested and, needless to say, sites that rendered in <1sec were now taking >15secs.

Diagnosing the second issue took a while longer but some tips and utilities include:

  • check /var/log and see if any log files are increasing rapidly;
  • check top and check any processes with high / unusual utilisation;
  • use iostat (apt-get install sysstat) and pay particular attention to any devices with high volumes of transactions per second. In my case it was the root filesystem rather than any of the mounted partitions exported by NFS.
  • use iotop (apt-get install iotop) and note any processes with high utilisation (in my case jbd2/xvda1-8 was at 100% and xvda1-8 is my root partition)

The jbd2 process is the ext4 journaling process. At this point you can evaluate fsck’ing your partition but I wanted to see if I could discover what was happening here. I enabled some debugging via:

# enable tracing:
echo 1 > /sys/kernel/debug/tracing/events/ext4/ext4_sync_file_enter/enable
# wait a couple of seconds and:
cat /sys/kernel/debug/tracing/trace
# and disable tracing:
echo 0 > /sys/kernel/debug/tracing/events/ext4/ext4_sync_file_enter/enable

What I found were lots of:

nfsd-2085  [001] .... 53730942.155573: ext4_sync_file_enter: dev 202,1 ino 276278 parent 149955 datasync 0
nfsd-2071  [001] .... 53730942.158743: ext4_sync_file_enter: dev 202,1 ino 276278 parent 149955 datasync 0

where every entry related to the same inode number (276278). We found this via:

find / -inum 276278

The solution was to stop nfs_kernal_server, remove that directory entirely, add it back and restart the nfs_kernel_server. We got the permissions wrong on the first attempt but this’ll be obvious from dmesg / kernel log messages such as:

kernel: [53731827.778104] NFSD: Failed to remove expired client state directory 8d97cccceb37641d3804a84683a9282a
kernel: [53731827.779204] NFSD: failed to write recovery record (err -13); please check that /var/lib/nfs/v4recovery exists and is writeableNFSD: Failed to remove expired client state directory 8d97cccceb37641d3804a84683a9282a

Apple OS X as an NFS Server (with Linux Clients)

For a customer, I had to set up a Linux-based virtualised environment on a MacBook Pro using VirtualBox. This environment included making a couple of 8TB external hard drives available under NFS to the Linux hosts.

In all fairness, what better use can one put OS X to than to virtualise Linux?!?  Just kidding fanboys… well, sort of 😉

Let’s begin with a quick description of the environment:

  • A MacBook Pro (MBP) with OS X 10.8.2
  • VirtualBox with it’s own network (MBP: for NFS as well as bridged adapters for general Internet access;
  • Multiple external HDDs – for simplicity, let’s just do one here which is mounted under /Volumes/DATA-1.

We want to export the DATA-1 volume to the Linux clients. That bit’s actually not too hard (see below), the main issue is we needed to match what on Linux is call no_root_squash – i.e. so the root user on the Linux clients would have root access to the NFS shares. That bit was harder.

I’ll assume root access / sudo use in the following commands.

To configure NFS, we edit / create /etc/exports (e.g. nano /etc/exports) such as:

/Volumes/DATA-1 -maproot=root:wheel -network -mask

In other words:

  • export /Volumes/DATA-1
  • map the clients root user to local root user and the clients root group to local group wheel (gid = 0)
  • allow the export to be accessed by any host on the private VirtualBox network.

With that entry, NFS can be enabled at boot and started via:

nfsd enable
nfsd start

On a Linux client, this can then be mounted at boot with an /etc/fstab entry: /mnt/data-1 nfs defaults 0 0

The problem was that no matter what variation of options I used, I could not get root access from the Linux clients.

The answer came by chance when I glanced an odd mount option on the external HDD:

/dev/disk2s2 on /Volumes/DATA-1 (hfs, NFS exported, local, nodev, nosuid, journaled, noowners)

noowners? What pray-tell is this? The internet provided some insight:

In Leopard, due to an unfortunate design decision by Apple, “admin” authentication is now required to make this change (no noowners) and non-admin users are no longer able to use “Get Info” to change this setting, even on devices they own and have mounted themselves.

An unfortunate design decision indeed. The temporary solutions is to execute:

mount -u -o owners /Volumes/DATA-1

Thereafter, I now have root access / effective UID from the Linux clients. This of course needs to be entered each time – if someone has a more permanent solution, I’m all ears (see below for a cron script I have implemented for this).

Just as an aside, we have a lot of NFS activity which required some tuning. First, additional NFS threads by adding nfs.server.nfsd_threads=16 to /etc/nfs.conf (execute nfsd restart after that). I’ve also added the following line to /etc/rc.local:

sysctl -w kern.aiomax=64 kern.aioprocmax=32 kern.aiothreads=4

Cron Script for Automatically Removing noowners

As mentioned above, removing this mount option every time you connect these HDDs is damn annoying at best and error prone at worst. I have a script for this now which I locate in /var/root/bin/ which is:

#! /bin/bash

NOOWNERS=`/sbin/mount | grep "/Volumes/DATA-1" | grep noowners | wc -l`

if [[ "X${NOOWNERS//[[:space:]]/}X" = "X1X" ]]; then
    /sbin/mount -u -o owners /Volumes/DATA-1;

This is then executed via a new line in /etc/crontab:

* * * * *    root    /var/root/bin/