The site I’m working on builds everything on Ubuntu, with Puppet. Lately a lot of people have been complaining (they’re ops people, what else do you expect them to do?) that our Puppet runs are too slow. And they’re right. They’re incredibly, painfully slow. But why? What’s going on? We don’t do an awful lot to the boxes. We use a masterless setup, so the catalog has to be compiled every time. but my Wavefront Puppet reporter shows me that by far the bulk of the time being taken is on the actual run, not the compilation.
Puppet spends most of its time exec()
ing external programs,
getting them to do the real work. So I thought it might be
interesting to see which externals were being called, and how much
time was spent doing what.
I could have used sysdig, but I’m far more comfortable with DTrace,
and I already do the vast majority of my dev work in LX branded
zones on SmartOS. For those who don’t know, SmartOS is a
Solaris-derived platform which can already run safe, secure
containers, virtualize the network stack, use a parallel init system
with restarters, allows safe dynamic tracing, and has a
filesystem/volume manager not rooted in the early 1980s. Normally
you get the SmartOS user environment, which is based on Solaris but
incorporates most of the modern GNU toolchain. However, if you
wish, you can run Linux (LX) branded zones, which are containers
with the ability to map the Linux ABI to the SunOS one: this means
most Linux stuff works perfectly and seamlessly. You also get
/native
, which lets you run proper, grown-up things like
dtrace(1)
and the proper versions of the ptools.
A DTrace Srcipt
I wanted this process to be completely external to the normal bootstrapping and Puppetizing process. That is, I didn’t want to have to change anything, or interact with the bootstrapping zone at all.
Though you can run DTrace in the local zone, I chose to run it from the global (hypervisor) zone. This meant I was looking in from the outside, not interfering at all with the standard bootstrapping and Puppet processes.
I played with a couple of ways of doing this. In the end I decided to grab the contract ID of the Puppet process when it started, and trace processes only with that same contract ID.
Contracts are groups of related processes. They’re used primarily by SMF (think systemd, but finished) as a way of knowing when a process has died and guaranteeing it is restarted. I’m a little murky on exactly how they work in LX zones, but poking around leads me to think that my bootstrapper and all its children will run under their own unique contract.
So, I wrote the following DTrace script:
#pragma D option bufpolicy=fill
#pragma D option bufsize=1m
BEGIN {
contract = -1;
}
proc:::exec-success
/execname == "puppet-apply.sh" && contract == -1/
{
puppet_contract = curpsinfo->pr_contract;
}
proc:::exec-success
/curpsinfo->pr_contract == puppet_contract/
{
time_arr[curpsinfo->pr_pid] = timestamp;
}
proc:::exit
/time_arr[curpsinfo->pr_pid]/
{
printf("%d\t%s\t%s\n", timestamp - time_arr[curpsinfo->pr_pid],
curpsinfo->pr_fname, curpsinfo->pr_psargs);
}
proc:::exit
/execname == "puppet-apply.sh"/
{
exit(0);
}
What does it do? Well, the #pragma
s tell DTrace to buffer up the output
rather than writing it to stdout. It will be flushed when the script exits.
The BEGIN
block, like its awk
namesake, runs once, right at the start of
execution. I set the contract
variable to -1
because a contract can
never have that ID, but I need it to be set to something so as not to mess
up the predicates later in the script.
The first proper block of code runs when the puppet-apply.sh
process
starts. proc:::exec-success
refers to a probe, which fires each time a
process is successfully kicked off by exec()
, or execve()
, or one of
their friends. The block of code in {
braces }
is only run if the
/predicate/
passes. In this case we only want to run our code if the probe
was triggered by puppet-apply.sh
running for the first time. If it was,
the script remembers the contract ID which that puppet-apply.sh
has. I
knew how to access the contract ID from the psinfo
struct in
/usr/include/sys/procfs.h
.
The next block of code is very similar, but the predicate is different.
Whereas before we looked to match the execname
of the process which
triggered the probe, this time we compare the contract ID of the process
with the value we stored in the earlier block. For any matching process
(loosely speaking, any part of the Puppet run) we append to an array called
time_arr
, with the index being the process PID, and the value being
the time at which the process started: i.e. now.
The next block runs when a process exits. We don’t care what
the process exits, just that it does. The /predicate/
says to use
the PID of the exiting process to make a lookup in the time_arr
array. It then prints out the time for which the process existed. This is
the simple delta between start time and stop time. I don’t care how long it
was blocked, on CPU, or anything else. I could do this quite easily if I
wished, by accessing other data in the psinfo
struct.
I chose to use the pr_fname
, which is simply the path the
executable, as well as pr_psargs
, which gives a (shortened)
command-line, including options and arguments. I want a very
broad-brush view: a general flavour of what’s happening, and in my
preliminary experiments I could see that the same binaries were
being referred to implicitly and explicitly, and I didn’t want to
have to amalgamate things later. More data means more options when I
come to analyze it later.
Finally, I catch the wrapper script exiting, and terminate my tracing at that point.
An LX Branded Zone
I ran the D script in one window, in the global zone:
# dtrace -qs puppet-apply.d >out
then from a second terminal, spun up an Ubuntu LX zone using this manifest
{
"brand": "lx",
"image_uuid": "5917ca96-c888-11e5-8da0-e785a1ad1185",
"alias": "ubuntu",
"kernel_version": "3.13.0",
"hostname": "xxxx_base",
"alias": "xxxx_base",
"max_physical_memory": 2048,
"resolvers": ["192.168.1.26", "8.8.8.8"],
"nics": [
{
"nic_tag": "admin",
"ip": "192.168.1.84",
"netmask": "255.255.255.0",
"gateway": "192.168.1.1",
"primary": "1"
}
],
"customer_metadata": {
"user-script": "wget -P /var/tmp http://httpserv.localnet/xxxx_bootstrap.sh;
chmod 700 /var/tmp/xxxx_bootstrap.sh; /var/tmp/xxxx_bootstrap.sh
puppet-ops/ops:dev:base >/var/log/xxxx_bootstrap.log 2>&1"
}
}
and this command.
# time vmadm create -f xxxx_base.json
Successfully created VM ca42db41-161d-438b-94e7-ea8d2a8dd87a
real 0m3.411s
user 0m0.846s
sys 0m1.002s
Doesn’t take long, does it?
The customer_metadata
fetches a special bootstrap script from a
local HTTP server, and runs it. This script is particular to my
SmartOS setup (no one else where I work uses SmartOS) and acts as a
thin wrapper around our actual bootstrap script. Essentially, it
makes a fresh zone look like one of our standard Vagrant boxes. It
then downloads and runs our real bootstrapper, which sets up repos
and some external facts, then runs Puppet via a wrapper script
called puppet-apply.sh
.
image_uuid
refers to an Ubuntu 14.04 base image supplied by Joyent.
Everything else should be self-explanatory.
I flipped back to my DTrace window, and waited for stuff to occur.
When the DTrace exited and gave me back my prompt, I knew the Puppet run was complete, and I had data.
Results
$ wc -l out
3741 out
$ sed 10q out
1408894 kmod /sbin/lsmod
1543007 grep grep -q vbox
1541005 uname /bin/uname -s
1405667 hostname /bin/hostname
1858056 hostname /bin/hostname -f
4196447 dash sh -c /bin/hostname -f 2> /dev/null
1767484 hostname /bin/dnsdomainname
4072433 dash sh -c /bin/dnsdomainname 2> /dev/null
1770362 hostname /bin/hostname -f
4078016 dash sh -c /bin/hostname -f 2> /dev/null
So lots of programs ran, and we have recorded the PID, exec time,
path and arguments of each one. To recap, the first column
is the time the command took, in nanoseconds; the second is the
program which ended up running, and the remainder is the command
that was invoked. The first line illustrates that lsmod
is a
symlink to kmod
, and later we see that Ubuntu’s sh
is really dash
.
I recently re-read The AWK Programming
Language
(if you haven’t read it, do, if you have, read it again, you’ve
probably forgotten how good awk
is) so there seems a very obvious
next step. Let’s tot up all those exec times for actual program
calls with a little script called sum.awk
and find out what we
spent all that time doing.
{ cmd[$2] = cmd[$2] + $1; }
END { for (c in cmd)
print cmd[c] / 1e9, c;
}
$ awk -f sum.awk < out | sort -nr | sed 20q
428.482 dash
401.509 puppet-apply.sh
401.497 flock
401.477 puppet
345.846 apt-get
213.512 dpkg
41.5535 https
41.0096 http
20.8534 dpkg-preconfigur
18.8725 sshd
16.02 apt-extracttempl
14.8066 gem1.9.1
10.6841 man-db.postinst
10.4272 mandb
8.32055 frontend
8.25165 dpkg-deb
6.274 gpgv
5.77478 copy
5.35583 postfix.postinst
.442013 apt-show-version
We can ignore the first four entries, which are puppet-apply.sh
and the shell it used (plus the odd other sh
command), the lock
which wraps our Puppet runs, and the Puppet process itself. We can
also disregard sshd
. Not sure quite how that crept in. Possibly it
ended up in the same contract as Puppet when it was restarted after
a config change.
So, clearly, the reason our Puppet runs take a long time is because we’re pulling in a lot of packages. Let’s use the long command fields to find out a bit more.
{ full_cmd = substr($0, index($0, $3));
cmd[full_cmd] = cmd[full_cmd] + $1;
}
END { for (c in cmd)
print cmd[c] / 10e9, c;
}
$ awk -f sum.awk < out | sort -nr | grep apt-get | sed 10q
116.858 /usr/bin/apt-get -q -y -o DPkg::Options::=--force-confold install td-agent
37.9265 /usr/bin/apt-get -q -y -o DPkg::Options::=--force-confold install bgch-complian
25.3062 /usr/bin/apt-get -q -y -o DPkg::Options::=--force-confold install mailutils
23.8725 /usr/bin/apt-get -q -y -o DPkg::Options::=--force-confold install awscli
14.5637 /usr/bin/apt-get -q -y -o DPkg::Options::=--force-confold --force-yes install b
10.9446 /usr/bin/apt-get update
7.62643 /usr/bin/apt-get -q -y -o DPkg::Options::=--force-confold install apparmor-util
6.90385 /usr/bin/apt-get -q -y -o DPkg::Options::=--force-confold install gettext
5.84451 /usr/bin/apt-get -q -y -o DPkg::Options::=--force-confold install apt-show-vers
5.36272 /usr/bin/apt-get -q -y -o DPkg::Options::=--force-confold install ssh
The fluentd (td-agent
) pacakge takes a good while to install
(unless you compare it to any package installed, ever, with IPS). I
don’t know why we even install mailutils
. We’d already trimmed
out packages with heavy dependency lists (check out the dependencies
on the Ubuntu collectd
package), a couple of which took for
ever to install (Aide was one). But, still the vast majority of
time on the initial Puppet runs was downloading and installing
packages.
Upshot: I re-baked our AMIs and Vagrant boxes so they already
contained the packages installed by our base
Puppet role. You
could, I’m pretty sure, have guessed what was taking all the time,
but it was fun to do a bit of DTracing for the first time in
ages, and it’s always interesting to see what’s going on
underneath the veneer.