Following on from last time, I need to
migrate twenty-plus zones from a mix of Solaris 11 and SmartOS to my new
OmniOS machine. The Solaris zones are defined in Puppet, the SmartOS ones as
vmadm
JSON files. The configuration of the operating system inside the zone
is done by Puppet in both cases.
One of my big beefs with Solaris 11 was that, because of IPS, it took Solaris
10’s lightweight zone management, and made it heavy and slow. 10 used what we
called “sparse” zones, where filesystems like /usr
were linked from the
global zone. This meant there was little to install, and patching the global
patched the locals. (You could also install “whole root” zones if you needed
more control, but they took quite a bit longer and had to be patched
independently. Avoiding using them, usually by way of loopback mounts, became
something of an art.)
In 11, Sun made zones much more flexible, but at a cost. Zones were installed by adding packages to a fresh ZFS dataset: “sparse” mode was no longer a thing, and spinning up a zone – particularly if it was the first one on a system and you didn’t have a local package repo – took ages.
The solution to this was to start off not with an empty ZFS dataset, but with a clone of a pre-built zone. This, made zone creation much, much faster, and, in days when such things still mattered, saved a lot of disk space. On my Solaris 11 box I had a shell script which spun up a “golden zone”, and dropped into it a fully configured Puppet. Then, in the global zone’s Puppet config, I defined the local zones. Each time Puppet ran, it ensured that the right zones existed.
OmniOS tries to combine the best of the Solaris 10 and Solaris 11 worlds. It
still uses IPS and ZFS, but it brings back sparse-root zones with the sparse
brand, and offers as its default lipkg
(linked ipkg) which shares less
global-zone directories with the local zone than sparse
does. You can also
install a full-fat ipkg
zone, which works in the same way as Solaris 11 and,
of course, you can still clone an ipkg
zone from a golden image.
You can also run Linux if you like, by way of KVM or Bhyve emulation, or using lx brand zones, which use system-call translation. But I don’t use those, so they won’t get discussed here.
Puppet?
I was pretty certain that the Puppet zone building code wouldn’t work in OmniOS without some modification, but I didn’t bother checking. Though it was a simple thing to do, and felt “correct”, I was never overly keen on building the zones via Puppet, preferring the SmartOS way of defining the zone in a file and running a one-shot script which built it.
Oozone!
Lots of people have already come up with a way of doing exactly that. (I knew
about vmadm
, I’ve since found zcage
, zap
, and zadm
). But I happened to
have a Ruby programming interview coming up, so to sharpen myself a little, I
decided to reinvent the wheel and write a Ruby imitation of SmartOS’s vmadm
.
Or, at least, the bits of it I needed.
Originally I called it ozone
, but there was already a gem called that, and
there are two ‘o’s in ‘OmniOS’, so I renamed it oozone
. I like to use a
rising inflection on the long ‘oo’, sort of like you’re pleasantly surprised,
but feel free to make that inital syllable annoyed, owl-ish, or comically
erotic. That’s the beauty of open source. It’s up to you.
I appreciate that Ruby is not a native OmniOS thing. Python is there (for
pkg(5)
, and Perl is there (for intrd
– I was sure we’d been weaned off
the Perl dependency, though maybe that was just SmartOS). But I’m using Puppet,
so I have to have Ruby on there.
oozone
is on Github, along with some
documentation. In short, you define a zone as a YAML file (people can’t get
enough YAML), including all the things zonecfg(1m)
requires, and a bunch of
“special” things that oozone
understands. Then you run the script with one
or more of those files as arguments, and it builds you the zones. It will
create any dependent ZFS datasets; configures the zone’s DNS settings; and
runs commands once the zone is up. It will also upload files or install
packages prior to running those commands, so it’s easy for me to bootstrap
Puppet. Here’s a sample zone configuration.
---
brand: lipkg
zonepath: /zones/mariadb
autoboot: true
net:
- physical: mariadb_net0
'global-nic': auto
allowed-address: 192.168.1.152/24
'defrouter': 192.168.1.1
dataset:
- name: fast/zone/mariadb
dns:
domain: localnet
nameserver:
- 192.168.1.26
- 192.168.1.1
facts:
role: mariadb
environment: lab
packages:
- 'ooce/runtime/ruby-26'
run_cmd:
- '/opt/ooce/bin/gem install puppet -v 5.5.17 --no-document
- '/opt/ooce/bin/puppet agent -t'
Note the facts
hash. That gets converted into Puppet facts inside the zone.
If you add facts oozone
drops in a free zbrand
fact, which is helpful
if you only need to perform certain operations for certain zone types.
Everything Was Better in the Old Days
For a “dead” operating system, Solaris 11 has been acquiring features at a fair old rate, and there are some zone-related ones that I miss.
Datasets aren’t Pools
First, OmniOS’s ZFS dataset handling isn’t quite as nice. In Solaris 11 you
get to give the dataset a name, and it appears in the local zone as a pool
with that name. I always call mine local
, and it gives me a lovely clear
way of handling zone data.
$ zpool list
NAME SIZE ALLOC FREE CAP DEDUP HEALTH ALTROOT
local 224G 124G 99.8G 55% 1.00x ONLINE -
rpool 199G 47.5G 152G 23% 1.00x ONLINE -
In OmniOS, the dataset appears as itself:
$ zfs list rpool/zonedata/mariadb
NAME USED AVAIL REFER MOUNTPOINT
rpool/zonedata/mariadb 168K 440G 24K none
Though the two are functionally identical, I prefer the Solaris way of doing it. Having dedicated pools makes the zone feel more like a “real” computer. The lack of the distinct pool broke a lot of my Puppet config too, but it was easy enough to fix.
Zones aren’t Services
Another nice Solaris feature is that as of 11.4, individual zones map to SMF services, which means you can define dependencies between them. Though this is clearly open to abuse (there’s never any excuse for a system which needs VMs to start in the right order), I did like being able to have my telemetry system up and running before the application zones came up.
Solaris also lets you specify how many zones it should start up (or shut down) at once. This is great on heavily zoned systems where startup gets intense as a hundred new-borns all pile onto the same resources.
Put together these two features let you get the important stuff up quickly and painlessly, with the lessers biding their time, and I miss them. Never mind: you can’t have everything.
Zones can’t NFS…
Solaris 11 lets you run NFS servers in zones, and I use this. On OmniOS:
$ svcs -H nfs/server
disabled Oct_16 svc:/network/nfs/server:default
$ pfexec svcadm enable nfs/server
$ svcs -H nfs/server
disabled 15:45:12 svc:/network/nfs/server:default
$ cat /var/svc/log/network-nfs-server:default.log
...
[ Oct 16 15:21:54 Executing start method ("/lib/svc/method/nfs-server start"). ]
The NFS server is not supported in a local zone
…Unless!
I was interested by something I’d seen on the brands(5)
man page.
+--------+-------------------------------------------------------+
|illumos | An independent illumos environment running under the |
| | shared OmniOS kernel. |
+--------+-------------------------------------------------------+
illumos
brand zones, which I presume are analagous to Solaris’s solaris-kz
“kernel” zones, are made to let you run something like SmartOS or Tribblix in
a zone. I wondered whether that level of virtualization might accommodate
an NFS server. So, I made one.
Instead zoneadm install
using locally cached packages or loopback mounts, an illumos
brand install
blasts a ZFS image into the zone’s root pool in a similar way to the Kayak
installer.
So I made a quick “golden” image with oozone
:
---
brand: ipkg
zonepath: /zones/tornado-gold
Then made that zone, unconfigured it and took a snapshot.
$ pfexec oozone create tornado-gold.yaml
$ pfexec zlogin tornado-gold
root@tornado-gold:~# /usr/lib/brand/ipkg/system-unconfigure
root@tornado-gold:~# halt
root@tornado-gold:~# ^D
$ pfexec zfs snapshot rpool/zone/tornado-gold/ROOT/zbe@cloner
$ pfexec zfs send -Lce rpool/zone/tornado-gold/ROOT/zbe@cloner \
| gzip -c9 > illumos_golden_image.zfs.gz
Here I ran into my first proper snag. Whatever I did, I couldn’t get a working network connection in my Illumos zone; at least not on first boot. After a bit of study I found that if I manually created an interface on top of the zone’s VNIC:
# ipadm create-if $(dladm show-link -polink)
then rebooted, it all worked.
It seemed like the illumos
brand install mechanism was probably, under the
hood, identical to a standard zone clone
operation. So I tried cloning zones
and got the same behaviour.
This felt like a bug, so I reported it in the OmniOS gitter channel and in an hour or so I had a patch that fixed it.
But better than that: one of the maintainers had spun off a hotfix patch of a branch which properly supported NFS in zones. Wow. How’s that for service?
I abandoned the illumos
experimentation (I don’t think it would have worked
anyway) and ended up running my NFS server from a zone, having oozone
install the hotfix as a post-build command. Seamless!
UPDATE: as of the r151034
release, OmniOS supports NFS in zones.
Back to the “problems”.
Zones are Not Immutable
You get a lot of talk about “immutable infrastructure” these days. It isn’t really of course: you can change it, you just don’t. It’s like doing functional programming in Python by just choosing to create new data structures instead of modifying existing ones. It might make you feel cool, but you’re kidding yourself.
But Solaris 11 lets you make zones that are genuinely immutable. Inside one,
you cannot change anything. This is managed by the global zone’s kernel, and
it’s non-negotiable unless the global zone adminstrator turns it off, or
sneaks into the zone via the trusted path. (You can also configure zones which
allow modification of files under, say, /var
, should you need to perform
such 20th century actions as logging to disk.)
OmniOS may not have immutable zones as such, but sparse zones are reasonably close.
root@sparse:~# df -h -Flofs
Filesystem Size Used Available Capacity Mounted on
/lib 443G 1.00G 442G 1% /lib
/sbin 443G 1.00G 442G 1% /sbin
/usr 443G 1.00G 442G 1% /usr
/usr/lib/libc/libc_hwcap1.so.1
443G 1.00G 442G 1% /lib/libc.so.1
root@sparse:~# touch /lib/a /sbin/a /usr/a
touch: cannot create /lib/a: Read-only file system
touch: cannot create /usr/a: Read-only file system
touch: cannot create /sbin/a: Read-only file system
Smaller, faster, more secure. I like sparse
zones.
OmniOS doesn’t Have Many Packages…
As of right now, the OmniOS pkg
repo has 759 packages
in it. The Solaris 11.4
equivalent has 6444.
Sounds pretty rubbish doesn’t it? Except it’s not, because having that small
number of core packages makes IPS work. Using pkg(5)
on OmniOS is a
different experience from using it on Solaris, even if you also have the
extra repo, which serves up things
like Ruby.
…Except!
But what good is a fast base system if it can’t run any applications? What if you need stuff like Clojure, or Elixir? I had those in some of my SmartOS zones, pulled straight from Joyent’s pkgsrc repo.
If you need all kinds of crazy software, OmniOS lets you create a pkgsrc
branded zone. This builds a base image in the same way as other zones, but
installs and configures everything you need to access the aforementioned
Joyent repo. It even installs things in /opt/local
just like SmartOS does.
All my SmartOS tooling worked with no changes at all.
The only downside of the pkgsrc
brand is that /usr
is mounted read-only,
so you cannot install (most) normal OmniOS ipkg packages. (I say “most”
because things from the extras
repo install into /opt/ooce
, so are fine.)
This caught me out, as things seemed to have installed but hadn’t. It’s not
a big deal once you’re aware of it, and so far as I can tell everything the
OmniOS repos offer is also in pkgsrc.
I think the pkgsrc
brand is great, and it’s typical of the kind of creative,
elegant thinking that typifies Illumos. The Illumos way is not to throw code
at a problem, or to invent a slightly different wheel, but to take a step
back, look at what’s there, have a think, and adapt. The different zone types
are all driven by a few shell scripts hooking into the branded zones
framework. It’s a slight bending of a proven solution, and that’s my kind of
engineering.
In building my first half-dozen zones I’d found uses for lipkg
(faster and
smaller than ipkg
but still lets you add packages); sparse
(faster and
smaller than lipkg
, for when you don’t need much configuration); and
pkgsrc
(for when you need funny applications).
Doing It
I built my test machine off the old Solaris Puppet zone. When time came to nuke that and build OmniOS on the old hardware, I hit the ourobourus of building the Puppet server.
Puppet’s perfectly happy to Puppet itself and add all the fine-detail like
users and whatnot, so all I had to do was build a zone which could do that.
Fortunately there’s a decent Puppet package in the Joyent pkgsrc repo. Here’s
the oozone
config that builds a zone, installs the needful, and Puppets
itself.
---
brand: pkgsrc
zonepath: /zones/cube-puppet
autoboot: true
net:
- physical: puppet_net0
'global-nic': auto
allowed-address: 192.168.1.51/24
'defrouter': 192.168.1.1
fs:
- dir: /home
special: /export/home
type: lofs
dataset:
- name: fast/zone/puppet
dns:
domain: localnet
search: localnet
nameserver:
- 192.168.1.26
- 192.168.1.1
facts:
role: puppet
environment: lab
upload:
'files/cube-puppet/puppet-master.xml': /lib/svc/manifest/site/puppet-master.xml
'files/cube-puppet/puppet.conf': /etc/puppetlabs/puppet/puppet.conf
run_cmd:
- 'yes | /opt/local/bin/pkgin in ruby26-puppet-5.5.2'
- '/opt/local/bin/gem install fast_gettext -v 1.1.2 --no-document'
- '/usr/sbin/groupadd -g 40 puppet'
- '/usr/sbin/useradd -u 40 -g 40 -s /bin/false -d /var/tmp puppet'
- '/usr/sbin/svccfg import /lib/svc/manifest/site/puppet-master.xml'
- '/bin/mkdir -p /opt/ooce/bin'
- '/bin/ln -s /opt/local/bin/puppet /opt/ooce/bin/puppet'
- '/bin/sleep 10'
- '/opt/ooce/bin/puppet agent -t'
The Puppet code is brought into the zone through the delegated dataset, and the code in said dataset is kept under version control. That dataset also contains additional configuration files for things like automatic certificate signing.
I’d never put a sleep
in anything that mattered, but this is a task run so
rarely, and always under supervision, that I’ll take a deep breath and let it
go. We just need the Puppet server to be up, and it probably takes less than a
second. Yes, sleep
is awful, but OmniOS doesn’t offer watch(1)
[UPDATE: it
does now] and I don’t have a great deal of love for the idea of embedding
shell loops in a YAML file to later be exectued by Ruby.
Next I built the DNS zone, with an additional command to temporarily pop
puppet.localnet
into /etc/hosts
so it will always find the Puppet server.
After that building zones was entirely straightforward.
Any zone can be rebuilt from a single command. I have full telemetry going into Wavefront, with alarms for all things traditional as well as failed SMF services, failed Puppet runs, failed cron jobs.
The Solaris to OmniOS migration was not only painless, but enjoyable. OmniOS feels very professional, polished, and focused. Its user community is small, but extremely helpful and inclusive. Long may it prosper.