Devops malarkey

Success. Failure. Cake.

Not Invented Here (Kubernetes Remix)

For the last yea-many months, I’ve been hacking on actual Kubernetes. It has been/is being really very interesting, and that’s only partly the precursor to an ‘interesting times’ riff. I have learned many things.

  • Anyone who says you don’t need any ops experience/people is lying through their teeth. The doc about how to make yr site ‘production ready’ advises you to carve out your own netblock(s) from within a lump of 10/8, allocate subnets for various systems, string them together with load balancers, consider capacity plans and redundancy, logging, monitoring, backups, security and access control. None of these things are beyond the wit of most people, but they’re more likely to be back-of-envelope stuff for yr average jobbing Unix admin than most other technical types.

  • Deployment tools/strategies seem, thus far at least, to consider it a job well done if they start/bounce/replace a single container. Which, well, is really quite a lot of a Noddy job.

  • There is a thing called Helm, which is jolly handy. It’s a package-manager for Kubernetes. If you’ve come from an ops background, then you’ll be familiar with the idea that a running (web)app is only half of the job, and generating something versioned and deployable is a much more useful target. Yes I know containers are versioned and deployable, but see how far you get with that attitude with K8S as a target. Just a simple matter of YAML, is it? Fine. I’ll wait while you grovel through a pile of similar-looking files, looking for the right version tag.

The first thing can be solved (FAVO ‘solved’) with Terrorform. The only nice thing I can say about Terrorform is that it works, mostly as advertised. A quite startling feature of the thing is that if you want to change something about a K8S cluster on GCloud with Terrorform, it will blow the cluster away and build you a new one. Now I guess this is probably less of a surprise if you’ve a set of redundant clusters in multiple zones, but it’s still not the sort of behaviour that one might hope for.

After that, Helm is a useful tool.

However, The Product is a collection of helm charts flying in close formation, and rightly so since generating a monolith is a daft idea and being able to tinker with subsystems in isolation is a useful thing. So I found myself typing ‘helm install thing, helm install other-thing, helm upgrade this-thing’ quite a lot, which became trivially fat-fingerable. As I discovered when a routine upgrade toasted a beta cluster.

I poked around at a couple of the available things which alleged they could make this sort of malarkey go away, and they were either complete workflows which assume their way is best (Ho ho ho. No.) or just seemed to be complicated YAML generators. I am really very much not interested in generating very similar piles of inpenetrable YAML that I have to keep in version control. The impenetrable YAML should only ever exist on the target cluster and not really be the thing that is fiddled with by humans, unless they like stabbing themselves in the leg with a fork or something.

So I lashed up a thing that ran helm commands in the right order for me, made sure that the cluster I thought I was looking at was the actual target, did some elementary access config and made sure the storage classes I expected were where they should be. Then I could put much fewer things into the git repository that defined a given site/cluster and repeat myself as seldom as possible.

Here’s a really simple example that installs Prometheus, Grafana, a cert-manager and an Ingress. Obviously you’d not do this in an environment where you wanted Prometheus to work, since there’s nothing for it to monitor.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
---

context: (your kubectl context goes here. D'you think I'd leave one of ours to look at?)
                                                                                                                                           
secrets:                                                                                                                                   
  - dockerhub                                                                                                                              
                                                                                                                                           
logfile: notgrove.log                                                                                                                      
debug: false                                                                                                                               
                                                                                                                                           
charts:                                                                                                                                    
  naunton:                                                                                                                                 
    values: helm-values/nginx.yaml                                                                                                         
    chart:  stable/nginx-ingress                                                                                                           
  shuthonger:                                                                                                                              
    values: helm-values/kube-lego.yaml                                                                                                     
    chart:  stable/kube-lego                                                                                                               
  turkdean:                                                                                                                                
    values: helm-values/prometheus.yaml                                                                                                    
    chart:  stable/prometheus                                                                                                              
  queenhill:                                                                                                                               
    values: helm-values/grafana.yaml                                                                                                       
    chart: stable/grafana        

Because I come from Puppet-land, that config file gets called a manifest.yaml.

  • context is the name of the target cluster. The tool expects to have the right permissions to access same.
  • secrets are, er, a list of k8s secrets you may need to have configured before Helm will work.
  • logfile should be obvious.
  • debug ditto.
  • charts is a list of Helm charts, their local config values that are different from the defaults and where to find them.

That last bit implies there’s some directory structure. Here it is:

1
2
3
4
5
6
7
drwxrwxr-x  5 jhr jhr 4096 Jul  9 13:22 .
drwxrwxr-x 22 jhr jhr 4096 Oct  2 10:39 ..
drwxrwxr-x  2 jhr jhr 4096 Jul  9 13:22 access
drwxrwxr-x  2 jhr jhr 4096 Jul  9 13:22 cluster-setup
drwxrwxr-x  2 jhr jhr 4096 Jul  9 13:22 helm-values
-rw-rw-r--  1 jhr jhr  467 Jul  9 13:22 manifest.yaml
-rw-rw-r--  1 jhr jhr 1514 Jul  9 13:22 README.md
  • access contains cluster config to do with roles and/or pod security.
  • cluster-setup contains things like storage class config.
  • helm-values are (somewhat unsurprisingly) where the values.yaml files for the Helm charts live.

As you might guess, the tool processes the contents of those directories in order, so you can be fairly sure that your storage classes will exist before something in a Helm chart tries to allocate a PersistentVolumeClaim that you’d prefer was SSD instead of default spinny rust.

And that implies there’s a tool with some options:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
km [OPTION] ... (install|DELETE|upgrade)

-h, --help:
  show help

-f, --file [name]:
  filename of manifest. Usually manifest.yaml

-n, --namespace [namespace]:
  k8s namespace to create and use for this installation.

-j, --jfdi
  Don't stop and ask for confirmation that you're about to do something terrible.

install, upgrade or delete. Actions. Yes.

That’s quite enough typing for one evening. I should imagine that I’ll return to this to explain it better, show what it looks like running and point at the relevant github repos.

Technical Intermission

You’ll have noticed (or not, since with any luck I am typing into a void. Or Google. Which is more or less the same thing.) that content has been somewhat sparse.

As you might imagine, in the intervening time entire new technical edifices have been founded, matured and then fallen. Much like the careers of minor Britpop beat combos. If I cared I would go look for some evidence of the internet version of, say, Powder or, er, Menswear for a laugh. MongoDB? Who knows.

What I’m failing to get at is that I’d forgotten how to drive Octopress-on-github, had failed to commit and push the last live post (the one about DNS), ruby had changed and no longer worked right because I’d forgotten how/why to drive rbenv, the way that github pages work had changed and it wasn’t obvious that ‘bundle exec rake deploy’ was even doing a damn thing.

I would like to present that as an example of an entirely rubbish workflow.

So obviously I tried the same things as before, because doing the same thing twice always works. Then I thought ‘This workflow is rubbish. I shall replace it with something I barely understand.’ because obviously.

What thing do I barely understand best? Go.

Are there Octopress->Hugo migration tools? Yes.

Do they work as expected? Patches welcome!

This is why people who don’t have ‘git push -u origin master’ burned into their muscle memory use Wordpress. If your competition is the functional equivalent of a swift rochambeau to the trouser area, this is a warning.

So anyway. This is a long-arse version of a ‘Test. Please ignore.’ message. Let’s see if sarcasm works better than upgrading rbenv.

More And/or Different Tumbleweed

… And then I left.

You know how they say ‘If you can’t say anything nice, then don’t say anything at all’?

Well then.

Stirrings From the Pit: DNS

So anyway. DNS. It seems that every time I opine on the FB about the state of our own DNS rig, someone or other will grouse about the rubbish state of the name services at their own place of work. If it’s not carefully hand-curated hosts tables to prop up and/or bodge around stuff that no-one can change, it’s having to keep a sheet of paper with columns of IP addresses that belong to the servers you’re expected to use. Which, oh look. It’s just rubbish and there’s no excuse.

Here are the things we do at Future.

  • The zonefiles live in a Gitlab instance. Other source-code repositories are available. As is the old zonefile for futurenet.co.uk which has about a page of comments at the top which are instructions not to fiddle, to always log what changes you made, really don’t fiddle and if you break it you are in so much trouble. Stuff like that is rubbish. Don’t do it. The DNS is a bunch of text files which respond well to versioning. There’s no excuse not to. Even RCS is more than good enough if you don’t have git to hand, but keeping it on a different server does mean you’ve a backup if something unfortunate happens to your nameserver.

  • The zonefiles are pushed out to all the nameservers automatically, which makes it quite hard to have a zone mismatch. (It’s possible though. I shall explain below.) How you do that is best hacked up locally, because our rig (Git pulls triggered by a pubsub message bus) would be somewhat top-heavy for just this one job. Gitlab has multiple triggers. Use the sort of thing you like best.

  • The config is managed by Puppet. If you’re still managing server config by hand, please stop, have a mug of something warming and try to work out why you hate yourself so much.

  • Because Gitlab contains a CI rig that uses containers, we test the zonefiles on every commit by sparking up a container with a complete nameserver install inside and then making sure that the forward zones match the reverse ones, the zonefiles actually parse, the SOA records and NS records match and that the serial number on the zone hasn’t been fat-fingered to overflow its type. These are all things that can, will and have gone wrong for us, so having the machines rather than the customers do a spot of sanity-checking is likely a Good Thing.

  • The code that configures the container to run NSD, build its config and sanity-check the zones is the production code that runs the production nameservers.

  • This means that pretty much anyone can checkout the zones, patch them and submit a merge request, which lowers the load on the people who know DNS best (er, me) and allows the less confident to make their own changes, knowing that the machinery will flag up problems before they escape to production. See above about not fiddling. That’s a terrible way to run anything. No-one will learn a damn thing if they’re too scared to make mistakes. So all the work will be queued up for the High Priestesses, and that will breed resentment because oh god can’t you people do anything for yourselves here look it’s simple.

  • You will also have to be able to rebuild the config files automatically when you add or remove a domain. While DNS knows about secondary servers, there’s no in-band signalling to allow for that sort of thing. Our git repo contains a subdirectory of zonefiles, another containing a big list of domains, and a scripts subdir where all the testing bits live.

  • In our case, we have a pile of domains that are more or less The Same. So we have a generic zonefile that just contains some NS records and a set of A and A4 records that point to a webserver that does db-based 301 redirects. That’s the sort of thing that happens when you experiment with Nginx, embedded Ruby and Redis. Still, it’s less worse than the previous versions. Unsurprisingly, the big YAML table of redirects is also held in Gitlab and runs up a container to test itself on commit. You can probably sense a theme here.

  • A thing we’re working towards is programmatic generation of the reverse zones from the forward ones, mostly prompted by the utter impossibility of working with ip6.arpa addresses if you’re even slightly dyslexic. Obviously the logical endpoint for such thinking is a return to using the H2N(-HP) script for generating zones from hosts tables. (HHOS)

  • There’s probably a better way of doing it.

Tumbleweed

… And then everbody left.

I could write a thing about burnout, but I was too fried to notice when it happened. The interesting/exciting/disturbing thing about being properly stressed is that becomes entirely normal and you only realise something is broken when the music stops. Then all the things in your head that you’ve been ignoring pitch up at once and are like ‘HELLO’ and waving cartoon cricket-bats with nails and broken glass embedded.

Re-inventing the Wheel - Square and on Fire

I am incompetent and I can’t make Vagrant work.

At least, that’s the excuse I’ve been making for not joining in with the rest of the floor in using that for Puppet (among other things) development. Instead, my dev-rig is a VM running as a puppetmaster that’s tracking the changes I make to a given branch in our Git repo via the magic of post-commit hooks and another VM in which I can run ‘puppet agent –debug –blah –server (first VM)’. Once in a while I remember to blow away the second VM so I can make sure everything builds in the right order. However, even with snapshots it’s just slightly too painful to happen regularly.

Meanwhile, quite a lot of the recent developments at Future have involved rigs of between eight and twelve boxes. Generating a worthwhile test/dev version of one of those is rather tiresome because even if you’ve got the spare horsepower lying about, you have to spend yea-long wiring it all together, sanitising the static and/or test data and it all quickly gets completely oh god why did i even get out of bed i should have been a farmer like my dad mind you if i had done that i’d have been out of bed at six to go feed the sheep on long barrow bank perhaps not after all…

So when a mildly broken Dell R620 arrived back from one of the DCs coincidentally with me wanting to have a play with this Docker business, it all seemed a bit convenient.

I am incompetent and I can’t make Docker work. On Debian.

LXC, on the other hand, was slightly simpler than falling off a log.

Given that it’s simple to build a puppetmaster that’s the same as one of the live ones, and that all the machine config I currently care about is in a manifest, it should be pretty easy to generate a container and have it puppet itself up tolerably quickly.

This indeed turned out to be the case. However, having to hand-allocate IP addresses and fiddle about with container naming such that they picked up the configs in use on the live rig was all a bit too hands-on and really not what I wanted.

DNSMasq fixed the first problem. It is a surprisingly useful tool.

A rakefile which read a list of made-up machine names, generated softlinks to the actual hiera node configs and then instantiated the relevant containers fixed the second.

I also spent quite some time building a Wheezy ‘image’ that minimised apt-get as much as possible.

Result - fully puppeted containers come up in circa a minute. Somewhat longer if you have to install PHP. If I didn’t have quite such a rational hatred of golden images and all who sail in them, it would likely be faster still.

The next part is a bit fiddly.

The example problem I now have is that some parts of my collection of yea-many VMs want to connect to other parts. For instance if I have a redis slave, I need to know what the master’s IP address might be during the puppet run. At Future, we generate a location fact and use that in our hiera, er, hierarchy to configure things like message brokers, smarthosts and DNS ordering. I could just add yet another location - testbox, or something - allocate a block of IP space and then add some extra indirection. And then I could do that again for each person and/or project that wanted to run up a test-rig. At which point one has just run into a behaviour pattern that should probably be named ‘It’s OK, I can fix that for you.’

I first came across this in, er, 1991 when doing some NHS-related coding. One of the other chaps had written a thing which had to deal with, oh I don’t know, ten items or something. Because he was a forward-thinking sort, he allocated sixteen slots in his array and beetled off all smug for a coffee and a corned beef sandwich. As you might expect, a few months later one site or other had a list of seventeen items and a bug report. ‘It’s OK, I can fix that for you!’ went our chap and expanded the array to the clearly ludicrous value of some twenty-three slots…

There’s scope for an Eric Berne knockoff book of tiresome technical behaviour antipatterns, isn’t there?

Anyway, I’m using DHCP, and I wanted the entire edifice to work with little or no extra typing.

CoreOS’s ETCD looked like a good fit. Emit salient facts to etcd database when bringing up (say) redis master, then query same via Garethr’s hiera-etcd when bringing up the slave. Profit!

That bit did take a little tinkering to get right.

It seems to me that the notion of a reactive puppet configuration is really rather interesting. Other people may well be screaming in terror and jabbering about things like ‘deadly embrace’ and ‘terrible feedback loops are fine for the Jesus and Mary Chain (Or A Place to Bury Strangers if that’s too retro for you) but have no place in a theoretically stable configuration.’ However, just as a top-down decision process enforced by rigid hierarchy is a hateful idea for a workplace environment, so it is for a machine environment.

TL;DR - code in Github, patches welcome.

Treating People Like Dicks (Distance Learning Edition)

Today one of the old Solaris boxes expired. Well, I say ‘box’ and ‘expired’. I mean ‘1U Solaris-X86-what-were-we-thinking machine’ and ‘fell into maintenance mode while I was eating breakfast’. And, in a truly extraordinary amount of digression and rambling, when I say ‘what were we thinking’ I probably mean ‘The kit had actually managed to serve MySQL tolerably reliably for some 1500 days.’

I don’t know if you lot remember the uptime wars, but they were medium sized in the late nineties. Rather like Sleeper or Menswear, but with fewer annoying tunes and rather more waiting around. We learned better as soon as someone equated long uptimes with being an obvious target for some bollix with a copy of Metasploit.

Anyway. A machine that hadn’t been restarted since it was shoved in a rack, that was host to a pile of Solaris zones. What could possibly have gone wrong with that?

It transpired that one set of binary logs or another had experienced a Jolly Interesting Time and had managed to confuse the zpool enough that the alleged hypervisor had thrown a strop and gone into maintenance mode. Which, um, okay…

Thankfully there are no beard-fondling Solaris types around to tell me that the next move was a Bad Idea, but mucking out the disks, clearing maintenance mode and restarting the beast looked to be the least-worst option.

That is until we discover that the running network config had never been written back to various bits of /etc and indeed there were no build notes or valid excuses on either the deceased wiki or the somewhat shiny new one.

There now followed a swearing competition.

I suspect that what happened in 2009 was pretty much like what happened this morning. After the eighth or twelfth reboot, the people wanting the databases back won over ‘I would like to make this network config survive a reboot surely the combined wit of the Sun/Oracle doc and three dozen assorted blogs and HOWTOs can’t all be missing the vital something or other that we can’t spot either…’

It’s still an unpleasant trick, though.

Short Commercial Break

Trigger warning: contains talk of horrible old Unix kit running horrible old Unix

There’s a good chance I’m probably a massive arsehole and I get paid for it. Which, I dunno, maybe I’m supposed to be pleased with myself about it because being disruptive is seen as a good thing these days. The last time I came across people being described as such, they were the hyperactive (or just sugared-up) kids at junior school who seemed to be convinced that it was all about them and if it wasn’t they’d throw a massive strop and wander round with a lower lip hung out like a soup-plate. I’m assuming a disruptive technology doesn’t have a howling fit in the middle of the organic vegetable section of the supermarket if it doesn’t get its own way, but then I wouldn’t be surprised if it did.

The thing about the arseholedom isn’t malevolent in the slightest, it’s more a case of going up to someone and asking them why they’re nailing their legs to the table. You get this very weird selection of looks when you do things like that. As if they’re expecting you to come up with something sarcastic about using the contact adhesive on the shelf. Then they’ll say something like ‘Well we’ve always nailed our legs to the table in this department because it keeps the bees from flying to Winchcombe.’ Which, um, okay…

I mean, there’s no answer to that. Especially when some manager piles out of the end office going “D’you want the bees to go to Winchcombe? Do you? Because that’s what’s going to happen if you don’t buck your ideas up and crack on with that hammering.”

But you have to try. You point to one of the chairs in the corner and suggest that using those would be much less unpleasant. That’s when the trouble really starts. The manager goes pop-eyed and kicks off about ‘You smart buggers in IT think you know everything coming down here with your ideas I don’t have time for ideas there’s barely enough time to send Bob here down to the hardware shop for more nails, what with the bleeding and the Tetanus jabs and now you want us to cross-train to chairs I’m glad you think we all sit around with glue-guns like you wasters someone should sort you lot out once and for all.’

So you pull a chair over and they look at you like you just shat out a railway station.

What this is really about is that years ago (HP-UX 10.20 ago, in fact) I was given a HP9000 to look after. In poking around the filesystem to see what dreadful sort of albatross I’d been handed, I found a whole pile of cron-jobs that ran scripts to monitor sendmail and some more scripts that re-started sendmail and further scripts that tested the state of earlier scripts. It all seemed a bit pointless because even then sendmail could more or less be left alone to generate remote root exploits and sometimes deliver mail.

I asked one of the longer-serving chaps and he came over all leg-nailing. Apparently it wasn’t to be touched because sendmail was dreadfully unreliable and crashed every half hour.

I nodded, smiled and went off to throw away all the junk and upgrade the sendmail install to $latest.

It didn’t crash.

The point being that writing long-lived daemon processes is really very well known science and instead of mucking about with multiple layers of monitoring and backup, you’re much better off making the daemon work right.

There Are Two Hard Problems in Computer Science: Caching

Title stolen from one of the myriad on the internet more cleverer and witty than wot I am. However, Octopress seems to have added its own twist, so you’ll have to do the rest of the ‘joke’ yourselves…

Years ago, not long after a visit to the Anarchist Bookshop and having become mildly peeved with the names of computers at Previous Employ (The failover pair named after the (in)appropriate Southpark characters, the ones that were funny if you were twelve… Mind, we were all twelve; that was part of the fun. Mind also that our American management decided to call us all ‘spanners’ because of I don’t know what made up terrible morale-boosting exercise. Tip for the MBAs out there - if your entire English team has a fit of the giggles in an ‘all hands’, you have just said something hysterically inappropriate and they are not going to let on until you have the t-shirts printed), I started naming kit I built after anarchists. I think I got as far as kropotkin and bakunin before the option of voluntary redundancy came up and I followed my political convictions and ran pell-mell towards the £MONEY.

The Americans had something of a sense of humour failure (or actually maybe they didn’t in retrospect) and started naming machines nasdaq, bourse et al.

Last year, self & Sam(oth) start calling the notion of Devops, ‘anarcho-syndicalism in action’.

Actually, I think he found the reference elsewhere, but it totally struck home because a lot of the alleged problems that the modern middle class while male technocratic elite have to put up with (only decent latte halfway across town, nowhere to dry yr bike kit in the office) are best approached with an eye to Solidarity (with other teams. Don’t let ‘managers’ or ‘stakeholders’ play at divide and conquer), Direct Action (fix those problems yourselves. You know your environment best. ‘Management’ ‘control’ is bollocks) and Worker’s Self-Management (do not replicate process with code. Optimise it out. Build the environment in which you wish to work. No-one will do it for you.)

And, obviously, this is a debased and pitiful version of a full-on political movement. Which is generally home to misogynistic rape-apologist dickheads it seems. (Who act like the polis at the first sign of trouble because that’s the only model of dissent-management they have. There is a policeman inside all of our heads it must be stopped.)

You may imagine my lack of surprise at discovering a tool called ‘Serf’ which lives at ‘serfdom.io’

Again, that could well be irony so sufficiently advanced that it is indistinguishable from reality. However, such Hayek-followers as I have come across didn’t hold with that sort of malarkey.

I guess this sort of thinking fits in well with the sort of sods who talk about being ‘disruptive’ but actually just want other people to provide free services for which they can charge rent.

There’s probably another ‘talk’ in this, but I think it’s the sort of thing better done by the likes of Shanley Kane.

Giving Stuff Away on the Internet Is Probably a Good Thing.

For reasons that will become sadly apparent when these posts are read in the wrong order, I’ve been engaged in the job of interviewing people who’ve expressed interest in the notion of coming to work for Future. At least one of those people was keen to point out that they’d been looking at our code on Github and was keen to come along and play with it.

Which was nice.