Life is Too Brief for Jenkins
bout 9 months in the past, I requested a switch to the workforce engaged on the corporate’s CI tooling. In my judgement, CI was a serious productiveness blocker for the entire group, and I hoped I might have the ability to assist enhance it and make a broad, optimistic impression.
At the moment, CI was in Jenkins 1, which had three main issues:
-
All people’s CI pipeline was described in textual content packing containers within the Jenkins UI, which meant they weren’t model managed, discoverable, and enhancing/testing new configurations was a troublesome expertise.
-
The online interface was dated and ugly to make use of.
-
Builders had little management over the setting by which their jobs ran, as a result of the VMs working as Jenkins nodes have been centrally managed.
My workforce thought-about two choices.
Possibility 1: Change instruments
The pinnacle of SRE championed Gitlab CI. I resisted this concept as a result of I, the comparatively inexperienced supervisor of a nascent workforce, was daunted by the prospect of attempting to supplant Jenkins, Github, and JIRA abruptly.
On a earlier workforce I had used Concourse CI to some extent, however I wasn’t actually blown away by the expertise. Travis and Circle have been talked about. I used to be a idiot. I ought to have dedicated to significantly researching a number of the contenders and making a extra knowledgeable resolution, however I lacked the willpower and the discernment.
Possibility 2: Improve to Jenkins 2
On the face of it, Jenkins 2 appeared to fulfill all our wants. It:
-
Helps defining your CI job as a “declarative pipeline” that may reside as a Jenkinsfile within the root of your repository. Hooray configuration as code!
-
Boasts a UX facelift referred to as “Blue Ocean” that appears extra trendy.
-
Permits pipelines to request to be run on a docker “agent”, which lets software builders management the setting on which their job is run by specifying a Docker picture or Dockerfile.
A Taxonomy of Errors
The worst errors are available two distinct flavors: catastrophic and insidious.
A catastrophic mistake is like triggering an outage, or deleting manufacturing knowledge. The second you understand what you’ve performed is the worst single second in your profession. Your coronary heart kilos in your chest. Is that this a nightmare? Perhaps in a second, you’ll get up? No, it’s actual. Hopefully, you’ve acquired a wholesome tradition at work, and also you desperately describe the state of affairs to your teammates, who rally to your facet. Any individual with a cool head thinks of some method to make the very best of issues, and by some means – perhaps that night time, perhaps the subsequent day – you make it by. Issues return to regular. You write a postmortem, rely your losses, and return to work – rather less harmless, and somewhat wiser.
An insidious mistake, in contrast, doesn’t reveal itself in a second. It makes you undergo somewhat bit right here, and somewhat bit there, till someday you get up and also you understand that there’s a gaping gap the place your humanity was once. You’re a depressing husk of a person, with cruelty in your lips and bile in your coronary heart. You continue to greet your colleagues with that jolly smile of yours – however the sweetness in your smile is the saccharine of cynicism, not the honeyed optimism because it was within the days earlier than, when life was cheerful and your burden was gentle. The sunshine in your eyes was once the hope for a greater tomorrow. Now it’s the glint of insanity.
What’s incorrect with Jenkins
Selecting Jenkins was the insidious sort of mistake. Warning – I’m going to rant for a lot of, many paragraphs. My recommendation is to skim.
The worst factor about Jenkins is that it really works. It might probably meet your wants. With a liiittle extra effort, or by adopting sliiiightly decrease requirements, or with a liiiiitle extra tolerance for ache, you possibly can all the time get Jenkins to do aaaaalmost what you need it to. However let’s speak specifics. Jenkins options:
Excessive indirection between you and the execution of your code.
For me, the majority of the particular work of a CI pipeline takes the type of shell instructions. are usually executed inside shell instructions. In Jenkins pipeline, there’s a ‘sh’ “step” that executes the shell. For instance
So instead of writing Bash directly, you’re writing Bash inside Groovy. But:
- Your editor won’t syntax highlight the Bash inside Groovy.
- You can’t run “shellcheck” (or any sort of Linter) on the Bash inside the groovy.
- You can’t very easily execute your shell commands to test them.
There are two ways to try and address this:
- Write your shell in a separate Bash file that you execute from Groovy, avoid putting it inline in your pipeline.
- Try to avoid writing shell at all – instead, implement everything as Groovy methods.
I think #1 is actually the better approach. We started out there. The trouble was, we started wanting to abstract our pipeline steps and turn them into “shared libraries” and so we gravitated toward #2, so that we could share steps easily across pipelines.
The trouble is: Groovy is a much, much worse language for executing commands than Bash. Bash is interpreted, has a REPL that is great for experimentation, does require a ton of imports, and has lightweight syntax. Groovy has none of these things. The way that developers test their Groovy steps is by triggering a job on the remote Jenkins server to run them. The feedback loop is 2 orders of magnitude slower than it is for just executing Bash locally.
Are there ways to execute the Groovy steps locally? The way you’re supposed to do it is with JenkinsPipelineUnit which is an excellent thought – it enables you to write unit checks in opposition to your Jenkins Pipeline, and offers you an interface for mocking varied Jenkins issues. However there are two issues:
- As famous within the README, Groovy doesn’t run the identical means on Jenkins because it does in your unit check, as a result of the groovy DSL is “serialized” by Jenkins earlier than operating.
- “Declarative” pipelines are not supported – an enormous drawback for us, since that’s how we’ve carried out all our stuff, because it appeared to be the latest and most trendy factor to be doing.
So principally, that’s an enormous bust. Particularly since we weren’t a Java store. My workforce was barely in a position to sort of piece this collectively as a result of it’s our job to work on the CI system, however there’s completely no means that any of the PHP/Javascript/Golang/Python software builders who want to write down pipelines will have the ability to obtain Gradle, determine they should run gradle init, set up the pipeline unit testing library, determine the right method to initialize the “PipelineTestHelper”.
So we’re principally resigned to the workflow of operating shell instructions outlined in strategies utilized by a DSL embedded in groovy transmitted to the CI grasp node, serialized and handed to a CI employee node and executed there.
There’s a “replay script” characteristic that permits you to edit your pipeline proper within the net interface, which helps reduce down on the suggestions time somewhat bit in the event you don’t care about model controlling your adjustments or with the ability to use your individual editor/instruments. I personally am not prepared to make that sacrifice.
TL;DR, the suggestions loop sucks. You’ll by no means have the ability to successfully check any of the code operating in your pipeline. Your finest wager is to construct all of it completely in Bash, construct your individual mechanism for testing it and sharing performance. The flexibility to write down Groovy shared libraries is a entice and leads solely to distress.
Low degree of discoverability
A number of performance that Jenkins has within the net UI – particularly the performance that comes by plugins – can be attainable to outline in pipelines, however the means for doing this isn’t well-documented. For instance, there’s this plugin that lets you “throttle” a job in order that a number of jobs don’t hearth without delay. that you would be able to see contained in the UI. After most likely half a day of Googling, trial and error, and because of a stroke of luck, I found out that I might accomplish what I needed by placing the next in my Jenkinsfile:
properties([[
$class: 'ThrottleJobProperty',
maxConcurrentTotal: 5,
throttleEnabled: true,
throttleOption: 'project'
]])
Maybe if I were a Java/Groovy expert I could have read the source code for the plugin and determined this was possible. But I shouldn’t have to be. And the application developers trying to implement their own pipelines for there code definitely shouldn’t have to be.
There are two tricks I’ve developed to help the discovery of these magic incantations. Trick 1 is the “Snippet Generator”, which is basically a drop down box in the Jenkins UI with a pretty comprehensive list of options to explore and can help you find what you need maybe 15% of the time. Even if you can’t produce something usable, the snippet generator can give you an idea what to Google for.
Trick 2 I’ve had much more success with. Use the snippet generator or Google things just enough to find a function name or keyword relevant to whatever you’re trying to do. Then, go to github.com/search
and put filename:Jenkinsfile <keyword>
. You’ll probably find something you can copy and paste. It’s worked 90% of the time, for me.
Really, this experience sucks. I don’t really do any sort of other engineering like that, because sane systems have better documentation, more obvious abstractions, and better interactivity. I feel like a script kiddie, blindly typing incantations in to make magic happen through trial and error. Hugely demoralizing.
Blue Ocean is Incomplete and Unintuitive
Blue Ocean looks more modern than the classic Jenkins UI – I’ll give it that. Unfortunately, it’s missing functionality, so you’ll have to use and become familiar with the classic Jenkins UI anyway. It’s also just not a pleasant UI to use! I’m not much of a design person or a front-end developer, so I can’t articulate precisely what it is that makes the interface unpleasant, but it always seems to take several clicks in places I don’t expect in order to do what I’m trying to do – usually, I just want to run the build, or see the output of the build.
Docker
It is possible to have your builds run inside docker containers. Jenkins 2 does let the job author specify a docker image, or dockerfile – even kubernetes configurations for autoscaling! So, in principle, the problem of letting job authors own their job’s environment is solved.
The only problem is that this problem is solved by incorporating the idea of a “Jenkins worker” INTO the idea of a Docker container. These two ideas don’t always play well together. For example, one thing I kind of expected/hoped for was that, defining a Jenkinsfile to use a Dockerfile, and then giving it a build step like
would be approximately the same thing as
But it different in one very significant way. With docker run
, your cwd is whatever the Dockerfile defined. In a Jenkins job, the cwd is the Jenkins workspace – which is bind mounted in from the host node. Basically, Jenkins tries to turn your docker container into a regular old Jenkins worker. This makes a degree of sense, but has a number of inconveniences.
- You probably can’t be root inside your docker container. If your build produces any sort of persistent artifact in the workspace, that artifact will be owned by root and will end up on the filesystem of the host. Jenkins on the host doesn’t run as root, so it doesn’t have permissions to wipe the workspace when it needs to, and you’ll get janky permissions errors.
So what we ended up doing is creating a user inside the dockerfile with the same UID as the user that Jenkins runs as. Passed through via a build arg. This is not something I’d really mind doing once – but you have to do this trick for every single job that is defined. So it’s not just something we could solve for everybody on the CI team. Every application developer who wanted to define their own job ran up against this problem. And it’s a confusing problem – it took me days to really make sense of what Jenkins was trying to do. We documented it internally about as well as we could, but still we ended up guiding probably at least a dozen application developers through this particular confusion.
- You’re probably going to have to define a docker image just for the build.
One of the mostly-false promises of Docker, as it was sold to me by the true believers who introduced me to it, was that, if you do it right, you can run the same docker image, and therefore have basically the same environment in production, in CI, and on your local development machine. I’ve never actually see this happen, but I can tell you right now – you’re going to have to define a special docker image just for Jenkins, because of how strangely it interacts with the world of containers.
Lest this turn into a rant against Docker – a tool I am also seriously disappointed with, I’ll end here. Long story short, we used Jenkins 2. It kind of solved our problems. So now our problems are kind of solved, which is the worst kind of solved.
Postlude
It’s a month after I started writing this post. Now I work at a different, bigger company. I no longer work on CI. What’s more, one of the principles I had never even thought to question at my old company – “everybody should be writing and maintaining their own CI jobs” – is just not at play here. There’s a team that seems almost completely to own CI and all CI jobs. I’m in week 4, and I know Jenkins is there, somewhere, lurking behind the scenes. But I have never interacted with it, and it seems like there are a lot of smart people working so that I never, ever need to. What a strange new world this is.