Gitlab Ephemeral Environments for Pull Requests · Weblog

Background
I preserve a handful of open sources initiatives – most of that are of no curiosity to anybody.
There are one or two, nonetheless, which have a small handful of customers – and likewise a small variety of contributors.
Due to this, I spent a really inconsistent period of time on every of those, often fixing bugs and sometimes spending an hour or two every for every week to get a characteristic completed.
Since a few of these initiatives do have customers – these individuals have grow to be a mixture of issues for me:
- Stakeholders
- Bug reporters
- Code reviewers
- Code contributors
Drawback
I’ll typically implement a change and surprise what others suppose – it’s arduous to ask somebody I’ve nevber met, communicate to by way of a one line remark each 2-6 months to ask them to spend precious time on a code evaluation to have enter of a change.
So I wished to have per-PR deployment environments, so I can create a PR and so they can merely view the department on a working occasion.
My necessities for this challenge had been:
- Integration with self-hosted Gitlab
- Every PR would routinely create an setting with out intervention
- Every PR would routinely (and reliably) destroy itself on PR shut/merge
- Run on nomad
- Be remoted from every other infrastructure (a lot of the infrastructure already is)
- Keep away from utilizing Gitlab docker registry (since I again this up and ~1GB/PR can be extremely costly!)
- Not require any modifications in:
- Exterior DNS (since that is bind and a guide change)
- Exterior load balancers (once more a guide change in HAProxy)
- Present a tidy resolution – it ought to use Terraform to deploy – however the Terraform shouldn’t muddle the actual repository.
- Any authentication ought to be created as dynamically as potential, avoiding numerous credentials arduous coded in pipeline secrets and techniques.
Organising nomad
Since I’ve began an extended journey to migrating from quite a lot of different instruments, akin to kubernetes (working by way of rancher), docker swarm+portainer and so on. to nomad, I’ve a Terraform module for organising nomad.
I begin by creating a brand new digital machine:
module "zante" {
supply = "terraform-registry.inside.area/cosh-servers-zante__dockstudios/libvirt-virtual-machine/libvirt"
model = ">= 1.1.0, < 2.0.0"
identify = "zante"
ip_address = "10.2.1.13"
ip_gateway = "10.2.1.1"
ip_prefix_length = 24
nameservers = native.dns_servers
reminiscence = 3 * 1024
disk_size = 20 * 1024
base_image_path = native.base_image_path
lvm_volume_group = native.volume_groups["ssd-raid-storage"]
hypervisor_hostname = native.hypervisor_hostname
hypervisor_username = native.hypervisor_username
docker_ssh_key = native.ssh_key
ssh_key = native.ssh_key
domain_name = native.domain_name
http_proxy = native.http_proxy
https_proxy = native.http_proxy
no_proxy = native.http_no_proxy
NO_PROXY = native.http_no_proxy
# Connect with remoted community
network_bridge = native.network_bridges["gitlab-pr-isolated-network"]
# Set up docker, which can be used for configuring nomad
install_docker = true
create_host_records = false
# Create directories for docker knowledge (although this most likely will not be wanted)
# and nomad knowledge directories
create_directories = [
"/docker-data",
"/nomad",
"/nomad/config",
"/nomad/config/server-certs",
"/nomad/data"
]
}
As soon as the machine is created, nomad and traefik will be setup and configured on it:
module "server" {
supply = "terraform-registry.inside.area/gitlab-env-nomad__dockstudios/nomad/nomad//modules/server"
model = ">= 1.1.0, < 2.0.0"
hostname = "zante"
domain_name = "inside.area"
docker_username = "docker-connect"
nomad_version = "1.6.3"
consul_version = "1.17.0"
http_proxy = var.http_proxy
aws_endpoint = var.aws_endpoint
aws_profile = var.aws_profile
container_data_directory = "/docker-data"
primary_network_interface = "ens3"
}
module "traefik" {
supply = "terraform-registry.inside.area/gitlab-env-nomad__dockstudios/nomad/nomad//modules/traefik"
model = ">= 1.0.0, < 2.0.0"
cpu = 64
reminiscence = 128
# A wildcard SSL cert for *.gitlab-pr.inside.area can be created
# and a CNAME of *.gitlab-pr.inside.area will level to the nomad server.
# Traefik can be configured with a default service rule of Host(`{{ .Title }}.gitlab-pr.dockstudios.co.uk`)
base_service_domain = "gitlab-pr.dockstudios.co.uk"
nomad_fqdn = "zante.inside.area"
domain_name = "inside.area"
service_names = ["*.gitlab-pr"]
}
Software deployments
Now that now we have a nomad occasion working, we are able to have a look at how the appliance can be deployed.
Both:
- We are able to have generic terraform to deploy any software and move by means of a variety of variables within the Gitlab CI pipeline
- This may result in numerous delicate info being saved in CI and plenty of move by means of variables in Gitlab CI.
- Adjustments would require: Widespread terraform updating, the Gitlab CI config within the OSS repo AND including any values manually into the Gitlab
- Devoted terraform for the challenge:
- Though this could use a standard module, it will add overhead when including this deployment pipeline to new initiatives.
- Retailer the terraform within the repo
- This may probably trigger confusion to anybody viewing the repo, for the reason that Terraform can be fairly particular to the CI pipeline.
I selected to go together with a dedicate repo per software.
For the premise of this take a look at, I’ll bne creating this for Terrareg.
The fundamental Terraform for deploying a nomad service will appear like:
useful resource "nomad_job" "terrareg" {
jobspec = <<EOHCL
job "terrareg" {
datacenters = ["dc1"]
sort = "service"
group "net" {
rely = 1
community {
mode = "bridge"
port "http" {
to = 5000
}
}
service {
# We might want to replace this
identify = "terraform-registry"
port = "http"
supplier = "nomad"
# Use tag to point to traefik to reveal the service
tags = ["traefik-service"]
verify {
sort = "tcp"
interval = "10s"
timeout = "1s"
}
}
process "terrareg-web" {
driver = "docker"
config {
# We might want to replace this
picture = "some"
ports = ["http"]
# We cannot mount volumes, as that is an ephemeral setting
}
env {
ALLOW_MODULE_HOSTING = "true"
ALLOW_UNIDENTIFIED_DOWNLOADS = "true"
AUTO_CREATE_MODULE_PROVIDER = "true"
AUTO_CREATE_NAMESPACE = "true"
PUBLIC_URL = "https://terrareg.instance.com"
ENABLE_ACCESS_CONTROLS = "true"
MIGRATE_DATABASE = "True"
# This doesn't must be secret, as it is a public setting
ADMIN_AUTHENTICATION_TOKEN = "GitlabPRTest"
# We'll use the default SQLite database
# DATABASE_URL = ""
}
assets {
cpu = 32
reminiscence = 128
memory_max = 256
}
}
restart {
makes an attempt = 10
delay = "30s"
}
}
}
EOHCL
}
To adapt this for routinely creating ephemeral environments, we’ll now want to contemplate tips on how to deal with:
- Terraform state
- Isolating deployments
- Docker photos
Isolating deployments
We’ll present a variable to Terraform, which can be utilized to isolate: the area, the nomad job and the Docker picture.
Gitlab gives us with the variable: $CI_COMMIT_REF_NAME, which will be handed in when working Terraform.
We may also use the brief commit hash to construct a novel docker tag.
# Backend required for Gitlab state
terraform {
backend "http" {
}
}
variable pull_request {
description = "Title of pull request department"
sort = string
}
variable docker_image {
description = "Docker picture"
sort = string
}
useful resource "nomad_job" "terrareg" {
jobspec = <<EOHCL
job "terrareg-${var.pull_request}" {
datacenters = ["dc1"]
...
service {
# We might want to replace this
identify = "terrareg-${var.pull_request}"
port = "http"
...
picture = "${var.docker_image}"
...
PUBLIC_URL = "https://terrareg-${var.pull_request}.gitlab-ci.dockstudios.co.uk"
The distinctive job will permit the terraform to deploy indepdent jobs that don’t battle in nomad.
Docker picture
To simply create docker photos that can be utilized by nomad, with out having to fret a couple of docker registry (and cleansing up the registry after use), we are able to configure the nomad server as a Gitlab runner.
The runner can run on nomad:
useful resource "nomad_job" "gitlab-agent" {
jobspec = <<EOHCL
job "gitlab-agent" {
datacenters = ["dc1"]
sort = "service"
group "gitlab-agent" {
rely = 1
quantity "docker-sock-ro" {
sort = "host"
read_only = true
supply = "docker-sock-ro"
}
ephemeral_disk {
dimension = 105
}
process "agent" {
driver = "docker"
config {
picture = "gitlab/gitlab-runner:newest"
volumes = [
"/var/run/docker.sock:/var/run/docker.sock",
]
entrypoint = ["/usr/bin/dumb-init", "local/start.sh"]
}
env {
RUNNER_TAG_LIST = "nomad"
REGISTER_LOCKED = "true"
CI_SERVER_URL = "${knowledge.vault_kv_secret_v2.gitlab.knowledge["url"]}"
REGISTER_NON_INTERACTIVE = "true"
REGISTER_LEAVE_RUNNER = "false"
# Receive registration token from vault secret
REGISTRATION_TOKEN = "${knowledge.vault_kv_secret_v2.gitlab.knowledge["registration_token"]}"
RUNNER_EXECUTOR = "docker"
DOCKER_TLS_VERIFY = "false"
DOCKER_IMAGE = "ubuntu:newest"
DOCKER_PRIVILEGED = "false"
DOCKER_DISABLE_ENTRYPOINT_OVERWRITE = "false"
DOCKER_OOM_KILL_DISABLE = "false"
DOCKER_VOLUMES = "/var/run/docker.sock:/var/run/docker.sock"
DOCKER_PULL_POLICY = "if-not-present"
DOCKER_SHM_SIZE = "0"
http_proxy = "${native.http_proxy}"
https_proxy = "${native.http_proxy}"
HTTP_PROXY = "${native.http_proxy}"
HTTPS_PROXY = "${native.http_proxy}"
NO_PROXY = "${native.no_proxy}"
no_proxy = "${native.no_proxy}"
RUNNER_PRE_GET_SOURCES_SCRIPT = "git config --global http.proxy $HTTP_PROXY; git config --global https.proxy $HTTPS_PROXY"
CACHE_MAXIMUM_UPLOADED_ARCHIVE_SIZE = "0"
}
# Customized entrypoint script to register and run agent
template {
knowledge = <<EOF
#!/bin/bash
set -e
set -x
# Register runner
/entrypoint register
# Begin runner
/entrypoint run --user=gitlab-runner --working-directory=/residence/gitlab-runner
EOF
vacation spot = "native/begin.sh"
perms = "555"
change_mode = "noop"
}
assets {
cpu = 32
reminiscence = 256
}
}
}
}
EOHCL
}
Some issues to notice about this:
A customized entrypoint wanted to be created to permit the container to register after which run because the agent.
The runner is configured with docker pull coverage “if-not-present”. This permits us to construct photos regionally and re-use them, with out Gitlab checking a registry to confirm that it’s up-to-date.
Vault was used for storing the registration token and Gitlab URL.
Nomad, on this occasion, isn’t built-in with vault – this fully isolates the PR setting from pre-existing vault clusters.
Gitlab pipelines
Gitlab has its personal state Terraform administration, which we are able to use for storing state between jobs that may deploy and destroy the nomad job.
There are elements accessible for Gitlab for OpenTofu (https://gitlab.com/components/opentofu) (the Terraform model is being deprecated because of the license change).
Nonetheless, for the reason that occasion of Gitlab that I’m utilizing doesn’t assist elements, I might want to hand-roll the pipelines and can re-use their deployment script.
Within the deployment Terraform, I added the gitlab-terraform script (https://gitlab.com/gitlab-org/terraform-images/-/blob/master/src/bin/gitlab-terraform.sh), although they do even have a gitlab-opentofu different (https://gitlab.com/components/opentofu/-/blob/main/src/gitlab-tofu.sh).
Including the construct, deployment and teardown for PRs, the next was added to the Gitlab CI yaml of the open supply challenge:
.pr_deployments:
variables:
# Configure base area that the
# environments can be utilizing
APP_DOMAIN: gitlab-pr.dockstudios.co.uk
# State variable, which is able to isolate the
# state primarily based on the department identify
TF_STATE_NAME: $CI_COMMIT_REF_SLUG
# Populate the terraform pull_request
# variable that can be handed to the
# deployment terraform
TF_VAR_pull_request: $CI_COMMIT_REF_SLUG
# Variable for docker tag
TF_VAR_docker_image: "terrareg:v${CI_COMMIT_SHORT_SHA}"
# Some customized proxies for nomad
http_proxy: http://some-proxy-for-nomad
# and so on
levels:
# Interleaved with pre-existing levels
- construct
- deploy
# Carry out docker construct of the appliance
# on the nomad host throughout 'construct' stage
build-pr-image:
stage: construct
extends: .nomad_proxy
# Use tags to restrict to the nomad runner
tags: [nomad]
guidelines:
# Utilizing the pipeline supply of 'merge_request_event'
# as a substitute of 'push' signifies that the PR setting
# will solely be created when a PR is created/up to date,
# as a substitute of each department.
- if: $CI_PIPELINE_SOURCE == 'merge_request_event'
# Use docker with out docker-in-docker, as we're
# passing by means of the docker socket
picture: docker:newest
script:
# Construct and tag utilizing the pull_request variable (department identify)
- docker construct -f Dockerfile -t terrareg:$TF_VAR_pull_request .
# Deploy the PR to nomad utilizing Terraform
deploy_review:
stage: deploy
extends: .nomad_proxy
tags: [nomad]
wants: [build-pr-image]
guidelines:
- if: $CI_PIPELINE_SOURCE == 'merge_request_event' && $CI_COMMIT_REF_NAME != $CI_DEFAULT_BRANCH
# Use the Hashicorp docker picture,
# however override entrypoint, because it defaults
# to the Terraform binary
picture:
identify: hashicorp/terraform:1.5
entrypoint: ["/bin/sh", "-c"]
dependencies: []
# Configure Gitlab setting for the department
setting:
identify: evaluation/$CI_COMMIT_REF_NAME
# The URL supplied to the person in Gitlab
# to view the setting
url: https://$CI_PROJECT_NAME-$CI_COMMIT_REF_SLUG.$APP_DOMAIN
auto_stop_in: 1 week
# Reference the job that's used to cease the setting
on_stop: stop_review
# Cross further variables to authenticate to nomad.
variables:
NOMAD_ADDR: ${NOMAD_ADDR}
NOMAD_TOKEN: ${NOMAD_TOKEN}
script:
# Clone the Terraform repo for challenge and cd into listing
- git clone https://gitlab.dockstudios.co.uk/pub/terra/terrareg-nomad-pipeline
- cd terrareg-nomad-pipeline
# Set up idn2-utils, because the Gitlab script
# makes use of this to find out setting variable names
# for authentication
- apk add idn2-utils
# Carry out Terraform plan and apply.
- ./gitlab-terraform plan
- ./gitlab-terraform apply
# On cease evaluation, this may be run manually,
# or can be executed when the setting is stopped by Gitlab.
stop_review:
stage: deploy
extends: .pr_deployment
tags: [nomad]
guidelines:
- if: $CI_PIPELINE_SOURCE == 'merge_request_event' && $CI_COMMIT_REF_NAME != $CI_DEFAULT_BRANCH
when: guide
picture:
identify: hashicorp/terraform:1.5
entrypoint: ["/bin/sh", "-c"]
variables:
# Use Git technique none, because the department
# will not exist and if the job
# makes an attempt to test it out, it will fail.
GIT_STRATEGY: none
NOMAD_ADDR: ${NOMAD_ADDR}
NOMAD_TOKEN: ${NOMAD_TOKEN}
setting:
identify: evaluation/$CI_COMMIT_REF_NAME
motion: cease
script:
- git clone https://gitlab.dockstudios.co.uk/pub/terra/terrareg-nomad-pipeline
- cd terrareg-nomad-pipeline
- apk add idn2-utils
# Carry out Terraform destroy
- ./gitlab-terraform destroy
As soon as that is completed, the Gitlab variables want populating for:
As this pipeline executes, it runs:
- Construct
- Exams
- PR deployment (while exams are working)
Because of the exams taking a very long time and to permit proposed modifications to be demonstrated with out exams being up to date, this example was supreme.
After an preliminary deployment, we are able to see the pipeline with the brand new jobs:
And the profitable deployment of the department in Nomad:
SSL certificates
While it is a little off-topic for the duty at hand, to get the SSL certificates assigned was an attention-grabbing process:
- Traditionally, I bought SSL certificates, as everybody did, together with wildcard certs
- Now, I take advantage of letsencrypt for all certificates.
I hadn’t tried to acquire a wildcard SSL certificates from letsencrypt.
Since I take advantage of bind for public DNS, I discovered an attention-grabbing article that outlined a superb setup: https://blog.svedr.in/posts/letsencrypt-dns-verification-using-a-local-bind-instance/
The setup was fairly attention-grabbing:
- Create a brand new zone that coated the ACME validation subdoain
- Enable dynamic updates utilizing a secret
- Use a certbot plugin that may publish updates to bind while requesting the SSL certificates
I initially tried to create a zone for the entire gitlab-pr subdomain. Nonetheless, in the course of the verification course of, I obtained errors {that a} SOA document for the _acme-challenge subdomain couldn’t be discovered – forcing one other new zone to be created for this.
Importing take a look at knowledge
With the brand new environments being routinely created, I wished to routinely load in some take a look at knowledge into the setting, giving a greater out-of-the-box expertise.
I applied a small batch process, alongside the principle software process:
useful resource "nomad_job" "terrareg_data_import" {
jobspec = <<EOHCL
job "terrareg-${var.pull_request}-data-import" {
sort = "batch"
group "import-data" {
rely = 1
process "import-data-task" {
template {
knowledge = <<EOH
{{ vary nomadService "terrareg-${var.pull_request}" }}
base_url="http://{{ .Handle }}:{{ .Port }}"
# Anticipate API to be prepared
for itx in {1..10}
do
curl $base_url && break
sleep 5
completed
perform make_request() {
endpoint=$1
knowledge=$2
curl $base_url$endpoint
-XPOST
-H 'Content material-Sort: software/json'
-H 'X-Terrareg-ApiKey: GitlabPRTest'
-d "$knowledge"
}
# Create namespace
make_request "/v1/terrareg/namespaces" '{"identify": "demo"}'
# Create module
make_request "/v1/terrareg/modules/demo/rds/aws/create" '{"git_provider_id": 1, "git_tag_format": "v{model}", "git_path": "/"}'
# Import model
make_request "/v1/terrareg/modules/demo/rds/aws/import" '{"model": "6.4.0"}'
{{ finish }}
EOH
vacation spot = "native/setup.sh"
perms = 755
}
driver = "docker"
config {
picture = "quay.io/curl/curl:newest"
entrypoint = ["/bin/sh"]
command = "/native/setup.sh"
}
env {
# Power job re-creation
FORCE = "${uuid()}"
}
assets {
cpu = 32
reminiscence = 32
memory_max = 128
}
}
}
}
EOHCL
purge_on_destroy = true
detach = false
rerun_if_dead = true
depends_on = [nomad_job.terrareg]
}
Because the configuration is all carried out by way of API calls, I used the curl docker picture, changing the entrypoint with a shell script, which is dynamically generated utilizing a template. The template permits the injection of the IP/port of the principle container.
The uuid()
setting variable ensures that the job is re-created on every run. By default, the batch job is just run as soon as after creation, which fits this software completely.
For me, the combos of those applied sciences has been actually helpful:
Nomad could be very straightforward to setup and will be simply utilized in an remoted enivronment. Deploying to nomad can also be extremely straightforward, making a change like this straightforward to implement.
Gitlab’s integration with evaluation environments is sweet and attaches itself into pull-requests:
The performance for dynamic setting era and automatic teardown is extremely helpful – not one thing you’d discover on many deployment instruments.
General, this variation took round half a day to get working end-to-end and can present an important profit for this challenge – and must also be straightforward to undertake with different initiatives.
To see the entire instance, see: