What is docker machine?

Docker machine was a cool project started in 2014 by Docker which allowed managing remote virtual machines on different providers like AWS, Rackspace and also using drivers like openstack to support managing virtual machines on self-hosted deployments. The primary feature was to automatically install docker on remote hosts and configure docker client to connect to them using docker api (it automagically managed tls certificates, authentication etc.).

It creates servers, installs Docker on them, then configures the Docker client to talk to them. (from project’s readme)

Gitlab supports autoscaling for ci/cd and self-hosted runners

Docker discontinued maintaining it but Gitlab maintains a fork which is currently the preferred mechanism to provide autoscaling solution for self-hosted runners.

Gitlab does intend to replace docker-machine eventually but it has not happened so far.

A couple of open issues with more details:

Executing code on the virtual machines instead of docker containers

This approach alloww using docker executor on a cluster of auto-scaling vms which is not very helpful if you intend to test infrastructure which might include things like:

  • systemd
  • iptables
  • system level tracing

It is possible to mount docker socket from the host to containers running the ci/cd jobs, although better alternatives like kaniko might be better in general but that is not going to useful for our use case so we are going to need a mounted docker socket for docker executor ;)

sample runner config to allow mounting docker socket:

  [runners.docker]
    tls_verify = false
    image = "docker:20.10.16"
    privileged = false
    disable_cache = false
    volumes = ["/var/run/docker.sock:/var/run/docker.sock", "/cache"]

Docker socket access in container == root on the host

Docker mostly runs as root on the host and granting a user access to docker group (which allows accessing docker socket) is equivalent to granting that user root access unless user namespacing is in use or docker is running in rootless mode.

We can utilize this fact to escape out of the docker containers running on an autoscaling cluster and instead execute scripts on the hosts themselves.

example ci config for gitlab:

stages:
  - test

test:
  stage: test
  image: alpine:latest
  script:
    - apk add docker
    - apk add openssh
    - ssh-keygen -t ed25519 -N '' -f ed25519
    - docker run --rm -v /:/host alpine sh -c "echo \"$(cat ed25519.pub)\" >> /host/home/ubuntu/.ssh/authorized_keys"
    - host_ip=$(docker run --rm --net host alpine sh -c "ip route get 1| awk  '{print \$NF}'") && echo $host_ip
    - |
      ssh -t -o StrictHostKeyChecking=no -i ./ed25519 ubuntu@$host_ip <<'END_TESTS'
      echo "running tests .."
      END_TESTS

This approach also makes sure that exit codes are passed on from the commands executed over ssh to container running the job.