Reviving the Blog

9 minute read

Introduction

High time to revive the blog!

I didn’t have the source in git, and there was only a manual “deployment” in place. As the first post after the revival, I’ll document the improvement process. This post will cover:

  • Moving of GitLab instance
  • Automatic mirroring from GitLab to GitHub
  • CI/CD pipeline on GitLab to build and create a release for the blog

Moving and removing public access from my GitLab instance

I have locked down my self-hosted GitLab instance from the public, because I am not comfortable staying on top of each and every security issue and I have gotten some abuse mails in the past because some people scanning the internet for vulnerable instances thought my instance had issues. There have been some serious security issues, some of them allowing RCE and GitLab has become quite popular so people are actively hunting for vulnerable instances.

The Omnibus docker container I was running had a volume bind mount. To move the instance, all I did was stopping the container on the source machine. I was then using rsync transferring everything over to the target machine, both the contents of the persistent volumes and the docker-compose.yml.

On the target machine I had decided to run podman instead of docker, and while not strictly needed, I did rename the docker compose file to compose.yml. After updating the volumes section to match the new situation, I was all done:

version: '2'

services:
  gitlab:
    #image: gitlab/gitlab-ce:11.11.0-ce.0
    #image: gitlab/gitlab-ce:12.10.14-ce.0
    #image: gitlab/gitlab-ce:13.0.0-ce.0
    #image: gitlab/gitlab-ce:13.3.2-ce.0
    #image: gitlab/gitlab-ce:13.9.4-ce.0
    #image: gitlab/gitlab-ce:13.9.6-ce.0
    #image: gitlab/gitlab-ce:13.12.15-ce.0
    #image: gitlab/gitlab-ce:14.0.12-ce.0
    #image: gitlab/gitlab-ce:14.1.8-ce.0
    #image: gitlab/gitlab-ce:14.5.2-ce.0
    #image: gitlab/gitlab-ce:14.7.3-ce.0
    #image: gitlab/gitlab-ce:14.8.2-ce.0
    #image: docker.io/gitlab/gitlab-ce:14.9.0-ce.0
    #image: docker.io/gitlab/gitlab-ce:14.9.5-ce.0
    #image: docker.io/gitlab/gitlab-ce:14.10.5-ce.0
    #image: docker.io/gitlab/gitlab-ce:15.0.3-ce.0
    #image: docker.io/gitlab/gitlab-ce:15.3.3-ce.0
    #image: docker.io/gitlab/gitlab-ce:15.3.5-ce.0
    #image: docker.io/gitlab/gitlab-ce:15.4.0-ce.0
    #image: docker.io/gitlab/gitlab-ce:15.7.1-ce.0
    #image: docker.io/gitlab/gitlab-ce:15.7.2-ce.0
    image: docker.io/gitlab/gitlab-ce:15.8.3-ce.0
    #image: gitlab/gitlab-ce:14.10.2-ce.0
    hostname: git.example.com
    domainname: example.com
    container_name: gitlab
    ports:
      - "8222:22"
      - "8700:80"
    volumes:
      - /var/lib/pv/gitlab/data:/var/opt/gitlab
      - /var/lib/pv/gitlab/logs:/var/log/gitlab
      - /var/lib/pv/gitlab/config/ssh_host_ecdsa_key:/etc/gitlab/ssh_host_ecdsa_key:ro
      - /var/lib/pv/gitlab/config/ssh_host_ecdsa_key.pub:/etc/gitlab/ssh_host_ecdsa_key.pub:ro
      - /var/lib/pv/gitlab/config/ssh_host_ed25519_key:/etc/gitlab/ssh_host_ed25519_key:ro
      - /var/lib/pv/gitlab/config/ssh_host_ed25519_key.pub:/etc/gitlab/ssh_host_ed25519_key.pub:ro
      - /var/lib/pv/gitlab/config/ssh_host_rsa_key:/etc/gitlab/ssh_host_rsa_key:ro
      - /var/lib/pv/gitlab/config/ssh_host_rsa_key.pub:/etc/gitlab/ssh_host_rsa_key.pub:ro
      - /var/lib/pv/gitlab/config/gitlab-secrets.json:/etc/gitlab/gitlab-secrets.json
    environment:
      GITLAB_OMNIBUS_CONFIG: |
        external_url 'https://git.example.com'
        nginx['listen_port'] = 80
        nginx['listen_https'] = false
        nginx['real_ip_trusted_addresses'] = [ '10.0.0.0/8' ]
        nginx['real_ip_header'] = 'X-Forwarded-For'
        nginx['real_ip_recursive'] = 'on'
        gitlab_rails['gitlab_ssh_host'] = 'git.example.com'
        gitlab_rails['gitlab_email_from'] = 'gitlab@example.com'
        gitlab_rails['gitlab_shell_ssh_port'] = 8222
        gitlab_rails['smtp_enable'] = true
        gitlab_rails['smtp_address'] = "exim"
        gitlab_rails['smtp_port'] = 25
        user['git_user_email'] = "gitlab@example.com"
        sidekiq['concurrency'] = 1
        prometheus['enable'] = true
        prometheus['monitor_kubernetes'] = false
        grafana['enable'] = true
        grafana['gitlab_application_id'] = 'redacted'
        grafana['gitlab_secret'] = 'redacted'
    restart: always
    depends_on:
      - exim
    shm_size: '256m'
    # leaving out the exim service

I like to keep track of which versions I have been running and I think this is a good place to do that.

I changed from using a copy of gitlab.rb in favor of GITLAB_OMNIBUS_CONFIG environment because configuring through environment variables is much more flexible. One mistake I made which did take me a while to figure out was that I had an equals sign in external_url = 'https://git.example.com' and that caused gitlab to not configure itself properly behind the HTTPS reverse proxy. I didn’t notice any errors and that external_url line appeared to be ignored. Gitlab was able to figure out the hostname I would guess using X-Forwarded or Host headers, but it thought it was on http.

Symptoms were that the clone url on repo pages used http://, gitlab runners failing to pull properly and the releases API thoroughly confusing release-cli. All that even though there is an HTTP redirect returned, but I remember there was something with PUT and POST while redirects are in play . Wish I had saved the specific error messages, because I did find some people having the same issues, but nobody had posted a cause or solution.

For a few days I did workaround some issues by explicitly specifying the Custom Git clone URL for HTTP(S) in the global admin settings. I knew this was not a proper fix, but I was stuck. While this did let me get past the gitlab runner’s pulling issue, the release-cli was not using the same environment vars that are used for cloning.

The mounted volumes are all on a ZFS dataset which I created with:

zfs create rpool/var/lib/pv/gitlab

Finally, the way I locked down this instance from the public is by only allowing specific IPs on the NGINX reverse proxy:

server {
    listen 443 http2 ssl;
    listen [::]:443 http2 ssl;
    server_name git.example.com;

    include /etc/nginx/conf.d/ssl_params;
    include /etc/nginx/conf.d/common_locations;

    # Default is HTTP/1, keepalive is only enabled in HTTP/1.1
    proxy_http_version 1.1;
    proxy_set_header Connection "";
    chunked_transfer_encoding off;

    client_max_body_size 0;
    
    proxy_buffering off;
    proxy_request_buffering off;

    location / {
        # location 1
        allow x.x.x.x;
        # location 2
        allow x.x.x.x;
        # public ips of the box running the GitLab Instance
        allow x:x:x:x::1;
        allow x.x.x.x;
        # internal, podman network is in this range
        allow 10.0.0.0/8;

        deny all;

        proxy_pass http://localhost:8700;
        proxy_set_header Host              $host;
        proxy_set_header X-Real-IP         $remote_addr;
        proxy_set_header X-Forwarded-By    $server_addr:$server_port;
        proxy_set_header X-Forwarded-For   $proxy_add_x_forwarded_for;
        proxy_set_header X-Forwarded-Proto $scheme;
        proxy_set_header X-Forwarded-SSL   on;

        proxy_connect_timeout 300;
    }
}

Automatic mirroring from GitLab to GitHub

After finding the source to the Jekyll blog, I created a new repo on my GitLab instance.

Since I locked down this GitLab instance from the public, I decided I want to mirror some public repos to GitHub to still have them be available.

This was easy to set up. I followed the instructions from the GitLab docs.

Firstly I created target repo on GitHub and a Personal Access Token on the developer settings section. I chose to the new fine-grained option. The options I selected are:

  • Only select repositories and I selected the newly created daniel5gh/blog repo and I will add others to this as needed
  • Read and Write Contents
  • Read Only Metadata (selected by default and mandatory)

Selected a lifetime, generated the token and stored it in a secure location.

Then on GitLab I set up push mirroring. Under Mirroring Repositories in the project’s repository settings filled out the URL field. It was important to include the username, because without it GitLab will try to push anonymously. I used https://daniel5gh@github.com/daniel5gh/blog.git, mirror direction push and entered the Personal Access Token as password.

Not selecting Keep divergent refs because I will not have any divergent refs on the target.

Behold the mirrored repo for this blog: https://github.com/daniel5gh/blog

CI/CD pipeline on GitLab to build and create a release for the blog

I’ll go over the CI/CD setup section by section:

image: docker.io/ruby:3.1

Jekyll is a ruby project, so we select this image as the main image for the jobs. The docker.io usually is omitted, but I have configured podman to not have any unqualified-search-registries (in /etc/containers/registries.conf) so I need to be specific. It was required to add allowed_images = ["docker.io/*:*", "registry.gitlab.com/gitlab-org/*:*"] to the gitlab runner config files.

variables:
  JEKYLL_ENV: production
  LC_ALL: C.UTF-8

setvars:
  stage: .pre
  script:
    - |
      if [[ -n $CI_COMMIT_TAG ]]; then
        echo "VERSION=$CI_COMMIT_TAG" >> version.env
      else
        echo "VERSION=$CI_COMMIT_REF_SLUG-$CI_COMMIT_SHORT_SHA" >> version.env
      fi
      source version.env
      echo "TARBAL_FILENAME=blog-$VERSION.tar.bz2" >> version.env
    - cat version.env
  artifacts:
    reports:
      dotenv: version.env

Here I set up global env variables in the first stage, which are then available in all subsequents stages and jobs. The dotenv report artifact is responsible for loading these variables in those jobs.

When the length of string CI_COMMIT_TAG is non-zero, it means this pipeline runs after a git tag has been created. In this case I want the VERSION to be that tag name. A git ref can have multiple tags, I don’t know what the contents of CI_COMMIT_TAG is in that case, maybe space separated tags. I’m willing to take this risk and I will be sure to quote any usage of VERSION.

In all other cases the VERSION will consist of the git ref slug and short sha, for example: master-d381768f. Because it specifically says slug in the gitlab predefined variable, I am assuming no spaces can be in there.

Lastly I want to define a tarball filename here, so I can use it in both the build and release jobs. Because it uses the VERSION, which is chosen conditionally, we’ll have to source it into the env first.

cat version.env is just there for verbosity.

build:
  stage: build
  script:
    - gem install bundler
    - bundle install
    - bundle exec jekyll build -d blog
    - tar cjvf "$TARBAL_FILENAME" ./blog
    - echo "DOWNLOAD_LINK=$CI_JOB_URL/artifacts/raw/$TARBAL_FILENAME?inline=false" >> download.env
    - cat download.env
  artifacts:
    paths:
      - "*.bz2"
    reports:
      dotenv: download.env
  rules:
    - if: $CI_COMMIT_BRANCH == $CI_DEFAULT_BRANCH
    - if: $CI_COMMIT_TAG

There are two things happening in the build job, building and packaging of the Jekyll site, and secondly we are adding another environment var for the job artifact’s download link.

The build and tar.bz2 creation is straightforward. The tarball gets uploaded as an artifact to the build job. This build job’s ID is encoded in CI_JOB_URL and it is specific for this exact build job. The next job which is for release will have another ID.

Because the release job has another ID we have to store the DOWNLOAD_LINK in this job. I chose for the raw link, because I want to easily download the latest release with curl when deploying. Without the raw part in the URL, we’d be presented the artifact download or preview page.

Lastly the rules are to run this job only if a commit is made to the default branch or when a tag has been created, or both.

release_job:
  stage: deploy
  image: registry.gitlab.com/gitlab-org/release-cli:latest
  rules:
    - if: $CI_COMMIT_TAG                 # Run this job when a tag is created
  script:
    - echo "running release_job"
  release:                               # See https://docs.gitlab.com/ee/ci/yaml/#release for available properties
    tag_name: $CI_COMMIT_TAG
    name: 'Release $CI_COMMIT_TAG'
    description: 'Release created for version $VERSION.'
    assets:
      links:
        - name: '$TARBAL_FILENAME'
          url: '$DOWNLOAD_LINK'

The release job uses the release-cli command line tool. I followed the instructions on the GitLab documentation on how to use the release and the only thing that was missing there was how to get the link to an artifact built in a prior job of this pipeline. I chose to store the link to the download in an environment variable at DOWNLOAD_LINK and this seems to work out fine.

I did wonder how release-cli knows where to connect to and what defaults to use, but apparently this is just picking it up from the predefined CI_ environment variables.

Conclusion

Finally, another blog post after a short 8-year break. I figured to make it a bit meta by having the topic on the blog itself and how I revived it. One part is missing, which is how I am deploying the automatically built tarballs. This is not automatic yet, and I will have to think about how I want to do that. It’ll be a topic in another post. I don’t yet know if I want to push it from my CI/CD pipeline, because then I’ll need to have some credentials on there. Pulling it from a cron job on the target box could also work, but is a bit lame I think.

Other ideas for topics include some research and maybe implementation of a commenting system. I want the blog te remain a static HTML site, so that’ll be interesting. And I am also working again on a tower defense game, this time using Bevy Engine in rust and I want to share my learnings on Entity Component Systems which I think are super cool. For The game I’ll also be using AI art generation and maybe ChatGPT to help me with some story elements!