GitHub Actions for Self-Hosted Deployment Pipelines

GitHub Actions is well-documented for deployments to cloud PaaS platforms — Vercel, Railway, Fly.io, and similar managed environments all have first-party integrations. But in reality, a significant share of production infrastructure runs on self-hosted VPS instances or bare-metal servers where the deployment model is fundamentally different. There is no push-to-deploy API. You need to reach the server, copy files, and execute commands. This post covers the patterns we use for deploying Docker Compose applications to a self-hosted Hetzner server via GitHub Actions, including secrets management and rollback strategies.

Infrastructure Context

Our production server runs Debian on a Hetzner CX53. Docker Compose manages the service stack, and Traefik acts as the reverse proxy and TLS termination layer. Deployments involve building a Docker image, pushing it to a registry, pulling it on the server, and restarting the affected service. The web frontend is a static site deployed via rsync rather than Docker, which has a simpler deployment path but the same access model.

GitHub Actions runners are hosted by GitHub (we do not use self-hosted runners). They connect to our server via SSH using a deployment key. This is a standard pattern but has security implications: the deployment key must have access to the server, which means it is a high-value credential that needs careful handling in both GitHub Secrets and on the server itself.

SSH Deployment Setup

The deployment key is an Ed25519 keypair generated specifically for CI. The private key is stored as a GitHub Secret; the public key is added to ~/.ssh/authorized_keys on the server for the deployment user. We use a dedicated system user (deploy) with limited shell access and ownership only over the directories the deployment process needs to write to. This user cannot sudo and cannot access other users’ home directories.

Connecting via SSH in a GitHub Actions workflow requires loading the private key into the SSH agent before any remote commands run. The standard pattern uses ssh-agent and ssh-add in the workflow steps, or a community action like webfactory/ssh-agent which handles key loading and known hosts configuration. We add the server’s host key to the workflow’s known hosts at setup time to prevent interactive host verification prompts from blocking the pipeline.

For rsync deployments — our static site, for example — the workflow builds the output, then runs rsync -avz --delete over SSH to synchronise the dist/ directory to the server. The --delete flag ensures that removed files on the source are removed from the destination, which is important for static sites where stale files can cause unexpected behaviour. The connection string uses the deployment user and a non-standard SSH port if applicable.

Docker Compose Remote Execution

Deploying a Docker Compose service requires a different approach. The workflow needs to: build the new image (or pull a pre-built image from a registry), then on the remote server pull the new image and restart the service with minimal downtime.

We use a two-step remote execution pattern. The first SSH command handles the pull: ssh deploy@server "docker pull registry/image:tag". The second handles the restart: ssh deploy@server "docker compose -f /opt/stack/docker-compose.yml up -d --no-deps service_name". The --no-deps flag prevents Docker Compose from restarting dependent services unnecessarily. Running pull and restart as separate commands means a pull failure does not leave the service in a partially updated state.

For services that require database migrations before the new version starts, we add a third SSH command that runs the migration inside the new container image before the service restart: docker run --rm --env-file /opt/stack/.env registry/image:tag migrate. Migrations run against the current database before traffic switches to the new container. This assumes migrations are backwards-compatible — a requirement that deserves its own discussion but is a prerequisite for zero-downtime deployments regardless of your deployment tooling.

We pass the image tag as a workflow input or derive it from the git commit SHA. Tagging images with the commit SHA rather than “latest” provides an unambiguous record of what is running in production and makes rollback straightforward — you can deploy any previous tag without needing to reason about what “latest” was at a given point in time.

Secrets Management

GitHub Secrets stores credentials that the workflow needs at runtime: SSH private keys, registry credentials, environment variable values. GitHub masks these values in workflow logs, which prevents accidental exposure in build output. Secrets are accessed as environment variables in workflow steps: ${{ secrets.SSH_PRIVATE_KEY }}.

Application secrets — the values that go into the .env file on the server — are a separate concern. We do not store application secrets in GitHub Secrets and inject them at deployment time. Instead, the .env file lives on the server and is managed independently of the deployment pipeline. Deployment does not update or replace the env file; it only updates the running code. This means changes to application secrets require a separate manual step on the server, which creates an intentional gate rather than making environment variable changes part of every deploy.

The alternative — storing all application secrets in GitHub and injecting them during deployment — is simpler to reason about but concentrates credential exposure. If your GitHub account or repository is compromised, an attacker with the ability to trigger a workflow run would have access to all application secrets. Keeping secrets on the server means an attacker needs both GitHub access and server access to extract them.

Rollback Strategies

The most straightforward rollback strategy for a Docker-based deployment is redeployment of the previous image tag. Because we tag images with git commit SHAs, rolling back means re-running the deployment workflow with the previous commit’s SHA as the image tag. This can be done by reverting the git branch or by manually triggering a workflow dispatch with the target tag as an input parameter.

For rsync deployments — the static site — rollback is a redeploy of the previous build artifact. We retain build artifacts as GitHub Actions workflow artifacts for 30 days. For a production incident, we download the previous artifact and rsync it manually. This is infrequent enough that a manual process is acceptable; automating it would add workflow complexity that is not justified by the rollback frequency.

A common failure mode in self-hosted deployment pipelines is a partial deployment that leaves the service in an inconsistent state. The Docker Compose approach handles this well because container restarts are atomic from the service’s perspective — either the new container starts successfully or the old one continues running. The more dangerous scenario is a migration that runs successfully on a database before a container restart fails. At that point, rolling back the code may not be safe if the migration altered the schema in a way the previous version cannot handle. We address this by requiring all migrations to be backward-compatible and by running a smoke test against the new container before the final service restart step in the workflow.

Observability in the Pipeline

GitHub Actions provides built-in logging for every workflow run, but the logs are ephemeral — they are deleted after a retention period and are not a substitute for application-level observability. We treat the workflow logs as diagnostic information for CI failures and rely on server-side logging (structured JSON logs shipped to a log aggregator) for production incident investigation.

One addition that has paid off: a final step in every deployment workflow that runs a health check against the deployed service. A simple HTTP request to the service’s health endpoint with a timeout and a non-200 assertion triggers a workflow failure if the service did not come up cleanly. Combined with an alerting integration on workflow failures, this gives a near-real-time signal that a deployment has left the service in a broken state, before customers encounter it.