A Ruby Script to Deploy Rails Docker App to Hetzner
When new code is committed, I need to build a new Docker image and repeat the process of:
- docker save and ssh docker load it in the server
- Stop old Rails app Docker container
- Start new Rails app Docker container
- Remove old Rails app Docker image
- Tag new Rails app Docker image
The Problem
This whole process is kind of boring, I want to automate it. Like we used to have "kamal deploy", and "git push dokku main".
Best hope is I can just run "ruby deploy.rb" and go get a cup of tea, when I come back the new code base is live under my domain name!
A Ruby Script for Automated Deploy
Once I've set up the server and deployed the first image manually, I wrote a Ruby script to automate the consecutive deploy process.
First I need to set some project-specific variables, such as Environment variables, the app name, the docker network or container names.
require 'colorize'
app_name = "my-app" # specify your app name
rails_master_key = File.read('config/master.key')
db_password = "dumbpass" # same as postgresql container's setting
db_host_or_ip = "app-pg" # the container name of pg db, to let Rails database.yml know where to connect
docker_subnet = "app-net" # The rails, db, frp, redis Docker internal network, can be "bridge" or "host" or "anything", need to create in docker with docker network create
server_user = "root" # the user of the remote server (pi, or an Hetzner vps, etc)
server_ip = "docker-app-ip" # the IP of the server
use_bzip = false # whether to compress docker image when transmitting it to server, if server can't decompress fast or net traffic is free, don't zip. If server traffic is slow, zip it.And then I wrote two proc to run commands in my local development machine or remote server (VPS machine). I also record time elapsed when running each task, so I know what can cause error and how long is used everytime.
deploy_start_at = Process.clock_gettime(Process::CLOCK_MONOTONIC) # To count how many seconds is used for the whole deploy process
# Run a shell command on remote machine
run_remote = Proc.new do |cmd|
start_at = Process.clock_gettime(Process::CLOCK_MONOTONIC)
puts "Start CMD: ssh #{server_user}@#{server_ip} #{cmd} 2>&1".colorize(:blue) # 2>&1 will show the stderr in return value
out = `ssh #{server_user}@#{server_ip} #{cmd} 2>&1`
finish_at = Process.clock_gettime(Process::CLOCK_MONOTONIC)
puts "Finished CMD: #{(finish_at - start_at).round(1)} seconds.\n\n".colorize(:green)
out
end
# Run a shell command on local machine
run_local = Proc.new do |cmd|
start_at = Process.clock_gettime(Process::CLOCK_MONOTONIC)
puts "Start CMD: #{cmd} 2>&1".colorize(:blue)
out = `#{cmd} 2>&1`
finish_at = Process.clock_gettime(Process::CLOCK_MONOTONIC)
puts "Finished CMD: #{(finish_at - start_at).round(1)} seconds.\n\n".colorize(:green)
out
endThen I wrote a method for Health Check. This is learned from Kamal, but I didn't look into Kamal's implementation, I just use a cURL docker image running in the same Docker subnetwork to "ping" the "/up" route.
# Healthcheck a Rails app is running successfully by inspect the container status and then GET /up.
test_app_running = Proc.new do |container_name|
10.times do |i|
puts "#{server_ip}: Healthcheck new container status - Round: #{i}"
out = run_remote.call("docker inspect --format \"\" #{container_name}")
if out.include? "true"
out = run_remote.call("docker run --rm --network #{docker_subnet} curlimages/curl --silent -LI -o /dev/null #{container_name}:3000/up -w '%{http_code}\n'")
if out.include? "200"
puts "#{server_ip}: Container is ready."
puts "#{server_ip}: Success!"
break
else
puts "#{server_ip}: Container is up but Rails server is not ready, waiting 3 seconds..."
sleep(3)
end
elsif i < 9
puts "#{server_ip}: Container is not ready, waiting 1 second..."
sleep(1)
else
raise "#{server_ip}: Container is not ready after waits. Aborting. Manual fix is required."
end
end
endThose are the methods to be invoked, then I need to run some commands on my mac and on the Hetzner VPS.
The Deploy Process
Phase 1 is building the Docker image on my development machine, it uses the app specific variables I defined before. And if successful it would use Docker load and ssh to load the image on the Hetzner server with a new Docker image tag "new_build".
# Phase 1: Build Docker Image for Rails App
puts "local: Building Docker image for #{app_name}."
build_image_cmd = "docker build -t #{app_name}:new_build ."
out = run_local.call(build_image_cmd)
if out.include? "DONE"
puts "local: Docker Build Successful."
elsif out.include? "ERROR"
puts out
raise "local: Docker Build Error."
else
raise "local: Docker build process status unknown, check manually."
end
puts "local+remote: Compressing Image and Loading it in the Remote Server"
out = run_local.call("docker save #{app_name}:new_build #{use_bzip ? "| bzip2 |" : "|"} ssh #{server_user}@#{server_ip} docker load")Phase 2 is testing if the new image can run on the server. At this moment, the old and new rails app containers will run together.
# Phase 2: Test the Docker image of Rails app on remote server. If successful, stop it and run it later.
# Because port mapping of docker and container name can't have conflict, but FRP uses a static web app container name in frps.ini, can't change it dynamically.
# So, have to stop and rerun the new app container.
puts "#{server_ip}: Starting Rails app with new built image"
out = run_remote.call("docker run --name #{app_name}-app-new-build -e RAILS_MASTER_KEY=#{rails_master_key} -e POSTGRES_PASSWORD=#{db_password} -e DB_HOST=#{db_host_or_ip} --network #{docker_subnet} -d #{app_name}:new_build")
puts "#{server_ip}: #{out}"
puts "Test if app can start"
test_app_running.call("#{app_name}-app-new-build")
puts "#{server_ip}: Stopping new container, will restart it later after stopping the old one."
out = run_remote.call("docker container stop #{app_name}-app-new-build")
puts "#{server_ip}: Removing new build container"
out = run_remote.call("docker container rm #{app_name}-app-new-build")Since I can't find an easy way to just change the reverse proxy's port to the new container, I have to stop both old and new containers, and start a new container again. This will cause a few seconds' downtime. Haven't found a good solution to that yet. Kamal does this with Traefik, but I don't use Traefik so I didn't look into their implementation. And BTW in Kamal 2 I heard they won't be using Traefik either, so I didn't take time to look into it. This script will work for me because I don't need to enforce 0-downtime deploy.
Phase 3 will start new container and do some cleanup, such as removing the old image and container, re-tag the new image from "new_build" to "latest".
# Phase 3: Stop old container, start new container for the Rails app
puts "#{server_ip}: Stopping old Rails container"
out = run_remote.call("docker container stop #{app_name}-app")
puts "#{server_ip}: Removing old Rails container"
out = run_remote.call("docker container rm #{app_name}-app")
puts "#{server_ip}: Removing old Rails Image"
out = run_remote.call("docker image rm #{app_name}")
puts "#{server_ip}: Renaming new rails image tag from 'new_build' to 'latest'"
out = run_remote.call("docker image tag #{app_name}:new_build #{app_name}:latest")
out = run_remote.call("docker image rm #{app_name}:new_build")
puts "#{server_ip}: Starting Rails app..."
out = run_remote.call("docker run --name #{app_name}-app -e RAILS_MASTER_KEY=#{rails_master_key} -e POSTGRES_PASSWORD=#{db_password} -e DB_HOST=#{db_host_or_ip} --network #{docker_subnet} -d #{app_name}")
The last phase is testing the new rails app Docker container is working correctly. Once it's done I can manually run some db migration through docker exec -it
And if I'm confident, I can also remove old image and re-tag the image on my development machine from "new_build" to "latest". But that can cause the danger of new container not working properly but I can't rollback to old version quickly. So whether or not to remove old container depends on how critical uptime means for the project.
# Phase 4: Test new container is ready.
puts "Test if app is running"
test_app_running.call("#{app_name}-app")
puts "Deployment finished in #{(Process.clock_gettime(Process::CLOCK_MONOTONIC) - deploy_start_at).round(1)} seconds."
# Optional: Clean up local docker images
puts "local: Retag new build Docker image to latest. Remove old image file."
old_image_id = run_local.call("docker images -q #{app_name}:latest").strip
out = run_local.call("docker image tag #{app_name}:new_build #{app_name}:latest")
out = run_local.call("docker image rm #{old_image_id}")
out = run_local.call("docker image rm #{app_name}:new_build")And that's it! With this deploy script, the Docker build process takes around 30-60 seconds if no significant changes are done to the codebase. And the whole process can be finished within 2 minutes, including transfering a 180MB (after bzip compress and uncompress) to the server at 5MB/s.