Skip to main content

Self-hosted Runners with ACA Job and KEDA

By combining Azure Container Apps (ACA) Jobs with the KEDA (Kubernetes-based Event Driven Autoscaler) github-runner scaler, you can build ephemeral self-hosted runners that scale on-demand based on the GitHub Actions workflow queue.

What are Self-hosted Runners?

GitHub Actions offers two types of "runners" that execute jobs.

ComparisonGitHub-hosted RunnerSelf-hosted Runner
Infrastructure managementManaged by GitHubYou provision and manage
OSUbuntu / Windows / macOSAny (Linux, Windows, macOS, containers, etc.)
NetworkPublic internetCustomizable (private network possible)
Custom softwareLimitedFreely installable
CostBilled by usage timeInfrastructure cost only
Startup timeTypically 1–3 minutesCustomizable

Self-hosted runners run the runner agent on infrastructure you manage.

Use Cases for Self-hosted Runners

1. Access to Private Networks Required

GitHub-hosted Runners access from the public internet. This means they cannot directly access resources on Azure Virtual Networks (VNet) — such as Azure SQL Database, ACR, or Key Vault protected by private endpoints. Placing self-hosted runners inside a VNet enables private access to these resources.

2. Specific Hardware or Specs Required

  • ML workflows requiring GPU: Use GPU instances for model training or inference testing
  • Large memory builds: .NET or Java large-scale projects requiring 32GB+ memory
  • Fast storage: NVMe SSD for cache-heavy builds

3. Security and Compliance Requirements

  • Data sovereignty: GDPR and similar regulations may prevent placing code or artifacts on GitHub's infrastructure
  • Proprietary code protection: Source code or build artifacts cannot be placed on external shared cloud infrastructure
  • Security auditing: Need full control and auditability of the runner execution environment
  • Secret management: Integration with private secret management tools like Azure Key Vault

4. Custom Environment Requirements

  • Pre-installed internal tools: Specific SDKs, licensed software, internal tools
  • Fixed IP addresses: When external services require IP whitelisting
  • Stateful caching: Persist Docker layer caches or dependency caches

5. Cost Optimization

  • Large-scale CI/CD pipelines: For organizations with high GitHub-hosted Runner usage, self-hosted infrastructure may be more cost-efficient
  • Spot instance utilization: Reduce costs with Azure Spot VMs or ACA Jobs spot features

What are Ephemeral Runners?

Traditional self-hosted runners were "always-on." However, this approach has problems:

  • Security risk: Execution environment is shared across multiple jobs — secrets and artifacts may persist
  • Resource waste: Runners stay running even when there are no jobs
  • Scaling difficulty: Cannot handle bursts of jobs

Ephemeral runners (with the --ephemeral flag) are disposable runners that automatically deregister after executing one job. They are recommended as best practice for both security and efficiency.

What are Azure Container Apps (ACA) Jobs?

Azure Container Apps Jobs are a mechanism for running container-based tasks.

Key Container Apps Concepts

Job Types

TypeDescriptionUse Case
ManualTriggered manually via API or CLIBatch processing, migration tasks
ScheduledCron-based schedulingPeriodic reports, cleanup
Event-drivenAuto-execution based on KEDA scalerSelf-hosted runners ← here

Event-driven jobs automatically create and delete job instances based on KEDA scaling rules, proportional to the number of events.

KEDA and the github-runner Scaler

What is KEDA?

KEDA (Kubernetes-based Event Driven Autoscaler) is an open-source component that scales containers based on external event sources (queues, topics, metrics, etc.). It is a CNCF project widely adopted across the industry, and Azure Container Apps uses KEDA internally.

How the github-runner Scaler Works

The KEDA github-runner scaler monitors the Actions workflow queue for a specific GitHub repository or organization, and scales ACA Job instances based on the number of pending jobs in the queue.

Scaling Logic

KEDA determines the number of instances based on the targetWorkflowQueueLength value:

desired replicas = ⌈ pending jobs / targetWorkflowQueueLength ⌉

For example, if there are 5 jobs in the queue and targetWorkflowQueueLength is 1, 5 runner instances will start.

Overall Architecture

Setup Guide

Using a GitHub App is strongly recommended over Personal Access Tokens (PAT). GitHub Apps can generate Just-in-Time (JIT) tokens, making them more secure.

  1. GitHub organization settings → Developer settingsGitHub AppsNew GitHub App
  2. Configure the following permissions:
    • Repository permissions
      • Actions: Read-only
    • Organization permissions
      • Self-hosted runners: Read and write
  3. Save the App ID and Private Key

2. Prepare Azure Resources

Container Apps Environment (VNet integrated)
resource "azurerm_container_app_environment" "runner_env" {
name = "cae-github-runners"
location = var.location
resource_group_name = var.resource_group_name
infrastructure_subnet_id = azurerm_subnet.aca.id
internal_load_balancer_enabled = true

tags = var.tags
}

3. Build the Runner Container Image

You can use the official actions/runner as a base image, though you'll often want to add custom tools.

Dockerfile
FROM ghcr.io/actions/actions-runner:latest

# Install custom tools (e.g., Azure CLI)
USER root
RUN apt-get update && apt-get install -y \
azure-cli \
&& rm -rf /var/lib/apt/lists/*

USER runner
info

ghcr.io/actions/actions-runner is the official runner image provided by GitHub. Since it is regularly updated, avoid pinning tags too tightly and rebuild periodically via CI/CD.

4. ACA Job Terraform Definition

terraform/modules/aca-runner/main.tf
resource "azurerm_container_app_job" "github_runner" {
name = "caj-github-runner"
location = var.location
resource_group_name = var.resource_group_name
container_apps_environment_id = var.aca_environment_id

# Ephemeral runner: exits after 1 job
replica_timeout_in_seconds = 1800 # 30-minute timeout
replica_retry_limit = 0 # No retries (failures managed by job)

# KEDA github-runner scaler
event_trigger_config {
parallelism = 1
replica_completion_count = 1

scale {
min_executions = 0 # Zero-scale when idle
max_executions = 10 # Max concurrent executions
polling_interval_in_seconds = 30 # Polling interval

rules {
name = "github-runner-scaler"
type = "github-runner"
metadata = {
owner = var.github_org
runnerScope = "org" # "repo" or "org" or "enterprise"
targetWorkflowQueueLength = "1" # 1 runner per job
labels = "self-hosted,linux,azure"
}
authentication {
secret_name = "github-app-auth"
trigger_parameter = "personalAccessToken"
}
}
}
}

identity {
type = "UserAssigned"
identity_ids = [azurerm_user_assigned_identity.runner.id]
}

template {
container {
name = "runner"
image = "${var.acr_login_server}/github-runner:latest"
cpu = 2.0
memory = "4Gi"

env {
name = "GITHUB_APP_ID"
value = var.github_app_id
}
env {
name = "GITHUB_APP_PRIVATE_KEY"
secret_name = "github-app-private-key"
}
env {
name = "GITHUB_ORGANIZATION"
value = var.github_org
}
env {
name = "RUNNER_LABELS"
value = "self-hosted,linux,azure"
}
env {
name = "EPHEMERAL"
value = "true" # Enable ephemeral mode
}
}
}

secret {
name = "github-app-private-key"
identity = azurerm_user_assigned_identity.runner.id
key_vault_secret_id = azurerm_key_vault_secret.github_app_private_key.id
}

registry {
server = var.acr_login_server
identity = azurerm_user_assigned_identity.runner.id
}
}

5. Workflow Configuration

.github/workflows/ci.yml
name: CI

on:
push:
branches: [main]
pull_request:

jobs:
build:
# Use self-hosted runner
runs-on: [self-hosted, linux, azure]
steps:
- uses: actions/checkout@v4
- name: Access private resources
run: |
# Access via private endpoint within VNet
az login --identity
az acr login --name myprivateregistry

Best Practices

Security

1. Always Use Ephemeral Runners

# Specify --ephemeral flag when starting the runner
./config.sh --url ... --token ... --ephemeral

Ephemeral runners automatically deregister after executing one job. This ensures:

  • No secret leakage between jobs
  • No environment contamination
  • Each job runs in a clean environment

2. Use GitHub Apps Instead of Personal Access Tokens

ApproachSecurityRecommendation
Personal Access Token (PAT)Tied to user, difficult expiration management❌ Not recommended
Fine-grained PATCan restrict permissions but still user-tied△ Acceptable
GitHub App + JIT TokenApp-dedicated, least privilege, auto-expiring✅ Recommended

3. Leverage Managed Identity (Workload Identity)

Authenticate Key Vault secret retrieval and ACR login with Managed Identity, avoiding long-lived secrets embedded in containers.

# Grant Managed Identity access to Key Vault
resource "azurerm_role_assignment" "runner_kv_secrets" {
scope = azurerm_key_vault.main.id
role_definition_name = "Key Vault Secrets User"
principal_id = azurerm_user_assigned_identity.runner.principal_id
}

# Grant Managed Identity access to ACR
resource "azurerm_role_assignment" "runner_acr_pull" {
scope = azurerm_container_registry.main.id
role_definition_name = "AcrPull"
principal_id = azurerm_user_assigned_identity.runner.principal_id
}

4. Restrict Execution with Runner Groups

Create Runner Groups at the organization level to limit which repositories can use self-hosted runners.

Runner Group settings in org settings
# Organization Settings > Actions > Runner Groups
# - Group name: azure-private-runners
# - Repository access: Allow selected repositories only
# - Allow public repositories: Off (important!)
warning

Using self-hosted runners with public repositories allows malicious PRs to execute code in the runner environment. For public repositories, restrict pull_request_target usage or use GitHub-hosted Runners instead.

Performance and Scalability

5. Configure Scaling Parameters Appropriately

scale {
# Full zero-scale when idle (cost savings)
min_executions = 0
# Set based on peak CI/CD job count for your organization
max_executions = 20
# KEDA polling interval (be mindful of API rate limits if too low)
polling_interval_in_seconds = 30
}

6. Size CPU and Memory for Your Workload

container {
# For heavy builds (.NET / Java, etc.)
cpu = 4.0
memory = "8Gi"

# For simple script execution
# cpu = 0.5
# memory = "1Gi"
}

7. Runner Timeout Configuration

resource "azurerm_container_app_job" "github_runner" {
# Timeout (set longer than the max workflow execution time)
replica_timeout_in_seconds = 3600 # 1 hour
}

Cost Optimization

8. Leverage Zero-scaling

Setting min_executions = 0 means no runners start when there are no jobs, bringing idle costs to zero.

9. Dependency Caching Strategy

In ephemeral environments where containers start fresh every time, caching strategy is critical for reducing build times.

.github/workflows/ci.yml
steps:
# Use Actions Cache API with Azure Blob Storage as cache backend
- uses: actions/cache@v4
with:
path: ~/.nuget/packages
key: ${{ runner.os }}-nuget-${{ hashFiles('**/*.csproj') }}
restore-keys: |
${{ runner.os }}-nuget-

Operations and Monitoring

10. Container Image Lifecycle Management

.github/workflows/update-runner-image.yml
name: Update Runner Image

on:
schedule:
# Update runner image every Monday
- cron: '0 1 * * 1'
workflow_dispatch:

jobs:
build-and-push:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Build and push runner image
uses: docker/build-push-action@v5
with:
context: ./runner
push: true
tags: |
${{ secrets.ACR_LOGIN_SERVER }}/github-runner:latest
${{ secrets.ACR_LOGIN_SERVER }}/github-runner:${{ github.sha }}

11. Monitor Runners with Azure Monitor

Collect and configure alerts for the following metrics and logs in Azure Monitor:

  • Job execution count: Sudden spikes or drops in jobs per hour
  • Execution time: Alerts for jobs approaching replica_timeout_in_seconds
  • Failure rate: Early detection of increased job failures
Azure Monitor alert example
resource "azurerm_monitor_metric_alert" "runner_job_timeout" {
name = "github-runner-job-timeout"
resource_group_name = var.resource_group_name
scopes = [azurerm_container_app_job.github_runner.id]
description = "Detect GitHub Runner job timeouts"

criteria {
metric_namespace = "Microsoft.App/jobs"
metric_name = "FailedCount"
aggregation = "Total"
operator = "GreaterThan"
threshold = 3
}

action {
action_group_id = var.alert_action_group_id
}
}

12. Separate Runners by Purpose Using Labels

When using runners for multiple purposes, separate them with labels.

# Standard runner
env {
name = "RUNNER_LABELS"
value = "self-hosted,linux,azure"
}

# High-spec runner (for ML / large builds)
env {
name = "RUNNER_LABELS"
value = "self-hosted,linux,azure,high-memory"
}
Specifying in workflow
jobs:
ml-training:
runs-on: [self-hosted, linux, azure, high-memory]

Troubleshooting

Runner Fails to Start

  1. Check KEDA scaler logs: Review system logs in the ACA Environment
  2. GitHub API rate limits: PAT allows 50 req/h; GitHub App allows 15,000 req/h
  3. JIT token expiration: Just-in-Time tokens expire quickly, so the registration process must be fast

Job Not Being Assigned

  1. Runner label mismatch: Verify runs-on in the workflow exactly matches the labels set on the runner
  2. Runner group repository access: Confirm the runner group allows access to the target repository
  3. KEDA polling delay: The default polling interval (30s) causes a delay before new jobs become visible

Network Connection Errors

  1. Subnet NSG rules: Allow outbound to *.github.com (443) and *.githubusercontent.com (443)
  2. VNet integration check: Confirm the ACA Environment is correctly integrated with your VNet

Summary

The combination of ACA Job + KEDA github-runner scaler provides an excellent self-hosted runner platform with the following strengths:

  • Zero idle cost: Starts only when jobs arrive; zero cost when idle
  • Fully ephemeral execution: Each job runs in a clean environment, minimizing security risk
  • VNet integration: Seamless access to private Azure resources
  • Managed Identity: Passwordless access to Azure resources
  • Terraform-managed: Declarative, infrastructure-as-code management

These characteristics solve the challenge of "GitHub-hosted Runners cannot meet our requirements, but always-on self-hosted runners are too costly to manage."