Self-hosted Runners with ACA Job and KEDA
By combining Azure Container Apps (ACA) Jobs with the KEDA (Kubernetes-based Event Driven Autoscaler) github-runner scaler, you can build ephemeral self-hosted runners that scale on-demand based on the GitHub Actions workflow queue.
What are Self-hosted Runners?
GitHub Actions offers two types of "runners" that execute jobs.
| Comparison | GitHub-hosted Runner | Self-hosted Runner |
|---|---|---|
| Infrastructure management | Managed by GitHub | You provision and manage |
| OS | Ubuntu / Windows / macOS | Any (Linux, Windows, macOS, containers, etc.) |
| Network | Public internet | Customizable (private network possible) |
| Custom software | Limited | Freely installable |
| Cost | Billed by usage time | Infrastructure cost only |
| Startup time | Typically 1–3 minutes | Customizable |
Self-hosted runners run the runner agent on infrastructure you manage.
Use Cases for Self-hosted Runners
1. Access to Private Networks Required
GitHub-hosted Runners access from the public internet. This means they cannot directly access resources on Azure Virtual Networks (VNet) — such as Azure SQL Database, ACR, or Key Vault protected by private endpoints. Placing self-hosted runners inside a VNet enables private access to these resources.
2. Specific Hardware or Specs Required
- ML workflows requiring GPU: Use GPU instances for model training or inference testing
- Large memory builds: .NET or Java large-scale projects requiring 32GB+ memory
- Fast storage: NVMe SSD for cache-heavy builds
3. Security and Compliance Requirements
- Data sovereignty: GDPR and similar regulations may prevent placing code or artifacts on GitHub's infrastructure
- Proprietary code protection: Source code or build artifacts cannot be placed on external shared cloud infrastructure
- Security auditing: Need full control and auditability of the runner execution environment
- Secret management: Integration with private secret management tools like Azure Key Vault
4. Custom Environment Requirements
- Pre-installed internal tools: Specific SDKs, licensed software, internal tools
- Fixed IP addresses: When external services require IP whitelisting
- Stateful caching: Persist Docker layer caches or dependency caches
5. Cost Optimization
- Large-scale CI/CD pipelines: For organizations with high GitHub-hosted Runner usage, self-hosted infrastructure may be more cost-efficient
- Spot instance utilization: Reduce costs with Azure Spot VMs or ACA Jobs spot features
What are Ephemeral Runners?
Traditional self-hosted runners were "always-on." However, this approach has problems:
- Security risk: Execution environment is shared across multiple jobs — secrets and artifacts may persist
- Resource waste: Runners stay running even when there are no jobs
- Scaling difficulty: Cannot handle bursts of jobs
Ephemeral runners (with the --ephemeral flag) are disposable runners that automatically deregister after executing one job. They are recommended as best practice for both security and efficiency.
What are Azure Container Apps (ACA) Jobs?
Azure Container Apps Jobs are a mechanism for running container-based tasks.
Key Container Apps Concepts
Job Types
| Type | Description | Use Case |
|---|---|---|
| Manual | Triggered manually via API or CLI | Batch processing, migration tasks |
| Scheduled | Cron-based scheduling | Periodic reports, cleanup |
| Event-driven | Auto-execution based on KEDA scaler | Self-hosted runners ← here |
Event-driven jobs automatically create and delete job instances based on KEDA scaling rules, proportional to the number of events.
KEDA and the github-runner Scaler
What is KEDA?
KEDA (Kubernetes-based Event Driven Autoscaler) is an open-source component that scales containers based on external event sources (queues, topics, metrics, etc.). It is a CNCF project widely adopted across the industry, and Azure Container Apps uses KEDA internally.
How the github-runner Scaler Works
The KEDA github-runner scaler monitors the Actions workflow queue for a specific GitHub repository or organization, and scales ACA Job instances based on the number of pending jobs in the queue.
Scaling Logic
KEDA determines the number of instances based on the targetWorkflowQueueLength value:
desired replicas = ⌈ pending jobs / targetWorkflowQueueLength ⌉
For example, if there are 5 jobs in the queue and targetWorkflowQueueLength is 1, 5 runner instances will start.
Overall Architecture
Setup Guide
1. Create a GitHub App (Recommended)
Using a GitHub App is strongly recommended over Personal Access Tokens (PAT). GitHub Apps can generate Just-in-Time (JIT) tokens, making them more secure.
- GitHub organization settings → Developer settings → GitHub Apps → New GitHub App
- Configure the following permissions:
- Repository permissions
- Actions: Read-only
- Organization permissions
- Self-hosted runners: Read and write
- Repository permissions
- Save the App ID and Private Key
2. Prepare Azure Resources
resource "azurerm_container_app_environment" "runner_env" {
name = "cae-github-runners"
location = var.location
resource_group_name = var.resource_group_name
infrastructure_subnet_id = azurerm_subnet.aca.id
internal_load_balancer_enabled = true
tags = var.tags
}
3. Build the Runner Container Image
You can use the official actions/runner as a base image, though you'll often want to add custom tools.
FROM ghcr.io/actions/actions-runner:latest
# Install custom tools (e.g., Azure CLI)
USER root
RUN apt-get update && apt-get install -y \
azure-cli \
&& rm -rf /var/lib/apt/lists/*
USER runner
ghcr.io/actions/actions-runner is the official runner image provided by GitHub. Since it is regularly updated, avoid pinning tags too tightly and rebuild periodically via CI/CD.
4. ACA Job Terraform Definition
resource "azurerm_container_app_job" "github_runner" {
name = "caj-github-runner"
location = var.location
resource_group_name = var.resource_group_name
container_apps_environment_id = var.aca_environment_id
# Ephemeral runner: exits after 1 job
replica_timeout_in_seconds = 1800 # 30-minute timeout
replica_retry_limit = 0 # No retries (failures managed by job)
# KEDA github-runner scaler
event_trigger_config {
parallelism = 1
replica_completion_count = 1
scale {
min_executions = 0 # Zero-scale when idle
max_executions = 10 # Max concurrent executions
polling_interval_in_seconds = 30 # Polling interval
rules {
name = "github-runner-scaler"
type = "github-runner"
metadata = {
owner = var.github_org
runnerScope = "org" # "repo" or "org" or "enterprise"
targetWorkflowQueueLength = "1" # 1 runner per job
labels = "self-hosted,linux,azure"
}
authentication {
secret_name = "github-app-auth"
trigger_parameter = "personalAccessToken"
}
}
}
}
identity {
type = "UserAssigned"
identity_ids = [azurerm_user_assigned_identity.runner.id]
}
template {
container {
name = "runner"
image = "${var.acr_login_server}/github-runner:latest"
cpu = 2.0
memory = "4Gi"
env {
name = "GITHUB_APP_ID"
value = var.github_app_id
}
env {
name = "GITHUB_APP_PRIVATE_KEY"
secret_name = "github-app-private-key"
}
env {
name = "GITHUB_ORGANIZATION"
value = var.github_org
}
env {
name = "RUNNER_LABELS"
value = "self-hosted,linux,azure"
}
env {
name = "EPHEMERAL"
value = "true" # Enable ephemeral mode
}
}
}
secret {
name = "github-app-private-key"
identity = azurerm_user_assigned_identity.runner.id
key_vault_secret_id = azurerm_key_vault_secret.github_app_private_key.id
}
registry {
server = var.acr_login_server
identity = azurerm_user_assigned_identity.runner.id
}
}
5. Workflow Configuration
name: CI
on:
push:
branches: [main]
pull_request:
jobs:
build:
# Use self-hosted runner
runs-on: [self-hosted, linux, azure]
steps:
- uses: actions/checkout@v4
- name: Access private resources
run: |
# Access via private endpoint within VNet
az login --identity
az acr login --name myprivateregistry
Best Practices
Security
1. Always Use Ephemeral Runners
# Specify --ephemeral flag when starting the runner
./config.sh --url ... --token ... --ephemeral
Ephemeral runners automatically deregister after executing one job. This ensures:
- No secret leakage between jobs
- No environment contamination
- Each job runs in a clean environment
2. Use GitHub Apps Instead of Personal Access Tokens
| Approach | Security | Recommendation |
|---|---|---|
| Personal Access Token (PAT) | Tied to user, difficult expiration management | ❌ Not recommended |
| Fine-grained PAT | Can restrict permissions but still user-tied | △ Acceptable |
| GitHub App + JIT Token | App-dedicated, least privilege, auto-expiring | ✅ Recommended |
3. Leverage Managed Identity (Workload Identity)
Authenticate Key Vault secret retrieval and ACR login with Managed Identity, avoiding long-lived secrets embedded in containers.
# Grant Managed Identity access to Key Vault
resource "azurerm_role_assignment" "runner_kv_secrets" {
scope = azurerm_key_vault.main.id
role_definition_name = "Key Vault Secrets User"
principal_id = azurerm_user_assigned_identity.runner.principal_id
}
# Grant Managed Identity access to ACR
resource "azurerm_role_assignment" "runner_acr_pull" {
scope = azurerm_container_registry.main.id
role_definition_name = "AcrPull"
principal_id = azurerm_user_assigned_identity.runner.principal_id
}
4. Restrict Execution with Runner Groups
Create Runner Groups at the organization level to limit which repositories can use self-hosted runners.
# Organization Settings > Actions > Runner Groups
# - Group name: azure-private-runners
# - Repository access: Allow selected repositories only
# - Allow public repositories: Off (important!)
Using self-hosted runners with public repositories allows malicious PRs to execute code in the runner environment. For public repositories, restrict pull_request_target usage or use GitHub-hosted Runners instead.
Performance and Scalability
5. Configure Scaling Parameters Appropriately
scale {
# Full zero-scale when idle (cost savings)
min_executions = 0
# Set based on peak CI/CD job count for your organization
max_executions = 20
# KEDA polling interval (be mindful of API rate limits if too low)
polling_interval_in_seconds = 30
}
6. Size CPU and Memory for Your Workload
container {
# For heavy builds (.NET / Java, etc.)
cpu = 4.0
memory = "8Gi"
# For simple script execution
# cpu = 0.5
# memory = "1Gi"
}
7. Runner Timeout Configuration
resource "azurerm_container_app_job" "github_runner" {
# Timeout (set longer than the max workflow execution time)
replica_timeout_in_seconds = 3600 # 1 hour
}
Cost Optimization
8. Leverage Zero-scaling
Setting min_executions = 0 means no runners start when there are no jobs, bringing idle costs to zero.
9. Dependency Caching Strategy
In ephemeral environments where containers start fresh every time, caching strategy is critical for reducing build times.
steps:
# Use Actions Cache API with Azure Blob Storage as cache backend
- uses: actions/cache@v4
with:
path: ~/.nuget/packages
key: ${{ runner.os }}-nuget-${{ hashFiles('**/*.csproj') }}
restore-keys: |
${{ runner.os }}-nuget-
Operations and Monitoring
10. Container Image Lifecycle Management
name: Update Runner Image
on:
schedule:
# Update runner image every Monday
- cron: '0 1 * * 1'
workflow_dispatch:
jobs:
build-and-push:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Build and push runner image
uses: docker/build-push-action@v5
with:
context: ./runner
push: true
tags: |
${{ secrets.ACR_LOGIN_SERVER }}/github-runner:latest
${{ secrets.ACR_LOGIN_SERVER }}/github-runner:${{ github.sha }}
11. Monitor Runners with Azure Monitor
Collect and configure alerts for the following metrics and logs in Azure Monitor:
- Job execution count: Sudden spikes or drops in jobs per hour
- Execution time: Alerts for jobs approaching
replica_timeout_in_seconds - Failure rate: Early detection of increased job failures
resource "azurerm_monitor_metric_alert" "runner_job_timeout" {
name = "github-runner-job-timeout"
resource_group_name = var.resource_group_name
scopes = [azurerm_container_app_job.github_runner.id]
description = "Detect GitHub Runner job timeouts"
criteria {
metric_namespace = "Microsoft.App/jobs"
metric_name = "FailedCount"
aggregation = "Total"
operator = "GreaterThan"
threshold = 3
}
action {
action_group_id = var.alert_action_group_id
}
}
12. Separate Runners by Purpose Using Labels
When using runners for multiple purposes, separate them with labels.
# Standard runner
env {
name = "RUNNER_LABELS"
value = "self-hosted,linux,azure"
}
# High-spec runner (for ML / large builds)
env {
name = "RUNNER_LABELS"
value = "self-hosted,linux,azure,high-memory"
}
jobs:
ml-training:
runs-on: [self-hosted, linux, azure, high-memory]
Troubleshooting
Runner Fails to Start
- Check KEDA scaler logs: Review system logs in the ACA Environment
- GitHub API rate limits: PAT allows 50 req/h; GitHub App allows 15,000 req/h
- JIT token expiration: Just-in-Time tokens expire quickly, so the registration process must be fast
Job Not Being Assigned
- Runner label mismatch: Verify
runs-onin the workflow exactly matches the labels set on the runner - Runner group repository access: Confirm the runner group allows access to the target repository
- KEDA polling delay: The default polling interval (30s) causes a delay before new jobs become visible
Network Connection Errors
- Subnet NSG rules: Allow outbound to
*.github.com(443) and*.githubusercontent.com(443) - VNet integration check: Confirm the ACA Environment is correctly integrated with your VNet
Summary
The combination of ACA Job + KEDA github-runner scaler provides an excellent self-hosted runner platform with the following strengths:
- Zero idle cost: Starts only when jobs arrive; zero cost when idle
- Fully ephemeral execution: Each job runs in a clean environment, minimizing security risk
- VNet integration: Seamless access to private Azure resources
- Managed Identity: Passwordless access to Azure resources
- Terraform-managed: Declarative, infrastructure-as-code management
These characteristics solve the challenge of "GitHub-hosted Runners cannot meet our requirements, but always-on self-hosted runners are too costly to manage."