How eBPF Prevents Deployment Disasters at GitHub

By ✦ min read

GitHub encounters a peculiar challenge: since they host their own source code on github.com, an outage could prevent them from accessing the very code needed to fix it. This creates a dangerous circular dependency. To make deployments safer, GitHub adopted eBPF to monitor and block network calls that might introduce circular dependencies. Below, we answer key questions about this innovative approach.

What is the primary circular dependency GitHub faces?

GitHub’s core service is github.com, which hosts all their source code, including the code for github.com itself. If that site goes down, engineers cannot access the repository to push fixes. That’s a simple circular dependency: deploying GitHub requires GitHub. To address this, GitHub maintains a mirror of critical code and pre-built assets for rollbacks. Yet this only scratches the surface—deployment scripts can create their own circular dependencies on internal services or on GitHub itself.

How eBPF Prevents Deployment Disasters at GitHub
Source: github.blog

How does GitHub mitigate the basic code access issue?

GitHub keeps a separate mirror of their source code that remains accessible even when github.com is down. They also store built artifacts (like binaries) that can be rolled back without fetching anything from the live site. This ensures that during an outage, engineers still have the tools to fix GitHub forward or revert to a previous stable state. However, these measures don’t eliminate all circular dependencies—especially those introduced by deployment scripts that may inadvertently call out to GitHub or other services.

What are the three types of circular dependencies in deployments?

GitHub identified three categories: direct, hidden, and transient. A direct dependency occurs when a deploy script explicitly downloads something from GitHub during an outage, so it fails immediately. A hidden dependency happens when a local tool, like a servicing utility, checks for updates from GitHub and hangs or errors out if it can’t connect. A transient dependency arises when a script calls another internal service, which in turn tries to fetch from GitHub, propagating the failure back.

How eBPF Prevents Deployment Disasters at GitHub
Source: github.blog

Can you give an example of a direct dependency during a MySQL outage?

Imagine a MySQL outage makes github.com unable to serve release data. An engineer runs a deploy script to apply a configuration change to affected MySQL nodes. The script tries to pull the latest release of an open-source tool directly from GitHub. Since GitHub is down, the download fails and the script cannot complete. That’s a classic direct circular dependency—the deployment relies on the very service it’s trying to fix.

What is a hidden dependency and how does it manifest?

A hidden dependency is more subtle: the deploy script uses a tool already present on the machine’s disk, so you might think it’s safe. But that tool, when executed, checks GitHub for an update. During an outage, it can’t reach GitHub and may either fail or hang, depending on its error handling. For example, a servicing tool might block indefinitely waiting for a response. The deployment stalls even though the script itself never directly accessed GitHub.

How does eBPF help GitHub prevent these dependencies?

eBPF allows GitHub to selectively monitor and block network calls made by deployment scripts. By writing eBPF programs, they can intercept connections to specific hosts or ports and either log them, rate-limit them, or outright deny them. This prevents scripts from accidentally creating circular dependencies on services like GitHub during an outage. Previously, teams had to manually review scripts for potential dependency issues. eBPF automates that enforcement, making deployments safer and reducing incident response time.

Tags:

Recommended

Discover More

Cloud Gaming Revolution: 10 Key Highlights from May’s GeForce NOW Update10 Critical Facts About Google's Prompt API and the Gemini Nano DownloadHow to Scale Your Sovereign Private Cloud to Thousands of Nodes Using Azure LocalHow to Future-Proof Your Flutter Apps: A Step-by-Step Guide to the 2026 RoadmapWhy Your Old iPad Might Be Your Best iPad: A Guide to Evaluating Upgrade Decisions