GitHub Services in April 2026: Key Incidents and Lessons Learned
Overview of GitHub Availability in April 2026
Throughout April 2026, GitHub recorded 10 separate incidents that temporarily degraded performance across various services. These events prompted the team to increase transparency, culminating in a detailed blog post covering the most significant disruptions on April 23 and April 27. Additionally, enhancements were made to the GitHub status page to provide more granular updates. As the engineering team continues to invest in both short-term fixes and long-term reliability improvements, we appreciate the patience and feedback from the developer community.

April 1: Code Search Service Disruption
What Happened
On April 1, 2026, between 14:40 and 17:00 UTC, GitHub’s code search service experienced a complete outage. During this 2-hour and 20-minute window, all search queries failed. Service was partially restored by 17:00 UTC, but search results were temporarily stale—reflecting repository content as of approximately 07:00 UTC. Full recovery with up-to-date indexing was achieved by 23:45 UTC, meaning the overall incident lasted 8 hours and 43 minutes.
Root Cause
The disruption originated during a routine infrastructure upgrade to the messaging system that powers code search. An automated change was applied too aggressively, causing a coordination breakdown between internal services. This halted search indexing, leading to stale results. While the team worked to restore the messaging layer, an unintended service deployment cleared internal routing state, escalating the staleness into a total outage.
Recovery Actions
Engineers restored messaging coordination through a controlled restart, then reset the search index to a point before the disruption. Importantly, no repository data was lost—the search index is a secondary structure derived from unaffected Git repositories. Once re-indexing completed, all results reflected the current repository state.
Preventive Measures
To avoid similar issues, GitHub is implementing:
- Gradual upgrades with enhanced health checks to catch problems before they cascade.
- Deployment safeguards that prevent unintended changes during active incidents.
- Faster recovery tooling to reduce time-to-restore for search services.
- Better traffic isolation to contain impacts from unexpected traffic spikes during outages.
April 1: Audit Log Service Interruption
What Happened
Later the same day, between 15:34 and 16:02 UTC, the audit log service lost connectivity to its backing data store due to a failed credential rotation. During this 28-minute window, audit log history became unavailable via both the API and web UI, resulting in 5xx errors for 4,297 API actors and 127 github.com users. New audit events created during the incident were delayed by up to 29 minutes in github.com and event streaming. However, no audit log events were lost—all events were ultimately written and streamed successfully. Customers using GitHub Enterprise Cloud with data residency were unaffected.

Response and Follow-up
GitHub’s monitoring system alerted the team at 15:40 UTC—six minutes after the failure began. The credential rotation process has been reviewed to prevent recurrence, and additional validation steps are being added to similar automated operations.
Transparency and Status Page Enhancements
In response to the April incidents, GitHub has committed to providing more detailed and timely updates. The GitHub status page now includes richer information about incident timelines, affected services, and root causes. Future availability reports will continue to follow this transparent format, with post-incident analyses published promptly.
Continuous Reliability Investments
The incidents of April 2026 underscore the complexity of operating a global platform at scale. GitHub’s engineering teams are pursuing both near-term fixes and long-term architectural improvements to enhance resilience. Key focus areas include:
- Infrastructure hardening for critical services like code search and audit logs.
- Automated rollback mechanisms to quickly undo problematic changes.
- Improved testing for credential rotations and configuration updates.
- Cross-service impact analysis to prevent secondary failures during incidents.
We thank all users for their continued trust and patience as these improvements are implemented.