How Concourse weathered the CrowdStrike outage and ensured minimal downtime for our customers
It's been a tough week for Windows systems and the IT teams managing them. I live in a glass house here, so no stones to throw. However, this situation highlights the value of Concourse’s methodical architectural approach to end-to-end managed private cloud and security services for Windows and SQL applications.
What is CrowdStrike?
CrowdStrike is a cybersecurity company that has experienced significant growth over the past several years. Their main product, Falcon, is an Endpoint Detection and Response (EDR) tool, which is a modern anti-virus/anti-malware solution.
What Happened?
Recently, CrowdStrike pushed a problematic update. The CEO referred to this as a "content update," but that's not accurate. CrowdStrike operates differently from traditional anti-virus programs, which consist of a core program updated with new virus signatures. These signatures are essentially lists of known threats.
Traditional anti-virus solutions relying solely on content updates are not sophisticated enough for modern cybersecurity needs, as hackers can bypass them. CrowdStrike became popular by focusing on behavioral analysis to detect malware, rather than relying on signature-based updates.
When CrowdStrike updates a Falcon agent to include new detection capabilities, it's more than just a content update. These updates often include changes to drivers with special access to the Windows kernel, operating at a level more privileged than normal programs.
How Could Microsoft Allow Such Low-Level Access?
Before Windows Vista, anti-virus vendors could run their software in kernel space, which is essentially part of the operating system. This created stability issues for Microsoft. With Vista's release, Microsoft restricted this access, forcing anti-virus vendors to run in user mode, like any other program. Vendors had to rely on special user-mode hooks to maintain functionality.
Anti-virus vendors were unhappy with this change and threatened to find ways to regain their previous access levels. Microsoft eventually compromised by providing kernel callback notifications, offering a middle ground: more access than user-mode programs but not full kernel access.
CrowdStrike operates in this special area. After the operating system loads, drivers with kernel callback notifications can run, collecting telemetry data sent to the main program in user mode. However, if these drivers malfunction, the kernel crashes by design to protect itself.
How was Concourse Impacted?
At Concourse Hosting, we leverage CrowdStrike Falcon as part of our robust security architecture. Unlike many, our strategy is to avoid immediately adopting the latest updates. Instead, we wait for the n-1 or second-to-latest updates, ensuring that any potential issues are identified and resolved before they affect our systems. Additionally, we roll out updates in a staggered manner based on time zones, minimizing risk and ensuring continuity.
Why Our Approach Works
Most of our systems were unaffected by the recent CrowdStrike issue because they did not receive the problematic update immediately. For the few that did, enough time had passed for CrowdStrike to release a fix. A simple reboot resolved the issue in most cases. In the rare instances where further action was needed, we booted into safe mode, manually removed the problematic file, and then rebooted.
The Concourse Advantage
Our high-quality managed detection and alert systems, along with our 24/7/365 technical support, ensured the problem was caught immediately. A real person received the alert about the issue on Friday evening. This experienced engineer quickly checked CrowdStrike's updates on Twitter and used their deep technical skills to follow early guidance and remediate the problem.
My thoughts are with the IT Ops teams dealing with these challenges. It's a tough job, and we all strive to keep our systems running smoothly under such pressures.
Learn More
If you'd like to learn more about how Concourse Hosting can provide stable and secure private cloud solutions for your organization, visit Concourse Hosting | Private Cloud Solutions & Services.
If you'd like to learn more, Concourse Hosting | Private Cloud Solutions & Services