Resources
Blog & News
Global Impact of a 1% Windows Outage: Insights from the CrowdStrike Issue and the Benefits of Managed Desktop as a Service
Global Impact of a 1% Windows Outage: Insights from the CrowdStrike Issue and the Benefits of Managed Desktop as a Service
In today's digitally interconnected world, even a seemingly small security update can have catastrophic consequences. The recent incident involving a problematic update from CrowdStrike, a well-known cybersecurity solution, highlights this fact. The update affected less than one percent of Windows devices, yet the implications were massive and global.
WRITTEN BY
TABLE OF CONTENT
In this blog post you will learn more about the CrowdStrike update incident, lessons learned, and future precautions from Dizzion, a major Desktop as a Service (DaaS) vendor. We will explore why such an outage had a massive impact globally and how Desktop as a Service (DaaS) solutions, including those with a full-service managed option, like Dizzion can ensure faster recovery, with minimal impact to operations.
The CrowdStrike Incident
Microsoft estimates that CrowdStrike's update affected 8.5 million Windows devices, or less than one percent of all Windows machines. Despite this seemingly small percentage, the impact was significant since so many business systems run Windows. Cybersecurity incidents like this one are often like icebergs; you only see the tip but there is far more vulnerability hiding under the surface. Many more customers experienced issues that aren't publicly reported. Companies often claim they are down for "maintenance" to soften the impact of this kind of problem and make it seem routine. For some, the Crowdstrike update led to the infamous Microsoft Blue Screen of Death (BSOD) on Windows PCs and servers - which often does not recover automatically, without some manual intervention. These crashes disrupted operations in airlines, banks, retail and healthcare. This crash impacted Windows only and did not affect Mac or Linux PCs or servers. However, *nix systems are not impervious to unplanned crashes. Similar issues were reported a couple of months ago, affecting Debian and Rocky Linux machines as well.
While CrowdStrike issued an update shortly after the issue was discovered, mitigation was not always automatic, and typically required manual intervention by IT professionals to recover from the BSOD and reboot the Windows machine. For companies managing CrowdStrike on their own, their own IT departments were responsible for mitigating the issue. For companies using a managed service, like Dizzion DaaS, the service provider took care of the issue, leveraging 24x7 operations to respond immediately. For example, for Dizzion Managed DaaS customers in the US, the issue was detected and resolved by Dizzion in the early morning hours, before most US businesses started their day on Friday July 19.
Encryption, Modern Device Management and Device Proliferation
The proliferation of both BYOD and corporate devices, along with their varied locations, has significantly complicated the IT infrastructure landscape. This issue is twofold, worsened by the post-COVID-19 shift to remote work and the growing reliance on cloud services. Companies may have thousands of remote locations, each adding layers of device management complexity. Devices on home networks often aren't connected to the corporate domain and may require VPNs for management, including device encryption solutions such as BitLocker. This lack of connectivity puts many devices at risk, as end-users cannot access them if issues arise, and in some cases, they can't even send emails to open IT support tickets. Failures during device boot and subsequent loops are particularly challenging, requiring physical intervention. Managing this complexity is key and the need for skilled IT professionals is evident, especially with labor cuts and outsourcing. Modern Workspace solutions such as virtualized applications and Desktop as a Service, can address these challenges efficiently since management is centralized and secure access to applications and desktops are device independent. Any device with a browser can securely access corporate applications.
How can DaaS solutions address today's security and availability challenges?
Desktop as a Service is a cloud computing solution that delivers virtual desktops and applications to end-users over the internet, enabling centralized management and scalability. Dizzion DaaS solutions allow users to access their virtual desktops from any device with a browser, ensuring flexibility and efficiency in IT operations. DaaS solutions can significantly improve recovery efforts during disruptions like the recent CrowdStrike incident. Here are some of the key advantages of using DaaS:
Central Image Deployment and Management
DaaS enables centralized management and deployment of virtual desktops and applications, simplifying operating system, applications and (3rd party) security solutions such as CrowdStrike and Windows Defender for Endpoints. This centralization reduces the time and effort required to address issues across multiple endpoints, ensuring a swift response to any OS, application or (3rd party) security application problems.
Rapid Deployment
DaaS facilitates the rapid deployment of virtual desktops and applications. This capability minimizes downtime and enables employees to resume work quickly, which is crucial during recovery periods.
Scalability and Multi-Cloud
Modern DaaS solutions such as Dizzion Frame - allow organizations to scale resources up or down based on demand. This scalability ensures that critical services remain available even during peak recovery periods, helping businesses maintain operations without significant interruptions. Multiple regions and multiple different cloud providers can be used to ensure availability and capacity.
Good Practices for Non-Persistent Images
Disabling automatic updates for non-persistent machines is a leading practice in DaaS environments. If end-users with their non-persistent machines encounter issues, rebooting them resets them to a known good state. This approach can be particularly useful in mitigating the effects of problematic updates like the one from CrowdStrike.
Diverse Endpoints
DaaS supports various endpoint devices; including Windows, MacOS and Linux PCs, Thin Clients (HP, IGEL, Unicon, Stratodesk, 10Zig), Chrome OS devices, Linux or just using an HTML5 browser on any device. With no data stored on the endpoint, it simply serves as a gateway to virtual desktops and applications, adding an extra layer of resilience against Windows-specific corporate issues.
How Dizzion Ensures Swift Mitigation and Recovery
The primary lesson from this incident for businesses is the necessity for clear communications, robust preparedness and efficient, fast recovery mechanisms. IT disruptions of this scale underscore the importance of having powerful and proven recovery protocols in place to minimize downtime and maintain business continuity. Dizzion demonstrated the benefits of using a fully managed DaaS service during the CrowdStrike incident. Many of our customers use CrowdStrike in their Dizzion DaaS environments, including government, retail, technology, and education sectors. For these customers, Dizzion's managed service offering ensured swift mitigation and recovery.
While CrowdStrike issued an update shortly after the problematic one, mitigation was not automatic and often required manual intervention. For our Managed customers using CrowdStrike, Dizzion was able to detect, manage, and correct the issue before the workday on Friday began. As a result, Dizzion's customers in the US did not experience significant issues, highlighting the value of a fully managed DaaS service.
Lessons worth sharing from Dizzion - a major DaaS provider
- Controlled Updates: Be selective when automatically approving and deploying Windows Operating System or (3rd-party) application updates such as CrowdStrike without validation. Control and stability are crucial for every business, though this must be balanced with the need for timely updates to minimize security risks
- Backup, easy rollback and recovery: Implementing systematic backup and rollback capabilities can significantly aid in recovery during outages caused by problematic updates. Dizzion DaaS provides capabilities to manually or automatically schedule backups of non-persistent desktops, golden images, personal desktops and utility servers.
- Robust Incident Response: When bad things happen, there needs to be a process implemented immediately that handles communication, remediation, and recovery in the least amount of time. Incident response teams are on standby, ready to quickly respond to any issue, ensuring coordinated efforts, identifying root causes, and prioritizing recovery to resume business as usual swiftly.
- High Availability and Multi-Cloud Configurations: the Dizzion DaaS control plane is highly available with built-in platform redundancy and with the ability for customers to run workload VMs across regions and infrastructures, supporting Azure, AWS, GCP, IBM Cloud and Nutanix AHV to maintain accessibility during hardware or software failures.
- Global Managed Service: Dizzion customers and partners benefit from a global support organization with extensive experience in handling enterprise clients and international deployments. For example, when issues arose in EU deployments, lessons learned were swiftly applied to mitigate similar problems for US customers.
Conclusion
The recent CrowdStrike update incident serves as a powerful reminder of the critical need for robust IT recovery protocols, balanced update procedures and clear communication strategies. Even a small disruption, affecting less than one percent of Windows PCs and Servers, can have widespread consequences across various sectors. This underscores the importance of deploying robust systems and being prepared with efficient recovery mechanisms to minimize downtime and maintain business continuity.
DaaS solutions, such as those offered by Dizzion, play a crucial role in addressing these challenges. With customer focused global support, mature incident response processes, industry expertise, centralized image management, rapid re-deployment, and scalable resources, Dizzion Managed ensures immediate mitigation and recovery during IT disruptions. Dizzion's managed service demonstrated its value by efficiently handling the CrowdStrike incident, ensuring that our customers experienced minimal impact.
Collective Wisdom: The Value of Sharing Information
The information below is helpful for CrowdStrike and mitigating potential security related issues:
- Template for Executive Leadership Notification by Team8 - via Ross Young, former CIA and CISO
- CISO advice for addressing cyber-risk management challenges
- Internal Plan of Action Template by Anshu Gupta
- Windows Group Policy Recommendations
- Crowdstrike Statement on Falcon Content Update for Windows Hosts
- Technical Details: Falcon Content Update for Windows Hosts
- Crowdstrike published a technical update blog post, for anyone curious about the 'why'
- Windows Restore Guidance
- Microsoft Azure Guidance
- AWS EC2 Guidance and How do I recover AWS resources that were affected by the CrowdStrike Falcon agent?
- Bob and Alice in Kernel-land
- Microsoft releases recovery tool to help repair Windows machines hit by CrowdStrike issue
- CrowdStrike IT Outage Explained by a Windows Developer
- Microsoft calls for Windows changes and resilience after CrowdStrike outage
- CrowdStrike Preliminary Post Incident Review
Thanks for reading; if you have any questions, don't hesitate to get in touch with me!
Subscribe to our newsletter
Register for our newsletter now to unlock the full potential of Dizzion's Resource Library. Don't miss out on the latest industry insights – sign up today!
Dizzion values your privacy. By completing this form, you agree to the processing of your personal data in the manner indicated in the Dizzion Privacy Policy and consent to receive communications from Dizzion about our products, services, and events.