On 19th July 2024, Microsoft users witnessed one of the major outages in history. This ‘blue screen of death’, where everything that relied on Microsoft shut down for five hours, affected more than 1.5 billion users around the world and caused a hindrance to the effective functioning of the global system. Three pertinent questions arise: what made a tech giant go non-functional for five hours? What was the damage? What’s being done to ensure this is not repeated?
What happened?
Microsoft uses Crowdstrike, a cyber security platform, to protect itself from any attacks within its ecosystem. Crowdstrike performs this function by incorporating sensors that can protect the devices and users. These sensors are called Falcon sensors.
Falcon sensors scope out everything happening in the user’s microsoft account or a device. This means that they observe, monitor and analyse every user activity. It knows how many folders there are, how many files are open, what is the nature of the files; what devices are connected to Bluetooth or wi-fi, what activity is being conducted by the user in a web browser, etc. It gathers all this information and protects all of it from any cyber attack from any malware.
On 19th July, 2024 Crowstrike updated the Falcon sensors which triggered the requirement to update each and every device that runs on the Windows operating system (Microsoft’s proprietary software). A blue screen appeared during the update process instead of the usual update screen. This is because of a bug in the Falcon sensors. This resulted in any and every device that was up for an update loading a blue screen and completely cutting people off from their own devices.
This had a catastrophic effect in all organisations that rely on Microsoft and its services.
What was the damage?
One major fault led the tech-dependent world to handle things and operations manually. It rewound the clock to the 1950s. This has severely damaged every industry that relies on Windows. This includes Airlines, Hospitals, Banks, Stock markets and other businesses.
Web check-in became impossible, and airlines had to complete the process manually. 21,000 flights were delayed globally in just five hours, a few got cancelled, and the halt hours were extended.
The UK National Health Service revealed that their access to patient’s medical history got difficult and they could not scan and run blood tests. In the US, Patient’s care got difficult at Cancer Centers. 911 services took a hit – calls could be seen coming but could not be answered. The health of critical patients was drastically affected.
Many financial institutions were facing complications due to the crash and had to retrogress to their backup. ATMs went inoperational for a few hours. Logging in into the devices became troublesome for institutions like Bank of America Corporation, JP Morgan Chase & Co., Nomura Holdings, etc. Many consumers were gearing up to file claims against the world’s largest insurance brokerage, Marsh.
The automobile industry was also affected by the outage. Supplies got stuck and production became back-breaking for Renault. Elon Musk, CEO of Tesla Inc., used ‘X’ to set forth that he shall no longer be using Crowdstrike software.
Another front this IT outage had affected the people was the price of the shares and its market value. RNS, a service used by companies to deal out price-related announcements, ceased to act as RNS was ineffectual in publishing news on London Stock Exchange Group’s website. Shares on Microsoft went down for about 1% to $ 437.11.
Ultimately, all of it impacted Crowdsrike’s share price. That day, it’s share price closed at 11% lower to $304.96 in New York trading. It swept away 9 billion in market value. It resulted in the rise of Crowdstrike rival’s share prices for about 2% to 8% on July 19th.
What is being done to ensure that this is not repeated?
On 20th July, the Vice President of Microsoft stated how they have been helping their customers through the crash by CrowdStrike, an independent cyber-security company, while developing a scalable resolution. Interesting enough, Microsoft claimed no liability for the loss that has resulted because of clause 6B of the Microsoft Service Agreement.
Moreover, the task of fixing Crowdstrike’s crash was difficult, but it took 79 minutes for its employees to find the source of the outage. Crowdstrike took responsibility for the whole disruptive- damage and apologized for the same using various social media platforms. It communicated that there was an error in the software and clarified that it was not a cyber attack.
Microsoft swiftly extended its support to its users soon after this incident. Microsoft and Crowdstrike worked together to solve the crisis. Crowdstrike released solutions for future assistance to mitigate any outage. They would not push any update without testing it numerous times; they would add various layers of checking and testing before a consumer accepts an update; they shall advance their configuration system; Falcon sensor code will be reviewed strictly; etc. These actions will make the digital world less vulnerable and less prone to such outages. The likelihood of such incidents happening shall be reduced.
Conclusion:
A normal bug in the system single-handedly succeeded in creating chaos around the globe. This points out the gravity of the dependence we have on technology and how fatal it gets when affected. The implications it had on the usual From Airlines to Healthcare, the majority of the industries became vulnerable due to this incident. This incident highlighted the critical reliance on cloud services and the importance of robust contingency planning for organisations worldwide.