All our servers and company laptops went down at pretty much the same time. Laptops have been bootlooping to blue screen of death. It’s all very exciting, personally, as someone not responsible for fixing it.
Apparently caused by a bad CrowdStrike update.
Edit: now being told we (who almost all generally work from home) need to come into the office Monday as they can only apply the fix in-person. We’ll see if that changes over the weekend…
Reading into the updates some more… I’m starting to think this might just destroy CloudStrike as a company altogether. Between the mountain of lawsuits almost certainly incoming and the total destruction of any public trust in the company, I don’t see how they survive this. Just absolutely catastrophic on all fronts.
Agreed, this will probably kill them over the next few years unless they can really magic up something.
They probably don’t get sued - their contracts will have indemnity clauses against exactly this kind of thing, so unless they seriously misrepresented what their product does, this probably isn’t a contract breach.
If you are running crowdstrike, it’s probably because you have some regulatory obligations and an auditor to appease - you aren’t going to be able to just turn it off overnight, but I’m sure there are going to be some pretty awkward meetings when it comes to contract renewals in the next year, and I can’t imagine them seeing much growth
Don’t most indemnity clauses have exceptions for gross negligence? Pushing out an update this destructive without it getting caught by any quality control checks sure seems grossly negligent.
I think you’re on the nose, here. I laughed at the headline, but the more I read the more I see how fucked they are. Airlines. Industrial plants. Fucking governments. This one is big in a way that will likely get used as a case study.
The London Stock Exchange went down. They’re fukd.
Testing in production will do that
Not everyone is fortunate enough to have a seperate testing environment, you know? Manglement has to cut cost somewhere.
What lawsuits do you think are going to happen?
Forget lawsuits, they’re going to be in front of congress for this one
For what? At best it would be a hearing on the challenges of national security with industry.
Don’t we blame MS at least as much? How does MS let an update like this push through their Windows Update system? How does an application update make the whole OS unable to boot? Blue screens on Windows have been around for decades, why don’t we have a better recovery system?
Crowdstrike runs at ring 0, effectively as part of the kernel. Like a device driver. There are no safeguards at that level. Extreme testing and diligence is required, because these are the consequences for getting it wrong. This is entirely on crowdstrike.
Yeah my plans of going to sleep last night were thoroughly dashed as every single windows server across every datacenter I manage between two countries all cried out at the same time lmao
I always wondered who even used windows server given how marginal its marketshare is. Now i know from the news.
Marginal? You must be joking. A vast amount of servers run on Windows Server. Where I work alone we have several hundred and many companies have a similar setup. Statista put the Windows Server OS market share over 70% in 2019. While I find it hard to believe it would be that high, it does clearly indicate it’s most certainly not a marginal percentage.
I’m not getting an account on Statista, and I agree that its marketshare isn’t “marginal” in practice, but something is up with those figures, since overwhelmingly internet hosted services are on top of Linux. Internal servers may be a bit different, but “servers” I’d expect to count internet servers…
Most servers aren’t Internet-facing.
There are a ton of Internet facing servers, vast majority of cloud instances, and every cloud provider except Microsoft (and their in house “windows” for azure hosting is somehow different, though they aren’t public about it).
In terms of on premise servers, I’d even say the HPC groups may outnumber internal windows servers. While relatively fewer instances, they all represent racks and racks of servers, and that market is 100% Linux.
I know a couple of retailers and at least two game studios are keeping at scale windows a thing, but Linux mostly dominates my experience of large scale deployment in on premise scale out.
It just seems like Linux is just so likely for scenarios that also have lots of horizontal scaling, it is hard to imagine that despite that windows still being a majority share of the market when all is said and done, when it’s usually deployed in smaller quantities in any given place.
It’s stated in the synopsis, below where it says you need to pay for the article. Anyway, it might be true as the hosting servers themselves often host up to hundreds of Windows machines. But it really depends on what is measured and the method used, which we don’t know because who the hell has a statista account anyway.
since overwhelmingly internet hosted services are on top of Linux
This is a common misconception. Most internet hosted services are behind a Linux box, but that doesn’t mean those services actually run on Linux.
It’s only marginal for running custom code. Every large organization has at least a few of them running important out-of-the-box services.
Well, I’ve seen some, but they usually don’t have automatic updates and generally do not have access to the Internet.
Almost everyone, because the Windows server market share isn’t marginal at all.
Not too long ago, a lot of Customer Relationship Management (CRM) software ran on MS SQL Server. Businesses made significant investments in software and training, and some of them don’t have the technical, financial, or logistical resources to adapt - momentum keeps them using Windows Server.
For example, small businesses that are physically located in rural areas can’t use cloud based services because rural internet is too slow and unreliable. Its not quite the case that there’s no amount of money you can pay for a good internet connection in rural America, but last time I looked into it, Verizon wanted to charge me $20,000 per mile to run a fiber optic cable from the nearest town to my client’s farm.
How many coffee cups have you drank in the last 12 hours?
I work in a data center
I lost count
What was Dracula doing in your data centre?
Because he’s Dracula. He’s twelve million years old.
THE WORMS
I work in a datacenter, but no Windows. I slept so well.
Though a couple years back some ransomware that also impacted Linux ran through, but I got to sleep well because it only bit people with easily guessed root passwords. It bit a lot of other departments at the company though.
This time even the Windows folks were spared, because CrowdStrike wasn’t the solution they infested themselves with (they use other providers, who I fully expect to screw up the same way one day).
There was a point where words lost all meaning and I think my heart was one continuous beat for a good hour.
Did you feel a great disturbance in the force?
Oh yeah I felt a great disturbance (900 alarms) in the force (Opsgenie)
How’s it going, Obi-Wan?
CrowdStrike: It’s Friday, let’s throw it over the wall to production. See you all on Monday!
They did it on Thursday. All of SFO was BSODed for me when I got off a plane at SFO Thursday night.
This is going to be a Big Deal for a whole lot of people. I don’t know all the companies and industries that use Crowdstrike but I might guess it will result in airline delays, banking outages, and hospital computer systems failing. Hopefully nobody gets hurt because of it.
Big chunk of New Zealands banks apparently run it, cos 3 of the big ones can’t do credit card transactions right now
It was mayhem at PakNSave a bit ago.
Several 911 systems were affected or completely down too
Clownstrike
Crowdshite haha gotem
CrowdCollapse
An offline server is a secure server!
Been at work since 5AM… finally finished deleting the C-00000291*.sys file in CrowdStrike directory.
182 machines total. Thankfully the process in of itself takes about 2-3 minutes. For virtual machines, it’s a bit of a pain, at least in this org.
lmao I feel kinda bad for those companies that have 10k+ endpoints to do this to. Eff… that. Lot’s of immediate short term contract hires for that, I imagine.
How do you deal with places with thousands of remote endpoints??
Call the remote people in, deputize anyone who can work a command line, and prioritize the important stuff.
That’s one of those situations where they need to immediately hire local contractors to those remote sites. This outage literally requires touching the equipment. lol
I’d even say, fly out each individual team member to those sites… but even the airports are down.
Lot’s of immediate short term contract hires for that, I imagine.
I think sysadmins union should be created today
lol
too bad me posting this will bump the comment count though. maybe we should try to keep the vote count to 404
I can only see 368 comments rn, there must be some weird-ass puritan server blocking .ml users. It’s not beehaw as I can see comments from there.
I can only conclude that it is probably some liberals trying to block “Tankies” and no comment of value was lost.
Linux and Mac just got free advertisment.
The words ‘Mac’ and ‘free’ aren’t allowed in the same sentence.
Also, enterprise infrastructure running on a Mac? It’s something I’ve never heard of in over a decade and a half of working in tech. And now I’m curious. Is it a thing?
I’ve never heard of Macs running embedded systems - I think that would be a pretty crazy waste of money - but Mac OS Server was a thing for years. My college campus was all Mac in the G4 iMac days, running MacOS Server to administer the network. As far as I understand it was really solid and capable, but I guess it didn’t really fit Apples focus as their market moved from industry professionals to consumers, and they killed it.
Depending on what your definition of “enterprise” is, I’ve attended what was at the time a fairly large and prestigious art school that ran everything on Macs.
They even preferred that we didn’t bring windows laptops, although after some… rather intense protests by pretty much anyone under 25 we did get to bring our own peripherals.
Edit: I’ll also add that outside the shitty keyboards and mice, the server system they had set up with our accounts on etc was completely fine.
Never had a single issue with it and it was my first ever touching a Mac.
When you say ran everything, what kinda stuff?
This is almost 20 years ago today, so my memory is a bit hazy, but basically each student had an account with a certain amount of server space. I can’t remember the size, but given the amount of digital files we produced it would’ve been at minimum 500GB+/student. We could also “see” the account folder for everyone else in our class for file sharing and stuff.
There were also accounts/folders for each teacher which were used to turn in the primary copy of whatever assignment we had done if it was in digital form. Physical art were scanned or photographed also, as a sort of backup. We were also required to back every project up via USB sticks, ofc.
There was also a rack with individual docks for each digital camera that they had which allowed us to get our photographs transferred to our own folders. Since we could access those files from our accounts it also was a part of that server system.
There were also several networked and customised Macs used for single tasks, like larger printing projects and also for an airgapped paintgun for a lack of a better description. We avoided having to wear masks when we printed large sheets in single colours with it, for example. I have no idea what software that thing used, I think I used it like once or twice.
Now, I’ll freely admit that I haven’t touched a Mac since I left that school, and I’ve never had any interest in them whatsoever, so I don’t know what they used or if it even exists anymore. Someone with more knowhow maybe does?
I do remember them specifically (proudly) telling us it all ran on Macs, otherwise I probably wouldn’t have any reason to believe so. The “server room” was basically what looked like a glorified closet with a rack and a couple of Macs that didn’t look like the ones we students used. This was just before the all-in-one models were introduced, iirc.
Heard linux systems had a similar issue a few months ago
Yeah the more I read it seems some crucial security software deployed bad patch.
I’m sure whatever OS would dominate would face widespread issues just by the numbers.
But to be fair it’s fun to make fun of Windows/MSFT :p
crowdstrike sent a corrupt file with a software update for windows servers. this caused a blue screen of death on all the windows servers globally for crowdstrike clients causing that blue screen of death. even people in my company. luckily i shut off my computer at the end of the day and missed the update. It’s not an OTA fix. they have to go into every data center and manually fix all the computer servers. some of these severs have encryption. I see a very big lawsuit coming…
they have to go into every data center and manually fix all the computer servers.
Jesus christ, you would think that (a) the company would have safeguards in place and (b) businesses using the product would do better due diligence. Goes to show thwre are no grown ups in the room inside these massive corporations that rule every aspect of our lives.
I’m calling it now. In the future there will be some software update for your electric car, and due to some jackass, millions of cars will end up getting bricked in the middle of the road where they have to manually be rebooted.
Laid off one too many persons, finance bros taking over
I work for one of these behemoths, and there are a lot of adults in the room. When we began our transition off the prior, well known corporate AV, I never even heard of crowd strike.
The adults were asking reasonable questions: why such an aggressive migration timeline? Why can’t we have our vendor recommended exclusion lists applied? Why does this need to be installed here when previously agentless technologies was sufficient? Why is crowd strike spending monies on a Superbowl ad instead of investing back into the technology?
Either something fucky is a foot, as in this was mandated to our higher ups to m make the switch (why?), or, as is typically the case, the decision was made already and this ‘due diligence’ is all window dressing to CYA.
Who gives a shit about fines on SLAs if your vendor is going to foot the bill.
Why does this need to be installed here when previously agentless technologies was sufficient
As someone who works in offensive Cybersecurity doing Red Teamings, where most of my job is to bypass and evade such solutions, I can say that bypassing agent less technologies is so much easier than agented ones. While you can access most of the logs remotely, having an agent helps you extremely with catching 0-day malware, since you can scan memory (that one is a bitch to bypass and usually how we get caught), or hook syscalls which you can then correlate.
Oh, an unknown unsigned process just called RWX memory allocation, loaded a crypto binary, and spawned a thread in another process that’s trying to execute it? Better scan that memory and see what it’s up to. That is something you cannot do remotely.
Insane that these people are the ones making the decisions
A few years ago when my org got the ask to deploy the CS agent in linux production servers and I also saw it getting deployed in thousands of windows and mac desktops all across, the first thought that came to mind was “massive single point of failure and security threat”, as we were putting all the trust in a single relatively small company that will (has?) become the favorite target of all the bad actors across the planet. How long before it gets into trouble, either because if it’s own doing or due to others?
I guess that we now know
No bad actors did this, and security goes in fads. Crowdstrike is king right now, just as McAfee/Trellix was in the past. If you want to run around without edr/xdr software be my guest.
All of the security vendors do it over enough time. McAfee used to be the king of them.
https://www.zdnet.com/article/defective-mcafee-update-causes-worldwide-meltdown-of-xp-pcs/
My dad needed a CT scan this evening and the local ER’s system for reading the images was down. So they sent him via ambulance to a different hospital 40 miles away. Now I’m reading tonight that CrowdStrike may be to blame.
We had a bad CrowdStrike update years ago where their network scanning portion couldn’t handle a load of DNS queries on start up. When asked how we could switch to manual updates we were told that wasn’t possible. So we had to black hole the update endpoint via our firewall, which luckily was separate from their telemetry endpoint. When we were ready to update, we’d have FW rules allowing groups to update in batches. They since changed that but a lot of companies just hand control over to them. They have both a file system and network shim so it can basically intercept **everything **
I’m so exhausted… This is madness. As a Linux user I’ve busy all day telling people with bricked PCs that Linux is better but there are just so many. It never ends. I think this is outage is going to keep me busy all weekend.
🙄 and then everyone clapped
A month or so ago a crowdstrike update was breaking some of our Linux vms with newer kernels. So it’s not just the os.
How? I’m really curious to learn.
Crowdstrike bricked networking on our linuxes for quite a few versions.
I don’t know how on either one. I just know it happened.
You’re comment I came looking for. You get a standing ovation or something.