Max Kilger
is a University of Texas at San Antonio professor in the departments of management science and statistics and information systems and cybersecurity. He is a core faculty member of the university’s school of data science.
University of Texas at San
Antonio
In the early hours of Friday, as airline workers, baristas and programmers were faced with blank blue screens, University of Texas at San Antonio professor Max Kilger was already piecing together what led to the biggest information technology outage in history.
When news broke that a bug in a security software update was responsible, Kilger wasn’t surprised. And he thinks the global outage is an important reminder of how quickly technological failures can lead to major economic and societal fallout.
Kilger, who has a doctorate in social psychology, is a core faculty member in the UTSA school of data science and in the department of information systems and cyber security. He spoke with the San Antonio Express-News about the Microsoft outage, what it
means and what we can expect in the coming days. The following has been edited for clarity and length.
Q: Can you take us through a high-level survey of what happened and the effects so far?
A: Basically, CrowdStrike is a major supplier of information security software to enterprise solutions for corporations and
the government as well. Early (Friday) morning, they had sent an update, configuration or channel update to all of the machines — hundreds of thousands, probably millions of machines — that use their software, and there was a small bug in it.
That bug basically caused the computers to do what’s called “the blue screen of death” on Windows machines only. The
machines crashed. And every time they would try to start it back up, it would just crash again. So it had a major global effect. Analysts that I have talked with say it’s probably the largest global IT outage in history.
Q: Why is it affecting such a broad range of industries, from relatively minor inconveniences like mobile ordering
at Starbucks to major issues like planes grounded around the world?
A: CrowdStrike is a major, major supplier of cybersecurity software, and it’s used in all sorts of industries all over the globe: airlines, railways, health systems, Starbucks, television stations. It was an incredibly significant outage.
Q: From what you can tell, how far are we from getting back to normal?
A: It’s a huge outage, and it’s going to take a while, even after you get the fix. So, for example, the airline system — you have tens of thousands of people in airports all over the country and around the world, that are sitting there going, “Now what?” It causes this huge sort of domino effect for some
industries like the airline industry.
Q: You mentioned that this is probably the largest such outage in history. Is there anything you’re aware of that has happened that even approaches this in similarity?
A: I can’t think of anything that even begins to touch this.
Q: When you heard about this, were you surprised to learn that it was an error, rather than an attack?
A: I wasn’t surprised, as I was beginning to hear things on the grapevine, as the scope and the magnitude of the outage became apparent. I’m going, “Yeah, probably not in a cyberattack. It sounds like some major vendor has made some mistake.”
Q: I imagine updates are being run all the time overnight. So is it usually going to be early in the morning when complications become obvious?
A: Usually, they spend a number of months making these updates and then rigorously test them on machines to make sure that there aren’t any issues with different operating
versions and different hardware and things. So they’re usually very, very careful before they do this kind of release. I was kind of surprised that it sounds like they just went for it, for the whole thing only once, instead of, perhaps, updating a certain sector or a certain set of customers. It’s not really clear yet exactly what happened and how much testing was done before it was released.
Q: Is there anything that you see as an important takeaway for the average reader watching this situation unfold?
A: It just echoes everywhere. One of the things that cybersecurity experts need to think about is: How can we maybe spread this risk across a wider landscape so that we don’t have these huge failures? It’s a really tough ask.
For the average American, it’s a bit of a wake-up call to suddenly discover that services and resources and things they depend upon are subject to disruption and can not function like they’re supposed to. So, hopefully, that is going to focus both the public’s attention as well as industry’s attention and government policymakers’ attention on investing more in cybersecurity infrastructure and policies and regulation to help
make sure that these kinds of things have less of an impact in the future.
Because this is not the last one. We’re going to see it again.