Event Thinking: What it is and why you need it
Article by SolarWinds head geek Leon Adato
2019 brings with it a host of new arrivals to IT’s buzzword bingo sheet, including this one: “event thinking.” Yes, this relates to IT management, and no, it doesn’t have anything to do with Coachella. In fact, Gartner’s pundits believe that by 2020, event thinking will revolutionise the way in which businesses handle IT, with 80% of digital solutions requiring it to succeed. That sounds like a lot of incoming disruption until you realise that “event thinking” isn’t something fundamentally new for most IT pros.
The term refers to what Gartner also calls “real-time situational awareness”—namely, systems immediately detecting and responding when certain conditions are met. Any set of such conditions can count as an event. Every IT pro deals with these, en masse, as part of their everyday job: traps and syslog alerts are good examples of what events look like today. They’re warning signs that reveal something may be wrong, or at least worth investigating. But as an organisation brings more and more digital solutions into play, it’ll need to pay attention for increasingly complex and subtle events—the sort that even the most skilful IT pros may struggle to spot without assistance.
No Syslog, Sherlock!
Traditionally, events have typically existed within certain clearly-defined parameters, limited to individual devices or systems. If your server flashes a disk error, you’d fix the server—and that’s about how complex it got. Today, however, IT pros look after numerous interconnected, integrated systems—some under their direct control, some not. They also face increasingly sophisticated threats and risks to these networks of systems, where an attack or imminent instability may not sound alarm bells on any one system.
That requires IT pros to act more like detectives than traffic controllers: correlating feedback from different platforms and systems and identifying when the sum of that feedback might indicate a risk to the organisation’s infrastructure. Event thinking, in that sense, requires a whole new mental model for IT pros. You’re applying a whole new level of critical thinking, analysis, and suspicion to what your systems may tell you—looking for events that can be made up of tens or even hundreds of otherwise innocuous indicators. It’s anything but elementary.
Don’t Get Caught Up in the Past
IT pros need to adopt event thinking because if they don’t, they’ll ultimately overlook risks and threats that, unless picked up early, could tear their digital infrastructure apart. But shifting mindsets and changing processes don’t come easy, especially not when doing so could involve huge amounts of extra workload on already-stretched monitoring teams. IT leaders should keep three things in mind as they redefine what an event is, and how they’re looking out for them.
The first is the difference between events and metrics. The former indicates what’s happening right now, in real time, with implications for present operations. The latter—your standard read-outs collected from systems on a regular basis, like performance logs—tell you what’s happened in the past, helping you identify what “normal” operation looks like in the present and future. Metrics play a critical role in the health of any IT infrastructure, but they won’t save you from real-time threats or unexpected risks to your systems. With that in mind, IT leaders should ensure they and their teams have in place the tools that use the context offered by metrics to identify potential events.
The second is the need for solutions that can sort out an event’s signal from the surrounding noise. Being able to absorb multiple data types, deduplicate redundant information, and automatically flag events based on ever-changing criteria, all these will help IT pros in dealing with the growing subtlety of events in increasingly complex “systems of systems.” It’s possible that event thinking will gain a hand from AI in the future, but, for now, most IT leaders should focus their attention on robust system and networking monitoring platforms that can help them not only detect these less visible events, but also guide them towards more relevant event criteria as systems, platforms, and their risks evolve over time.
Thinking in Events: Is This a Triumph?
The final, most important, and probably most difficult point is that event thinking remains a second-class buzzword. It lacks the lustre of more glamorous terms like blockchain, edge computing, or machine learning; it also takes much longer to show fruits than other methodology-based buzzwords like DevOps. Yet IT leaders have a responsibility to at least try to convince the business that event thinking really does matter—that, in fact, the business needs it to survive. Without it, they’ll only know of systemic cybersecurity threats, flaws in software architecture, or latent performance degradation when a breach or outage shuts operations down.
Many IT pros will hold fond memories of the phrase “you are now thinking in portals,” from the eponymous video game which requires players to radically rethink how they moved through space to solve seemingly impossible problems. There’s no homicidal AI goading them to start “thinking in events,” only the promise of averting major digital crises before they occur. But in a similar fashion, the fundamental rules of the game remain unchanged: stay alert, look for the bigger picture and don’t get caught with your logs down.