GenAI poised to overhaul the observability arena, ServiceNow says

  • Like many segments, observability is poised to be transformed by GenAI

  • Major potential lies in the technology's ability to filter and sort data to pinpoint issues faster and speed incident response

  • But experts said humans will still need to remain in the loop as some root causes just can't be identified by software

Generative artificial intelligence (GenAI) is quickly eclipsing observability in terms of how broadly it’s being applied, ServiceNow VP and GM of Cloud Observability Ben Sigelman told Silverlinings. But the two technologies together present a promising future for holistically observing cloud environments, he added.

Across incident response, data interpretation and overall control of system monitoring, GenAI can supercharge the observability space and “completely revolutionize how human beings interact with these really complex systems,” Sigelman explained. “They'll be woken up to fewer false positives, and they’ll also have a much easier time getting a birds-eye-view of what’s going on, which is really the first step in incident response.”   

One of the biggest opportunities for GenAI in this space, according to Sigelman, is to swiftly summarize what can often be a confusing and overwhelming number of signals — which a human can then contextualize.

“Where monitoring and observability solutions were originally just positioned as a way of getting data,” in the last decade “that's been largely replaced by open source,” he said. Sigelman highlighted the OpenTelemetry project, an open-source initiative he co-founded that aims to simplify observability in cloud-native environments, as an example of this trend.

“Solutions are shifting more towards a unification [and] consolidation story where, instead of focusing on getting the data, they're focused on bringing it together and analyzing it in some kind of coherent way,” Sigelman continued.

But there's still “a very real problem” when it comes to reducing the time to resolution for incidents that span across many different domains. Enter GenAI.

Making sense of it all

A large language model (LLM) can dramatically improve the signal-to-noise ratio during incident response.

“The opportunity to take a somewhat noisy signal of potential insights, look at that holistically and reduce it down to events that a human might actually care about… that's an area where LLMs do have a very significant part to play,” especially in emergency scenarios, he explained.

Sigelman stressed that no human is getting replaced by GenAI, as sometimes “the ultimate root cause [of an incident] can be very surprising.”

He recalled a memorably outlandish example of a serious incident during his time at Google. The root cause? A shark biting a cable at the bottom of the Atlantic Ocean.“You're not going to build software that's going to tell you that,” he said. Still, “there's an opportunity to consolidate and integrate into something greater than the sum of its parts, mainly as a communication tool.”

Separate the wheat from the chaff  

Sigelman said 2023 was a “more interesting year than most for the development of any data-based technology,” thanks to what he referred to as both “genuine innovations” and “marketing thrash” around AI. “It's always a combination of the two.”

Marketing thrash indeed came along with AI’s mainstream emergence throughout the last year, with businesses “slapping an AI moniker on whatever their product lines” were, as CockroachLabs CEO Spencer Kimball described to Silverlinings in a recent interview.

“It's in a very early stage of the hype cycle,” Thomas King, CTO of internet exchange operator DE-CIX, said, expressing a similar sentiment to Silverlinings.

But in separating the wheat from the chaff, valuable developments did begin materializing in 2023. As an example, data management platform Elastic launched its AI assistant for observability to help in incident response scenarios like those described by Sigelman.

King believes value will really begin to take shape next year as AI becomes “productized.” Moving beyond hype and broad public services like ChatGPT, “we will see real products… I think that will be the year 2024.”

Sigelman knows GenAI will undoubtedly revolutionize observability, but in its unprecedented pace of adoption this year, he has observed people “conflating a lot of different things with the same word, which is GenAI or LLMs.”

He concluded that one of the most important things the industry can do in 2024 is to be clear with its terminology, avoid speaking broadly about GenAI and “give people the language they need to compare solutions.”