Dynatrace’s new LLM extension boosts observability, cuts costs

Dynatrace today announced its Dynatrace artificial intelligence (AI) Observability extensions for large language models (LLMs). Designed to offer a unified observability platform, the goal is to address the challenges popping up on the frontiers of generative AI (genAI) to add better observability and, in turn, control costs. There are many indications that genAI expenses are difficult to tame.

GenAI hardware and infrastructure is expensive, said Bob Wambach, VP of product marketing at Dynatrace. “If you bought it and you're not using it, then that's not giving any value. If you're using it, but you're not getting value out of AI, that's not any better than not using it,” he said.

To answer questions on how effectively teams are using the complex infrastructure involved in genAI, Wambach said, they have to observe and analyze the value and efficiency of the applications they build on top of the infrastructure.

The need for observability spans from base GPUs up to the Application layer, he continued.

Dynatrace’s AI observability extensions, tailored to handle complex LLM ecosystems, were front and center at the software observability and security vendor’s annual Dynatrace Perform 2024 in Las Vegas.

GenAI tokens observed

Dynatrace’s AI observability spans the end-to-end development process, and benefits from the company’s established efforts to support AIOps and MLOps.

LLM Application observability software allows costs to be measured on the basis of LLM token consumption, the latter an increasingly common means for gauging genAI processing. At the same time, Dynatrace AI monitors semantic caches and vector databases, which are finding wide use in new AI deployments. The software supports extensions to Google AI Platform, Amazon SageMaker and other development environments.

Wambach indicated that observability software is taking on a greater role today, as LLM-based and other AI proofs-of-concept move to production. Data quality becomes an acute need in this regard, he said. As teams continue to develop and evolve these applications, there is a need to ensure they are constantly feeding the AI applications with the best data available.

Those goals are reflected in updates to the Dynatrace Data Observability platform, which measures data “freshness” and monitors outliers from expected dataset values, as well as related usage for servers, networking and storage.

Also at Dynatrace Perform 2024, the company released OpenPipeline for managing petabyte-scale data ingestion on its platform. It allows teams to ingest and route observability, security and business events data from sources such as OpenTelemetry and Dynatrace OneAgent.

According to Wambach, Dynatrace is tapping its own experiences building out AI, real-time data ingestion and data lakehouses as part of its work to build its observability and security platform. Central pieces are the Davis AI engine, which performs root cause analysis, anomaly detection and task automation, and Grail, which supports unified storage for logs, metrics, traces and events.

“We’ve been AI experts for over a decade,” Wambach said. “People have depended upon our causal AI and predictive AI to very quickly identify, resolve and prevent problems.”

Monitoring power-hungry processing

Dynatrace has also announced it is working to improve measurement of carbon emissions related to IT operations. The company said it is collaborating with customer Lloyds Banking Group to further develop its Dynatrace Carbon Impact offering, which works with its topology and dependency mapping tools to establish benchmarks for green coding initiatives.

The software translates operations metrics — including processing, memory disk and network I/O — into CO2-equivalent measures. This has clear import for emerging genAI efforts that have been criticized for prodigious energy use.

The power-hungry nature of today’s LLM-driven AI was emphasized, for example, by news late last year of a Carnegie Mellon University and Allen Institute for AI study that found some power requirements for carbon-intensive AI image generation models use up as much CO2 as is required to power an average gasoline-powered car more than 4 miles.

Dynatrace’s Carbon Impact software has use beyond AI apps, the company said, supporting customers’ hybrid and multicloud environments as carbon consumption and power utility costs become another observability metric to monitor in real time.

Meanwhile, the road to transformative AI applications is not going to build itself – though some genAI advocates see that capability around the corner. For now, the work of IT shops resolutely centers on getting useful output from LLMs that are still as much art as science. The balance will shift to the science side as efforts like Dynatrace’s enter the space.

Dynatrace’s new LLM extension boosts observability, cuts costs

Tags

AI Data Centers: Scaling Up and Scaling Out

Advanced Networks for Artificial Intelligence and Machine Learning Computing

DCD>Survey: Data center networking trends

Future-proof your datacenter with DDC S-Series