Cribl launched a new data lake as an alternative to traditional data lake solutions whose complex, expensive and time-consuming management processes hinder IT teams.

Cribl Lake helps IT teams collect, analyze and route a complete view of all IT and security data throughout the enterprise. “Data lakes act as a central repository, making them vulnerable to attacks. Storing sensitive data without proper access controls or masking can result in data breaches and security incidents,” former Gartner analyst and Cribl Senior Director of Market Strategy and Competitive Intelligence Nick Heudecker told SDxCentral.

IT and security data comes in a variety of formats and shapes, which tends to be difficult to structure. This issue will only be exacerbated as data variety and volume increases, particularly with the rise of artificial intelligence (AI) workload deployments.

“As the volume and complexity of data continues to escalate, enterprises need a data lake solution that consolidates large volumes of data from disparate sources to make it easy to share and retrieve for future use,” Enterprise Strategy Group (ESG) SeniorAnalyst Jon Brown said.

Disparate data types stored in separate places paired with the nonstop flow of new data creates significant challenges in building and maintaining custom parsers, pipelines and dashboards based on data.

To address those barriers, Cribl Lake enables IT teams to onboard significant volumes of data under specific security policies based on open formats that promote interoperability. Users can store raw, formatted, structured or unstructured data; unify search, security, compliance and regulatory policies and reduce storage costs.

“This frees up crucial resources to look at what’s in the data, rather than dealing with headaches around storing and managing the data itself,” Heudecker said.

The data lake is a turnkey service for storing, managing, enforcing policy on and accessing data. By building the data lake with open formats, no predetermined schema is required, and users can run queries without needing to move data. “Cribl handles the heavy lifting so data can easily be usable and valuable to the teams and tools that need it,” he said.

Cribl Lake is designed to streamline workflows and ingest, store, route, replay and search data located across various environments. To that point, the service “complements and optimizes other data lake solutions out there,” Heudecker said. “Customers can choose to let Cribl manage it all for them, or store their data where they currently have it – S3 buckets, Azure Blob, etc.”

Cribl Lake also includes an integrated role-based access control (RBAC) system with dataset-level access controls. This feature allows IT leaders to configure each user’s access rights and other controls within the vendor’s Stream and Search capabilities. In addition, the data lake offers audits logs for monitoring data lake activity and identifying potential security breaches.

Lowering costs with data value-based tiered storage

Cribl claims its new data lake will help customers save money on data storage bills. “Cribl Lake stores low-value data for short-term or long-term retention, ensuring teams can still access and work with data,” Heudecker said.

High-value and frequently-accessed data can be routed downstream and stored in security information and event management (SIEM) or application performance management (APM) tools, while less relevant or low-value data can be stored in Cribl Lake for long-term retention. “For example, low-value debug logs can be stored in Cribl Lake and accessed with Search, and only production logs get routed to downstream tools, like Splunk or Datadog, to save on costs,” he said.

This tiered storage data management strategy based on data value has long been popular with some of the vendor’s most advanced customers.

“This strategy recognizes that the value of data fluctuates over time and is dependent on a number of things, including the age of the data, its current relevance to the business, the likelihood of the need for access and regulatory/compliance requirements,” Heudecker said. “Low-value logs under normal operating conditions may become high-value logs in the event of an outage or attack.”