Young and hungry big data firm Databricks secured a $400 million Series F funding round in tandem with a new chief financial officer: former Splunk CFO Dave Conte.

The latest round was led by returning investor Andreessen Horowitz’s Late-Stage Venture Fund, which has been onboard since day one leading the company's $14 million Series A  in 2013 and its $250 million Series E in February. 

According to Crunchbase, this round brings Databricks’ total funding to $897 million, which more than doubles its valuation to a balmy $6.2 billion. The company went from zero to a $200 million revenue run rate in less than four years.

New seed money will go toward research and development to continue expanding Databricks’ presence globally as the company plans to invest about $111 million into its new European data center in Amsterdam over the next three years.

Databricks appears to the popular kid customers just can't get enough of that startups are itching to partner with. At this stage in the game, it's not a matter of whether or not the company will set an initial public offering  — it's a matter if when. 

Open Pockets for Open Source

In the last week Databricks contributed its open source software project Delta Lake to the Linux Foundation. The Delta Lake Project launched in April as a storage layer to live on top of Spark SQL and Parquet files stored in Databricks File Systems within data lakes to manage large sets of data.

It has seen significant growth since joining the open source community earlier this year with more than 4,000 organizations using it and over 2 exabytes of data processed in just the last month, a Databricks spokesperson told SDxCentral.

“We’re excited to foster even more growth by partnering with the Linux Foundation. We are joined by Alibaba, Booz Allen Hamilton, Intel, and Starburst in the announcement to develop Delta Lake support not just for Apache Spark, but also Apache Hive, Apache NiFi, and Presto,” the spokesperson added.

And just yesterday the company announced a partnership with StreamSets to harness capabilities from StreamSets DataOps platform and Databricks’ Delta Lake, thus setting course to voyage into data lakes, accelerate analytics projects, and reduce cycle times.

The integrated service aims to provide insight beyond error diagnoses so users can understand the health of their Apache Spark projects at a granular level when errors occur.