Data Firm Databricks Launches New Info Ingestion Features

February 25, 2020

Data management and processing company Databricks recently released two new info ingestion features: Data Ingestion Network and Databricks Ingest service. Tech Crunch reported that the two ingestion functions aim to make combining data warehouses and lakes easier.

According to the Databricks blog, the new Ingestion Network “enables an easy and automated way to populate [customers’] lakehouse from hundreds of data sources into Delta Lake.” On the other hand, the Ingest service is an Auto-Loader that lets users to “incrementally ingest data into Delta Lake.”

Powered by Delta Lake, the tech allows the company to work with the Ingestion Network Partners namely Fivetran, Qilk, Infoworks, StreamSets, and Syncsort. The goal of this partnership is to upload all their info into Delta Lake and let the system process it.

Delta Lake is the firm’s open-source management project overseen by Linux Foundation. It offers “a new storage layer to data lakes that helps users manage the lifecycles of their data.” It also helps guarantee the quality of the info by enforcing systems, creating log records and other similar practices.

Databricks co-founder and CEO Ali Ghodsi said that the traditional practice of most companies is to separate big and structured info for their respective uses. He noted that this calls for the siloing of info in lakes and warehouses, leading to slow processing and incomplete results.

With the help of the firm’s new ingestion systems, companies can take advantage of the warehouse’s reliability and the expansive scale of lakes on a case-to-case basis. The lakehouse combines the benefits of the two types of silos, allowing customers to enjoy the best of both worlds. In doing so, the Ingest service will help in processing the info uploaded to the system.

Bharath Gowda, vice president of product marketing, said that the new network and service will allow clients to analyze even their most recent information. Ultimately, this lets businesses become “more responsive” even as they gather new information.

Moreover, Ghowda remarked that the new systems will allow customers to make better use of their structured and unstructured information when it comes to creating more effective machine learning models.

Meanwhile, he noted that the advanced tech will not eliminate traditional functions such as conventional analytics. Instead, it merely enables them to process a huge chunk of their database instead of only a small part of it, making report generation more efficient.