Auteur: Data Virtuality

Logical Data Warehouse - the perfect combination of data federation, physical data integration, and a common query language, SQL

A modern data integration strategy employs what’s known as “best-fit engineering”, whereby each part of the data management infrastructure utilizes the most appropriate technology solution to perform its role, including storing data determined by business requirements and Service Level Agreements (SLAs). Unlike a data lake, this new architecture has a distributed approach, aligning  information storage selection with information use, and leveraging multiple data technologies that are fit for specific purposes. A hybrid approach can also significantly reduce costs and time to delivery when changes or additions in the warehouse are required.

One term for this new architecture is logical data warehouse. Another is virtual data lake. In either case, the premise is that there is no single data repository. Instead, the logical data warehouse is an ecosystem of multiple, fit-for-purpose, repositories, technologies, and tools that interact synergistically to manage data storage and provide performant enterprise analytical capabilities. 

The original, and so far unfulfilled, analytical requirements of the traditional data warehouse were to be able to retrieve data using a single query language, get speedy query response, and to quickly assemble different data models or views of the data to meet specific needs. By combining data federation, physical data integration, and a common query language (SQL), the logical data warehouse approach achieves all three of these goals without the need to copy or move all the data to a central location.

Physical data integration is a robust feature of the logical data warehouse that ensures fast query response while decoupling performance from the source data stores and moving it to the logical data warehouse repository. In this manner, the effort-intensive, physical transfer of the data is minimized and simplified, effectively removing lengthy data movement delays from the critical path of data integration projects.

In Understanding the Logical Data Warehouse: The Emerging Practice, Gartner weighed in on this approach, pointing out that it offers flexibility for companies that have different data requirements at different times. For example, many use cases require a central repository, such as a traditional data warehouse or analytic database, where data that is needed frequently, or with the greatest retrieval speed can be stored and optimized for performance. 

Logical data warehouse

Increasingly, data analysts must be able to explore data freely with reliably adequate query performance. Frequent use cases along these lines are sentiment analysis or fraud detection analysis. These use cases require a distributed technology to store the massive amounts of
data available through social media feeds, clickstream activity logs, and many other sources. Additionally, they demand direct access to data sources via data federation. As Gartner rightly indicates, a logical layer is needed on top of these technologies to unify the architecture and to allow queries and processes to operate on all systems concurrently
as needed.

As the first logical data warehouse, Data Virtuality provides this uniform layer over numerous data storage technologies, unifying these data stores and facilitating the use cases suggested by Gartner. By routing queries among data stores behind the scenes as needed, the Data Virtuality technology offers significant benefits to business users.

The business can use the same platform for handling a variety of use cases, and far more than could be handled by a traditional data warehouse or any other single means. Also, new approaches to data integration are possible, enabling users to put business needs first and allow the technology platform to adapt as needed.

By decoupling the semantic unified data access layer, in which the business users interact, from the actual data sources, changes that occur in the original data source can be isolated from interfering with analytical processes. In a profound departure from past data accessibility strategies, business users can interact with data comfortably and easily, focusing on their objectives rather than the technological underpinnings.

By consolidating relational and non-relational data sources, including real-time data, Data Virtuality enables immediate analysis using SQL. Data Virtuality provides a central data dashboard, which allows all data sources, whether analytical or operational, to freely interchange data.

Integrated connectors allow data to be immediately processed using analysis, planning, or statistics tools, or written back to source systems as needed. In addition, the logical data warehouse automatically adjusts to changes in the IT landscape and user behavior, thereby offering the highest possible degree of flexibility and speed, with little administrative overhead.

In a logical data warehouse project, a few clicks can seamlessly connect all data-producing and data-processing systems, including ERP and CRM systems, web shops, social media applications, and just about any SQL and NoSQL data sources, all in real-time. With instant access to the data, users can begin experimenting with these connections and joins until they achieve the results they want.

Considering the vast compatibility with data sources, it is a natural fit to include cloud-based MPP analytical stores in your Logical Data Warehouse. Whether these platforms are primary databases or used for limited specific purposes, the integration and performance through the
Logical Data Warehouse is seamless and dependable and removes the siloing factor from their use. In the case where MPP platforms exist on-premises, the typical progressive growth costs relating to CPU and storage can be halted and if desired, phased out altogether as the data within these expensive platforms can be migrated to the logical data warehouse.

In stark contrast to traditional ETL solutions, the key difference with the Logical Data Warehouse is that there’s no need to move the data to analyze it. This significantly reduces development and database structuring time and costs. Equally flexible and responsive, the logical data warehouse is a completely different data integration paradigm providing new capabilities and approaches heretofore not possible.

In our next blog post you will learn how the Logical Data Warehouse works and how it can help you to ensure data governance.

Got interested? Visit us at Big Data Expo, booth# 64, and watch a live demo. Or schedule an appointment with one of our experts.

Logical Data Warehouse data lake strategy

Reactie toevoegen