Building Blocks of Data Warehouse
A Data Warehouse (DW) is a database that collects data and information from different sources. It is subject-oriented and stores large amounts of data. It also stores a series of snapshots of an organization’s operational data generated over a period of time. The DW represents the flow of data through time. Data is periodically uploaded then time-dependent data is recomputed.
Business Intelligence and Data Warehouse solutions and tools help to improve the availability of data and information. They significantly simplify the data analytics architecture. They also provide data security and business continuity. These tools and solutions impact the business bottom-line positively.
Main components of data warehouse
There are several logical and physical components. Apart from the logical components such as raw or source data coming from various sources, schemas, tables, reports, data visualization, data sets, and so on, and physical components such as the data warehouse itself, platforms, and so on, the following are the main components of a DW.
Database: It is the foundation or core component of DW architecture.
Metadata: It is data about data, and it describes the DW and its objects. It is used for building, maintaining, managing, implementing, and using the DW. It is managed with the help of accompanying software.
There are three main categories: operational, business, and technical. Operational metadata holds information in terms of the currency of data and data lineage. Currency of data refers to the data being active, archived, or purged. The lineage of data is about the history of data migrated, and the transformation applied to it. It also holds business information, algorithms for granularity, summarization, aggregation, and so on, rules related to data refresh, data purge, data transformation, and so on. Business metadata contains a business definition, data ownership information, and changing policies. Technical metadata contains primary and foreign key attributes, tables and columns, database system names, data types, and so on.
Metadata is a shorthand depiction of data that leads us to detailed data. It is a directory that helps us to locate objects in a DW. It is often used to describe rules, aggregations, transformations, and so on.
Load manager: It is the front-end component, and it performs all the operations associated with the extraction and loading processes of data. It is responsible for transforming and preparing the data before it enters the DW; it performs transformations into a structure similar to one in the DW. The higher the degree of overlap between the data sources, the larger the load manager will be.
Warehouse manager: It performs operations associated with the management of the data in the DW. It deals with operations like analysis of data to ensure consistency, creation of indexes and views, generation of denormalization and aggregations, transformation and merging of source data, and archiving and backing up data. It consists of C programs, shell scripts, third-party software programs, and so on.
It is also responsible for creating partition views, indexes, and business views. It updates the existing aggregations and creates new ones as well. It keeps track of the data to perform referential integrity checks. It also performs backup operations on the data in the DW.
Query manager: It is the back-end component, and it performs all the operations related to the management of user queries. By directing the queries to appropriate tables, the query manager speeds up the query request and response process. It is responsible for scheduling the execution of the queries, and it also presents the data to the users in an understandable format.
End-user access tools: There are various areas related to the DW where such tools can be utilized. They are used for data reporting and querying, data mining, online analytical processing, executive information systems, and so on. Data dippers are basic tools that facilitate the generation of standard reports. Reporting tools provide reports in various formats. OLAP tools facilitate the generation of complex charts and graphs and provide complex online analysis against live data.
Hope this was helpful.