Datawarehouse Concepts

avs sridhar
5 min readJan 18, 2022

This Article Covers all basic data warehouse concepts.

Dataware house

Data Warehouse is like a relational database designed for analytical needs.

It functions on the basis of OLAP (Online Analytical Processing).

It is a central location where consolidated data from multiple locations (databases) are stored.

Data warehousing

Data Warehousing is the act of organizing & storing data in a way so as to make its retrieval efficient and insightful.

It’s also called the process of transforming data into information.

OLAP-Online Analytical Analytical Processing

OLAP is a flexible way for you to make a complicated analyses of multidimensional data.

DWH is modeled on the concept of OLAP. DBS is modeled on the concept of OLTP (Online Transaction Processing).

OLTP systems use data stored in the form of two-dimensional tables, with rows and columns.

Advantages of OLAP over OLTP

1. Opens up new views of looking at data.

2. Supports filtering/ sorting of data.

3. Data can be refined.

Types of OLAP Cubes

MOLAP

MOLAP is a form of OLAP that processes and stores the data directly into a multidimensional database.

Advantage:- Excellent performance; Can perform complex calculations.

Disadvantage: Only limited data can be handled.

ROLAP

ROLAP is a form of OLAP that performs dynamic multidimensional analysis of data stored in a relational database rather than in a multidimensional database.

Advantage:- Greater amount of data can be processed.

Disadvantage:- Requires more processing time/ disk space.

HOLAP

It is a combination of both ROLAP and MOLAP.

Advantage: It can drill through from the cube into the underlying database.

OLAP Operations

Let the cube be as follows before any operation is performed

Drill Down

In drill-down operation, the less detailed data is converted into highly detailed data. It can be done by:

  • Moving down in the concept hierarchy
  • Adding a new dimension

Roll Up

It is just the opposite of the drill-down operation. It performs aggregation on the OLAP cube. It can be done by:

  • Climbing up in the concept hierarchy
  • Reducing the dimensions

Slice

It selects a single dimension from the OLAP cube which results in a new sub-cube creation. In the cube given in the overview section, Slice is performed on the dimension Time = “Q1”.

Pivot

It is also known as rotation operation as it rotates the current view to get a new view of the representation. In the sub-cube obtained after the slice operation, performing pivot operation gives a new view of it.

Dice

It selects a sub-cube from the OLAP cube by selecting two or more dimensions.

Dimensions in Data Warehouse

In data warehousing, a dimension is a collection of reference information about a measurable event.

The tables that describe the dimensions involved are called Dimension tables.

Dividing a Data Warehouse project into dimensions provides structured information for analysis & reporting.

Facts and Measures

A fact is a measure that can be summed, averaged, or manipulated.

A Fact table contains 2 kinds of data — a dimension key and a measure.

Every Dimension table is linked to a Fact table.

Schema in Data Warehouse

A schema gives the logical description of the entire database.

It gives details about the constraints placed on the tables, key values present & how the key values are linked between the different tables.

A database uses a relational model, while a data rehouse uses Star, Snowflake, and Fact Constellation schema

Star Schema

Each dimension in a star schema is represented with a one-dimension table that contains a set of attributes.

The fact table is at the center, which contains keys to every dimension table & attribute.

Snowflake Schema

Dimension tables are normalized here. They are split into different tables.

Galaxy Schema

Also known as Fact Constellation schema. Contains more than 1 Fact table.

Below, there are two fact tables: Revenue and Product.

Dimensions that are shared are called Conformed Dimensions.

If you found this article helpful. Consider clapping and follow me on medium.

Let’s connect via Linkedin or Twitter for any further discussions on big data and ML

References

  1. https://www.youtube.com/watch?v=CHYPF7jxlik&t=19s
  2. https://www.guru99.com/database-vs-data-warehouse.html

--

--