Welcome to MIND of YOUTH

BE THE CHANGE ! MAKE A DIFFERENCE IN THE LIVES OF YOUTH..........

What is data warehouse ?

What is data warehouse ?

Agenda

  • What is the need for Business Intelligence?
  • What is data warehousing?
  • Key terminologies related to DWH architecture
  1. OLTP VS OLAP
  2. ETL
  3. DATA MART
  4. META DATA
  • DWH architecture

Why Business Intelligence?

Business Intelligence is the activity which contribute to the growth of any company.

What is business Intelligence?

Business intelligence is the act of transforming raw/operational data into useful information for business analysis.

How does it work?

  1. Business Intelligence is based on data ware house technology extracts information from company’s operational systems.
  2. The data is transformed(cleaned and integrated) and loaded into data warehouses.
  3. Since this data is credible it is used for business insights.

But Why Data Warehousing?

  • Let us understand the challenges in achieving Business Intelligence.
  • Data collected from various sources and stored in various databases can not be directly visualized.
  • The data first needs to be integrated and then processed before visualization takes place.

What is a Data Warehouse?

  • A central location where consolidated data from multiple locations(data houses) are stored.
  • DWH is maintained separately from organization’s operational database.
  • End users access it whenever any information is needed.

Note: Data Warehouse is not loaded every time new data is added to database.

What are the advantages of Data Ware House?

  • Strategic questions can be answered by studying trends.
  • Data warehousing is faster and more accurate.

Note: Data Ware House is not a product that a company can go and purchase, it needs to be designed and depends entirely on the company’s requirement.

Properties of a Data Ware House?

A Data Ware House is a subject-oriented, integrated, time-variant and non-volatile collection of data in support of managements decision-making process. – Bill In man, Father of Data Ware Housing

  1. Subject Oriented: Data is categorized and stored by business subject rather than by application.
  2. Integrated: Data on a given subject is collected from disparate sources and stored in single place.
  3. Time-variant: Data is stored as a series of snapshots, each representing a period of time.
  4. Non-Volatile: Typically data in Data Ware House is not updated or deleted.

Key Terminologies related to Data Ware Housing

Information System : OLTP(DB) VS OLAP(DWH)

OLTP is an Online Transaction Processing system

OLAP is an Online Analytical Processing system

 
     Relational Database(OLTP)                                            Analytical DwH(OLAP)                     
Based on entity relationship model.                     Based on Star, Snowflake and fact constellation schema.
Provides primitive and highly detailed data.      Provides summarized and consolidating data.
Used for writing data into the database.              Used for reading data from the data warehouse.
Database size ranges from 100 MB to 1GB.          Data Warehouse size ranges from 100GB to 1TB.
Fast, provides high performance.                          Highly flexible, but not fast.
Number of records accessed is in tens                  Number of records accessed is in millions.
Ex: All Bank transactions made by a customer.  Bank transactions made by a customer at a particular time.

OLTP Examples:

  1. A super market server which records every single product purchased at that market.
  2. A bank server which records every time a transaction is made for a particular account.
  3. A railway reservation server which records the transactions of a passenger.

OLAP Examples:

  1. Bank Manager wants to know how many customers are utilizing the ATM of his branch. Based on this he may take a call whether to continue with the ATM or relocate it.
  2. An insurance company wants to know the number of policies each agent has sold. This will help in better performance management of agent.

ETL(Extract, Transform and load)

ETL is the process of extracting the data from various sources and then transforming this data to meet requirement and then loading it into a target data warehouse.

Data Mart

  • Data mart is a smaller version of the data ware house which data deals with a single subject.
  • They are focused on one area. Hence they draw data from a limited number of sources.
  • Time taken to build data marts is very less compared to the time taken to build a data warehouse.

Difference between Data Warehouse and Data Marts

Data Warehouse                               Data Marts 
Enterprise wide data.                        Department wide data
Multiple data sources.                       Limited data sources.
Occupies large memory.                   Occupies Limited memory.
Longer time to implement.              Shorter time to implement.

Types of Data Mart

  1. Dependent
  2. Independent
  3. Hybrid

Explanation:

1.Dependent

  • The data is first extracted from the OLTP system and then populated in the central DwH.
  • From the DwH, the data travels to the Data Mark.

2. Independent

  • The data is directly received from the source system.
  • This is suitable for small organizations or smaller groups within an organisation.

3. Hybrid

  • In hybrid system the data is fed both from OLTP systems as well as the Data Warehouse.

Meta Data

  • Meta Data is defined as data about data.
  • It is DwH defines the source data i.e Flat file, Relational Database and the other objects.
  • Is used to define which table is source table and target. And which concept is used to build business logic called transformation to actual output,

Data Ware House Architecture

Monika Khilrani

Hii, I am Monika Khilrani amazed by Computer Science, I am a technology enthusiast, author and founder of www.mindofyouth.com, writing is not only my profession, it is my passion too because DATA=Knowledge + experience.