The term "Data Warehouse" was first coined by Bill Inmon in 1990. He said that Data warehouse is subject Oriented, Integrated, Time-Variant and nonvolatile collection of data.This data helps in supporting decision making process by analyst in an organization
The operational database undergoes the per day transactions which causes the frequent changes to the data on daily basis.But if in future the business executive wants to analyse the previous feedback on any data such as product,supplier,or the consumer data. In this case the analyst will be having no data available to analyse because the previous data is updated due to transactions.
The Data Warehouses provide us generalized and consolidated data in multidimensional view. Along with generalize and consolidated view of data the Data Warehouses also provide us Online Analytic Processing (OLAP) tools. These tools help us in interactive and effective analysis of data in multidimensional space. This analysis results in data generalization and data mining.
The data mining functions like association,clustering ,classification, prediction can be integrated with OLAP operations to enhance interactive mining of knowledge at multiple level of abstraction. That's why data warehouse has now become important platform for data analysis and online analytic processing.
Data warehouse is Subject Oriented, Integrated, Time-Variant and Nonvolatile collection of data that support management's decision making process.
Data Warehouse Features
The key features of Data Warehouse such as Subject Oriented, Integrated, Nonvolatile and Time-Variant are are discussed below:
Subject Oriented - The Data Warehouse is Subject Oriented because it provide us the information around a subject rather the organization's ongoing operations. These subjects can be product, customers, suppliers, sales, revenue etc. The data warehouse does not focus on the ongoing operations rather it focuses on modelling and analysis of data for decision making.
Integrated - Data Warehouse is constructed by integration of data from heterogeneous sources such as relational databases, flat files etc. This integration enhance the effective analysis of data.
Time-Variant - The Data in Data Warehouse is identified with a particular time period. The data in data warehouse provide information from historical point of view.
Non Volatile - Non volatile means that the previous data is not removed when new data is added to it. The data warehouse is kept separate from the operational database therefore frequent changes in operational database is not reflected in data warehouse.
Data Warehouse Applications
As discussed before Data Warehouse helps the business executives in organize, analyse and use their data for decision making. Data Warehouse serves as a soul part of a plan-execute-assess "closed-loop" feedback system for enterprise management. Data Warehouse is widely used in the following fields:
financial services
Banking Services
Consumer goods
Retail sectors.
Controlled manufacturing
Data Warehouse Types
Information processing, Analytical processing and Data Mining are the three types of data warehouse applications that are discussed below:
Information processing - Data Warehouse allow us to process the information stored in it.The information can be processed by means of querying, basic statistical analysis, reporting using crosstabs, tables, charts, or graphs.
Analytical Processing - Data Warehouse supports analytical processing of the information stored in it.The data can be analysed by means of basic OLAP operations,including slice-and-dice,drill down,drill up, and pivoting.
Data Mining - Data Mining supports knowledge discovery by finding the hidden patterns and associations, constructing analytical models, performing classification and prediction.These mining results can be presented using the visualization tools.
Data Warehouse Tools and Utilities Functions
The following are the functions of Data Warehouse tools and Utilities:
Data Extraction - Data Extraction involves gathering the data from multiple heterogeneous sources.
Data Cleaning - Data Cleaning involves finding and correcting the errors in data.
Data Transformation - Data Transformation involves converting data from legacy format to warehouse format.
Data Loading - Data Loading involves sorting, summarizing, consolidating, checking integrity and building indices and partitions.
Refreshing - Refreshing involves updating from data sources to warehouse.