What is a Data Warehouse?
A Data Warehouse is a central location where consolidated data from multiple locations are stored. Data Warehouse is not loaded every time when a new data is generated but the end-user can assess it whenever he needs some information. There are certain timelines determined by the business as to when Data Warehouse needs to be loaded whether on a daily, monthly or once in a quarter basis.
Data Warehouse Architecture
Different data warehousing systems have different structures. Some may have a small number of data sources while some can be large.
There are multiple transactional systems, source 1 and other sources as mentioned in the image. The source can be SAP or flat files and hence, there can be a combination of sources. The ETL (Extract, Transfer, Load) is used to load the data warehouse in the data marts. The difference between a data house and a data mart is that data warehouse is used across organisations, while data marts are used for individual customized reporting.
For example, there are multiple departments in a company like the finance department which is very different from a marketing department. They all draw data from different sources and they need customized reporting. The finance department is concerned mainly with the statistics while the marketing department is concerned with the promotions. The marketing department doesn’t require any information on finance.
For customized reporting, subsets of data warehouse called data marts is required. There are two approaches to loading it. First, load the data warehouse and then load the marts or vice-versa. In the reporting scenario which is the data access layer, the user accesses the data warehouse and generates the report. All these reporting tools are meant to make the front interface extremely easy for the consumer since people at the decision-making level are not concerned with technical information. They are primarily concerned with a neat usable report.
Therefore, all these reporting tools perform at the front end but at the back end, they generate the queries and hit the database and the user gets the report just in time. These reporting tools can schedule the jobs to run and generate the reports.
Got a question for us? Please mention them in the comments section and we will get back to you.