Featured Post

Reference Books and material for Analytics

Website for practising R on Statistical conceptual Learning: https://statlearning.com  Reference Books & Materials: 1) Statis...

Thursday, February 15, 2018

Designing Data warehouse: Good Practices




We always aspire (though difficult) for a flawless data warehouse design for a successful BI system. Listed down some best practices  to make your data warehouse perfect.
A data warehouse lies at the foundation of any business intelligence (BI) system. But if this foundation is flawed, the towering BI system cannot possibly be stable. Given BI’s importance as a decision enabler today, such flaws are undesirable. For a flawless data warehouse design and process, avoid the following common mistakes.  

1) Designing Data warehouse on future needs rather than only on Current need
Data warehouse (in some cases Common data Model) need to be designed for long term. Designing it to get the benefit only in short term would not reap much benefits. While designing the Data warehouse, 3 to 5 years organization business roadmap needs to be in mind.  Data warehouse designer should pay as much attention to as much attention to business strategy as to technical aspects
2) No shortcomings or negligence while creating the metadata layer
While designing a data warehouse, poor design of the metadata has far-reaching implications. Metadata is the integrator between data models, extract, transform, load (ETL) and BI. However, the metadata layer often is created only to fit short-sighted data criteria and its documentation is haphazard.
It is necessary to add descriptions to the tables or columns at the data warehouse design stage itself. When business users reject BI reports they cannot decipher, the problem can usually be traced to poorly designed data models that lack easy-to-understand descriptions, and have inconsistent naming conventions. Prevent this by setting the appropriate metadata strategy at the data modeling stage of data warehouse design.

 3) Weightage to ad hoc querying and self-service BI
Generating a simple report can sometimes expend considerable bandwidth and be a drain on productivity for the IT team. However, a self-service BI can simplify the task by relying on the metadata layer to generate the reports, without affecting the sanctity of the underlying data model. For example, to produce a report on top-performing branches, the user simply selects hierarchical columns with the titles “region,” “branch,” “number of customers” and “relationship packages.” Thus, incorporation of a self-service or ad hoc query layer in the data warehouse process can help you gain user acceptance.

4) Not getting carried away on visual appeal and focus on Speed
When developing the reporting layer of a data warehouse, the design should focus on ease of use and speedy action Although business users tend to be swayed by fancy charts and reports, do not succumb to the temptation of sacrificing speed at the altar of beautification. Indeed, in the course of data warehouse project implementation, IT teams have noted that speedy response time in report generation is of prime importance for the BI system to gain popularity amongst users. To give a comparison, if a simple report takes a couple of seconds to load, the chart may take three minutes. Speed should therefore be given priority over visual appeal when designing data warehouse process.

5) Clarity on data quality prior to finalization of data warehouse design
A large amount of aggregation takes place at the data mart level. The data warehouse is the source of data, and the data contained therein should be clean and accurate. If not, the output from the system is likely to show discrepancies, and the data warehouse design as well as process, is unfairly thought of as the culprit.
Rather, active monitoring of dimensional data should be incorporated right at the data warehouse design stage. Quality control is governed by usage too, and the business could point out faults that the technology may forego; hence, the data warehouse design must provide for strong data governance processes in order to maintain clean data.

6) Not considering Data warehousing only as an IT initiative.
Focusing on data warehouse implementation as a pure IT project can amount to diluting its essence. Ideally, the benefits of a data warehouse design and process have to be quantified; if not as monetary returns on investments, then at least in terms of growth in business (defined by key performance indicators). This growth can be viewed and measured in terms of increased usage of the BI system and efficient data management, leading to smoother operations, and increased efficiency for top-level management.
It must be understood that discrepancies usually creep in at transactional data levels. For instance, consider a bank customer whose transaction details are captured correctly, but whose mobile number on record is incorrect. A change in the mobile number could have occurred subsequent to recording the original form. This example relates to dimensional data. In such cases, it may not be feasible to stop the data warehouse process and cleanse the data.


No comments:

Post a Comment