We always aspire (though difficult) for a flawless data
warehouse design for a successful BI system. Listed down some best practices to make your data warehouse perfect.
A data warehouse lies at the
foundation of any business intelligence (BI) system. But if this foundation is
flawed, the towering BI system cannot possibly be stable. Given BI’s importance
as a decision enabler today, such flaws are undesirable. For a flawless data
warehouse design and process, avoid the following common mistakes.
1) Designing Data warehouse on
future needs rather than only on Current need
Data warehouse (in some cases Common
data Model) need to be designed for long term. Designing it to get the benefit
only in short term would not reap much benefits. While designing the Data warehouse,
3 to 5 years organization business roadmap needs to be in mind. Data warehouse designer should pay as much attention
to as much attention to business strategy as to technical aspects
2) No shortcomings or negligence while creating the metadata layer
While designing a data warehouse,
poor design of the metadata has far-reaching implications. Metadata is the
integrator between data models, extract, transform, load (ETL) and BI.
However, the metadata layer often is created only to fit short-sighted data
criteria and its documentation is haphazard.
It is necessary to add descriptions to the tables
or columns at the data warehouse design stage itself. When business users
reject BI reports they cannot decipher, the problem can usually be traced to
poorly designed data models that lack easy-to-understand descriptions, and have
inconsistent naming conventions. Prevent this by setting the appropriate
metadata strategy at the data modeling stage of data warehouse design.
3) Weightage to ad hoc querying and self-service BI
Generating a simple report can
sometimes expend considerable bandwidth and be a drain on productivity for the
IT team. However, a self-service BI can simplify the task by relying on the
metadata layer to generate the reports, without affecting the sanctity of the
underlying data model. For example, to produce a report on top-performing
branches, the user simply selects hierarchical columns with the titles
“region,” “branch,” “number of customers” and “relationship packages.” Thus,
incorporation of a self-service or ad hoc query layer in the data warehouse process can help you gain
user acceptance.
4) Not getting carried away on visual appeal and focus on Speed
When developing the reporting layer
of a data warehouse, the design should focus on ease of use and speedy action
Although business users tend to be swayed by fancy charts and reports, do not
succumb to the temptation of sacrificing speed at the altar of beautification.
Indeed, in the course of data warehouse project implementation, IT teams have noted that speedy response time in report
generation is of prime importance for the BI system to gain popularity amongst
users. To give a comparison, if a simple report takes a couple of seconds to
load, the chart may take three minutes. Speed should therefore be given
priority over visual appeal when designing data warehouse process.
5) Clarity on data quality
prior to finalization of data warehouse design
A large amount of aggregation takes
place at the data mart level. The data warehouse is the source of data, and the
data contained therein should be clean and accurate. If not, the output from
the system is likely to show discrepancies, and the data warehouse design as
well as process, is unfairly thought of as the culprit.
Rather, active monitoring of
dimensional data should be incorporated right at the data warehouse design
stage. Quality control is governed by usage too, and the business could point
out faults that the technology may forego; hence, the data warehouse design
must provide for strong data governance processes in order to maintain clean data.
6) Not considering Data
warehousing only as an IT initiative.
Focusing on data warehouse
implementation as a pure IT project can amount to diluting its essence.
Ideally, the benefits of a data warehouse design and process have to be
quantified; if not as monetary returns on investments, then at least in terms
of growth in business (defined by key performance indicators). This growth can
be viewed and measured in terms of increased usage of the BI system and efficient data management, leading to smoother operations, and increased efficiency
for top-level management.
It must be understood that discrepancies usually
creep in at transactional data levels. For instance, consider a bank customer
whose transaction details are captured correctly, but whose mobile number on
record is incorrect. A change in the mobile number could have occurred
subsequent to recording the original form. This example relates to dimensional
data. In such cases, it may not be feasible to stop the data warehouse process
and cleanse the data.