Data quality measurement is fundamental to business success

By Bryn Davies for InfoBluePrint
Johannesburg, 05 Apr 2011

Organisations are realising that the negative impacts of unreliable data about core business entities such as customers, suppliers, employees, materials and products are inhibiting their attempts to cut costs, boost revenue or meet compliance.

These business impacts are symptoms of years, if not decades, of poor to non-existent data quality management practices, and are now manifesting primarily through analytical initiatives relating to business performance improvement, such as those delivered through business intelligence (BI).

Impacts are also being felt in operational areas such as ERP and CRM, and within governance, risk and compliance (GRC) programmes such as those relating to King III and the Protection of Personal Information (POPI) Bill. To address these problems, we are seeing corporates launching data governance and/or master data management in their organisations, with a view to getting their core data assets under control.

However, many are unsure of where to start and how to navigate this new and uncharted information management territory. What quickly becomes clear is that, just like business performance improvement, it is impossible to make any progress without basic metrics such as KPIs.

In other words, it is necessary to measure that which you are trying to improve, and in this case it is the quality of the data itself which requires a solid, objective and repeatable measurement process.

But unlike tangible goods, measuring the "quality" of something like data is a real challenge for most: while there are a number of sophisticated software tools on the market that help deliver data quality metrics, the real trick is to be able to apply them to deliver the metrics in a way that effectively supports the main objectives of the business.

First and foremost, therefore, it is essential to take into account the ultimate business objectives that are driving the need to measure data quality, before deciding on how to go about measuring it and how to present the results, and to whom. In other words, if time and money is spent on a data quality improvement initiative without being able to correlate measurable business improvements, then it is not time and money well spent.

Experience has shown that it is never a matter of simply pointing data quality software at the data and pressing a "measure" button. While most data profiling software is very capable, an out-the-box data profiling exercise does little to surface meaningful data quality metrics, and besides producing reams and reams of statistics about your data, such profiling is typically only suitable as input to the technical aspects of a data migration or data warehouse project.

To properly measure data quality, a lot of up-front work has to be done to analyse and design an objective and repeatable framework for data quality measurement. Besides the software tool`s capabilities in this area, this has to take into account things like whether data content as well as structures need to be assessed; data volumes and sampling techniques; and on whether a view on levels of inter- and intra-database duplication is required or not.

The generic metrics provided by most tools, such as "completeness", "consistency of format" and others, are useful to get an overall feel for the shape and form of your data, but these rarely provide insight into actual business impacts or give guidance for planning and prioritising quality improvement interventions. In the end, what is always important is the ability to establish the extent to which the data complies with business rules, because business rules are the basis of meaningful data quality rules.

Another crucial piece of this puzzle is how to represent and display the resultant data quality metrics, and this is largely dependent on the target audience(s). For example, a database developer will require the technical detail output by a data profiling tool, while a manager or data steward requires a high level status of the data`s compliance to business rules, but an executive is unlikely going to be interested in anything other than the rand value or graphical representations of data quality issues integrated into an existing dashboard.

Whatever the initial drivers for data quality measurement, be it a second-generation BI project, a new application implementation or a compliance initiative, the long-term benefits of being able to consistently repeat the measurement processes have become clear to those who have got it right: for one thing, data quality metrics are excellent indicators of the efficacy of a data governance programme, as they form the basis of a monitoring and control system of the ability of core data to successfully support the business.

Being able to monitor data quality levels, ie, repeatedly measure and report results against an established baseline, will soon be a hallmark of leading organisations that have realised the value of their core data assets to their ability to grow and succeed.

Editorial contacts