Garbage in, garbage out. This motto has been true ever since punched cards and teletype terminals. Today’s sophisticated IT systems depend just as much on good quality data to bring value to their users, whether in accounting, production, or business intelligence. However, data doesn’t automatically format itself properly, any more than it proactively tells you where it’s hiding or how it should be used. No, data just is. If you want your business data to satisfy criteria of availability, usability, integrity, and security, you need a data governance strategy.
Data governance, in general, is an overarching strategy for organizations to insure the data they use is clean, accurate, usable, and secure. Data stakeholders from business units, the compliance department, and IT are best positioned to lead data governance, although the matter is important enough to warrant CEO attention too. Some organizations go as far as appointing a Data Governance Officer to take overall charge. The high-level goal is to have consistent, reliable data sets to evaluate enterprise performance and make management decisions.
Ad-hoc approaches are likely to come back to haunt you. Data governance has to become systematic, as big data multiplies in type and volume and users seek to answer more complex business questions. Typically, that means setting up standards and processes for acquiring and handling data, as well as procedures to make sure those processes are being followed. If you’re wondering whether it’s all worth it, the following five reasons may convince you.
Reason 1: Ensure data availability
Even business intelligence (BI) systems won’t look very smart if users cannot find the data needed to power them. In particular, self-service BI means that the data must be easy enough to locate and to use. After years of hearing about the sinfulness of organizational silos, it should be clear that even if individual departments “own” data, the governance of that data must be done in the same way across the organization. Authorization to use the data may be restricted, as in the case of sensitive customer data, but users should not ignore its existence when it could help them in their work.
Availability is also a matter of having appropriate data that is easy enough to use. With a trend nowadays to store unstructured data from different sources in non-relational databases or data lakes, it can be difficult to know what kind of data is being acquired and how to process it. Data governance is, therefore, a matter of first setting up data capture to acquire what your enterprise and its different departments need, rather than everything under the sun. Governance then also insures that data schemas are applied to organize data when it is stored, or that tools are available for users to process data, for example, to run business analytics from non-relational (NoSQL) databases.
Reason 2: Ensure users are working with consistent data
When the CFO and the COO work from different sets of data and reach different conclusions about the same subjects, things are going to be difficult. The same is true at all other levels in an enterprise. Users must have access to consistent, reliable data so that comparisons make sense and conclusions can be checked. This is already a good reason for making sure that data governance is driven across the organization, by a team of executives, managers, and data stewards with the knowledge and authority to make sure the same rules are followed by all.
Global data governance initiatives may also grow out of attempts to improve data quality at departmental levels, where individual systems and databases were not planned for information sharing. The data governance team must deal with such situations, for instance, by harmonizing departmental information resources. Increased consistency in data means fewer arguments at executive level, less doubt about the validity of data being analyzed, and higher confidence in decision making.
Reason 3: Determining which data to keep and which to delete
The risks of data hoarding are the same as those of physical hoarding. IT servers and storage units full of useless junk make it hard to locate any data of value or to do anything useful with it afterward. Users use stale or irrelevant data as the basis for important business decisions, IT department expenses mushroom, and vulnerability to data breaches increases. The problem is, unfortunately, common. 40 to 60 percent of the data stored by organizations is simply ROT (redundant, obsolete, or trivial), according to the Veritas Data Genomics Index 2016 survey.
Yet things don’t have to be that way. Most data does not have to be kept for decades, “just in case.” As an example, retailing leader Walmart uses only the last four weeks’ transactional data for its daily merchandising analytics. It is part of good data governance to carefully consider which data is important to the organization and which should be destroyed. Data governance also includes procedures for employees to make sure data is not unnecessarily duplicated, as well as policies for systematic data retirement (for instance, for archiving or destruction) according to age or other pertinent criteria.
Reason 4: Resolve analysis and reporting issues
An important dimension of data governance is the consistency across an organization of its metrics, as well as the data driving them. Without clearly recorded standards for metrics, people may use the same word, yet mean different things. Business analytics are a case in point when analytics tools vary from one department to another. Self-service analytics or business intelligence can be a boon to an enterprise, but only if people interpret metrics and reports in a consistent way.
When reports lack clarification, the temptation is often to blame technology. The root cause, however, is often the misconfiguration of the tools and systems involved. It may even be in their faulty application, as in the case of reporting tools being wrongly applied to production databases, triggering problems in performance that mean that neither transactions nor analytics are satisfactorily accomplished. Ripping out and replacing fundamentally sound systems is not the solution. Instead, improved data governance brings more benefit, faster, and for far less cost.
Reason 5: Security and compliance with laws concerning data governance
Consequences for non-compliance with data regulations can be enormous, especially where private individuals’ information is concerned. A case in point, the European General Data Protection Regulation (GDPR) for May 2018 sets non-compliance fines up to some $22 million or four percent of the offender’s worldwide turnover, whichever is the higher, for data misuse or breach affecting European citizens.
Effective data governance helps an organization to avoid such issues, by defining how its data is to be acquired, stored, backed up, and secured against accidents, theft, or misuse. These definitions also include provision for audits and controls to ensure that the procedures are followed. Realistically, organizations will also conduct suitable awareness campaigns to makes sure that all employees working with confidential company, customer, or partner data understand the importance of data governance and its rules. Education and awareness campaigns will become increasingly important as user access to self-service solutions increases, as will the levels of data security already inherent in those solutions.
Conclusion
If you think about data as a strategic asset, the idea of governance becomes natural. Company finances must be kept in order with the necessary oversight and audits, workplace safety must be guaranteed and respect the relevant regulations, so why should data – often a key differentiator and a confidential commodity – be any different? As IT self-service and end-user empowerment grow, the importance of good data governance increases too. Business user autonomy in spotting trends and taking decisions can help an enterprise become more responsive and competitive, but not if it is founded on data anarchy.
Effective data governance is also a continuing process. Policy definition, review, adaptation, and audit, together with compliance reviews and quality control, are all regularly effected or repeated as a data governance life cycle. As such, data governance is never finished, because new sources, uses, and regulations about data are never finished either. For contexts such as business intelligence, especially in a self-service environment, good data governance helps users to use the right data in the right way, to generate business insights correctly and take sound business decisions.