Businesses and organizations around the world are scrambling to insure their operations will be compliant with the sweeping changes that the General Data Protection Regulation (GDPR) will usher in next May. Among the businesses most critically impacted by GDPR are credit reporting firms, which collect and disseminate enormous volumes of personal data on millions of individuals and businesses.
Oslo-based Creditsafe is among such businesses. This year the firm will deliver more than 150 million credit reports while receiving hundreds of thousands of new credit records daily, a lot of it in unstructured formats. Thus complying with the stringent requirements of GDPR took on a great sense of urgency at Creditsafe.
Tough Challenges
Specifically, Creditsafe faced several pressing challenges, including:
- Cataloging data and bringing it into an inventory database in an industry where the data flooded in from multiple disparate sources and multiple countries
- Improving common processes for managing data across different countries
- Making data accessible and usable to Creditsafe employees that understood the core business, not just usable by highly skilled engineers
- Creating a common data structure when data was held in more than a dozen separate data silos, making it hard to communicate these data across different businesses, each with its own language and methods of describing these data.
Solution: A Data Catalog
The solution for these and other challenges at Creditsafe was a centralized, searchable data catalog of all disparate data sources, according to Cato Syversen, CEO at Creditsafe. “We needed a great platform and good environment for managing our great volumes of very different data types,” he recalled.
An essential element of the catalog solution chosen is the ability to automate data profiling and data tagging. This allows Creditsafe to discover what is personal attributable information within sensitive files. Creditsafe can now take those tags and feed them into its metadata processing engine, allowing the company to anonymize that data from raw files to the initial data load. The result is compliance with the edicts of GDPR. “This helps us to meet the GDPR requirements quickly in that we can immediately identify each data item right back to its original source,” notes Angus Gow, Creditsafe Chief Technology and Chief Content Officer.
Immediate, Positive Business Results
The positive business impact of the data catalog approach at Creditsafe was felt almost immediately. Right away business analysts gained far greater visibility into the raw data the company processes. Cost savings accrued from the de-duping capabilities of the data catalog solution, as Creditsafe could more quickly identify duplicate data sent to the company by different data sources. So the company only pays once for data.
Prior to implementing the data catalog, it would take a month or more of laborious manual labor to onboard new data sources or new countries. That process now takes a couple days or less, as the data catalog automates discovery and tagging formerly done by hand.
Finally governance and overall GDPR compliance got a major boost as Creditsafe can now track the origin of all data files back to the first supplier, tagging the location automatically.
In all, Gow says the data catalog will be an integral part in Creditsafe’s five-year goal of becoming the biggest source of crowd-sourced business data globally.