Data Cleansing vs Data Maintenance: Which One Is Most Important?
Data Cleansing vs Data Maintenance: Which One Is Most Important?
Often, businesses ask us: which process is the most important? In the long term, which one should we focus on? Unfortunately there is no simple answer, but there is an easy way to understand the differences between them.
An Apple A Day…
When we think about data, we can compare it to caring for our health. In particular, data maintenance is a lot like brushing your teeth. We brush our teeth at least twice a day to stop decay from taking hold. If we didn’t, the sugar that we consume would gnaw away at the enamel and cause rot to set in.
The longer we leave it between brushings, the more vulnerable our teeth become. Similarly, our database must be continually cared for and maintained.
Why?
Data in a database rots and decays in exactly the same way as teeth do. Frequent data maintenance is required to keep the data in good health, ensuring that the rot cannot progress to a catastrophic stage. That’s one good argument for data maintenance, and it proves why it is an unavoidable task that all businesses must commit to.
But what about cleansing data?
Facing Facts
Simply brushing your teeth helps to stop them from crumbling and decaying, but we also need to organise frequent visits to the dentist. At these essential appointments, our teeth are thoroughly checked and professionally cleaned, and any tooth damage repaired before it escalates. Brushing the teeth does not mean these visits can be skipped.
We might not find the dentist’s chair pleasant, and there are certainly more enjoyable things to spend time and money on. But these regular appointments are essential if we want our teeth to last.
In the same way, data needs to be checked and validated by an expert. In our example, we do this by using data quality software. This is your database’s ‘dentist’s appointment’ – the chance to catch and fix errors that have built up over time. Using sophisticated matching techniques, automated processes can pick out likely duplicates, and find data that doesn’t play by the rules.
Activity | Typical Cleansing |
Prevention | 10% |
Detection | 30% |
Repair | 60% |
Activity | Ideal Maintenance |
Prevention | 45% |
Detection | 30% |
Repair | 25% |
Don’t Depend on Dentures
If you don’t look after your teeth, you’ll end up with nothing – at best, you might get a set of false ones for your old age. If you don’t care for data, all the effort and money that was invested in collecting it will turn out to be wasted. And it will be impossible to build meaningful reports based on the scraps of accurate data that you have left. The only way to continue will be to start from scratch, buying a new set of data from someone else.
Aside from that, a successful business with no reliable data is facing a perilous future. Deprived of its most important asset – the information it needs for sensible decisions – it must navigate without knowing who its customers are.
There is no short cut to good data quality, and no way that cleansing or maintenance can be skipped.