I just finished publishing a five part series of articles on methodology for dealing with the the common data quality problem of identifying duplicate customers.
The article series was published on Data Quality Pro, which is the leading data quality online magazine and free independent community resource dedicated to helping data quality professionals take their career or business to the next level.
To read the series, please follow these links:
- Identifying Duplicate Customers (Part 1)
- Identifying Duplicate Customers (Part 2)
- Identifying Duplicate Customers (Part 3)
- Identifying Duplicate Customers (Part 4)
- Identifying Duplicate Customers (Part 5)
I just finished publishing a five part series of articles on data matching methodology for dealing with the common data quality problem of identifying duplicate customers.
The article series was published on Data Quality Pro, which is the leading data quality online magazine and free independent community resource dedicated to helping data quality professionals take their career or business to the next level.
Topics covered in the series:
- Why a symbiosis of technology and methodology is necessary when approaching the common data quality problem of identifying duplicate customers
- How performing a preliminary analysis on a representative sample of real project data prepares effective examples for discussion
- Why using a detailed, interrogative analysis of those examples is imperative for defining your business rules
- How both false negatives and false positives illustrate the highly subjective nature of this problem
- How to document your business rules for identifying duplicate customers
- How to set realistic expectations about application development
- How to foster a collaboration of the business and technical teams throughout the entire project
- How to consolidate identified duplicates by creating a “best of breed” representative record
To read the series, please follow these links: