Data Quality: Why is the water brown?

by

At one point in life I lived in a house with a natural spring well for water supply.  In the spring when the rain was insistent the water coming out of the faucet would turn brown.  Needless to say, we didn’t drink the water. 

After a time we installed a filtration system and the water was clear.  We still didn’t drink the water though there was the impression that because the output  was clear that is was pure.  Interestingly enough, since we had installed a filter (but not complete purification), we inadvertently added a maintenance step to a formerly maintenance free system.  Now we had the appearence of ‘pureness’ along with added cost without actually benefiting from a ‘pure’ water supply.  Does that sound like some of the data issues you grapple with?

What is the point of this in a MDM Blog post?  Well, the Enterprise Data Ecology works much the same way.  Somehow data got into the system.  Either by users, automation (capture) or external feeds.  Is it ‘pure’?  Because it seems ‘clear’ doesn’t mean it’s pure.  What steps are in place to filter, purify and test?  If your company’s data was water coming out of a faucet would you drink it?

In the past I’ve assisted firms with evaluations of their data quality, processes, governance and recommended and implemented opportunities for improvement.  Several of these involved external data.  At one point I was approached about developing a repeatable quality audit solution for a specific data provider.   Though this was never funded I have helped several firms ‘reassess’ their SLA/Contracts with various providers helping them improve overall data quality while reducing external data costs (through reduced fees corresponding to quality benchmarks).

Today I noticed a good article for those incorporating external data, “External Data in Enterprise Data Warehousing – Trendy and Trying” on information-management.com.  Mr Ramakrishnan does a good job of pointing out some of the major challenges and issues when bringing external data into your organization.  He also provides practical advice for incorporating external data.  NOTE:  I have no relationship with Mr. Ramakrishnan and submit this as my (and mine only) objective opinion.

I hope this article and my plumbing analogy will help next time you are considering bringing in external data.  Note that the point of this post is not to discourage external data integration.  Indeed, these days a closed off enterprise is not realistic in most cases.  However, remember that the cost isn’t simply the cost of the data feed.  There is an added maintenance cost of increased complexity and management.  In addition, think of the potential cost of ‘organizational sickness’ if dirty data impacts revenues, costs, risk or compliance. 

On the other hand, weigh the potential ‘lost opportunity cost’ if your organization remains in it’s pond of denial.  I know, that is a weak pun and I’ll stop now while I’m (hopefully) ahead.

Advertisement

One Response to “Data Quality: Why is the water brown?”

  1. Info Says:

    I was very interested in reading your excellent article on data management and data quality. We have published a number of white papers on the area of data management – in particular reference data management
    have a look at:
    http://www.polarlake.com/resources/

    Any feedback would be greatly appreciated
    Polarlake – reference data management

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s


Follow

Get every new post delivered to your Inbox.