A common Acronym that comes up in managing Master/Reference Data is SoR. As in, where did the data come from? What is the SoR? Is an information source, by definition, also the SoR?
I’d like to introduce this topic with an analogy to the natural world. I like to contrast SoR with what I call SoLaR (System of Last Report…or Resort). I think it’s time we Green our MDM. If one were asked for the source of light might most reply, the sun? Even at night the sun reflects off the moon and provides light. Is the moon the SoR for night? And with the moon we are introduced to a couple of challenges of SoR; timeliness and indirect source.
Another challenge for SoR is context. If you were a navigator in the northern hemisphere your most important light might be the North Star (Polaris). Sometimes there is no moon visible and even when it is it is not a reliable navigational tool. Of course this context relies on one being on earth and in the northern hemisphere. The data is getting more complex! Personally, I’ll admit I use a GPS and not Polaris to navigate at night.
Beyond the challenges of timeliness, source and context there is one of process. Most of those reading this article get their light from electricity. That electricity more than likely came from coal or oil (a clear exception is wind). Coal or oil can both be linked ultimately to the sun by virtue of the sun’s causility in the life from which coal or oil originated. So is it true that electricity is SoLaR (System of Last Report…or Resort)?
I’ve went a bit far afield to make a point crucial to SoR. Without knowing the requirement (what is needed?) and context (who will use it and for what?) one runs the risk of providing a solution for which there is no problem. Or worse, not providing business with any solution. Where we get the data, from whom and when is typically seen as a technology problem not a business problem. The necessary partnership will have to be a topic for another day.
Here is a link to an article by Daniel Linstedt that makes relevant points on this topic. In addition, please note the comment at the bottom of the article concerning SoE (System of Entry). Also, please feel welcome to join the LinkedIn group (Master) Reference Data Management. I will be posting a question looking to solicit broader feedback.
Daniel provides three definitions for SoR:
- SoR Definition 1: “The data that exists in the source system, in other words, where the data is entered, or originates for the first time.” It shold be auditable, and compliant. It may not be “clean, quality checked, or integrated”.
- SoR Definition 2: “This data resides in a NORMALIZED enterprise data warehouse, and is auditable and compliant.” He then goes into some discussion of his Data Vault approach.
- SoR definition 3: “Master Data, or Conformed Dimensions – data that has been cleansed, quality checked, duplicates removed. In the master data set there is only a SINGLE copy of each (customer / part / work order / supplier etc..) item. This is an SoR by business standards because it represents value to the business in eliminating duplicates and understanding how the business looks “TODAY”. It’s a snapshot of the current consistent, and quality cleansed information that feeds the rest of the source systems.”
Ultimately we come to the reminder that technologists need to work to understand the requirements and build pracitical and auditable data management systems. After all, solving a business need is usually the goal. I recommend a matrix be kept (I’ll discuss an ‘active process’ solution later) that tracks each object back to SoE, SoR, or whatever name with which it’s tagged. In short, know thy data.
Hopefully this brief (at least that was the intent) article will provide some flavor to the subject of SoR and provide some context for solutions. Thanks for your time and please feel free to comment.