Those of you who have worked with me, know of my fondness for Operational Data stores. I have always believed in the importance of having an enterprise or holistic view of the data requirements for every application. An Operational Data Store seemed to be the perfect vehicle to ensure this happened. Perhaps my fondness was related to not wanting to stray too far from the normalization rules that I knew quite well. In this way, it was a new-ish discipline or context that really wasn’t new.
I always looked kinda sideways at those weird Dimensional modelers with their Star Schemas and Snowflakes. I mean if they really put their mind to it, they would be able to figure out how to solve their data needs with a nice relational normalized Operational Data Store, Only exceptional and massive amounts of data require the Dimensional modeling constructs that these models typically use right? I mean what is so complicated about a model with only 100 main tables? Shouldn’t everyone know how to write SQL by hand?
On my latest project, I have had the opportunity to become re-introduced to Dimensional Analysis and modeling and I have found the process fascinating and very valuable. Besides the obvious benefits that are being realized by being able to model the data in a way that allows the clients to efficiently write and execute queries, there was an unexpected benefit.
Taking a normalized Data Model and attempting to translate it into a Dimensional Model really challenges and validates your data model. It is easy to create a model with a multitude of complex relationships than it is to distill in down to a handful of FACTs and Dimensions. With so many relationships, it is possible to inconsistencies to exist and hide in the data model. I found multiple modeling errors in the process on trying to create a Dimensional model from my relational model. When you distill a relational model down to a Dimensional model, inconsistencies and errors become very apparent in the creation of the FACTs and Dimensions.
Dimensional Analysis also forces you to look at the data in a different way. Instead of a relational/hierarchical way, I find it forces me to look at the data in a chronological way and forces me to consider data changes, data history, and data latency in ways I may not have considered before. Not having to account for data across time and verify consistency at every point is quite a bit simpler.
I am a convert of using Dimensional Analysis on all my data models for validation of the data model and additional analysis of the data. I’ve discovered that I need to understand the data better to create a Dimensional Model than a normalized model. More factors need to be considered and creating the Dimensional model with fewer objects requires that the data model has greater consistency, integrity, and cohesion.
Simple is hard.