Data Modeling, Data Warehouse, Database

Why #Dimensional Analysis should be done on every #datamodel

Those of you who have worked with me, know of my fondness for Operational Data stores. I have always believed in the importance of having an enterprise or holistic view of the data requirements for every application. An Operational Data Store seemed to be the perfect vehicle to ensure this happened. Perhaps my fondness was related to not wanting to stray too far from the normalization rules that I knew quite well. In this way, it was a new-ish discipline or context that really wasn’t new.

I always looked kinda sideways at those weird Dimensional modelers with their Star Schemas and Snowflakes. I mean if they really put their mind to it, they would be able to figure out how to solve their data needs with a nice relational normalized Operational Data Store, Only exceptional and massive amounts of data require the Dimensional modeling constructs that these models typically use right? I mean what is so complicated about a model with only 100 main tables? Shouldn’t everyone know how to write SQL by hand?

On my latest project, I have had the opportunity to become re-introduced to Dimensional Analysis and modeling and I have found the process fascinating and very valuable. Besides the obvious benefits that are being realized by being able to model the data in a way that allows the clients to efficiently write and execute queries, there was an unexpected benefit.

Taking a normalized Data Model and attempting to translate it into a Dimensional Model really challenges and validates your data model. It is easy to create a model with a multitude of complex relationships than it is to distill in down to a handful of FACTs and Dimensions. With so many relationships, it is possible to inconsistencies to exist and hide in the data model. I found multiple modeling errors in the process on trying to create a Dimensional model from my relational model. When you distill a relational model down to a Dimensional model, inconsistencies and errors become very apparent in the creation of the FACTs and Dimensions.

Dimensional Analysis also forces you to look at the data in a different way. Instead of a relational/hierarchical way, I find it forces me to look at the data in a chronological way and forces me to consider data changes, data history, and data latency in ways I may not have considered before. Not having to account for data across time and verify consistency at every point is quite a bit simpler.

Summary

I am a convert of using Dimensional Analysis on all my data models for validation of the data model and additional analysis of the data.  I’ve discovered that I need to understand the data better to create a Dimensional Model than a normalized model. More factors need to be considered and creating the Dimensional model with fewer objects requires that the data model has greater consistency, integrity, and cohesion.

Simple is hard. 

 

About Terry Bunio

Terry Bunio has worked for Protegra for 14+ years because of the professionalism, people, and culture. Terry started as a software developer and found his technical calling in Data Architecture. Terry has helped to create Enterprise Operational Data Stores and Data Warehouses for the Financial and Insurance industries. Along the way Terry discovered that he enjoys helping to build teams, grow client trust and encourage individual career growth, completing project deliverables, and helping to guide solutions. It seems that some people like to call that Project Management. As a practical Data Modeller and Project Manager, Terry is known to challenge assumptions and strive to strike the balance between the theoretical and real world approaches for both Data Modelling and Agile. Terry considers himself a born again agilist as Agile implemented according to the Lean Principles has made him once again enjoy Software Development and believe in what can be accomplished. Terry is a fan of Agile implemented according to the Lean Principles, the Green Bay Packers, Winnipeg Jets, Operational Data Stores, 4th Normal Form, and asking why

Discussion

3 thoughts on “Why #Dimensional Analysis should be done on every #datamodel

  1. What were your first “ah-ha” steps in your re-introduction to Dimensional Modeling with your project? Book? Course? Good examples? Teammate?

    Posted by Luc | September 11, 2012, 12:36 pm
    • Hi Luc, it was a combination of things. I first did some reading on ‘traditional’ data analysis to refresh my memory. There are exceptional books out there by David C. Hay, John Giles, and Len Silverstein on the subject. I then read various articles on dimensional modeling and reviewed Kimball’s books. The book that brough it all together was the Star Schema reference by Chris Adamson.

      But the real ah-ha moment was taking all this knowledge and doing dimensional analysis and seeing how the process challenged the knowledge I had of the data and exposed modeling errors in my model…

      Posted by bornagainagilist | September 11, 2012, 8:23 pm

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: