Data Warehouse

This category contains 13 posts

#SAP, Breaking Data, and Re-enabling #SQLServer Database Referential Integrity Constraints #Microsoft #FTW

Many times as Data professionals we no longer have full control over the quality of data in the source systems. I am discussing SAP in my example, but I could have easily mentioned PeopleSoft, SalesForce, or a number of other purchased solutions. Usually those solutions are purchased and then we are tasked with maintaining those … Continue reading

Creating my own #ETL data validation #FTW

Recently on the same project I created an Agile Data Warehouse and Extract, Transform, and Load automated test suite, I was tasked to create a data validation process. We need to create a foundational process that could be leveraged to provide ongoing data validation for the data load process. We were responsible for loading data … Continue reading

How to create 10,000 Extract, Transform, and Load automated tests using 4 tables #agile #data

The thing I love about my chosen profession is the ability to learn new things and improve on lessons learned from past projects. Recently I was able to take on a problem that I have experienced on multiple past projects. “How can we easily create automated tests for a Data Migration or Extract, Transform, and … Continue reading

12 Rules to create a Dimensional Model from a Normalized Model in an #agile way

Recently I have been on a project where I have been fortunate enough to develop both a normalized Operational Data Store and a Dimensional Data Warehouse. In addition to this, I have been fortunate to be able to recreate the Dimensional Data Warehouse three times over. This has been because the project has been done … Continue reading

Adaptive Data Model – #Agile or Anathema?

I have seen the concept of an Adaptive Data Model proposed as an Agile method to Data Modelling lately. (Most recently in Ken Collier’s excellent book – “Agile Analytics”) The theory is that you can be more Agile using an ¬†Adaptive Data Model instead of a traditional Data Model of the business domain. Definition An … Continue reading

Why #Dimensional Modeling matters

I’ve recently completed a data modeling initiative on a major project. After doing this I’ve come to two major conclusions: The coverage area in Insurance is probably the most devious and twisted area of data that I have ever modeled. Dimensional Modeling should be done on every model to ensure you can simply model the … Continue reading

#SDEC12 Conference Review #Agile

Well another Software Development and Evolution conference has come and gone. (You’ve always wondered what SDEC stood for didn’t you?) It was a lot of work and effort to make it all happen, but in the end it was very enjoyable. I learned an immense amount and cant’t wait until next year. My Highlights The … Continue reading

#Data #Hoarding and #Norm-aholics

Are you a sufferer of Data Hoarding? This is an affliction that affects thousands of businesses every day. Common symptoms are terabytes of data being kept around just in case you ever need them. The data ends up being stored in a multitude of disconnected tables just piled on top of one another with no … Continue reading

Why #Dimensional Analysis should be done on every #datamodel

Those of you who have worked with me, know of my fondness for Operational Data stores. I have always believed in the importance of having an enterprise or holistic view of the data requirements for every application. An Operational Data Store seemed to be the perfect vehicle to ensure this happened. Perhaps my fondness was … Continue reading

Why do we #DataModel at all?

People in the Database world take Normalization and Data Modeling as something that should be done without question. I compare it to best practices like versioning software. No one expects that anyone would create software without version control anymore.But more often recently I do get questioned and challenged on why we need to normalize and¬†model … Continue reading