Agile, Data Modeling, Data Warehouse, Innovation Game, User Stories

How an #Agile Data Warehouse leveraged an Innovation Game #agile2012 #sdec2012

On my latest project I have been doing a lot of research and learning on how to conduct a Data Warehouse project in an Agile manner. Actually I’m really honoured to be able to combine my two passions in my career: Data Modeling and Agile.

This investigation has led me to what I feel is a unique approach to an Agile Data Warehouse project. Many of the Agile Data Warehouse projects I hear of and read about typically seem to follow a somewhat standard process. This process usually involves building a Data Warehouse in increments by working with departments within the corporation to understand their data requirements. (in many situations, this is a very detailed analysis of their data requirements) In this way, the Data Warehouse is built up over time until an Enterprise Data Warehouse is created. While this is a far cry better that trying to analyze the data requirements for the entire corporation, I do believe there are some limitations with this approach.


1) If the Data Modelers are not experts in the existing data domain, this process may involve significant rework as data modeled for one department may need to be modified as new information is learned and existing data structures need to be integrated with data structures designed in subsequent releases. This risk can be significantly reduced if the data modelers are experts in the data domain, but this can be a real risk for consultants that are not experts.

2) We are still not implementing a Data Warehouse in an iterative fashion, only incrementally. While this is a step in the right direction, it is still falling short of being able to implement a Data Warehouse iteratively and get even quicker feedback. Since we are also essentially time-boxing implementing data requirements on a department by department basis, we are also only working on the most important data requirements on a departmental basis. We are not working on the prioritized data requirements for the entire corporation. If every department receives the same time allocation, we may work on some data requirements while other more important ones are left in the backlog. And finally, the data requirement for each department can get quite detailed. In many ways, we are still doing Big Design Up Front – in chunks.

Our Approach

My background in Data Modeling has always been aligned with Bill Inmon’s thoughts rather than Ralph Kimball’s. So to no surprise I felt I needed a good understanding of the data domain for the Enterprise before I could start to model the data. To address this I proposed we create and Agile Enterprise Data Model. An Agile Enterprise Data Model is an Enterprise Data Model that takes 6 weeks instead of 6 months to create. (or 6 years) Its purpose is to validate the entities, relationships, and primary attributes. It is not intended to drive down into every single attribute and ensure alignment. In my experience, this excessive detail has derailed other Enterprise Data Model projects. (as consensus cannot be reached or the data is always evolving) But in creating an Agile Enterprise Data Model, we now understand enough of the enterprise’s data so that even though we may only be modeling one department’s data, we know what integrations and relationships are required in the future.

I felt this addressed the first limitation.

The second limitation was much harder. I felt it was much harder to engage clients in small slices or iterations of their data requirements in a light-weight fashion that wouldn’t require reviewing their current data requirements in detail. How could we get a picture of the data requirements for the corporation at a high level? Many of the clients were concerned that if the project operated in iterations they would not get all of their requirements satisfied. In addition, our current project had a list of existing reports that the clients would like to be migrated to the new environment. How could we engage our clients and work with them to determine what their corporation’s data requirements were in an iterative light-weight manner? How could we provide visibility of what the entire corporation  data requirement were?

I turned to Innovation Games.

I am a fan of Innovation Games and any light weight process that can help to visualize data and engage the clients better. I was searching for a way to use a variant of the ‘Prune the Product Tree’ game to use. But I didn’t feel that just placing data requirements on the Product Tree would truly help the engagement of the client and provide the visibility required. What we really needed was a variant of the Product Tree that helped the clients ‘see’ their data requirements to confirm that we got it right and that showed the requirements for the entire corporation on one wall.

We ended up merging a ‘Prune the Product Tree’ game with a ‘Spider Web’ game with some unique additions.

Here is what are doing:

1) We are seeding the visualizations with 30-40 enterprise level reports that are used internally and externally.

2) We have asked for the top 20 reports from each department to further seed the visualizations. This will help to populate the visualization and I believe help the clients to generate other data requirements that we are missing.

3) We are capturing data requirements or reports in the simplest mode possible. We decided to leverage the Goal, Question, Metric method proposed by Victor Basili. Our data requirement stories will specify the following:

  • Noun : The object of primary interest
  • Question : The main inquiry about the noun
  • Reason : What is the business reason for the question
  • Frequency : How often do I need the answer?

An example is provided below:

Noun Question Reason Frequency
Claims Amount > $1,000 To generate audits Monthly

We will capture these additional data requirement stories in Silent Brainstorming sessions with users. This Silent Brainstorming will occur after we present the visualizations and review the data requirements that have seeded the visualization.

The final piece of the puzzle

Even with those first three pieces, it was still unclear how we could visualize the data and reporting requirements and gain engagement with the clients. Then during an Innovation Games course that was educating us on how to customize and create our own game, it struck me. The Product Tree is the wrong metaphor for our data requirements. We need something that visualizes the data requirements themselves. We iterated through a spider web metaphor, to a 3 dimensional axis metaphor, until we ended up on a hexagon. I’d recommend you find the shape that best fits your data domain. The hex and its six dimensions fit our data domain nicely.

The hex diagram below is our customized Innovation Game we will use to seed data requirements and generate new requirements.

The Concept

Usually for all data requirements or reports there is a person object at the centre. This is no exception. We will have a data hex for each type of person that the corporation needs to know about. The six accesses were chosen by understanding the data domain and by reviewing existing reports that are used by the client. The existing data requirements and reports will be translated into the data requirement stories and placed on the hex. Sessions will be facilitated to add additional stories onto the data hexes.

Once all of the stories have been placed on the data hexes we will prioritize the stories by moving the data requirement stories based on priority. Data Requirement stories that are high priority will be moved near the centre. Stories of less priority will be moved to the outer rings.

The Rationale

Using this metaphor we are able to do the following:

1) Visualize the hundreds of data requirements and reports of the corporation at once and prioritize them.

2) Ensure we are working on the true data requirements across the entire corporation.

3) Work in a true iterative fashion in how we deliver data requirements and ultimately build the Data Warehouse iteratively.

4) Use a light weight method that limits detailed design up front


If you want to hear how this process worked, the results will be presented at Agile 2012 in Dallas and at SDEC2012 in Winnipeg! I’ll also have a subsequent blog post that publishes the results. So far the feedback has been encouraging and we will be expanding the use of the process over the next few weeks.

About Terry Bunio

Terry Bunio has worked for Protegra for 14+ years because of the professionalism, people, and culture. Terry started as a software developer and found his technical calling in Data Architecture. Terry has helped to create Enterprise Operational Data Stores and Data Warehouses for the Financial and Insurance industries. Along the way Terry discovered that he enjoys helping to build teams, grow client trust and encourage individual career growth, completing project deliverables, and helping to guide solutions. It seems that some people like to call that Project Management. As a practical Data Modeller and Project Manager, Terry is known to challenge assumptions and strive to strike the balance between the theoretical and real world approaches for both Data Modelling and Agile. Terry considers himself a born again agilist as Agile implemented according to the Lean Principles has made him once again enjoy Software Development and believe in what can be accomplished. Terry is a fan of Agile implemented according to the Lean Principles, the Green Bay Packers, Winnipeg Jets, Operational Data Stores, 4th Normal Form, and asking why


3 thoughts on “How an #Agile Data Warehouse leveraged an Innovation Game #agile2012 #sdec2012

  1. Nice post!

    I have a few questions:
    1. Is “Agile Enterprise Data Model” similar to something else I’ve heard… the Canonical Model?
    2. Once you’ve done this planning, how short can your data/report iterations get (2 weeks?)? …a big challenge I see with Agile DW or Agile BI.

    Posted by Luc | June 27, 2012, 5:40 pm
    • Thanks Luc.

      Actually I believe this goes beyond a canonical model. My understanding of canonical models is that there are more data taxonomies and don’t go into the detail of relationships between entities and distinctive attributes for the entities.

      The real benefit of this approach is that you have a ‘report backlog’ that can then allow even weekly iterations. You pull report stories and refine them and work on them in priority sequence. We will run into challenges where we may not have production data to test the reports until the ETL processes are done. This is something we will have to work through and see if we can co-ordinate.

      If you already have a warehouse that is populated. I don’t see an issues having weekly or even daily iterations!


      Posted by bornagainagilist | June 28, 2012, 5:21 pm


  1. Pingback: How an #Agile Data Warehouse leveraged an #InnovationGame – Iteration 2! | bornagainagilist - August 1, 2012

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: