Friday, February 5, 2010

The "Million Dollar" BI Architecture Diagram

Anyone who has worked with me or taken training from me has seen some variant of this diagram. It's not an original concept, although this version has been refined to include my personal experience with many Business Intelligence implementations, as well as my bias towards both the Kimball Business Dimensional methodology and using Microsoft SQL Server (especially Analysis Services) as part of successful deployments.




This diagram is conceptual, although I have designed systems that did have physical servers for each of the servers represented, and these worked very, very well.


I have lots to say about this, and will continue to do so on this blog. Perhaps the most significant concept to mention at the outset is that this is a recipe for success with Business Intelligence. Put another way, the level of success I've had with BI environments corresponds pretty closely with how well the architecture of these environments matched this diagram. (Also note that I said "a" recipe for success - not "the" recipe for success. I acknowledge that there are other approaches, but this is the one I've used successfully for over a decade.)


Two important concepts inherent in this diagram are:


a) each layer in this diagram has a primary purpose. From an architecture standpoint, designing these functional layers to do only a single thing very well creates a robust, yet simple, environment.


b) There are well-defined interfaces between each functional layer. Having clearly defined boundaries allows the processes to be moved from server to server (which enables important operational benefits like high-availability, disaster recovery, and performance tuning)


What are these primary purposes mentioned above, and how do they benefit the overall architecture?
  1. Source Systems  OLTP, Line of Business, ERP systems, and other sources - All BI systems are 100% dependent upon being able to source data from somewhere. The systems that run the business are our sources. All the rich data that we analyze and/or aggregate comes from there.
  2. Data Quality Extraction, Transformation, and Loading - Because the systems in #1 are generally concerned with running the business and not with pristine and perfect data, there must be intentional processes that clean and consolidate data from these sources. That happens in layer 2
  3. Single Vision Enterprise Data Warehouse - This is what an organization "knows" about itself, and enables a single vision of the truth. The is the consolidated, consistent historical data store, ideally becoming the "system of record" after the operational processes are no longer active. Note that under this model, this layer does not have to be tuned for interactive query performance - the primary purpose here is to act as a sound repository of enterprise information.
  4. Business Intelligence - Unified Dimensional Model - Nearly all successful implementations include a layer like this, either informally or formally. In my view, formally is the superior approach. In this layer, performance is improved by creating locally (or proximally) cached copies of information. This is where "Business Intelligence" is exposed - i.e. the metrics, KPI's that provide useful information. All presentation tools should source from this database.
  5. Presentation - By keeping the Business Intelligence in layer 4, the presentation tools really just become about delivering information to the various audiences in the format and vehicle most convenient for each. It allows different tools to be used as needed. 
BTW - I call this the "Million Dollar" BI Architecture Diagram because:

  • if you're an organization using BI this effectively, it should represent (at least) a million dollars in increased revenue or reduced cost. 
  • If you're a BI practitioner, it should be worth that in revenue or perhaps even more significantly, a million dollars in reduced stress - because it represents a plan that is proven and works.

Followers