Information in Context

Friday, July 23, 2010

self-exemplify - Dr. Edward Tufte's seminar in Chicago July 23

In his book, The Visual Display of Quantitative Information, 2nd edition, Edward Tufte describes how he wanted to make the book, self-exemplifying. In other words, the book should use the techniques about the effective display of information to describe techniques about the effective display of information; it should be an example of itself. Dr. Tufte's Seminar today in Chicago did the same thing.

I often find myself experiencing a great sense of relief and identification when encountering a thought leader who has wrestled with the same issues as I have in my work, and aspired towards some effective execution, whether this is with technology issues or softer issues like managing complexity, personal productivity, etc. I felt the same sense of relief in Dr. Tufte's extremely concise and coherent exposition on effective information display.

This was a fairly dense 5 hours. I found myself taking some notes, but for the most part trying to pay attention and absorb what was presented. I was noticing that Dr. Tufte was not shy about opinions and frequently would illustrate points with succinct and boldly-worded statements about best practices (or lack thereof).

I am reproducing some of these statements and impressions here, as hastily noted. If you do anything with the presentation of information (whether that be PTA newsletters or CPM Dashboards), please do yourself a favor and participate in these wonder learnings if you have the chance.

Here are Dr. Tufte's ideas and statements in roughly chronological order. Also, it's possible I have paraphrased him inaccurately. I'll claim any mistakes as the heat of the moment. Dr. Tufte is definitely more lucid than my note-taking ability. This is intentionally left in "brain-dump" format. I think many of these statements are provocative on their own, but I hope these might intrigue anyone reading to either seek out Dr. Tufte's books (Beautiful EvidenceVisual Explanations: Images and Quantities, Evidence and NarrativeEnvisioning InformationThe Visual Display of Quantitative Information, 2nd edition) and/or his seminar.
  • Presenting (to others) is a moral act.
  • Examination of the Music Animation Machine. Especially how there is no legend or guidelines yet it's a very dense and intuitive presentation of information
  • Dr. Tufte used this as an example that refutes the idea of "information overload" and he cited several examples of dense presentation throughout the seminar (e.g. 800,000 data points on 2 sides of an 11x17 page)
    • "There's no such thing as information overload, just lousy design"
  • A worthwhile diagram deserves the same amount of intention as the text that would impart the equivalent amount of information
  • Graphics are frequently used to depict causality
    • Policy and prevention missions both need to analyze causality
  • Every linking line should be annotated
  • The map is the gold standard for effective presentation
  • "Chart junk" should always be replaced by information
  • In a graphic presentation, are you using the results of evidence, or "evidence selection" i.e. are you cherry-picking favorable data?
  • One should assume that presenters have similar motivations as the audience (and vice-versa)
  • You want an open mind, but not an empty head.
  • Maximize content reasoning time; minimize content interpreting time.
  • Paper has 10 times the resolution as a computer display. Paper has 100 times the resolution of projected slides
  • Authoritarian presenters tend to distrust their audience. This creates the tendency to stint information. (3 points per slide - sound familiar?)
  • Rather than "know your audience", "know your content". Respect your audience instead.
  • Do whatever it takes to impart the content. e.g. Sock Puppets, real objects, physical models. Don't be constrained by convention
  • Every time you can get a real object in a presentation, do so.
  • If possible, see how data is originally collected.
    • Example of water being collected from the cleanest part of the river in a pollution impact study
  • 1 + 1 = 3 - describes the phenomenon that 2 graphic elements create 3 effects - the effect of each, and the effect of the juxtaposition of the 2 objects
  • Local Optimization = Global Pessimization
  • The goal is to zero out the interface
  • Omit grids. Good typography supplies enough guidelines.
  • Tufte then went through 8 fundamental principles which are discussed in his book, Beautiful Evidence
    1. show comparisons - "Compared to what?"
    2. Illustrate causality
    3. show multivariate data. Translation: enrich your data with dimensional attributes
    4. integrate all content. There shouldn't be different modes to view the comprehensive presentation
    5. Document all sources, scales, and any missing data. It enables the credibility of your presentation.
    6. Content counts most of all. Over presentations, style, formatting
    7. Locate imporant comparisons in a common space. Use small multipliers.
  • The point of information display is to assist thinking.
    • Most design can be placed in its decade because it is based on fashion. This is not necessary a complete evil
  • Design is based on human factors.
  • After 2D drawing is just that: 2D.  Perspective drawing is something like 2.33 dimensions
  • Navigation instruction is a 4D presentation:
    • 3 physical dimensions
    • and time (the 4th dimension)
  • Information resolution = the ability to communicate more bits per area unit  and/or per time unit
  • Galileo's telescope was the first increase of information resolution beyond the capabilities of the human eye
  • Since 1610, information resolution has increased 10 million to 100 million-fold
  • Make displays worthy of the the human eye/brain system
    • The human eye/brain system was measured to a capacity of 10 megapixels per second per optic nerve
    • Tufte asked, "Why are we looking at these moronic displays? (PowerPoint)
  • Label directly; don't use legends
  • How do you solve the flat-land problem (i.e. displaying 3 dimensional artifacts on a 2-dimensional surface [screen, paper, iphone])
    • Use a model
  • The principles of analytic design come from the principles of analytic thinking
  • Quoting Steve Jobs (?), "Real artists ship"
  • Interface design
    • (Quoting someone, not sure who - Alan Cooper) "No matter how beautiful your interface is, it would be better if there were less of it."
  • Forming your summary:
    • State what the problem is
    • State who cares
    • State the solution
  • Other key pieces of advice
    • Show up early
    • Finish early (Which Tufte did)

Friday, May 28, 2010

Inexpensive iPad holder - less than $5

I just needed a temporary solution to hold my iPad at my desk. It's worked out pretty well. Thought I'd post the tip:































It's an Office Depot Plate Holder, Clear Item # 544474

Thursday, May 27, 2010

Resetting "Windows XP Mode" to initial settings

Under Windows 7 64-bit ultimate edition, I've had the need to reset my "Windows XP Mode" to initial settings several times over the past few days. Initially I did this because as I used the XP virtual environment, the VHD file grew to around 10 Gig. Not huge, but I wasn't using it for anything except to run a client's 32-bit only VPN software, so it was wasting a fair amount of space. My main machine has a 256 Gig SSD for the main drive, so I am conservative with space consumption.


The second reason was that I was helping my fiancée (a high-school teacher) obtain some (public domain) video for use in her classroom. Unfortunately this video seemed only to be available in a heavily watermarked version on YouTube (ugh) or in a Real Media stream (double-ugh). So I wound up needing to use a mixture of open source and very old software to convert this video to a format that was playable in the classroom. As a rule I don't install software like that on my production machine - a virtual environment is a great "sandbox" for momentary needs like this one. After I converted this video, I no longer needed all these (rather buggy) utilities, so I wanted to start clean.


I didn't see much on the web on how to reset the "Windows XP Mode" to initial settings. I tried this experiment, which worked pretty well.


To reset Windows XP to factory new condition:
  1. Right click the vmcx file. Select Settings. (On my machine the file is called Windows XP Mode.vmcx and is located in C:\Users\Tom\Virtual Machines)
  2. Note the location of Hard Disk 1 (On my machine this location is called C:\Users\Tom\AppData\Local\Microsoft\Windows Virtual PC\Virtual Machines\Windows XP Mode.vhd)
  3. Exit the Settings Dialog
  4. Delete the files associated with your "Windows XP Mode" environment. On my machine, these are located in C:\Users\Tom\AppData\Local\Microsoft\Windows Virtual PC\Virtual Machines\Windows XP Mode.vhd and are called 
    1. Windows XP Mode.vhd
    2. Windows XP Mode.vmc
    3. Windows XP Mode.vmc.vpcbackup
    4. Windows XP Mode.vsv
  5. Now, find the original Start Menu shortcut for Windows XP Mode. As long as you've left the Parent Disk in place, the system will prompt you to recreate a new environment:

voilà! - New XP environment (after a few other dialogs and a few minutes building)

Standard cautionary notes: 
    • I haven't researched whether this is the "proper" way to do this. But I've done this multiple times on 2 different machines and it takes less than 10 minutes to reset, install anti-virus, and reinstall VPN software
    • Obviously when I suggest you delete files on your system, I am assuming you know the implications of this and have everything backed up or have determined that you no longer need anything from the files you are deleting

    Tuesday, May 25, 2010

    Burn the Ships

    Although it's probably an apocryphal story, legend has it that Cortez instructed his men to "Burn the Ships" upon encountering the New World. This has become (in my parlance, anyway) a catchphrase for being irrevocably committed to a particular course of action.

    In that spirit, yesterday I blew away v1 of my website and started up this version. This will remain a work in progress for a while, since other client commitments will have me quite busy through June.

    Anyone using WordPress who has themes or plugins that they love, please leave a comment.

    Wednesday, April 14, 2010

    Time (Not Date) Dimension Table SQL code

    I am seeing more call for Time Dimensions. By that I mean Time-of-Day Dimensions. So now I have to retrain myself to call Date Dimensions "Date" and Time Dimensions "Time".


    As a follow up to this post, here's some DDL and a quick routine to generate a time of day table. The resolution is to the second, which so far has proved sufficient for my clients' purposes.


    Enjoy


    SET nocount ON
    /*
    CREATE TABLE [dbo].[dim_Time](
      [TimeId] [int] NOT NULL,
      [Time] [time](7) NULL,
      [Hours] [tinyint] NULL,
      [Minutes] [tinyint] NULL,
      [Seconds] [tinyint] NULL,
     CONSTRAINT [PK_dim_Time] PRIMARY KEY CLUSTERED
    (
      [TimeId] ASC
    )WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
    ) ON [PRIMARY]

    TRUNCATE TABLE [dbo].[dim_Time]
    GO
    SELECT * FROM dbo.dim_time
    */


    DECLARE @second1 INT = 0
    DECLARE @currtime TIME
    DECLARE @msg VARCHAR(30) = ''


    WHILE @second1 < 86400
      BEGIN
          SELECT @second1 = @second1 + 1
          SELECT @currtime = Dateadd(ss, @second1, '00:00.00')
          --  SELECT   
          --DatePart(hh, @currtime) * 10000 + 
          --DatePart(mi, @currtime) * 100 +
          --DatePart(ss, @currtime) AS TimeID
          --, @currtime AS [Time]
          --, DatePart(hh, @currtime) AS [Hours]
          --, DatePart(mi, @currtime) AS [Minutes]
          --, DatePart(ss, @currtime) AS [Seconds]
          INSERT dbo.dim_time
                 ([TimeID],
                  [Time],
                  [Hours],
                  [Minutes],
                  [Seconds])
          VALUES (
                  Datepart(hh, @currtime) * 10000 + 

                  Datepart(mi, @currtime) * 100 + 
                  Datepart(ss, @currtime),
                   @currtime,
                   Datepart(hh, @currtime),
                   Datepart(mi, @currtime),
                   Datepart(ss, @currtime) )
          IF @second1 % 1000 = 0
            BEGIN
                SELECT @msg = 'Now Processing ' + CAST(@second1 AS VARCHAR(12))
                PRINT @msg
            END
      END 

    Sunday, April 4, 2010

    Adventures with Virtualization

    Those of us needing to run BI software in virtual environments are well familiar with the challenges of getting things running smoothly, especially if you are using a laptop as your lab or demo environment. We have a new set of challenges with Windows 7, and with SharePoint 2010.



    Challenge #1 - SharePoint 2010 requires 64 bit OS, and the only MS virtualization software that allows for 64-bit is Hyper-V.

    Only problem with running Hyper-V is that you cannot sleep a machine with Hyper-V, so if you’re using you laptop as a lab machine, you need to retain an OS that will run productivity apps and sleep. I set up my old laptop using the technique described here and it works well. I haven’t repeated the process for my new laptop yet, but I plan to.

    Challenge #2 - Windows 7 Virtual PC requires Hardware Assisted Virtualization

    Apparently this is no longer true, but I've wrestled with this one for months, in fact in early March I returned a machine that didn’t support hardware assisted virtualization because I needed to run Windows 7 VPC. . MS just released a version of Virtual PC that does not require hardware assisted virtualization

    Other tips that have worked well for me:
    • Use a fast disk - I was using a 7200 RPM eSATA external drive - faster than the 5400 disk commonly found on laptops. Now I am using an SSD drive- even better.
    • Store the VHD in a compressed folder. This one struck me as counter-intuitive but anecdotally seems to work.

    Monday, March 22, 2010

    Brilliantly stupid MDX debugging technique

    Whenever I do something really stupid, rather than keep it to myself, I prefer to blog about it for the entire world to see. Even better - I came up with a debugging technique to save myself from my own, er, stupidity.


    The scenario is this: I am developing some SQL Server Analysis Services Date Calculations in MDX for a client. They have three different Date Hierarchies, and wanted the standard, Previous Period, Period to Date, etc calculations. (For a very good explanation of the techniques involved, read this PDF first, then study this refinement from Mosha Pasumansky. )


    The MDX looks like this: 


    Scope([Calculation].[Calculation].[Previous Period]);      
        ([Calendar].[Calendar].[Date]) = ([Calculation].[Current Period], ParallelPeriod([Calendar].[Calendar].[Calendar Year],1));      
        ([Calendar].[Week].[Calendar Week].MEMBERS) = ([Calculation].[Current Period], [Calendar].[Week].Lag(52));      
        ([Calendar].[Calendar].[Calendar Month].MEMBERS) = ([Calculation].[Current Period], [Calendar].[Calendar Month].Lag(12));      
        ([Calendar].[Calendar].[Calendar Quarter].MEMBERS) = ([Calculation].[Current Period], [Calendar].[Calendar Quarter].Lag(4));      
        ([Calendar].[Calendar].[Calendar Year].MEMBERS) = ([Calculation].[Current Period], [Calendar].[Calendar Year].Lag(1));      
        ([Calendar].[NRF].[NRF Week].MEMBERS) = ([Calculation].[Current Period], ParallelPeriod([Calendar].[NRF].[NRF Week],1));
        ([Calendar].[NRF].[NRF Period].MEMBERS) = ([Calculation].[Current Period], [Calendar].[NRF Period].Lag(13));
        ([Calendar].[NRF].[NRF Quarter].MEMBERS) = ([Calculation].[Current Period], [Calendar].[NRF Quarter].Lag(5));
        ([Calendar].[NRF].[NRF Year].MEMBERS) = ([Calculation].[Current Period], ParallelPeriod([Calendar].[NRF].[NRF Year],1));
    End Scope;      

    Pretty standard stuff (sorry about the line wraps): 

    The symptom was that my Calendar Data Hierarchies were working fine, but the other two (Week and NRF [National Retail Federation]) were not. I worked my way through the data, and it all looked correct. Even a dynamic MDX query in SSMS worked fine. But it was wrong in the cube.

    When debugging MDX, I find that it pays to go in small increments and to do patently obvious things like setting Scoped statements to constant values, as in: 


    Scope([Calculation].[Calculation].[Previous Period]);      
        ([Calendar].[Calendar].[Date]) = 7777;
    End Scope;  


    I had done this with this MDX code in an attempt to see what was going on, and it showed me that the scoped overwrites were working, but the calculations they were over-writing with were not. Then I got an idea to change how I was debugging: 


    Scope([Calculation].[Calculation].[Previous Period]);      
        ([Calendar].[Calendar].[Date]) = 1;
        ([Calendar].[Week].[Calendar Week].MEMBERS) = 10;
        ([Calendar].[Calendar].[Calendar Month].MEMBERS) = 100;
        ([Calendar].[Calendar].[Calendar Quarter].MEMBERS) = 1000;
        ([Calendar].[Calendar].[Calendar Year].MEMBERS) = 10000;
        ([Calendar].[NRF].[NRF Week].MEMBERS) = .0001;
        ([Calendar].[NRF].[NRF Period].MEMBERS) = .001;
        ([Calendar].[NRF].[NRF Quarter].MEMBERS) = .01;
        ([Calendar].[NRF].[NRF Year].MEMBERS) = .1;
    End Scope;      


    Essentially I am using a bit-mask, one position for each level. I split the two different hierarchies at the decimal. This allowed me to see which scoped overwrite statement was in effect as I browsed the cube, and it led me to the solution.

    It turns out I had defined my scoped overwrite assignment statements in reverse order. (i.e. from highest level (Year) to lowest (day)) Once I switched them so the highest level statement was evaluated last, everything worked as desired. This was a very stupid mistake, and once it was defined as part of the cube MDX script, was a difficult one to spot. Luckily the debugging technique described above made it pretty obvious what was happening.

    Followers