Blog Archive

Tuesday, November 26, 2013

This Is Not a Data Visualization.


Data 'visualization' is different from data 'illustration'.  An easy way to think about this is to remember that we must see things before we can explain or communicate things.  It's important to distinguish visualization from illustration. Using either one in the wrong context can disrupt communications, comprehension and learning.  

Data Illustration

Tables, graphs and charts are critical visual tools for illustrating patterns of data. For example, we use them to show trending or correlation. Many helpful authorities (a few are listed in the 'labels' section of this blog) provide rules and pedantic guidelines for formatting and displaying these visual tools. These are data 'illustrations', because they are manufactured post-discovery of the trends or correlations. They can then be used: 
  • carefully, for advancing theories; 
  • transparently, with citations and qualifications for journalism or story-telling;
  • sometimes manipulatively, in a sales or propaganda context.  
Data Visualizations 

Data visualizations, on the other hand, are visual displays of information that generate discovery and greater perspective.  They can be subjective in their message, meaning the reader may be free to draw their own conclusion. Or, they may be even subjective in their context, meaning the viewer may simply appreciate the beauty of the shapes, patterns, and animation that the visualizations provide. 

However, their context is fragile. Visualizations have to be treated like laboratories or theater, where we structure the experience and carefully select the audience. Viewers leave the laboratory or theater having formed their perspectives and opinions, but the original performance is not syndicated to other audiences.  The viewers must then shape their own arguments, rhetoric, and illustrations.  

A Data Illustration + A Discovery Context 
= Confusion

Last week I attended two completely different events. The first was a conference for a higher education program, and the second was a museum hackathon.   

At the higher education conference, I joined a team that had been working on a study of graduate hiring trends for the higher education program that was sponsoring the conference. In order to explain our work, we brought the requisite 'poster' for the poster session. Our audience: professors and educators whose livelihood depends upon theory, experimentation, and conclusion. 

'What did you find?' was the opening question of someone walking up and taking a read. I pointed to this chart:




This was somewhat flummoxing moment for many of the viewers. It was in a data illustration format - a bar graph.  But it wasn't supporting a theory or our findings. It was instead a discovery tool in a static format, meant to speak to as broad of an audience as possible.  

The people who came up to the poster had come prepared for our theory, experiment, and conclusion, and had sharpened their critical thinking in advance. That's their job.  

If you break the rules, you need to be ready to handle the consequences. We were standing there to establish the context, and to direct people to the section of the graph that represented their teaching discipline, where they could then see the mix of hiring company types for their discipline. Once we added that scaffolding, the chart became the jumping off point for the conversation we'd wanted to have with people: where are graduates in that discipline now working? 

The conference attendees were able to walk away with a sense of the hiring mix in their discipline, and how it compared to the overall hiring mix. For example, we found that math graduates had a far greater experience being hired by a large company than health graduates. This was an interesting discovery for people from both disciplines, and we worked with them to think about how we could further explore this story.  

But without our personal presence, and our guidance through the visualization, this illustration tool in a visualization and discovery context might have quickly broken down and failed.  

A Data Visualization + A Discovery Context 
= Excitement
  
In the same week, I also co-hosted a museum hackathon.  The hackathon explored the depiction of smile in art. We explored the museum's collection for smiles, and we found many smiles across different periods and media.  

This got us interested: what kinds of patterns might we see?  What would we discover if we aggregated and explored some of the information about the pieces of work that depicted smiles?  

After aggregating the smiles using some basic crowdsourcing protocols, Adam Vigiano (Transmogrifier) quickly put a viewer together in Tableau to organize and explore the smiles by age and type of artistic media.  


Now we had a really exciting portal into seeing things about the collection that we would have never seen before. The viewer gave us a sense of scale and proportion to the number of smiles in the museum's collections. Furthermore, we were able to dive down into the collection to see each and every piece, if we so chose.  

Nowhere in this visualization were we putting forward a theory, or were we expressing a point of view. We could subsequently form hypotheses such as 'hardly anyone was smiling during the dark ages (1000-1400 AD)'.  And we could then generate a bar chart (or blasphemy, a pie chart!) to show that observation, relative to the population. Of course, that would be an absurd conclusion. But the point here is that data visualization gives us perspectives that stimulate exploration, hypotheses, and scientific experimentation. And it was fun.

The Map Is Not The Territory 

Alfred Korzybski's celebrated quote "the map is not the territory" neatly reminds us that representations are not objects themselves. And Magritte reminded us that 'this is not a pipe'. Visualization is organizing one's perspective of an abstraction, where the abstraction is due to dimension or scale that is beyond our grasp. We are too small to see the whole territory. We are too in the moment to see the progress. We need this help.  

But visualizations are not the data. The data is not the sum of the experience. We've been inappropriately using data visualizations as the basis for statements and conclusions. We're leaving out rigorous statistical analysis, and appropriate qualifiers such as confidence intervals. It's exciting that we've become more and more a society of pattern-seekers. But it's important that we don't become lazy and cavalier with what we do with those observations. We also have to remember our audience, and how the audience puts into context what they see. 

2 comments:

  1. Thanks a lot for sharing this with all folks you really recognise what you are talking about! In this complex environment business need to present there company data in meaningful way.Sqiar (http://www.sqiar.com/solutions/technology/tableau/) which is in UK,provide services like Tableau and Data Warehousing etc .In these services sqiar experts convert company data into meaningful way.

    ReplyDelete
  2. An insightful post . . . http://clioviz.wordpress.com/2014/07/06/data-illustration-vs-data-visualization/

    ReplyDelete