LectureBlog: Ben Shneiderman - Information Visualisation For Knowledge Discovery
Ben Shneiderman - HCIL University of Maryland
www.cs.umd.edu/hcil ben@cs.umd.edu @benbendc
11th June 2012, Future Interaction Lab, Swansea University.
Brief History
Ben has worked on design ideas, input devices, output devices, social media, help tutorials, teaching, search and visualisations.
Pride in serving 5 Billion users with a diverse multitude of apps and interfaces.
Successfully affected the development of a wide variety of interfaces across a huge range of platforms.
Info Vis
Visual bandwidth is huge, and the human perception of it is remarkable
Trends, clusters, outliers are easy to spot, humans very good at recognising patterns.
A lot of big business buying visualisation companies.
Eg, Spotfire: Pioneer of software that ran realtime queries, selected data, contained multi-filters over millions of data
Used in big pharma, to identify the role of Retinol in embryos and vision.
Over time found multiple 2D displays better than fewer 3D - cordinated multiple displays highly useful, all 2D
100M pixels and more, spatially stable displays, arranged in meaningful relationships via proximity
EG, corporate headquarters and NASA control rooms
Smaller screens such have tablets and mobile phones have become increasingly popular too.
Information Visualisation Mantra
“Overview, zoom & filter, details on demand”
Show user everything first, no matter how complex and messy
Allow user to zoom and/or filter data
Then allow user to query details on demand
Written in paper, that then got 2k cites, and attracted lots of interest & discussion
Tried to represent a human way of navigating data similar to they way we navigate and interact with the world around us.
Info Vis: Data Types
SciViz; 1D Linear, 2D Map, 3D World
InfoViz; Multi-Var, Temporal, Tree, Network
Multi-variant, hugely dimesional visualisations can be difficult to create, display and use in infovis
flowingdata.com, infovis.org, infoasthetics.com, infovis.net all have some great and not so great examples of visualisations
Why Visualise?
Anscombe’s Quartet:
4 sets of data, each with 12 rows of x and y points. Hard to see any patterns when just tabular formatted data.
Takes a while to identify data trends, points of interest.
Very easy to see when plot on simple charts.
Also very hard to see errors in large data sets
Hospital thought its average age statistics were out, visualised data.
Only then did they notice multiple patients had an age recorded as 999 years old!
Also found other monthly data series which were missing april’s month data.
Some Examples and Previous Projects: Multi Var and Temporal visualisations
Timesearcher
V1.3 Designed for time series data, used for stocks, weather and genes
Users specified patterns, supported rapid search
Design goal: 200 periods, 5000 stocks, 100ms updates required
KD-tree, quad trees, gridfiles fall out after 6-8 dimensions
constructed using a rapid linear search
Uses the above mantra- allows users to see intial overview, and immediately identify POI.
V2.0 allowed for 10,000 points, multi-var data
Allowed controlled precision of match, tightness of fit (linear, offset, noise, match)
V3.0 includes forcasting etc.
Lifelines: Patient Histories
Historical medial data, visual overview of issues, updates and an idea of magnitude of each event
Lifelines 2: Contrast+Creatine
Large amount of patient data and histories- millions of people over 20 years.
Allows identification of generally slow or hard to spot patterns.
Designed around ARF; Align, Rank and filter
Ability to align data by certain events, rank it and create sequence filters.
Lifeflow: Aggregation Strategy.
Temporal categorisation of data > data in lifelines 2 format > tree of event sequences > lifeflow aggregation
Visualisation of sequences of hospital vists- where did people go, and what happened to them
Allows identification of “bounce backs”- Patients arrive, are treated in ICU, sent to ward and then bounced back to ICU - means staff missed something.
Can align by any event - enables identification of patients who went to ward before ICU etc
Treemap: Gene Ontology
Space filling, space limited, colour coding, size coding, but requires learning.
Practical example: www.smartmoney.com/marketmap
Provides a spatialy stable map, enables identification of differences over time, variance etc.
An example of visualisations giving you answers to questions you didn’t know you had.
newsmap.jp - example of Google news treemap
hivegroup.com - example of logistics and supply-chain treemap
Spotfire.com bond portfolio analysis, NY times, gardian all used treemaps in the past.
Voronoi tree maps in NY times piece on inflation.
Also used treemaps for vis of hard drives across multiple computers in an organisation,
shows wasted trash space, directories mirrored across loads of machines.
Network analysis
Visualcomplexity.com (not visual simplicity! Some great, some bad examples of network visualisations)
Discovery Process: Social Action
Network using links from US Senators that seem to vote similarly to each other.
Filtering weak links shows distint clustering between two communities- Republicans and Democrats
Shows strong, weak and middle postions within parties.
NodeXL
codeplex.com/nodexl
Netowork overview tool for discovery and exploration in Excel.
Also shows senate voting pattern discussed above
Allows lots of social network data to be used by people with limited programming skills
“Group in a box” layout: treemaps with node links
Innovation clusters: people, locations, compnies.
11k noes, 26k links
Using vis of clusters using “mouse” shows the animal, input device, mickey mouse
Summary of projects.
Check out analysing socialmedia with NodeXL book
All work has tried to affect the world, not just restricted to minor unconnected problems
Tried to focus on UN millenium development goals - a worthy task
Q: WRT clustering algos - How can I prove that this is a sensible network?
A: nodeXL uses 3 diff clusters algos, current fashions push for multiple community membership.
Metric for clusterings are important. Most clusters works on network connections, but should work on node value too- starting to happen
Goal of vis is insight not pretty pictures, learning better how to do it. Vis integrated with stats is the way to go, produces clues on next stats methods to use.
nodeXL goal was allow access to stats tools without programming tools.
Q: Have you done any work to identify and explore poor quality data sets (medical as example)?
Favourite example: Medical data showed patient admitted to hospital 14 times, but discharged twice.
Being admitted is a billing event, discharge isn’t and gets less attention.
Outliers are obvious indication of POI, either genuine, or errors.
There is a overal commercial need for cleanup of data
Sub Q: If you don’t know the truth for events, if there are more than one, how do you choose?
Use context: IE aligned medical data by heart attacks and rank by number of.
Someone had 6 attacks, but this was actually 2 attacks that had been reported 3 times from differnt people.
Make public aware of how poor medical data is!
Q: What would you say are the low hanging fruit in UN millenium development goals?
Approach to research has shifted dramatically. Previously promoting imperical controled studies was the heart of HCI.
Increasingly aware these don’t payout for quetions of insight.
these require much longer term programs.
Some projects require weeks & months to get used to vis tools.- 2-4 weeks with domain expert, 2-4 weeks more on their own, then to solve problem- much longer term than standard small user studies.
An answer is to find agencies, commercial entities, ask them which small problems they are having solving bigger probems, and help solve them.
Believe in breaking the typical separations of research: work should be both basic and applied, be mission driven and curiosity driven.
Getting comanies to work with you is hard, 3-4 months to get to talk to people. But resulting conversations, networking and projects are worth it.
Also the case that tools built for one problem can be equally suited to a number of other problems.
1 Notes/ Hide
- toronto-seo-specialist liked this
- elsmorian posted this