RSS

Connectograms and Circos Visualization Tool

Connectograms and Circos Visualization Tool

Yesterday (May 16) the Public Library of Science (PLoS) published a fascinating article titled “Mapping Connectivity Damage in the Case of Phineas Gage“. It analyzes the brain damage which the famous trauma victim sustained after an accident drove a steel rod through his skull. Railroad worker Phineas Gage survived the accident and continued to live for another 12 years, albeit with significant behavioral changes and anomalies. Those changes were severe enough for him to have to discontinue his work and also get estranged from his friends who stated he was “no longer Gage”. This has become a much studied case about the impact of brain damage on behavior anomalies. Since the accident happened more than 150 years ago there are no autopsy data or brain scans from Phineas Gage’s brain. So how did the scientists reconstruct the likely damage?

Since a few years there has been interest in the human connectome. Just like the genome is a map of human genes, the connectome is a map of the connectivity in the human brain. The human brain is enormously complex. Most estimates put the number of neurons in the hundreds of billions and the synaptic interconnections in the hundreds of trillions! Using diffusion weighted (DWI) and magnetic resonance imaging (MRI) one can identify detailed neuron connectivity. This is such a challenging endeavor that it drives the development of many new technologies, including the data visualization. The image resolution and post-processing power of modern instruments is now large enough to create detailed connectomes that show major pathways of neuronal fibers within the human brain.

The authors of the Laboratory of Neuro Imaging (LONI) in the Neurology Department at UCLA have studied the connectomes of a population of N=110 healthy young males (similar in age and dexterity to Phineas Gage at the time of his accident). From this they constructed a typical healthy connectome and visualized it as follows:

Circular representation of cortical anatomy of normal males (Source: PLoS ONE)

Details of the graphic are explained in the PLoS article. The outermost ring shows the various brain regions by lobe (fr – frontal, ins – insula etc.). The left (right) half of the connectogram figure represents the left (right) hemisphere of the brain and the brain stem is at the bottom, 6 o’clock position of the graph.

Connectograms are circular representations introduced by LONI researchers in their NeuroImage article “Circular representation of human cortical networks for subject and population-level connectomic visualization“:

This article introduces an innovative framework for the depiction of human connectomics by employing a circular visualization method which is highly suitable to the exploration of central nervous system architecture. This type of representation, which we name a ‘connectogram’, has the capability of classifying neuroconnectivity relationships intuitively and elegantly.

Back to Phineas Gage: His skull has been preserved and is on display at a museum. Through sophisticated spatial and neurobiological reasoning the researchers reconstructed the pathway of the steel rod and thus the damaging effects on white matter structure.

Phineas Gage Skull with reconstructed steel rod pathway and damage (Source: PLoS ONE)

Based upon this geospatial model of the damaged brain overlaid against the typical brain connectogram from the healthy population they created another connectogram indicating the connections between brain regions lost or damaged in the accident.

Mean connectivity affected in Phineas Gage by the accident damage (Source: PLoS ONE)

From the article:

The lines in this connectogram graphic represent the connections between brain regions that were lost or damaged by the passage of the tamping iron. Fiber pathway damage extended beyond the left frontal cortex to regions of the left temporal, partial, and occipital cortices as well as to basal ganglia, brain stem, and cerebellum. Inter-hemispheric connections of the frontal and limbic lobes as well as basal ganglia were also affected. Connections in grayscale indicate those pathways that were completely lost in the presence of the tamping iron, while those in shades of tan indicate those partially severed. Pathway transparency indicates the relative density of the affected pathway. In contrast to the morphometric measurements depicted in Fig. 2, the inner four rings of the connectogram here indicate (from the outside inward) the regional network metrics of betweenness centrality, regional eccentricity, local efficiency, clustering coefficient, and the percent of GM loss, respectively, in the presence of the tamping iron, in each instance averaged over the N = 110 subjects.

The point of the above quote is not to be precise in terms of neuroscience. Experts can interpret these images and advance our understanding of how the brain works – I’m certainly not an expert in this field, not even close. The point is to show how advances in imaging and data visualization technologies enable inter-disciplinary research which just a decade ago would have been impossible to conduct. There is also a somewhat artistic quality to these images, which reinforces the notion of data visualization being both art and science.

The tool used for these visualizations is called Circos. It was originally developed for genome and cancer research by Martin Krzywinski at the Genome Sciences Center in Vancouver, CA. Circos can be used for circular visualizations of any tabular data, and the above connectome visualization is a great application. Martin’s website is very interesting in terms of both visualization tools as well as projects. I have already started using Circos – which is available both for download and in an online tableviewer version – for some visualization experiments which I may blog about in the future.

 
6 Comments

Posted by on May 17, 2012 in Scientific

 

Tags: , , , ,

Faceplant with Facebook?

With the Facebook IPO coming up this Friday there is a lot of attention around its business model and financials. I’m not an expert in this area, but my hunch is that a lot of people will lose a lot of money by chasing after Facebook shares. Why?

I think there are two types of answers. One from reasoning and one from intuition.

For reasoning one needs to look at a more technical assessment of the business model and financials. Some have written extensively about the comparative lack of innovation in Facebook’s business model and core product. Some have compared Facebook’s performance in advertising to Google – the estimates are that Google’s ad performance is 100x better than that of Facebook. Some have pointed out that many of Facebook’s core metrics such as visits/person, pages/visit or Click-Through-Rates have been declining for two years and go as far as calling this the Facebook ad scam. One can question the wisdom of the Instagram acquisition, buying a company with 12 employees and zero revenues for $1B. One can question the notion that the 28 year old founder will have 57% of the voting rights of the public company. One could look at stories about companies discontinuing their ad Facebook efforts such as the Forbes article about GM pulling a $10m account because they found it ineffective. The list goes on.

Here is a more positive leaning infographic from an article looking at “Facebook: Business Model, Hardware Patents and IPO“:

Analysis Infographic of pre-IPO Facebook (source: Gina Smith, anewdomain.net)

To value a startup at 100x last year’s income seems just extremely high – but then Amazon’s valuation is in similarly lofty territory. As for reasoning and predicting the financial success of Facebook’s IPO, people can cite numbers to justify their beliefs both ways. At the end of the day, it’s unpredictable and nobody can know for sure.

The other answer to why I am not buying into the hype is more intuitive and comes from my personal experience. Here is a little thought experiment as to how valuable a company is for your personal life: Imagine for a moment if the company with all its products and services would disappear overnight. How much of an impact would it have for you as an individual? If I think about companies like Apple, Google, Microsoft, or Amazon the impact for me would be huge. I use their products and services every day. Think about it:

No Apple = no iPhone, no iPad, no iTunes music on the iPod or via AppleTV on our home stereo. That would be a dramatic setback.

No Google = no Google search, no GMail, no YouTube, no Google maps, no Google Earth. Again, very significant impact for me personally. Not to mention the exciting research at Google in very different areas such as self-driving vehicles.

No Facebook = no problem (at least for me). I deactivated my own Facebook account months ago simply because it cost me a lot of time and I got very little value out of it. In fact, I got annoyed with compulsively looking at updates from mere acquaintances about mundane details of their lives. Why would I care? I finally got around to actually deleting my account, although Facebook makes that somewhat cumbersome (which probably inflates the account numbers somewhat).

I’m not saying Facebook isn’t valuable to some people. Having nearly 1B user accounts is very impressive. Hosting by far the largest photo collection on the planet is extraordinary. Facebook exploded because it satisfied our basic need of sharing, just like Google did with search, Amazon did with shopping or eBay did with selling. But the entry barrier to sharing is small (see LinkedIn, Twitter or Pinterest) and Facebook doesn’t seem to be particularly well positioned for mobile.

I strongly suspect that Facebook’s valuation is both initially inflated – the $50 per account estimate of early social networks doesn’t scale up with the demographics of the massive user base – as well as lately hyped up by greedy investors who sense an opportunity to make a quick buck. My hunch is that FB will trade below its IPO price within the first year, possibly well below. But then again, I have been surprised before…

I’m not buying the hype. What am I missing? Let me know what you think!

UPDATE 8/16/2012: Well, here we are after one quarter, and Facebook’s stock valuation hasn’t done so well. Look at the first 3 month chart of FB:

First 3 month of Facebook stock price (Screenshot of StockTouch on iPad)

What started as a $100b market valuation is now at $43b. One has to hand it to Mark Zuckerberg, he really extracted maximum value out of those shares. It turns out sitting on the sidelines was the right move for investors in this case.

 
2 Comments

Posted by on May 16, 2012 in Financial, Socioeconomic

 

Tags: , , , , , , ,

Sankey Diagrams

Sankey Diagrams

Whenever you want to show the flow of a quantity (such as energy or money) through a network of nodes you can use Sankey diagrams:

“A Sankey diagram is a directional flow chart where the width of the streams is proportional to the quantity of flow, and where the flows can be combined, split and traced through a series of events or stages.”
(source: CHEMICAL ENGINEERING Blog)

One area where this can be applied very well is that of costing. By modeling the flow of cost through a company one can analyze the aggregated cost and thus determine the profitability of individual products, customers or channels. Using the principles of activity-based costing one can create a cost-assignment network linking cost pools or accounts (as tracked in the General Ledger) via the employees and their activities to the products and customers. Such a Cost Flow can then be visualized using a Sankey diagram:

Cost Flow from Accounts via Expenses and Activities to Products

The direction of flow (here from left to right) is indicated by the color assignment from nodes to its outflowing streams. Note also the intuitive notion of zero-loss assignment: For each node the sum of the in- and outflowing streams (= height of that node) remains the same. Hence all the cost is accounted for, nothing is lost. If you stacked all nodes on top of one another they would rise to the same height. (Random data for illustration purposes only.)

The above diagram was created in Mathematica using modified source code originally from Sam Calisch who had posted it in 2011 here. Sam also included a “SankeyNotes.pdf” document explaining the details of the algorithms encoded in the source, such as how to arrange the node lists and how to draw the streams.

I find these a perfect example of how a manual drawing can go a long ways to illustrate the ideas behind an algorithm, which makes it a lot easier to understand and reuse the source code. Thanks to Sam for this code and documentation. Sam by the way used the code to illustrate the efficiency of energy use (vs. waste) in Australia:

Energy Flow comparison between New South Wales and Australia (Sam Calisch)

Note the sub-flows within each stream to compare a part (New South Wales) against the whole (Australia).

Another interesting use of Sankey Diagrams has been published a few weeks ago on ProPublica about campaign finance flow. This is particularly useful as it is interactive (click on image to get to interactive version).

Tangled Web of Campaign Finance Flow

Note the campaigns in green and the Super-PACs in brown color. The data is sourced from FEC and the New York Times Campaign Finance API. Note that in the interactive version you can click on any source on the left or any destination on the right to see the outgoing and incoming streams.

Finance Flow From Obama-For-America

Finance Flow to American Express

Here are some more examples. Sankey diagrams are also used in Google Flow Analytics (called Event Flow, Goal Flow, Visitor Flow). I wouldn’t be surprised to see Sankey Diagrams make their way into modern data visualization tools such as Tableau or QlikView, perhaps even into Excel some day… Here are some Visio shapes and links to other resources.

 
3 Comments

Posted by on May 14, 2012 in Financial, Industrial

 

Tags: , , ,

Quarterly Comparison: Apple, Microsoft, Google, Amazon

Quarterly Comparison: Apple, Microsoft, Google, Amazon

Last quarter we looked at the financials and underlying product & service portfolios of four of the biggest technology companies in the post “Side by Side: Apple, Microsoft, Google, Amazon“. With the recent reporting of results for Q1 2012 it is a good time to revisit this subject.

Comparison of Financials Q4 2011 and Q1 2012 for Apple, Microsoft, Google, and Amazon.

Market cap has grown roughly by 25% for both Apple and Amazon, whereas Microsoft and Google only added 5% or less. A sequential quarter comparison can be misleading due to seasonal changes, which impact different industries and business models in a different way. For example, Google’s ad revenue is somewhat less impacted by seasonal shopping than the other companies.

Sequential quarter comparison of financials

Apple and Microsoft seem to be impacted in a similar way by seasonal changes. For Amazon, which already has by far the lowest margin of all four companies, operating income decreased by 40% while it increased its headcount by 17%. This leads to much lower income per employee and with increased stock price to a doubling of its already very high P/E ratio. I’m not a stock market analyst, but Amazon’s P/E ratio of now near 200 seems extraordinarily high. By comparison, the other companies look downright cheap: Apple 8.8, Microsoft 10.5, Google 14.5

Horace Dediu from asymco.com has also revisited this topic in his post “Which is best: hardware, software, or services?“. What’s striking is that all three companies (except Amazon) now have operating margins between 30-40% – very high for such large businesses – with Apple taking the top near 40%. Over the last 5 years, Apple has doubled it’s margin (20% to 40%), whereas Microsoft (35-40%) and Google (30-35%) remained near their levels.

(Source: Asymco.com)

Long term the most important aspect of a business is not how big it has become, but how profitable it is. In that regard Amazon is the odd one out. Their operating income last quarter was about 1% of revenue. Amazon needs to move $100 worth of goods to earn $1. They employ 65,000 people and had revenue of $13.2b last quarter, yet only earned $130m during that time! Apple earns more money just with their iPad covers! Amazon’s strategy is to subsidize the initial Kindle Fire sale and hoping to make money on the additional purchases over the lifetime of the product. In light of these numbers, do you think Amazon has a future with it’s Kindle Fire tablet against the iPad?

But what really struck me about the extreme differences in profitability is this comparison of Apple and Microsoft product lines (source: @asymco twitter):

(source: @asymco twitter)

This shows what an impressive and sustained success the iPhone has been. And the iPad is on track to grow even faster. Horace Dediu guesses that Apple’s iPad will be a bigger profit generator than Windows in one quarter, and a bigger profit generator than Google (yes, all of Google) in three quarters. We will check on those predictions when the time comes…

 
3 Comments

Posted by on May 2, 2012 in Financial, Industrial

 

Tags: , , , , , ,

Tube Maps

Tube Maps

I just got back from a combined business and vacation trip around Easter to Germany and Austria. In Europe, public transportation is an important part of the infrastructure. Especially in the big cities many people commute daily by train or subway, some even live without a car.

One of the most important pieces of information for train and subway systems is the tube map. It is a schematic transit map showing the lines, stations and connections of the train or subway system. It’s main element is that it abstracts away geographical detail (where is what) and focuses on topological aspects: How do I need to transit to which other line to get to a particular station?

London Tube Map (Source: Wikipedia)

The Wikipedia tube map article details the origins around the London subway system which was called the tube (hence the name for this type of map) dating back to first schematic maps in 1931 by Harry Beck:

“Beck was a London Underground employee who realised that because the railway ran mostly underground, the physical locations of the stations were irrelevant to the traveller wanting to know how to get to one station from another — only the topology of the railway mattered.”

This style of map has been widely adopted and successively refined. Having grown up in Munich and having used its train (S-Bahn) and subway (U-Bahn) system for some 25 years, I came to realize that it is not only a convenient tool for the traveller. It can form the basis of mental models of the topology of a city. The first lines of the Munich S- and U-Bahn system were built for the Olympic Games in 1972. The history and evolution of the train and subway system over the 40 years since has been documented on this website. Let’s look at the tube maps and their evolution in roughly 10 year time intervals.

Munich Tube Map 1971

1971: Note the basic shape of a central track West-East shared by all S-Bahn lines which then fan out radially to the suburbs. The 45° angles help with the text labels and add simplicity to the layout. This simplicity is one key element for such tube maps to become a mental model of the city topology, i.e. of knowing what is where and how to get to it. Note that initially there are only two U-Bahn lines sharing most of their underground tracks.

Munich tube map 1980

1980: The design of the map evolves to “stretch” out the line-graph to both fill out the entire available rectangular space and to free up some more space in the center; here two additional U-Bahn lines require more space, also due to the fact that U-Bahn stations are closer together than S-Bahn stations in the periphery. The Text label “P+R” is introduced to designate Park & Ride facilities at the stations for commuters.

Munich tube map 1992

1992: Some additional U-Bahn lines and stations fill in the center. One of the S-Bahn lines is renamed (S3 -> S8) and extended to the North to connect to the new Munich airport (Erding). Also a few minor map changes (new color scheme, font and legend).

Munich tube map 2001

2001: S1 now also reaches the new airport, which simplifies travel from the Western part of the city and effectively creates a Northern loop. The map changes in the top section to reflect this new topology; this graphically compresses the U-Bahn system in the upper half. A new color (blue) for the stations represents he inner zone. This together with the new text label “XXL” represents tariff boundaries. (A similar approach with blue font color for inner zone station names was dropped after a brief version in 1997; it looked confusing.)

Munich tube map 2012

2012: The current map adds several graphical aspects such as the concentric rings of background color for tariff boundaries, a new font for cleaner look and less line breaks as well as icons for long distance train connections. It also shows some geographic features such as the Isar river and the two lakes in the South-West as well as icons for tourist attractions or land-marks such as the new soccer stadium, the ‘Deutsches Museum’ or the Zoo. For a hi-res map see this pdf file.

Such a sequence shows the evolution of schematic concepts and visual representations over the decades. When you take away some of the simplifying tube map abstractions such as the 45° angle, you get topographical maps like this:

Topographical map of Munich U-Bahn 2010

While such a map gives you a more precise idea of where you are at any given station in the city, it is much harder to remember and to reconstruct in your head. I believe that this simplicity-by-design of modern tube maps makes it such a strong candidate for forming the basis of mental models of city topology.

Here is an interesting variation of the Munich transit system in a so called isochrone map using colors to display transit times say from the center to other city destinations. Robin Clarke created the following map and describes in this post how he did it.

Munich transit system Isochrone Map (Source: Robin Clarke)

A final example of using tube maps in an interactive graphic comes from Tom Carden. He created an applet that lets you click on any of the 200 London subway stations and get a isochronic map showing transit times from that origin to any other station. While not laid out as clean as the Beck-style tube maps, this interactive graphic represents 200 different maps all in one! (Click on the image to get to the interactive version.)

Interactive London Tube Map (Source: Tom Carden)

See also the more recent Blog post London Tube Map for additional examples of graph visualizations using the London underground as illustration object.

As a traveller arriving in an unknown city we often tend to take such subway infrastructure and its documentation for granted. What amazes me is to think about the amount of cumulative work – plan, design, construction, logistics, etc. – that has gone into building such an infrastructure. A few interesting facts about the Munich U-Bahn (subway) system: 6 lines, 100 stations, 103 km, ~ 1 million passengers /day. (Source: Wikipedia). Building a subway costs in the order of $100 million/km, so this represents an investment of about $10 billion! Think about that the next time you try to find your way through a new city…

 
2 Comments

Posted by on April 20, 2012 in Industrial, Recreational

 

Tags: ,

Khan Academy and Interactive Content in Digital Education

Khan Academy and Interactive Content in Digital Education

Online education has received a lot of attention lately. Many factors have contributed to the rise in online educational content, including higher bandwidth, free video hosting (YouTube), mobile devices, growing and global audiences, improved customization mechanisms (scoring, similarity recommendations), gamification (earning badges, friendly competitions, etc.) and others. Interactivity is an important ingredient for any form of learning.

“I tell you and you forget. I show you and you remember. I involve you and you understand.” [Confucius, 500 BC]

During learning a student forms a mental model of the concepts. Understanding a concept means to have a model detailed enough to be able to answer questions, solve problems, predict a system’s behavior. The power of interactive graphics and models comes from the ability of the student to “ask questions” by modifying parameters and receive specific answers to help refine or correct the evolving mental model.

Digital solutions are bringing innovations to many of these areas. One of the most innovative approaches is the Khan Academy. What started as an experiment just a few years ago by way of recording short, narrated video lessons and sharing them via YouTube with family and friends has grown into a broad-based approach to revolutionize learning. Over the years, founder Sal Khan has developed a large collection of more than 3000 such videos. Backed by prominent endorsers such as Bill Gates the not-for-profit Khan Academy has developed a web-based infrastructure which can handle a large number of users and collect and display valuable statistics for students and teachers. The Khan Academy has received lots of media attention as well, with coverage on CBS 60 minutes, a TED talk and more. The videos have by now been seen more than 130 million times!

Another high profile experiment has been launched in the fall of 2011 at Stanford University, where three Computer Science courses have been made available online for free, including the Introductory Course to Artificial Intelligence by Sebastian Thrun and Peter Norvig. In a physical classroom a professor can teach several dozens to a few hundred students at most. In a virtual classroom these limits are obviously far higher. Exceeding all expectations, some 160.000 students in 190 countries had signed up for this first course!

The basic pillar of online learning continues to be the recorded video of a course unit. The student can watch the video whenever, wherever to learn at his own pace and schedule. One can pause, rewind, replay however often as needed to better understand the content. Of course, if that was the only way to interact, it would be fairly rudimentary. Unlike in a real classroom or with a personal tutor, one can’t ask the teacher in the video a question and receive an answer. One can’t try out variations of a model and see its impact.

Sample Khan Academy Profile Graph

That’s where the tests come in. Testing a concept’s understanding usually involves a series of sample questions or problems which can only be solved repeatedly and reliably with such an understanding. Both Khan Academy and the Stanford AI course have test examples, exams and grading mechanisms to determine whether a student has likely understood a concept. In the Khan Academy, testable concepts revolve around mathematics, where an unlimited number of specific instances can be generated for test purposes. The answers to test questions are recorded and can be plotted.

Khan Academy Knowledge Map of testable concepts

The latter form of interactivity may be among the most useful. The system records how often you take tests, how long it takes you to answer, how often you get the answers right, etc. All this can then be plotted in some sort of dashboard. Both for yourself as individual student, or for an entire class if you are a coach. This shows at a glance where you or your students are struggling and how far along they have progressed.

Concepts are related to one another in a taxonomy so that one gets guidance as to which concepts to master first before building higher level concepts on top of the simpler ones. Statistical models can suggest the most plausible hints of what to try next based on prior observations.

Founder Sal Khan deserves a lot of respect for having almost single-handedly having recorded some 3000+ video lessons and changing the world of online education so much for the better with his not-for-profit organization. From an interactive content perspective, imagine if at the end of some Khan video lessons you could download an underlying model, play with the parameters and maybe even extend the model definition? I know this may not be feasible in all taught domains, but it seems as if there are many areas ripe for such additional interactivity. We’ll look at one in the next post.

 
1 Comment

Posted by on March 26, 2012 in Education

 

Tags: , , ,

Interactive Documents – Roambi Flow

Interactive Documents – Roambi Flow

One year ago I purchased my own iPad 2. When using it in meetings, it quickly became apparent how much potential there is to make presented information much more interactive. I posted last June about Interactive and Visual Information. In the meantime, more and more software is aiming at making documents more interactive, especially on the iPad to leverage mobility and touch.

In this post we will look at Roambi Flow, a product that lets you compose documents with interactive elements. Roambi is a set of business intelligence products by San Diego based company MeLLmo which has been designed from the ground up to take advantage of iOS features such as rich graphics and touch interface. On Roambi’s product website you will find detailed descriptions of each of these products.

Roambi Analytics Views

Roambi Analytics introduced a series of so called Views. Each of these views is interesting in its own and warrants a more in-depth coverage; I’ll just enumerate them briefly.

Blink gives you cube analytics displaying various measures in selected dimensions, swiping and scrolling through a data set.

Cardex is a visual metaphor for organizing sets of elementary reports and visually comparing them side-by-side like a mini comparison dashboard.

CataList lets you browse top-level lists and drill into a detailed view with sliders to see data points over time and display highlighted information.

Elements allows you to compose dashboards of connected, basic chart elements to explore multi-dimensional data.

Layers specializes on the display and navigation of hierarchically grouped data sets – such as continent, country, city – through the use of scroll, pinch and zoom gestures.

PieView is a variation of the Piechart theme. It’s main innovation is to allow the rotation of the entire piechart similar to the original Apple iPod click wheel. (It doesn’t eliminate the shortcomings of piecharts per se, but it makes them a little easier to live with and a lot more fun to explore.)

Squares is using the heat map concept in a very intuitive and easy to use way to display data organized along two main axes – such as the global sales performance of various products in various countries. Dragging along rows or columns highlights them one at a time, tapping on a row or column “explodes” its content to a matrix with more detail – in which one can again navigate, sort, etc.. Tap & Hold on the heat map generates a Fish-Eye view with more detail of the tapped element maximized. Moving while holding will move the fish-eye to areas of interest. (see image below)

SuperList is a generic view for lists with numeric information that allows to sort, filter, toggle between bars and numbers etc. Think of it as a starting point for tabular data display on the iPad.

Fish-Eye view in Squares, one of the Roambi Analytics views

Each view has a Help-style description with a short 1 min video overview in it. This goes to show that seeing these views in narrated action is much more intuitive and easier to understand than just reading about them. It’s literally leveraging some “show & tell”. The best way to explore these views is to download the free Roambi Viewer apps on the iPad and play with them. They come with stored sample data sets so you can visually explore the views even while you are offline. Roambi also features brief videos and tutorials on their website.

But back to Roambi Flow: You want your data to tell stories. This is best done through a combination of text explaining the context, perhaps some multimedia demonstrating the highlights and some interactive elements allowing the reader to visually explore on her own. This is where Roambi Flow comes in. It’s a publishing container that allows you to embed the above views (and other multi-media content) into regular text documents. The reader navigates the content at the top-level like a traditional book, either by clicking on the table of contents or by literally flipping through the pages. The app will even simulate the page turning like we are used to from Apple’s iBooks.

Page transition in Roambi Flow; Note the embedded, interactive element on the next page.

The individual elements can be double-tapped, which expands them to full-screen and then support their full visual exploration capabilities. The views can be linked to backend data sources to automatically stay in sync with up-to-date information. View displays can be bookmarked and shared with others. But the main point really is the fact that the reader does not only see a static image, but can interact and manipulate the views to obtain a richer understanding of the underlying data sets.

Roambi Flow page with two interactive view elements.

Given the rapid adoption of iPads in corporate environments it is straightforward to see such interactive documents spreading both within a company as well as in its external communications. Imagine reading the annual report, the sales pitch or the research paper when you can interact with the financials, the offered product or the proposed scientific model! With interactive content, reading will never be the same.

 
Leave a comment

Posted by on March 12, 2012 in Industrial

 

Tags: , , ,

Mobile Business Intelligence Market Study

Mobile Business Intelligence Market Study

Dresner Advisory Services publishes an annual study on Mobile Business Intelligence vendors, the latest in October 2011. It focuses on the mobile capabilities of BI platform vendors similar to those in the Gartner Magic Quadrant for Business Intelligence we recently looked at.

The ~50 page document has a good executive summary and provides insight from industry surveys and changes between 2010 and 2011. In terms of data visualizations, it generally does a poor job of conveying the study findings. There is an abundance of pie charts and stacked bar charts with often very confusing color codes. For example, consider this chart on BI vendor mobile platform priority:

Mobile BI Vendor Platform Priority (source: DAS)

Rank information shouldn’t be conveyed by color (better by vertical position). It is very confusing to see which platforms gained or lost in the ranking. A data visualization should first and foremost make it easy to spot patterns and thus provide insight. Not every dataset makes for a good Excel bar chart.

All that said, I found one very useful chart which shows all vendor Mobile BI capabilities at a glance:

Mobile BI Vendor Scores (source: DAS)

Regarding the vendor scoring, from the study:

Using the data that was provided by twenty-four different BI vendors, we constructed a model which scores them based on mobile platform support, platform integration and numbers of supported BI features (Figure 33).

Please carefully review the detailed vendor and product profiles on pages 47 – 52 and to consider both dimensions (i.e., platform and features) independent of each other.

It should be noted that this model reflects only two dimensions of a BI vendor’s product capability and is not intended to indicate “market leadership” only a convergence of capabilities for Mobile BI. Readers are encouraged to use other tools to understand the many other dimensions of vendor capability, such as our own Wisdom of Crowds Business Intelligence Market Study ®.

The full report can be downloaded from the Yellowfin website here.

 
1 Comment

Posted by on February 29, 2012 in Industrial

 

Tags: , ,

Probabilistic Project Management at NASA with Joint Confidence Level (JCL-PC)

Probabilistic Project Management at NASA with Joint Confidence Level (JCL-PC)

On the Strategic Project and Portfolio Management Blog by Simon Moore one can find many fascinating stories about project failures as well as a related collection of project management case studies. One entry there links to a project management method NASA is mandating internally since 2009 to estimate costs and schedule of their various aerospace projects. The method is called Joint Confidence Level – Probabilistic Calculator (JCL-PC). It’s a sophisticated method using historical data and insight into estimation psychology (like optimism bias) to arrive at corrective multipliers for project estimates based on project completion percentages with required confidence level. It’s also using Monte Carlo simulations to determine outcomes, leading to scatterplots of the simulated project runs on a Cost-vs.-Schedule plane. From there one can determine estimates with for example 70% confidence levels for what the cost and schedule overruns will likely be.

If you’re either already familiar with the method or if you are very good at abstract thinking the above paragraph will have meant something to you. If it didn’t, bear with me. In this post I make a brief attempt to explain what I understood about the method using the data visualizations from two sources (a 100+ page report and a 12 page FAQ). The report is fascinating on many levels, as it deals with the history of high-profile project overruns (Apollo program, Space Shuttle, Space Station) and the pervasive culture of under-estimation (optimism bias) through not accounting for project risks that are unknown, but historically evident.

JCL starts with historical observations of similar projects with regards to cost and schedule overruns. For example, the above cited report contains best fit histogram distributions for robotic missions.

Overrun Distributions of Cost and Schedule for Robotic Missions (Soucre: NASA)

The idea is to use a set of such distributions for probabilistic estimates of cost and schedule. The set of distributions needs to account for the fact that in the early stages of a project there are more unknowns and as such higher risk of overruns. From the report:

The JCL-PC estimating method is based on the hypotheses that in the beginning phases of a project there are many unknown risks – and over time the project will have a high probability of exceeding estimated costs and scheduled duration. … Work as it was initially planned will inevitably change. Quantifiable risks become clearer and NASA’s S-Curves will tend to lay down as the work goes forward. Keep in mind that it’s not the project that is becoming inherently riskier. It’s a matter of participants fully identifying the real work that was “out there” all along. Even though the scope of the work wasn’t fully perceived “back when” – progress has continued to identify the risks and quantify the corrective actions. History is written in real time and that history differs to a greater or lesser degree from what was anticipated. The JCL-PC helps us better plan for and manage that difference.

The JCL-PC method strikes a needed balance between subjectivity and anticipated risk variability leaving only one remaining probability influence factor to deal with. – namely, assigning the percentage complete of the subject project. This % complete factor includes both subjective and objective elements.

One of the key elements is the notion of a multiplier which implements this reduced-uncertainty-over-time as well as a so called optimism corrector and other project risk in line with historical aerospace project overruns. The multiplier is plotted below as a function of the project % complete parameter for different confidence levels:

Multiplier as function of project % complete for various confidence levels

The concept is illustrated via two charts of a fictitious $1m project (applied here to cost overruns, but equally applicable to schedule overruns): The first shows a point estimate and it’s S-curves (confidence bands) per project % complete.

The second shows the S-Curves after applying “the optimism corrector and some minor project risk, through a more typical project life cycle with project scope creep … As the project evolves the S-Curve moves slightly to the right and becomes more and more vertical.”

It would be great to have an interactive graphic where the S-Cruves are plotted in response to sliding the project % complete between 0% and 100%. The report lists the above multipliers in a numerical table spanning project % complete (in 1% increments) and four confidence levels (50%, 60%, 70%, 80%). Rather than copying the entire table I filtered this down to just 10% increments in project % complete. This table tells NASA officials at various confidence levels, how much money they will have to spend for a $1m project as a function of project % complete:

Cost Estimate Table with project % complete and confidence levels

The data point highlighted in yellow is described as follows:

When the project is 50% complete, you’ll notice that a 50% confidence level suggests that the project can be completed for the anticipated $1,000,000. However, if we adhere to the NASA standard of a 70% confidence level, we see that another $400,000+ will likely be needed to complete the project. No matter how well a project is managed, it rarely compensates for ultra- optimistic budget estimates that sooner or later return with a vengeance and overcome the most skillful leaders.

As a final illustration the FAQ document includes this scatterplot as JCL-PC output:

Scatter Plot of Monte Carlo simulation with JCL-PC

A Frontier Curve represents all possible combinations of cost and schedule that will give you a percent JCL. The plot shows the Frontier Curve for a 70% JCL in yellow. The green dots are simulated runs with outcomes below the selected cost and schedule (blue cross-hair, yellow labels). White dots have either cost or schedule overruns, red dots have both.

The report makes bold claims about the potential of JCL-PC, but also about the challenges inherent in attempting to change an entire management culture. I am not qualified to comment on these claims, but my impression is that such probabilistic project management methods will raise the bar in the field and should lead to more accurate estimates.

The more I think about such abstract concepts, the more I’m convinced that mental models are inherently visual. We remember some key visualizations or charts and anchor our understanding of the concept around those visual images. We also use them to communicate or teach the concepts to each other – hence the value of the whiteboard or even the napkin drawing. As such, the increasing computational ability to produce such visual images and ideally even interactive graphics is an important element of academic and scientific endeavors.

 
Leave a comment

Posted by on February 25, 2012 in Financial, Industrial, Scientific

 

Tags: , , ,

Gartner’s Magic Quadrant for Business Intelligence

Gartner’s Magic Quadrant for Business Intelligence

Note: See also the more recent update on the Magic Quadrant for Business Intelligence 2013.

The Gartner group publishes an annual report called Magic Quadrant of Business Intelligence. It compares various vendors in two dimensions: Ability to Execute and Completeness of Vision. These two dimensions span up four quadrants (leaders, challengers, visionaries, niche players).

The key graphic in the Gartner reports is the so called Magic Quadrant diagram. Here is the 2012 version (click the image to see the full report):

Magic Quadrant of BI 2012 (Source: Gartner)

Similar charts have been published for 2011, 2010, 2009, and 2008 (source: Google Image Search).

From these snapshots in time one can create a time-series and compare relative movement of vendors. Here is an interactive version of such a chart created with Tableau Public: (Click on chart below to interact.)

Interactive BI_MagicQuadrant 2008-2012

Disclaimer: There are at least two caveats here: One is the limited quality of the data. The other is the limited applicability of this type of visualization.

Quality: I have contacted two of the authors at Gartner and asked for the (x,y-coord) data of those Magic Quadrants. However, Gartner’s policy is to not disclose these data. Hence I screen-scraped the coordinates off the publicly available images. This brings with it limited accuracy to measure the positions from the images and the possibility of (my) clerical error in entering that data in a spreadsheet.

Applicability: The contacted authors (James Richardson and John Hagerty) both emphasized that due to subtle changes in the way the dimension score is calculated each year such sequential comparisons are not supported by Gartner. In other words, the data may show misleading or unintended conclusions.

Discussion: Of course the original Gartner reports provide a tremendous amount of detail, both around the methodology (which factors contribute to Vision and Execution scores) and on the various vendors, their products and other relevant business aspects like sales channels etc. One also needs to bear in mind that some of these companies emerge or disappear over time.

That said, the interactive time-series chart has many advantages over the individual snapshots:

  • You can select a subset of companies (for example all public companies)
  • Companies are identified by label and color
  • History can be traced for consecutive years
  • Trends are more easily detected (see also Disclaimer above)

For example, smaller but rapidly growing companies like Tibco (Spotfire) and Tableau have somewhat vertical trajectories leading them into the “challenger” quadrant with strong increases in the ability to execute. Tibco and QlikTech are the only 2 (of 24) companies to change quadrants in the last 5 years, from visionary to challenger (Tibco) and leader (QlikTech), respectively.

MQ trajectory for Tableau, Tibco, and QlikTech

Some big public companies like IBM, SAP and Microsoft have invested heavily over the last years in the BI space. This has resulted in a more horizontal trajectory within the leader quadrant as they have increased the completeness of their vision, among others through acquisitions of smaller companies (SAP bought Business Objects, IBM bought Cognos).

MQ trajectory for IBM, SAP, and Microsoft

Some individual trajectories are more dynamic than others. For example MicroStrategy has had strong increases first in vision (2008-2009) and then in their ability to execute (2010-2012). By contrast, Actuate has fallen behind relative to others in both execution and vision in the first 3 years, only to stop (2011) and revert (2012) that trend in recent years.

MQ trajectory for Actuate and MicroStrategy

Bottom Line: Data presented via Interactive Charts invites exploration, discovery, and better understanding. Through Tableau Public these charts can easily be shared with others. The Magic Quadrant data is originally curated and presented by Gartner in the traditional snapshot moment-in-time format. IMHO, in this interactive time-series format the data comes to live and yields additional insight. I’d be interested to hear your thoughts and comments on the caveats from the authors about the limited applicability of the time series animation?

 
5 Comments

Posted by on February 20, 2012 in Industrial

 

Tags: , , ,

 
%d bloggers like this: