RSS

Monthly Archives: November 2011

World Cartogram of Mobile Phone Adoption

World Cartogram of Mobile Phone Adoption

Under the slogan “Our Changing World”, FedEx has developed a website with various cartograms showing world-wide socio-economic changes based on publicly available data from sources such as World Bank, UNESCO, World Health Organization and others.

Cartograms visualize a particular metric by adjusting a country’s size corresponding to that metric. It leaves country neighborhood relationships (which we blogged about here) intact, but inflates or deflates countries, often dramatically so. Here is a series of three cartograms showing the adoption of mobile phones in the years 1995, 2000, and 2008. Size of each country is proportional to the density of mobile phones (average # mobile phones per 100 people).

Mobile Phone Density 1995

Mobile Phone Density 2000

Mobile Phone Density 2008

From the Topic Info on the Mobile Phone Presence display:

In 1996, mobile phones were a Nordic phenomenon. A Swede was twice as likely as an American to own one, and five times as likely as a German. Skip forward four years and the picture changed radically. Mobile phone usage boomed ten-fold across Europe; most European nations caught up with their northern neighbours. Eight years later. Africa suddenly loomed large. Mobile-phone penetration in same emerging economies now outstrips that of the developed world; Algeria tops the US. In most countries, mobile phone use is now ubiquitous. Lacking a mobile phone is more striking today than possessing one.

Indeed, it’s hard to find a country with very small mobile phone presence – and then to pinpoint it on the cartogram. One country I found was Cuba: While most countries in the Americas have between 50-100, Cuba has only 3 mobile phones per 100 people.

A few months ago Nathan Yau covered this topic on his FlowingData Blog here. As he already suggested, there are many more data to explore on FedEx’s website, so check it out for yourself here.

 
1 Comment

Posted by on November 20, 2011 in Industrial, Scientific, Socioeconomic

 

Tags: , ,

The Observatory of Economic Complexity

The Observatory of Economic Complexity

In this second part we will look at the online interactive visualizations as a companion to the first part’s Atlas of Economic Complexity. It’s interesting that the authors chose the title “Observatory”, as if to convey that with a good (perhaps optical) instrument you can reveal otherwise hidden structure. To repeat one of the fundamental tenets of this Blog: Interactive graphics allow the user to explore data sets and thus to develop a better understanding of the structure and potentially create otherwise inaccessible insights. This is a good example.

The two basic dimensions for exploration of trade data are products and countries. The most recent world trade data is from 2009 and it ranges back between 20 to 50 years (varying by country). I worked with three types of charts: TreeMaps, Stacked Area Charts, and the Product Space network diagram. Let’s start with Germany’s Exports in 2009:

Hovering the cursor over a node highlights it’s details, here “Printing Presses”, a product type where Germany enjoys a high degree of Revealed Comparative Advantage (RCA). (For details on RCA or any other aspects of the product space concept and network diagram, please see the previous post on the Atlas of Economic Complexity.) We can now explore which other countries are exporting printing presses:

While Germany clearly dominates this world market with 55% at $2.7b in 2009 with RCA = 5.6, the time slider at the bottom (with data since 1975) reveals that it has actually held an even bigger lead for most of the last 35 years. For example, with it’s exports in Printing Presses Germany commanded 72% at 3.7b in 2001 with RCA = 6.3 From the timeline one can also see how the United States captured about 20% of this (then much smaller) market for a brief period between 1979 and 1983. During this time its RCA for Printing Presses was just a bit above 1.0 – which shows as a black square in the Product Space – but the United States has since lost this advantage and not seen any significant exports in this product type. Printing Presses being a fairly complex product, only a handful of countries are exporting them, almost all of them European and Japan. There might be an interesting correlation between complexity and inequality, as the capabilities for the production of complex products tend to cluster in a few countries worldwide which then dominate world exports accordingly.

Another powerful instrument are Stacked Area Charts. Here you can see how a country’s Imports or Exports evolve over time, either in terms of absolute value or relative share of product types. For example, let’s look at the last 30 years (1978-2008) of Export data for the United States:

This GIF file (click if not animated) shows several frames. In Value display style one can see the absolute size and how Exports grew roughly 10-fold from about $100b to $1t over the course of those 30 years. The Share display style focuses on relative size, with all Exports always representing 100%. In the Observatory one can hover over any product type and thus highlight that color band to see the evolution of this product type’s Exports over time. In the highlighted example here, we can see how ‘Cereal and Vegetable Oil’ (yellow band) shrank from around 15% in the late seventies to around 5% since the late nineties. ‘Chemicals and Health Related Products’ (purple band) has remained more or less constant around a 10% Export share. ‘Electronics’ bloomed in the mid eighties from less than 10% to 15-20% and stayed on the high end of that range until around the year 2000 before shrinking in the last decade down to about 10%.

As a final example, look at the relative size of imports of the United States over the last 40 years, (1968 – 2008, sorted by final value):

The biggest category is crude petroleum products at the bottom. During the two oil shocks in the seventies the percentage peaked near 30% of all imports. Then it went down and stayed below 10% between 1985 – 2005. Since then it’s percentage has been steadily rising and reached about 15% again. (The data isn’t enough up-to-date to illustrate the impact of the 2008 recession.) Such high expenses are crowding out other categories. When the consumer pays more at the pump there is less to spend for other product types. Another interesting aspect of this last chart is that the bottom two bands represent opposite ends of the product complexity spectrum: Petroleum (brown) on the low end, cars (blue) on the high end.

As always, the real power of interactive visualizations comes from interacting with them. So I encourage you to explore these data at the Observatory of Economic Complexity.

Caveats: I noticed a couple of minor areas which seem to be either incomplete, counter-intuitive, poor design choices or simply implementation bugs. To start, there is no help or documentation of the visualization tool itself. Many of the diagram types on the left are grayed out and it is not always apparent what selection of products, countries or chart type will enable certain subselections. For example, there is a chart type “Predictive Tools” with two subtypes “Density Bars” and “Stepping Stone” that always seem to be grayed out? The same applies to Maps (presumably geographic maps) – all subtypes are grayed out. Perhaps I am missing something – would appreciate any comments if that’s the case.

In the TreeMaps for import and export one can not see the overall value of the overall trade (top-level rectangle) or any of the categories (second-level rectangles). Only the tooltips will show the value of a specific product type or country (third-level rectangle). The color legend is designed for the product space and designates the 34 communities of product types. When you hover the mouse over one product type, say garments (in green), then all imports / exports other than that product type are grayed out. When you show a product import / export chart, however, those same colors are used to designate groups of countries with color indicating continents (blue for Europe, red for the Americas, green for Asia etc.). Yet when you hover over the product icon in the legend (say garment), then only it’s corresponding color’s countries remains highlighted, which doesn’t make sense and can be misleading.
When you play the timeline in a TreeMap, the frequent change in layout can be confusing. A change from one year to the next played back and forth slowly or multiple times can be instructive, but a quick series of too many changes (particularly without seeing the labels) is just confusing.

In the stacked area charts when you click on Build Visualization it always comes up in “Value” style, even if “Share” is selected. To get to the Share style, you have to select Value and then Share again.

TreeMaps and Stacked Area Charts critically depend on the availability of data for all products / countries displayed. For years before 1990 there appear to be pockets of only sparsely available data, which then falsely suggests world market dominance of those products or countries. For example, the TreeMap for Imports in Printing Presses for 1983 shows the United States with 97% taking practically the entire market. In 1984, it’s share shrinks to a more balanced 28% despite growing very rapidly; simply because data for other countries from Europe, Asia etc. seems to not be available prior to 1984. In such cases it would have been better to show the rest as gray rectangle instead of leaving it out (if world import data are available) or just not display any chart for years with grossly incomplete data.

Navigation is somewhat limited. For example, looking at a country chart (say United Kingdom), it would be great to click on any product type (say crude petroleum) and get to a corresponding Stacked Area Chart diagram for that product type. One can do so using the drop-down boxes on the right, but that’s less intuitive.

There are two export formats (PDF and SVG). The vector graphics is a good choice since the fonts can be rendered fine even in the small print. I obtained poor results with PDF, however, as often the texts in TreeMaps were not aligned properly and printed on top of one another.

None of the above is a serious problem or even a showstopper. It would be great, however, if there was a feedback link to provide such info back to the authors and help improve the utility of this observatory.

 
1 Comment

Posted by on November 14, 2011 in Industrial, Scientific, Socioeconomic

 

Tags: , ,

The Atlas of Economic Complexity

The Atlas of Economic Complexity

Here is a recipe: Bring together renowned faculties like the MIT Media Lab and Harvard’s Center for International Development. Combine novel ideas about economic measures with years of solid economic research. Leverage large sets of world trade data. Apply network graph theory algorithms and throw in some stunning visualizations. The result: The Atlas of Economic Complexity, a revolutionary way of looking at world trade and understanding variations in countries paths to prosperity.

The main authors are Professors Ricardo Hausmann from Harvard and Cesar Hidalgo from MIT (whose graphic work on Human Development Indices we have reviewed here). The underlying research began in 2006 with the idea of the product space which was published in Science in 2007. This post is the first in a two-part series covering both the atlas (theory, documentation) as well as the observatory (interactive visualization) of economic complexity. This research is an excellent example of how the availability of large amounts of data, computing power and free distribution via the Internet enable entirely new ways of looking at and understanding our world.

The Atlas of Economic Complexity is rooted in a set of ideas about how to measure economies based not just on the quantity of products traded, but also on the required knowledge and capabilities to produce them. World Trade data allows us to measure import and export product quantities directly, leading to indicators such as GDP, GDP per capita, Growth of GDP etc. However, we have no direct way to measure the knowledge required to create the products. A central observation is that complex products require more capabilities to produce, and countries who manufacture more complex products must possess more of these capabilities than others who do not. From Part I of the Atlas:

Ultimately, the complexity of an economy is related to the multiplicity of useful knowledge embedded in it. For a complex society to exist, and to sustain itself, people who know about design, marketing, finance, technology, human resource management, operations and trade law must be able to interact and combine their knowledge to make products. These same products cannot be made in societies that are missing parts of this capability set. Economic complexity, therefore, is expressed in the composition of a country’s productive output and reflects the structures that emerge to hold and combine knowledge.

Can we analyze world trade data in such a way as to tease out relative rankings in terms of these capabilities?

To this end, the authors start by looking at the trade web of countries exporting products. For each country, they examine how many different products it is capable of producing; this is called the country’s Diversity. And for each product, they look at how many countries can produce it; this is called the product’s Ubiquity. Based on these two measures, Diversity and Ubiquity, they introduce two complexity measures: The Economic Complexity Index (ECI, for a country) and the Product Complexity Index (PCI, for a product).

The mechanics of how these measures are calculated are somewhat sophisticated. Yet they encode some straightforward observations and are explained with some examples:

Take medical imaging devices. These machines are made in few places, but the countries that are able to make them, such as the United States or Germany, also export a large number of other products. We can infer that medical imaging devices are complex because few countries make them, and those that do tend to be diverse. By contrast, wood logs are exported by most countries, indicating that many countries have the knowledge required to export them. Now consider the case of raw diamonds. These products are extracted in very few places, making their ubiquity quite low. But is this a reflection of the high knowledge-intensity of raw diamonds? Of course not. If raw diamonds were complex, the countries that would extract diamonds should also be able to make many other things. Since Sierra Leone and Botswana are not very diversified, this indicates that something other than large volumes of knowledge is what makes diamonds rare.

A useful question is this: If a good cannot be produced in a country, where else can it be produced? Countries with higher economic complexity tend to produce more complex products which can not easily be produced elsewhere. The algorithms are specified in the Atlas, but we will skip over these details here. Let’s take a look at the ranking of some 128 world countries (selected above minimum population size and trade volume as well as for reliable trade data availability).

Why is Economic Complexity important? The Atlas devotes an entire chapter to this question. The most important finding here is that ECI is a better predictor of a country’s future growth than many other commonly used indicators that measure human capital, governance or competitiveness.

Countries whose economic complexity is greater than what we would expect, given their level of income, tend to grow faster than those that are “too rich” for their current level of economic complexity. In this sense, economic complexity is not just a symptom or an expression of prosperity: it is a driver.

They include a lot of scatter-plots and regression analysis measuring the correlation between the above and other indicators. Again, the interested reader is referred to the original work.

Another interesting question is how Economic Complexity evolves. In some ways this is like a chicken & egg problem: For a complex product you need a lot of capabilities. But for any capability to provide value you need some products that require it. If a new product requires several capabilities which don’t exist in a country, then starting the production of such a product in the country will be hard. Hence, a country’s products tend to evolve along the already existing capabilities. Measuring the similarities in required capabilities directly would be fairly complicated. However, as a first approximation, one can deduce that products which are more often produced by the same country tend to require similar capabilities.

So the probability that a pair of products is co-exported carries information about how similar these products are. We use this idea to measure the proximity between all pairs of products in our dataset (see Technical Box 5.1 on Measuring Proximity). The collection of all proximities is a network connecting pairs of products that are significantly likely to be co-exported by many countries. We refer to this network as the product space and use it to study the productive structure of countries.

Then the authors proceed to visualize the Product Space. It is a graph with some 774 nodes (products) and edges representing the proximity values between those nodes. Only the top 1% strongest proximity edges are shown to keep the average degree of the graph below 5 (showing too many connections results in visual complexity). Network Science Algorithms are used to discover the highly connected communities into which the products naturally group. Those 34 communities are then color-coded. Using a combination of Minimum-Spanning-Tree and Force-Directed layout algorithms the network is then laid out and manually optimized to minimize edge crossings. The resulting Product Space graph looks like this:

Here the node size is determined by world trade volume in the product. If you step back for a moment and reflect on how much data is aggregated in such a graph it is truly amazing! One variation of the graph determines size by the Product Complexity as follows:

In this graph one can see that products within a community are of similar complexity, supporting the idea that they require similar capabilities, i.e. have high proximity. From these visualizations one can now analyze how a country moves through product space over time. Specifically, in the report there are graphs for the four countries Ghana, Poland, Thailand, and Turkey over three points in time (1975, 1990, 2009). From the original document I put together a composite showing the first two countries, Ghana and Poland.

While Ghana’s ECI doesn’t change much, Poland grows into many products similar to those where they started in 1975. This clearly increases Poland’s ECI and contributes to the strong growth Poland has seen since 1975. (Black squares show products produced by the country with a Revealed Comparative Advantage RCA > 1.0.)

In all cases we see that new industries –new black squares– tend to lie close to the industries already present in these countries. The productive transformation undergone by Poland, Thailand and Turkey, however, look striking compared to that of Ghana. Thailand and Turkey, in particular, moved from mostly agricultural societies to manufacturing powerhouses during the 1975-2009 period. Poland, also “exploded” towards the center of the product space during the last two decades, becoming a manufacturer of most products in both the home and office and the processed foods community and significantly increasing its participation in the production of machinery. These transformations imply an increase in embedded knowledge that is reflected in our Economic Complexity Index. Ultimately, it is these transformations that underpinned the impressive growth performance of these countries.

The Atlas goes on to provide rankings of countries along five axes such as ECI, GDP per capita Growth, GDP Growth etc. The finding that higher ECI is a strong driver for GDP growth allows for predictions about GDP Growth until 2020. In that ranking there are Sub-Saharan East Africa countries on the top (8 of the Top 10), led by Uganda, Kenya and Tanzania. Here is the GDP Growth ranking in graphical form – the band around the Indian Ocean is where the most GDP Growth is going to happen during this decade.

Each country has its own Product Space map. It shows which products and capability sets the country already has, which other similar products it could produce with relatively few additional capabilities and where it is more severely lacking. As such it can provide both the country or a multi-national firm looking to expand with useful information. The authors sum up the chapter on how this Atlas can be used as follows:

A map does not tell people where to go, but it does help them determine their destination and chart their journey towards it. A map empowers by describing opportunities that would not be obvious in the absence of it. If the secret to development is the accumulation of productive knowledge, at a societal rather than individual level, then the process necessarily requires the involvement of many explorers, not just a few planners. This is why the maps we provide in this Atlas are intended for everyone to use.

We will look at the rich visualizations of the data sets in this Atlas in a forthcoming second installment of this series.

 
6 Comments

Posted by on November 10, 2011 in Industrial, Scientific, Socioeconomic

 

Tags: , , ,

Implementation of TreeMap

Implementation of TreeMap

After posting on TreeMaps twice before (TreeMap of the Market and original post here) I wanted to better understand how they can be implemented.

In his book “Visualize This” – which we reviewed here – author Nathan Yau has a short chapter on TreeMaps, which he also published on his FlowingData Blog here. He is working with the statistical programming language R and uses a library which implements TreeMaps. While this allows for very easy creation of a TreeMap with just a few lines of code, from the perspective of how the TreeMap is constructed this is still a black box.

I searched for existing implementations of TreeMaps in Mathematica (which I am using for many visualization projects). Surprisingly I didn’t find any implementations, despite the 20 year history of both the Mathematica platform and the TreeMap concept. So I decided to learn by implementing a TreeMap algorithm myself.

Let’s recap: A TreeMap turns a tree of numeric values into a planar, space-filling map. A rectangular area is subdivided into smaller rectangles with sizes in relation to the values of the tree nodes. The color can be mapped based on either that same value or some other corresponding value.

One algorithm for TreeMaps is called slice-and-dice. It starts at the top-level and works recursively down to the leaf level of the tree. Suppose you have N values at any given level of the tree and a corresponding rectangle.
a) Sort the values in descending order.
b) Select the first k values (0<k<N) which sum to at least the split-ratio of the values total.
c) Split the rectangle into two parts according to split-ratio along its longer side (to avoid very narrow shapes).
d) Allocate the first k values to the split-off part, the remaining N-k values to the rest of the rectangle.
e) Repeat as long as you have sublists with more than one value (N>1) at current level.
f) For each node at current level, map its sub-tree onto the corresponding rectangle (until you reach leaf level).

As an example, consider the list of values {6,5,4,3,2,1}. Their sum is 21. If we have a split-ratio parameter of say 0.4, then we split the values into {6,5} and {4,3,2,1} since the ratio (6+5)/21 = 0.53 > 0.4, then continue with {6,5} in the first portion of the rectangle and with {4,3,2,1} in the other portion.

Let's look at the results of such an algorithm. Here I'm using a two-level tree with a branching factor of 6 and random values between 0 (dark) and 100 (bright). The animation is iterating through various split-ratios from 0.1 to 0.9:

Notice how the layout changes as a result of the split-ratio parameter. If it’s near 0 or 1, then we tend to get thinner stripes; when it’s closer to 0.5 we get more square shaped containers (i.e. lower aspect ratios).

The recursive algorithm becomes apparent when we use a tree with two levels. You can still recognize the containers from level 1 which are then sub-divided at level 2:

One of the fundamental tenets of this Blog is that interactive visualizations lead to better understanding of structure in the data or of the dynamic properties of a model. You can interact with this algorithm in the TreeMap model in Computable Document Format (CDF). Simply click on the graphic above and you get redirected to a site where you can interact with the model (requires one-time loading of the free CDF Browser Plug-In). You can change the shape of the outer rectangle, adjust the tree level and split-ratio and pick different color-schemes. The values are shown as Tooltips when you hover over the corresponding rectangle. You also have access to the Mathematica source code if you want to modify it further. Here is a TreeMap with three levels:

Of course a more complete implementation would allow to vary the color-controlling parameter, to filter the values and to re-arrange the dimensions as different levels of the tree. Perhaps someone can start with this Mathematica code and take it to the next level. The previous TreeMap post points to several tools and galleries with interactive applications so you can experiment with that.

Lastly, I wanted to point out a good article by the creator of TreeMaps, Ben Shneiderman. In this 2006 paper called “Discovering Business intelligence Using Treemap Visualizations” he cites various BI applications of TreeMaps. Several studies have shown that TreeMaps allow users to recognize certain patterns in the data (like best and worst performing sales reps or regions) faster than with other more traditional chart techniques. No wonder that TreeMaps are finding their way into more and more tools and Dashboard applications.

 
4 Comments

Posted by on November 9, 2011 in Industrial, Scientific

 

Tags: , , ,

7 Billion

7 Billion

World population has just reached 7 Billion this week. Exploring the growth of population and related aspects such as consumption, land use, urbanization etc. lends itself very well to data visualization. In this context, the National Geographic Society has released a free iPad app called “7 Billion” together with its Special Series: 7 Billion website.

The iPad app features some interesting charts under the heading “The Shape Of Seven Billion”. These visualizations come in the form of cartograms, a type of map that ignores a country’s true physical size and scales the size according to other data. Here they show population (current 2011 vs. 1960, when world population was around 3 Billion).

Population Cartogram 2011 (Source: National Geographic iPad App 7 Billion)

The position of countries is roughly preserved, the size is proportionate to the country population, and the color legend shows the amount of growth since 1960. The strongest growth (red, more than 300%) happened in Africa and the Middle East. Europe, Russia and Japan had the least amount of growth (blue, under 50%). India and China are by far the most populous countries, with India growing faster than China.

Another interesting cartogram illustrates consumption (as measured in Gross Domestic Product, GDP). Here the reference year is 1980 and is shown first in black & white:

Consumption Chart 1980 (Source: National Geographic, iPad App 7-Billion)

Compare this to the current Consumption or GDP distribution as of 2011:

World Consumption Chart 2011 (Source: National Geographic iPad App 7 Billion)

The size of the countries here is proportionate to their GDP (in constant international dollars using purchase power parity rates). The color scale has red (more than $40,000 per capita) and blue (less than $3,000 per capita) on both ends of the spectrum. While the United States is clearly dominating this picture, Europe has about the same size and China isn’t far behind. However, China has had the world’s largest GDP increase of 1,506% since 1980 (~15 fold increase), whereas the GDP of the U.S. grew by 119% (a bit more than doubled) during the same period of time.

Ideally on would be able to see this cartogram animated over time with sizes of countries shrinking or growing and changing colors over time, similar to the Bubble Charts we looked at earlier on this Blog.

There are many other interesting charts in this interactive eBook style app. For example, here is a chart showing the population growth over time – a good visualization of the power of exponential growth.

World Population Growth and Projection (Source: National Geographic 7 Billion iPad App)

One graphic aims at explaining the main drivers behind the explosive growth over the last two centuries after relatively slow growth for millennia – the improvements in health care and resulting drop in death rate led to a period of far greater birth rates than death rates.

Population Growth as Function of Birth Rate minus Death Rate

An interesting visualization idea has been published in a video by NPR using buckets for each continents and visualizing birth rate as water drops into the bucket and death rates as drops out of the bucket. It is obvious that when more water is dropping in on the top (births) than dropping out at the bottom (deaths), then the buckets fill up.

As a final example, consider this chart visualizing our even faster growing environmental impact: Since there is not just the Population size, but at least two other factors – Affluence and Technology – the multiplicative impact is growing even faster. With the use of three dimensions and the formula I = P * A * T this yields a simple but effective illustration.

Multiplicative Human Impact through Population, Affluence and Technology

Of course a short Blog post can’t do justice to all aspects of an app or eBook. There is a lot more to this app than shown here. But I hope you got an impression as to how interactive graphics can help communicate abstract and quantitative ideas in a more intuitive way.

 
3 Comments

Posted by on November 4, 2011 in Socioeconomic

 

Tags: , ,