RSS

Monthly Archives: June 2012

Interactive Tournament Map

Interactive Tournament Map

I hadn’t followed the UEFA 2012 European football championship (called soccer in the US) and wanted to catch up on where things stand. Enter the interactive tournament map on the official UEFA website:

Row selection highlights games at that stadium

When you first enter the map it animates the timeline from left to right by drawing the colored lines for each team. The tabular layout shows time in daily columns from left to right and teams in rows by 4 tournament groups. Today’s day column is always highlighted. Here are some of the interactive elements:

  • Mouse over any of the colored lines highlights the corresponding team’s games along it’s timeline.
  • Clicking on a particular day column header highlights the games played on that date.
  • Clicking on the stadium symbol at the right end highlights the games played at that stadium.
  • Clicking on any circle brings up a dialog with details for that game.
  • Clicking on a row header on the left brings up a dialog with details for that team.
  • Selecting the tournament stage at the bottom (quarter-, semi-, final) moves to the date interval.

Detail for team Spain

Spain is the reigning football world champion, so they are clearly one of the favorites of this tournament and will actually play their semi-final against Portugal later this evening.

The final will be played in the Olympic Stadium in Kyiv, capital of participating host country Ukraine.

Detail with game schedule for stadium

From these details you can click on the games and get to yet more detail (videos, comments, etc.) for that particular game.

When I first looked at the map, the amount of information displayed had me a bit confused. The color scheme is often difficult to separate, for example the three orange-red tones in Group B. The black background feels attractive, although I could do without the pattern overlay, which doesn’t add information and only distracts. Lastly, I could do without the colorful advertisements around the map. On first glance I thought the stadium symbols on the right were also just colored ads.

The interactive nature made the map grow on me. It’s intuitive and the tabular layout easy to navigate. You may not have a screen wide enough to see the map in its entirety, but I suppose you wouldn’t want to see time down the vertical axis, would you?

Postscript 7/1/12: Sure enough, Spain beat Italy 4:0 in today’s final and went on to become the European football champion 2012.

 
Leave a comment

Posted by on June 27, 2012 in Recreational

 

Tags: , , , , ,

Self-publishing to Apple bookstore

Self-publishing to Apple bookstore

Over the last couple of weeks I finished writing the book about my adventure of a lifetime: Panamerican Peaks, cycling from Alaska to Patagonia and climbing the highest mountain of every country along the way. By now I have successfully self-published the book to the Apple bookstore. This post gives a recap of the steps involved in that process, with a focus on the tools, logistics and finally some numbers and sales stats.

Disclaimer: In my personal life I am an avid Apple fan, and this post is heavily biased towards Apple products. In particular, the eBook is only available for the iPad. So the tools and publishing route described below may not be for everybody, but the process and lessons learnt may still be of interest.

Path to self-publishing on Apple bookstore

Creating Content

The first step is obviously to create, select and edit the content of the book. During the actual trip I tried documenting my experiences via the following:

  • Taking about 10,000 photos with digital camera (Olympus and Panasonic)
  • Taking daily notes with riding or climbing stats (on iPhone or NetBook)
  • Shooting about 200 video clips (Flip Mino)
  • Uploading photos (to Picasa) and videos (to YouTube)
  • Writing posts on my personal Blog

In the months after coming home I refined some of the above material. Using iMovie I created ~ 5 min long movies based on video clips, photos and map animations, typically with some iTunes song in the background and a bit of explanatory text or commentary. I shared those videos on my personal Blog and on my Panamerican Peaks YouTube channel.

I loaded all photos into Aperture on our iMac and tagged and rated them. That allowed me to organize them by topic or as required. The ‘Smart Folders’ feature of Aperture comes in handy here, as it allows to set up filters and select a subset of photos without having to copy them. For example, if I wanted photos rated 4 stars or higher related to camping, or photos of mountains in Central America, I just needed to create another Smart Folder. This was very useful for example for the Panamerican Peaks Synopsis video which features quick photo sets by topics (cycling, climbing, camping, etc.).

Google Earth proved to be a very useful tool as I could easily create maps of the trip based on the recorded GPS coordinates from my SPOT tracker. One can even retrace the trip in often astonishing detail thanks to Google Street View. For example, in many places along the Pacific Coast I can look at campgrounds or road-side restaurants where I stopped during my journey. I even created a video illustrating the climbing route on Mount Logan from within Google Earth.

The heart and soul of any book is of course the story and the text used to tell it. I created multiple chapters using MS Word because I am so used to it, but one can of course use any modern text writing tool. In addition, I created some slides for presentations I gave last summer using Keynote.

Book Layout

Once all the ingredients were available, it was time to compose the actual book. As I had decided to build an eBook for the iPad I used Apple’s new iBooks Author tool on my MacBook Pro. This meant choosing the layout and including the text and media. iBooks provides a few interactive widgets and accepts all widgets that can be installed into the OS X dashboard. This in particular allowed me to link to the various YouTube videos. I could always get a preview of the book copied out to my attached iPad 2.

After many weeks of busy work putting the finishing touches on the book and adding various edits from a few trusted friends I got to the point where I needed to figure out how to get the book published in Apple’s bookstore. There are two steps required here:

  1. Creating a developer account with Apple via iTunes Connect
  2. Managing one’s content via iTunes Producer

The creation of the account is fairly straightforward through the web browser. To get started, I visited Apple’s Content Provider FAQ page and filled out an application. One submits basic information such as name, address, tax ID, credit card information, and ties it all to an existing Apple account. It can take a while. I never received the account validation email I was promised. So after a few days I started inquiring in Apple’s support forum. This had happened to others. Finally I just tried connecting via web browser to itunesconnect.apple.com and it worked – I had an account to publish from.

The packaging of all material and uploading is done via the free iTunes Producer app on the Mac. iBooks Author exports the book in .ibooks format, which becomes part of the iTunes Producer package. One can also provide a free sample for the book. This can be any subset or variation of the full book, unlike with Amazon’s bookstore, where the free sample is always the first N pages.

Next, one needs to provide additional metadata such as book category, description, author name, optional sample screen shots etc. One also has to provide an ISBN (International Standard Book Number) for the book. These can be obtained from publishers or directly purchased from Bowker. This stems from the need to catalogue and identify physical books in inventory or libraries, but seems a bit anachronistic for electronic books. The prices for ISBNs are very high, especially for small volumes (1 for $125, 10 for $250, 100 for $500, 1000 for $1000). But since Bowker has a monopoly in the US you don’t have a choice in that matter. This expense seemed to be the only marginal upfront cost to publishing the book (aside from the tools to create the content).

Finally one can determine the pricing and the markets where the content is to be sold. Apple follows the agency model of book publishing: As author you get to set the price. As distributors they take a share of your proceeds, here 30%. (By contrast, in the wholesale model you sell to the distributor at a discount, say 50% of the suggested retail price; the seller has sole discretion to set the price.)

Book Review

Much has been written about the very restrictive terms and conditions Apple puts on authors using their iBooks Author tool. Essentially it locks you in as an author to sell only through Apple. For many authors that is not a viable option. It also allows Apple to reject your work at their sole discretion. So as an author you are completely at the mercy of Apple’s review process.

Apple is also strict with enforcing certain rules regarding the content it allows you to sell. For example, your book cannot contain any links to YouTube videos or Amazon books. They rejected my first revision with YouTube links and suggested to embed all videos. This would have bloated the download size of the book by more than 1 GB. As a compromise, I created short 1 min teaser versions of all videos and included those. At the end they display a screen to go to the companion website (my personal Blog) for the full versions.

After 3 revision cycles and about a week later I finally had my book on sale in 24 countries around the world, for $9.99 or the equivalent in Euro or other countries’ currencies.

Book Marketing

Publishing is not selling. Here are some of the things I did to promote my own book:

  • Email – Customized note to Hotmail contacts (~ 300 contacts)
  • Twitter – Tweets and direct messages to influencers for retweets (~ 2000 followers)
  • FaceBook – My daughter posted on her wall (~ 1000+ friends)

Sending the emails was not without hiccups. I used MS Word and Outlook to do a mail merge with text blocks and individual text from an Excel spreadsheet. First, the Mail Merge Filter condition dialog has a bug which replicates the last AND condition and adds it as an OR condition. This screws up your filter and ends up selecting lots of folks you didn’t mean to. I found this bug during a test with the first 5 addresses. (I sent them each an apologetic email explaining this.) Then after I did the filtering all in the spreadsheet it worked and Outlook cranked out the emails. After a short while, Hotmail decided that my account had apparently been hacked and used for spam, so they locked my account down! In a way this is good, but I didn’t consider my carefully crafted and personalized emails spam. So I had to change my password and unlock my account again.
The email was very effective. I got lots of positive responses and a few folks decided to buy right away. I had sold my first copy. Every journey of 1000 miles starts with a single step.

As a result of my daughter posting the news on Facebook I noticed a spike (4x average) in the views of my Blog and Book page. I also offered promo codes for free book download to influential twitter users if they would retweet the book announcement to their followers. Within a couple of days a handful of them accepted the offer and retweeted, which exposed the tweet to a total of 2,000+ followers.

I had emailed the Apple bookstore, and to my delight they actually featured my book in their Travel & Adventure category.

My book featured in Apple’s bookstore, Travel & Adventure section

Book Sales

With all these promotion efforts I couldn’t wait until the next morning to see the sales numbers. (iTunes Connect updates their sales numbers only once a day.) I had the first ratings and reviews come in, all at 5-stars. Naturally, I hoped to see the sales numbers go up. After all, I had reached hundreds, if not thousands of people, most of which either know me or are somewhat interested in adventure. The result? Tiny sales numbers. To date after one week I have sold 14 copies, with a maximum of four (4) copies per day. At my $10 price and 70% share this amounts to just under $100 for the first week. Not exactly enough to retire on.

I’ll revisit this topic at some point in the future when I have more data. Obviously, the iPad is just a fraction of the entire book market with Kindle, Nook and other devices. (Although, the iBook looks much better on the iPad than on many other readers, in particular the smaller black & white e-Ink display Kindle readers.) While the selection of titles seems comparable on Apple’s and Amazon’s bookstores, about 1.35 million each (see a spreadsheet of my recent sample here), there don’t appear to be many shoppers in Apple’s bookstore. Of course, Travel & Adventure is only a small fraction of the book market. But even there, on a day where I sold two copies my book briefly ranked 30.th in the Top Charts. 30.th out of 11,800 titles (in Travel & Adventure)! That means the other 11,770 titles sold even less than mine (i.e. one or none) during the sampling time interval. Book sales appear very unevenly distributed, another case of huge online inequality.

But more importantly, most of the people reached by my promotional efforts don’t engage to the level of actually following the links, downloading the sample and finally buying the book. From my experience, one needs to reach more than 100 people for every one book sold. Fellow adventure traveller and author Andrew Hyde – whose book coincidentally is featured just above mine in the screen-shot above – has recently written about his book sales here. His stats show a similar small fraction of sales to views. I just don’t have the millions of Twitter followers to generate meaningful sales this way!

 
1 Comment

Posted by on June 21, 2012 in Recreational

 

Tags:

Visualign Blog – View Stats for first year and a half

Visualign Blog – View Stats for first year and a half

I started this Data Visualization Blog back at the end of May 2011. WordPress provides decent analytics to measure things like views, referrer, clicks, etc. The built-in stats show bar charts by day/week/month, views by country, top posts and pages, search engine terms, comments, followers, tags and so on. I have accumulated the view data and wanted to share some analysis thereof.

At this point there are 17,000 views and 56 posts (about 1 post per week). The weekly views have grown as follows:

Weekly Views of Visualign Blog

The WordPress dashboard for monthly views looks like this:

Assuming an exponential growth process this amounts to a doubling roughly every 3 months. This may not sound like much, but if it were to continue, it would lead to a 16x increase per year or a 4096x increase in 3 years. Throughout the first year this model has been fairly accurate and allowed to predict when certain milestones would be reached (such as 10k views, reached in Apr-2011 or 100k views, predicted by Jan-2013).

However, the underlying process is not a simple exponential growth process. Instead it is the result of multiple forces, some increasing, some decreasing, such as level of interest of fresh content for target audience, rather short half-life of web content, size of audience, frequency of emails or tweets with links to the content etc. So I expect growth to slow down and consequently the 100k views milestone to be pushed out past Jan-2013.

Views come from some 112 countries, albeit very unevenly distributed.

Views by Country (10244 views since Feb-25, 2012)

The Top 2 countries (United States and United Kingdom) contribute nearly half of the views, the Top 10 (9%) countries nearly 75% of all views. The fairly high Gini index of this distribution (~0.83) indicates strong dependency on just a few countries. The only surprise for me in the Top 10 list was South Korea, ranking fifth and slightly ahead of India. Germany is probably a bit over-represented due to my German business partner (RapidBusinessModeling) and related network.

Views by country with Top 10 list

One interesting analysis comes from looking at the distribution of views over weekdays. Not every weekday is the same. Thursdays are the busiest, Saturday the quietest days. After a little more than one year, averaging over some 56 weeks, the distribution looks like this.

Weekday variation of Blog views averaged over 1st year

Of course, time zone boundaries may cause some distortions here, but it looks like the view activity builds during the week until it hits a peak on Thursday. Then it falls sharply to a low on Saturday, and builds from there again. This fits with intuition: One would expect the weekend days to be low as well as Monday and Friday to be lower than the mid-week days. It’s tempting to correlate that with the amount of work or research getting done by professionals. The underlying assumption is that people discover or revisit my Blog when it fits into their work.

A large fraction (> 65%) of referrals comes from search engines. Within those, it’s mostly Google (>90% summed across many countries) with just a small amount of others like Bing. It’s safe to say that without Google search my Blog would have practically no views. Chances are that your first exposure to this Blog came from a Google search as well. One unexpected insight for me was to see a high ratio of image to text searches, typically 3:1 or 4:1. In some ways it shouldn’t be surprising that a blog on data visualizations gets discovered more often by searching for visual elements than for text. It also jibes with the enormous growth of image related sites such as Instagram or Pinterest. I just would not have expected the ratio to be that high.

The beginning is always slow. But any exponential growth sooner or later leads to rather large numbers. So the real question is how one can keep the exponential growth process going? I’d love to hear your comments. If you want to compare this against your own Blog stats, I have shared the underlying data as a Google doc here. I have no idea how this compares to other blog stats in similar domains. If you know of any other public Blog stats analysis, please comment with a pointer below. Thanks.

Addendum 7/11/2012: Today my Blog reached 20,000 views. I noticed over the last few weeks that the deviation from an exponential growth model was getting quite large. For an exponential trend line R² = 0.9886.

Daily views with 20,000 total view milestone

When instead modeling the weekly views on a linear growth rate, this gives the total views a quadratic growth. Curve fitting the total views with a 2nd order polynomial yields a very good fit (R² = 0.9977).

Total views growth curve with quadratic curve fit

Linear growth of weekly views is compatible with approximately linear increase in content (steady frequency of about 1 post / week) and thus increased chance of Google search indexing new content (with Google search the main source of view traffic). Quadratic growth of total views is also nonlinear, but far slower than exponential growth. For example, the 100,000 view milestone is now projected to be reached in 08/2013 instead of in 01/2013, i.e. in 13 months as compared to 7 months.

Addendum 11/1/2012: The Blog reached 30000 views on Oct-19 and here is a chart of the monthly views through Oct-2012:

Monthly Blog views through Oct-2012

August and September have been slow, presumably seasonal variation. I also didn’t post between late August and mid October. The view data of the last couple of months no longer support the theory of significant growth in view frequency. Instead, multiple dynamic factors come into play. At times views spike due to a mention or a post of temporary interest – such as the recent post on visualizing superstorm Sandy. But such spikes quickly fade away according to the very limited half-life of web information these days. The undulating 4 week trailing average in weekly views below visualizes this clearly. The net effect has been a plateau in view frequency around 3000 per month.

Weekly Views with average Nov 2012

I continue to see most of the referrals coming from Google searches, still with a majority of those being image searches. Engagement growth has been anemic, with relatively few comments, back links or other forms of engagement. It seems to me that growth proceeds in phases, with growth spurts interspersed by plateaus of varying length. One such growth spurt has been reported by Andrei Pandre on his Data Visualization Blog through the use of Google+. Perhaps it’s time to extend this Blog to Google+ as well.

Variation of views by weekday

With regard to variation of views by weekday, the qualitative pattern remains. Tuesday is now emerging as the day with the most views, with Monday, Wednesday, and Thursday slightly behind, but still above average. Friday is slightly below average, Saturday is the lowest day with only half the views and Sunday in between.

I’m not sure whether to conclude from that that important posts should be published on a particular weekday. Again, most views come from Google searches and are accumulated over time, so perhaps only the height of the initial spike will vary somewhat based on the publishing weekday.

 
Leave a comment

Posted by on June 12, 2012 in Scientific

 

Venn Diagrams

Venn Diagrams

The private library Blog had a post with some word play relating to sound, spelling and meaning of words in the English language. From their post on Homographic Homophones:

English is one of the most difficult languages in the world for a non-native speaker to learn.  One of the reasons why this is so is that English has a large number of words that are pronounced the same as other words (i.e., they are homophones) even though they have quite different meanings.  Homophones such as parepair and pear, for example, have the same pronunciation but are spelled differently and have different meanings (heterographic homophones).  Other homophones — tender (locomotive),tender (feeling) and tender (resignation), for instance — are spelled the same and pronounced the same (homographic homophones) but have different meanings (i.e., they are homonyms).

Got all that?  Wikipedia has a nice Venn diagram that may help you sort it out:

Venn Diagram displaying meaning, spelling, and pronunciation of words (Source: Wikipedia)

Of course, you could also list the above combinations in a table. If you’re interested, Carol Moore has done just that on her Buzzy Bee riddle page.

A beautifully symmetric 5 set Venn diagram drawn from ellipses has been proposed by Branko Grünbaum and drawn by Wikipedia contributor Cmglee:

Symmetrical_5-set_Venn_diagram (Source: Wikipedia)

Such set-based diagrams invite a more mathematical notation. Cmglee annotates his image with this snippet:

Labels have been simplified for greater readability; for example, A denotes A ∩ Bc ∩ Cc ∩ Dc ∩ Ec (or A ∩ ~B ∩ ~C ∩ ~D ∩ ~E), while BCE denotes Ac ∩ B ∩ C ∩ Dc ∩ E (or ~A ∩ B ∩ C ∩ ~D ∩ E).

If you search the Wolfram Demonstration Project for ‘Venn Diagram’, you get several interactive diagrams.

Venn Diagram Demonstration Projects (Source: Wolfram Demonstration Project)

These diagrams are interactive. For example, they allow you to click on any subset and then have that set highlighted and the corresponding mathematical set notation displayed accordingly. Interesting and fun to learn.

Speaking of fun: Venn diagrams are also effectively used in many different areas, two of which I’d like to leave you with here:

Data Science Venn Diagram (Source: drewconway.com)

And last but not least, Stephen Wildish’s Pancake Venn Diagram:

 
Leave a comment

Posted by on June 10, 2012 in Linguistic, Scientific

 

Tags: , , ,

Graphic comparing highest mountains

Graphic comparing highest mountains

In mountaineering, 8000m peaks are the ultimate test of high-altitude climbing. It so happens that there are 14 Eight-thousanders. In 1986 Reinhold Messner became the first person to have climbed all 14 8000m peaks. It has become a coveted trophy of mountaineering, with only about 30 people having done so since.

A different, but somewhat related challenge is to climb the highest mountain on every continent, the so-called Seven Summits. This was first completed by Dick Bass in 1985. It has become a more mainstream mountaineering challenge, and about 300 people have repeated that feat. That has also lead to significant and often problematic overcrowding on those seven summits.

Interestingly, it was noted that the second highest mountain on each continent is typically harder to climb than the highest. Hence yet another challenge was born to complete the first ascent of the Seven Second Summits. Hans Kammerlander claims to have done so in 2010 – although some doubts have arisen regarding whether he stood on the right summit on Mount Logan, Canada. Others have suggested combining the Seven Summits and the Seven Second Summits, giving again 14 peaks.

On the Wikipedia page I found an interesting graphic comparing the 14 Eight-thousanders with the Seven + Seven Second Summits. It was created by Cmglee and shared on the Wikipedia page.

Comparison of highest mountains (Source: Wikipedia)

This is an interesting chart, created as .svg file and thus rendering in high definition on large wide-format screens. It is also interesting to follow the revision history on the talk page and the suggestions about coloring and labeling coming from interested readers. In some ways, this shows how published charts can be improved collaboratively. Contributor ‘Cmglee’ has contributed several .svg graphics to Wikipedia as per the User talk page, including a 5-set Venn diagram, life-expectancy bubble charts and Earthquake intensity bubble charts.

I have a personal interest in mountaineering. In 2009-2010 I embarked on my own adventure of a lifetime called the ‘Panamerican Peaks’. Cycling between Alaska and Patagonia (Panamerican Highway) and Climbing the highest mountain of every country along the way. You can find out more about that adventure on my Panamerican Peaks website. Coincidentally, there are a minimum of 14 countries and peaks in that set as well: United States, Canada, Mexico, Guatemala, El Salvador, Honduras, Nicaragua, Costa Rica, Panama, Colombia, Ecuador, Peru, Chile, Argentina.

Position and elevation of 14 Panamerican Peaks

Prior to starting my adventure journey I had mapped out the height of those 14 mountains. Interestingly, except for a few peaks in Central America, the country high-points get higher the further North or South they are located.

Heights of 14 Panamerican Peaks

Four of those peaks are included in the Seven (Second) Summit lists above: Denali, Logan (North America) and Aconcagua, Ojos (South America). It would be great to include the other 10 Panamerican Peaks in a similar graphic. About time for me to look into generating .svg graphics…

And sure enough, Wikipedia contributor Cmglee provided me with a version of the above .svg chart comparing the 14 Panamerican Peaks with the 14 Seven (Second) Summits as follows:

Comparison of 14 Panamerican Peaks with Seven (Second) Summits

Thanks to Cmglee for the quick turn-around.

 
1 Comment

Posted by on June 4, 2012 in Recreational

 
 
%d bloggers like this: