RSS

Inequality on Twitter

06 Dec
Inequality on Twitter

A lot has been written about economic inequality as measured by distribution of income, wealth, capital gains, etc. In previous posts such as Inequality, Lorenz-Curves and Gini-Index or Visualizing Inequality we looked at various market inequalities (market share and capitalization, donations, etc.) and their respective Gini coefficients.

With the recent rise of social media we have other forms of economy, in particular the economy of time and attention. And we have at least some measures of this economy in the form of people’s activities, subscriptions, etc. Whether it’s Connections on LinkedIn, Friends on FaceBook, Followers on Twitter – all of the social media platforms have some social currencies for attention. (Influence is different from attention, and measuring influence is more difficult and controversial – see for example the discussions about Klout-scores.)

Another interesting aspect of online communities is that of participation inequality. Jakob Nielsen did some research on this and coined the well-known 90-9-1 rule:

“In most online communities, 90% of users are lurkers who never contribute, 9% of users contribute a little, and 1% of users account for almost all the action.”

The above linked article has two nice graphics illustrating this point:

Illustration of participation inequality in online communities (Source: Jakob Nielsen)

As a user of Twitter for about 3 years now I decided to do some simple analysis, wondering about the degrees of inequality I would find there. Imagine you want to spread the word about some new event and send out a tweet. How many people you reach depends on how many followers you have, how many of those retweet your message, how many followers they have, how many other messages they send out and so on. Let’s look at my first twitter account (“tlausser”); here are some basic numbers of my followers and their respective followers:

Followers of tlausser Followers on Twitter

Some of my followers have no followers themselves, one has nearly 100,000. On average, they have about 3600 followers; however, the total of about 385,000 followers is extremely unequally distributed. Here are three charts visualizing this astonishing degree of inequality:

Of 107 followers, the top 5 have ~75% of all followers that can be reached in two steps. The corresponding Gini index of 0.90 is an example of extreme inequality. From an advertising perspective, you would want to focus mostly on getting these 5% to react to your message (i.e. retweet). In a chart with linear scale the bottom half does barely register.

Most of my followers have between 100-1000 followers themselves, as can be seen from this log-scale Histogram.

What kind of distribution is the number of followers? It seems that Log[x] is roughly normal distributed.

As for participation inequality, let’s look at the number of tweets that those (107) followers send out.

Some of them have not tweeted anything, the chattiest has sent more than 16,000 tweets. On average, each follower has 1280 tweets; the total of 137,000 tweets is again highly unequally distributed for a Gini index of 0.77.

The top 10 make up about 2/3 of the entire conversation.

Again the bottom half hardly contributes to the number of tweets; however, the ramp in the top half is longer and not quite as steep as with the number of followers. Here is the log-scale Histogram:

I did the same type of analysis for several other Twitter Users in the central range (between 100-1000 follower). The results are similar, but certainly not yet robust enough to statistical sampling errors. (A larger scale analysis would require a higher twitter API limit than my free 350 per hour.)

These preliminary results indicate that there are high degrees of inequality regarding the number of tweets people send out and even more so regarding the number of followers they accumulate. How many tweets Twitter users send out over time is more evenly distributed. How many followers they get is less evenly distributed and thus leads to extremely high degrees of inequality. I presume this is caused in part due to preferential attachment as described in Barabasi’s book “Linked: The new science of networks“. Like with all forms of attention, who people follow depends a lot on who others are following. There is a very long tail of small numbers of followers for the vast majority of Twitter users.

That said, the degree of participation inequality I found was lower than the 90-9-1 rule, which corresponds to an extreme Gini index of about 0.96. Perhaps that’s a sign of the Twitter community having evolved over time? Or perhaps just a sign of my analysis sample being too small and not representative of the larger Twitterverse.

In some way these new media are refreshing as they allow almost anyone to publish their thoughts. However, it’s also true that almost all of those users remain in relative obscurity and only a very small minority gets the lion share of all attention. If you think economic inequality is too high, keep in mind that attention inequality is far higher. Both are impacting the policy debate in interesting ways.

Turning social media attention into income is another story altogether. In his recent Blog post “Turning social media attention into income“, author Srininvas Rao muses:

“The low barrier to entry created by social media has flooded the market with aspiring entrepreneurs, freelancers, and people trying to make it on their own. Standing out in it is only half the battle. You have to figure out how to turn social media attention into social media income. Have you successfully evolved from blogger to entrepreneur? What steps should I take next?”

 
10 Comments

Posted by on December 6, 2011 in Industrial, Scientific, Socioeconomic

 

Tags: , , ,

10 responses to “Inequality on Twitter

  1. visualign

    December 9, 2011 at 1:01 pm

    I found a much more systematic research paper titled “Income Inequality in the Attention Economy” by Kevin McCurley from Google here: http://www.mccurley.org/papers/effective/

    It finds that attention inequality on the web is increasing, is bigger for web browsing compared to searching, and seems to be amplified by increasing commercialization of the web.

    With Gini index values of 0.985 – 0.994 – which are truly staggering levels of inequality – this trend “holds potentially serious consequences for the monetization of web content, since attention will continue to be a prerequisite for monetization.”

     
  2. danica

    December 12, 2011 at 9:33 am

    Attention is omnipresent topic in social media eco-system that draws socio-technological, psychological factors. Nice quantitative representation of inequality on Twitter, what program/software did you use for data analysis?

     
    • visualign

      December 12, 2011 at 11:27 am

      Danica,
      Thanks. I am doing almost all analysis and graphs on my Blog in Mathematica 8 on a Mac. For the Twitter analysis I am obtaining data directly from the Twitter APIs using various Java libraries. This was a bit more complicated than I thought at first since Twitter has recently changed the authentication model from Basic to OAuth and there were no available Mathematica packages for that yet. I commented on how to get the data from Twitter on a LinkedIn group and can give more details if you’re interested.
      Thomas.

       
  3. Alan parham

    December 15, 2011 at 5:34 am

    Fascinating. As a non specialist, or lay person some issues of translation, but this is very good stuff, well done!

     
  4. visualign

    December 15, 2011 at 5:26 pm

    Another interesting data point is for my VisualignCorp Twitter account. It has a paltry 20 followers, and the sum total of all their followers is about 17,000. Yesterday my tweet about this Blog Post “Inequality on Twitter” was re-tweeted by a Twitter user with ~318,000 followers! Over the course of the last 12 hours there were a few additional retweets which brought the number of followers reached to about 400,000 – more than 20x bigger than if all my own followers had retweeted to all their followers.
    I expected the number of Blog visits to this post to spike. However, I only received 8 hits coming from Twitter. 8 (eight)! Out of 400,000. That’s only about 1 in 50,000. Goes to show that raw numbers of followers may not really lead to high engagement.
    And as we know from the half-life analysis (see https://visualign.wordpress.com/2011/09/09/bit-ly-link-analysis-on-half-life-of-web-content/) we can expect to see about half of all follows of Twitter links in 3 hours, so there won’t be many more coming in.
    Most of my Blog visits come form search engines (mostly Google in various countries and a few others), and those from image search outnumber regular search by about 4 to 1.

     
  5. alastairhumphreys

    April 27, 2012 at 8:50 am

    Wow – this is a bit more scientific than my own musings on the topic!
    Al

     

Leave a comment