a batch data processing to visualize social network maps

Notice: Even if the main subject (batch data processing) of this article is still up-to-date, it describes two outmoded services: Google Trends for Websites, which is no longer available, and Google Image Charts API, which is now deprecated.

Several times I have happened to read an interesting work or news and to have an idea on how to delve deeper into the matter, then to renounce because such a task would have required to collect and analyse a lot of online data and I haven’t had the time and the right tool to do it.

It was so also the first time I have seen the World map of social networks. The chart is nice but the information is a bit poor. Which is the penetration of Facebook or the other most popular social network in each country? Which is the variation occurred in each country between two successive dates? Which is the difference about the number of users between the most popular social network and the second one? Indeed, the situation where the users of the two most used social networks are in a ratio of about 1:1 is very different from the one where the ratio is about 1000:1. Anyway, among the two declared sources of data, Alexa offers freely only rank statistics, as far as I know, and Google Trends for Websites requires to make hundreds of queries to obtain the necessary data, and this was very discouraging.

No time and no tool, I have said. But in the last year I have recognized that often Google spreadsheet functions for external data can save time and offer the right tool. For example, when disputing CounterPunch article on Fukushima accident effects in the USA, these functions have spared me the boring job to copy down all the data involved. The case of social network maps is even better to show how to take advantage of Google tools.

Trying to answer my previous questions I have build five different maps which are shown in the gallery below and at the bottom of this post, too.

Obviously my analysis is far from perfect. For example, I haven’t considered those social networks which are most used only in one or few countries. Such a task would have made my work too much complicated. Nevertheless, I think that my maps bring out some interesting facts: the growth of Facebook and its differential penetration among the countries, the widespread collapse of MySpace, the popularity of Twitter compared to Facebook in Japan.

Indeed, here my purpose is to explain how Google resources allow to deal the case of social network statistics with little efforts and great flexibility, and more generally how they are ideal to carry out some mechanical tasks in data analysis work, in particular online data collection and chart generation. In other words, I think that my work has a general interest: you can simply consider this as a case study to take cue from and apply to your interest area.

In the step-by-step description of my work, I refer to this Google Spreadspeet which contains all the necessary data and calculations. The key elements are represented by the function =IMPORTXML(), which allow to retrieve specific portions of html code for a given page, the Google Apps Script, used to process a great amount of data, and the Google Image Charts API which generates the maps. Here are the main operations carried out to build the spreadsheet.

  • Facebook penetration
    (ratio between numbers of unique visitors and population size)

    lowest rate , highest rate

  • Facebook penetration
    (ratio between numbers of unique visitors and number of internet users)

    lowest rate , highest rate

  • Relative variation of Facebook visitors in the last year
    (ratio between number of visitors at June 30, 2010 and June 30, 2011)

    highest drop , none variation , highest growth

  • Relative variation of MySpace visitors in the last year
    (ratio between number of visitors at June 30, 2010 and June 30, 2011)

    highest drop , none variation ), highest growth

  • Ratio between Facebook and Twitter visitors

    highest ratio in favour of Twitter 1:1 ratio , highest ratio in favour of Facebook

This work is released underthe CC BY-NC-SA license.

7 thoughts on “a batch data processing to visualize social network maps

  1. Pingback: sei-uno-zero-nove » Blog Archive » creazione automatica di mappe tematiche sui social network

  2. Pingback: sei-uno-zero-nove » Blog Archive » use google spreadsheet as a proxy

  3. Good stuff!
    However, when I access your spreadsheet and login with a google-account I cannot edit and play with the data. Making it editable (without save ;-) would be a great option.

  4. Why don’t you simply make your own copy and then apply all the edits you want?
    Open the spreadsheet, then do File > Make a copy.

  5. Uuups. The only excuse for missing this is a general reluctance to copy paste operations in Germany due to some of last year’s political events. ;-)
    Txs!

  6. Pingback: use google spreadsheet as a proxy | FreeShareHere

  7. Pingback: campione non solo d’inverno, forse | sei-uno-zero-nove

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.