Wednesday, July 24, 2013

Archival, Analysis, and Visualization of #ISMBECCB 2013 Tweets

As the 2013 ISMB/ECCB meeting is winding down, I archived and analyzed the 2000+ tweets from the meeting using a set of bash and R scripts I previously blogged about.

The archive of all the tweets tagged #ISMBECCB from July 19-24, 2013 is and will forever remain here on Github. You'll find some R code to parse through this text and run the analyses below in the same repository, explained in more detail in my previous blog post.

Number of tweets by date:


Number of tweets by hour:


Most popular hashtags, other than #ismbeccb. With separate hashtags for each session, this really shows which other SIGs and sessions were well-attended. It also shows the popularity of the unofficial ISMB BINGO card.


Most prolific users. I'm not sure who or what kind of account @sciencstream is - seems like spam to me.


And the obligatory word cloud:


5 comments:

  1. I wonder if there is a way of filtering out generic words from word cloud, like "will", "see", "get", "us", "just", etc.

    ReplyDelete
    Replies
    1. The 'tm' package in R has the function stopwords() which has a collection of words like this, but obviously doesn't catch everything.

      Delete
    2. This comment has been removed by the author.

      Delete
    3. This comment has been removed by the author.

      Delete
  2. Something like:

    corpus=tm_map(corpus, function(x) removeWords(x, c("will","see","get","us","just")))

    But I think it's okay anyway.

    ReplyDelete

Note: Only a member of this blog may post a comment.

Creative Commons License
Getting Genetics Done by Stephen Turner is licensed under a Creative Commons Attribution-NonCommercial 3.0 Unported License.