Issue Distance on Twitter

In June 2013 I attended the Digital Method’s Initiative’s Summer School at the University of Amsterdam. That year’s topic was social media APIs and how their potential for social research. DMI’s goal is to think critically about the affordances and limitations of using the web and social media as object of study as well as an instrument for social research. In week one, I conducted a project on rape discourse on Twitter together with Saskia Kok, Richard Rogers and Carlo de Gaetano & Stefania Guerra from the DensityDesign team (IT). View the result (a datavisualisation) here: scatterplot-1

Issue Distance on Twitter: The 2012 rape case in Delhi, India within the # rape discourse

Research question: How far from the top of Twitter’s #rape hashtag is the Delhi rape case? OR What is the issue distance between the Delhi gang rape of December 16, 2012 and the general  issue of rape on Twitter in the 3 months following the rape itself (time of legislation revisions and the trial of the offenders)?

8.491.020 tweets by 3.589.777 unique users from 2013-01-15 to 2013-06-27 containing the hashtag gangrape, rape


Using the TCAT (Twitter Capture and Analysis Toolset)

  • Locate the top of the general rape discourse by finding the top 15 most retweeted tweets with #rape hashtag
  • Locate the specific discourse around the rape case in India by finding the top 15 most retweeted tweets in a subset (only #rape tweets that also contain India AND rape)
  • Show issue distance between the top 15 tweets and the top 15 tweets from the subset in the entire #rape set
  • Visualize issue distance in scatter plot incorporating tweets and key event timeline from the December 2012 gangrape case

Continue reading for the results and findings

Time period for both datasets:
15.01.2013 – 15.03.2013

Frequencies: daily

Retweet threshold: 20

Dataset 1: General discourse on Rape
Filter tweets by query: rape
Number of tweets: 3.191.591, Number of distinct users: 1.655.910

Dataset 2: Specific Delhi Rape
Filter anew with query: India AND rape
Number of tweets: 117.784, Number of distinct users: 46.725



Explanation of the scatterplot graph: The upper set of highlighted tweets are the top fifteen retweeted tweets in the #rape, and the lower set of highlighted tweets are the top fifteen retweeted when filtered for India and Delhi.

fragment from the top of the #rape conversation (most retweeted reached around 4000 retweets)

fragment from an issue distance: conversations about rape in India (top retweeted in the subset reached max 800 retweets


  • There is a great distance between the top tweets in the #rape dataset and the subset (800-4000 retweets).
  • The substance of the most popular retweets is distinctive from the Delhi rape case retweets. The most popular retweets are cynical, showing “contempt for accepted standards of (…) morality” (dictionary definition)
  • The delhi subset is a serious content space, made up of quality news as well as engagement in women’s issues.
  • Twitter seems to remain a vernacular (vulgar) space, where query and filtering techniques are required to locate the substance of some urgency.

Characterization of the hashtag
e.g. Who uses it? Where are the users located?