First of all let me clarify **this is an exercise in Social Media monitoring and analysis**, not a **full featured Intelligence** report. Not by far. We do Social Media Monitoring for a living and we use the same technology you’ll see in these images and tables mainly to monitor *Corporate events* and to spot out the *Influencers* on online conversations regarding a peculiar subject. Nothing too fancy, but often useful and full of insights.
**I’m not affiliated** with the Intelligence community but have a somewhat extensive background in *IT Security* and has been in US under the *IVLP Program on “Combating Cybercrime”*. I also teach *Open Source Intelligence* as Contract Professor. That said, I’m *not really a US Fanboy* and, more over, not really a [@Th3j35t3r’s][jester] fan. But we sort of respect each other, though, and have chatted sparsely.
**In addition to this**, please understand **this is not a scientific paper**: it’s just an analysis done to find out if software we routinely use in some context (business) could be used in different contexts (intelligence). It is but a draft, but I hope you’ll find it interesting enough to spark a little bit of conversation about the results.
**Finally:** I do not understand and speak arab, and this analysis has been made on a pure mathematical and statistical approach on users. Please, be a good fellow and not an idiot and use this data to **muse on correlations** and not to **point out anyone** in these charts and graphs as terrorists and/or terrorist’s allies. Deal?
If you have any further question you can [contact me][contact] (mf @ matteoflora.com) and/or comment below. Or you can talk about this on the [Reddit Page][reddit]. Hope you’ll enjoy! M.
## What is this stuff?
On August 20th 2014, the famous grey hat and hacktivist (or computer vigilante, depending on your point of view) [Th3j35t3r][jester] has started a *Tango Down*[^tango] operation on several (19) Twitter accounts disseminating pro-jihad news, during the Twitter outbreak of the **#ISISMediaBlackOut** campaign[^isisbo].
All the accounts were referred to [Th3j35t3r][jester] manually by some Twitter users and were not correlated in any way, aside from the sheer fact that they were promoting pro-jihad news.
The *Tango Down*ed Twitter accounts were the following (many of them are now again online but I’ll not provide links).
Looking at the accounts and the similarity in the topic they were discussing, it came to my mind that finding out some other accounts with similar behaviour is not only possible, but extremely simple from our point of view, since we do this kind of things all the time.
To be clear, no, we don’t find jihad news sites every day, but the problem of **finding out who is talking about a particular topic** and who, in this group, is the **most influential** hub to disseminate information is a routine job on many different topics such as politics, fashion, automotive, banking, technology and so on…
It is something that is normally called Social Media Influencers Analysis and is used to map those accounts that are most “influential” over content diffusion.
Every company has its **own methodology** to spot out who the **Influencers** are, and so do we. It doesn’t really matter which method is better: different methods are most suitable for different situations and in this case we’ll apply a little bit of **Engagement Analysis**[^engagement] and a little bit of **Eigenvector Analysis**[^eigen] to find out what will come out of it.
## Methodology and extracted data
For creating the analysis we **acquired[^data] historical data** of the **last month** of Twitter conversations that included the mention of **any of the 19 accounts**. To avoid altering the results we explicitly excluded data after August 19th to avoid that mentioning the account **after the Tango Down** would result in false data about the interactions.
What we want to achieve here is **spotting out the community** and **how the community interact** with the various members. To do so we **did not download** all the tweets **made by those accounts** but only the tweet that **mentioned** or **retweeted** (a different kind of mention, if you think about it) them. In this way we’re able to spot:
* The endorsements and retweets
* All the conversations happening between two (or more) accounts
* All the questions/answers between two (or more) accounts
In the time frame between July 17th 2014 and August 18th 2014 we were able to find **23,677 tweets** (of which **20,850 retweets**) made by a total of **5,504 unique users**, with an average of 4 tweets per account. The peak minute in conversations was found July 18th at 06:27 AM GMT+1 with 37 conversations in a single minute.
In total, the conversations led to 40,039,424 Impressions[^impression] (or, better, Opportunities to See) impacting on a total[^total] (not de-duplicated) of 7,046,241 followers
Here is a little graph showing up the minute-by-minute tweet flow *(sorry for the italian UI ;])*:
## A little bit more about users
It is easy to spot out that the **most popular users** (users who **received** the most mentions or retweets) are led by **@ahlam_alnasr** mentioned in 7,083 tweet, immediately followed by **@diyala1435_w** with a total of 3,621 tweets mentioning it and by *@nynwa_news* with 3,344 mentions and @homs_isis following with 3,154.
Here a more compete listing:
|#|Twitter User|Number of Mentions|
**Most active users**, calculated by the number of tweets/retweets generated containing mention of one of the 19 users in the examined time frame, we find out **@isis_daash]** with 1,121 tweets
, **@triumph_isis** with a total of 414 tweets, along with **@isis_time** and **@fajeralislam90** with 326 and 296 tweets sent.
Here a list of the Top 20:
|#|Twitter User|Number of Tweets|
Looking at the users with the largest impact generated Impressions[^impression] we find out in leading position **@isis_daash** with a total of 3,861,769 generated Impressions, **@fc3o** with 2,800,006, **@7ob_3** and **@triumph_isis** respectively with 1,832,692 and 1,688,959.
An interesting fact: impression generated **by these 4 accounts alone** amount to 25.4% of the total Impressions.
Here a list of the Top 20:
|#|Twitter User|Number of Impressions|
Very last bit of information about users: **Twitter Superstars** (users with the largest followers count) tweeting mentioning the 19 accounts are **@7ob_3**with 458,173 followers, **@fc3o** with 311,148 followers, **al3r_b** with 220,499 followers and **@sas201416** with 171,362 followers.
## HashTags analysis
Being unable to read and understand arab language, the only meaningful analysis I can present is based not on text but on HashTags, that are used in english language only (or, at least, without arab letters). In the following bubble chart you can find the most used (with repetition):
Most used HastTags (counting repetitions by a user) has been **#isis** (3.004 uses), **#syria** (1.381 uses), **#is** (739) e **#iraq** (708). But a complete listing of the Top 20 is far more meaningful:
|#|HashTag|Number of Uses|
Pay attention to the **bold** ones, that are most important, since they convey intimidation messages widely spread after release by many of the different sub-communities!
You can, obviously, see them as a tag-cloud if you prefer:
Another interesting way of looking into this data is to analyse which COUPLE of HashTags are presented together in the tweets and map out them into a ribbon graph:
## The Global Picture
Aside from the statistical part, that in my opinion provide a lot of information to find out the **most active**, **most mentioned** and *most influential** accounts, it’s time to picture the entire community using some Graph visualisation of the entire space: using a customised version of [Gephi][gephi] and a pre-compiled graph file generated from the platform, we can map interactions between different accounts in a single picture *(that is BTW the most interesting part of this long post)*…
In the following image, a **Vertex** (the *point* or *ball) with a name is the Twitter account, while the **Edge** (the *line* connecting vertexes) is a mention or a retweet.
The largest node is the result of **Eigenvector Centrality**[^eigen], and represent the **most influential account** within its micro-community.
The color of a Vertex and Edge is calculated using **Modularity**[^modularity] (see below).
## Modularity and Sub-communities
Using **Modularity**[^modularity] we tried to spot out *groups*, *clusters* or *communities* within the original community). We retrieved a bunch of them (20+), some of which seemed numerically more important than others.
You’ll find out the most representative ones here below: please note that each of the following images is, in fact, a sort of filter on the original above:
## Some information on shared content
Parsing the content of the tweets we searched the most important content shared through the tweets (in terms of popularity). This let to interesting content in the platforms itself (**warning: some content contain strong images not pictured here but available following the links**).
First of all let’s look at the most retweeted tweets:
Being unable to read arab content and to understand link contents from the t.co link shortener, we expanded all the links found into the tweets and catalogued each of them in different typologies, based on the domains.
## Top Overall Content Shared
Not categorised top content being shared is:
|#|URL|Number of Uses||
## Top Twitter Images
Most Shared **Twitter Images** are, instead:
|#|URL|Number of Uses||
## Top YouTube Videos
Here is la list of the Most Shared **Youtube Videos**
|#|URL|Number of Uses||
Not much to be said, here: it’s **quite fun to see** how software designed and created **for Marketing and Communication** purpose quite nicely **fit into the Intelligence world**. It really make sense, since Marketing itself stole methodologies and visualisations from the Intelligence software and community.
Twitter (and social media in general), as demonstrated by the Arab Spring and Occupy Movement, are an incredible source of information about **uprisings**, but also for **terrorism** and people behaviour in general. The use of this king of tech must be considered **standard** in any kind of Electronic Warfare scenario, boing it CyberAttacks, CyberIntelligence or plain old PsyOp.
Feel free to [contact][contact] me (mf @ matteoflora.com) for specific information, to point out errors in the analysis or to point out grammar monster I created (there will be plenty, as a non-english mother-tongue). Or you can talk about this on the [Reddit Page][reddit].
[^modularity]: Modularity is one measure of the structure of networks or graphs. It was designed to measure the strength of division of a network into modules (also called groups, clusters or communities).
[^impression]: While an industry standard, the measurement of Impressions is of little use in practice: it is calculated multiplying the number of tweets (and retweets) one account has made by the number of followers it had in that peculiar moment in time. A better-suited definition could be “Opportunities to See”.
[^total]: Calculated summing up, without de-duplication, the number of the followers of all the users spotted out in the conversations.
[^tango]: A word used to describe a terrorist that has been eliminated. It is mainly used by the Special Forces to describe an eliminated enemy during a firefight.
[^isisbo]: The call for ‘IS’ media blackout after reported execution of the US journalist.
[^engagement]: Measuring who much (and by who) the information written by a single user is carried on on the web.
[^eigen]: The most connected people in a social network — those with the highest number of incoming and outgoing connections — have high eigenvalues. These eigenvalues can be calculated — like Google’s PageRank algorithm — by weighting the value of each connection based on the eigenvalue of the originator.
[^data]: We do have a paid firehose to get unmediated access to the Twitter stream. Most of what we did, though, can be replicated using free data from the API.