Least Squares

just trying to minimize error

Predicting information diffusion in Twitter

Posted by Scott on May 18, 2010

Jiang Yang, rockin’ grad student at Michigan and former MSR intern, and I will be showing a pair of short papers on information diffusion in Twitter at the upcoming ICWSM conference. The first paper examines person and tweet characteristics to see what predicts aspects of information diffusion. Predictors included things like the number of posts and number of mentions for a user, and whether a tweet contained a link. The outcome variables were the speed (how quickly does information travel through the network), scale (how many nodes at the first degree are affected), and range (how many hops in the network) of information diffusion, nice visualized courtesy of Jiang’s design skills:

Speed, Scale, and Range

Using about a month’s worth of “spritzer” feed content, Jiang built regression models to describe the percent of variance the different characteristics account for in each aspect of information diffusion across a number of sample topics. For speed and range Jiang suggested using a Cox proportional hazards regression model, which is often used for survival analysis. In the case of speed of information diffusion for example, this allowed us to describe how the different characteristics predict how information diffusion will die off over time. For the scale of diffusion (number of child nodes affected) we used a standard regression model. Here are the results from that model, showing correlation coefficients for each predictor:

One take-away is that the historical rate with which a user is mentioned (Log(nMentioned) in the above table) generally is the best predictor across all three measures. In fact, the correlations with scale of diffusion reach the .5 and .6 range. More nuanced findings include discussion about how the topic stage during which the tweet happens (i.e., early or late in the life cycle of a topic) can have significant impact, but not always in the same direction (that is, tweets earlier in a topic do not always have the greatest diffusion).

The second paper, btw, compares diffusion and network structure in Twitter to that of a blogging network. Here is a graphic comparing the two:


One Response to “Predicting information diffusion in Twitter”

  1. Not only has it been the year of the celebrity pregnant bride.
    Given below are some factors to consider are the overall formality and style of your wedding gown is exposed to moisture and possible mold and mildew growth.
    Although there are many Wedding Dresses to fit this body shape.

    As a contrast to her own gown, Kate chose a more fashionable dress for her
    marriage to Nicky Hilton.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: