Steve Green, PhD

Data Scientist and Director of Research and Evaluation

July 30th 2017

Illustrations by Mitch Green.

Can Donald Control The News Cycle?

It might be an understatment to say that our President can tweet some pretty colorful things:

In addition to winning the Electoral College in a landslide, I won the popular vote if you deduct the millions of people who voted illegally
— Donald J. Trump (@realDonaldTrump) November 27, 2016

Failing @NYTimes will always take a good story about me and make it bad. Every article is unfair and biased. Very sad!
— Donald J. Trump (@realDonaldTrump) May 20, 2016

I refuse to call Megyn Kelly a bimbo, because that would not be politically correct. Instead I will only call her a lightweight reporter!
— Donald J. Trump (@realDonaldTrump) January 27, 2016

As a result, there has been a great deal of interest in understanding the President’s tweets. Just this week, Kevin Quealy , at the New York Times, used predictive anlaytics to predict how many people Trump will insult during his presidency. At Slate, Matthew Dessem, using some colorful language himself, believes he can divine Trump Policy by using the inverse of Donald’s historical tweets. Taking a more analytical approach, David Robinson at Variance Explained, conducted an excellent analysis to show that tweets likely produced by the Donald, are more likely to be angrier than tweets produced by his campaign staff. Finally, working for Quartz Media, Michael Tabb hypothesizes that Donald utilizes his tweets strategically in order to drive national media attention away from the negative news about himself.

We were interested in testing the idea that Trump’s negative tweets attracted greater attention than positive tweets. Specifically, we wanted to test whether negative tweets would drive more attention in the form of retweets. If there is a basis in the argument that emotionally negative tweets drive media attention, then we expect these kinds of tweets to produce more retweets. This would at least establish the notion that negative tweets could strategically be used to draw attention away from negative news stories.

Scoring Tweets for Emotional Level

We used a straightforward approach to quantify the emotional content of Trump’s tweets. Using a list of positive and negative words developed by these researchers, each word was either classified as negative, positive or neutral. Next, we added up all positive words, subtracted all negative words, and ignored neutral words to get an emotional score of the tweet. Finally, we divided the emotional score of the tweet by the amount of words in the tweet. This final step was necessary to avoid tweets with more words always having a higher emotional score simply because there were more words in the tweet.

To better illustrate our analysis, look at the figure below. There you will see a Trump tweet in the top box and in the middle box one of the words from that tweet. Similar to our algorithm, you will classify each word as either positive, negative, or neutral using the three buttons below. Notice that as you classify the word, the orange point in the plot to the right will move around. Go ahead and determine the emotional scores of several of Trump’s tweets.

Testing the relationship between emotional score and retweets

Remember we are interested in seeing if emotionally negative tweets generate more retweets than positive tweets. Using a linear regression, we modeled the relationship between the emotional score and retweets using a line. Since we predict that more negative tweets will drive higher retweets, we believe a line modelling this relationship will start high on the negative side and become lower as the tweet becomes more positive.

Hit the "Draw!" button to see our prediction

Now hit the "Test!" button to see if our fake data is aligned with our predicted line

Notice in the above figure how the points, which each represent a tweet, line up closely with our predicted line. When the data lines up closely with our line, then our prediction is true. In this case, of course our fake data confirmed our hypothesis since we made up the numbers to make sure it would match the line.

However, what about the real data from Donald Trump’s tweets? Let's take a look…

Results

Hit the "Predict!" button to see our predicted line.

Hit the "Test!" button to see how well the actual data lines up with our predicted line.

Hmm... That didn't go as planned. It doesn't look like the data matched our predicted line at all.

What if we fit a line that best fit the acutal data. What would that look like

Interesting, although the data didn't fit our predicted line, we can see that the line that best fits our data has the same pattern as our predicted line.

You can hover invidivdual data points to see the text of that tweet

Separate Positive and Negative Tweets

So the results seem to indicate that as Trump’s tweets are more negative, the more retweets they receive. However, negative tweets don’t drive as many retweets as we originally thought.

But hold the phone, given the way the data looks, it might not be that the more negative a tweet is the more retweets it will receive. Instead, it may be that including both positive and negative tweets in the same analysis only makes it seem this way. This is because there might be a general increase in retweets when any kind of tweet is negative and a decrease in retweets when any tweet is positive. This is a technical difference but an important one for our analysis. Our model seems to be saying that as a tweet becomes more negative, it will receive more retweets. However, the data might actually be saying that regardless of how negative a tweet is, so long as it’s negative, it will receive more tweets than a positive tweet.

One way to test this is to take the average amount of retweets negative tweets received, and compare it to the average amount of retweets received by positive tweets. If you examine the figure to the right, you can see a bar chart of this analysis. The two bars represent the average amount of retweets negative and positive tweets receive. The results clearly show that negative tweets will receive more retweets regardless of how negatively valanced the tweet.

These results may contradict our model showing that more negative tweets are associated with more retweets, as they suggest that the downward slope is driven by the average retweet of a positive tweet being lower than a negative tweet. Fortunately, we can resolve the contradiction by modeling the relationship between the emotional score and retweets, separating for positive and negative tweets.

Separating positive and negative tweets and running individual models will allow us to see if a negative line shows up in the negative tweets. Such a pattern will let us conclude our original hypothesis, that the more negative a tweet the more retweets it will receive. In addition, we can observe how increases in positive tweets affect retweets without interference from negative tweets.

Let’s separate those tweets!

First, we split the data into positive and negative tweets.

Alright let’s split the line that best fit the data when we examined negative and positive tweets together. Each of these lines will provide a prediction for both positive and negative tweets.

Now let’s fit a line that best fits the positive tweets (on the right side). According to our predicted line, the more positive the tweet, the less retweets it will receive.

Interesting, the new line almost perfectly matches our prediction. The more positive a tweet is the less attention it gets from the twitterverse.

But what about negative tweets?

After separating tweets according to valence, we found no relationship between how negative a tweet is and how many retweets it receives. Positive tweets on the other hand, did show the predicted pattern with more positive tweets receiving less retweets.

Originally, we sought to test the hypothesis that negative tweets would drive greater attention in the form of retweets. When we modeled all tweets together we found support for this hypothesis. We found a linear trend indicating that the more negative a tweet was the more retweets it would receive. This linear trend was eliminated when we separated by valence of the tweet, calling into question the relationship between the negativity of tweet and its ability to produce retweets. Based on this analysis, we must conclude using negative tweets to draw attention would likely be fairly ineffective.

Well maybe not entirely ineffective. When we examined the average retweets of negative and positive tweets we found that negative tweets produced more retweets. Based on this result, if Trump is looking to drive attention away from a damaging story he is better off publishing negative tweets than positive tweets, although he wouldn’t need to make the tweet very negative in order to achieve the desired effect. More negative tweets will not produce more retweets compared to less negative tweets.

Become one with Trump

To see this more clearly, in the figure below you can make your own tweet. The figure below has negative, positive, and neutral words that you can drag to the box that says ‘Drop Words Here’. Make your own negative Donald Trump tweet by dropping words into the box and see how many retweets you can get. The figure has three circles. The red circle predicts retweets using the negative tweet model, while the blue circle predicts with the positive tweet model. The purple circle predicts retweets from our model that combined both positive and negative tweets (our first model).

Go on, no one is looking, channel your inner Donald Trump and make a really nasty tweet. See how many retweets it gets.

disaster
terrible
crooked
liar
horrible
bad
dishonest

it
Obama
Clinton
America
the
will
was
jobs
get
does
doesn't
be
President
has
been
it
for
our
is
country
a
for
and

Yuge
great
terrific
wonderful
tough
massive
terrific
special

Emotional Tweet Score

Notice that as the tweet becomes more or less negative the red circle mostly moves right and left but not up and down. In contrast, the blue and purple circles move up as the tweets are more negative and move down as the tweets become more positive. This is because our negative tweet model did not show a relationship between the negativity of a tweet and the amount of retweets it receives. Knowing this, it would seem silly to try and drive attention in the form of retweets by making tweets strongly negative since it would not have an effect on retweet counts.

Conclusions

Results from our analysis do not provide support that the more negative a tweet the more retweets it will receive. This result suggests that crafting negative tweets as a way to distract the media from negative stories would not be very effective. Thus, we believe that Trump is unlikely posting negative tweets for the explicit purpose of driving national media attention away from unflattering stories. Or, if he is, its not a very effective strategy.

Limitations

The most important limitation is that we used retweet counts as a measure of the national media’s attention. Retweets are likely only a small part of the national media, and there is no reason to think that retweet counts direct the kind of stories that the press covers. Related, is that this analysis is only associative in nature. Although we suspect there is a causal relationship between the emotional score and retweets, we would need to perform an experiment to test for a cause and effect relationship.

We also need to mention that we only used tweets from June 15th, 2015 to November 8th, 2016 (all tweets that occurred during Trump’s presidential campaign). Although I wouldn’t expect it, failure to show a relationship between negativity and increasing retweets may be due to the time period we analyzed tweets in.