Michelle Qiu 12/16/2022
Twitter’s mission statement is to “give everyone the power to create and share ideas and information instantly.” Since the inception of Twitter and similar social media giants, these platforms have undoubtedly been instrumental for individuals wanting to contribute to public conversations. However, while pursuing this goal of facilitating global connections, modern social media has also generated unintended side effects that have more influential implications than what their founders may have intended. In particular, these platforms have gained momentum in American politics and have established their power as an easily accessible method for delivering political platforms to constituents—so much so that nearly 48% of all Americans sometimes or often received their news from social media sources in 2021, as compared to 20% of all Americans who regularly got news from social network sites in 2012 (Matsa, 2021; Pew Research Center, 2012.
When looking for a successful political bid that utilized Twitter to its advantage, one does not need to look any further than at former President Donald Trump’s 2016 campaign. Trump explicitly stated that his daily barrage of tweets (which grew to nearly 35 tweets/day by the end of his presidency before he was infamously banned from the platform altogether) helped him ascend to presidency, declaring “if I didn’t have social media, I probably wouldn’t be here” (McCarthy, 2021; National Archives and Records Administration, 2020). Compared to his democratic opponent Hillary Clinton, who relied heavily on traditional forms of news media/journalism and who had considerably less followers and tweets on Twitter by election day of 2016, President Trump garnered substantial public support through his constant Twitter messaging (Enli, 2017). Whereas social media and especially Twitter as a mechanism for political campaigning is certainly normalized today, prior to the 2016 election, the general public largely believed that there was more legitimacy and influence with a traditional news-based campaign than with burgeoning social network sites (Johnson & Kaye, 2014). However, the 2016 election demonstrated that campaign messages on social media like Twitter, “regardless of which candidate the message promoted,” are regarded just as strongly as communications through traditional media sources (Morris, 2018).
More than just the sheer mass of tweets that Trump emitted during his presidency, his incorporation of “unprofessional,” insult-laden rhetoric contrasted greatly with the sanitized, markedly formal tweets that candidates Clinton and former President Barack Obama employed in their campaigns (Enli, 2017; Ross & Rivers, 2020). Not only did his deliberately inflammatory and negative statements delegitimize his opponents and emphasized his status as a celebrity politician, his informal and simplistic language style personalized him to constituents and ensured that he was comprehensible to the average American (Ross & Rivers, 2020). In other words, Trump’s overall “amateur” style, several scholars argue, ensured that he was viewed as authentic in publishing sincere opinions on Twitter, especially if they were vulgar or offensive (Enli, 2017; Theye & Melling, 2018).
Because of Trump’s unconventional use of Twitter as a campaign strategy that contributed to his unlikely election in 2016, it seems as though more politicians have adopted similar tactics in hopes of replicating his success. In particular, out of qualities that Trump continually exhibited on Twitter, his brashness in the form of insults, name-calling, and toxicity are key aspects of his rhetoric that are easily imitable on Twitter. Therefore, this paper seeks to analyze what kinds of candidates choose to use similar controversial or offensive messaging tactics to engage with constituents on Twitter post-Trump. In particular, the most recent 2022 midterm election was instrumental for both political parties in determining which party would gain control of Congress after the first election since President Joe Biden’s election (Epstein, 2022). Thus, in such a high stakes election, I was interested in seeing which candidates believed they would benefit electorally from utilizing a controversial campaign messaging style. After analyzing literature linking controversy, social media, and politics, I propose three hypotheses involving controversy by political candidates on Twitter:
higher controversiality scores lead to higher engagement (likes, retweets, and overall follower count for a given candidate),
candidates in politically heterogeneous states have more controversial tweets, and
non-incumbent candidates have more controversial tweets than incumbent candidates.
First, it is prudent to define what “controversial” really means in the context of social media. Because it is difficult to delineate any one corpus of words that could explicitly encompass all possible insults or remarks perceived to be “controversial,” this method of scrutinizing controversial tweets utilizes the Perspective API, which has a variety of scoring categories covering various forms of controversial statements:
TOXICITY: A rude, disrespectful, or unreasonable comment that is likely to make people leave a discussion.
SEVERE TOXICITY: A very hateful, aggressive, disrespectful comment or otherwise very likely to make a user leave a discussion or give up on sharing their perspective. This attribute is much less sensitive to more mild forms of toxicity, such as comments that include positive uses of curse words.
IDENTITY ATTACK: Negative or hateful comments targeting someone because of their identity.
INSULT: Insulting, inflammatory, or negative comment towards a person or a group of people.
PROFANITY: Swear words, curse words, or other obscene or profane language.
THREAT: Describes an intention to inflict pain, injury, or violence against an individual or group.
Thus, any further description of a tweet as “controversial” denotes that it scores highly in one or multiple of the above categories.
In this analysis, it is important to understand the link between controversy and engagement in an online setting like Twitter. Controversial statements intuitively seem like they would induce more conversations about a subject, especially polarizing topics that most individuals tend to have an opinion on (Chen & Berger, 2013). However, discussing controversial topics can also increase discomfort and cause social rejection, which will reduce the likelihood of people continuing to engage in these subjects (Buss, 1990). Yet, in the case of anonymized interactions, where users seemingly face less consequences for provocation, users are more likely to engage in response to controversial statements (Wang et al, 2014; Chen & Berger, 2013). Thus, because Twitter is a highly anonymized, online platform with a relatively low barrier for entry for engaging with other users, I predict that higher controversiality scores will promote higher engagement scores (Peddinti et al, 2017). Specifically, users are more likely to retweet about topics that they themselves typically do not tweet about (such as topics they find controversial), and they will obviously favorite tweets that they themselves agree with (Macskassy & Michelson, 2011). Because of this, I hypothesize that both retweets and favorites will be higher for more controversial tweets. I also hypothesize that those who are more popular on Twitter will have higher controversy scores as well because of this connection between engagement and controversy.
As visible with Trump’s campaign, extreme polarization of a candidate is no longer political suicide, and can even contribute to public perception of a candidate’s personability/authenticity (Theye & Melling, 2018). However, in the context of a “swing state,” or a state whose voters are closely politically divided, there is conflicting literature that demonstrates moderate candidates do perform better electorally among swing voters than extremely polarizing candidates because extremists tend to galvanize voters of the opposing party more than they inspire members of their own party to vote (What swing states are and why they’re important, 2020; Hall & Thompson, 2018). Despite this evidence, I was skeptical that recent political candidates would actually implement more moderate, less controversial messaging tactics simply due to the general rise in polarization in recent decades and the desire to replicate the tantalizing success that Trump achieved using his unconventional methods (DeSilver, 2022). Thus, in my analysis of controversial tweets, I ultimately selected the latter line of reasoning to hypothesize that candidates in more politically heterogeneous states would be more likely to employ more controversial statements on Twitter, as they would want to differentiate themselves more from the other candidate.
Because incumbent political candidates already enjoy a well-known advantage from increased name recognition and more financial resources than non-incumbent candidates, non incumbent candidates must adopt new tactics to stand out as a viable contender against the incumbent (Banks & Kiewiet, 1989). Additionally, we have previously demonstrated that controversy on online platforms like Twitter propel higher levels of engagement with other Twitter users, which can ensure higher name recognition of a candidate. For these reasons, I hypothesize that non-incumbent candidates will choose to be more controversial than incumbent candidates in order to enter the mainstream public consciousness and gain popularity.
In order to test my hypotheses, I gathered a large amount of Twitter data by first manually compiling a list of all senatorial and gubernatorial candidates (127 total) in the 2022 U.S. general election, omitting those who received <5% of the electoral vote to ensure that the tweets we collected were from at least relatively popular candidates. Within this list, I also noted the candidate’s state, political party, gender, incumbency status, and whether they won their election or not, and then recorded their Twitter handle. I then used the RTweet package’s get_timeline() function to obtain tweets, favorite counts, and retweet counts from every handle, intending to collect data from May 1, 2022 to November 8, 2022 (from the first primaries to election day). Because the Twitter API does not allow users to specify a particular time period that they are interested in analyzing, I initially selected the most recent 1500 tweets for every candidate, then extracted more tweets (up to 3200) for candidates whose most recent 1500 tweets did not extend all the way to May. Combining all of these tweets, I obtained a final dataset of over 163,000 tweets. Due to rate limits of the Perspective API, I randomly selected a sample of 16,391 tweets from this final dataset to analyze.
For my second hypothesis, I computed a metric for the political heterogeneity of states as the ratio between the minimum of the number of democratic and republican voters and the maximum of the number of democratic and republican voters (so that the most politically heterogeneous states would have a score closer to 1 than homogeneous states). I used data from the 2020 presidential election to compute these ratios for each state, and then joined this ratio data to my data subset. Finally, in order to conduct standardized regression modeling for each of my hypotheses, I recoded non-numeric values as numeric values. For example, I recoded political party values as 0 for democrats, 1 for republicans, and 0.5 for all other candidates, and I recoded incumbency status as 1 for nonincumbents and 0 for incumbents.
As previously mentioned, I decided to use the Perspective API, an external API that can sufficiently analyze the content of a selection of text and assign a score for categories specifically related to controversiality (toxicity, extreme toxicity, identity attack, insult, profanity, and threat). Each category is defined in more detail in the Background section. Each score ranges from 0 to 1, with larger values as stronger indicators of the category in the tweet. For simplicity’s sake, I only analyzed the total aggregation of these scores.
A potential concern that could impact this analysis’s validity is failing to account for confounding variables that could also correlate to controversy scores. For instance, the number of followers that a candidate has would obviously impact the number of retweets and likes that a single Tweet receives (since a candidate with only a few followers has less people to view that candidate’s tweet on their timeline than a more highly-followed candidate). Additionally, the political party that a candidate belongs to can also impact that candidate’s influence due to unintended side effects of the Twitter algorithm. Thus, in order to control for these factors, I utilized standardized regression models for each of the three hypotheses. To visualize these, I utilized added-variable plots to extract individual standardized parameters’ correlation with the expected value, omitting statistically not significant factors in this visualization.
Variables:
score = the sum of each of the controversiality scores
(TOXICITY, SEVERE_TOXICITY, IDENTITY_ATTACK, INSULT, PROFANITY, and
THREAT) that a particular tweet received 
retweet_count = the number of retweets that a
particular tweet received
favorite_count = the number of favorites that a
particular tweet received
followers_count = the number of followers that the
author of a particular tweet has
ratio = the ratio between the minimum of the number of
democratic and republican voters and the maximum of the number of
democratic and republican voters in the 2020 U.S. presidential
election
incumbent = the incumbency status of a candidate,
where 1 represents non-incumbent candidates, and 0 represents incumbent
candidates.
Hypothesis 1:
retweet_count = \(a_0\) + score\(a_1\) +party\(a_2\) +followers_count\(a_3\)
favorite_count = \(a_0\) +
score\(a_1\)
+party\(a_2\)
+followers_count\(a_3\)
followers_count = \(a_0\) +
score\(a_1\)
+party\(a_2\)
Hypothesis 2:
score = \(a_0\) +
ratio\(a_1\)
+party\(a_2\)
+followers_count\(a_3\)
Hypothesis 3:
score = \(a_0\) +
incumbent\(a_1\)
+party\(a_2\)
+followers_count\(a_3\)
After taking a preliminary look at this data, it is quite apparent that the vast majority of tweets are largely non-controversial. Table 1 and Figure 1 demonstrate that the highest average controversiality score across all of the categories is for toxicity, and that is still at a mere 0.0454492 on a scale ranging from 0 to 1, with 1 being more controversial. While this analysis continues to look at all of this data, future analyses may focus solely on tweets that had controversiality scores above a certain threshold.
Figure 1. The distribution of controversiality scores visualized with boxplots
Table 1. The overall mean scores for each controversiality category, with each category ranging from 0 to 1
| Controversiality Category | Average Score | 
|---|---|
| Identity Attack | 0.0134040 | 
| Insult | 0.0239396 | 
| Profanity | 0.0199635 | 
| Severe Toxicity | 0.0020456 | 
| Threat | 0.0123512 | 
| Toxicity | 0.0454492 | 
Note: * denotes a p-value of <0.05, ** denotes a p-value of <0.01, and *** denotes a p-value of <0.001
For each of the models, I created an added-variable plot that isolates each factor inputted into the regression model to better visualize the effect of each factor on the outcome. Each added-variable plot’s independent variable \(x_i\)can be interpreted as the effect of \(x_i\) (given the other factors \(x_j, x_k, ...\) ) on the outcome \(y\). Non-significant coefficient values for the confounding variables of political party and follower count (those that had a p-value of >0.05) were omitted in the plots.
Figure 2 displays the regression visualization for the first part of Hypothesis 1, that higher controversiality scores will result in higher retweet counts. Because the p-value for the score coefficient is less than 0.001, we can reject the null hypothesis for this particular hypothesis and conclude that there is a statistically significant correlation between controversiality score and retweet count. In addition to the score coefficient’s significance, we also see that the p-value for the followers_count coefficient is also less than 0.001, and is even smaller than the score coefficient’s p-value. Thus, we can also conclude that there is a statistically significant correlation between the number of followers one has and the retweet count, that is even larger than the correlation between controversiality and retweet count. The party coefficient value is not large enough to be statistically significant here, meaning that the political party that one is a part of does not have a significant bearing on the number of retweets that one receives.
Figure 2: Regression Visualization for Hypothesis 1a
Table 2: Simplified Regression data for Hypothesis 1a
| Standardized Coefficients | Estimate | t-value | p-value | 
|---|---|---|---|
| score | 5.972e-02 | 7.609 | 2.91e-14 *** | 
| party | 7.498e-03 | 0.956 | 0.339 | 
| followers_count | 8.015e-02 | 10.214 | < 2e-16 *** | 
Figure 3 displays the regression visualization for the second part of Hypothesis 1, that higher controversiality scores will result in higher favorite counts. Because the p-value for the score coefficient is less than 0.001, we can reject the null hypothesis and conclude that there is a statistically significant correlation between controversiality score and favorite count. In addition to the score coefficient’s significance, we also see that the p-value for the followers_count coefficient is also less than 0.001, and is even smaller than the score coefficient’s p-value. Thus, we can also conclude that there is a statistically significant correlation between the number of followers one has and the favorite count, that is even larger than the correlation between controversiality and favorite count. There also seems to be a weaker correlation between political party and favorite count, with a p-value of 0.0012<0.01, but is nonetheless statistically significant. Since we encoded the political party to be 0 for democrats, 1 for republicans, and 0.5 for all other candidates, this means that being a republican correlates to lower favorite counts.
Figure 3: Regression Visualization for Hypothesis 1b
Table 3: Simplified Regression data for Hypothesis 1b
| Standardized Coefficients | Estimate | t-value | p-value | 
|---|---|---|---|
| score | 5.260e-02 | 6.855 | 7.38e-12 *** | 
| party | -2.483e-02 | -3.240 | 0.0012 ** | 
| followers_count | 2.238e-01 | 29.172 | < 2e-16 *** | 
Figure 4 displays the regression visualization for the third part of Hypothesis 1, that higher controversiality scores correlate to higher follower count. Because the p-value for the score coefficient is less than 0.001, we can reject the null hypothesis and conclude that there is a statistically significant and positive correlation between controversiality score and follower count, supporting the idea that higher controversy scores correspond to higher follower count. Here, one’s political party does not seem to have a statistically significant impact on the number of followers that a candidate has.
Figure 4: Regression Visualization for Hypothesis 1c
Table 4: Simplified Regression data for Hypothesis 1c
| Standardized Coefficients | Estimate | t-value | p-value | 
|---|---|---|---|
| score | 4.785e-02 | 6.079 | 1.24e-09 *** | 
| party | 6.934e-03 | 0.881 | 0.378 | 
Figure 5 displays the regression visualization for Hypothesis 2, that higher heterogeneity within a state correlates to higher controversiality scores. Here, because the p-value for the score coefficient is quite large, we cannot reject the null hypothesis and conclude that there is not a statistically significant and positive correlation between political heterogeneity and controversiality. Additionally, one’s political party also seems to have a statistically significant impact on the controversiality score; with a p-value of 0.0103 and a positive coefficient, this indicates that republican candidates will be more likely to engage in controversial tweets. Once again, we also see that the p-value for the number of followers that a candidate has is less than 0.001, so we can conclude that there is a statistically significant and positive correlation between the number of followers a candidate has and the controversiality score.
Figure 5: Regression Visualization for Hypothesis 2
Table 5: Simplified Regression data for Hypothesis 2
| Standardized Coefficients | Estimate | t-value | p-value | 
|---|---|---|---|
| ratio | 5.234e-03 | 0.651 | 0.5153 | 
| party | 2.020e-02 | 2.567 | 0.0103 * | 
| followers_count | 4.675e-02 | 5.810 | 6.37e-09 *** | 
For hypothesis 3, that non incumbency status correlates to higher controversiality scores, I chose to display this data through a boxplot visualization in Figure 6 because incumbency status is binary. Here, because the p-value for the incumbency coefficient is quite small, < 2e-16, we reject the null hypothesis and conclude that there is a statistically significant and positive correlation between incumbency and controversiality. Because we encoded incumbency status as 1 for non incumbent candidates and 0 for incumbent candidates, this signifies that non incumbent candidates are likely to be more controversial in their tweets. Additionally, one’s political party also seems to have a statistically significant impact on the controversiality score; with a p-value of 0.000611<0.001 and a positive coefficient, this indicates that republican candidates will be more likely to engage in controversial tweets. Once again, we also see that the p-value for the number of followers that a candidate has is less than 0.001, so we can conclude that there is a statistically significant and positive correlation between the number of followers a candidate has and the controversiality score. Out of all of the previous metrics, this is the only one that indicates that the factor we are testing for (incumbency status) has a larger impact on the dependent variable than the number of followers that a candidate has.
Figure 6: Boxplot Visualization for Hypothesis 3
Table 6: Simplified Regression data for Hypothesis 3
| Standardized Coefficients | Estimate | t-value | p-value | 
|---|---|---|---|
| incumbent | 9.850e-02 | 12.421 | < 2e-16 *** | 
| party | 2.691e-02 | 3.427 | 0.000611 *** | 
| followers_count | 6.174e-02 | 7.804 | 6.37e-15 *** | 
After analyzing this data, this study is able to reject the second hypothesis and accept the first and second hypotheses. For the first hypothesis, we can claim that there is a statistically meaningful correlation between how controversial a tweet is (as measured by the Perspective API) and the number of favorites and retweets that it receives, and that there is a statistically meaningful correlation between the number of followers that a political candidate has and the controversiality score that they receive. For our second hypothesis, the political heterogeneity of a candidate was not significantly correlated to the controversy score that a tweet received. Finally, our third hypothesis demonstrated that the correlation between the incumbency status of a candidate and controversiality score is highly statistically significant.
This information largely corresponds to the literature we observed above. Tweets with higher controversiality scores correlated to higher user engagement, corroborating the notion that controversial subjects receive high engagement on anonymized platforms, and high and nonincumbent candidates did tend to use more controversial qualities in their tweets. In the second hypothesis, my perception that candidates in more politically heterogeneous states would employ more controversial tactics was not confirmed by this data, supporting the notion espoused by various scholars that candidates in more politically heterogeneous states display less polarizing behavior (Hall & Thompson, 2018). Additionally, incumbency status is strongly correlated to how controversial a candidate will be on Twitter, corroborating my hypothesis that non-incumbent candidates will resort to more inflammatory language in order to gain traction in public discourse.
While there do seem to be non-insignificant correlations between the factors mentioned above, there are still some limitations to this research that can be analyzed further in future studies. For instance, while the idea of a controversiality score as a predictor of likes/retweets/incumbency correlator seems to logically follow from this data, in each of the regression models, the influence of the number of followers that a candidate has on the predicted outcome was either significantly larger than or at least comparable to the influence of the controversiality measure. This signifies that how large a candidate’s platform is has a larger impact on outcomes than the controversiality measure, even when the correlation between controversiality and the other measures was statistically significant. Thus, this invites further research on the relationship between a larger platform on Twitter and higher controversiality.
Additional limitations of this study are the fact that the vast majority of these tweets have low controversiality scores to begin with (see Figure 1 and Table 1 for the distribution of controversiality scores). Thus, future analyses that focus on analyzing a larger corpus of tweets and then focusing on those that score higher than the vast majority may be more conducive to revealing more significant insights, rather than having most of the data be relatively uncontroversial.
Furthermore, the Perspective API that was utilized for this study is certainly not a foolproof mechanism for measuring controversiality, and there are other methods of measuring controversiality that are potentially more effective. Though this model was fine tuned on a particular corpus of data, the tweets that were used in this analysis may include different patterns of toxic language that are misinterpreted or unknown to the model. Certain words may also provide a different connotation than what the model interprets—for instance, when gubernatorial candidate Josh Green tweeted out “61% killing it!” as a positive reaction to his polling percentages, his tweet was scored relatively high in the threat category, despite his clearly nonviolent intent in this message. Additionally, other methods of measuring controversiality may be more accurate than exclusively using this method, as the Perspective API was designed with the intent of monitoring hateful behavior for online platforms, and thus may have more stringent requirements for assigning high controversiality scores to text. Therefore, further experimentation using methods like sentiment analysis of responses to a tweet or another API for measuring controversiality that is not formulated specifically for social media platforms can provide further insight.
Ultimately, a more in-depth analysis of Twitter data must be conducted to draw more decisive conclusions about the relationship between controversiality and these various factors. Whether through using a different API to quantify controversiality, analyzing previous elections rather than just looking at the most recent election, or introducing additional confounding variables, there are many ways to continue analyzing this data and generate new findings.
Banks, J. S., & Kiewiet, D. R. (1989). Explaining patterns of candidate competition in congressional elections. American Journal of Political Science, 997-1015.
Buss, David M. (1990), “The Evolution of Anxiety and Social Exclusion,” Journal of Social and Clinical Psychology , 9 (2), 196–201.
Chen, Z., & Berger, J. (2013). When, why, and how controversy causes conversation. Journal of Consumer Research, 40(3), 580-593.
DeSilver, D. (2022, April 22). The polarization in today’s Congress has roots that go back decades. Pew Research Center. Retrieved December 16, 2022, from https://www.pewresearch.org/fact-tank/2022/03/10/the-polarization-in-todays-congress-has-roots-that-go-back-decades/
Enli, G. (2017). Twitter as arena for the authentic outsider: exploring the social media campaigns of Trump and Clinton in the 2016 US presidential election. European Journal of Communication, 32(1), 50–61.
Epstein, R. (2022, November 8). What’s at stake in the 2022 midterm elections. Marie Claire Magazine. Retrieved December 16, 2022, from https://www.marieclaire.com/politics/why-2022-midterms-important/
Hall, A. B., & Thompson, D. M. (2018). Who punishes extremist nominees? Candidate ideology and turning out the base in US elections. American Political Science Review, 112(3), 509-524.
Johnson, T. J., & Kaye, B. K. (2014). Credibility of social network sites for political information among politically interested Internet users. Journal of Computer-mediated communication, 19(4), 957-974.
Macskassy, S., & Michelson, M. (2011). Why do people retweet? anti-homophily wins the day!. In Proceedings of the International AAAI Conference on Web and Social Media (Vol. 5, No. 1, pp. 209-216).
Matsa, K. E., & Walker, M. (2021). News Consumption Across Social Media in 2021. Pew Research.
McCarthy, N. (January 11, 2021). End Of The Road For Trump’s Twitter Account [Digital image]. Retrieved December 16, 2022, from https://www.statista.com/chart/19561/total-number-of-tweets-from-donald-trump/
Morris, D. S. (2018). Twitter versus the traditional media: A survey experiment comparing public perceptions of campaign messages in the 2016 US presidential election. Social Science Computer Review, 36(4), 456-468.
National Archives and Records Administration. (2020, February 18). Remarks by President Trump Before Air Force One Departure | Joint Base Andrews, MD. National Archives and Records Administration. Retrieved December 16, 2022, from https://trumpwhitehouse.archives.gov/briefings-statements/remarks-president-trump-air-force-one-departure-joint-base-andrews-md-2/
Peddinti, S. T., Ross, K. W., & Cappos, J. (2017). User anonymity on twitter. IEEE Security & Privacy, 15(3), 84-87.
Pew Research Center. (2019). Section 2: Online and Digital News. Pew Research.
Ross, A. S., & Rivers, D. J. (2020). Donald Trump, legitimisation and a new political rhetoric. World Englishes, 39(4), 623-637.
Theye, K., & Melling, S. (2018). Total losers and bad hombres: The political incorrectness and perceived authenticity of Donald J. Trump. Southern Communication Journal, 83(5), 322-337.
Using machine learning to reduce toxicity online. Perspective. (n.d.). Retrieved December 16, 2022, from https://perspectiveapi.com/
Wang, G., Wang, B., Wang, T., Nika, A., Zheng, H., & Zhao, B. Y. (2014, November). Whispers in the dark: analysis of an anonymous social network. In Proceedings of the 2014 conference on internet measurement conference (pp. 137-150).
What swing states are and why they’re important. U.S. Embassy & Consulate in Thailand. (2020, September 16). Retrieved December 16, 2022, from https://th.usembassy.gov/swing-states-importance/
Tables 1-6: Detailed Regression data for each Hypothesis
## 
## Call:
## lm(formula = scale(retweet_count) ~ scale(score) + scale(party) + 
##     scale(followers_count), data = temp)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -0.893 -0.124 -0.103 -0.086 37.698 
## 
## Coefficients:
##                          Estimate Std. Error t value Pr(>|t|)    
## (Intercept)            -5.037e-17  7.838e-03   0.000    1.000    
## scale(score)            5.972e-02  7.849e-03   7.609 2.91e-14 ***
## scale(party)            7.498e-03  7.840e-03   0.956    0.339    
## scale(followers_count)  8.015e-02  7.847e-03  10.214  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.9948 on 16106 degrees of freedom
## Multiple R-squared:  0.01054,    Adjusted R-squared:  0.01035 
## F-statistic: 57.16 on 3 and 16106 DF,  p-value: < 2.2e-16## 
## Call:
## lm(formula = scale(favorite_count) ~ scale(score) + scale(party) + 
##     scale(followers_count), data = temp)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -1.618 -0.107 -0.061 -0.020 48.197 
## 
## Coefficients:
##                          Estimate Std. Error t value Pr(>|t|)    
## (Intercept)            -8.761e-16  7.662e-03   0.000   1.0000    
## scale(score)            5.260e-02  7.673e-03   6.855 7.38e-12 ***
## scale(party)           -2.483e-02  7.664e-03  -3.240   0.0012 ** 
## scale(followers_count)  2.238e-01  7.671e-03  29.172  < 2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.9725 on 16106 degrees of freedom
## Multiple R-squared:  0.05445,    Adjusted R-squared:  0.05427 
## F-statistic: 309.2 on 3 and 16106 DF,  p-value: < 2.2e-16## 
## Call:
## lm(formula = scale(followers_count) ~ scale(score) + scale(party), 
##     data = temp)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -1.0662 -0.4120 -0.3274 -0.0768  5.3932 
## 
## Coefficients:
##                Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  -5.130e-16  7.870e-03   0.000    1.000    
## scale(score)  4.785e-02  7.872e-03   6.079 1.24e-09 ***
## scale(party)  6.934e-03  7.872e-03   0.881    0.378    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.9989 on 16107 degrees of freedom
## Multiple R-squared:  0.002352,   Adjusted R-squared:  0.002228 
## F-statistic: 18.98 on 2 and 16107 DF,  p-value: 5.82e-09## 
## Call:
## lm(formula = scale(score) ~ scale(ratio) + scale(party) + scale(followers_count), 
##     data = temp)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -0.7430 -0.4020 -0.3286 -0.1056 15.0226 
## 
## Coefficients:
##                          Estimate Std. Error t value Pr(>|t|)    
## (Intercept)            -3.162e-17  7.869e-03   0.000   1.0000    
## scale(ratio)            5.234e-03  8.046e-03   0.651   0.5153    
## scale(party)            2.020e-02  7.869e-03   2.567   0.0103 *  
## scale(followers_count)  4.675e-02  8.046e-03   5.810 6.37e-09 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.9987 on 16106 degrees of freedom
## Multiple R-squared:  0.002739,   Adjusted R-squared:  0.002553 
## F-statistic: 14.75 on 3 and 16106 DF,  p-value: 1.384e-09## 
## Call:
## lm(formula = scale(score) ~ scale(incumbent) + scale(party) + 
##     scale(followers_count), data = temp)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -0.9039 -0.4150 -0.2898 -0.1049 14.9515 
## 
## Coefficients:
##                         Estimate Std. Error t value Pr(>|t|)    
## (Intercept)            3.194e-16  7.831e-03   0.000 1.000000    
## scale(incumbent)       9.850e-02  7.930e-03  12.421  < 2e-16 ***
## scale(party)           2.691e-02  7.850e-03   3.427 0.000611 ***
## scale(followers_count) 6.174e-02  7.911e-03   7.804 6.37e-15 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.994 on 16106 degrees of freedom
## Multiple R-squared:  0.01218,    Adjusted R-squared:  0.01199 
## F-statistic: 66.17 on 3 and 16106 DF,  p-value: < 2.2e-16