Since 2016, President Donald J. Trump has left a remarkable impression on American life, transforming the intrinsic relations of American politics and expression. His cavalier attitude to the status quo persona of political professionalism has reverberated through dinner-table discussions across the country and globe; generating publicity and noise that carried him to the Presidency. From his powerful fingers on Twitter, President Trump capitalized on the breadth of social media and the connectedness, or perhaps sensitivity, of American consumers and journalists who weaved his words into an endless cycle of clamor, anger, excitement, but importantly novelty. I would like to endeavor in explaining the Trumpian influence on the American language, focusing on a corpus of New York Times articles (an influenced and influential publisher) to map the similarities from the pre-to-post-Trump era. Specifically, this project aims to answer the prompt: How the language of the New York Times has changed from President Barack Obama to President Donald Trump based on sentimental and contextual measures? Scraping New York Times articles from President Obama’s term in 2012-2016, and President Trump’s term in 2016-2020, I will perform my own logit-lasso regression, and cosine similarity analysis as well as employ outsourced semantic models from Hugging Face to derive comparative results.
I will be using two data sets: a corpus of headlines from controversial moments under President Barack Obama (from 2012-2016), and another corpus of headlines under President Donald Trump (from 2016-2020). I downloaded a dataset from Kaggle, detailing the New York Times headlines from 2010-2021, which I would ultimately preprocess into publication date and headline columns. The dataset would be filtered of stopwords and organized into an Obama Presidency Headline dataset (2012-2016) and a Trump Presidency Headline dataset (2016-2020), based on the year and if the headline contained the President’s name. Headlines were extracted by indexing into each article’s main headline, and appending it to my articles list. I then have respective lists of headlines in a 1-D array for Obama and Trump. This data will have been preprocessed and is ready to be vectorized into a bag of words approach for analysis, or embedded for alternative models.
There were a few notable observations in my preprocessing that may hold larger influences in my modeling. The dataset of Trump headlines was almost three times larger than the Obama headline dataset, perhaps this disparity in frequency already indicates an emphasis on Trump's actions. In staying true to the purpose of my endeavor, I also created duplicate datasets of each president’s respective headlines but removed the names of the presidents. These duplicates would be vectorized and passed into my logit-lasso model, as I hope to further understand the surrounding language of these headlines and their semantic meaning.
With the cleaned datasets, I can merge the Obama list and Trump list into a single merged corpus. I will apply a bag of words by vectorizing the count of common words. The binary sentiment used will be based on labels of either Obama (0) or Trump (1). The vectorization deployed is TF-IDF (Term Frequency-Inverse Document Frequency) which evaluates the frequency of each word in the aggregated dataset (Obama and Trump) as noted in term frequency. There is an additional layer applied of inverse document frequency that measures the importance of the vectorized headline in the entire corpus. Passing this embedded matrix (Bag of Words) into the logit-lasso model, I evaluate the most optimal hyperparameters to properly fit the model and derive the most strongly associated Trump words, and Obama words within respective periods as well as across both presidencies. Positive coefficients are more highly associated with President Trump, while negative coefficients are more highly associated with President Obama.
Additionally, I can find similar vectors through the cosine similarity of the articles, and return a subset of words with the highest measure. I will compare the mean similarity rating from the two periods, attempting to see the degree of difference in describing Trump against Obama in American culture. The cosine similarity model inputs the vectorized datasets from a sentence transformer to encode the Obama and Trump datasets respectively and maps them to a similarity score based on the directional similarities of any two vectors from the Obama corpus and the Trump corpus. The function mostly explains any difference in writing style from the pre-Trump era to the post-Trump era, examining the syntactical and logical flow of headlines.
Other pre-trained, semantic models from sources such as Hugging Face were effective in deriving semantic categorization of the headlines. I researched the implementation of BERT models (a type of transformer that is prominent in NLP) to attain comparable results from more nuanced approaches. Two kinds of transformers were deployed: a MobileBERT Sentence Transformer and a roBERTa Sentiment Transformer. The difference lies in the training of the two models, the former is trained for the general classification of sentences, while the latter focuses on the semantic value of tweets. The sentence transformer encoded my Obama dataset and Trump dataset respectively for the cosine similarity component. The TF-IDF vectorization inputted the aggregate data, which did not lead to an explainable table. The Semantic Transformer occupies another approach to reading the difference in the language of the New York Times. I inputted the respective datasets, and Figure 2. and Figure 3. demonstrate the results.
The logit-lasso regression attained a relatively strong accuracy of predicting the correct labels of Trump versus Obama headlines at 85%. The recall of the model was 95%, which means that the model was able to detect 95% of all actual positive cases in the dataset (or 95% of all Trump headlines, which may be attributed to the fundamental size disparity of the Trump dataset compared to that of Obama’s). We see this in the disparity of its accuracy in predicting Trump documents at 88% precision, meaning the model naturally over-predicted the number of Trump headlines.
As noted, positive coefficients relate to the most frequent words in the Trump corpus, and negative coefficients are registered similarly for the most frequent words in the Obama corpus. Figures 4.,5.,6. expand on these findings, noting primarily the significance of controversial issues that contextualize the respective presidencies. Figure 4 remains the most prominent demonstration of the semantic meaning of the pre-Trump and post-Trump eras. The pre-Trump era comments on President Obama’s principal policy of Obamacare (-30.282), and then proceeds to explain other challenges to his campaign: ISIS (-20.696), ebola (-16.063). The post-Trump era comments interestingly not on the President’s actions in office as much as a reflection of the uncertainty in President Trump’s office: phrases like “Mueller (12.487)”, “Impeachment (16.203)”, “voters (11.919)” attack his authority and integrity as President. These findings are not particularly polarizing, but do shed light on the projection of anxiety that the New York Times may carry for President Trump.
The cosine similarity module did not offer significant texture in describing the language of the New York Times. As demonstrated in Figure 1., the matrix mostly explains the contextual significance of headlines, seemingly pointing to an Obama policy and Trump reaction. Unfortunately, the table is not particularly moving evidence and does not seem to delineate any profound insight. Retrospectively, the vectorize aggregate dataset should have been further preprocessed, perhaps reduced to controversial words in each headline that could have suggested greater depth to the language of the New York Times from pre-Trump to post-Trump.
The roBERTa model remarked on the neutrality of the New York Times headlines across both periods. In both pre-Trump and post-Trump periods, with ten headlines sampled randomly, we see that at least 80% of the headlines received a Label 1 (Neutral). I would say that the training in categorizing tweets semantically influenced this outcome, as headlines are consistently brief, more ordered, and cannot yield an intense political charge. Thus the threshold for labeling a headline should perhaps be lowered. Unfortunately, there were no other semantic models to outsource that had been trained on headlines, and we must decide how much to value the neutrality of New York Times headlines across the Obama and Trump administrations.
The logit-lasso model seems to offer the most precision in detecting the differing focus of New York Times headlines, pointing at explicit words. The contextual significance of these words (from “Mueller”, “impeachment” and even “Biden”) that all attack Trump’s legitimacy should be the primary takeaway. The Obama headlines do not carry these charged words that connote any attack on his legitimacy or welcome any changing of the guard.
The weight of the Presidential voice remains incredibly significant to the discourse and education of people around the world. The reaction of media sources counts in this matter, shaping the family household discussions as well as informing the public about national matters. The significant association of Trump with more controversial and subversive words to his Presidency should serve as an interesting reflection of the New York Times’ standing. The Trump headlines were not more polarizing nor particularly more aggressive than that of Obama’s, but the moments of address (“impeachment”, “investigation”) focus on a movement away from the reporting of the Trump administration’s policies to the self-conscious conjecture of whether Trump should remain in office. These events did occur. Trump was impeached and investigated, but the strong frequency of even Biden as well as these challenges demonstrate not a neutral rundown of policy. The New York Times holds an obligation to report on significant news, and the journal seemingly acts with rare erroneous language to further display the headlines as polarizing.
My analysis suggests that the language used in New York Times headlines during the Obama and Trump administrations was not significantly more polarizing or aggressive in one period versus the other. However, the focus on controversial issues and the conjecture about Trump's presidency indicate a shift in reporting style. My findings underscore the importance of responsible journalism and unbiased reporting in shaping public discourse and perceptions, expanding on the role of media in a democratic society and encouraging further research in this area.