Sentiment Analysis

(Left appox. 3 papers to be updated in recent days) Here is a literature note on selected recent advancements in sentiment analysis techniques. The papers, published between 2024 and 2025 to date, all involve the application and/or discussion of LLMs in sentiment analysis tasks. Taxonomy of Sentiment Analysis Techniques Here I combine the taxonomies adopted in [2] and [6] to involve all mentioned sentiment analysis techniques. Lexicon-based Methods Possible to measure gradations in sentiment More intrinsically suited to types of questions social scientists often ask, for example: Trends over time in average sentiment across large quantities of texts Whether one group of texts is more negative than another Traditional Machine Learning Models (e.g., Naive Bayes, Support Vector Machines (SVM)) Deep Learning Models (e.g., Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs)) Can learn hierarchical and sequential patterns in text, more suitable for sentiment analysis on informal social media platforms Appealing for domain-specific applications Performance degrades rapidly when methods are applied to other domains Often require large labeled datasets and computational resources Transformer-based Architectures (e.g., BERT, RoBERTa, and GPT) Can address challenges specific to Twitter, including brevity, mixed sentiments, and multilingual content Have high computational cost and require domain-specific adaptation Have built-in bias even without domain-specific fine-tuning. Even explicitly de-biased LLMs continue to reflect pernicious biases. Supervised/Fine-tuned LLMs Pros: require fewer labeled samples Cons: May represent “catastrophic forgetting”, in which fine-tuning a general model for the sentiment analysis task risks losing many of the strengths of the pre-trained model There are often unexpected (and unnoticed) pitfalls in trying to keep the training, validation, and application steps of the machine learning model separate, which may invalidate findings Comparing to the supervised methods, unsupervised methods (i.e., Lexicon-based methods and unsupervised learning methods) are more domain-independent. ...