SentiFuse: Deep Multi-model Fusion Framework for Robust Sentiment Extraction

SentiFuse: Deep Multi-model Fusion Framework for Robust Sentiment Extraction
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Sentiment analysis models exhibit complementary strengths, yet existing approaches lack a unified framework for effective integration. We present SentiFuse, a flexible and model-agnostic framework that integrates heterogeneous sentiment models through a standardization layer and multiple fusion strategies. Our approach supports decision-level fusion, feature-level fusion, and adaptive fusion, enabling systematic combination of diverse models. We conduct experiments on three large-scale social-media datasets: Crowdflower, GoEmotions, and Sentiment140. These experiments show that SentiFuse consistently outperforms individual models and naive ensembles. Feature-level fusion achieves the strongest overall effectiveness, yielding up to 4% absolute improvement in F1 score over the best individual model and simple averaging, while adaptive fusion enhances robustness on challenging cases such as negation, mixed emotions, and complex sentiment expressions. These results demonstrate that systematically leveraging model complementarity yields more accurate and reliable sentiment analysis across diverse datasets and text types.


💡 Research Summary

The paper introduces SentiFuse, a model‑agnostic framework for integrating heterogeneous sentiment analysis systems. Recognizing that existing ensembles are often simplistic and ignore differences in output formats, the authors first propose a standardization layer that converts probabilities, scores (‑1 to 1), and logits into a unified probability distribution over sentiment classes. This enables seamless combination of lexicon‑based methods (e.g., VADER), pattern‑based approaches, traditional machine‑learning classifiers (TF‑IDF + SVM/XGBoost), and deep contextual encoders (BERT, RoBERTa, DistilBERT) without architectural changes.

Three fusion strategies are defined: (1) decision‑level fusion, which computes a weighted average of standardized probabilities; (2) feature‑level fusion, which concatenates model‑specific feature vectors (lexicon scores, engineered n‑gram features, transformer CLS embeddings) and trains a meta‑classifier (logistic regression with L2 regularization); and (3) adaptive fusion, which dynamically re‑weights model contributions based on textual characteristics such as presence of negation, length, and emotional complexity.

Experiments are conducted on three large‑scale social‑media datasets: Crowdflower (14.6 k airline tweets), GoEmotions (211 k Reddit posts mapped to positive/negative/neutral), and Sentiment140 (1.6 M tweets). Baselines include individual models, simple averaging, confidence‑weighted averaging, majority voting, median averaging, and max‑confidence selection.

Results show that systematic fusion consistently outperforms naive ensembles. Feature‑level fusion achieves the highest accuracy and F1 across all datasets (up to 90.71 % accuracy and 77.89 % F1), improving over the best single model by up to 4 % absolute F1. ROC‑AUC and PR‑AUC also peak for feature‑level fusion (0.856 and 0.857 on Sentiment140). Decision‑level fusion provides modest gains over simple averaging, while adaptive fusion excels on challenging cases such as negation, mixed emotions, and short texts, demonstrating increased robustness.

The authors discuss limitations: potential calibration errors in the standardization step, risk of overfitting in the meta‑classifier when training data are scarce, and the reliance of adaptive weighting rules on domain knowledge. Future work is suggested in three areas: (i) applying temperature scaling or other calibration methods; (ii) exploring non‑linear meta‑learners (e.g., gradient boosting, shallow neural nets); and (iii) developing fully learned adaptive weighting networks for multilingual and cross‑domain scenarios.

Overall, SentiFuse offers a flexible, multi‑level fusion architecture that can be readily applied to sentiment analysis and other text classification tasks, demonstrating that systematic exploitation of model complementarity yields more accurate and reliable predictions.


Comments & Academic Discussion

Loading comments...

Leave a Comment