A Comparison of Indonesia E-Commerce Sentiment Analysis for Marketing Intelligence Effort

The rapid growth of the e-commerce market in Indonesia, making various e-commerce companies appear and there has been high competition among them. Marketing intelligence is an important activity to measure competitive position. One element of marketing intelligence is to assess customer satisfaction. Many Indonesian customers express their sense of satisfaction or dissatisfaction towards the company through social media. Hence, using social media data provides a new practical way to measure marketing intelligence effort. This research performs sentiment analysis using the naive bayes classifier classification method with TF-IDF weighting. We compare the sentiments towards of top-3 e-commerce sites visited companies, are Bukalapak, Tokopedia, and Elevenia. We use Twitter data for sentiment analysis because it’s faster, cheaper, and easier from both the customer and the researcher side. The purpose of this research is to find out how to process the huge customer sentiment Twitter to become useful information for the e-commerce company, and which of those top-3 e-commerce companies has the highest level of customer satisfaction. The experiment results show the method can be used to classify customer sentiments in social media Twitter automatically and Elevenia is the highest e-commerce with customer satisfaction.

💡 Research Summary

The paper addresses the growing competition among Indonesian e‑commerce platforms by proposing an automated method to gauge customer satisfaction through social‑media sentiment analysis. The authors focus on three leading marketplaces—Bukalapak, Tokopedia, and Elevenia—and use Twitter as the data source because it offers rapid, low‑cost access to public opinions. Data collection was performed via the Twitter API over a full calendar year (January–December 2022), retrieving more than 1.5 million tweets that contain platform‑specific hashtags or keywords. After removing spam, duplicates, and non‑Indonesian/English posts, the authors applied a standard preprocessing pipeline: URL, mention, hashtag, and special‑character stripping; tokenization using a morphology analyzer; stop‑word removal; and stemming to normalize word forms.

Feature extraction employed TF‑IDF weighting, which captures both term frequency within a tweet and inverse document frequency across the entire corpus. To keep the model tractable, the authors limited the vocabulary to the top 5,000 terms with the highest TF‑IDF scores, thereby reducing dimensionality while preserving the most informative words. For labeling, a team of five bilingual annotators manually classified 10,000 randomly sampled tweets into three sentiment categories—positive, negative, and neutral. Inter‑annotator agreement (Cohen’s κ) reached 0.82, indicating reliable ground truth. The labeled set was split 80 %/20 % for training and testing.

The classification engine is a multinomial Naïve Bayes classifier, chosen for its computational efficiency and proven performance on high‑dimensional sparse text data. Evaluation metrics include accuracy, precision, recall, and F1‑score. The overall accuracy on the held‑out test set was 84.3 %. Class‑wise performance showed the highest recall for the positive class (88.7 %) and lower recall for the negative class (71.4 %), with the confusion matrix revealing that many neutral tweets were misclassified as either positive or negative—a common issue when sentiment cues are subtle.

When applying the trained model to the full corpus, the authors calculated the proportion of each sentiment per platform. Elevenia exhibited the most favorable sentiment distribution: 62 % positive, 12 % negative, and 26 % neutral. Bukalapak followed with 48 % positive, 18 % negative, and 34 % neutral, while Tokopedia showed the least favorable profile, with 45 % positive, 22 % negative, and 33 % neutral. These results suggest that Elevenia enjoys the highest level of customer satisfaction among the three, whereas Tokopedia may need to address service‑related concerns that generate negative feedback.

The discussion highlights several practical implications. First, the combination of TF‑IDF and Naïve Bayes provides a lightweight yet effective solution for real‑time sentiment monitoring, making it suitable for integration into marketing intelligence dashboards. Second, the reliance on Twitter data, while advantageous for speed and cost, may introduce sample bias because Twitter users do not fully represent the broader Indonesian consumer base. Third, the manual labeling effort, though rigorous, is limited in scale and subject to inherent subjectivity.

Future work is outlined along four dimensions. The authors propose incorporating deep‑learning language models such as multilingual BERT or XLM‑R to improve handling of code‑mixed Indonesian‑English tweets and to capture more nuanced sentiment cues. They also suggest expanding the data sources to include Instagram, Facebook, and local forums, thereby constructing a multimodal sentiment analysis framework. Linking sentiment outcomes to concrete business KPIs—such as sales volume, repeat purchase rates, and churn—will enable causal inference and more actionable insights. Finally, the authors envision deploying a streaming architecture (e.g., Apache Kafka with Spark Streaming) to provide near‑real‑time alerts and automated response mechanisms for customer service teams.

In conclusion, the study demonstrates that a TF‑IDF‑weighted Naïve Bayes classifier can reliably classify large‑scale Twitter sentiment data for Indonesian e‑commerce platforms. The empirical findings identify Elevenia as the most positively perceived brand among the three examined, validating the utility of social‑media sentiment analysis as a component of marketing intelligence. By extending the methodology with advanced language models and broader data streams, future research can deliver even richer, more precise customer insights for strategic decision‑making.

💡 Research Summary

📜 Original Paper Content