Roman Urdu Opinion Mining System (RUOMiS)

Roman Urdu Opinion Mining System (RUOMiS)
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Convincing a customer is always considered as a challenging task in every business. But when it comes to online business, this task becomes even more difficult. Online retailers try everything possible to gain the trust of the customer. One of the solutions is to provide an area for existing users to leave their comments. This service can effectively develop the trust of the customer however normally the customer comments about the product in their native language using Roman script. If there are hundreds of comments this makes difficulty even for the native customers to make a buying decision. This research proposes a system which extracts the comments posted in Roman Urdu, translate them, find their polarity and then gives us the rating of the product. This rating will help the native and non-native customers to make buying decision efficiently from the comments posted in Roman Urdu.


💡 Research Summary

The paper presents RUOMiS (Roman Urdu Opinion Mining System), an end‑to‑end framework designed to extract product reviews written in Roman Urdu (Urdu language transcribed with the Latin alphabet), translate them into English, determine their sentiment polarity, and generate a visual rating for the product. The motivation stems from the fact that many e‑commerce sites in Pakistan and India allow customers to post comments in Roman Urdu, which are unintelligible to non‑Urdu speakers and even to Urdu speakers who prefer the Arabic script. By mining these comments, the authors aim to help both native and non‑native customers make informed buying decisions.

System Architecture
RUOMiS consists of four major stages:

  1. Crawling – A web crawler fetches all user comments from product pages on the mobile‑shopping portal whatmobile.com. The comments are stored temporarily in a local database.

  2. Translation – Each comment is sent to the Microsoft Bing Translator API, which returns an English translation. The authors rely entirely on this external service; no quality assessment of the translation is reported.

  3. POS Tagging & Opinion Word Extraction – The translated English sentences are processed with SharpNLP (a C# wrapper of OpenNLP). Tokenization, sentence segmentation, and part‑of‑speech tagging are performed, and all words tagged as adjectives (JJ) are extracted.

  4. Sentiment Classification & Rating – A manually built lexicon containing 200 positive and negative adjectives is used to label each adjective as positive or negative. If a sentence contains at least one adjective that matches the lexicon, the sentence is classified as positive or negative accordingly; otherwise it is marked neutral. The counts of positive, negative, and neutral sentences are then visualized as pie or bar charts to provide a product rating.

Experimental Setup
Three mobile phone models were selected as test cases. For each model, 540 comments were collected, yielding a total of 1,620 comments. Manual annotation was performed to obtain the ground‑truth distribution of sentiment: 120 positive, 71 negative, and 1,429 neutral comments. RUOMiS produced 527 positive, 177 negative, and 916 neutral classifications.

Evaluation Metrics
The authors compute precision, recall, and F‑measure based on a four‑way confusion matrix (TP, FP, FN, TN). They report a precision of 0.271, recall of 1.0, and F‑measure of 0.427. The perfect recall indicates that all true positive and true negative (i.e., actual positive and negative) comments were captured, but the low precision reveals that a large proportion of neutral comments were incorrectly labeled as positive or negative (approximately 21 % of the total comments).

Analysis of Errors
The main source of misclassification is the presence of “condition‑related” adjectives such as “excellent condition” or “good condition” within neutral comments that are actually advertisements or trade offers. Because these adjectives appear in the manually built lexicon as positive terms, the system mistakenly treats the whole comment as a positive opinion about the product. The authors note that about 80 % of neutral comments are unrelated advertisements or requests, which further inflates the false‑positive rate.

Strengths

  • The paper tackles a low‑resource language problem that has received little attention in sentiment analysis research.
  • The end‑to‑end pipeline (crawling → translation → POS tagging → sentiment → rating) is clearly described and implemented with readily available tools.
  • The recall of 100 % demonstrates that the system does not miss any genuine positive or negative statements, which is valuable for ensuring that critical feedback is not ignored.

Weaknesses and Areas for Improvement

  1. Reliance on Machine Translation – Roman Urdu lacks standardized spelling, and automatic translation can introduce substantial noise. No evaluation of translation quality is provided, making it difficult to assess its impact on downstream sentiment analysis.
  2. Small Lexicon – A 200‑word adjective list is insufficient to capture the rich variety of expressions in product reviews, especially domain‑specific slang or idioms. Expanding the lexicon using existing English sentiment resources (e.g., SentiWordNet, VADER) or automatically extracting sentiment terms from a larger corpus would likely improve precision.
  3. Context‑Blind Polarity Assignment – The system classifies a sentence solely based on the presence of a matching adjective, ignoring negation (“not good”), intensifiers, or multi‑word expressions. This rule‑based approach is prone to errors in real‑world text. Incorporating syntactic patterns, dependency parsing, or machine‑learning classifiers (SVM, logistic regression, or transformer‑based models such as BERT) would enable more nuanced sentiment detection.
  4. Handling of Non‑Product Comments – Advertisements, trade offers, and spam constitute a large portion of the data. A pre‑filtering step (e.g., keyword‑based spam detection or a separate classifier) could remove these from the sentiment pipeline, reducing false positives.
  5. Evaluation Scope – The study uses only three products from a single website, limiting the generalizability of the findings. A broader evaluation across multiple domains (e.g., electronics, clothing, services) and larger datasets would provide stronger evidence of robustness.
  6. Metric Diversity – Reporting only precision, recall, and F‑measure for the binary (positive/negative) case overlooks the performance on the neutral class, which is the majority. Including accuracy, macro‑averaged F1, and confusion matrices for all three classes would give a more complete picture.

Future Directions
The authors propose making the sentiment lexicon publicly available and allowing crowd‑sourced updates, but they also acknowledge that the current SQL‑based storage does not capture semantic relationships. Integrating WordNet or other semantic networks could enrich the lexicon with synonymy and antonymy. Moreover, moving from a purely rule‑based system to a hybrid approach that combines lexical resources with supervised learning (trained on a manually annotated Roman Urdu corpus) would likely raise precision while preserving high recall.

Conclusion
RUOMiS represents an important first step toward leveraging user‑generated content in Roman Urdu for e‑commerce decision support. While the system successfully extracts and translates reviews and achieves perfect recall, its low precision—driven by translation errors, a limited adjective lexicon, and the inability to filter out non‑product comments—limits practical usefulness. Addressing these shortcomings through better translation handling, larger and semantically aware sentiment resources, context‑sensitive classification, and more comprehensive evaluation would substantially improve the system’s reliability and make it a valuable tool for both native and non‑native customers in the South Asian market.


Comments & Academic Discussion

Loading comments...

Leave a Comment