Implementing a Sharia Chatbot as a Consultation Medium for Questions About Islam

This research presents the implementation of a Sharia-compliant chatbot as an interactive medium for consulting Islamic questions, leveraging Reinforcement Learning (Q-Learning) integrated with Sentence-Transformers for semantic embedding to ensure contextual and accurate responses. Utilizing the CRISP-DM methodology, the system processes a curated Islam QA dataset of 25,000 question-answer pairs from authentic sources like the Qur’an, Hadith, and scholarly fatwas, formatted in JSON for flexibility and scalability. The chatbot prototype, developed with a Flask API backend and Flutter-based mobile frontend, achieves 87% semantic accuracy in functional testing across diverse topics including fiqh, aqidah, ibadah, and muamalah, demonstrating its potential to enhance religious literacy, digital da’wah, and access to verified Islamic knowledge in the Industry 4.0 era. While effective for closed-domain queries, limitations such as static learning and dataset dependency highlight opportunities for future enhancements like continuous adaptation and multi-turn conversation support, positioning this innovation as a bridge between traditional Islamic scholarship and modern AI-driven consultation.

💡 Research Summary

This paper presents the design, implementation, and evaluation of a Sharia‑compliant chatbot that serves as an interactive consultation medium for Islamic questions. Guided by the CRISP‑DM methodology, the authors first assembled a curated dataset of 25,000 question‑answer pairs drawn from authentic sources such as the Qur’an, canonical Hadith collections, and scholarly fatwas issued by recognized institutions (e.g., Al‑Azhar, Dar al‑Ifta). Each entry is stored in a flexible JSON schema that includes the question text, answer text, source reference, Islamic school of thought, and topical category (fiqh, aqidah, ibadah, muamalah).

For the core AI engine, the system combines semantic embedding with reinforcement learning. Sentence‑Transformers (a BERT‑based model) generate 768‑dimensional embeddings for both queries and candidate answers, capturing contextual nuances and lexical variations. On top of this embedding space, a Q‑Learning agent learns a policy that selects the answer with the highest expected reward. The state is defined by the query embedding, actions correspond to candidate answer indices, and the reward signal is a weighted blend of expert‑validated correctness scores and end‑user satisfaction ratings collected through the mobile interface. The Q‑Table is initialized using cosine similarity between embeddings and refined over 10,000 episodes with an ε‑greedy exploration strategy (α = 0.1, γ = 0.9).

The system architecture consists of a Flask‑based RESTful API and a Flutter mobile front‑end. When a user submits a question, the API computes its embedding, queries the Q‑Table, and returns the top‑ranked answer together with provenance metadata. The mobile app presents the response in a card layout, allows users to provide “useful” feedback, and optionally records voice input. Security is enforced via HTTPS and JWT token authentication, while the backend database (encrypted PostgreSQL) safeguards both religious content and user interaction logs.

Evaluation employed a held‑out test set of 5,000 unseen questions spanning the four major topical categories. Performance metrics included Top‑1 accuracy, semantic similarity scores (F1, BLEU), and a 5‑point user satisfaction rating. The prototype achieved an overall semantic accuracy of 87 %, with category‑specific results of 91 % for fiqh, 86 % for muamalah, 84 % for ibadah, and 82 % for aqidah. Error analysis revealed that ambiguous terms and highly abstract theological queries were the primary sources of misclassification, highlighting limitations in the current embedding granularity and reward formulation.

The authors acknowledge several constraints. First, the learning process is static; after the initial offline training, the model does not continuously adapt to new fatwas or evolving scholarly discourse. Second, the dataset is skewed toward Sunni sources, potentially marginalizing alternative schools of thought such as Shi’a or Sufi perspectives. Third, the chatbot operates in a single‑turn mode, lacking the ability to manage multi‑turn dialogues that are often required for nuanced religious counseling.

Future work is outlined along four dimensions: (1) implementing an online reinforcement‑learning pipeline that ingests real‑time user feedback and expert re‑annotations to update the Q‑Table continuously; (2) integrating a Transformer‑based dialogue manager (e.g., DialogGPT) to enable multi‑turn conversational flows; (3) expanding multilingual support (Arabic, English, Korean, Indonesian) and incorporating a broader spectrum of theological viewpoints; and (4) establishing an ethical oversight layer where a panel of scholars validates generated answers before they reach end users, thereby ensuring doctrinal correctness and accountability.

In conclusion, the study demonstrates that a carefully engineered combination of semantic embeddings and reinforcement learning can deliver a functional, Sharia‑aligned chatbot with respectable accuracy in a closed‑domain setting. While the prototype is limited by static learning, data bias, and single‑turn interaction, it offers a promising proof‑of‑concept for digital da’wah and religious literacy in the Industry 4.0 era. With the proposed enhancements, such a system could become a reliable, scalable, and ethically sound conduit for Islamic knowledge, bridging traditional scholarship and modern AI‑driven consultation.