Interpretable Fuzzy Systems For Forward Osmosis Desalination
Preserving interpretability in fuzzy rule-based systems (FRBS) is vital for water treatment, where decisions impact public health. While structural interpretability has been addressed using multi-objective algorithms, semantic interpretability often suffers due to fuzzy sets with low distinguishability. We propose a human-in-the-loop approach for developing interpretable FRBS to predict forward osmosis desalination productivity. Our method integrates expert-driven grid partitioning for distinguishable membership functions, domain-guided feature engineering to reduce redundancy, and rule pruning based on firing strength. This approach achieved comparable predictive performance to cluster-based FRBS while maintaining semantic interpretability and meeting structural complexity constraints, providing an explainable solution for water treatment applications.
💡 Research Summary
The paper addresses the critical need for interpretable fuzzy rule‑based systems (FRBS) in water‑treatment applications, where model decisions can directly affect public health. While prior work has largely focused on structural interpretability—controlling rule count and parameter size—semantic interpretability, i.e., the clarity and distinguishability of the fuzzy sets themselves, often degrades when membership functions are tuned solely for accuracy. To overcome this, the authors propose a human‑in‑the‑loop (HITL) methodology that explicitly preserves semantic interpretability while still delivering competitive predictive performance for forward‑osmosis (FO) desalination productivity.
The approach consists of four tightly coupled stages:
-
Domain‑oriented Feature Engineering (DFE). Experts select a compact set of input variables (three in this study) and construct physically meaningful derived features, notably the osmotic pressure difference Δπ derived from the solution‑diffusion model. This reduces dimensionality, respects cognitive limits (the classic “7 ± 2” rule), and ensures that each variable has a clear linguistic interpretation (low, medium, high).
-
Fixed Grid Partitioning (FGP). For each selected variable, a uniform grid of three Gaussian membership functions is created. Means and a shared standard deviation are computed analytically (Equations 3‑4) and the overlap factor k is set to 1, providing moderate overlap while guaranteeing three semantic properties: coverage of the entire variable range, normality (each fuzzy set attains a membership of 1 for some data point), and distinguishability (limited overlap). Because the grid is defined a priori by experts, the resulting fuzzy sets remain easily interpretable.
-
Inactivity Checking (IA). After the initial rule base (3³ = 27 rules) is generated, each rule’s firing strength is evaluated across the whole dataset using a product t‑norm. Rules whose normalized cumulative firing strength falls below 10⁻² are pruned. This step dramatically reduces redundancy, cuts the rule count roughly in half, and further improves semantic clarity by eliminating rarely activated rules.
-
Global Consequent Estimation (GCE). With antecedent parameters fixed, the consequent linear parameters (weights and bias) are estimated via ridge‑regularized ordinary least squares. The design matrix is built from the normalized firing strengths (Equation 9‑11), and the regularization coefficient λ is tuned by grid search to minimize mean absolute error (MAE). This regularization ensures that all retained rules contribute meaningfully and prevents over‑fitting despite the reduced rule set.
The methodology is evaluated on a real, skewed experimental dataset of FO desalination, comprising nine original sensor variables and the derived Δπ. Performance is compared against standard clustering‑based fuzzy models (e.g., fuzzy c‑means, subtractive clustering). Results show that the HITL‑FGP‑IA‑GCE model achieves MAE virtually identical to the best clustering approach while delivering substantially better semantic metrics: Jaccard similarity, Gaussian‑based similarity, and Mencar’s possibility measures all indicate far greater distinguishability of the fuzzy sets. Moreover, the number of parameters is reduced by more than 40 %, and the final rule base is small enough for a plant engineer to inspect and modify manually.
The authors acknowledge two main limitations. First, grid partitioning suffers from the curse of dimensionality; the current study sidesteps this by limiting the feature set to three variables, but real‑world plants may require many more sensors. Second, fixing Gaussian shapes may restrict expressive power in highly nonlinear regions. Future work is suggested to explore adaptive grid schemes, alternative membership shapes (e.g., trapezoidal, beta), and interactive visual tools that allow experts to refine partitions and rule thresholds in real time.
In summary, the paper demonstrates that a carefully designed human‑in‑the‑loop fuzzy system can simultaneously satisfy structural and semantic interpretability constraints while delivering predictive accuracy comparable to fully data‑driven clustering methods. This balance makes the approach highly attractive for safety‑critical domains such as water treatment, and the proposed framework is readily extensible to other industrial processes where transparent decision‑making is essential.
Comments & Academic Discussion
Loading comments...
Leave a Comment