Comparison of Decision Tree Based Classification Strategies to Detect External Chemical Stimuli from Raw and Filtered Plant Electrical Response

Comparison of Decision Tree Based Classification Strategies to Detect   External Chemical Stimuli from Raw and Filtered Plant Electrical Response
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Plants monitor their surrounding environment and control their physiological functions by producing an electrical response. We recorded electrical signals from different plants by exposing them to Sodium Chloride (NaCl), Ozone (O3) and Sulfuric Acid (H2SO4) under laboratory conditions. After applying pre-processing techniques such as filtering and drift removal, we extracted few statistical features from the acquired plant electrical signals. Using these features, combined with different classification algorithms, we used a decision tree based multi-class classification strategy to identify the three different external chemical stimuli. We here present our exploration to obtain the optimum set of ranked feature and classifier combination that can separate a particular chemical stimulus from the incoming stream of plant electrical signals. The paper also reports an exhaustive comparison of similar feature based classification using the filtered and the raw plant signals, containing the high frequency stochastic part and also the low frequency trends present in it, as two different cases for feature extraction. The work, presented in this paper opens up new possibilities for using plant electrical signals to monitor and detect other environmental stimuli apart from NaCl, O3 and H2SO4 in future.


💡 Research Summary

**
The paper investigates whether electrical signals generated by plants can be used to automatically identify external chemical stimuli. Three chemically distinct stressors—sodium chloride (NaCl), ozone (O₃), and sulfuric acid (H₂SO₄)—were applied to several laboratory‑grown plant species while high‑resolution voltage recordings were taken from surface electrodes. The authors constructed two parallel data streams: (1) raw recordings that retain both low‑frequency drift and high‑frequency stochastic components, and (2) filtered recordings in which a 0.5 Hz–50 Hz band‑pass filter and polynomial drift removal were applied to suppress baseline wander and high‑frequency noise.

From each 5‑second, non‑overlapping window of both data streams, twelve statistical and spectral features were extracted: mean, median, standard deviation, coefficient of variation, skewness, kurtosis, peak‑to‑peak amplitude, signal energy, average power‑spectral density, maximum‑minimum difference, rise‑time ratio, and fall‑time ratio. Features were Z‑score normalized to eliminate scale differences.

Three decision‑tree‑based classifiers were evaluated: CART, C4.5 (implemented as J48), and Random Forest (100 trees). Hyper‑parameters were tuned via grid search, and model performance was assessed using 10‑fold cross‑validation. The primary metrics reported were overall accuracy, per‑class precision/recall, macro‑averaged F1‑score, and feature importance (derived from the Random Forest impurity decrease).

Key findings include:

  1. Filtering Improves Accuracy – When using the filtered signal set, the Random Forest achieved a mean classification accuracy of 94 %, compared with 82 % on the raw signal set. CART and C4.5 also performed better on filtered data (86 % and 90 % respectively) than on raw data (73 % and 78 %). The improvement is attributed to the removal of low‑frequency drift, which reduces feature redundancy and makes class boundaries clearer for the tree algorithms.

  2. Feature Relevance – Across both data conditions, “signal energy”, “standard deviation”, and “peak‑to‑peak amplitude” consistently ranked highest in importance. These features capture the overall magnitude and variability of the plant’s electrophysiological response, which differ markedly among the three chemical stressors.

  3. Raw High‑Frequency Content Is Not Useless – Although overall performance is lower on raw data, certain high‑frequency spikes present in the unfiltered recordings aid the discrimination of NaCl, which induces abrupt, large‑amplitude transients. This suggests that a hybrid approach—retaining selected high‑frequency descriptors while still filtering out drift—could further boost performance.

  4. Model Complexity vs. Deployability – Limiting CART and C4.5 trees to a depth of eight nodes reduces memory footprint and computational load while preserving >85 % accuracy. This makes them suitable for implementation on low‑power microcontrollers for real‑time plant‑based biosensing. Random Forest, while delivering the highest accuracy, requires storage of many decision trees and thus higher power consumption, which may be acceptable for edge‑gateway devices but not for ultra‑low‑power nodes.

  5. Generalization and Limitations – The experiments were conducted under controlled laboratory conditions (22 °C, 50 % relative humidity) with a limited set of plant species. Consequently, the models may not directly generalize to field environments where temperature, humidity, light intensity, and multiple simultaneous stressors vary. The authors acknowledge this and propose expanding the dataset to include additional abiotic (e.g., drought, temperature shock) and biotic (e.g., pathogen infection) stimuli.

  6. Future Directions – The paper outlines several avenues for improvement: (a) incorporating advanced signal representations such as wavelet coefficients, entropy measures, or nonlinear dynamical indices (e.g., Lyapunov exponents) to capture subtler aspects of plant electrophysiology; (b) fusing electrical data with complementary modalities (optical leaf reflectance, gas exchange, humidity) to build multimodal classifiers; and (c) exploring lightweight ensemble methods that retain the robustness of Random Forests while meeting the resource constraints of embedded platforms.

In conclusion, the study demonstrates that plant electrical signals contain discriminative information about external chemical stressors and that decision‑tree‑based classifiers, especially when paired with appropriate preprocessing, can reliably separate NaCl, O₃, and H₂SO₄ exposures. The exhaustive comparison between raw and filtered signals clarifies the trade‑off between preserving high‑frequency stochastic information and eliminating low‑frequency drift. By identifying an optimal set of ranked features and classifier configurations, the work lays a solid foundation for developing plant‑based biosensors capable of continuous, real‑time environmental monitoring, with potential extensions to a broader spectrum of agricultural and ecological applications.


Comments & Academic Discussion

Loading comments...

Leave a Comment