Interval Fisher's Discriminant Analysis and Visualisation

Interval Fisher's Discriminant Analysis and Visualisation
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

In Data Science, entities are typically represented by single valued measurements. Symbolic Data Analysis extends this framework to more complex structures, such as intervals and histograms, that express internal variability. We propose an extension of multiclass Fisher’s Discriminant Analysis to interval-valued data, using Moore’s interval arithmetic and the Mallows’ distance. Fisher’s objective function is generalised to consider simultaneously the contributions of the centres and the ranges of intervals and is numerically maximised. The resulting discriminant directions are then used to classify interval-valued observations.To support visual assessment, we adapt the class map, originally introduced for conventional data, to classifiers that assign labels through minimum distance rules. We also extend the silhouette plot to this setting and use stacked mosaic plots to complement the visual display of class assignments. Together, these graphical tools provide insight into classifier performance and the strength of class membership. Applications to real datasets illustrate the proposed methodology and demonstrate its value in interpreting classification results for interval-valued data.


💡 Research Summary

The paper addresses a notable gap in the analysis of symbolic data: the lack of a principled, multivariate discriminant method that works directly on interval‑valued observations without collapsing them to point estimates or exploding dimensionality. Building on Fisher’s Linear Discriminant Analysis (LDA), the authors propose Interval Fisher’s Discriminant Analysis (IFDA), a framework that simultaneously exploits the centre and the range of each interval. The key technical ingredients are (i) the use of Moore’s interval arithmetic to define linear combinations of intervals, (ii) the adoption of the Mallows (L2‑Wasserstein) distance as a natural metric between intervals, and (iii) a decomposition of total inertia into within‑class and between‑class components that respects the interval structure.

The Mallows distance is expressed in closed form under the assumption that the latent micro‑data variables U_i are symmetric and share a common distribution. In this case the squared distance reduces to a weighted Euclidean form: ‖c₁‑c₂‖² + δ‖r₁‑r₂‖², where c denotes the centre vector, r the range vector, and δ = Var


Comments & Academic Discussion

Loading comments...

Leave a Comment