Method of fractal diversity in data science problems
The parameter (SNR) is obtained for distinguishing the Gaussian function, the distribution of random variables in the absence of cross correlation, from other functions, which makes it possible to describe collective states with strong cross-correlation of data. The signal-to-noise ratio (SNR) in one-dimensional space is determined and a calculation algorithm based on the fractal variety of the Cantor dust in a closed loop is given. The algorithm is invariant for linear transformations of the initial data set, has renormalization-group invariance, and determines the intensity of cross-correlation (collective effect) of the data. The description of the collective state is universal and does not depend on the nature of the correlation of data, nor is the universality of the distribution of random variables in the absence of data correlation. The method is applicable for large sets of non-Gaussian or strange data obtained in information technology. In confirming the hypothesis of Koshland, the application of the method to the intensity data of digital X-ray diffraction spectra with the calculation of the collective effect makes it possible to identify a conformer exhibiting biological activity.
💡 Research Summary
The paper introduces a novel methodology for quantifying cross‑correlation (collective effects) in data sets that deviate from Gaussian statistics by exploiting the fractal properties of Cantor‑dust‑like structures. Traditional signal‑to‑noise ratio (SNR) measures assume independent, Gaussian‑distributed noise and therefore fail to capture the intensity of inter‑variable correlations that are common in many scientific and engineering data streams. To overcome this limitation, the authors map a one‑dimensional data series onto a closed loop that mimics the self‑similar geometry of a Cantor dust. Each data point is encoded as a binary string, and the occupancy of intervals at a resolution ε is counted. From this occupancy the fractal (Hausdorff) dimension D is computed as
\
Comments & Academic Discussion
Loading comments...
Leave a Comment