Discussion of: Treelets--An adaptive multi-Scale basis for sparse unordered data

Reading time: 3 minute
...

📝 Original Info

  • Title: Discussion of: Treelets–An adaptive multi-Scale basis for sparse unordered data
  • ArXiv ID: 0807.4011
  • Date: 2008-07-28
  • Authors: Researchers from original ArXiv paper

📝 Abstract

Discussion of "Treelets--An adaptive multi-Scale basis for sparse unordered data" [arXiv:0707.0481]

💡 Deep Analysis

Deep Dive into Discussion of: Treelets--An adaptive multi-Scale basis for sparse unordered data.

Discussion of “Treelets–An adaptive multi-Scale basis for sparse unordered data” [arXiv:0707.0481]

📄 Full Content

1. Hierarchical clustering using a novel, adaptive, eigenvector-related, agglomerative criterion. 2. Principal components analysis carried out locally, leading to the required sample size for consistency being logarithmic rather than linear; and computational time being quadratic rather than cubic. 3. Multiresolution transform with interesting characteristics: data-adaptive at each node of the tree, orthonormal, and the tree decomposition itself is data-adaptive. 4. Integration of all of the following: hierarchical clustering, dimensionality reduction, and multiresolution transform. 5. Range of data patterns explored, in particular, block patterns in the covariances, and "model" or pattern contexts.

While I admire the work of the authors, nonetheless I have a different point of view on key aspects of this work:

  1. The highest dimensionality analyzed seems to be 760 in the Internet advertisements case study. In fact, the quadratic computational time requirements (Section 2.1 of Lee et al.) preclude scalability. My approach in Murtagh (2007a) to wavelet transforming a dendrogram is of linear computational complexity (for both observations, and attributes) in the multiresolution transform. The hierarchical clustering, to begin with, is typically quadratic for the n observations, and linear in the p attributes. These computational requirements are necessary for the “small n, large p” problem which motivates this work (Section 1). In particular, linearity in p is a sine qua non for very high dimensionality data exploration.

Since L = O(p) in Section 2.1, this cubic time requirement has to be alleviated, in practice, through limiting L to a user-specified value. 2. The local principal components analysis (Section 2.1) inherently helps with data normalization, but it only goes some distance. For qualitative, mixed quantitative and qualitative, or other forms of messy data, I would use a correspondence analysis to furnish a Euclidean data embedding. This, then, can be the basis for classification or discrimination, benefiting from the Euclidean framework. See Murtagh (2005). 3. My final point is in relation to the following (Section 1): “The key property that allows successful inference and prediction in high-dimensional settings is the notion of sparsity.” I disagree, in that sparsity of course can be exploited, but what is far more rewarding is that high dimensions are of particular topology, and not just data morphology. This is shown in the work of Hall et al. (2005), Ahn et al. (2007), Donoho and Tanner (2005) and Breuel (2007), as well as Murtagh (2004). What this leads to, potentially, is the exploitation of the remarkable simplicity that is concomitant with very high dimensionality: Murtagh (2007b).

Applications include text analysis, in many varied applications, and high frequency financial and other signal analysis.

In conclusion, I thank the authors for their thought-provoking and motivating work.

📸 Image Gallery

cover.png

Reference

This content is AI-processed based on ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut