In recent years, kernel density estimation has been exploited by computer scientists to model machine learning problems. The kernel density estimation based approaches are of interest due to the low time complexity of either O(n) or O(n*log(n)) for constructing a classifier, where n is the number of sampling instances. Concerning design of kernel density estimators, one essential issue is how fast the pointwise mean square error (MSE) and/or the integrated mean square error (IMSE) diminish as the number of sampling instances increases. In this article, it is shown that with the proposed kernel function it is feasible to make the pointwise MSE of the density estimator converge at O(n^-2/3) regardless of the dimension of the vector space, provided that the probability density function at the point of interest meets certain conditions.
Deep Dive into Supervised Machine Learning with a Novel Kernel Density Estimator.
In recent years, kernel density estimation has been exploited by computer scientists to model machine learning problems. The kernel density estimation based approaches are of interest due to the low time complexity of either O(n) or O(n*log(n)) for constructing a classifier, where n is the number of sampling instances. Concerning design of kernel density estimators, one essential issue is how fast the pointwise mean square error (MSE) and/or the integrated mean square error (IMSE) diminish as the number of sampling instances increases. In this article, it is shown that with the proposed kernel function it is feasible to make the pointwise MSE of the density estimator converge at O(n^-2/3) regardless of the dimension of the vector space, provided that the probability density function at the point of interest meets certain conditions.
, where m is the dimension of the vector space, is employed for generation of the density estimator in a highdimensional vector space. With the proposed kernel function, it is then feasible to make the pointwise MSE of the density estimator converge at
regardless of the dimension of the vector space, provided that the probability density function at the point of interest meets certain conditions.
Kernel density estimation is a problem that has been studied by statisticians for decades [1][2][3][4]. In recent years, kernel density estimation has been exploited by computer scientists to model machine learning problems [5][6][7]. The kernel density estimation based approaches are of interest due to the low time complexity of either
for generating an estimator, where n is the number of sampling instances [4]. Furthermore, in comparison with the support vector machine (SVM) [8], a recent study has shown that the kernel density estimation based classifier is capable of delivering the same level of prediction accuracy, while enjoying several distinctive advantages [7]. Therefore, the kernel density estimation based machine learning algorithms may become the favorite choice for contemporary applications that involve large datasets or databases.
Concerning design of kernel density estimators, one essential issue is how fast the pointwise mean square error (MSE) and/or the integrated mean square error (IMSE) diminish as the number of sampling instances increases. In this respect, the main problems with the conventional kernel density estimators is that the convergence rate of the pointwise MSE becomes extremely slow in case the dimension of the dataset is large. For example, with Gaussian kernels, the pointwise MSE of the fixed kernel density estimator converges at ) (
, where m is the dimension of the dataset. Accordingly, the conventional kernel density estimators suffer a serious deficiency in dealing with high-dimensional datasets. Since high-dimensional datasets are common in modern machine learning applications, design of a novel kernel density estimator that can handle high-dimensional datasets more effectively is essential for exploiting kernel density estimation in modern machine learning applications.
In this article, the kernel function with general form
, where m is the dimension of the vector space, is employed for generation of the density estimator in a high-dimensional vector space. With the proposed kernel function, it is then feasible to make the pointwise MSE of the density estimator converge at
regardless of the dimension of the vector space, provided that the probability density function at the point of interest meets certain conditions. Just like many conventional kernel density estimators, the proposed kernel density estimator features an average time complexity of O(nlogn) for generating the approximate probability density function. Accordingly, the average time complexity for constructing a classifier with the proposed kernel density estimator is O(nlogn). In [9], the effects of applying the proposed kernel density estimator in a bioinformatics application is addressed.
In the following part of this paper, Section II presents the novel kernel density estimator proposed in this article. Section III reports the experiments conducted to verify the theorems presented in Section II. Finally, concluding remarks are presented in section IV.
In this section, we will first elaborate the mathematical basis of the novel kernel density estimator proposed in this article. In particular, we will show that the pointwise mean squared error (MSE) of the basic form of the proposed kernel density estimator converges at
, regardless of the dimension of the vector space, where n is the number of instances in the training dataset. Then, we will discuss how the proposed kernel density estimator can be exploited in data classification applications.
Since we can always conduct a translation operation with the coordinate system, without loss of generality, we assume in the following discussion that it is the pointwise MSE at the origin of the coordinate system that is of concern. Let ) ,…, , (
denote the probability density function of the distribution of concern in an m-dimensional vector space. Assume that ) ,…, , (
. Let Z be the random variable that maps a sampling instance s i taken from the distribution governed by X f to m i s , where i s is the distance between the origin and s i . Accordingly, we have the distribution function
. Then, we have ( )
, where Γ(.) is the gamma function [10].
By the Taylor expansion,
4 Supervised Machine Learning with a Novel Kernel Density Estimator (arXiv:0709.2760) Furthermore, in region where
, we have
is the volume of a sphere in an m-dimensional vector space with radius = m ε .
Theorem 1 implies that we can obtain an estimate of
. Since Z f is a univariate probability density function, if we employ a fixed kernel density estimator [4] to estimate
,
…(Full text truncated)…
This content is AI-processed based on ArXiv data.