Least Square Error Method Robustness of Computation: What is not usually considered and taught

Reading time: 6 minute
...

📝 Original Info

  • Title: Least Square Error Method Robustness of Computation: What is not usually considered and taught
  • ArXiv ID: 1802.07591
  • Date: 2018-02-22
  • Authors: ** 논문에 저자 정보가 명시되지 않아 확인 불가. **

📝 Abstract

There are many practical applications based on the Least Square Error (LSE) approximation. It is based on a square error minimization 'on a vertical' axis. The LSE method is simple and easy also for analytical purposes. However, if data span is large over several magnitudes or non-linear LSE is used, severe numerical instability can be expected. The presented contribution describes a simple method for large span of data LSE computation. It is especially convenient if large span of data are to be processed, when the 'standard' pseudoinverse matrix is ill conditioned. It is actually based on a LSE solution using orthogonal basis vectors instead of orthonormal basis vectors. The presented approach has been used for a linear regression as well as for approximation using radial basis functions.

💡 Deep Analysis

📄 Full Content

Wide range of applications is based on approximation of acquired data and the LSE minimization is used, known also as a linear or polynomial regression. The regression methods have been heavily explored in signal processing and geometrical problems or with statistically oriented problems. They are used across many engineering fields dealing with acquired data processing. Several studies have been published and they can be classified as follows:  "standard" Least Square Error (LSE) methods fitting data to a function 𝑦 = 𝑓(𝒙) , where 𝒙 is an independent variable and 𝑦 is a measured or given value,  "orthogonal" Total Least Square Error (TLSE) fitting data to a function 𝐹(𝒙) = 0 , i.e. fitting data to some 𝑑 -1dimensional entity in this 𝑑-dimensional space, e.g. a line in the 𝐸 2 space or a plane in the 𝐸 3 space [1][6] [8][21] [22],  "orthogonally Mapping" Total Least Square Error (MTLSE) methods for fitting data to a given entity in a subspace of the given space. However, this problem is much more complicated. As an example, we can consider data given in and we need to find an optimal line in 𝐸 𝑑 , i.e. one dimensional entity, in this 𝑑-dimensional space fitting optimally the given data. Typical problem: Find a line in the 𝐸 𝑑 space that has the minimum orthogonal distance from the given points in this space. This algorithm is quite complex and solution can be found in [18]. It should be noted, that all methods above do have one significant drawback as values are taken in a squared value. This results to an artifact that small values do not have relevant influence to the final entity as the high values. Some methods are trying to overcome this by setting weights to each measured data [3]. It should be noted that the TLSE was originally derived by Pearson [16](1901). Deep comprehensive analysis can be found in [8] [13][21][22]. Differences between the LSE a TLSE methods approaches are significant, see Fig. In the vast majority the Least Square Error (LSE) methods measuring vertical distances are used. This approach is acceptable in the case of explicit functional dependences 𝑓(𝑥, 𝑦) = ℎ, resp. 𝑓(𝑥, 𝑦, 𝑧) = ℎ. However, it should be noted that a user should keep in a mind, that smaller differences than 1.0, will have significantly smaller weight than higher differences than 1.0 as the differences are taken in a square resulting to dependences in scaling of data approximated, i.e. the result will depend on physical units used, etc. The main advantage of the LSE method is that it is simple for fitting polynomial curves and it is easy to implement. The standard LSE method leads to over determined system of linear equations. This approach is also known as polynomial regression.

Let us consider a data set Ω = {〈𝑥 𝑖 , 𝑦 𝑖 , 𝑓 𝑖 〉} 𝑖=1 𝑛 , i.e. data set containing for 𝑥 𝑖 ,𝑦 𝑖 and measured functional value 𝑓 𝑖 , and we want to find parameters 𝒂 = [𝑎, 𝑏, 𝑐, 𝑑] 𝑇 for optimal fitting function, as an example:

Minimizing the vertical squared distance 𝐷, i.e.:

The selection of bilinear form was used to show the LSE method application to a non-linear case, if the case of a linear function, i.e. 𝑓(𝑥, 𝑦, 𝒂) = 𝑎 + 𝑏𝑥 + 𝑐𝑦, the 4 th row and column are to be removed. Note that the matrix 𝑨 is symmetric and the function 𝑓(𝒙) might be more complex, in general.

Several methods for LSE have been derived [4][5] [10], however those methods are sensitive to the vector 𝒂 orientation and not robust in general as a value of ∑ 𝑥 𝑖 2 𝑦 𝑖 2 𝑛 𝑖=1 might be too high in comparison with the value 𝑛, which has an influence to robustness of a numerical solution. In addition, the LSE methods are sensitive to a rotation as they measure vertical distances. It should be noted, that rotational and translation invariances are fundamental requirements especially in geometrically oriented applications.

The LSE method is usually used for a small size of data and span of a domain is relatively small. However, in some applications the domain span can easily be over several decades, e.g. in the case of Radial Basis Functions (RBF) approximation for GIS applications etc. In this case, the overdetermined system can be difficult to solve.

Let us explore a simple example, when many points 𝒙 𝑖 ∈ 𝐸 2 , i.e. 𝒙 𝑖 = (𝑥 𝑖 , 𝑦 𝑖 ) , are given with relevant associated values 𝑏 𝑖 , 𝑖 = 1, … , 𝑛. Expected functional dependency can be expressed (for a simplicity) as 𝑦 = 𝑎 1 + 𝑎 2 𝑥 + 𝑎 3 𝑦. The LSE leads to an overdetermined system of equations

where 𝒃 = (𝑏 1 , … , 𝑏 𝑛 ), 𝝃 = (𝜉 1 , … , 𝜉 𝑚 ) and 𝑚 is a number of parameters, 𝑚 < 𝑛.

If the values 𝑥 𝑖 , 𝑦 𝑖 over a large span, e.g. 𝑥 𝑖 , 𝑦 𝑖 ∈ 〈10 0 , 10 5 〉, the matrix 𝑨 𝑇 𝑨 is extremely ill conditioned. This means that the reliability of a solution depends on the distribution of points in the domain. Situation gets worst when a non-linear polynomial regression is to be used and dimensionality of the domain is higher.

As an example, let us consider a simple case, when points form regular orthogonal mesh and values

Reference

This content is AI-processed based on open access ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut