Geometric Models with Co-occurrence Groups

Reading time: 5 minute
...

📝 Original Info

  • Title: Geometric Models with Co-occurrence Groups
  • ArXiv ID: 1101.5766
  • Date: 2011-02-01
  • Authors: ** 논문에 명시된 저자 정보가 제공되지 않았습니다. (저자명 및 소속을 확인하려면 원문을 참고하십시오.) **

📝 Abstract

A geometric model of sparse signal representations is introduced for classes of signals. It is computed by optimizing co-occurrence groups with a maximum likelihood estimate calculated with a Bernoulli mixture model. Applications to face image compression and MNIST digit classification illustrate the applicability of this model.

💡 Deep Analysis

📄 Full Content

Finding image representations with a dimensionality reduction while maintaining relevant information for classification, remains a major issue. Effective approaches have recently been developed based on locally orderless representations as proposed by Koendering and Van Doom [1]. They observed that high frequency structures are important for recognition but do not need to be precisely located. This idea has inspired a family of descriptors such as SIFT [2] or HOG [3], which delocalize the image information over large neighborhoods, by only recording histogram information. These histograms are usually computed over wavelet like coefficients, providing a multiscale image representation with several wavelets having different orientation tunings.

This paper introduces a new geometric image representation obtained by grouping coefficients that have co-occurrence properties across an image class. It provides a locally orderless representation where sparse descriptors are delocalized over groups which optimize the coefficient co-occurrences, and can be interpreted as a form of parcellization [4]. Section 2 reviews wavelet image representations and the notion of sparse geometry through significant sets. Section 3 introduces our co-occurrence grouping model which is optimized with a maximum likelihood approach. Groups are computed from a training sequence in Section 4, using a Bernoulli mixture approximation. Applications to face image compression are shown in Section 5 and the application of this representation is illustrated for MNIST image classifications in Section 6.

Sparse signal representations are obtained by decomposing signals over bases or frames {φ p } p∈ȳ which take advantage of the signal regularity to produce many zero coefficients. A sparse representation is obtained by keeping the significant coefficients above a threshold T ,

The original signal can be reconstructed with a dual family f = p∈ȳ f, φ p φp , and the resulting sparse approximation is f y = p∈y f, φ p φp .

Wavelet transforms compute signal inner products with several mother wavelets ψ d having a specific direction tuning, and which are dilated by 2 j and translated by 2 j n: φ p = ψ d j,n . Separable wavelet bases are obtained with 3 mother wavelets [5], in which case the total number |ȳ| of wavelets is equal to the image size.

Let |y| be the cardinal of the set y. In absence of prior information on y, the number of bits needed to code y in ȳ is R 0 = log 2 |ȳ| |y| . One can also verify [5] that the number of bits required to encode the values of coefficients in y is proportional to |y| and is smaller than R 0 so that the coding budget is indeed dominated by R 0 which carries most of the image information.

In a supervised classification problem, a geometric model defines a prior model of the probability distribution q(y). There is a huge number 2 |ȳ| of subsets y in ȳ. Estimating the probability q(y) from a limited training set thus requires using a simplified prior model.

A signal class is represented by a random vector whose realizations are within the class and whose significance sets y are included in ȳ. A mixture model is introduced with co-occurrence groups θ(k) of constant size s, which define a partition of the overall index set ȳ

Co-occurrence groups θ(k) are optimized by enforcing that all coefficients have a similar behavior in a group and hence that y ∩ θ(k) is either almost empty or almost equal to θ(k) with a high probability. The mixture model assumes that the distributions of the components y ∩ θ(k) are independent. The distribution q(y ∩ θ(k)) is assumed to be uniform among all subsets of θ(k

This co-occurrence model is identified with a maximum log-likelihood approach which computes arg max

Given a training sequence of images {f l } l≤L that belong to a class, we optimize the group co-occurrence by approximating the maximum likelihood with a Bernoulli mixture.

Let y l be the significant set of f l . The log likelihood is calculated with

The maximization of this expression is obtained using the Stirling formula which approximates the first term by the entropy of a Bernoulli distribution. Let us write q k,l (0) = z l (k)/s and q k,l (1) = 1 -z l (k)/s, the Bernoulli probability distribution associated to z l (k)/s. Let us specify the groups θ(k) by the inverse variables

The distribution q k is generally unknown and must therefore be estimated. The estimation is regularized by approximating this distribution with a piecewise constant distribution qk over a fixed number of quantization bins, that is small relatively to the number of realizations L. The likelihood ( 1) is thus approximated by a likelihood over the Bernoulli mixture, which is optimized over all parameters:

(

The following algorithm, minimizes (2) by updating separately the Bernoulli parameters z l (k), the distribution qk and the grouping variables k(p).

The minimization algorithm begins with a random initialization of groups θ(k) of same size

📸 Image Gallery

cover.png

Reference

This content is AI-processed based on open access ArXiv data.

Start searching

Enter keywords to search articles

↑↓
ESC
⌘K Shortcut