Quantitative Biology / q-bio.SC Statistics / Applications

Remarks on the statistical study of protein-protein interaction in living cells

February 23, 2026

Reading time: 5 minute

...

#Statistics #Applications #Quantitative Biology

📝 Original Info

Title: Remarks on the statistical study of protein-protein interaction in living cells
ArXiv ID: 1105.5738
Date: 2011-05-28
Authors: Researchers from original ArXiv paper

📝 Abstract

In this note, we focus on a selection model problem: a mono-exponential model versus a bi-exponential one. This is done in the biological context of living cells, where small data are available. Classical statistics are revisited to improve existing results. Some unavoidable limits are also pointed out.

💡 Deep Analysis

Deep Dive into Remarks on the statistical study of protein-protein interaction in living cells.

📄 Full Content

The measurement of molecular dynamic interactions and their respective proportions in living cells or tissues is a major question in biological and medicine research. The Förster resonance energy transfer (FRET) is one of the best known approaches to observe and quantitatively study protein-protein interactions at a subcellular level ( [7]). The FRET measurement can be currently performed by fluorescence lifetimes imaging microscopy (FLIM for short) in living cells and tissus. It can be achieved via the time correlated single photon counting (TCSPC) method which provides a lifetime decay curve per site ( [8]). To be interpreted, this curve is fitted by selecting the "best" (with respect to a given statistical criterion) multi-exponential model. Contrary to a mono-exponential model, a bi-exponential one witnesses interaction between two proteins. Our aim is to find, pixel per pixel, which of these models is accurate. But one difficulty is that the number of observed photons per pixel is small for any statistical treatment in order to preserve the living cell and therefore cannot be increased. An attempt to deal with the problem can be found in [7]. Our aim here is to go further in this direction pointing out some improvements and limits. Some account of statistical methods in this area can be found in [5] and [6].

1.1. Modelling fluorescence lifetimes. It is not necessary to describe here in details FLIM and TCSCP. We only need to understand that lifetimes are measured as differences between excitation times (pulses) and emission times of photons. Denote by r the period between two consecutive pulses. Here r is 12 nanoseconds, near values taken in practice. What is actually measured is a lifetime modulo r since we cannot be sure from what pulse it goes.

It is assumed that lifetimes come from say K species and are observed in the interval [0, r) after infinitely many pulses. In these conditions, each lifetime species k (1 ≤ k ≤ K) admits the following probability density:

(1)

where α k is the inverse mean lifetime of the k-th species. A uniform noise is added with density

If π k denotes the proportion of the k-th species (π 0 refers to the noise’s one), we get the probability density of the fluorescence lifetime by writing

(3)

1.2. Modelling the photon emission. Let I k be the mean photon number of species k detected between two pulses. Assume that photons occurrences are independent. Then the total number of detected photons is Poisson distributed with intensity T K k=0 I k if observations take place during T pulses. For a later use, it is convenient to set (4)

Note that we have

Since the noise intensity I 0 will be supposed known, it is convenient to consider proportions π ′ k among all species with k ≥ 1 except k = 0. Thus, we have for k ≥ 1,

1.3. Maximum likelihood estimation (MLE) and likelihood ratio test. The aim is firstly the determination of the most probable parameter θ := (α 1 , . . . , α K , I 1 , . . . , I K ) from observed lifetimes modulo r denoted by t 1 , . . . , t n . The noise intensity I 0 is supposed known. The related log-likelihood is then

For physical reasons, in particular since lifetimes are sure to be between 30pc and 30ns, we may and do assume that θ lies in a compact parameter set. Numerical optimisation of the likelihood ( 5) is made easier by knowing derivatives:

Denote by θ * K the most probable parameter if there is K species. To decide next which model from K = 1 or K = 2 is the most accurate, a classical statistic is the likelihood ratio

From a theoretical point of view, since we are dealing with the number of components of a mixture model, even the asymptotics under the null hypothesis are not the usual χ 2 statistics. It can be expressed as a supremum over a Gaussian process on a subset of a four-dimensional unit sphere (in our case) endowed with the “right” covariance function ( [1] and references therein). However this process depends also on the “true” point θ. Since all calculations are complicated, it is easier to simply simulate if we want to know the level of a test associated with a given threshold. Notice on the other hand that simulations hint that the likelihood ratio test is quite efficient for knowing the number of components in a mixture with compact parameter set (see for example [4] or [3]).

Selection of the number of exponential species K 2.1. Comparisons. We restricted ourselves to test K = 1 versus K = 2. It can be already a difficult and interesting question, if few observed photons are available. With the help of simulated observations, we first optimised θ by MLE for each K and next tested K = 1 versus K = 2 via the likelihood ratio statistics D.

Compared to the one given in [7], the preceding statistical test is as efficient but with about 100 times less observations. For the reader’s convenience and for comparison, consider the table obtained in [7] Here, mean error rate is the average over the simulation number of the percentage

…(Full text truncated)…

📄 Read Full PDF on ArXiv

📸 Image Gallery

Reference

This content is AI-processed based on ArXiv data.

Remarks on the statistical study of protein-protein interaction in living cells

📝 Original Info

📝 Abstract

💡 Deep Analysis

📄 Full Content

📸 Image Gallery

Reference

Table of Contents

Table of Contents

📝 Original Info

📝 Abstract

💡 Deep Analysis

📄 Full Content

📸 Image Gallery

Reference

Related Posts

From Chain-Ladder to Individual Claims Reserving

Partition Decoupling for Multi-gene Analysis of Gene Expression Profiling Data

Respondent-Driven Sampling: An Assessment of Current Methodology

Start searching

No results found