Initial guesses generation for fluorescence intensity distribution analysis

15
U M L C Initial Guesses Generation for Fluorescence Intensity Distribution Analysis Victor V. Skakun 1,3 , Eugene G. Novikov 2 ,. Vladimir V. Apanasovich 3 , Hans Tanke 4 , Andre Deelder 1 , Oleg A. Mayboroda 1 1 Department of Parasitology, Leiden University Medical Centre, Leiden, The Netherlands 2 Service Bioinformatique, Institut CURIE, Paris, France 3 Department of Systems Analysis, Belarusian State University, Minsk, Belarus 4 Department of Molecular Cell Biology, Leiden University Medical Centre, Leiden, The Netherlands

Transcript of Initial guesses generation for fluorescence intensity distribution analysis

UML

C

Initial Guesses Generation for Fluorescence Intensity Distribution Analysis

Victor V. Skakun1,3, Eugene G. Novikov2,. Vladimir V. Apanasovich3, Hans Tanke4, Andre Deelder1, Oleg A. Mayboroda1

1 Department of Parasitology, Leiden University Medical Centre, Leiden, The Netherlands

2 Service Bioinformatique, Institut CURIE, Paris, France3 Department of Systems Analysis, Belarusian State University, Minsk, Belarus4 Department of Molecular Cell Biology, Leiden University Medical Centre, Leiden,The Netherlands

2

UML

CWhy IG?

PCD Intensity

×105

IG generationAllows to:

increase efficiency and correctness of the fitavoid trapping into local minimaincrease robustness of the analysisdecrease number of iterationsreduce user participation and make the whole procedure more standardized

Demands:developing of straightforward, noniterative methods

3

UML

C

1 2

1 2

( , , , ) , 1, 2, , , (1)where , , , is a set of unknown parameters

k m k

m

M M k mη η ηη η η

= =… ……

1*

1

*

( ) (2)

(n) is a probability to get photons within a counting time interval

Nk k

kn

M n n P n

P n T

=

= < > =∑

0( ) ( ) (5)n

nG P nξ ξ

=

=∑1

ln ( ) (4)k

k k

d GKd ξ

ξξ

=

=

1*( 1) ( 1) ( 1) ( 1) ( ) (6)

N

kn k

F n n n k n n n k P n−

=

= < − − + > = − − +∑… …

1

1

where1 1 ( 1)!, (7)

!( 1)!

k

k k k i ii

k k kK F K Fi i i k i

−=

− − −⎛ ⎞ ⎛ ⎞= − =⎜ ⎟ ⎜ ⎟ − −⎝ ⎠ ⎝ ⎠∑

1 2( , , , ) (3)k m kK Kη η η =…

in application to factorial cumulants

Method of moments

4

UML

C

1

2 22

3 33

4 2 2 443

i ii

i ii

i ii

i ii

q c

q c

q c

q c

χ

χ

χ

χ

⎧< Φ >=⎪⎪< ΔΦ >=⎪⎪⎨< ΔΦ >=⎪⎪⎪< ΔΦ > − < ΔΦ > =⎪⎩

∑∑∑

Moment analysis of fluorescence fluctuations

(8)

( ) (9)kk B r drχ = ∫ Qian, H., and E.L. Elson. Biophys. J. 57, 1990

Qian, H., and E.L. Elson. PNAS 87, 1990

(10)kk k i i

iK c qχ= ∑

Muller J.D. Biophis. J. 86, 2004

Fluorescence Cumulant Analysis

Historical background

here Φ is the fluorescence intensity, ci is the number of molecules per observation volume, qi is the specific brightness expressed in cpm, B(r) is a spatial brightness function, i is the number of molecular species

5

UML

C

{ }( ) exp ( 1) exp ( 1) ( ) 1 (12)j jj V

G T c q TB r dVξ ξ λ ξ⎛ ⎞

⎡ ⎤= − + − −⎜ ⎟⎣ ⎦⎝ ⎠

∑ ∫

2 3 00 0( ), ln , ( ) (13)( )

xdV BA x ax bx x B r B eB rdx−⎡ ⎤= + + = =⎢ ⎥⎣ ⎦

1

22

( ) 1,

( ) 1.V

V

B r dV

B r dV

χ

χ

= =

= =

∫(14)

Evotec Biosystems AG. 1998. Int. Patent WO 98/16814.

Kask at al. PNAS 96, 1999.

Evotec Biosystems AG. 1998. Int. Patent WO 98/16814. Palo at al. Boiphis. J. 79, 2000

here cj is the mean number of molecules per observation volume, qj is the specific brightness expressed in cpmt, V is the observation volume, T is the counting time interval, B(r) is brightness profile function which is the product excitation intensity and detection efficiency, j is the number of molecular species and λ is the mean background count rate of detector.

where a, b are instrumental parametersand A0, B0 can be calculated from system of normalization equations:

FIDA

1Finally ( ) ( ( ))iP n FFT G e ϕ−=

0

( ) is photon counting distribution (PCD)( ) ( ) (11)n

nP nG P nξ ξ

=

=∑

6

UML

C

1

2 22

( )

, 3, 4, ,

j jj

j jj

k kk k j j

j

K c q T

K c q T

K c q T k

λ

χ

= +

=

= =

∑ …

∫∞ − ++=

0

3200 )()( dxbxaxxAeB kx

0 0 2

8(2 6 1) 2 3 2,2 3 2 8(2 6 1)

a b a bB Aa b a b+ + + +

= =+ + + +

(15)

(16)

(17)

1

ln ( )k

k k

d GKd ξ

ξξ

=

=

General system of equations for IG generation

7

UML

C

22 1

1 2

( ),( )

K K Tq cK T T K

λλ

−= =

5 32 2

4

4 22 2

3

16875(10 6 25)(2 2 3)(4 3 8) 16384

1024729(2 3 2)(4 3 8)(2 2 3) 729

K Ka b a ba b K

K Ka b a ba b K

+ + + +⎧ =⎪ + +⎪⎨ + + + +⎪ =⎪ + +⎩

21

1 22

64 (2 2 3)(2 6 1)27 (2 3 2)

K a b a bKK a b

λ + + + += −

+ +

12 2

2

3 33 2

24 4

4 3

35 5

5 4

( )

64(2 6 1)(2 2 3)27(2 3 2)

4(2 6 1) (4 3 8)(2 3 2)

4096(2 6 1) (10 6 25)625(2 3 2)

K cq TK cq T

a b a bK cq Ta b

a b a bK cq Ta b

a b a bK cq Ta b

λ= +⎧⎪ =⎪⎪ + + + +

=⎪ + +⎪⎨

+ + + +⎪ =⎪ + +⎪

+ + + +⎪ =⎪ + +⎩

Basic system of equations:

(18)

(19)

Solution:

(20)

(21)

IG for one component model

estimated parameters are c, q, λ, a, b

1 33 2 2

22 2

1 44 3 3

2

( ) 64(2 6 1)(2 2 3)27(2 3 2)

( ) 4(2 6 1) (4 3 8)(2 3 2)

K T K a b a bK a b

K T K a b a bK a b

λχ

λχ

− + + + +⎧ = =⎪ + +⎪⎨

− + + + +⎪ = =⎪ + +⎩

22 1

1 2

( ),( )

K K Tq cK T T K

λλ

−= =

Simplifications: background (λ) is known

(22) 21

12

( )kk

k k

K T KKλχ

−=In general (23)

(21)estimated parameters are c, q, a, b

8

UML

C

2 2 18 27 16 27 0 red lineα β αβ α β− + + − > −

5 324

1687516384

K KK

α = 4 223

1024729

K KK

β =

Shape of the fourth order polynomial with respect to parameter a

2 2 18 27 16 27 0 blue lineα β αβ α β− + + − ≤ −

Solution of system 19

Root selection:

setting admissible rangesminimization of χ2 criterion

Fig. 1. Roots of polynomial

4 3 21 2 3 4 5 0H a H a H a H a H+ + + + =

9

UML

CIG for two component model

1 1 1 2 22 2 2

2 1 1 2 2

3 3 33 1 1 2 2 2

24 4 4

4 1 1 2 2 3

35 5 5

5 1 1 2 2 4

66 1 1

( )

( )64(2 2 3)(2 6 1)( )

27(2 3 2)4(4 3 8)(2 6 1)( )

(2 3 2)4096(10 6 25)(2 6 1)( )

625(2 3 2)

(

K c q c q T

K c q c q Ta b a bK c q c q T

a ba b a bK c q c q T

a ba b a bK c q c q T

a b

K c q c

λ= + +

= ++ + + +

= ++ +

+ + + += +

+ +

+ + + += +

+ +

= +4

6 62 2 5

6 57 7 7

7 1 1 2 2 6

4096(2 6)(2 6 1))27(2 3 2)

8 (14 6 49)(2 6 1)( )2401(2 3 2)

a b a bq Ta b

a b a bK c q c q Ta b

+ + + ++ +

+ + + += +

+ +

Initial system of equations

(24)

estimated parameters are c1, c2, q1, q2, λ, a, b

1 1 1 2 22 2 2

2 1 1 2 23 3 3

3 3 1 1 2 24 4 4

4 4 1 1 2 2

( )( )

( )

( ) ,

K T c q c q TK c q c q TK c q c q T

K c q c q T

λ

χ

χ

− = += +

= +

= +

Simplifications:1. background (λ) and instrumental parameters a and b are known

(25)

3 2

2

4 3

64(2 6 1)(2 2 3)27(2 3 2)

4(2 6 1) (4 3 8)(2 3 2)

a b a ba b

a b a ba b

χ

χ

+ + + +=

+ +

+ + + +=

+ +

(26)

2. background (λ) is knownA number of predefined parameters a and b used for solution of the sysmem 25. As result a number of sets of parameters is generated and the set resulting in lowest χ2 criterion is accepted.

10

UML

CTesting of IG for one component model

on simulated data

Table 1. IG for one component model on noisy PCD at different S/N. IG were rejected when either λ was negative or a and b exceed bounds (-2, 0); (0, 2) respectively. T = 5×10-5. 50 simulations in each series.

1. IG for all parameters c, q, λ, a, b

Table 2. IG calculated for one component model on noisy PCD at different S/N. λ fixed to 2000. T = 5×10-5.

2. IG for parameters c, q, a, b (λ is known)

Parameter Used for simulation

Recovered

S/Ni = 1000 S/Ni = 300 S/Ni = 50

c 5 5.000±0.006 4.999±0.025 4.999±0.107

q 20000 20002±24 20003±107 20029±455

a -1 -1.000±0.004 -1.006±0.015 -1.048±0.106

b 0.5 0.500±0.008 0.514±0.033 0.522±0.183

Parameter Used for simulation

Recovered

S/Ni = 7000 S/Ni = 3000 S/Ni = 1000

c 5 4.994±0.126 4.882±0.205 4.327±0.542

q 20000 20016±252 20253±436 21633±1462

λ 2000 2064±1260 3208±2086 9170±5948

a -1 -0.999±0.019 -0.980±0.034 -0.853±0.152

b 0.5 0.500±0.002 0.499±0.003 0.495±0.009

max/ (1 )maxS N mp p= −

max

Value at Maximum

/ initialS N mp= =

=

here max( ( )),max np P n=

m is total number of photons

11

UML

C

Testing of IG for two component modelon simulated data

Table 3. IG calculated for two component model on noisy PCD at different S/N. λ = 1000. a = -1; b = 0.5; T = 2×10-5. 50 simulations in each series.

IG for parameters c1, c2, q1, q2 (λ, a, b are known)

Parameter Used for simulation

Recovered

S/Ni = 1000 S/Ni = 100

c1 10 9.99±0.13 10.11±1.70

q2 20000 19883±635 18282±5013

c1 2 2.04±0.23 2.59±1.64

q2 50000 49847±1404 50931±11676

12

UML

CTesting of IG for one component model

on measured data

Table 4. IG calculated for one component model on measured data (Alexa 488). T = 8×10-6 .

1. IG for all parameters c, q, λ, a, b

Table 5. IG calculated for one component model on measured data (Alexa 488). λ fixed to 1500. T = 8×10-6

2. IG for parameters c, q, a, b (λ is estimated from additional measurement)

Parameters IG Best fit = fit starting from IG

χ2 0.668 0.666

c 24.916 24.923±0.189

q 18500 18500±140.5

a -1.463 -1.472±0.0014

b 0.321 0.324±0.0004

Parameters IG Fit starting from IG

χ2 0.781 0.760

c 23.12 25.08±4.07

q 19192 18426±1495

λ 18424 27±37465

a -4.41 -4.58±0.47

b 3.22 3.47±0.63

Confidential intervals are calculated as Asymptotic Standard Errors.

Confidential intervals are calculated as Asymptotic Standard Errors.

13

UML

CTesting of IG for two component model

on measured dataTable 8. IG calculated for two component model on measured data (mixture of IgG labeled with Alexa488 and pure dye). λ fixed to 1000. T = 2×10-5

Parameters IG Best fit

χ2 1.26 0.78

c1 5.477 5.166±0.089

q1 32099 32768±872

c2 0.387 0.528±0.052

q2 90770 78824±2511

λ 1000 (fixed) 1000 (fixed)

a -0.85 -0.769±0.014

b 0.25 0.296±0.018

Confidential intervals are calculated as Asymptotic Standard Errors.

14

UML

Cminima in χ2 space

Parameter set 1 set 2 set 3

a -1.393 -1.463 -4.643

b -0.861 0.321 3.539

χ2 0.732 0.668 0.711

Parameter Value

c 24.923

q 18500

λ 1500

a varied

b varied

Table 6. Parameters used for calculation (obtained from best fit of Alexa 488). T = 8×10-6.

Table 7. Calculation of χ2 for all three roots of the system 22.

1 332 2

222

1 443 3

2

( )64(2 6 1)(2 2 3)27(2 3 2)

( )4(2 6 1) (4 3 8)(2 3 2)

K T Ka b a ba b K

K T Ka b a ba b K

λ χ

λ χ

−+ + + +⎧ = =⎪ + +⎪⎨

−+ + + +⎪ = =⎪ + +⎩

(22)

Fig. 2. χ2 surface plotted versus a and b.

15

UML

CConclusions

1. In theory, if we have molecular system with n-components, IG can be obtained as solution of system of equations 15.

2. A non iterative, straightforward method of IG calculation for one and two-component system is proposed.

3. Applicability of method was verified with testing on simulated and measured data.

4. A rule of selecting of IG from possible results is suggested.

5. Impact of possible discontinuity points in model domain and consequently in χ2 space is discussed

6. developed IG allows to increase the speed of analysis in most cases at least into 5 times.