Value measurement in statistically uncertain conditions

Value Measurement in Statistically Uncertain Conditions

I. I. Gorban

SE “UkrNIUTs,” Kyiv, Ukraine

Received in final form March 5, 2007

Abstract—A mathematical model for measuring values has been proposed taking into account the

uncertainty of statistical conditions of forming the measured value and its estimate. This model is based

on presenting measured value and its estimate as hyper-random values. The point and interval methods of

estimating hyper-random values have been developed. The notions of biased, consistent and efficient

estimates were extended to the case of hyper-random estimates. The potential accuracy of measurements

was investigated. It was shown that under the conditions of statistically uncertain conditions the potential

accuracy of value measurement was limited and determined by the parameters of estimate bias.

DOI: 10.3103/S0735272708070017

INTRODUCTION

In constructing physical models of measured values and their estimates it is generally assumed that the

values subjected to measurement are deterministic, while their estimates due to the action of various

hampering factors are random values. That is why deterministic mathematical models are often used for the

mathematical description of measured values, while their estimates are described by using random

(stochastic) models with specific distribution laws. This represents the basis of the modern classical theory

of measurements that forms the theoretical basis of the commonly used applied metrology.

At relatively small space-time intervals of observation these assumptions generally do not provoke any

objections, but often prove to be invalid at large intervals.

Let us consider, for example, a power-supply system of the aircraft. It deems natural to consider that

during several minutes the EMF of such system is constant, while due to the alternating switching on/off of

various loads the mains voltage measured by a voltmeter and used for estimating the EMF varies in

accordance with a random law described by a certain distribution function. However, it is hardly admissible

to use such mathematical models while dealing with large time intervals. During the operation life of aircraft

specifications of its power-supply system undergo major changes due to aging of elements, variation of

adjustment parameters and many other reasons. In addition, even during one flight as the regime of flight

changes, a significant variation of operation modes of the generator and system load occurs resulting in

varying of the EMF and voltages registered. These circumstances do not allow us to describe EMF and the

results of measurements either by random values with certain distribution laws or even more so by

deterministic values. Different mathematical models are required that enable us to take into account the

statistical uncertainty of the distribution laws while conducting measurements. Such models include

nonparametric type models, in particular, hyper-random models described by using hyper-random values

and functions [1–3].

Hyper-random value is a nondeterministic value that can be presented [3] by a set of random values with

specific distribution laws depending on the uncertain conditions from a certain set. It should be noted that the

probability measure is not determined for these conditions.

Hyper-random value X can be most comprehensively presented by set G of conditions g and the family of

distribution functions Fx/g(x) (or probability densities fx/g(x)) corresponding to the specified conditions. A

more generalized description is provided by the probability characteristics of the bounds, in particular, the

bounds of distribution functions F x F x F xSx g G x g x gS( ) ( ) ( )/ /� �

�

sup , F x F x F xIx g G x g x gI( ) ( ) ( )/ /� �

�

inf

(where g S , g I are the boundary conditions that may or may not belong to set G); a still more generalized

description is ensured by the moments of distribution bounds (for example, mathematical expectation,

root-mean-square deviations of bounds, etc.) and bounds of moments (for example, bounds of the

mathematical expectation, bounds of the root-mean-square deviation, etc.)

Two models of measurements with elements of hyperprobability were investigated in paper [4]. In one of

these models the measured value was described as a deterministic value, while its estimate was described as

349

ISSN 0735-2727, Radioelectronics and Communications Systems, 2008, Vol. 51, No. 7, pp. 349–363. © Allerton Press, Inc., 2008.

Original Russian Text © I.I. Gorban, 2008, published in Izv. Vyssh. Uchebn. Zaved., Radioelektron., 2008, Vol. 51, No. 7, pp. 3–22.

a hyper-random value; in the second model the measured value and its estimate were described as random

and hyper-random values, respectively. These mathematical models generalize the known models

representing the measured value as a deterministic or random value and its estimate as a random value.

Figures 1–4 schematically display the probability characteristics for a scalar value and its estimate that

correspond to the specified models of measurements: deterministic-random (Fig. 1), random-random

(Fig. 2), deterministic-hyper-random (Fig. 3) and random-hyper-random (Fig. 4), where � is the

deterministic measured value, F�

�* ( ) is the distribution function of random estimate �

*, F

�

�( ) is the

distribution function of the measured random value �, FS �

�* ( ) and FI �

�* ( ) are the upper and lower bounds

of the distribution function of hyper-random estimate �

*, respectively, � 0 is bias of the mathematical

expectation m�

* of the random estimate (if the measured value is deterministic, then � �

�

0 � �m * , if it is

random, then �

�

�0 � �m m* ), m�

is the mathematical expectation of measured value, ��

, ��

* are the

root-mean-square deviations of the random value measured and its random estimate, � S 0 and � I 0 are the

biases of the upper and lower bounds of the distribution function of hyper-random estimate with respect to

the value measured (if this value is deterministic, then � �

�

S Sm0 � �* , � �

�

I Im0 � �* , if it is random, then

�

�

�S Sm m0 � �* , �

�

�I Im m0 � �* ), m

S �

* , mI �

* are mathematical expectations of the upper and lower

bounds of the hyper-random estimate, ��S

* , ��I

* are root-mean-square deviations of the appropriate

RADIOELECTRONICS AND COMMUNICATIONS SYSTEMS Vol. 51 No. 7 2008

350 GORBAN

Fig. 1.

Fig. 2.

Fig. 3.

Fig. 4.

bounds of the hyper-random estimate. The uncertainty zone of hyper-random value is depicted by shaded

area in Figs. 3, 4.

To a certain degree of approximation the deterministic value� can be considered as a random value with

�-shaped distribution density. Therefore, deterministic value� is presented by a step distribution function in

Figs. 1 and 3.

The specific interest is being shown to the deterministic-interval model of measurements where the value

measured is considered as a deterministic value, while its estimate as an interval one.

Interval values [5–7] are generally used for description of inaccurately obtained data. These values are

not random. For an interval value only the magnitude range of this value is determined, but not its probability

measure. To a certain degree of approximation the interval value~

[ , ]x x x� 1 2 can be considered as

hyper-random value X for which G x x [ , ]1 2 and the conditional distribution function F xx g/ ( ) is described

for all g x x� [ , ]1 2 by the following expression

F x

x g

x g

x g

x g/ ( )

, ,

. , ,

, .

�

�

�

�

�

�

�

0

05

1

if

if

if

Figure 5 schematically displays a deterministic-interval model of measurements for scalar value � and

interval estimate ��

. The uncertainty zone of interval value ��

is depicted by shaded area.

The purpose of the present paper is building a hyper-random–hyper-random model of measurements and

investigating the measurement accuracy under statistically uncertain conditions.

The urgency of this problem follows from the hypothesis presented in paper [2] and asserting that

practically all real phenomena (except, possibly, a small number of values considered by the modern science

as world physical constants) have the hyper-random nature. This hypothesis is based on the fact that real

phenomena (events, quantities, processes, and fields) depend on statistically unstable physical conditions

varying in unpredictable manner. It should be noted that the hyper-random nature is intrinsic not only to the

real values and processes under investigation, but also their estimates [3, 8, 9].

HYPER-RANDOM–HYPER-RANDOM MODEL OF MEASUREMENTS

Under hyper-random–hyper-random model of measurements we understand a model where the

measured value and its estimate are presented as hyper-random values.

Let set G cover the set of all variants of conditions of forming the measured value and its estimate. During

the time of taking a sample under the fixed conditions g G� the magnitude of the value measured does not

change. The measured hyper-random value � represents a set of random values � / g describing the

measured value under the fixed conditions g G� : � �={ / }g G� (Fig. 6). Random value � / g can take on

the set of specific values{ / }� g . The set of hyper-random values{ }� forms space �0.

Hyper-random estimate �

*corresponding to the measured hyper-random value � represents a set of

random values �

*/ g describing estimates of random values � / g under conditions g G� :

� �

* *={ / }g G� . The distribution function of random value �

*/ g is determined by the distribution law

of random value �*

,/ � g describing the estimate at a specific value of measured quantity � / g and by the

distribution law of random value � / g. Random value �

*,/ � g can take on the set of specific values

{ / }� �

*,g . The set of hyper-random estimates{ }�

*forms the space �0

*.


VALUE MEASUREMENT IN STATISTICALLY UNCERTAIN CONDITIONS 351

Fig. 5.


*is formed on the basis of multidimensional hyper-random data sampling

� �

X X g G� �{ / } from the general set of hyper-random value X X g G� �{ / } accessible for direct

measurement. Hyper-random estimate �*

is the function (statistics) of hyper-random sampling

�

X ; random

estimates�*

/ g and�*

,/ � g are functions of random samplings

�

X g/ and

�

X g/ � , , respectively, while the

specific estimate � �

*,/ g is the function of specific sampling

�

x g/ � , .

Values �, (�*

,/ � g), �*

/ g, and �

*can be described as follows.

Hyper-random value � is characterized by probabilistic characteristics (conditional distribution

functions F g�

�/ ( ), probability density f g�

�/ ( ), bounds of distribution function FS �

�( ), FI � �( ), etc.),

conditional parameters (mathematical expectation m g�/ , root-mean-square deviation ��/g , etc.) and

unconditional parameters (mathematical expectations of bounds mS �

, mI �, root-mean-square deviations of

bounds ��S , �

�I , etc.)

Random estimate �

*,/ � g is characterized by probabilistic characteristics (distribution function

Fg� �

�*,

*

/( ), distribution density f

g� �

�*,

*

/( ), etc.) and numerical parameters (mathematical expectation

mg� �

*,/

, root-mean-square deviation �

� �

*,/ g

, etc.), while random estimate �

*/ g is characterized by

distribution function Fg�

�*

*

/( ), mathematical expectation m m

g g� �

* *M[ ],/ /

�

�

, root-mean-square

deviation ��

*/g

, etc., where M[ ]� is the operator of mathematical expectation.


*is described by probabilistic characteristics (conditional distribution

functions Fg�

�*

*

/( ) for all g G� , bounds of the distributions function F

S �

�*

*( ), F

I ��*

*( ), etc.),

conditional parameters (mathematical expectations mg�

*/

, root-mean-square deviations �

� �

*,/ g

for all

g G� , etc.) and unconditional parameters (mathematical expectations of bounds mS �

* , mI �

* ,

root-mean-square deviations of bounds ��S

* , ��I

* , etc.)

Hyper-random–hyper-random model of measurements is schematically displayed in Fig. 7.

One of the main tasks of measuring hyper-random value� involves the need to compute the estimate and

assess the accuracy of measurement on the basis of the available specific sampling�

x g/ � , corresponding to

unknown value � and unknown conditions g G� , and also a priori information about the parameters and

characteristics of the estimate and the measured value.

The estimate can be considered as a point or interval one. In the first case a specific estimate � �

*,/ g

should be formed on the basis of sampling�

x g/ � , , and the error bounds corresponding to hyper-random

value � and hyper-random estimate �*

should be specified for uncertain conditions.

In the second case, with due regard for hyper-random properties of the error, it is necessary to compute

the limits of the confidence interval covering the measured hyper-random value �. Let us consider both

types of estimates.

POINT ESTIMATE OF THE HYPER-RANDOM VALUE

Under the fixed conditions g the proximity of random estimate �

*/ g to random value � / g can be

judged by the conditional distribution function F zz g/ ( )of error Z g g g/ / /=*

� �� . The square root of the

average squared error can be used as metrics: � � �z g g g/ [| / / |� �M ]* 2

.


352 GORBAN

Fig. 6.

This value � z g/ is related to the mathematical expectation mz g/ and root-mean-square deviation� z g/ of

the error in accordance with the following expression � z g z g z gm/ / /� �

2 2� . It can be easily seen that value

mz g/ represents an estimate bias��

�g g gm m� �*/

/ under conditions g, while error dispersion� z g/

2is related

to the conditional moments of the measured value and its estimate in accordance with the following formula:

� � �

�

�

� �

z g g g gR/

//

/* *

2 2 22� � � , where R g m g m

g g g� � �

�* *

/ //[( / )( / )]= M

*� �� is the conditional

covariance moment of the estimate and the measured value.

Under uncertain conditions the accuracy of measurement is characterized by the bounds of the error

distribution function F zS z ( ), F zI z ( ) and also values � S z , � I z representing the roots of the average with

respect to the bounds of squared errors (Fig. 8):

�

S z S zz f z z�

��

�

�

2( )d , �

I z I zz f z z�

��

�

�

2( )d ,

where f zS z ( ), f zI z ( ) are distribution densities of bounds corresponding to distribution functions F zS z ( )

and F zI z ( ).

Values � S z , � I z are determined by mathematical expectations mSz , mIz of the bounds of distribution

function F zS z ( ), F zI z ( ) and root-mean-square deviations of bounds � S z , � I z : � S z S z S zm� �

2 2� ,

� I z I z I zm� �

2 2� . Value � S z can be either more or less than value � I z . If mI z � 0, then

� � �I z z g S z� �/ , if m mS z I z� 0 , then0 ,� �� z g S z I z/ max( ), if mS z �0, then� � �S z z g I z� �/ .

If the error of specific measurement z g g g/ / /� ��

*, in unknown conditions g can be estimated by

the inequality

m k z g m kS z S z I z I z� �� / , (1)

while the interval of location of the measured value � / g in the presence of estimate � �

*,/ g can be

estimated by the following inequality

� � � � � � �

* *, ,/ / /g m k g g m kI z I z S z S z� � � � , (2)

where k is a certain constant (Fig. 8) determined by the degree of belief to the result of measurement.

It should be noted that the difference in dispersions of distribution bounds was accounted for in

inequalities (1) and (2), which is a major factor if this difference is large.

Expressions (1) and (2) simplify, when the conditional distribution functions of error Z g/ for all g G�

do not intersect and with the rise of conditional mathematical expectations of the error its conditional

dispersions increase (distribution type “a” in accordance with the classification offered in paper [3]) or

decrease (distribution type “b”). Then intervals (1) and (2) are characterized by the bounds of the error

mathematical expectation mi z , msz and bounds of the error root-mean-square deviation � i z, � sz , the



Fig. 7.

computation of which does not require computing the bounds of the distribution function. Analytically these

values are described by the following expressions:

m mi zg G

z gg G

g i� � �

� �

inf inf/ � � ,

m msz

g Gz g

g G

g s� � �

� �

sup sup/

� � ,

� � � �

�

�

�

� �

i zg G

z gg G

g g g

sz

g

R� � � �

�

� �

inf inf 2

sup

//

//

* * ,2 2

� �

� � �

G

z g

g Gg g g

R� � �

�

�

� �

//

//

* * .sup 22 2

(3)

Then, inequalities (1) and (2) for the distribution of type “a” assume, correspondingly, the following

form:

� � � �

� � � � � � � �

i iz s sz

s sz i

k z g k

g k g g k

� �

� � � �

/ ,

/ / /* *

, , � i z ,

(4)

while for the distribution of type “b”:

� � � �

� � � � � � � �

i s z s i z

s i z i

k z g k

g k g g k

� �

� � � �

/ ,

/ / /* *

, , � sz .

(5)

The exact bounds of the average squared error� i z

2,� sz

2are determined by the difference of mathematical

expectations of estimate mg�

*/

and measured value m g�/ (estimate bias� g ), dispersions of the estimate��

*/g

2

and measured value �� g

2and also by the covariance moment R

g� �

*/

under different conditions g G� :

� �izg G

z gg G

g g g gR

2 2inf inf 2� � � � �

� �

//

//

( ),* *� � �

�

�

� �

2 2 2

� �sz

g G

z g

g G

g g g gR

2 2 2 2� � � � �

� �

sup sup 22

//

//

( ).* *� � �

�

�

� �


354 GORBAN

Fig. 8.

ADDITIVE INTERFERENCE MODEL

In this case, estimate �

*can be presented as a sum of the measured hyper-random value � and

hyper-random interference W. At the same time, bias � g is equal to mathematical expectation mw g/ of

random interferenceW g/ , error dispersion� z g/

2is equal to the interference dispersion�w g/

2, bias bounds� i

and � s are equal to bounds of the interference mathematical expectation miw, msw respectively; while the

bounds of error dispersion � i z

2and � sz

2are equal to the bounds of interference dispersion � iw

2and � sw

2

respectively.

Then inequalities (4) for the type “a” distribution assume the form:

m k z g m k

g m k g g m

iw iw sw sw

sw sw

� �

� � �

� �

� � � � � �

/ ,

/ / /* *

, , iw iwk� � ,

while inequalities (5) for the type “b” distribution acquire the form:

m k z g m k

g m k g g m

iw sw sw iw

sw iw

� �

� � �

� �

� � � � � �

/ ,

/ / /* *

, , iw swk� � .

Then the bounds of the average squared error are determined by the following formula:

� i zg G

w g w gm2

inf� �

�

( ),/ /

2 2� � sz

g G

w g w gm2 2 2

� �

�

sup( )/ /� . (6)

From expression (6) it follows that in the case of the additive model of interference the hyper-random

features of the measured value do not affect the accuracy of measurement. Only the mathematical

expectation and dispersion of the interference play a significant role. With negligibly small dispersion�w g/

2

for all g G� , the limits of the average squared error are equal to the appropriate limits of the squared

mathematical expectation of the interference mi w

2, msw

2.

MULTIPLICATIVE MODEL OF INTERFERENCE

In this case estimate �*

can be presented in the form � � �

*� �( )1 , while error Z g g g/ ( / )( / )� � � ,

where �,( � / g) are, respectively, hyper-random and random values describing the multiplier of the

multiplicative interference. It should be noted that the multiplicative model ensures a good description of

numerous estimates, in particular, the estimate of EMF of the aircraft power-supply system discussed

earlier.

If values� / g, � / g are independent at any g, the mathematical expectation of error mz g/ (estimate bias)

is equal to m mg g� �/ / , while the dispersion is determined by the formula:

� � � � �� z g g g g g g gm m/ / / / / / /

2 2 2 2 2 2 2� � � ,

where m g�/ , ��/g

2are, respectively, the mathematical expectation and dispersion of the multiplier� / g. In

this case, the average square of the error � z g g g g gm m/ / /( )( )2 2 2 2 2

� � ��

� � and the limits of the average

square of the error are determined by the formulas:

� iz

g G

g g g gm m2

inf� � �

�

[( )( )]/ / / /� � � �

� �

2 2 2 2,



� sz

g G

g g g gm m2 2 2 2 2

� � �

�

sup[( )( )]/ / / /� � � �

� � ,

while parameters of the distribution bounds present in inequalities (1) and (2) are described by the following

expressions:

m m mS z S S��

, m m mI z I I��

,

� � � � �� S z S S S S S Sm m� � �

2 2 2 2 2 2,

� � � � �� I z I I I I I Im m� � �

2 2 2 2 2 2,

where mS �

, mI � are mathematical expectations, while��S

2,�

�I

2are dispersions of the distribution bounds

of the multiplicative interference factor.

As can be seen, in this case the measurement error is determined by mathematical expectations and

dispersions of two quantities: the factor of the multiplicative interference and the value measured.

UNBIASED HYPER-RANDOM ESTIMATES


*of hyper-random value � shall be called unbiased (unbiased under all

conditions g G� ) [3], if for all g G� the mathematical expectation mg�

*/

of random value �*

/ g is equal to

mathematical expectation m g�/ of conditional random value � / g, i.e., if � g g G� � �0 . Otherwise, the

estimate shall be called biased. The definition presented for the notion of biased estimate for a hyper-random

value and hyper-random estimate is in agreement with the conventional definition of the same notion for a

deterministic value and random estimate [10–12], and also for a deterministic value and hyper-random

estimate [4].

It should be noted that if the estimate is unbiased, the bounds of mathematical expectation of the

measured value and its estimate coincide m mi i�

�

� * , m ms s�

�

� * . In this case, the fact that the estimate is

unbiased does not necessarily imply the coincidence of appropriate mathematical expectations of bounds

(m mS S�

�

� * , m mI I�

�

� * ). Coincidence occurs only in certain particular cases, for example, when both

distributions of values � and �

*refer to type “a” or type “b”. In these two cases, given the unbiased

estimate, mathematical expectations of error distribution bounds are equal to zero: m mS z I z� �0.

A particular case of the biased estimate that is biased by a fixed value � 0 � �g G is of special interest.

Then the bias bounds satisfy the relationship: � � �i s� � 0.

CONSISTENT HYPER-RANDOM ESTIMATES

As is known [10, 11], in the case of deterministic value� and random estimate �*, the estimate is called

consistent if it converges in terms of probability to value �.

In the case of hyper-random estimate �

*/ � of deterministic value � the estimate is called consistent

[3, 4], if it converges in terms of probability [3, 12] to value � at all conditions g G� :

lim {| / | / }N

P g��

� � ��

*,� � � � 0� �g G, where N is the volume of sampling for each condition g, � �0 is

an arbitrarily small number. The necessary condition of consistency of such estimate under conditions of the

fixed value of� is its degeneration into a random value at N � �. Estimates maintaining the hyper-random

nature at N � � are not consistent [3, 12].

Hyper-random estimate�*

of hyper-random value�can be called consistent [3], if it converges in terms

of probability to the following value for all conditions g G� :

lim {| / | }N

P g g��

� � ��

*� 0, � �g G,


356 GORBAN

where N is the sampling volume for each condition g.

A particular case of the notion introduced is the consistent random estimate �

*of random value �

determined under stationary and unique observation conditions in the following way:

lim | |N

P��

� � ��

*0� .

The meaning of this expression is sufficiently transparent. Random error Z =*

� �� at the sampling

volume N can be considered as a parametric family of random values described by the distribution function

F zzN ( )depending on parameter N. If N tends to infinity, the distribution function F zzN ( )verges towards the

unit step function F zz� ( ) at point 0.

It can be easily seen that the hyper-random estimate of random value maintaining its hyper-random

nature at N � � is not consistent. If the hyper-random estimate �*

of hyper-random value �maintains the

hyper-random nature at N � �, the estimate may be either consistent or inconsistent.

The convergence it terms of the distribution is weaker than the convergence in terms of probability [11].

Therefore, the necessary condition of convergence of hyper-random estimate �*

to the hyper-random value

� is the convergence of distribution function Fg�

�*/

( ) towards the distribution function F g��

�( ).

EFFICIENT HYPER-RANDOM ESTIMATE OF THE HYPER-RANDOM VALUE

Hyper-random estimate �e

*of hyper-random value � shall be called efficient under all conditions g G�

[3], if for all g G� the mathematical expectation of the squared estimate deviation�e g*

/ of value�� g over

the aggregate of samplings of the specified volume N (i.e., the average squared error � z g/

2) is no more than

that for any other estimates � i g*

/ :

� �z g z ge ii g G/ / ,

2 21,2� � � � � �� , (7)

where � � �z g eeg g/

*[( / / ) ]

2= M �

2, � � �z g ii

g g/

*[( / / ) ]

2M� �

2.

It should be noted that in the case of the unbiased estimate of determinate value, the average squared error

� z g/

2is equal to the estimate dispersion �

�

*/g

2. Then the efficiency condition can be written in the form

� �

� �e ig gi g G* *

/ /

2 2� � � �, 1, 2,� .

The bounds of the relative efficiency of estimate li , ls may serve as a measure of efficiency. These

bounds are determined as limits of the ratio of the mathematical expectation of the square of the effective

estimate deviation �e g*

/ from � / g to the mathematical expectation of the squared deviation of the

estimate under consideration �

*/ g from � / g:

lg g

g g

lig G

es

g G

�

�

�

�

��

infM

M[( ) ]

supM

* 2

[( / / ) ]

/ /

,[

*� �

� �

2( / / ) ]

/ /

.

*� �

� �

e g g

g g

�

�

2

M[( ) ]* 2

The bounds of relative efficiency are found in the interval [0, 1]. When the estimate is efficient, li = ls = 1.

The bounds of measurement accuracy are determined by the following theorems.

THEOREM 1

Let hyper-random value � described by the conditional probability density f g�

�/ ( ) be estimated in

terms of hyper-random sampling

� �

X X g G� �{ / } of volume N for each condition g G� . In this case the

( )N �1 -dimensional conditional probability density f xx g( ,

( , )�

�

�

�

)/is doubly differentiable in terms of �,



while derivatives

�

�

f xx g( , /

( , )�

�

�

�

�

)and

�

�

2

2

f xx g( , /

( , )�

�

�

�

�

)are absolutely integrable in terms of

�

x and �, while

lim ( ) ( , )*

( , /�

�

� � �

��

�

� �

�f x x

x g�

� �

)d 0.

Then the limits of the average squared error

� �izg G

g sz

g G

gJ J2 2

inf sup

�

�

�

�1 1, , (8)

where J g is the conditional Fisher information determined by the following expression

J

f X

g

x g�

�

�

!

"

#

#

$

%

&

&

'

(

)

)

)

*

+

,

,

,

� �

�

M M)/

ln ( , ) ln( ,�

�

�

�

�

22

f Xx g( ,

( , )�

�

�)/�

��

'

(

)

)

*

+

,

,2

.

The proof of this theorem is based on the known Cramer-Rao inequality for random estimates [11, 13,

14]. If the conditions specified in the theorem are satisfied, the following inequalities are valid:

� z g gJ g G/

2 � �

�1. Hence, we obtain inequalities (8).

For the uniform independent sampling

f x f f xx g g x g n

n

N

( , / / ,( , ) ( ) ( )�

�

��

� �

)/�

�

-

1

,

and

Jf N f X

g

g x g� �

� �

�

!

"

#

#

$

%

&

&

'

(

)

)

*

+

,

,

M2

2[ln ( ) ln ( )]/ / ,� �

�

�

.

If the sampling elements X gn / represent an additive mixture of random value � / g having dispersion

��/g

2and independent random uniform interference V/g having mathematical expectation mv g/ and

dispersion� v g/

2, then in case of the Gaussian distribution of values � / g and X gn / the conditional Fisher

information assumes the form JN

g

g v g

� �

1

2 2� ��/ /

.

Then, at N � � the following equality is valid: � �iz sz

2 2� �0.

It should be noted that bounds (8) are sufficiently coarse. More precise bounds of the measurement

accuracy that take into account the estimate bias are determined by the following two theorems.

THEOREM 2

Let hyper-random value � described by the conditional probability densities f g�



� �

X X g G� �{ / }of volume N for each condition g G� . The domain bounds

of N-dimensional conditional probability density f xx g�

�

/( )

�

do not depend on �; this probability density is

absolutely integrable with respect to�

x and doubly differentiable with respect to �. In addition, the two first

moments exist for the conditional random estimate �

*,/ � g. Then, bounds �

�i*

2, �

�s*

2of the average


358 GORBAN

dispersion � �

� �

* */ /

]g g

2 2= M[

,�

of hyper-random estimate �

*,/ � g are determined by the following

inequalities:

�

�

�

� �ig G

g

g sJ*

/( ) ,

2

2

1

�

�

!

"

#

#

$

%

&

&

'

(

)

)

*

+

,

,�

�

inf M 1+�

�

� *

/( ) ,

2

2

1

�

�

!

"

#

#

$

%

&

&

'

(

)

)

*

+

,

,�

�

sup M 1+

g G

g

gJ��

�

� (9)

while the bounds of the average squared error are determined by the following ones:

�

�

��

�

izg G

g

g

gJ2

inf M + 1+

�

�

!

"

#

#

$

%

&

&

'

(

)

)

*

+

,

�

�

�

�

/

/( )

2

2

1

,

,

�

�

��

�

sz

g G

g

g

gJ2

sup M + 1+

�

�

!

"

#

#

$

%

&

&

'

(

)

)

*

+

,

�

�

�

�

/

/( )

2

2

1

,

, (10)

where � ��

� �

//

( / )g gm g� �*

,is the bias of estimate �

*,/ � g under conditions g with respect to � / g,

J g ( )� is the Fisher information for random estimate �*

,/ � g:

J

f X f

g

x g( )

ln ( ) ln/

�

�

�

�

�

�

!

"

#

#

$

%

&

&

'

(

)

)

)

*

+

,

,

,

� �

�

M M,

�

� 22

�

�

x gX

/( )

�

�

,

�

'

(

)

)

*

+

,

,2

.

The proof of this theorem is based on the Cramer–Rao inequality for a random estimate of the

deterministic value [11, 13, 14]. For the random value �*

,/ � g under conditions of the fixed values of�, g

and fulfillment of conditions specified in the theorem, the following relationships are valid for the dispersion

�

� �

*/ ,g

2and the average squared error � �z g g g/

*/ / )

�

� �,

2M[( , ]� �

2:

�

�

�

�

� �

�

*/

/( )

,1+

g

g

gJ2

2

1

�

�

!

"

#

#

$

%

&

&

�

, � z g g g/ //

*� �

� �

� �,

2 2

,� �

2.

Then for the average dispersion ��

*/g

2and average squared error � �

�z g z g/ /[ ]2

,

2M� we get

�

�

�

*/

/( )

g

g

gJ2

2

1

�

�

!

"

#

#

$

%

&

&

'

(

)

)

*

+

,

,

�

M 1+�

�

� , �

�

��

�

z g g

g

gJ/ /

/( ) .

2M 1+ �

�

�

!

"

#

#

$

%

&

&

'

(

)

)

*

+

,

,

�

�

�

2

2

1(11)

Hence, inequalities (9) and (10) follow from inequalities (11).

As can be seen from expressions (9), the value of�

�

�

�

�/gmust be equal to –1 for all � in order to ensure

zero value of the average dispersion. This means that, similar to the case of estimating random values, it is

impossible to simultaneously ensure zero bias and zero dispersion of the estimate.



For the uniform independent sampling J Nf X

g

x g( )

ln ( )/ ,�

�

�

� �

�

�

'

(

)

)

*

+

,

,

M

2

2.

Then, the following inequalities are valid at N � �: ��iz

g Gg

2inf M

�

[ ]/�

2, �

�sz

g G

g

2sup M

�

[ ]/�

2.

Hence, it follows that the potential accuracy is determined by the limits (bounds) of the average squared

bias, while the infinitely high accuracy of measurement is ensured at the infinitely large sampling volume

and the absence of bias for all � and conditions g G� .

Statistical conditions of forming the measured value and its estimate generally change independently.

That is why it is not feasible to ensure the absence of bias for all conditions and, consequently, infinitely high

accuracy of measurements. This can explain a well-known empirical fact that the accuracy of any real

physical measurements has a limit that cannot be overcome even at a very large volume of data.

In place of the above-specified definition of the efficient estimate one can introduce another definition

based on inequalities (10): estimate �

*of the specified volume N can be called efficient hyper-random

estimate �e

*, if its bounds of the average squared error are determined by the following inequalities

�

�

�

�

izg G

g

gJ2

inf M 1+�

�

�

!

"

#

#

#

$

%

&

&

&

'

(

)

)

)

*

+

,

,

,

�

�

�

/( ) ,

2

1�

�

�

�

sz

g G

g

gJ2

sup M 1+�

�

�

!

"

#

#

#

$

%

&

&

&

'

(

)

)

)

*

+

,

,

,�

�

�

/( )

2

1. (12)

In the general case definitions (7) and (12) are not equivalent.

THEOREM 3

Let hyper-random value � described by the conditional probability densities f g�



� �

X X g G� �{ / }of volume N for each condition g G� . The domain bounds

of N-dimensional conditional probability densities of bounds f xx gS

�

�

/ ,( )

�

, f xx gI

�

�

/ ,( )

�

do not depend on �;

these probability densities are absolutely integrable with respect to�

x and doubly differentiable with respect

to�. In addition, the two first moments exist for random estimates�*

,/ � g S ,�*

,/ � g I . Then, the average

dispersions of error distribution bounds � �

� �S gS

* */

]2 2

= M[,�

, � �

� �I gI

* */

]2 2

= M[,�

are described by the

following inequalities:

�

�

�

� �S

g

g I

S

SJ* *

/( ) ,

2

2

1 2

�

�

!

"

#

#

$

%

&

&

'

(

)

)

*

+

,

,

�

M 1+ M�

�

� 1+

�

�

!

"

#

#

$

%

&

&

'

(

)

)

*

+

,

,

�

��

�

�

/( ) ,

g

gI

IJ

2

1(13)

while the average (with respect to the bounds) squares of the absolute error � S z

2and � I z

2are determined by

the following relationships:

�

�

�

�

�

�

S z g

g

gS

S

SJ

2 2

2

1 �

�

�

!

"

#

#

$

%

&

&

'

(

)

)

*

+

,

,

�

M 1+�

�

/

/( ) ;

I z g

g

gI

I

IJ

2 2

2

1 �

�

�

!

"

#

#

$

%

&

&

'

(

)

)

*

+

,

,

�

M 1+�

�

�

�

�

�/

/( ) ,

(14)


360 GORBAN

where��/gS

,��/gI

are estimate biases for the upper and lower bounds of the distribution, J gS( )� , J gI

( )� are

Fisher information for the upper and lower bounds of the distribution, respectively:

J

f X

g

x g

S

S( )

ln ( )/

�

�

�

�

�

�

!

"

#

#

$

%

&

&

'

(

)

)

)

*

+

,

,

,

M,

�

� 2

, J

f X

g

x g

I

I( )

ln ( )/

�

�

�

�

�

�

!

"

#

#

$

%

&

&

'

(

)

)

)

*

+

,

,

,

M,

�

� 2

.

The proof of this theorem is similar to that of theorem 2. The bounds of the distribution function of

hyper-random sampling

�

X can be viewed as distribution functions of random vectors

�

X g S/ ,

�

X g I/

corresponding to conditions g S , g I that may or may not belong to set G. With due regard for inequalities

(11), we have inequalities (13) and (14).

For a uniform independent sampling of volume N we get

J Nf X

g

x g

S

S( )

ln ( )/�

�

�

�

�

�

!

"

#

#

$

%

&

&

'

(

)

)

*

+

,

,

M,

2

, J Nf X

g

x g

I

I( )

ln ( )/�

�

�

�

�

�

!

"

#

#

$

%

&

&

'

(

)

)

*

+

,

,

M,

2

.

Then the average (with respect to the bounds) squares of the errors � S z

2and � I z

2at N � � tend,

respectively, towards M[ ]/�� gS

2and M[ ]/�

� gI

2.

INTERVAL ESTIMATE OF THE HYPER-RANDOM VALUE

The confidence interval [z1, z2] specifying the value of hyper-random error Z and the putative interval

� � � � �

* *, ,/ / /g z g g z� �2 1 of measured value � / g can be calculated on the basis of confidence

probability bounds.

Let . S S z

zf z z�

��

( )d1

, . I I z

zf z z�

��

( )d1

, /S S zz

f z z�

�

�( )d

2

, /I I zz

f z z�

�

�( )d

2

. Then

. .I SP z z g� � �( / )1 , / /S IP z z g� �( / )2 . Hence, 0 0i sP z z z g� �( / )1 2 or

0 � � � 0i sP z z g� � � �( / )* *

2 1 , where 0 i , 0 s are bounds of the confidence probability:

0 . /i S I= +1� ( ), 0 . /s I S= +1� ( ). (15)

With the known distribution functions of error bounds F zS z ( ) and F zI z ( ), the bounds of confidence

probability 0 i , 0 s determine the bounds of confidence interval z1, z2.

For the Gaussian distribution of bounds with parameters ( , )mS z S z� , ( , )mI z I z� the calculation of

bounds of the confidence interval is reduced to the following.

Let us take into account that

.

�

S

S z

S z

z m�

�!

"

#

#

$

%

&

&

1

1, .

�

I

I z

I z

z m�

�!

"

#

#

$

%

&

&

1

1,

/

�

S

S z

S z

z m� �

�!

"

#

#

$

%

&

&

12

1 , /

�

I

I z

I z

z m� �

�!

"

#

#

$

%

&

&

12

1 ,

where1( )x is the Gaussian distribution function with zero mathematical expectation and unity dispersion.

Then from expression (15) we get the system of equations



1 1

1

z m z m

z m

I z

I z

S z

S z

i

S z

S

2 1

2

�!

"

#

#

$

%

&

&

�

�!

"

#

#

$

%

&

&

�

�

� �

0

�

,

z

I z

I z

s

z m!

"

#

#

$

%

&

&

�

�!

"

#

#

$

%

&

&

�

�

�

�

�

�

�

1

1

�

0 .

The unknown bounds z1 and z2 are a solution of this system.

RATIONAL VOLUME OF SAMPLING

It is reasonable to increase the sampling volume until it yields a perceptible rise in the measurement

accuracy. For real hyper-random estimates there is a limit above which it is not expedient to increase the

volume of data subjected to processing. In this connection the hyper-random estimates behave similarly to

the interval estimates [15].

Let us consider a simple example. The measured value �, its estimate � �

*/ , and sampling

�

X are

hyper-random. Random sampling

�

X g X gn/ / ,� �, ,� n N�1, corresponding to conditions g G�

represents an additive mixture of random value � / g having dispersion ��/g

2and the random uniform

interference described by vector

�

V g/ , the components of which are independent and have the same

mathematical expectations mv and dispersions � v g/

2. The interference does not depend on the measured

value, and the interference dispersion lies in the range [� iv

2,� sv

2]. Statistical conditions change so slowly that

we can consider the conditions of the sampling formation to be practically invariable.

It is necessary to estimate the measured value � / g in unknown conditions g G� and the accuracy of

measurements.

Having N successive samples of x gn / � , we can form the estimate � � �

*, =/ /g

Nx nn

N1

1�2for

unknown conditions g G� . Parameters of this estimate are as follows � � �

�

�*

2 2 2

// / /

g g v g N� � ,

Rg g

� �

�

�*/

/=2

, while the average square of the error is determined by the formula � z g v v gm N/ / /2 2 2

� �� .

Then, we get in accordance with expression (6):

mN

mN

viv

z g vsv2

22 2

2

� �

� �

� / . (16)

At N � � value � z g/ �� 0. Using the right side of inequality (16) we can estimate the rational volume of

sampling as follows: N msv v0

2 210� � / . Hence, requirements to the sampling volume are enhanced as the

bias is reduced � 0 � mv and the upper bound of interference dispersion� sv

2is increased. If the estimate bias

is comparable with the root-mean-square deviation of interference (� �0 3 sv ), the rational sampling volume

N0 proves to be about only ten samples.

CONCLUSIONS

1. Mathematical model of measuring values has been proposed taking into account the uncertainty of

statistical conditions of forming the measured value and its estimate. This model is based on presenting the

measured value and its estimate as hyper-random values.

2. Point and interval methods for estimating hyper-random values have been developed. The notions of

biased, consistent, and efficient estimates are extended to the case of hyper-random estimates.

3. The potential accuracy of measurements was investigated. The theorems determining the potential

accuracy of measuring hyper-random values have been proved. It was shown that in statistically uncertain


362 GORBAN

conditions the potential accuracy of measurements was limited and determined by parameters of the

estimate bias.

4. It was shown that in estimating hyper-random values the sampling volume can be expediently

increased only to a specific limit. In a typical situation of exposure to the additive interference, the rational

sampling volume was in the region of only ten samples, when the estimate bias was comparable to the

root-mean-square deviation of the interference.

REFERENCES

1. I. I. Gorban, “Hyper-Random Phenomena and Their Description,” Akustychnyi Visnyk 8, Nos. 1–2, 16 (2005).

2. I. I. Gorban, “The Hyperrandom Functions and Their Description,” Radioelectron. Commun. Syst. 49(1), 1

(2006).

3. I. I. Gorban, The Theory of Hyper-Random Phenomena (IPMMS NAN Ukrainy, Kyiv, 2007) [in Russian].

4. I. I. Gorban, “Point and Interval Methods of Estimating Parameters of Hyper-Random Values,” Matematicheskie

Mashiny i Sistemy, No. 2, 3 (2006).

5. Voshchinin, A. F. Bochkov, and G. R. Sotirov, “Data Analysis Method in the Case of Interval Nonstatistical

Error,” Zavodskaya Laboratoriya 56, No. 7, 76 (1990).

6. G. Alefel’d and Yu. Hertsberger, Introduction to Interval Computations (Mir, Moscow, 1987) [in Russian].

7. V. I. Levin, “Interval Mathematics and the Study of Uncertain Systems,” Informatsionnye Tekhnologii, No. 6

(1998) (www.techno.edu.ru).

8. I. I. Gorban, “Representation of Physical Phenomena with the Help of Hyper-Random Models,”

Matematicheskie Mashiny i Sistemy, No. 1 (2007).

9. I. I. Gorban, “Mathematical Description of Physical Phenomena in Statistically Unstable Conditions,”

Standartyzatsiya, Sertyfikatsiya, Yakist’, No. 6, 26 (2006).

10. V. S. Korolyuk, et al., Reference Book on the Probability Theory and Mathematical Statistics (Nauka, Moscow,

1985) [in Russian].

11. I. I. Gorban, Probability Theory and Mathematical Statistics for Researchers and Engineers (IPMMS NAN

Ukrainy, Kyiv, 2003) [in Ukrainian].

12. I. I. Gorban, “Estimates of Characteristics of Hyper-Random Values,” Matematicheskie Mashiny i Sistemy,

No. 1, 41 (2006).

13. B. R. Levin, Theoretical Principles of Statistical Radio Engineering (Radio i Svyaz’, Moscow, 1989) [in

Russian].

14. Van Tris, Theory of Detection, Estimation and Modulation, Vol. 1 (Sov. Radio, Moscow, 1972) [in Russian].

15. A. I. Orlov, Econometrics. Textbook (Ekzamen, Moscow, 2002) [in Russian].



Value measurement in statistically uncertain conditions

Documents

Transcript of Value measurement in statistically uncertain conditions