Estimation of generalized entropies with sample spacing

7
THEORETICAL ADVANCES Mark P. Wachowiak Renata Smolı´kova´ Georgia D. Tourassi Adel S. Elmaghraby Estimation of generalized entropies with sample spacing Received: 28 November 2003 / Accepted: 24 March 2005 / Published online: 11 August 2005 Ó Springer-Verlag London Limited 2005 Abstract In addition to the well-known Shannon en- tropy, generalized entropies, such as the Renyi and Tsallis entropies, are increasingly used in many appli- cations. Entropies are computed by means of nonpara- metric kernel methods that are commonly used to estimate the density function of empirical data. Gener- alized entropy estimation techniques for one-dimen- sional data using sample spacings are proposed. By means of computational experiments, it is shown that these techniques are robust and accurate, compare favorably to the popular Parzen window method for estimating entropies, and, in many cases, require fewer computations than Parzen methods. Keywords Generalized entropy Renyi entropy Parzen windows Sample spacings Order statistics Nonparametric estimation 1 Introduction Entropy is a useful measure of uncertainty and disper- sion, and has been widely employed in many pattern analysis applications. Shannon’s differential entropy, H(f), is computed from the probability density function (pdf) f X of a continuous random variable X as [1]: H ðf Þ¼ Z 1 1 f X ðxÞlnf X ðxÞdx ¼Eðlnf X ðxÞÞ: ð1Þ Recently, interest has also increased in the applications of generalized entropies, including the measures of Renyi [2] and Tsallis [3]. For instance, the Renyi entropy has been used in ultrasound backscatter characterization [4], in the study of DNA binding sites [5], and in the analysis of EEG following brain injury [6]. The recent Tsallis measure (a similar form was introduced earlier in [7]) is useful in explaining many phenomena in thermody- namics and statistical mechanics [3], in brain electrical activity analysis [8], and as animage similarity metric [9]. The increasing importance of these measures under- scores the need for accurate and robust nonparametric estimation techniques for continuous-valued data. In this paper, several generalized entropy estimators for one-dimensional/time series data are proposed. These methods are based on the distances between data sam- ples (sample spacings), and utilize easily-computable order statistics. A sample spacings approach had been previously introduced for Shannon entropy [10]. Sub- sequently, numerous modifications and bias corrections to this basic approach have been reported [1115]. Different methods for computing the Shannon [21] and generalized entropies are found in the literature [1620]. However, as Parzen windows and other kernel- based methods are popular ways for estimating both Shannon and generalized entropies, in the current study, comparative techniques are limited to Parzen window estimators. By means of extensive experiments, it is demonstrated that the proposed methods reduce the computational load for estimating generalized entropies, and are also accurate, robust, and consistent. 2 Generalized entropies The differential Renyi entropy of order q 2R, denoted as R q , is given as [2]: M. P. Wachowiak (&) R. Smolı´kova´ Imaging Research Laboratories, Robarts Research Institute, London, ON, N6A 5K8, Canada E-mail: [email protected] E-mail: [email protected] G. D. Tourassi A. S. Elmaghraby Department of Computer Engineering and Computer Science, University of Louisville, Louisville, KY 40292, USA E-mail: [email protected] E-mail: [email protected] Pattern Anal Applic (2005) 8: 95–101 DOI 10.1007/s10044-005-0247-4

Transcript of Estimation of generalized entropies with sample spacing

THEORETICAL ADVANCES

Mark P. Wachowiak Æ Renata Smolıkova

Georgia D. Tourassi Æ Adel S. Elmaghraby

Estimation of generalized entropies with sample spacing

Received: 28 November 2003 / Accepted: 24 March 2005 / Published online: 11 August 2005� Springer-Verlag London Limited 2005

Abstract In addition to the well-known Shannon en-tropy, generalized entropies, such as the Renyi andTsallis entropies, are increasingly used in many appli-cations. Entropies are computed by means of nonpara-metric kernel methods that are commonly used toestimate the density function of empirical data. Gener-alized entropy estimation techniques for one-dimen-sional data using sample spacings are proposed. Bymeans of computational experiments, it is shown thatthese techniques are robust and accurate, comparefavorably to the popular Parzen window method forestimating entropies, and, in many cases, require fewercomputations than Parzen methods.

Keywords Generalized entropy Æ Renyi entropy ÆParzen windows Æ Sample spacings Æ Order statistics ÆNonparametric estimation

1 Introduction

Entropy is a useful measure of uncertainty and disper-sion, and has been widely employed in many patternanalysis applications. Shannon’s differential entropy,H(f), is computed from the probability density function(pdf) fX of a continuous random variable X as [1]:

Hðf Þ ¼ �Z1

�1

fX ðxÞlnfX ðxÞdx ¼ �EðlnfX ðxÞÞ: ð1Þ

Recently, interest has also increased in the applicationsof generalized entropies, including the measures of Renyi[2] and Tsallis [3]. For instance, the Renyi entropy hasbeen used in ultrasound backscatter characterization [4],in the study of DNA binding sites [5], and in the analysisof EEG following brain injury [6]. The recent Tsallismeasure (a similar form was introduced earlier in [7]) isuseful in explaining many phenomena in thermody-namics and statistical mechanics [3], in brain electricalactivity analysis [8], and as animage similarity metric [9].

The increasing importance of these measures under-scores the need for accurate and robust nonparametricestimation techniques for continuous-valued data. Inthis paper, several generalized entropy estimators forone-dimensional/time series data are proposed. Thesemethods are based on the distances between data sam-ples (sample spacings), and utilize easily-computableorder statistics. A sample spacings approach had beenpreviously introduced for Shannon entropy [10]. Sub-sequently, numerous modifications and bias correctionsto this basic approach have been reported [11–15].

Different methods for computing the Shannon [21]and generalized entropies are found in the literature [16–20]. However, as Parzen windows and other kernel-based methods are popular ways for estimating bothShannon and generalized entropies, in the current study,comparative techniques are limited to Parzen windowestimators. By means of extensive experiments, it isdemonstrated that the proposed methods reduce thecomputational load for estimating generalized entropies,and are also accurate, robust, and consistent.

2 Generalized entropies

The differential Renyi entropy of order q 2R, denoted asRq, is given as [2]:

M. P. Wachowiak (&) Æ R. SmolıkovaImaging Research Laboratories, Robarts Research Institute,London, ON, N6A 5K8, CanadaE-mail: [email protected]: [email protected]

G. D. Tourassi Æ A. S. ElmaghrabyDepartment of Computer Engineering and Computer Science,University of Louisville, Louisville,KY 40292, USAE-mail: [email protected]: [email protected]

Pattern Anal Applic (2005) 8: 95–101DOI 10.1007/s10044-005-0247-4

Rqðf Þ ¼1

1� qln

Z1

�1

fX ðxÞqdx: ð2Þ

The differential Tsallis entropy of order q 2R, de-noted as Tq, is given as [3]:

Tqðf Þ ¼1

q� 11�

Z1

�1

fX ðxÞqdx

0@

1A: ð3Þ

The Shannon entropy is a special limiting case ofboth the Renyi and Tsallis measures as q fi 1. TheRenyi and Tsallis measures are related by:

Tqðf Þ ¼exp½Rqðf Þð1� qÞ� � 1

1� q: ð4Þ

3 Estimation based on sample spacings

For the Shannon entropy, an alternative estimationapproach [10] employs the distribution function, F(x),where d F(x)/d x=fX(x). Equation 1 can be written inthe following form:

Hðf Þ ¼Z1

0

lnd

dpF �1ðpÞ

� �dp: ð5Þ

Approximating the derivative as the slope of the curveF�1(p) from n samples yields [10]:

HV ðm; nÞ ¼1

n

Xn

i¼1ln

n2m

xðiþmÞ � xði�mÞ� �� �

; ð6Þ

where the smoothing parameter m <n/2 is a positiveinteger, and x(i) = x(1) for i <1, and x(i) = x(n) fori > n.

This estimator has the property thatHV ðm; nÞ ! Hðf Þ for n, m fi ¥ and m/n fi 0 [11]. Forfinite samples, the quality of the estimate is heavilydependent on the choice of m. In general, the smootherthe distribution function, the larger the m should be. Abias correction term for this estimator was also proposed[10, 12].

The focus now turns to generalized entropies, wherethe derivation of (6) is applied to Renyi entropy (2). LetF(x) = p 2[0, 1], so that F�1(p) = x. Then d x = d/dp(F�1(p)) d p. Making this substitution yields:

Z1

�1

fX ðxÞqdx ¼Z1

0

f F �1ðpÞ� �q d

dpF�1ðpÞdp: ð7Þ

By observing that d/d p(F�1(p)) =1/f(F�1(p)), (7) canbe rewritten as

Z1

0

d

dpF�1ðpÞ

� ��qd

dpF �1ðpÞdp ¼

Z1

0

d

dpF �1ðpÞ

� �1�q

dp:

ð8Þ

Thus, the Renyi entropy can be written as

Rqðf Þ ¼1

1� qln

Z1

0

dF �1ðpÞdp

� �1�q

dp; ð9Þ

and the estimator becomes

RV ðq;m; nÞ ¼1

1� qln

1

n

Xn

i¼1

n2mðxðiþmÞ � xði�mÞÞ

� �1�q" #

:

ð10Þ

Other methods have been proposed for estimatingentropy using d/d p (F�1(p)). An empiric entropy esti-mator of order m, proposed for Shannon entropy [11],can also be used for estimating Renyi entropy. Using thedefinition for empiric entropy of order m, the Renyientropy can be estimated as:

REðq;m;nÞ

¼ 1

1�qln

1

nq

X0i¼2�m

xðiþmÞ �xðiþm�1Þ2

� �"

�Xiþm�1

j¼1

2

xðjþmÞ �xðj�mÞ

!q

þ 1

nq

Xnþ1�m

i¼1

xðiÞ þxðiþmÞ �xði�1Þ �xðiþm�1Þ2

� �

�Xiþm�1

j¼i

2

xðjþmÞ �xðj�mÞ

!q

þ 1

nq

Xn

i¼nþ2�m

xðiÞ�xði�1Þ2

� � Xn

j¼i

2

xðjþmÞ �xðj�mÞ

!q#: ð11Þ

An approach based on linear regression was alsodescribed for Shannon entropy [13]. Extending the resultto the Renyi entropy yields the Correa estimator RC:

RCðq;m; nÞ ¼1

1� qln

1

n

Xn

i¼1b1�q

i

!; ð12Þ

where

bi ¼

Piþm

j¼i�mxðjÞ � xðiÞ� �

ðj� iÞ

nPiþm

j¼i�mxðjÞ � xðiÞ� �2 ; and

xðiÞ ¼1

2mþ 1

Xiþm

j¼i�m

xðjÞ:

ð13Þ

96

4 Methods

To test the accuracy and precision of the proposedestimators, experiments were performed on data withknown distributions providing ‘‘ground truth’’. As manyprocesses are assumed to have Gaussian statistics, en-tropy was estimated from data generated from Gaussiandistributions with zero mean and standard deviation r.

The exponential power (EP) distribution with shapeparameter a is a symmetric distribution with varying tailbehavior, and thus assumes a wide range of kurtosisvalues [15, 22]. Its pdf is given as:

fX ðxÞ ¼1

2Cð1þ 1=aÞ exp �jxjað Þ; �1 � x � 1; a > 1:

ð14Þ

As a fi 1, heavy tails are seen, and the pdf becomesdouble exponential. For a=2, the pdf is Gaussian withmean zero and r =1/2. The EP distribution convergesto the uniform distribution as a fi ¥.

The Renyi entropy for the Gaussian distribution isgiven as:

Rqðf Þ ¼1

1� qln ð

ffiffiffiffiffiffi2pp

rÞ1�qq�12

h i; ð15Þ

and for the EP distributions as:

Rqðf Þ ¼1

1� qln 21�qq�1=aC1�q 1þ 1

p

� �� : ð16Þ

Exponential power data is generated as Y(a)=±(G(1/a,1))1/a, where G(h, j) is Gamma-distributed, and ±indicates a random sign [22].

Although the parameter q in (2) can assume any realnumber, positive q are most commonly used [23]. Hence,experiments consisted of Renyi estimation for positive qvalues: q = 0.25, 0.5, 1.5, 2, 3, and 4. For the Gaussiandata, 20 values of r, linearly spaced from 0.01 to 40,were used. For EP data, 20 values of a were log-spacedfrom 1 to 20. Experiments were performed on samplesizes of n=50, 100, and 200. For the proposed estima-tors, 20 smoothing parameters, linearly spaced from 1 ton/2 � 1, were used.

The entropies were also estimated using Parzen win-dows, with general form [1]:

fnðxÞ ¼1

nVn

Xn

i¼1/

x� xi

an

� �; ð17Þ

where Vn is a normalization factor, an is a smoothingparameter dependent on n, and /(Æ) is a kernel(smoothing) function. The density estimates are thensubstituted into the following equation to computeentropies [18]:

RP ðq; n; f Þ ¼1

1� qln

1

n

Xn

i¼1fnðxiÞq�1

!; ð18Þ

where n is the sample size and fnðxÞ is a nonparametricdensity estimate.

Three different kinds of kernels were used. TheGaussian kernel, /G, is given as:

/GðxÞ ¼1ffiffiffiffiffiffi2pp

anexp

x2

2an2

� �; ð19Þ

where, for the current experiments, an 2fn�1=1:1; n�1=2; n�1=4; n�1=10; 1g (an=n�1/4 is recom-mended in some literature [11]). The uniform kernel, /U,is given as:

/U ðxÞ ¼1

an; jxj\1

2; ð20Þ

where an 2 fn�1=1:1; n�1=2; n�1=4; n�1=10; 1g as above. Thedouble exponential kernel /DE, which has been shown tobe an accurate kernel entropy estimator [24], is given as:

/DEðxÞ ¼1

2anexp � jxj

an

� �: ð21Þ

For this kernel, an 2 fn�1=1:1; n�1=2; n�1=3; n�1=4; 1g; asthe estimator was found to perform best for n�b, where1/4 < b < 1/2 [24].

Fifty trials were performed for each different experi-ment. The root-mean-squared error, ERMS, for eachparameter set (r for Gaussian data and a for the EP pdf)was computed, along with the bias for each distributionfor each set of parameters. Analysis was based on theERMS and bias for increasing distribution parametervalues, and the overall ERMS and bias over all parame-ters for a specific distribution.

5 Results

The overall ERMS values (overall parameters) was com-puted for each distribution, sample size, and estimationtechnique. The Renyi estimation results for the Gaussianand EP distributions are shown in Table 1. The Gauss-ian Kernel was used for Parzen windows estimators. Forall tested methods, entropy estimation was most accu-rate when q ‡ 1.5 for both tested distributions.

As q increases, the deviations from ideal distributionsare expected to grow. Hence, m should increase with q.The m values yielding the lowest overall ERMS weredetermined for the estimators for the three sample sizes.For all sample sizes, methods, and distributions, there isan increasing linear trend for m as q increases, fromapproximately 0.05n for q=0.25 to 0.25n for q=4.These trends, especially strong for the Vasicek andCorrea estimators, are also seen in Table 1. However,the empiric estimator does not follow this linear trend.

Plots for ERMS and bias for the Renyi metrics, asfunctions of increasing distribution parameters, areshown in Figs. 1, 2, 3, 4 for q = 0.25. Plots from thethree overall best Parzen window kernels are alsoshown. The best Parzen estimators were found to be

97

negatively biased for the EP distribution, and posi-tively biased for the Gaussian distribution, withincreasing values of the distribution parameters. Forthe sample spacing estimators, a positive bias is ob-served for the EP data, but this bias decreases withincreasing a. It is seen from these figures that thesample spacing methods are generally more accuratethan the Parzen methods as the distribution parameterincreases. All sample spacing estimators followed theexpected trend of decreasing errors with increasingsample size. Overall, the Correa estimator had thelowest ERMS and bias values for most experiments,and performed well for m chosen as linearly related toq for all distributions.

Density estimation requires that the probability ateach of the n data sample be a sum of kernel functionsfor all data samples. From (17), there are n2 operations(kernel function evaluations) and n � 1 additions perdata sample. As there are n data samples, and ignoringmultiplication by constants and taking logarithms,computing Renyi entropy from Parzen windows requiresapproximately 2(n2+n) operations.

The sample spacing estimators require data sorted inincreasing order. Sorting can be accomplished with O(nlog n) operations. From (10), there are n exponentia-tions, and about 2n additions. Computing Renyi en-tropy with the Vasicek estimator requires about3n+1.4n log2 n operations. Thus, the complexity is in

Fig. 1 ERMS for Renyi entropyestimation for Gaussian data,q=0.25

Table 1 ERMS for Renyi entropy estimators for the Gaussian and exponential power distributions

Renyi entropy estimation

n Estimator Gaussian distribution Exponential power distribution

q (best values) q (best values)

0.25 0.5 1.5 2 3 4 0.25 0.5 1.5 2 3 4

50 RV 0.461 0.298 0.146 0.130 0.117 0.117 0.353 0.247 0.173 0.172 0.194 0.211RE 0.213 0.189 0.179 0.152 0.126 0.117 0.168 0.140 0.192 0.181 0.177 0.182RC 0.292 0.159 0.111 0.117 0.124 0.135 0.204 0.118 0.092 0.097 0.112 0.118RP 0.951 0.777 0.639 0.636 0.652 0.677 0.493 0.392 0.302 0.287 0.280 0.274

100 RV 0.375 0.222 0.086 0.076 0.082 0.086 0.283 0.176 0.105 0.107 0.121 0.132RE 0.178 0.127 0.128 0.107 0.088 0.083 0.133 0.093 0.136 0.128 0.122 0.126RC 0.235 0.112 0.075 0.084 0.092 0.099 0.156 0.074 0.062 0.066 0.074 0.082RP 0.731 0.565 0.455 0.456 0.476 0.498 0.492 0.399 0.314 0.305 0.297 0.292

200 RV 0.305 0.156 0.054 0.056 0.062 0.067 0.229 0.129 0.068 0.065 0.071 0.081RE 0.146 0.139 0.095 0.078 0.063 0.061 0.104 0.100 0.106 0.092 0.086 0.091RC 0.183 0.074 0.056 0.058 0.064 0.069 0.120 0.053 0.042 0.046 0.052 0.057RP 0.585 0.436 0.346 0.353 0.367 0.388 0.498 0.412 0.339 0.331 0.321 0.312

98

H(n) for the Vasicek estimator’ and in H(n2) for theParzen window estimator. For the empiric estimator, thenumber of operations is dependent on m, with themaximum occurring when m = n/2 � 1. From (11), it isseen that there are only n exponentiation operations.The complexity for the empiric estimator is in H(n+n2)operations (with H(n) exponentiations). The Correaestimator requires n additions to compute each mean(assuming that m is chosen to be the maximum n/2 � 1),

and 5n operations are required to compute (13). Tocompute (12) requires about 17n2+n+1.4n log2 noperations. From this analysis, the proposed estimatorsgenerally have a lower computational complexity thanParzen window estimators. Among the sample spacingestimators, the Correa methods are the most complex.

Just as the choice of a smoothing parameter andkernel function for Parzen window methods affects theaccuracy of the estimate, the smoothing parameter m in

Fig. 2 Bias for Renyi entropyestimation for Gaussian data,q=0.25

Fig. 3 ERMS for Renyi entropyestimation for exponentialpower data, q=0.25

99

the proposed sample spacing methods is also animportant consideration. General heuristics can beconsidered, given that they are reasonable for manydistributions. Firstly, the value of m, as a fractionalmultiple of n, is not dependent on the specific value of n.Secondly, for the Vasicek and Correa estimators, withincreasing q, the value of m should correspondingly in-crease. Thirdly, from the experiments performed in thisstudy, m should not be chosen to be above 0.25n for q£ 4. Selection of m for the empiric estimator is moreproblematic, as the m values that have the best overallERMS and bias do not follow this trend. Optimal selec-tion of smoothing parameters is still an open problem, asin the case of estimating Shannon entropy [15].

6 Conclusion

Generalized entropy estimation methods based onsample spacings were proposed in this study. Experi-mental results demonstrate that these estimators com-pare favorably to Parzen window methods in terms ofaccuracy and computation time. The Correa estimatorhad the best overall performance. The empiric estima-tor also performed well, but the optimal m is notlinearly related to q. A promising future direction maybe an adaptive approach in which m is selected on thebasis of first-order statistics, in addition to sample sizeand entropy order (corresponding to kernel andsmoothing parameter selection for Parzen windowestimators). The empiric and Correa estimators are alsoconducive to parallelization. Future work also includesan analysis of the effect of the smoothing parameter,especially for nonsmooth and multimodal density

functions, and comparative studies with efficient‘‘plug-in’’ estimators.

The speed and accuracy of the sample spacing esti-mators make these techniques attractive, indicating thatthey are complementary estimation methods, as the useof generalized entropies increases in pattern recognitionapplications.

7 Originality and contributions

Generalized entropies, including the Renyi and Tsallisdefinitions, are gaining popularity in a number of signal/image processing and pattern recognition applications.These measures are parameterized by a real number, andbecome the Shannon entropy in some limit of thisparameter. Kernel-based estimators, such as Parzenwindows, are widely used to estimate generalizedentropies for continuous-valued data. The contributionof this paper is to adapt three new techniques for non-parametric Renyi entropy estimation of one-dimensionalor time series data. Estimators based on sample spac-ings, including the Vasicek, empiric, and Correa meth-ods, originally developed for Shannon entropy, areadapted for generalized entropy estimation. Thesemethods require a positive integer spacing, which acts asa smoothing parameter. The accuracy and computationtime of these new estimators are compared with Parzenwindow approaches. Generalized entropy estimatorsbased on sample spacings have been shown to be fast,robust, and accurate, and may prove useful in manypattern recognition applications.

Mark P. Wachowiak received the Ph.D. degree fromthe University of Louisville (Louisville KY, USA) in

Fig. 4 Bias for Renyi entropyestimation for exponentialpower data, q=0.25

100

2002. He is currently a researcher in the Imaging Lab-oratories at the Robarts Research Institute in London,Canada. His interests include mathematical modeling ofbiological systems, signal processing, medical imaging,and computational science and engineering.

Renata Smolıkova received doctorate degrees fromPalacky University (Olomouc, Czech Republic) and theUniversity of Louisville (Louisville KY, USA). She wasan assistant professor in the Department of Mathemat-ics at the University of Ostrava (Ostrava, CzechRepublic), and a researcher at the Institute for FuzzyModeling and Applications at the University of Ostrava.Her research interests include applied mathematics,signal processing, machine learning, and fuzzy logic.

Georgia D. Tourassi is a professor at the Departmentof Radiology at Duke University (Durham NC, USA)and an adjunct professor at the University of Louisville(Louisville KY, USA). Her research interests includebiomedical engineering, computer-aided diagnostics,and proteomics.

Adel S. Elmaghraby is Chair and Professor in theDepartment of Computer Engineering and ComputerScience at the University of Louisville (Louisville KY,USA). He received the Ph.D. degree from the Universityof Wisconsin, Madison. Dr. Elmaghraby is a SeniorMember of the IEEE. His research interests includebiomedical computing, multimedia, security, simulation,and performance evaluation.

Acknowledgments The authors thank the anonymous reviewers forhelpful criticisms and suggestions.

References

1. Silverman BW (1986) Density estimation for statistics and dataanalysis. Chapman and Hall, London

2. Renyi A (1970) Probability theory. North-Holland, Amster-dam

3. Tsallis C (1988) Possible generalization of Boltzmann-Gibbsstatistics. J Stat Phys 52:479–487

4. Smolıkovı R, Wachowiak MP, Zurada JM (2004) An infor-mation-theoretic approach to estimating ultrasound backscat-ter characteristics. Comput Biol Med 34:355–370

5. Krishnamachari A, Mandal VM, Karmeshu (2004) Study ofDNA binding sites using the Renyi parametric entropy mea-sure. J Theor Biol 27:429–436

6. Tonga S, Bezerianosa A, Amit Malhotraa A, Zhub Y, ThakorN (2003) Parameterized entropy analysis of EEG followinghypoxic-ischemic brain injury. Phys Lett A 314:354–361

7. Havrda J, Charvat F (1967) Quantification method of classifi-cation processes: concept of structural a-entropy. Kybernetika3:30–35

8. Rosso OA, Martin MT, Plastino A (2003) Brain electricalactivity analysis using wavelet-based informational tools (II):Tsallis non-extensivity and complexity measures. Physica A320:497–511

9. Wachavia KMP, Smolıkovı R, Peters TM (2003) Multiresolu-tion biomedical image registration using generalized informa-tion measures. Lecture notes in computer science 2899(MICCAI 2003), pp 846–853

10. Vasicek O (1976) A test for normality based on sample entropy.J Roy Stat Soc B 38:54–59

11. Dudewicz E, van der Meulen EC (1987) The empiric entropy, anew approach to nonparametric entropy estimation. In: PuriML, Vilaplana JP, Wertz W (eds) New perspectives in theo-retical and applied statistics. Wiley, NY

12. van Es B (1992) Estimating functionals related to a density by aclass of statistics based on spacings. Scand J Stat 19:61–72

13. Correa JC (1995) A new estimator of entropy. Commun StatTheor 24:2439–2449

14. Beirlant J, Dudewicz E, Gyorfi L, van der Meulen EC (1997)Nonparametric entropy estimation: an overview. Int J MathStat Sci 6(1):17–39

15. Wieczorkowski R, Grzegorzewski P (1999) Entropy estima-tors—improvements and comparisons. Commun Stat Simul28(2):541–567

16. Grassberger P. Entropy estimates from insufficient samplings.ArXiv:physics/0307138 2003

17. Hero A, Ma B, Michel O, Gorman J (2002) Applicationsof entropic spanning graphs. IEEE Signal Proc Mag19(5):85–95

18. Erdogmus D, Principe JC (2001) Entropy minimization algo-rithm for multilayer perceptrons. In: Proceedings of INNS-IEEE conference on neural networks (IJCNN), Washington,DC, pp 3003–3008

19. Holste D, Grosse I, Herzel H (1998) Bayes’ estimators ofgeneralized entropies. J Phys A 31:2551–2566

20. Wolpert DH, Wolf DR (1995) Estimating functions of proba-bility distributions from a finite set of samples. Phys Rev E52(6):6841–6854

21. Paninski L (2003) Estimation of entropy and mutual informa-tion. Neural Comput 15:1191–1253

22. Tadikamalla PR (1980) Random sampling from the exponen-tial power distribution. J Am Stat Assoc 75(371):683–686

23. Golan A, Perloff JM (2002) Comparison of maximum entropyand higher-order entropy estimators. J Econom 107(1–2):195–211

24. Eggermont PB, LaRiccia VN (1999) Best asymptotic normalityof the kernel density entropy estimator for smooth densities.IEEE T Inform Theory 45(4):1321–1325

101