Differential geometry-based techniques for characterization of boundary roughness of pulmonary...

13
Int J CARS DOI 10.1007/s11548-015-1284-0 ORIGINAL ARTICLE Differential geometry-based techniques for characterization of boundary roughness of pulmonary nodules in CT images Ashis Kumar Dhara 1 · Sudipta Mukhopadhyay 1 · Pramit Saha 2 · Mandeep Garg 3 · Niranjan Khandelwal 3 Received: 13 March 2015 / Accepted: 13 August 2015 © CARS 2015 Abstract Purpose Boundary roughness of a pulmonary nodule is an important indication of its malignancy. The irregularity of the shape of a nodule is represented in terms of a few diagnostic characteristics such as spiculation, lobulation, and sphericity. Quantitative characterization of these diagnostic characteris- tics is essential for designing a content-based image retrieval system and computer-aided system for diagnosis of lung can- cer. Methods This paper presents differential geometry-based techniques for computation of spiculation, lobulation, and sphericity using the binary mask of the segmented nodule. These shape features are computed in 3D considering com- plete nodule. Results The performance of the proposed and competing methods is evaluated in terms of the precision, mean sim- ilarity, and normalized discounted cumulative gain on 891 nodules of Lung Image Database Consortium and Image Database Resource Initiative. The proposed methods are comparable to or better than gold standard technique. The reproducibility of proposed feature extraction techniques is evaluated using RIDER coffee break data set. The mean and standard deviation of the percent change of spiculation, lob- ulation, and sphericity are 1.66 ± 2.36, 10.57 ± 11.63, and 6.27 ± 7.99 %, respectively. B Sudipta Mukhopadhyay [email protected]; [email protected] 1 Department of Electronics and Electrical, Communication Engineering, Indian Institute of Technology Kharagpur, Kharagpur 721302, India 2 Department of Electrical Engineering, Jadavpur University, Kolkata 700032, India 3 Department of Radiodiagnosis, Postgraduate Institute of Medical Education and Research, Chandigarh 160023, India Conclusion The prior works of computation of spiculation, lobulation, and sphericity require a set of four ground truths from radiologists and, hence, can not be used in practice. The proposed methods do not require ground truth information of nodules from radiologists, and hence, it can be used in real-life computer-aided diagnosis system for lung cancer. Keywords Lung cancer · CT images · Pulmonary nodule · Differential geometry · Spiculation · Lobulation · Sphericity · Content-based image retrieval · Precision · Mean similarity · Normalized discounted cumulative gain Introduction In USA, lung cancer is the leading cause of cancer-related death [1]. Early diagnosis of lung cancer could increase five- year survival rate from 10 to 80 % [2]. Most of the pulmonary nodules are early indication of lung cancer [3]. The nod- ules are blob-like structures with the maximum diameter ranging from 3 to 30 mm. The diagnostic characteristics of a nodule: spiculation, lobulation, and sphericity, assist the radiologists for planing of biopsy or follow-up scan. Spic- ulation is associated with the radial extension of malignant cells along interlobular septa, lymphatics, small airways, or blood vessels and appears as spokes of a wheel Fig. 1a. Lob- ulated margin is associated with uneven rates of growth of a nodule Fig. 1b. The irregularity of the shape of a nodule is represented by sphericity. The value of sphericity is near to 1 for the nodules with spherical shape Fig. 1c. Malignancy is 93 % for densely spiculated margin and 82 % for lobulated margin [4]. Most of the benign pulmonary nodules have a well-defined and smooth boundary although 21 % of nod- ules with a well-defined and smooth border are malignant. It should be noted that there is no computational technique for quantification of spiculation, lobulation, and sphericity 123

Transcript of Differential geometry-based techniques for characterization of boundary roughness of pulmonary...

Int J CARSDOI 10.1007/s11548-015-1284-0

ORIGINAL ARTICLE

Differential geometry-based techniques for characterizationof boundary roughness of pulmonary nodules in CT images

Ashis Kumar Dhara1 · Sudipta Mukhopadhyay1 · Pramit Saha2 ·Mandeep Garg3 · Niranjan Khandelwal3

Received: 13 March 2015 / Accepted: 13 August 2015© CARS 2015

AbstractPurpose Boundary roughness of a pulmonary nodule is animportant indication of itsmalignancy. The irregularity of theshape of a nodule is represented in terms of a few diagnosticcharacteristics such as spiculation, lobulation, and sphericity.Quantitative characterization of these diagnostic characteris-tics is essential for designing a content-based image retrievalsystem and computer-aided system for diagnosis of lung can-cer.Methods This paper presents differential geometry-basedtechniques for computation of spiculation, lobulation, andsphericity using the binary mask of the segmented nodule.These shape features are computed in 3D considering com-plete nodule.Results The performance of the proposed and competingmethods is evaluated in terms of the precision, mean sim-ilarity, and normalized discounted cumulative gain on 891nodules of Lung Image Database Consortium and ImageDatabase Resource Initiative. The proposed methods arecomparable to or better than gold standard technique. Thereproducibility of proposed feature extraction techniques isevaluated using RIDER coffee break data set. The mean andstandard deviation of the percent change of spiculation, lob-ulation, and sphericity are 1.66 ± 2.36, 10.57 ± 11.63, and6.27 ± 7.99 %, respectively.

B Sudipta [email protected]; [email protected]

1 Department of Electronics and Electrical, CommunicationEngineering, Indian Institute of Technology Kharagpur,Kharagpur 721302, India

2 Department of Electrical Engineering, Jadavpur University,Kolkata 700032, India

3 Department of Radiodiagnosis, Postgraduate Institute ofMedical Education and Research, Chandigarh 160023, India

Conclusion The prior works of computation of spiculation,lobulation, and sphericity require a set of four ground truthsfrom radiologists and, hence, can not be used in practice. Theproposed methods do not require ground truth informationof nodules from radiologists, and hence, it can be used inreal-life computer-aided diagnosis system for lung cancer.

Keywords Lung cancer · CT images · Pulmonary nodule ·Differential geometry ·Spiculation ·Lobulation ·Sphericity ·Content-based image retrieval · Precision · Mean similarity ·Normalized discounted cumulative gain

Introduction

In USA, lung cancer is the leading cause of cancer-relateddeath [1]. Early diagnosis of lung cancer could increase five-year survival rate from10 to 80% [2].Most of the pulmonarynodules are early indication of lung cancer [3]. The nod-ules are blob-like structures with the maximum diameterranging from 3 to 30mm. The diagnostic characteristics ofa nodule: spiculation, lobulation, and sphericity, assist theradiologists for planing of biopsy or follow-up scan. Spic-ulation is associated with the radial extension of malignantcells along interlobular septa, lymphatics, small airways, orblood vessels and appears as spokes of a wheel Fig. 1a. Lob-ulated margin is associated with uneven rates of growth of anodule Fig. 1b. The irregularity of the shape of a nodule isrepresented by sphericity. The value of sphericity is near to1 for the nodules with spherical shape Fig. 1c. Malignancyis 93% for densely spiculated margin and 82% for lobulatedmargin [4]. Most of the benign pulmonary nodules have awell-defined and smooth boundary although 21% of nod-ules with a well-defined and smooth border are malignant.

It should be noted that there is no computational techniquefor quantification of spiculation, lobulation, and sphericity

123

Int J CARS

(a) (b) (c)

Fig. 1 Examples of pulmonary nodules observed in lung CT: a pres-ence of spicules, b presence of lobules and, c spherical in structure

in the literature though commercial CAD system is availablefor diagnosis of lung cancer. The reported 2D techniquesmake use of the boundary information of radiologists tocompute spiculation, lobulation, and sphericity [5,6]. In theproposed method, spiculation, lobulation, and sphericity arecomputed in 3D using the binarymask of the segmented nod-ule, obtained by the method of Dhara et al. [7]. Out of threefeatures, only the computation of spiculation is an improve-ment of the prior work of Dhara et al. [7]. The method ofcomputation of lobulation and sphericity is proposed for thefirst time. The proposed feature extraction techniques can beused in real-life content-based image retrieval (CBIR) systemand computer-aided diagnosis (CAD) system, because thesetechniques do not require the involvement of radiologistsfor nodule segmentation. The characterization of diagnosticshape features could lead to elimination of the interobservervariability during estimation ofmalignancy. To the best of ourknowledge, there is no existing work on the computation ofspiculation, lobulation, and sphericity considering the com-plete nodule. Therefore, the proposedwork can be consideredas pioneering research effort that could quantify the shapefeatures of nodules to improve the performance of retrievaland classification of pulmonary nodules.

The paper is organized into several sections: State-of-the-art methods are described in “Reported works on charac-terization of spiculation, lobulation, and sphericity” section,the proposed methods of computing spiculation, lobulation,and sphericity are described in “Methods for characteriza-tion of spiculation, lobulation and sphericity” section, dataset is described in “Database used in evaluation of proposedtechniques” section, performance evaluation metrics are pro-vided in “Performance metrics,” and the results are providedin “Results and discussion” section. In “Conclusion” section,conclusion and future scope of improvement are stated.

Reported works on characterization of spiculation,lobulation, and sphericity

Raicu et al. [6] developed a framework for predicting the rankof nine diagnostic characteristics such as subtlety, texture,

calcification, internal structure, lobulation, margin, spheric-ity, spiculation, and malignancy. Total 1106 images werecollected from 73 nodules of Lung Image Database Con-sortium (LIDC) public database [9]. To quantify the imagecontent, total fifty-nine features were extracted from thebiggest representative slice of the nodule. The boundaryof each nodule is annotated by radiologists. These featuresinclude shape, size, gray-level intensity, and texture informa-tion of nodule. Logistic regression was applied to generatethe prediction model. The reported correlations betweensemantic characteristics and image features of spiculation,lobulation, and sphericity are 0.32, 0.28, and 0.24, respec-tively.

Horsthemke et al. [5] developed aCADsystem for predict-ing the rank of spiculation, lobulation, and sphericity usingthe boundary region-based features of the biggest represen-tative slice of the nodule. The boundary region is formedas the pixel set difference between the union and intersec-tion of all radiologist-drawn outlines. The radial gradientindex (RGI) is computed for all the pixels of the boundaryregion. The standard deviation and the full width of the halfmax (FWHM) are calculated from the gradient magnitude-weighted RGI histogram. Two image entropies are computedfor the boundary pixels using the intensity of the pixels andthe gradient magnitude, respectively. A set of five lower-order Zernike moments [8] are computed from the intensityof pixels within the boundary region. These image featuresare fed to a decision trees for predicting the rank of spicula-tion, lobulation, and sphericity. The method was validated on147 nodules of LIDC public database [9]. The accuracy ofprediction of median rating for spiculation, lobulation, andsphericity is 53, 44, and 58%, respectively.

Dhara et al. [7] reported a differential geometry-basedtechnique for predicting the rank of spiculation of solitarypulmonary nodule (SPN). In the first step, Marching cubealgorithm (MCA) [12] is applied on the binary mask of thesegmented nodule to generate an equivalent triangular mesh.In the next step, the peak of a spicule is identified using theproperty of Gaussian and mean curvature computed at eachvertex of the triangular mesh. Once the peak of a particu-lar spicule is identified, the base is determined consideringthe sudden change of the base area of that spicule. The spic-ulation value for an individual spicule is determined usingsolid angle subtended at a peak of spicule by the base ofspicule and the height of the spicule. The computed spicula-tion of an SPN is the weighted sum of all the spicules. Themethod was validated on a small data set of 95 SPN of LungImage Database Consortium and Image Database ResourceInitiative (LIDC/IDRI) public database. The reported accu-racy of predicting composite spiculation is 87.4%. Themethod was not validated for part-solid and non-solidnodules.

123

Int J CARS

Methods for characterization of spiculation,lobulation, and sphericity

Segmentation of pulmonary nodules

Segmentation of pulmonary nodules is a prerequisite step foracceptable performance of CAD systemdesigned for diagno-sis of lung cancer from CT images. The pulmonary nodulesare segmented using the method of Dhara et al. [10]. In thissegmentation framework, pulmonary nodules are extractedusing a threshold-based segmentation followed by removalof attached pleural surface and blood vessels. The pleuralattachment and vascular attachment are efficiently removedby limiting the VOI to an ellipsoid. The pleural surfaceremoval technique is applied in the ellipsoid VOI, consid-ering the fact that the lungs are mostly convex. The convexhull of the lung parenchyma within the bounded ellipsoid isused to mask out the chest wall part of the VOI. The bloodvessels are removed by means of a vasculature pruning tech-nique.

Computation of spiculation

A differential geometry-based technique is introduced forcomputation of spiculation of pulmonary nodules. In the pro-posed method, spicules are modeled as conical objects. Thesolid angle subtended at vertex and height of the conicalobject are used for computation of spiculation. The changein spiculation for different values of solid angle and heightof spicules is illustrated with the help of Fig. 2. The proce-dure of computation of spiculation of a nodule is describedin Fig. 3. The steps [(i) identification of a base of a spiculeand (ii) measurement of solid angle subtended at spicule tip]are borrowed from the prior work of Dhara et al. [7], and therest steps are introduced for the first time.

Extraction of surface voxels

The surface voxels of the binary mask of the segmented nod-ule (Fig. 5b) are identified based on the fact that each surfacevoxel must have at least one exposed face touching back-ground. The information of surface voxel will be useful forsubsequent steps.

Computation of geodesic distance

The values of geodesic distance (GD) at spicule tip are rel-atively higher compared to the rest surface voxels of thenodule. GD helps to segregate the spicule tip from the listof surface voxels. The schematic diagram for computing GDin 2D is given in Fig. 4. GD is computed with respect to cen-troid of the binary mask, considering that centroid has zerogeodesic distance. In case of 3D object, 26 neighborhoods

(a) (b) (c)

(d) (e) (f)

Fig. 2 Spiculation for different types of conical objects (using Eq. ) a–c represent increment in spiculation with increase in height of cone fora constant area of spicule base, d–f represent decrement in spiculationwith increment in area of cone for a constant height of spicule

are considered for computation of GD. Thus, GD of all vox-els of the binary mask can be computed by incrementallygrowing the region from centroid (Fig. 5c).

Creation of voxel transformation map

The purpose of this step is to boost up the value of spiculetip which will help in the detection of tips of spicules. Inorder to create voxel transformationmap, a parameter namedvoxel density (VD) is introduced, which is defined as theratio of the number of foreground voxel in its (9 × 9 × 9)neighborhood and total number of voxel contained in thecube of size (9 × 9 × 9). The voxel transformation map isshown in Fig. 5d and defined as

F1(v) = GD/(VD)2, (1)

where v represents the location of a voxel.

Identification of tip of a spicule

Dhara et al. [7] identified the tip of a spicule based on theproperty of Gaussian and mean curvatures at each vertex of

123

Int J CARS

Identification of tip of a spicule

Extraction of surface voxel

Computationof geodesic distance

Creation of voxel

transformation Map

Identification of base of a spicule

Measurement of solid angle subtended at

spicule tip

Binary mask of segmented

nodule

Spiculationindex

Computation of spiculation

index

Fig. 3 Block diagram represents the steps of computing spiculation

Fig. 4 Geodesic distance of pixels from the center using L∞ norm

the triangular mesh of a nodule. Gaussian and mean curva-tures are sensitive to undulation of the triangular mesh andproduce false positives in the detection of spicule tip. In theproposed method, the spicule tip is identified using a voxeltransformation map as given in Eq. 1. In the voxel trans-formed map F1, the spicule tips have higher value comparedto the remaining surface voxels as GD has greater value andVD has lower value at spicule tip. The surface voxels aresorted in ascending order of voxel transformation map. Thesurface voxels above a threshold value Tst are considered asspicule tip (Fig. 5e). The procedure for determining Tst isdescribed in “Determination of parameters associated withthe proposed methods” section.

Compute the area of base of a spicule base

Thebaseof spicules is identifiedby tracing the sudden changeof area of intersection of the plane (moving toward the cen-

troid of the nodule from the spicule tip of a spicule) and themask of nodule. The area of the region of intersection (A)is determined by computing the eigenvalues of the covari-ance matrix of the boundary points of the region [11] andcomputed as

A = π√

λ1λ2, (2)

where A is the area and λ1 and λ2 are the eigenvalues of thecovariance matrix. This step is similar to the prior works ofDhara et al. [7].

Measurement of solid angle subtended at spicule tip

This step is similar to prior works of Dhara et al. [7]. Theperpendicular distance of spicule tip from the base of spiculesis considered as the height of spicule (h). The sharpness ofspicules is measured in terms of solid angle subtended at tipof the spicule by the base of the spicule. The solid angle fori th spicule (ωi ) is computed as

ωi = A

h2. (3)

Computation of spiculation

In the presentwork, thewayof computation of net spiculationof a nodule is different from the prior works of Dhara etal. [7]. In order to emphasize the sharp spicules, ωi and hi

(a) (b) (c) (d) (e)

Fig. 5 The steps of computing spiculation: a original CT image slice,b binary mask of the segmented nodule, c geodesic distance transfor-mation of (b), d voxel transformation map of (b), and e tip of a spicule

marked with a red dot. The protrusion in the right side is an attachmentto vessel and does not qualify as spicule

123

Int J CARS

can be combined in several ways, and we have computed netspiculation of a nodule in two different ways

Spiculation = ΣNi=1e

−ωi hi , (4)

Spiculation =∑N

i=1 hi cosωi∑N

i=1 hi, (5)

where ωi is the solid angle subtended at peak point of i thspicule, hi is the height of a spicule, and N is total numberof spicules. The false positives are eliminated by limitingthe solid angle within 0.7 steradian [7]. The computed spic-ulation is a positive quantity. The zero value of spiculationindicates the absence of spicule, and higher value of spicula-tion indicates prominence of spicules in nodule. The efficacyof these two equations (Eqs. 4, 5) is evaluated in section“Reproducibility of computed features using proposed fea-ture extraction techniques” section.

Computation of lobulation

Uneven growth of pulmonary nodules creates lobulation. Inthe case of spherical objects, the ratio of concave surfacearea and convex surface area is zero (Fig. 6a), whereas theratio increases due to the presence of lobulations (Fig. 6b).The fact motivates us to characterize lobulation as the ratioof the total area of the concave surface and the total area ofconvex surface of a nodule. The procedure of computationof lobulation of a nodule is depicted in Fig. 7.

Fig. 6 The value of lobulation index for a a spherical object (lob-ulation index=0) and b a lobulated objects (lobulation index=0.23),respectively. The objects are represented using triangular mesh

Triangularmesh

representation of nodule

Retriangulation of mesh

Detection of seed points in convex and

concave region

Growing of concave and

convex surface

Binary mask of segmented

nodule

Computationof lobulation

Lobulation index

Fig. 7 Block diagram represents the steps of computing lobulation

Triangular mesh representation of nodule

The triangular mesh is obtained from the binary mask of thesegmented nodules usingMCA [12] (Fig. 8b). InMCA, eachvoxel is treated as a cube, and the isosurface is constructed insuch a way that the intersected cube edges have one or morevoxel intensity values above and below the predefined inten-sity value. On traversal of all the surface voxels, a completetriangular mesh is generated.

Smoothing of triangular mesh

Triangular mesh consists of sharp or skinny angles. Hence,mesh smoothing is applied to improve the quality of mesh.In the first step of mesh smoothing, Laplacian smoothingis used. In the second step, quadratic surface fitting is per-formed. The mesh smoothing algorithm considers 2nd ringneighborhood. The result of triangular mesh smoothing isshown in Fig. 8c.

Finding of seed points in convex and concave regions ontriangular mesh

The seed points in the concave and convex parts of the tri-angular mesh are determined using property of shape index(SI) [13]. Every distinct shape corresponds to a unique SIvalue (see Table 1). Considering k1 and k2 maximum andminimumprinciple curvatures, respectively, at particular ver-tex of the triangular mesh of nodule, SI is defined as

SI = 2

πarctan

(k1 + k2k1 − k2

). (6)

The principle curvatures at each vertex of the triangularmesh are computed using the method of Dong et al. [14]. Inthe proposed method, the seed points in the convex part andthe concave part of the triangular mesh are determined usingSI > 0.63 and SI < −0.30, respectively.

Growing of convex and concave surface

Once the seed points are identified, the convex part and theconcave part in triangular mesh are obtained by expandingthe regions around the seed points based on ranges of SI(Table 2). Along with SI, curvedness (

√k12 + k22) is also

used to control the growing of region around a seed point.The curvedness is ameasure of the intensity of curvature [13].In the proposed method, the value of curvedness is kept>0.5for growing of both convex and concave regions. The resultof region growing for the convex parts and the concave partsof the triangular mesh is shown in Fig. 8d. Total convexarea Sconvex and total concave area Sconcave are computedby adding all convex and concave patches, respectively.

123

Int J CARS

(a) (b) (c) (d)

Fig. 8 Steps of computing lobulation: a original nodule, b triangular mesh representation of (a), c smoothing of triangular mesh, and d identifiedconcave surface (in green) and convex surfaces (in cyan)

Table 1 Range of SI and corresponding surface

Range of SI Surface

[−1, −0.63] Concave ellipsoid

[−0.63, −0.37] Concave cylinder

[−0.37, 0.37] Hyperboloid

[0.37, 0.63] Convex cylinder

[0.63, 1] Convex ellipsoid

Table 2 Permissible range of SI for region growing of convex andconcave parts of triangular mesh

SI for growing ofconvex part

SI for growing ofconcave part

[0.37, 1] [−1, 0]

Computation of lobulation

Lobulation of a nodule is defined as

Lobulation = SconcaveSconvex

. (7)

The computed lobulation is a positive quantity. The zerovalue of lobulation indicates the absence of lobe, and highervalue of lobulation indicates prominence of lobe-like struc-tures of the nodule.

Computation of sphericity

The compactness of a 2D binary object is defined as

Compactness = P2

A, (8)

where A is the area of the of the object and P is the perimeterof the object. Based on the idea of compactness in 2D, Sladojeet al. [15] extended it to 3D for computing sphericity. Theydefined sphericity as

Sphericity = 6√

πV A−3/2, (9)

where V is the volume of the nodule and A is its surface area.The range of sphericity is [0, 1], where value of sphericity 1indicates a sphere. The volume of a nodule is computed bymultiplying total number of voxel with size of each voxel.The surface area of a nodule is computedby summing the areaof all triangles in the triangular mesh of nodule. Triangularmesh is obtained using MCA [12]. The block diagram ofcomputation of sphericity of a nodule is provided in Fig. 9.

Database used in evaluation of proposedtechniques

Reference image database to evaluate responseto therapy (RIDER) coffee break data set

This data set consists of 32 patients with the pathology ofnon-small cell lung cancer [16]. The patients were scannedtwice (time point 1 and time point 2) on the same CT scannerat an interval of 15 minutes. The image acquisition was per-formed without the use of any iodinated contrast agent. Forthese data sets, only one lesion per patient was selected formeasurement and analysis. The approximate lesion diame-ters range from 8 to 40mm. Out of 32 lesion, 12 lesions havemaximum diameter within 30mm and considered as nodule.These 12 nodules are used for the evaluation of reproducibil-ity of proposed feature extraction techniques.

Lung Image Database Consortium and Image DatabaseResource Initiative (LIDC/IDRI) data set

In LIDC/IDRI database [17], 928 lesions are annotated byfour radiologists. Out of 928 nodules, the boundary annota-tion is available for 891 nodules. The proposed methods ofcomputation of shape-based features are evaluated on those891 nodules. In LIDC/IDRI database, each CT image sliceconsists of a matrix size of 512 × 512 pixels. The pixel sizeranges from 0.5 to 0.8mm with a 12-bit gray scale in HU.

123

Int J CARS

Computationof surface area of nodule

Computation of sphericity

Computationof volume of

nodule

Binary mask of segmented

nodule

Triangular mesh

representation of nodule

Spherictyindex

Fig. 9 Block diagram represents the steps of computing sphericity

Table 3 Composition of 891 nodules considering composite rating

Diagnosticfeatures

Nodulesof rank-1

Nodulesof rank-2

Nodulesof rank-3

Nodulesof rank-4

Nodulesof rank-5

Spiculation 695 98 44 23 31

Lobulation 598 147 87 43 16

Sphericity 0 44 268 333 346

The slice thickness varies from0.5 to 3mm. Four radiologistsrated each nodule with nine diagnostic characteristics like:texture, subtlety, spiculation, lobulation, sphericity, margin,malignancy, internal structure in scale of 1–5, and calcifi-cation in scale of 1–6. In categorical ranking, rank-5 andrank-1 represent maximum and minimum values of a diag-nostic characteristic, respectively. The composite rank of aparticular diagnostic feature is the mode of rating of fourradiologists. For multiple modes, the floor of median of fourradiologists is considered as composite rating. The compo-sition of nodules based on the composite rank of spiculation,lobulation, and sphericity is provided in Table 3.

Performance metrics

Metrics for evaluation of reproducibility of results

The reproducibility of proposed feature extraction tech-niques is evaluated using percent change and proportionalchange [16]. The lower values of percent change and propor-tional change represent the better reproducibility of resultsusing the nodules of two time points.

Percent change

The percent change between the computed values of a diag-nostic characteristic is defined as

Percent change

= Computed values of a characteristic in the (second scan − first scan)

Value of the diagnostic characteristic in the first scan

×100%. (10)

Proportional change

The proportional change between the computed values of adiagnostic characteristic is defined as

Proportional change

= Computed values of a characteristic in the (second scan − first scan)

Mean of the diagnostic characteristic in two scans

×100%. (11)

Metrics for evaluation of performance of CBIRexperiments

Separate CBIR experiments are performed considering eachfeature (viz. computed spiculation, lobulation, and spheric-ity) separately, for evaluating the performance of computedshape features. The performance of CBIR experiments isevaluated in terms of following metrics.

Precision

The precision for a CBIR experiment is defined as

Precision = Number of relevant images

Number of retrieved images. (12)

The range of precision is [0,1]. The precision is a goodmeasure of retrieval efficiency for a database containingimages of different class. In the present study, the databaseconsists of nodules of different ranks (1:5) of spiculation, lob-ulation, and sphericity. The difference at one rank of marginsharpness with other is in degree, but not in kind. Moreover,there are large amount of interpersonal variations in rank ofmargin sharpness of radiologists. Hence, the precision pro-vides very pessimistic result as only the retrieved imageswithsame composite rank as the query image are considered asrelevant.

Mean similarity (MS)

The MS is introduced to overcome pitfall of precision. Xu etal. [18] defined a reference standard for measuring the sim-ilarity in margin characteristic between two images. In thepresent work, the reference standard of similarity is rede-fined to accommodate composite rating and to normalize theresult. Based on the composite rating of the shape features,the similarity measure Si j between two images i and j can bedefined as

Si j = 1

4

[4 − ∣

∣Ri − R j∣∣] (13)

123

Int J CARS

where Ri and R j are composite rating of a diagnostic fea-ture for a query image and j th retrieved image, respectively.Considering top K retrieved images, MS is defined as

MS = 1

K

K∑

i=1

Si j (14)

The range of MS is [0 − 1], with 0 indicating the leastsimilarity and 1 indicating the best similarity.

Normalized discounted cumulative gain (NDCG)

The NDCG is a measure of ranking quality of informationretrieval [19]. Xu et al. [18] used NDCG score for quantita-tive evaluation of margin sharpness with respect to groundtruth, provided by five radiologists in the scale of 1 to 5.Considering K retrieved images, discounted cumulative gain(DCG) is defined as

DCGK =K∑

i=1

2reli − 1

log2(i + 1)(15)

Assuming IDCGk as the maximum possible DCG for agiven set of queries, NDCG is defined as

NDCG = DCGK

IDCGK(16)

Results and discussion

Reproducibility of computed features using proposedfeature extraction techniques

To investigate the reproducibility of proposed techniques,the features (viz. spiculation, lobulation, and sphericity) arecomputed for the pulmonary nodules obtained from the CTimages of two time points in RIDER coffee break data set.The variability of computed features for a particular noduleat two time points is measured in terms of percent changeand proportional change [16]. The mean and standard devia-tion of percent change and proportional change of computedfeatures at two time points are given in Table 4. The resultsdepict the fact that computation of spiculation using Eq. 5 hasbetter reproducibility than usingEq. 4.Hence, Eq. 5 is used tocompute the spiculation of nodules in rest of the work. Smallchanges are noted in segmented results of pulmonary nodulesacquired at different time instances. The small variation ofcomputed values of spiculation, lobulation, and sphericity attwo instances of coffee break data set is due to small changesin segmentation results. McNitt-Gray et al. [16] applied asemiautomated tool for 3D measurements of nodule char-acteristics in coffee break data set. The reported values of

Table 4 Percent change and proportional change of values of shapefeatures computed from two instances of RIDER coffee break data set

Diagnostic characteristic Percent change(mean ± SD) %

Proportional change(mean ± SD) %

Spiculation using Eq. 4 31.68 ± 41.16 28.08 ± 25.66

Spiculation using Eq. 5 1.66 ± 2.36 1.66 ± 2.36

Lobulation 10.57 ± 11.63 11.06 ± 13.11

Sphericity 6.27 ± 7.99 5.88 ± 6.92

mean and standard deviation of percent change and propor-tional change are 7.4 ± 44.2 and 2.2 ± 25.5, respectively.From the results, it is clear that in the case of prominent fea-ture (spiculation), the variation is lesser than small features.Among small features, sphericity is measured by compact-ness of the nodule and, hence, the variations are less acrossdifferent time instances than lobulation. The variation of spic-ulation, lobulation, and sphericity seems to be logical andcomparable with the measures of McNitt-Gray et al. [16].The evaluation of reproducibility of the method of Hors-themke et al. could not be done for RIDER coffee break dataset, as boundary annotation of nodules by radiologists is notavailable.

Qualitative results of the proposed methods

The computed values of spiculation, lobulation, and spheric-ity for few example nodules and corresponding compositerank of spiculation lobulation and sphericity are illustratedusing Figs. 10, 11, and 12, respectively. The qualitativeresults depict the fact that the values of computed spicula-tion, lobulation, and sphericity increasewith the increment inthe composite rank of spiculation, lobulation, and sphericity,respectively. However, there is certain overlap between therange of computed spiculation of nodules of two consecu-tive ranks of spiculation as shown in Fig. 10. This anomalycould be due to the subjective nature of the spiculation rank.It may be noted that there is significant variations in theratings of spiculation provided by the radiologists. Hence,the pulmonary nodules are grouped according to their corre-sponding composite rank of their spiculation. This is also truefor lobulation as shown in Fig. 11. The computed spheric-ity and composite rank of sphericity are more consistent asshown in the example in Fig. 12.

Quantitative results of the proposed methods and goldstandard method

It should be noted that there is no computable techniquefor lobulation and sphericity. However, there is only onecomputable technique for spiculation [7]. The reported 2Dtechnique of Horsthemke et al. makes use of the boundary

123

Int J CARS

Rank-1 Rank-2 Rank-3 Rank-4 Rank-5

SPI 0.70 0.86 0.91 0.95 0.96

SPI 0.82 0.94 0.93 0.97 0.98

SPI 0.85 0.95 0.96 0.98 0.99

Fig. 10 Example of noduleswith values of computed spiculation (SPI)using proposed algorithm and composite rank of spiculation (1:5) asobtained from LIDC/IDRI. The five columns are marked by the respec-

tive composite rank of spiculation. The computed values of spiculationare provided below the respective images

Rank-1 Rank-2 Rank-3 Rank-4 Rank-5

LOB 0.00 0.11 0.45 0.63 0.64

LOB 0.01 0.24 0.51 0.99 1.08

LOB 0.02 0.26 0.71 1.00 1.85

Fig. 11 Example of noduleswith values of computed lobulation (LOB)using proposed algorithm and composite rank of lobulation (1:5) asobtained from LIDC/IDRI. The five columns are marked by the respec-

tive composite rank of lobulation. The computed values of lobulationare provided below the respective images

information of radiologists to compute spiculation, lobula-tion, and sphericity. Hence, the method of Horsthemke et al.is considered as gold standard. The performance of the pro-

posed methods and gold standard method is evaluated usingthree separate CBIR experiments on the complete data set of891 nodules of LIDC/IDRI. The CBIR experiments are per-

123

Int J CARS

Rank-2 Rank-3 Rank-4 Rank-5

SPH 0.50 0.64 0.69 0.90

SPH 0.57 0.67 0.72 0.90

SPH 0.61 0.67 0.73 0.93

Fig. 12 Example of nodules with values of computed sphericity (SPH)using proposed algorithm and composite rank of sphericity (2:5) asobtained from LIDC/IDRI. The five columns are marked by the respec-

tive composite rank of sphericity. The computed values of sphericityare provided below the respective images

formed to rank order the corresponding retrieved images for agiven query image. The CBIR experiment is performed usingcomputed spiculation to determine the efficacy of computingtechnique of spiculation. The procedure is same for lobula-tion and sphericity. The performance of proposed methodsis compared with the gold standard method. The results ofspiculation of proposed method are also compared with theprior work of Dhara et al. [7].

Considering K retrieved images, the precision, MS, andNDCG are computed for each query images. The mean ofprecision, MS, and NDCG of proposed methods and goldstandard method is given in Table 5. The p values of preci-sion,MS, andNDCGof proposedmethods and gold standardmethod are given in Table 6. The performance of proposedmethod of computation of spiculation is better than that ofthe prior work of Dhara et al. and little lower compared to thegold standard method with respect to precision and MS. Forspiculation, the results of Dhara et al. and proposed methodare statically significant. The results of Horsthemke et al.and proposed method are statically significant with respectto MS and NDCG though p value is >0.05 for precision. Ingold standard technique, the boundary information of eachnodule provided by four radiologists is used. Though it is a

2D technique, perhaps the information provided by multipleradiologists helps it to perform better than the proposed 3Dtechnique using single automated nodule segmentation algo-rithm. The error of the nodule segmentation algorithm alsoplays a role in it.

The performance of proposed method of computing lob-ulation and sphericity is better than that of the gold standardmethod in terms of precision and MS. The p value analy-sis depicts that the results of proposed methods and goldstandard technique are statistically significant with respect toprecision and MS. In the proposed methods, the shape fea-tures are computed in 3D, and hence, the proposed methodsproduce comparable or better results than the gold standardtechnique. The values of NDCG in proposed methods arelittle lower than those of the gold standard technique in caseof the computation of spiculation, lobulation, and spheric-ity. The lower value of NDCG is due to the fact that theprocedure of computation of NDCG does not penalize thetechnique for fewer relevant images in the retrieval result.The procedure of computation of spiculation, lobulation, andsphericity requires 0.7± 0.9, 2.79± 4.27, and 0.59± 0.43s,respectively, using 2.66GHz Intel quad-core processor with8 GB RAM and MATLAB(2013b).

123

Int J CARS

Table 5 Precision, MS, and NDCG for computing spiculation, lobulation, and sphericity using the proposed method, the method of Dhara et al.,and gold standard method considering K retrieved images

Metric K Spiculation Lobulation Sphericity

Horsthmke et al. Dhara et al. Proposed Horsthmke et al. Proposed Horsthmke et al. Proposed

Precision 1 0.26 0.25 0.27 0.28 0.33 0.26 0.32

2 0.25 0.24 0.25 0.27 0.30 0.28 0.34

3 0.26 0.23 0.25 0.25 0.30 0.28 0.33

4 0.25 0.22 0.24 0.24 0.29 0.27 0.33

5 0.25 0.22 0.24 0.25 0.28 0.26 0.33

6 0.25 0.23 0.24 0.25 0.28 0.26 0.33

7 0.25 0.23 0.24 0.25 0.27 0.26 0.33

8 0.22 0.26 0.24 0.24 0.27 0.26 0.32

9 0.24 0.22 0.24 0.25 0.27 0.26 0.32

10 0.24 0.22 0.24 0.25 0.26 0.26 0.32

11 0.24 0.22 0.24 0.25 0.26 0.26 0.33

12 0.24 0.22 0.23 0.25 0.26 0.26 0.32

13 0.23 0.23 0.23 0.25 0.26 0.26 0.32

14 0.23 0.23 0.23 0.25 0.26 0.26 0.32

15 0.23 0.23 0.23 0.25 0.26 0.26 0.32

MS 1 0.62 0.57 0.61 0.63 0.64 0.72 0.76

2 0.61 0.58 0.58 0.62 0.65 0.73 0.76

3 0.61 0.56 0.58 0.61 0.64 0.73 0.76

4 0.60 0.55 0.57 0.61 0.63 0.73 0.76

5 0.60 0.55 0.56 0.61 0.63 0.72 0.76

6 0.59 0.55 0.56 0.61 0.63 0.72 0.76

7 0.59 0.55 0.56 0.61 0.63 0.72 0.76

8 0.59 0.55 0.56 0.61 0.62 0.72 0.76

9 0.59 0.55 0.57 0.61 0.62 0.72 0.76

10 0.59 0.55 0.56 0.61 0.62 0.72 0.76

11 0.59 0.55 0.56 0.60 0.62 0.72 0.76

12 0.59 0.55 0.56 0.61 0.61 0.72 0.76

13 0.58 0.56 0.56 0.61 0.61 0.72 0.76

14 0.58 0.56 0.56 0.61 0.61 0.72 0.76

15 0.58 0.56 0.56 0.61 0.61 0.72 0.76

NDCG 1 1.00 0.91 0.84 1.00 0.88 1.00 1.00

2 0.99 0.94 0.97 1.00 0.93 1.00 1.00

3 0.99 0.93 0.94 0.99 0.91 0.99 0.98

4 0.99 0.93 0.93 0.98 0.89 0.98 0.96

5 0.97 0.93 0.92 0.97 0.88 0.97 0.95

6 0.96 0.92 0.92 0.96 0.90 0.97 0.95

7 0.95 0.91 0.90 0.95 0.89 0.96 0.94

8 0.95 0.90 0.91 0.94 0.88 0.96 0.94

9 0.95 0.90 0.89 0.94 0.88 0.96 0.93

10 0.94 0.90 0.88 0.94 0.87 0.95 0.93

11 0.94 0.90 0.88 0.94 0.87 0.95 0.93

12 0.94 0.88 0.88 0.93 0.87 0.95 0.93

13 0.93 0.88 0.88 0.93 0.87 0.95 0.93

14 0.93 0.87 0.88 0.93 0.87 0.95 0.93

15 0.93 0.86 0.88 0.92 0.87 0.95 0.93

123

Int J CARS

Table 6 Statistical analysis ofproposed techniques andcompeting techniques

Diagnostic Competing methods Metric p value

Spiculation Horthmke et al. and proposed Precision 0.37

MS 8.73E−06

NDCG 6.19E−06

Dhara et al. and proposed Precision 1.6E−03

MS 0.02

NDCG 0.57

Lobulation Horthmke et al. and proposed Precision 2.71E−04

MS 9.67E−04

NDCG 3.10E−09

Sphericity Horthmke et al. and proposed Precision 1.27E−20

MS 7.65E−25

NDCG 0.04

Several partial volume effect reduction (PVE) techniquessuch as supersampling [22,23] and iterative deconvolu-tion [24] are applied to improve the precision and MS.However, there is no substantial improvement of precisionand MS even after application of PVE reduction tech-nique. On the other hand, PVE reduction techniques aretime-consuming. Hence, we have not incorporated any PVEreduction technique in this article.

Determination of parameters associated with theproposed methods

The threshold value Tst for identifying the spicule tip incomputation of spiculation (see “Identification of tip of aspicule” section) is determined using data-driven approach.The data set of 891 nodules is divided into D1 and D2 tostudy the dependency of computed spiculation index withrespect to Tst and data sets. The plot of mean NDCG fortop ten retrievals versus threshold value for identification ofspicule tip is shown in Fig. 13. The values of NDCG remainalmost constant for both the data sets D1 andD2, for Tst morethan 92nd percentile. The values of NDCG are maximum fordata sets D1 and D2 at Tst = 0.96. Hence, this value of Tst isused for the entire experiment. The parameters used for com-puting lobulation are fixed as discussed in “Finding of seedpoints in convex and concave region on triangular mesh” and“Growing of convex and concave surface” sections. In caseof computing sphericity, no parameter is required.

Potential clinical application

Feature extraction is an important step for the development ofa CAD system for diagnosis of lung cancer or developmentof a self-learning tool of radiology. The commercial CADsystems are available for screening of lung cancer, but thereis no computable technique in the literature for spiculation,

90 92 94 96 98 1000.8

0.85

0.9

0.95

1

Threshold (percentile) for identification of spicule tip

ND

CG

(con

side

ring

top

10 re

triev

ed im

ages

) NDCG using data set D1NDCG using data set D2

Fig. 13 Mean NDCG for top ten retrieved images considering datasets D1 and D2

lobulation, and sphericity. Seitz et al. [20] designed a CBIRsystem using 63 image features from the biggest representingslice of the nodule. In this technique, boundary informationofthe nodule is taken from radiologists rather than using a seg-mented technique. Han et al. [21] also used boundary infor-mation of the nodule in their CADsystem instead of segment-ing the pulmonary nodules. The proposed methods presentcomputation of shape features of a pulmonary nodulewithoutusing the boundary information of the nodule from radiolo-gists. Hence, these feature extraction techniques could findapplication in the development of CAD system for diagnosisof lung cancer or designing a self-learning tool of radiology.

Conclusion

In this paper, the shape features of pulmonary nodules arecomputed in 3D considering the complete nodule. The pro-

123

Int J CARS

posed work can be considered as pioneering research effortthat could quantify the shape features of nodules to improvethe performance of retrieval and classification. The improve-ment in the computation of shape features could improvethe performance of CAD system of diagnosis of lung can-cer. The characterization of diagnostic shape features couldalso reduce the interobserver variability during estimation ofmalignancy. These shape features could be used in the devel-opment of a general purpose CBIR system in radiology.

Compliance with ethical standards

Conflicts of interest This study was funded by Department of Elec-tronics and Information Technology, Govt. of India, Grant Number1(3)2009-ME&TMD and 1(2)/2013-ME &TMD/ESDA, respectively.The authors declare that they have no conflict of interest. This work isdone using a public lung CT image data set, and for this type of study,formal consent is not required. This article does not contain any studieswith human participants or animals performed by any of the authors.Informed consentwas obtained from all individual participants includedin the study.

References

1. Siegel R, Naishadham D, Jemal A (2013) Cancer statistics. CACancer J Clin 63(1):11–30

2. Diederich S, Wormanns D, Semik M, Thomas M, Lenzen H, RoosN,HeindelW (2002) Screening for early lung cancerwith low-dosespiral CT: prevalence in 817 asymptomatic smokers. Radiology222(3):773–781

3. Ko JP, Naidich DP (2004) Computer-aided diagnosis and the eval-uation of lung disease. J Thorac Imaging 19(3):136–155

4. Ost D, Fein AM, Feinsilver SH (2003) The solitary pulmonarynodule. N Engl J Med 348(25):2535–2542

5. Horsthemke WH, Raicu DS, Furst JD (2009) Characterizingpulmonary nodule shape using a boundary-region approach. In:Proceedings of SPIE medical imaging 2009, vol 7260. Florida, pp72602Y–72602Y-9

6. Raicu DS, Varutbangkul E, Cisneros JG, Furst JD, Channin DS,Armato SG III (2007) Semantics and image content integration forpulmonary nodule interpretation in thoracic computed tomography.In: Proceedings of SPIE medical imaging 2007, pp 65120S–65120S-12

7. Dhara AK,Mukhopadhyay S, AlamN,Khandelwal N (2013)Mea-surement of spiculation index in 3D for solitary pulmonary nodulesin volumetric lung CT images. In: SPIE medical imaging 2013:computer aided diagnosis, vol 8670. Florida, pp 86700K–86700K-6

8. Teague MR (1980) Image analysis via the general theory ofmoments. J Opt Soc Am 70(8):920–930

9. McNitt-Gray MF, Armato SG III, Meyer CR, Reeves AP, McLen-nan G, Pais RC, Freymann J, Brown MS, Engelmann RM, BlandPH, Laderach GE, Piker C, Guo J, Towfic Z, Qing PYD, Yankele-vitzDF,AberleDR,BeekEJR,MacMahonH,Kazerooni EA,CroftBY, Clarke LP (2007) The lung image database consortium LIDCdata collection process for nodule detection and annotation. AcadRadiol 14(12):1464–1474

10. Dhara AK, Mukhopadhyay S, Das Gupta R, Garg M, KhandelwalN (2015) A segmentation framework of pulmonary nodules in lungCT images. J Digit Imaging. doi:10.1007/s10278-015-9812-6

11. Tsai DM, Hou HT, Su HJ (1999) Boundary-based corner detec-tion using eigenvalues of covariance matrices. Pattern Recogn Lett20(1):31–40

12. Lorensen WE, Cline HE (1987) Marching cubes: a high resolu-tion 3D surface construction algorithm. ACMSiggr Comput Graph21:163–169

13. Koenderink JJ, van Doorn AJ (1992) Surface shape and curvaturescales. Image Vis Comput 10(8):557–564

14. Dong C,Wang G (2005) Curvatures estimation on triangular mesh.J Zhejiang Univ Sci 6(1):128–136

15. Sladoje N, Nyström I, Saha PK (2005) Measurements of digi-tized objects with fuzzy borders in 2D and 3D. Image Vis Comput23(2):123–132

16. McNitt-GrayMF,KimGH,ZhaoB, SchwartzLH,ClunieD,CohenK, Petrick N, Fenimore C, Lu ZJ, Buckler AJ (2015) Determiningthe variability of lesion size measurements from ct patient data setsacquired under “no change” conditions. Transl Oncol 8(1):55–64

17. Armato SG III, McLennan G, Bidaut L, McNitt-Gray MF, MeyerCR, Reeves AP, Zhao B, Aberle DR, Henschke CI, Hoffman EA,Kazerooni EA, MacMahon H, Beek EJR, Yankelevitz D, Bian-cardi AM, Bland PH, Brown MS, Engelmann RM, Laderach GE,Max D, Pais RC, Qing DPY, Roberts RY, Smith AR, Starkey A,Batra P, Caligiuri P, Farooqi A, Gladish GW, Jude CM, MundenRF, Petkovska I, Quint LE, Schwartz LH, Sundaram B, Dodd LE,Fenimore C, Gur D, Petrick N, Freymann J, Kirby J, Hughes B,Casteele AV, Gupte S, Sallam M, Heath MD, Kuhn MH, DharaiyaE, Burns R, Fryd DS, Salganicoff M, Anand V, Shreter U, VastaghS, Croft BY, Clarke LP (2011) The lung image database con-sortium (LIDC) and image database resource initiative (IDRI): acompleted reference database of lung nodules on CT scans. MedPhys 38(2):915–931

18. Xu J, Napel S, Greenspan H, Beaulieu CF, Agrawal N, Rubin D(2012) Quantifying the margin sharpness of lesions on radiologicalimages for content-based image retrieval. Med Phys 39(9):5405–5418

19. Järvelin K, Kekäläinen J (2002) Cumulated gain-based evaluationof ir techniques. ACM Trans Inf Syst 20(4):422–446

20. Seitz KA Jr, GiucaAM, Furst J, RaicuD (2012) Learning lung nod-ule similarity using a genetic algorithm. In: Proceedings of SPIEmedical imaging 2012, vol 8315. San Deigo, USA, pp 831537–831537-7

21. Han F, Wang H, Zhang G, Han H, Song B, Li L, Moore W, LuH, Zhao H, Liang Z (2014) Texture feature analysis for computer-aided diagnosis on pulmonary nodules. J Digit Imaging 28(1):99–115

22. KostisWJ, ReevesAP,Yankelevitz DF, HenschkeCI (2003) Three-dimensional segmentation and growth-rate estimation of smallpulmonary nodules in helical CT images. IEEETransMed Imaging22(10):1259–1274

23. Reeves AP, Chan AB, Yankelevitz DF, Henschke CI, Kressler B,Kostis WJ (2006) On measuring the change in size of pulmonarynodules. IEEE Trans Med Imaging 25(4):435–450

24. Teo BK, Seo Y, Bacharach SL, Carrasquillo JA, Libutti SK, ShuklaH, Hasegawa BH, Hawkins RA, Franc BL (2007) Partial-volumecorrection in PET: validation of an iterative postreconstructionmethod with phantom and patient data. J Nucl Med 48(5):802–810

123