An intelligent decision support system for quantitative assessment of gastric atrophy

9
An intelligent decision support system for quantitative assessment of gastric atrophy Faruq A Al-Omari, 1 Ismail I Matalka, 2 Mohammad A Al-Jarrah, 1 Fatima N Obeidat, 3 Faisal M Kanaan 2 ABSTRACT Aims To build an automated decision support system to assist pathologists in grading gastric atrophy according to the updated Sydney system. Methods A database of 143 biopsies was used to train and examine the proposed system. A panel of three experienced pathologists reached a consensus regarding the grading of the studied biopsies using the visual scale of the updated Sydney system. Digital imaging techniques were utilised to extract a set of discriminating morphological features that describe each atrophy grade sufficiently and uniquely. A probabilistic neural networks structure was used to build a grading system. To evaluate the performance of the proposed system, 66% of the biopsies (94 biopsy images) were used for training purposes and 34% (49 biopsy images) were used for testing and validation purposes. Results During the training phase, a 98.9% precision was achieved, whereas during testing, a precision of 95.9% was achieved. The overall precision achieved was 97.9%. Conclusions A fully automated decision support system to grade gastric atrophy according to the updated Sydney system is proposed. The system utilises advanced image processing techniques and probabilistic neural networks in conducting the assessment. The proposed system eliminates inter- and intra-observer variations with high reproducibility. INTRODUCTION In the last two decades, several studies have tackled the issue of utilising computer systems in medical applications. 1e7 In 2000, Duncan and Ayache presented a conclusive survey on the progress made in the broad area of medical image analysis over the last two decades. 8 The rationale behind the great interest among researchers in this eld is the nature of the problems presented within this area. Exam- ples of these include: the types of image informa- tion that are acquired, the fully three-dimensional (3D) image data, the non-rigid nature of object motion and deformation, and the statistical varia- tion of both the underlying and abnormal ground truth. Several image analysis and computer vision techniques have been utilised in processing these problems, chief among which are image segmen- tation, image registration and matching, motion analysis, 3D image analysis and modelling. The authors divided the last 20 years into four time frames focusing on the methodology issues. Pre- 1980 to 1984 characterised the era of 2D image analysis. In 1985e91, knowledge-based strategies came to the forefront and the advent of MRI changed the landscape. In 1992e98, analysis of fully 3D images became a key goal and more mathematical model driven approaches became computationally feasible. Finally, from 1999, advanced imaging and computing technology has facilitated work in image-guided procedures and more realistic visualisation. In conclusion, Duncan included a look at the remaining challenges researchers face in this eld. However, little attention was given to utilising computer-based approaches in pathological assess- ment issues. 9e12 The aim of these studies was to improve uniformity in histopathological reporting and to provide a exible matrix of rules for grading the histological features. 9e12 Computer use was restricted only to statistical analyses performed on visually or manually collected data. To this end, this study utilises advanced image processing techniques and articial intelligence approaches to build a decision support system for an automatic assessment of gastric atrophy. According to epidemiological and biological evidence, atrophic gastritis represents an important risk factor for gastric adenocarcinoma of the intes- tinal type. 13 Proper identication and assessment of atrophy helps in estimating the risk of gastric carcinoma. Nevertheless, pathologists have a low level of agreement on a precise denition of gastric atrophy and atrophic gastritis. 14 The Sydney system for grading gastritis, which was introduced in 1990, aimed at standardising the interpretation of gastric biopsy. 15 In 1992, this system was updated and a four-grade system assisted by visual analogue scales was proposed. The proposed grades were: no atrophy (normal), mild, moderate, and severe atrophy. 16 Although some consensus studies have achieved a degree of improvement, there is still signicant inter- observer and intra-observer variability between pathologists. 17 El-Zimaity et al have tested the degree of agree- ment among the ndings of four gastrointestinal pathologists in the semi-quantitative evaluation of Helicobacter pylori infection and gastritis. There was essentially no agreement among pathologists on atrophy assessment. 18 However, several studies attempted to redene atrophy as the loss of appropriate glandsand to draw a line between non-atrophic and atrophic gastritis, where atrophy in turn split into metaplastic and non-metaplastic categories. This has led to a signicant increase of inter-observer agreement. 1 2 19 In 1998, Zaitoun et al introduced a quantitative technique for the assessment of gastric atrophy using syntactic structure analysis. 3 The topographical 1 Computer Engineering Department, Hijjawi Faculty for Engineering Technology, Yarmouk University, Irbid, Jordan 2 Department of Pathology and Laboratory Medicine, Jordan University of Science and Technology, Irbid, Jordan 3 Department of Pathology and Laboratory Medicine, Jordan University Hospital, University of Jordan, Amman, Jordan Correspondence to Faruq A Al-Omari, Computer Engineering Department, Hijjawi Faculty for Engineering Technology, Yarmouk University, Irbid 21163, Jordan; [email protected] Accepted 26 January 2011 Published Online First 23 February 2011 330 J Clin Pathol 2011;64:330e337. doi:10.1136/jcp.2010.088252 Original article group.bmj.com on March 19, 2011 - Published by jcp.bmj.com Downloaded from

Transcript of An intelligent decision support system for quantitative assessment of gastric atrophy

An intelligent decision support system for quantitativeassessment of gastric atrophy

Faruq A Al-Omari,1 Ismail I Matalka,2 Mohammad A Al-Jarrah,1 Fatima N Obeidat,3

Faisal M Kanaan2

ABSTRACTAims To build an automated decision support system toassist pathologists in grading gastric atrophy accordingto the updated Sydney system.Methods A database of 143 biopsies was used to trainand examine the proposed system. A panel of threeexperienced pathologists reached a consensus regardingthe grading of the studied biopsies using the visual scaleof the updated Sydney system. Digital imagingtechniques were utilised to extract a set ofdiscriminating morphological features that describe eachatrophy grade sufficiently and uniquely. A probabilisticneural networks structure was used to build a gradingsystem. To evaluate the performance of the proposedsystem, 66% of the biopsies (94 biopsy images) wereused for training purposes and 34% (49 biopsy images)were used for testing and validation purposes.Results During the training phase, a 98.9% precisionwas achieved, whereas during testing, a precision of95.9% was achieved. The overall precision achieved was97.9%.Conclusions A fully automated decision support systemto grade gastric atrophy according to the updatedSydney system is proposed. The system utilisesadvanced image processing techniques and probabilisticneural networks in conducting the assessment. Theproposed system eliminates inter- and intra-observervariations with high reproducibility.

INTRODUCTIONIn the last two decades, several studies have tackledthe issue of utilising computer systems in medicalapplications.1e7 In 2000, Duncan and Ayachepresented a conclusive survey on the progress madein the broad area of medical image analysis over thelast two decades.8 The rationale behind the greatinterest among researchers in this field is the natureof the problems presented within this area. Exam-ples of these include: the types of image informa-tion that are acquired, the fully three-dimensional(3D) image data, the non-rigid nature of objectmotion and deformation, and the statistical varia-tion of both the underlying and abnormal groundtruth. Several image analysis and computer visiontechniques have been utilised in processing theseproblems, chief among which are image segmen-tation, image registration and matching, motionanalysis, 3D image analysis and modelling. Theauthors divided the last 20 years into four timeframes focusing on the methodology issues. Pre-1980 to 1984 characterised the era of 2D imageanalysis. In 1985e91, knowledge-based strategiescame to the forefront and the advent of MRI

changed the landscape. In 1992e98, analysis offully 3D images became a key goal and moremathematical model driven approaches becamecomputationally feasible. Finally, from 1999,advanced imaging and computing technology hasfacilitated work in image-guided procedures andmore realistic visualisation. In conclusion, Duncanincluded a look at the remaining challengesresearchers face in this field.However, little attention was given to utilising

computer-based approaches in pathological assess-ment issues.9e12 The aim of these studies was toimprove uniformity in histopathological reportingand to provide a flexible matrix of rules for gradingthe histological features.9e12 Computer use wasrestricted only to statistical analyses performed onvisually or manually collected data. To this end,this study utilises advanced image processingtechniques and artificial intelligence approaches tobuild a decision support system for an automaticassessment of gastric atrophy.According to epidemiological and biological

evidence, atrophic gastritis represents an importantrisk factor for gastric adenocarcinoma of the intes-tinal type.13 Proper identification and assessment ofatrophy helps in estimating the risk of gastriccarcinoma. Nevertheless, pathologists have a lowlevel of agreement on a precise definition of gastricatrophy and atrophic gastritis.14

The Sydney system for grading gastritis, whichwas introduced in 1990, aimed at standardising theinterpretation of gastric biopsy.15 In 1992, thissystem was updated and a four-grade systemassisted by visual analogue scales was proposed.The proposed grades were: no atrophy (normal),mild, moderate, and severe atrophy.16 Althoughsome consensus studies have achieved a degree ofimprovement, there is still significant inter-observer and intra-observer variability betweenpathologists.17

El-Zimaity et al have tested the degree of agree-ment among the findings of four gastrointestinalpathologists in the semi-quantitative evaluation ofHelicobacter pylori infection and gastritis. There wasessentially no agreement among pathologists onatrophy assessment.18 However, several studiesattempted to redefine atrophy as the ‘loss ofappropriate glands’ and to draw a line betweennon-atrophic and atrophic gastritis, where atrophyin turn split into metaplastic and non-metaplasticcategories. This has led to a significant increase ofinter-observer agreement.1 2 19

In 1998, Zaitoun et al introduced a quantitativetechnique for the assessment of gastric atrophyusingsyntactic structure analysis.3 The topographical

1Computer EngineeringDepartment, Hijjawi Faculty forEngineering Technology,Yarmouk University, Irbid,Jordan2Department of Pathology andLaboratory Medicine, JordanUniversity of Science andTechnology, Irbid, Jordan3Department of Pathology andLaboratory Medicine, JordanUniversity Hospital, University ofJordan, Amman, Jordan

Correspondence toFaruq A Al-Omari, ComputerEngineering Department, HijjawiFaculty for EngineeringTechnology, Yarmouk University,Irbid 21163, Jordan;[email protected]

Accepted 26 January 2011Published Online First23 February 2011

330 J Clin Pathol 2011;64:330e337. doi:10.1136/jcp.2010.088252

Original article

group.bmj.com on March 19, 2011 - Published by jcp.bmj.comDownloaded from

relation between gastric glands was derived using the minimumspanning tree (MST) to assess the characteristic features of thegastric atrophy based on the Sydney system. The proposed tech-niques depended on pathologist interaction with the system inidentifying the centre of the glands,which resembles a crucial stepin building the MST. This in fact keeps the inter- and intra-pathologist differences. More importantly, the proposed tech-nique was not able to distinguish between mild and moderateatrophy as it was reported in their results.

Later, Grieken et al designed rapid, reproducible and quanti-tative method for the assessment of gastric atrophy in tissuesections of corpus biopsies according to the updated Sydneysystem.4 For this purpose, a stereology module of an interactivevideo overlay microscopic measuring system (QPRODIT)equipped with an automated motorised scanning stage wasused. The results obtained indicated that the differencesbetween non-atrophy and mild atrophy were negligible. There-fore, the authors have suggested a three-scale instead of thestandard Sydney system. Furthermore, the process of datacollection was performed visually by pathologists and only thecollected data was statistically analysed on a computer system.Hence, the process is still subjective to human interpretationand interaction.

In 2001, Ruiz et al applied morphometric techniques to a setof antral biopsy specimens that were examined visually bya group of experienced gastrointestinal pathologists.5 Discrimi-nant function analyses of morphometric measurements wereconducted to grade atrophy. Statistical analyses were then usedto compare the performance of the introduced morphometricmeasurements and discriminant functions against the perfor-mance of pathologists. While it proved accuracy in identifyingno atrophy, and moderate and severe atrophy, the mild caseswere not identified at all and rather were graded as eithermoderate or severe. Therefore, a three-scale system is usedinstead of the standard updated Sydney system.

In a previous study, we established a basis for automatedassessment of gastric atrophy according to the updated Sydneysystem. In that study, interactive image processing techniqueswere used to derive a set of morphological features to charac-terise each atrophy grade. Accordingly, gland-shape and gland-density related features were extracted. The K-means clusteringtechnique was used to validate the proposed discriminatingfeatures. The results obtained indicate an overall accuracy of95.6% when contrasted with pathologists’ consensus.20

In this study we aim to automate the assessment process ofgrading gastric atrophy based on the findings made in theupdated Sydney system20 and in accordance with it. Theproposed system is meant to be a decision support system thathelps practitioners and researchers in this field to standardise theassessment process and to eliminate inter- and intra-observervariability.

MATERIALS AND METHODSA total of 175 antral gastric biopsies were collected for exami-nation by three pathologists who are well experienced with theupdated Sydney system for grading atrophy. All biopsies had atleast one well oriented 103 microscopic field representing theentire thickness of the mucosa wherever possible.

Building the databaseA three-stage process was followed by pathologists to examinethe biopsies and to exclude those where a final consensus wasnot possible. In the first stage, each pathologist visually and

independently examined the same set of tissue sections andgraded the cases according to the updated Sydney system. Eachpathologist was required to submit 175 reports for all examinedbiopsies without allowing any interaction between the groupmembers. To avoid inter-observer variations, a special conferencemeeting was held by the group of pathologists to discuss theirresults. There was agreement between the pathologists on 72biopsies. A consensus was reached after discussion on 83 otherbiopsies where different opinions were initially reported. Thepathologists were unable to reach a consensus on the remaining20 biopsies. Therefore, only 155 biopsies were considered in theremainder of the study.Six months later, the same group of pathologists was asked to

study and grade the 155 biopsies independently and blindly.Previous consensus and individual opinion was not shown toany of the pathologists in an attempt to eliminate any intra-observer variations. The same previous scenario was performed.The pathologists had agreement on 91 biopsies and disagree-ment on 64. The pathologists were then asked to reacha consensus on their current results. After group discussion, theymanaged to reach a consensus on 57 biopsies; 7 biopsies wereexcluded because a consensus was not possible. Therefore, only148 biopsies proceeded to the final stage of the process.In the last stage, the previous and current consensuses on the

148 biopsies were presented to the group of pathologists forfurther evaluation. The pathologists had agreement betweentheir first and second consensus on 129 biopsies. Of theremaining biopsies, a final consensus was reached on 14 otherbiopsies; they failed to reach a final consensus on the remaining5 biopsies. These 5 biopsies were eliminated from the study toavoid any misclassifications. Only 143 biopsies were used in therest of the study as inter- and intra-variations between pathol-ogists were vastly minimised through this three-stage elaborateprocess. Figure 1 shows the stages of the visual analogue scalegrading process, and table 1 summarises the pathologists’consensus.

Image acquisitionMicroscopic digital coloured images were captured for all 143studied biopsies at the Department of Pathology in KingAbdullah University Hospital (KAUH), Irbid, Jordan. Imageswere obtained using the Leica imaging system (Leica ImagingSystems, Cambridge, UK and Dynamic Data Links). In order tocapture the widest possible field from the tissue, images werecaptured at 103. The size of acquired images was set to5743736 pixels. To compensate for illumination irregularities,a single background image was captured for each slide under thesame microscope setup options. Compensation was performedby image subtraction between the captured image and thebackground image.

Gland localisation and feature extractionNormal antral morphology and histology in terms of density ofglands varies depending on biopsy location. Currently, there isno gold standard to adjust for these differences. No study iscompetently reproducible on antral atrophy as far as we areaware. However, H pylori infection tends to localise more to theantrum. For that, it becomes the biopsy target by the gastro-enterologists. It is believed that all atrophic cases in a certainatrophy grade have a common set of features within proximitythat sufficiently enclose all one-class casesdthat is, distend fromany other cases.In digital image terms, glands are polygons characterised by

having a low-contrasted boundary and relatively brighter body.

J Clin Pathol 2011;64:330e337. doi:10.1136/jcp.2010.088252 331

Original article

group.bmj.com on March 19, 2011 - Published by jcp.bmj.comDownloaded from

Their shape tends to be almost circular, but might deform or evendisappear due to the presence of atrophy, as shown in figure 2.

As indicated earlier, glands are a key factor in grading gastricatrophy. Therefore, the first and crucial step towards theassessment of atrophic gastritis is to locate these polygons in animage. ‘Snakes’ is a well known active contour model that hasbeen extensively used in the past years for localising polygons ina greyscale image in many computer vision and image processingapplications.21e23 This model generates an elastic curve that ispropagated by image forces towards the minimum energygenerated by an image. To ensure regularity of the curve and tolimit the bending effect, ‘Snakes’ introduces some internal

regularisation constraints. The internal forces keep the shapeand ensure the spatial and temporal continuity. The externalforces, which represent images forces as well as constraintforces, pull and guide the snake in an interactive and dynamiciterative process. The energy function is defined as21:

Esnake ¼ Einternal þ Eexternal (1)

Einternal intends to elastically hold the curve together (elasticityforces) and to keep it from bending too much (bending forces). Itis defined as:

Einternal ¼ 12!

sajCsj2:ds þ s

12!

sbjCssj2:ds (2)

where Cs and Css represent the first and second derivativerespectively; a and b are parameters to control the snake’stension and rigidity.On the other hand, Eexternal intends to pull or push the curve

towards the edges. Typically, the external forces consist of imageforces and constraints. This energy is defined as:

Figure 1 Stages of the visualanalogue scale grading processfollowed to build the biopsy database.

175 biopsies

Agreement:72 biopsies

Stage I Disagreement:

103 biopsies

E l d d 20 bi iConsensus:83 biopsies

155 biopsies

Excluded: 20 biopsies

Stage II

Agreement:91 biopsies

Disagreement:64 biopsies

Consensus:57 biopsies

Excluded: 7 biopsies

Stage III

biopsies

148 biopsies

Agreement:129 biopsies

Disagreement:19 biopsies

Consensus:14 biopsies

Excluded: 5 biopsies

143 biopsies

Table 1 Pathologists’ consensus using the visual analogue scale basedon the updated Sydney system

Studied biopsies

Atrophy grade (consensus reached)

ExcludedNormal Mild Moderate Severe Total

175 47 49 31 16 143 32

332 J Clin Pathol 2011;64:330e337. doi:10.1136/jcp.2010.088252

Original article

group.bmj.com on March 19, 2011 - Published by jcp.bmj.comDownloaded from

Eexternal ¼ Eimage þ Econ (3)

where Eimage is the image force, which represents the negativegradient of a potential function, given as:

EimageðxÞ ¼ �jVIðxÞj2 (4)

Finally, Econ gives rise to external constraint forces.Classical ‘Snakes’ suffers some problems associated with

initialisation and poor convergences to boundary concavities.Gradient vector flow (GVF) presents a new external force foractive contours, largely solving both problems. It is computed asa diffusion of the gradient vectors of a greyscale or binary edgemap derived from the image. It differs fundamentally fromtraditional snake external forces in that it cannot be written asthe negative gradient of a potential function, and the corre-sponding snake is formulated directly from a force balancecondition rather than a variation formulation.

This approach was used to localise glands in our study. Theapproach was implemented via special software. All localisedpolygons were further investigated to guarantee that theyrepresent glandular regions. Inflammatory cells and other imageartefacts were the main encountered polygons in addition toaimed glandular regions. To get rid of these polygons, the areaof all encountered polygons was calculated, and polygonswith small areas less than 50% of the average area wererejected. Any unenclosed contour was also rejected as it mightrefer to the mucosa boundary. Further cleaning steps werecarried out to remove any unwanted spots. These cleaningoperations are mainly erosion and dilation.24 In conclusion,only glandular areas remain in the processed image representedby closed contours resembling the boundary of these regions(figure 3).

The next stage of the process was to perform a number ofrequired measurements to extract a set of discriminatingfeatures peculiar to every atrophy grade. The set of features

proposed in Matalka et al20 were adopted as they have proved tobe efficient and sufficient in the grading process. The performedmeasurements were the centroid of each gland, the gland area,the gland perimeter, the mucosa area, and the number of glandslocated in the mucosa region. Based on these measurements, twotypes of features were extracted: shape related features anddensity related features.

Shape related featuresThe primary shape related feature as defined in Matalka et al20

was the circularity of the glands, defined as:

CG ¼ Perm2

Area(5)

where Perm and Area are the gland’s perimeter and area, respec-tively. When the shape is exactly circular, the circularity is 4p(y12.57) regardless of the size of the circle, whereas it is 16when the shape is square. Therefore, to normalise this feature, asrequired by the neural network utilised, this feature is divided by20, as all circularity values computed were always below thisvalue.

Density related featuresThree density related features were extracted as defined inMatalka et al.20 The first feature is the ratio between the areaoccupied by the glands to the total mucosa area. Strictlyspeaking, the value of this feature is normalised by definition asrequired. Second, the average glands spacing is derived from theminimal spanning tree (MST) algorithm. This feature measuresthe closeness of encountered glands to each others. To normalisethis feature, the average mucosa area is calculated in all trainingsamples. The ‘diameter ’ of this area is approximated assumingcircular shape. Then, the computed average distance found usingthe MST algorithm is divided by the approximated ‘diameter ’.Finally, the ratio of the mucosa area to the total number oflocated glands is calculated.20 To normalise this feature, this

Figure 2 Sample biopsy images:graded visually as: (A) normal, (B) mild,(C) moderate, and (D) severe.

J Clin Pathol 2011;64:330e337. doi:10.1136/jcp.2010.088252 333

Original article

group.bmj.com on March 19, 2011 - Published by jcp.bmj.comDownloaded from

ratio is further divided by the average gland area computedearlier. That is,

Am

N $Ag(6)

where Am is the total mucosa area,N is the number of encounteredglands and Ag is the average gland area.

Based on the above discussion, a digital image was capturedfor each biopsy. The images were processed as described,extracting four features for each biopsy. The features werethen integrated in the database as tagging attributes foreach image.

Probabilistic neural networksThe probabilistic neural network (PNN) was introduced bySpecht.25 It is specialised for use mainly with classificationproblems. The architecture of the PNN is an implementation ofthe Bayesian classifier in which a feature vector,V

!, is assigned

to a class Ci, i¼1,2,., k, if and only if:

piLifiðV!Þ> pjLjfjðV

!Þ isj (7)

where pi is the probability that V!

belongs to a class Ci, Li isthe loss function associated with misclassifying a vector fromCi, and fiðV!Þ is the probability density function (pdf) for theclass Ci.

Figure 3 Processing steps: original images (left), intermediate image after applying the Snakes algorithm to localise glands (middle), and the finalimage after further cleaning (right). (A) normal, (B) mild, (C) moderate, and (D) severe.

334 J Clin Pathol 2011;64:330e337. doi:10.1136/jcp.2010.088252

Original article

group.bmj.com on March 19, 2011 - Published by jcp.bmj.comDownloaded from

The main problem with the Bayesian classifier is that theprobability density function fiðV!Þ is usually unknown. Oneway to estimate the pdf is to use the Parzen pdf estimator.26

With a Gaussian weighting kernel, the pdf is estimated by27:

fiðV!Þ ¼ 12pn=2snki

+ki

j¼1exp

� ðV!�Vij

�T�V!�Vij�

2s2

!(8)

where ki is the number of pattern sample points in class Ci andVij is the jth sample belonging to class Ci. The choice of theparameter,s, is crucial for the success of the classificationprocedure; a large s results in a flat curved surface, while a smalls results in narrow peaked curves.27

The architecture of a PNN is similar to a four-layer feedforward neural network (FFNN), as shown in figure 4. The firstlayer is the input layer, which passes the inputs to the nextlayer. The inputs must be normalised to a unit length beforebeing processed by the network.27 The second layer is called thehidden layer, which is fully connected to the input layer. Thenumber of nodes in this layer is equal to the number of thetraining samples in the training set. With an exponential acti-vation function, the output of a node in this layer is of theform:

fðV!; W�!

iÞ ¼ exp�� ðV!� W

�!iÞTðV!� W

�!iÞ

2s2

�(9)

whereV!

is the input vector and W�!

i is the node’s weight vector,which is set equal to the training sample that the node repre-sents. Each node in the class layer accumulates the outputs ofthe nodes in the hidden layer that belongs to the same class sothat the computation of equation 8 is completed. The finaloutput is computed in the last layer according to equation 7.This architecture was adopted and implemented via special

software to perform the classification and recognition of atrophygrading. The inputs to the PNN were the set of four extractedfeatures described in the previous subsection. The number ofhidden layer nodes was set to the number of training patterns,which was 94 in our case. Four class nodes in the class layerrepresented the four grades of atrophy according to the updatesSydney system. Finally, there was one decision node in the lastlayer, which evaluates the outputs of the third layer outputsaccording to equation 7.

RESULTS AND DISCUSSIONThe constructed database resembling the 143 images with theirassociated feature tags was incorporated in the recognitionprocess. The final experts’ consensus was used as the baseline fortraining and testing the implemented classifier. Sixty-six per centof the cases (94 biopsies) were used for training purposes and theremaining 34% (49 biopsies) were used for testing purposes.

Input Nodes

Hidden Nodes

Class Nodes

Decision Node

Figure 4 Probabilistic neural network architecture.

Table 2 Classification results obtained using the probabilistic neuralnetwork (PNN) classifier against experts’ consensus during the trainingphase

Experts’ consensus PNN classification (training phase)

Atrophy grade Cases (n) Normal Mild Moderate Severe

Normal 31 31 0 0 0

Mild 32 0 31 1 0

Moderate 20 0 0 20 0

Severe 11 0 0 0 11

Table 3 Classification results obtained using the probabilistic neuralnetwork (PNN) classifier against experts’ consensus during the testingphase

Experts’ consensus PNN classification (testing phase)

Atrophy grade Cases (n) Normal Mild Moderate Severe

Normal 16 16 0 0 0

Mild 17 0 16 1 0

Moderate 11 0 1 10 0

Severe 5 0 0 0 5

Table 4 Computed performance metrics during training and testing of the proposed classificationtechnique

Atrophy grade

Performance metric

Precision Recall

Training Testing Overall Training Testing Overall

Normal 100% 100% 100% 100% 100% 100%

Mild 100% 94.12% 97.92% 96.88% 94.12% 95.92%

Moderate 95.24% 90.91% 93.75% 100% 90.91% 96.77%

Severe 100% 100% 100% 100% 100% 100%

Overall 97.90% 97.90%

J Clin Pathol 2011;64:330e337. doi:10.1136/jcp.2010.088252 335

Original article

group.bmj.com on March 19, 2011 - Published by jcp.bmj.comDownloaded from

During the training phase of the classifier, the system was fedwith the desired atrophy grade according to the experts’consensus. On the contrary, in the testing phase, the classifierworks blindly and independently from the consensus in order toevaluate the performance of the proposed system.

Two assessment measures were used to evaluate the perfor-mance of the proposed system: recall and precision. Recall isdefined as the ratio of the number of correctly classified cases tothe total number of cases in a certain class.28 Precision, on theother hand, is defined as the ratio of the number of correctlyclassified cases to the total number of correctly and incorrectlyclassified cases in a certain class.28

The pathologists’ consensus was used as the reference tovalidate the proposed technique. Tables 2 and 3 illustrate theobtained results after classification during the training andtesting phases, respectively, in comparison with pathologists’consensus.

As can be seen from tables 2 and 3, the results show highagreement with the pathologists’ consensus. In addition, themisclassified cases were determined to belong to a class that isnext to the actual class, a situation that can be fairly accepted bypathologists to certain extent. Therefore, great confidence can beput in the proposed system as a decision support system topathologists working in this particular field.

Table 4 shows the precision/recall metrics calculated based onthe results reported in tables 2 and 3. The overall precisionachieved was very high as only three cases were misclassified. Infact, when investigating those misclassified cases, we found thatthey were among the biopsies that pathologists did not haveagreement on in the first and second round of studying thebiopsies independently. It was only after panel discussion thatthe pathologists reach consensus on those cases.

The proposed decision support system was able to discernfour grades of gastric atrophy in accordance with the updatedSydney system. Furthermore, the system was correctly able tograde 140 biopsies out of 143 considered in this study, with anoverall precision of 97.9%. To this end, the proposed systemoutperforms existing techniques reported in the literature,3e5

where authors were able to distinct three grades of atrophy only.The reported recognition rate in these studies was fairly low, anddid not exceed 60% in one study.5 In comparison with ourprevious study,20 the proposed system performs higher in termsof precision and recall measures. This is due to the fact that theuse of a probabilistic neural network as a classifier gives theproposed system higher dynamicity as compared to the use ofthe classical k-means clustering technique. We believe that theproposed system is a real attempt to standardise the assessmentof gastric atrophy, eliminating inter- and intra-observer vari-ability. In future, practitioners can rely confidently on thissystem for grading atrophic gastritis quantitatively.

CONCLUSIONSIn this study, a decision support system for grading gastricatrophy according to the updated Sydney system was proposed.In this system advanced image processing and artificial intelli-gence techniques were incorporated to grade atrophic gastritisquantitatively and automatically. The proposed system isrobust and reliable, and accomplishes the grading process withsignificant reduction in variability and subjectivity and highreproducibility.

The neural network classifier incorporated was able to classifythe studied biopsies into four distinct grades in accordance withthe updated Sydney system and with a precision of 97.9%. In

conclusion, we have great confidence that the proposed systemis a reliable decision support system that can be used byresearchers and practitioners in this field with minimal subjec-tively and user interaction.

Funding The authors are grateful to the Deanships of Scientific Research at JordanUniversity of Science and Technology and Yarmouk University for funding this project.

Competing interests None.

Provenance and peer review Not commissioned; externally peer reviewed.

REFERENCES1. Rugge M, Russo VM, Guido M. Review article: what we have learnt from gastric

biopsy? Aliment Pharmacol Ther 2003;17(Suppl 2):68e74.2. Genta RM, Rugge M. Assessing risks for gastric cancer: new tools for pathologists.

World J Gastrointerol 2006;12:5622e7.3. Zaitoun AM, al Mardini H, Record CO. Quantitative assessment of gastric atrophy

using the syntactic structure analysis. J Clin Pathol 1998;51:895e900.4. Grieken NCT,Weiss MM, Meijer GA, et al. Rapid quantitative assessment of gastric

corpus atrophy in tissue sections. J Clin Pathol 2001;54:63e9.5. Ruiz B, Garay J, Johnson W, et al. Morphometric assessment of gastric antral

atrophy: comparison with visual evaluation. Histopathol 2001;39:235e42.6. Zaitoun AM, Record CO. Application of quantitative techniques for the assessment

of gastric atrophy. J Clin Pathol 2001;54:161e2.7. Grieken NCT, Meijer GA, Weiss MM, et al. Quantitative assessment of gastric

corpus atrophy in subjects using omeprazole: a ramdomized follow-up study. Am JGastroenterol 2001;96:2882e6.

8. Duncan JS, Ayache N. Medical image analysis: progress over two decades and thechallenges ahead. IEEE Trans on PAMI 2000;22:85e106.

9. Chen XY, Hulst RW, Bruno MJ, et al. Interobserver variation in the histopathologicalscoring of helicobacter pylori related gastritis. J Clin Pathol 1999;52:612e15.

10. Andrew A, Wyatt JI, Dixon MF. Observer variation in the assessment of chronicgastritis according to the Sydney system. Histopathol 1994;25:317e22.

11. Ruiz B, Garay J, Correa P, et al. Morphometric evaluation of gastric antralatrophy: improvement after cure of helicobacter pylori. Am J Gastroenterol2001;96:3281e7.

12. Rugge M, Genta RM. Staging and grading of chronic gastritis. Hum Pathol2005;36:228e33.

13. Genta RM. Gastric atrophy and atrophic gastritis-nebulous concepts in search ofa definition. Aliment pharmacolTher 1998;12(suppl 1):17e23.

14. Genta RM. Atrophy, acid suppression and Helicobacter pylori infection: a tale of twostudies. Eur J Gastroenterol Hematol 1999;11(suppl 2):S29e33; discussion S43e5.

15. Misiewiez JJ, Tytgat GNJ, Goodwin CS, et al. The Sydney system: a newclassification of gastritis. J Hepatol Gastroenterol 1991;6:209e22.

16. Dixon MF, Genta RM, Yardley JH, et al. Classification and grading of gastritis theupdated Sydney system. Am J Surgical Pathology 1996;20:1161e81.

17. Offerhaus GJ, Price AB, Haot J, et al. Observer agreement on the grading of gastricatrophy. Histopathology 1999;34:320e5.

18. El-Zimaity HM, Graham DY, Al-Assi MT, et al. Interobserver variation in thehistopathological assessment of helicobacter pylori gastritis. Hum Path1996;27:36e41.

19. Rugge M, Correa P, Dixon MF, et al. Gastric mucosal atrophy: interobserverconsistency using new criteria for classification and grading. Aliment Pharmacol Ther2002;16:1249e59.

Take-home messages

< Atrophic gastritis is an important risk factor for gastricadenocarcinoma of the intestinal type. Proper identificationand assessment of atrophy helps in estimating the risk ofgastric carcinoma.

< The updated Sydney system for grading gastritis aimed tostandardise the interpretation of gastric biopsy.

< Although some consensus studies have achieved a degree ofimprovement, there is still significant inter-observer and intra-observer variability between pathologists.

< The proposed system intends to serve as a fully automateddecision support system to grade gastric atrophy according tothe updated Sydney system. It utilises advanced imageprocessing techniques and probabilistic neural networks inconducting the assessment.

336 J Clin Pathol 2011;64:330e337. doi:10.1136/jcp.2010.088252

Original article

group.bmj.com on March 19, 2011 - Published by jcp.bmj.comDownloaded from

20. Matalka II, Al-Omari FA, Al-Jarrah MA, et al. Image-based discriminatingmorphological features for gastric atrophy assessment: a step to go further. PatholRes Pract 2008;204:235e40.

21. Kass M, Witkin A, Terzopoulos D. Snakes: active contour models. In Proc Int ConfComputer Vision, 1987;261e8.

22. Kass M, Witkin A, Terzopoulos D. Snakes: active contour models. Int J of ComputerVision 1988;1:321e31.

23. Xu C, Prince J. Snakes, shapes, and gradient vector flow. IEEE Trans on PAMI1998;7:359e69.

24. Castleman K. Digital Image Processing. Upper Saddle River, NJ: Prentice Hall, 1996.25. Specht DF. Probabilistic neural networks. Neural Networks 1990;3:109e18.26. Parzen E. On estimation of a probability density function and mode. Ann

Mathematical Statistics 1962;33:1065e76.27. Patterson DW. Artificial Neural Networks, Theory and Applications. Upper Saddle

River, NJ: Prentice Hall, 1996.28. Muller H, Muller W, Marchand-Maillet S, et al. Performance evaluation in

content-based image retrieval: overview and proposals. Pattern Recog Let2000;22:593e601.

The BMJ Group is delighted to announce the launch of BMJ Open, a new and exciting open access online journal of medical research.

BMJ Open publishes the full range of research articles from protocols and phase I trials to meta analyses.

Accessible to everyone• Fully open transparent peer review• Open access means maximum exposure for all articles• Article-level metrics showing use and impact• Rate and comment on articles

For more details visit bmjopen.bmj.com

SUBMITNOW

J Clin Pathol 2011;64:330e337. doi:10.1136/jcp.2010.088252 337

Original article

group.bmj.com on March 19, 2011 - Published by jcp.bmj.comDownloaded from

doi: 10.1136/jcp.2010.0882522011

2011 64: 330-337 originally published online February 23,J Clin Pathol Faruq A Al-Omari, Ismail I Matalka, Mohammad A Al-Jarrah, et al. quantitative assessment of gastric atrophyAn intelligent decision support system for

http://jcp.bmj.com/content/64/4/330.full.htmlUpdated information and services can be found at:

These include:

References http://jcp.bmj.com/content/64/4/330.full.html#ref-list-1

This article cites 25 articles, 4 of which can be accessed free at:

serviceEmail alerting

the box at the top right corner of the online article.Receive free email alerts when new articles cite this article. Sign up in

Notes

http://group.bmj.com/group/rights-licensing/permissionsTo request permissions go to:

http://journals.bmj.com/cgi/reprintformTo order reprints go to:

http://group.bmj.com/subscribe/To subscribe to BMJ go to:

group.bmj.com on March 19, 2011 - Published by jcp.bmj.comDownloaded from