Haplotype structure of five SNPs within the ACE gene in the Tunisian population

11
Annals of Human Biology, May–June 2006; 33(3): 319–329 ORIGINAL ARTICLE Haplotype structure of five SNPs within the ACE gene in the Tunisian population MAHA REBAI ¨ , NAJLA KHARRAT, IMEN AYADI, & AHMED REBAI ¨ Bioinformatics Unit, Centre of Biotechnology of Sfax, Sfax, Tunisia (Received 24 November 2005; revised 2 January 2006; accepted 23 January 2006) Abstract Background: The Angiotensin-Converting Enzyme (ACE) is a candidate gene in the aetiology of several common diseases. The study of the haplotype structure of this gene is of interest in diagnosis and in pharmacogenomics. Aim: The study investigated the haplotype profile of single nucleotide polymorphisms (SNPs) within the ACE gene in the Tunisian population and compared it with other populations. Subjects and methods: Five SNPs (rs1800764, rs4291, rs4309, rs4331, rs4340) covering a region of 15.6 kb of the ACE gene were typed by PCR-digestion in a sample of 100 healthy subjects. Results: All SNPs were polymorphic and in Hardy–Weinberg equilibrium. A total of 21 haplotypes were identified but only eight had a frequency of more than 1%. The four most common haplotypes had a cumulative frequency of 87.4%. The ‘Yin–Yang’ phenomenon (the two major haplotypes are complementary at all sites) was found. Linkage disequilibrium between all pairs of loci was highly significant (p<10 5 ). A simple and efficient statistical procedure was used to identify three important SNPs. Conclusion: The Tunisian population showed a different haplotype structure from the European one for the ACE gene and three important SNPs were identified. These will be very helpful in future association studies in the Tunisian and North African populations. Keywords: ACE, SNP, Tunisian, haplotype, linkage disequilibrium Introduction The analysis of haplotypes for multiple single nucleotide polymorphisms (SNPs) in a gene or a region of interest is a fundamental step in the process of identification of DNA variants underlying complex inherited diseases. With the availability of the whole sequence of the human genome, with more than 10 millions SNPs, and the near-completion of the HapMap project (International HapMap Consortium 2003, 2005a), it is now possible to target Correspondence: Ahmed Rebaı, Centre of Biotechnology of Sfax, PO Box ‘K’, 3038 Sfax, Tunisia. E-mail: [email protected] ISSN 0301–4460 print/ISSN 1464–5033 online ß 2006 Informa UK Ltd. DOI: 10.1080/03014460600621977

Transcript of Haplotype structure of five SNPs within the ACE gene in the Tunisian population

Annals of Human Biology, May–June 2006; 33(3): 319–329

ORIGINAL ARTICLE

Haplotype structure of five SNPs within the ACE genein the Tunisian population

MAHA REBAI, NAJLA KHARRAT, IMEN AYADI, & AHMED REBAI

Bioinformatics Unit, Centre of Biotechnology of Sfax, Sfax, Tunisia

(Received 24 November 2005; revised 2 January 2006; accepted 23 January 2006)

AbstractBackground: The Angiotensin-Converting Enzyme (ACE) is a candidate gene in the aetiology of severalcommon diseases. The study of the haplotype structure of this gene is of interest in diagnosis andin pharmacogenomics.Aim: The study investigated the haplotype profile of single nucleotide polymorphisms (SNPs) within theACE gene in the Tunisian population and compared it with other populations.Subjects and methods: Five SNPs (rs1800764, rs4291, rs4309, rs4331, rs4340) covering a region of 15.6 kbof the ACE gene were typed by PCR-digestion in a sample of 100 healthy subjects.Results: All SNPs were polymorphic and in Hardy–Weinberg equilibrium. A total of 21 haplotypes wereidentified but only eight had a frequency of more than 1%. The four most common haplotypes hada cumulative frequency of 87.4%. The ‘Yin–Yang’ phenomenon (the two major haplotypes arecomplementary at all sites) was found. Linkage disequilibrium between all pairs of loci washighly significant (p<10�5). A simple and efficient statistical procedure was used to identify threeimportant SNPs.Conclusion: The Tunisian population showed a different haplotype structure from the European one forthe ACE gene and three important SNPs were identified. These will be very helpful in future associationstudies in the Tunisian and North African populations.

Keywords: ACE, SNP, Tunisian, haplotype, linkage disequilibrium

Introduction

The analysis of haplotypes for multiple single nucleotide polymorphisms (SNPs) in a gene or

a region of interest is a fundamental step in the process of identification of DNA variants

underlying complex inherited diseases. With the availability of the whole sequence of the

human genome, with more than 10 millions SNPs, and the near-completion of the HapMap

project (International HapMap Consortium 2003, 2005a), it is now possible to target

Correspondence: Ahmed Reba€ı, Centre of Biotechnology of Sfax, PO Box ‘K’, 3038 Sfax, Tunisia.

E-mail: [email protected]

ISSN 0301–4460 print/ISSN 1464–5033 online � 2006 Informa UK Ltd.

DOI: 10.1080/03014460600621977

association studies by choosing the most appropriate SNPs based on their polymorphism

features, position within the gene or region of interest and linkage disequilibrium

among them.

The HapMap project, started in 2002 has the objective to provide by 2006, a full

haplotype map of the human genome including 3.5 million SNPs in four populations

of African, European and Asiatic origin (Schmidt 2005). This will provide useful

information for geneticists and medical geneticists who are interested in identifying

haplotypes associated with particular forms of common diseases or drug response and will

open the door for a personalized approach of medicine. One of the major objectives of

haplotype studies is to find haplotype tagging SNPs (Johnson et al. 2001), which are SNPs

that best represent a gene or region of interest. Finding such SNPs (denoted tagSNP or

tSNP and sometimes htSNP) will result in a considerable saving of time and money in

genotyping for association and pharmacogenomics studies (Allen-Brady and Camp 2005;

Camp et al. 2005).

The Angiotensin-Converting Enzyme (ACE) or kininase II (EC 3.4.15.1, MIM 106180)

catalyses the conversion of the angiotensin I to the physiologically active peptide angiotensin

II, which controls fluid–electrolyte balance and systemic blood pressure. A large number of

physiological, pharmacological and genetic studies have shown the importance of ACE and

its inhibition in the pathogenesis and treatment of a variety of cardiovascular and associated

diseases (see Niu et al. 2002 and Scharplatz et al. 2004 for good reviews). Particularly, DNA

variants within the ACE gene were shown to be involved in the aetiology of several common

diseases and in the therapeutic response to several drugs (see the Genetic Association

Database at http://geneticassociationdb.nih.gov for a large list of references; National

Center for Biotechnology Information 2005b).

The ACE gene maps to 17q23 region extending over 45 kb genomic size. The two most

abundant RNA variants of this gene encode two isozymes, the somatic form and the

testicular form that are equally active. The somatic variant is a transcript of 25 exons

(exons 1–25 and corresponding to 20 kb genomic size) while the testicular variant includes

an alternate in-frame exon in the 50 coding region and results in a different N-terminal part

of the protein.

The ACE gene has 243 SNPs in the human SNP database (dbSNP, http://

www.ncbi.nlm.nih.gov/SNP, National Center for Biotechnology Information 2005a) build

124 among which 127 are validated. Among these, 50 SNPs within the coding region or in

close proximity to the ACE gene were typed in the HapMap project, most of them (37

SNPs) are located in the 50 part of the gene (corresponding to the part coding for the somatic

variant). At the current status of the HapMap project (HapMap web site, http://

www.hapmap.org; International HapMap Consortium 2005b), 28 of these 50 SNPs

showed no polymorphism in all studied populations. Keavney et al. (1998) studied the

haplotype structure of 10 SNPs within the ACE gene in the European (British) population.

Zhu et al. (2000) genotyped seven SNPs in the 30 end of ACE in Afro-Caribbean subjects

and evaluated the linkage disequilibrium between them, while Zhu et al. (2001) studied the

association between 13 SNPs within the ACE gene with the plasma ACE concentration and

blood pressure in a Nigerian population sample.

However, several recent studies on linkage disequilibrium structure and haplotype

distribution of SNPs in different populations have shown that results in this field

have limited transferability from one population to another (Nejentsev et al. 2004;

Mueller et al. 2005).

Here we report the results of haplotype analysis with five SNPs in the ACE gene in

a sample of 100 controls from the Tunisian population. Comparison with other populations

320 M. Reba€ı et al.

revealed different features in haplotype structure. Based on linkage disequilibrium among

markers, three SNPs were retained as being the most important.

Materials and methods

Subjects

A sample of 100 healthy unrelated subjects from southern Tunisia (Sfax region) was

recruited for this study. These were chosen randomly from a larger sample of volunteers

having given informed consent and blood samples. There were 50 males and 50 females with

a mean age of 39 years.

Isolation of genomic DNA from peripheral blood

Genomic DNA from the blood sample was extracted using a standard phenol–chloroform

protocol and was stored at �20�C for SNP genotyping.

Genotyping of SNPs

The five SNPs typed on controls were chosen from the dbSNP build 121 (2004) according

to the following conditions: the average distance between SNPs is 5 kb, and the frequency of

the minor allele is more than 5%. We looked preferably for SNPs located in coding and

regulatory regions of the somatic isoform. To allow comparison with other populations,

we used a subset of SNPs from the study of Keavney et al. (1998) and Zhu et al. (2001).

Names, features and positions of the SNPs relative to the gene are given in Table I and

Figure 1.

The ACE ID (insertion/deletion) polymorphism and four biallelic SNPs were typed by

PCR amplification followed by restriction-enzyme digestion. The PCR reactions were

performed using a GenAmp PCR system 9600 thermocycler (Perkin–Elmer). PCR products

Table I. Features of the five ACE SNPs and primer sequences and enzymes used for their genotyping.

SNP

Accession

number Type

Position

(bp) Primer sequence

Restriction

enzyme

ACE3 rs1800764 T/C �3905 F: 50ATAGTGTATATAGGGCTTGGTAC30 PstI

R: 50AGAAGATATTTGCAAAGTATGTACTG30

ACE4 rs4291 A/T �240 F: 50ACCATGGCCTGGTGAAGAAGC30 XbaI

R: 50CGGCTCTGCCCCTTCTCCTGCGC3

Fint: 50TGTCACTCCGGAGGCGGGAGGCT30

Rint: 50GAGAAAGGGCCTCCTCTCTCT30

ACE6 rs4309 C/T 5489 F: 50AGTGCACACGGGTCACGATG30 MspI

R: 50CCCCCCGACGCAGGGAGCC30

ACE7 rs4331 A/G 9618 F: 50CACACCCTGAAGTACGGCAC30 HaeII

R: 50TCCTCCAGCTCCTGGGCAG30

ACE ID rs4340 I/D 11698 F: 50CTGGAGACCACTCCCATCCTTTCT30

R: 50GATGTGGCCATCACATTGGTCAGAT30

Fv: 50TGGGACCACAGCGCCCGCCACTAC30

Rv: 50TCGCCAGCCCTCCCATGCCCATAA30

Position relative to the first base position of the first exon of the gene. For ACE4, Fint and Rint are internal primersused in nested PCR and for ACE ID Fv and Rv are primers used in homozygotes D/D validation.

Haplotype structure of ACE gene SNPs in Tunisia 321

were digested with the appropriate enzyme (Table I) and were run on acrylamide gels (10%)

stained with ethidium bromide, and scored by UV visualization. The SNPs ACE3, ACE4,

ACE7 were amplified using the conditions described by Zhu et al. (2001) while the SNP

ACE6 was amplified following the conditions of Keavney et al. (1998). For the ACE ID, no

digestion is needed but the PCR was carried out in two steps; first PCR was performed with

the primers from Keavney et al. (1998). Since the preferential amplification of the D allele

(190 bp) over the I allele (490 bp) have been reported with these primers (Lindpaintner et al.

1995), a second PCR was performed on putative homozygous individuals, i.e. those having

a single D band in first PCR (which may actually be either D/D homozygotes or I/D

heterozygotes). This second PCR uses the primers from Yoshida et al. (1995) and gives

a PCR product of 335 bp only when the I allele is present, thus allowing us to distinguish

I/D from D/D individuals.

Positive and negative controls were used in all PCR and digestion reactions in order to

check for the correct amplification and restriction of the products.

Statistical analysis

The estimation of allele frequency and exact test for Hardy–Weinberg equilibrium was

performed using the Genetic Data Analyses (GDA) program (version 1.1) (Weir 1996).

Inference of haplotypes and their frequencies from genotype data was performed using the

PHASE program (version 2.0.2) (Stephens et al. 2001; Stephens and Donnelly 2003).

Three measures of linkage disequilibrium (LD) between SNPs were then computed from

controls in order to study the LD structure of the gene: gametic disequilibrium from inferred

haplotypes, composite LD from genotype data (Weir 1996) and three-locus LD from

estimated haplotype frequencies based on the Bennett (1954) coefficient. These measures

were standardized as correlation coefficients (r2) so that Nr2 provides a �2 test of LD. The

Haploxt (Abecasis and Cookson 2000) program was used to estimate the LD between pairs

of loci from inferred haplotypes. We calculated composite LD using the GDA program

(Weir 1996). Three locus composite LD were calculated using an Excel worksheet

implementing the formula of Bennett’s coefficient provided by Weir (1996).

We defined a block of LD as a subset of SNPs where three to five haplotypes represent

75–90% of the observed haplotypes in the population (Patil et al. 2001). The determination

of key SNPs was performed using an implementation in R language of the method of

Lin and Altman (2004).

Haplotype frequencies between samples were compared using exact test of population

stratification in GENPOP (Raymond and Rousset 1995).

Exon 8 Exon 15 Exon 16

9485

ACE6

5489 11689

Exon1

+1 271 5394 5617 9643 9914 10001

ACE 4ACE3 ACE 7 ACE ID

−3900 −240 9618

SNP within 5’UTRExonic SNP Intronic SNP

Figure 1. Positions of the studied SNPs within the ACE gene (The colour version of this figure isincluded in the online version of the journal).

322 M. Reba€ı et al.

Results and discussion

Distribution of allelic frequencies

The allele frequencies of the five SNPs are given in Table II. No significant deviation from

Hardy–Weinberg equilibrium was found for any of the SNPs and all of them have a good

information content (observed heterozygosities close to 0.5, the maximal value).

The markers ACE3, ACE4 and ACE ID were the most informative.

In dbSNP, only the allelic frequencies of the SNPs ACE6 and ACE7 were available (0.607

and 0.499 for C and A alleles, respectively). These values are significantly different from

those estimated in our sample (p<0.005). In the HapMap project, only data for the SNP

ACE6 were available for European, Chinese, Japanese and Nigerian population samples.

Frequencies of C allele in these four populations are given in Table III. We see that

the frequency in our population is significantly different from all other populations

(all p<0.001), being midway between the frequencies in European and African populations.

Note also that the heterozygosity in our population is midway between those of African and

European samples of the HapMap project.

Haplotypes analysis

Twenty one haplotypes were inferred but only eight have a frequency greater than 1%

(Table IV). Cumulative frequency of these eight haplotypes was 90%. The four first

Table III. Frequency of the C allele of the SNP ACE6 according to the HapMap project and comparison with

the present study.

Population Chinese Japanese European Present study Nigerian

n 45 43 60 100 60

Frequency 0.244 0.488 0.542 0.75 0.917

H (%) 44.4 55.8 41.7 34.0 13.3

�2 test 65.74 18.66 14.77 – 18.56

p-value <10�6 1.6� 10�5 1.2� 10�4 – 2.2� 10�4

n is the number of individuals genotyped, H is the observed heterozygosity in % and �2 test is the Chi-square testfor comparison of allele frequency in that population to allele frequency in the population in the present study.

Table II. Allelic frequencies of the five SNPs in a sample of 100 controls from the Tunisian population.

SNP Frequency Observed H Expected H p-value of HWE

ACE3 C: 0.57; T: 0.43 0.440 0.493 0.21

ACE4 A: 0.66; T: 0.34 0.460 0.451 0.97

ACE6 C: 0.75; T: 0.25 0.340 0.377 0.21

ACE7 A: 0.695; G: 0.305 0.390 0.426 0.27

ACE ID I: 0.325; D: 0.675 0.450 0.441 0.96

H, Heterozygosity. Observed H is the ratio of number of heterozygotes to total number of individuals. Expected His calculated as one minus sum of squares of allele frequencies. HWE, Hardy–Weinberg equilibrium. Exact p-valuesfor HWE were calculated using the permutation procedure in GDA (Weir 1996).

Haplotype structure of ACE gene SNPs in Tunisia 323

haplotypes (CTCAD, TATGI, CACAD, TACAD) are the most frequent and represent

87.4% of the haplotypes, suggesting that the five SNPs belong to a single block of LD.

The two most frequent haplotypes, CTCAD and TATGI, complement each other at all

sites. This phenomenon, known as ‘Ying–Yang haplotypes’, has been reported by Zhang

et al. (2003) for many genes in the human genome and seems to be a general characteristic

of eukaryotic genomes. According to the estimation of these authors, this phenomenon

would be present in 75–85% of the human genomic regions. The Ying–Yang phenomenon

was also observed in the ACE gene in the study of Keavney et al. (1998) for the 10 SNPs

studied.

We have recomputed haplotype frequencies for the five SNPs studied here, from the data

of Keavney et al. (1998) on 10 SNPs. We noticed that the number of the observed

haplotypes in our population (21 haplotypes) is larger than that found in the study

of Keavney et al. (19 haplotypes). Fourteen haplotypes were common to both studies. It can

be seen from Table IV that the order of the four most frequent haplotypes is different in the

two studies. The most discordant haplotypes among those that are common are TATGI

(14% more frequent in the British population) and CACAD (12.5% more frequent in our

population), which are complementary for all SNPs except ACE4. Exact test of population

differentiation based on haplotype counts revealed a significant difference in haplotype

structure of the two populations (p¼ 0.006). Haplotype diversities (calculated as one minus

the sum of squares of haplotype frequencies) were, respectively, 0.7992� 0.0085 (�standard

error) and 0.7672� 0.0057, indicating that diversity is significantly larger in our population.

Many studies have, in fact, reported a greater genetic diversity among the old African

population than among Europeans (Gabriel et al. 2002). Due to its special geographic

location in the Mediterranean basin, Tunisia, whose indigenous inhabitants were African

Berbers, has been subject, during its recent history (from 814 BC to 1830), to many

conquests by several populations, including Phoenicians, Romans, Arabs, Vandals and

Ottomans (Julien 2003). This may have resulted in a particular genetic composition,

in between African and Caucasian populations.

Linkage disequilibrium analysis

Table V gives the r2 measures of pairwise LD (haplotypic and composite). Significant LD

was found for all SNP pairs. All p-values of �2 test were <0.0001. The strongest LD was

Table IV. Haplotype frequencies of five SNPs in Tunisian and British population samples.

Haplotypes

Present study

(n¼ 100)

Keavney et al. (1998)

study (n¼ 555) �2 test p-value

CTCAD 0.295 0.298 0.01 0.93

TATGI 0.227 0.360 13.84 0.0002

CACAD 0.201 0.076 30.57 <10�6

TACAD 0.151 0.095 5.63 0.017

CTCGI 0.033 0.008 10.16 0.0014

TACAI 0.022 0.005 7.81 0.0051

TACGI 0.019 0.050 3.60 0.057

CACAI 0.011 0.003 2.37 0.123

�2 test is the Chi-square test for comparison of haplotype frequency between the present study and that ofKeavney et al. (1998).

324 M. Reba€ı et al.

observed between SNPs ACE7 and ACE ID (r2¼ 0.783). The correlation between

haplotypic and composite r2 values is very high (r¼ 0.99, p<0.001). This correlation

provides a measure of the precision of haplotype inference.

In Figure 2 we plotted haplotypic LD measure r2 against physical distance. The relation

between these two measures is non-linear. Spearman rank correlation was rs¼�0.50,

indicating that physical distance explains only 50% of the LD variation between loci.

In particular, a strong LD is observed between two distant loci (ACE6 and ACE ID)

whereas a weak LD is observed between two close loci (ACE3 and ACE4). This relation

is similar to that found in genome-wide studies; for example, Abecasis et al. (2001) reported

a correlation of rs¼�0.49 from LD data between 127 SNPs in three genomic regions.

Fitting a polynomial quadratic model to our data allows us to predict that useful LD

(r2 > 0.1) will exist on average between loci that are 8 kb apart in the ACE gene region.

We also calculated the LD between three adjacent loci based on the Bennett (1954)

coefficient. Among the combinations tested, only the SNPs ACE6–ACE7–ACE ID showed

rs= −0.5

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 2000 4000 6000 8000 10000 12000 14000 16000

Physical distance (bases)

Hap

loty

pic

LD

Figure 2. Relation between linkage disequilibrium and physical distance for the SNPs studied.

Table V. Gametic and composite linkage disequilibrium r2 coefficients between five SNPs.

SNP Gametic r2 Composite r2

ACE3–ACE4 0.389 0.417

ACE3–ACE6 0.354 0.379

ACE3–ACE7 0.320 0.303

ACE3–ACE ID 0.366 0.315

ACE4–ACE6 0.134 0.201

ACE4–ACE7 0.114 0.134

ACE4–ACE ID 0.132 0.140

ACE6–ACE7 0.675 0.706

ACE6–ACE ID 0.575 0.599

ACE7–ACE ID 0.783 0.785

See text for definition of r2 coefficients.

Haplotype structure of ACE gene SNPs in Tunisia 325

a significant LD ( p¼ 0.011), suggesting that those markers constitute a block of LD.

In fact, LD between three loci is rarely found in the human genome; a 5 kb region on

chromosome 12 is the only genome region that has been reported to harbour loci with high

three-point LD (Meng et al. 2003).

Identification of key SNP

In order to find the most important SNP among the five studied, we first calculated the

number of representative SNP based on the formula of Nyholt (2004):

Neff ¼ 1þ ðL� 1Þð1� Varð�Þ=LÞ

where L is the total number of SNP and Var(�) is the variance of eigenvalues of the matrix

of pairwise LD coefficients. We found Neff¼ 3.4, indicating that three SNPs among the five

studied might be enough to represent the gene region studied. In order to identify the three

most representative SNPs we used the method Lin and Altman (2004), which eliminates

one by one the least informative SNPs, i.e. those SNPs that are associated (after varimax

rotation) with the eigenvectors having the smallest eigenvalues of the LD matrix. The two

least important eigenvectors have eigenvalues of 0.01 and 0.22 and are, respectively,

associated with ACE7 and ACE6 SNPs. The key SNPs that are identified by this approach

are thus ACE3, ACE4 and ACE ID. It is appealing that the two SNPs removed are those

located in coding exons (but are non-synonymous) of the gene while two of the three kept

are located in the 50 regulatory region. ACE ID polymorphism, located in the middle of the

gene (intron 16), is known as being one of the most important markers of the ACE gene.

Many studies have associated this polymorphism with the levels of circulating enzyme

or cardiovascular pathophysiologies (see the Genetic Association Database web site,

http://geneticassociationdb.nih.gov). The most important marker (associated with the

highest eigenvalue, 3.46) identified in this study is ACE3 located in the 50 regulatory

region of the gene, at about 4 kb upstream of the first codon of the gene. This SNP may

become a key marker for future association studies.

Haplotype diversity calculated with these three SNPs is close to its value calculated with

all five SNPs (0.762 vs. 0.799), indicating that the loss of information is minor.

Conclusion

We have studied the haplotype structure of the ACE gene in a sample of the Tunisian

population using five SNPs within this gene: ACE3 (rs1800764), ACE4 (rs4291), ACE6

(rs4309), ACE7 (rs4331) and ACE ID (rs4340).

The statistical analysis of the genotypes of these SNPs allowed us to show that there is

a significant difference in haplotype distribution between our population and the British

population studied by Keavney et al. (1998). The Yin–Yang phenomenon was found in both

studies.

Strong LD was found among all SNPs studied, covering a region of about 16 kb within the

ACE gene. Among these, three SNPs were identified as key SNPs based on a simple

statistical procedure: ACE3 (rs1800764), ACE4 (rs4291) and ACE ID (rs4340). These key

SNPs will be very valuable for future effective association studies of the ACE gene

polymorphisms with diseases or pharmacogenetic studies in the Tunisian and similar North

African populations.

326 M. Reba€ı et al.

Acknowledgements

This work was supported by the Ministry of Scientific Research, Technology and Human

Resources, Tunisia.

References

Abecasis G, Cookson WO. 2000. GOLD—graphical overview of linkage disequilibrium. Bioinformatics

16:182–183.

Abecasis GR, Noguchi E, Heinzmann A, Traherne JA, Bhattacharyya S, Leaves NI, Anderson GG, Zhang Y,

Lench NJ, Carey A, et al. 2001. Extent and distribution of linkage disequilibrium in three genomic regions.

Am J Hum Genet 68:191–197.

Allen-Brady K, Camp NJ. 2005. Characterization of the linkage disequilibrium structure and identification of

tagging-SNPs in five DNA repair genes. BMC Cancer 5:1–10.

Bennett JH. 1954. On the theory of random mating. Ann Eugen 18:311–317.

Camp NJ, Swensen J, Horne BD, Farnham JM, Thomas A, Cannon-Albright LA, Tavtigian SV. 2005.

Characterization of linkage disequilibrium structure, mutation history, and tagging SNPs, and their use in

association analyses: ELAC2 and familial early-onset prostate cancer. Genet Epidemiol 28:232–243.

Gabriel SB, Schaffner SF, Nguyen H, Moore JM, Roy J, Blumenstiel B, Higgins J, DeFelice M, Lochner A,

Faggart M, et al. 2002. The structure of haplotype blocks in the human genome. Science 296:2225–2229.

International HapMap Consortium. 2003. The International HapMap Project. Nature 426:789–796.

International HapMap Consortium. 2005a. A haplotype map of the human genome. Nature 437:1299–1320.

International HapMap Consortium. 2005b. International HapMap Project [Internet]. International HapMap

Consortium. Available online at: http://www.hapmap.org/, accessed 23 November 2005.

Johnson GC, Esposito L, Barratt BJ, Smith AN, Heward J, Di Genova G, Ueda H, Cordell HJ, Eaves IA,

Dudbridge F, et al. 2001. Haplotype tagging for the identification of common disease genes. Nat Genet

29:233–237.

Julien CA. 2003. Histoire de l’Afrique du Nord: des origines a 1830 (livre 1). Ceres Eds, p 483.

Keavney B, McKenzie CA, Connell JMC, Julier C, Ratcliffe PJ, Eric S, Lathrop M, Farrall M. 1998. Measured

haplotype analysis of the angiotensin-I-converting enzyme gene. Hum Mol Genet 7:1745–1751.

Lin Z, Altman RB. 2004. Finding haplotype tagging SNPs by use of principal components analysis. Am J Hum

Genet 75:850–861.

Lindpaintner K, Pfeffer MA, Kreutz R, Stampfer MJ, Frodstein F, LaMotte F, Buring J, Hennekens H. 1995.

A prospective evaluation of an angiotensin-converting-enzyme gene polymorphism and the risk of ischemic

heart disease. N Engl J Med 332:706–711.

Meng Z, Zaykin DV, Xu CF, Wagner M, Ehm MG. 2003. Selection of genetic markers for association analyses,

using linkage disequilibrium and haplotypes. Am J Hum Genet 73:115–130.

Mueller JC, Lohmussaar E, Magi R, Remm M, Bettecken T, Lichtner P, Biskup S, Illig T, Pfeufer A,

Luedemann J, et al. 2005. Linkage disequilibrium patterns and tagSNP transferability among European

populations. Am J Hum Genet 76:387–388.

National Center for Biotechnology Information. 2005a. dbSNP, Single Nucleotide Polymorphism Database

[Internet]. National Center for Biotechnology Information, Bethesda, USA. Available online at: http://

www.ncbi.nlm.nih.gov/SNP/, accessed 23 November 2005.

National Center for Biotechnology Information. 2005. Genetic Association Database [Internet]. National Center

for Biotechnology Information, Bethesda, USA. Available from: http://geneticassociationdb.nih.gov/, accessed

23 November 2005.

Nejentsev S, Godfrey L, Snook H, Rance H, Nutland S, Walker NM, Lam AC, Guja C, Ionescu-Tirgoviste C,

Undlien DE, et al. 2004. Comparative high-resolution analysis of linkage disequilibrium and tag single

nucleotide polymorphism between populations in the vitamin D receptor gene. Hum Mol Genet

13:1633–1639.

Niu T, Chen X, Xu X. 2002. Angiotensin converting enzyme gene insertion/deletion polymorphism and

cardiovascular disease. Drugs 62:977–993.

Nyholt DR. 2004. A simple correction for multiple testing for single-nucleotide polymorphisms in linkage

disequilibrium with each other. Am J Hum Genet 74:765–769.

Haplotype structure of ACE gene SNPs in Tunisia 327

Patil N, Berno AJ, Hinds DA, Barrett WA, Doshi JM, Hacker CR, Kautzer CR, Lee DH, Marjoribanks C,

McDonough DP, et al. 2001. Blocks of limited haplotype diversity revealed by high-resolution scanning of

human chromosome 21. Science 294:1719–1723.

Raymond M, Rousset F. 1995. Population genetics software for exact tests and ecumenicism. J Hered 86:248–249.

Scharplatz M, Puhan MA, Steurer J, Bachmann LM. 2004. What is the impact of the ACE gene insertion/deletion

(I/D) polymorphism on the clinical effectiveness and adverse events of ACE inhibitors?—Protocol of

a systematic review. BMC Med Genet 5:1–6.

Schmidt C. 2005. Latest HapMap update aims to direct researchers to genetic basis of disease. J Natl Cancer Inst

97:1638–1640.

Stephens M, Donnelly P. 2003. A comparison of bayesian methods for haplotype reconstruction from population

genotype data. Am J Hum Genet 73:1162–1169.

Stephens M, Smith NJ, Donnelly P. 2001. A new statistical method for haplotype reconstruction from population

data. Am J Hum Genet 68:978–989.

Weir BS. 1996. Genetic data analysis. Sunderland, MA: Sinauer Associates.

Yoshida H, Mitarai T, Kawamura T, Kitajima T, Miyazaki Y, Nagasawa R, Kawaguchi Y, Kubo H, Ichikawa I,

Sakai O. 1995. Role of the deletion of polymorphism of the angiotensin converting enzyme gene in the

progression and therapeutic responsiveness of IgA nephropathy. J Clin Invest 96:2162–2169.

Zhang J, Rowe WL, Clark AG, Buetow KH. 2003. Genomewide distribution of high-frequency, completely

mismatching SNP haplotype pairs observed to be common across human populations. Am J Hum Genet

73:1073–1081.

Zhu X, Bouzekri N, Southam L, Cooper RS, Adeyemo A, McKenzie CA, Luke A, Chen G, Elston RC, Ward R.

2001. Linkage and association analysis of angiotensin I converting enzyme (ACE) gene polymorphisms with

ACE concentration and blood pressure. Am J Hum Genet 68:1139–1148.

Zhu X, McKenzie CA, Forrester T, Nickerson DA, Broeckel U, Schunkert H, Doering A, Jacob HJ, Cooper RS,

Rieder MJ. 2000. Localization of a small genomic region associated with elevated ACE. Am J Hum Genet

67:1144–1153.

Resume. Arriere plan: Le gene de l’enzyme de conversion de l’angiotensine (ECA) peut etrepotentiellement implique dans l’etiologie de plusieurs maladies courantes. L’etude de la structure de sonhaplotype est donc utile pour la diagnose comme pour les recherches pharmacogenomiques.Objectif: Etudier le profile haplotypique des polymorphismes de nucleotides uniques (PNU) dans le geneECA de la population tunisienne et le comparer a d’autres populations. Trois PNU importants pourleur emploi dans les futures etudes d’association ont ete identifies.Sujets et methodes: Cinq PNU (rs1800764, rs4291, rs4309, rs4331, rs4340) couvrant une region de 15,6kb du gene ECA ont ete types par PCR-RFLP dans un echantillon de 100 personnes en bonne sante.Resultats: Tous les PNU etaient polymorphiques et en equilibre de Hardy-Weinberg. Un total de 21haplotypes ont ete identifies, mais seulement huit avaient une frequence superieure a 1%. Les quatrehaplotypes les plus communs avaient une frequence cumulee de 87,4%. On a rencontre le phenomene‘‘Yin-Yang ’’ (les deux haplotypes majeurs sont complementaires a tous les sites). Le desequilibre delinkage entre toutes les paires de loci etait hautement significatif (p<10–5). Une methode statistiquesimple et efficace a ete utilisee pour identifier trois PNU importants.Conclusion: La population tunisienne presente une structure haplotypique differente de celle deseuropeens pour le gene ECA et trois PNU importants ont ete identifies. Ils seront tres utiles pour lesfutures etudes d’association dans les populations tunisienne et nord-africaines.

Zusammenfassung. Hintergrund: Das Angiotensin-Converting Enzym (ACE) ist ein Kandidatengenin der Atiologie verschiedener haufiger Krankheiten. Die Untersuchung der Haplotypstruktur diesesGens ist bei der Diagnostik und der Pharmakogenomik von Interesse.Ziel: Die Studie untersuchte das Haplotypprofil von Einzelnukleotidpolymorphismen (single nucleotidepolymorphisms, SNPs) innerhalb des ACE-Gens in der Tunesischen Bevolkerung und verglich es mitanderen Populationen. Drei fur den spateren Gebrauch in Assoziationsstudien bedeutsame SNPswurden identifiziert.Probanden und Methoden: Funf SNPs (rs1800764, rs4291, rs4309, rs4331, rs4340), die sich uber eineRegion von 15,6 kb des ACE-Gens erstrecken, wurden mit PCR-RFLP in einer Stichprobe von 100gesunden Probanden typisiert.

328 M. Reba€ı et al.

Ergebnisse: Alle SNPs waren polymorph und im Hardy–Weinberg Gleichgewicht. Insgesamt 21Haplotypen wurden identifiziert, aber nur acht traten in einer Haufigkeit von uber 1% auf. Die vierhaufigsten Haplotypen hatten eine kumulative Haufigkeit von 87,4%. Das ‘Yin–Yang’-Phanomenwurde gefunden (die beiden wichtigsten Haplotypen sind allseits komplementar). DasKopplungsungleichgewicht zwischen den jeweils gepaarten Loci war hochsignifikant (p<10�5). Eineinfaches und effizientes statistisches Verfahren wurde genutzt, um drei bedeutsame SNPs zuidentifizieren.Zusammenfassung:Die Tunesische Bevolkerung zeigte eine gegenuber der Europaischen unterschiedlicheHaplotypstruktur des ACE-Gens, und es wurden drei bedeutsame SNPs identifiziert. Diese werdenfur spatere Assoziationsstudien bei Tunesischen und anderen Nordafrikanischen Populationenhilfreich sein.

Resumen Antecedentes: El enzima convertidor de la angiotensina (ECA) es un gen candidato en laetiologıa de varias enfermedades comunes. El estudio de la estructura haplotıpica de este gen es deinteres en diagnostico y farmacogenomica.Objetivos: El estudio investigo el perfil haplotıpico de los polimorfismos de nucleotidos simples (SNPs)dentro del gen ECA en la poblacion de Tunez, y lo comparo con otras poblaciones. Se identificaron tresimportantes SNPs para su futuro uso en estudios de asociacion.Sujetos y metodos: Se tiparon cinco SNPs (rs1800764, rs4291, rs4309, rs4331, rs4340), que cubren unaregion de 15,6 kb del gen ECA, utilizando PCR-RFLP en una muestra de 100 sujetos sanos.Resultados: Todos los SNPs fueron polimorficos y estaban en equilibrio de Hardy-Weinberg. Seidentificaron un total de 21 haplotipos, pero solo ocho presentaban una frecuencia mayor de 1%. Loscuatro haplotipos mas comunes presentaban una frecuencia acumulada del 87,4%. Se observo elfenomeno ‘‘Yin-Yang’’ (dos haplotipos mayores son complementarios en todos los sitios). Eldesequilibrio de ligamiento entre todos los pares de loci fue altamente significativo ( p<10�5). Se utilizoun procedimiento estadıstico simple y eficiente para identificar tres SNPs importantes.Conclusion: La poblacion de Tunez mostro una estructura haplotıpica diferente de la europea para elgen ECA y se identificaron tres SNPs importantes, los cuales seran de gran ayuda en futuros estudios deasociacion en poblaciones de Tunez y del norte de Africa.

Haplotype structure of ACE gene SNPs in Tunisia 329