Send Orders for Reprints to [email protected] Pharmacophore Design, Virtual Screening,...

19
Send Orders for Reprints to [email protected] Combinatorial Chemistry & High Throughput Screening, 2013, 16, 000-000 1 1386-2073/13 $58.00+.00 © 2013 Bentham Science Publishers Pharmacophore Design, Virtual Screening, Molecular Docking and Optimization Approaches to Discover Potent Thrombin Inhibitors Chandrasekaran Loganathan 1 , Sugunadevi Sakkiah 2 , Keun Woo Lee 2 , Senthamaraikannan kabilan 1 and Chandrasekaran Meganathan *,3 1 Department of Chemistry, Annamalai University, Annamalainagar, Chidambaram, Tamilnadu, India 2 Division of Applied Life Science (BK21 Program), Systems and Synthetic Agrobiotech Center (SSAC), Plant Molecular Biology and Biotechnology Research Centre (PMBBRC), Research institute of Natural Science (RINS), Gyeongsang National University (GNU), 501 Jinju-daero, Gazha-dong, Jinju 660 701, Republic of Korea 3 Department of Physics, GKM College of Engineering and Technology, Perungalathur, Chennai, Tamilnadu, India Abstract: Thrombin plays a key role in the regulation of hemostasis and thrombosis. Inhibition of thrombin is therefore an effective therapeutic target to prevent the formation of blood clots and related thromboembolism disorders. Hence, we have developed chemical feature based pharmacophore models of thrombin inhibitors. The best hypothesis, Hypo1, is characterized with two hydrogen bond acceptors (A), one hydrophobic (H) and one ring aromatic (R) feature. Hypo1 was cross validated using several techniques to prove its validity and statistical significance. The well validated model Hypo1 was used as a 3D query to perform virtual screening. The scores obtained from virtual screening were sorted by applying drug-like filters and molecular docking studies. Finally, 4 compounds were obtained as drug-like leads based on scoring functions, binding modes and molecular interactions at the active site. These 4 molecules were further optimized by adding different substitutions in their side chains. When compared to the original database hits, optimized molecules showed high scoring function, good binding modes and molecular interactions. Hence, we suggest that, upon optimization, these four database hits can act as potential virtual leads to design novel thrombin inhibitors. Also, our model could be useful to retrieve the structurally diverse compounds from various databases. Keywords: Common feature and ligand based pharmacophore, molecular docking, thrombin, thrombosis, virtual screening. 1. INTRODUCTION In a human blood circulatory system during vascular injury a small leakage of blood occurs. Fortunately, blood system itself is having a repair mechanism, blood coagulation, to come across from this severe damage. Blood coagulation is the ability of the body to control the flow of blood after vascular injury. The process comprised of, blood clotting, subsequent dissolution of the clot and repairing the injured tissue is termed as hemostasis. In this coagulation cascade process a number of serine proteases such as factor Xa and thrombin are involved which lead to clot formation. Thrombin is a multifunctional serine protease (EC 3.4.21.5) and a key enzyme in the blood coagulation cascade [1] as well as a promising target in medicinal chemistry [2]. The mechanism of action of this class of enzymes is the final common mediator of both intrinsic and extrinsic coagulation pathways; it triggers platelet activation, production of factors V, VIII, IX, and mediates the proteolytic cleavage of fibrinogen to fibrin which forms the fibrin gel of a hemostatic plug or a pathological thrombus [2, 3] and plays a central role in blood clotting and wound healing [4]. Upon initiation of the coagulation process, prothrombin is converted into the active human -thrombin by the factor Xa in the presence of the factor Va, phospholipid and calcium. *Address correspondence to this author at the Department of Physics, GKM College of Engineering and Technology, Perungalathur, Chennai, Tamilnadu, India; Tel: +919443613379; E-mail: [email protected] Human -thrombin consists of two chain enzymes, a NH 2 - terminal with 36 amino acid(6 kDa) and a COOH-terminal with 259 amino acids (31 kDa) which are covalently joined through a single disulfide bond 2 . Thrombin also activates the factors XI & XIII [5] and this positive feedback stimulates the production of thrombin as well as the formation of covalent bonds between lysine and glutamine residues in fibrin which could increase the stability of the fibrin clot [5]. Under normal physiological conditions, blood balances both coagulation and fibrinolysis mechanisms. Nevertheless, this coagulation cascade could lead to thrombosis in human through pathogenic imbalance condition. Since thrombus forms the basis of many inflammatory responses in the micro vascular endothelium, through thrombosis, the thrombin has become the key factor for thromboembolism disorders like ischemic stroke and pulmonary infarction [6]. Inhibition of thrombin is therefore an effective therapeutic means to prevent the formation of blood clots and related thromboembolism disorders. For the past seven decades, various agents like Warfarin and Heparin, that attenuate the activity of thrombin, have been used as antithrombotic drugs, including indirect or direct thrombin inhibitors. Warfarin is involved in a number of drug-drug interactions which can be problematic in many cases [7]. On the other hand, Heparin and its low molecular weight derivatives based anticoagulant therapy has been used by activating antithrombin, which then inhibits indirectly the trypsin-like serine protease thrombin [8]. However, the therapeutic indexes of these

Transcript of Send Orders for Reprints to [email protected] Pharmacophore Design, Virtual Screening,...

Send Orders for Reprints to [email protected]

Combinatorial Chemistry & High Throughput Screening, 2013, 16, 000-000 1

1386-2073/13 $58.00+.00 © 2013 Bentham Science Publishers

Pharmacophore Design, Virtual Screening, Molecular Docking and Optimization Approaches to Discover Potent Thrombin Inhibitors

Chandrasekaran Loganathan1, Sugunadevi Sakkiah2, Keun Woo Lee2, Senthamaraikannan kabilan1 and Chandrasekaran Meganathan*,3

1Department of Chemistry, Annamalai University, Annamalainagar, Chidambaram, Tamilnadu, India

2Division of Applied Life Science (BK21 Program), Systems and Synthetic Agrobiotech Center (SSAC), Plant Molecular

Biology and Biotechnology Research Centre (PMBBRC), Research institute of Natural Science (RINS), Gyeongsang

National University (GNU), 501 Jinju-daero, Gazha-dong, Jinju 660 701, Republic of Korea

3Department of Physics, GKM College of Engineering and Technology, Perungalathur, Chennai, Tamilnadu, India

Abstract: Thrombin plays a key role in the regulation of hemostasis and thrombosis. Inhibition of thrombin is therefore an effective therapeutic target to prevent the formation of blood clots and related thromboembolism disorders. Hence, we have developed chemical feature based pharmacophore models of thrombin inhibitors. The best hypothesis, Hypo1, is characterized with two hydrogen bond acceptors (A), one hydrophobic (H) and one ring aromatic (R) feature. Hypo1 was cross validated using several techniques to prove its validity and statistical significance. The well validated model Hypo1 was used as a 3D query to perform virtual screening. The scores obtained from virtual screening were sorted by applying drug-like filters and molecular docking studies. Finally, 4 compounds were obtained as drug-like leads based on scoring functions, binding modes and molecular interactions at the active site. These 4 molecules were further optimized by adding different substitutions in their side chains. When compared to the original database hits, optimized molecules showed high scoring function, good binding modes and molecular interactions. Hence, we suggest that, upon optimization, these four database hits can act as potential virtual leads to design novel thrombin inhibitors. Also, our model could be useful to retrieve the structurally diverse compounds from various databases.

Keywords: Common feature and ligand based pharmacophore, molecular docking, thrombin, thrombosis, virtual screening.

1. INTRODUCTION

In a human blood circulatory system during vascular injury a small leakage of blood occurs. Fortunately, blood system itself is having a repair mechanism, blood coagulation, to come across from this severe damage. Blood coagulation is the ability of the body to control the flow of blood after vascular injury. The process comprised of, blood clotting, subsequent dissolution of the clot and repairing the injured tissue is termed as hemostasis. In this coagulation cascade process a number of serine proteases such as factor Xa and thrombin are involved which lead to clot formation. Thrombin is a multifunctional serine protease (EC 3.4.21.5) and a key enzyme in the blood coagulation cascade [1] as well as a promising target in medicinal chemistry [2]. The mechanism of action of this class of enzymes is the final common mediator of both intrinsic and extrinsic coagulation pathways; it triggers platelet activation, production of factors V, VIII, IX, and mediates the proteolytic cleavage of fibrinogen to fibrin which forms the fibrin gel of a hemostatic plug or a pathological thrombus [2, 3] and plays a central role in blood clotting and wound healing [4]. Upon initiation of the coagulation process, prothrombin is converted into the active human -thrombin by the factor Xa in the presence of the factor Va, phospholipid and calcium.

*Address correspondence to this author at the Department of Physics, GKM College of Engineering and Technology, Perungalathur, Chennai, Tamilnadu, India; Tel: +919443613379; E-mail: [email protected]

Human -thrombin consists of two chain enzymes, a NH2-terminal with 36 amino acid(6 kDa) and a COOH-terminal with 259 amino acids (31 kDa) which are covalently joined through a single disulfide bond2. Thrombin also activates the factors XI & XIII [5] and this positive feedback stimulates the production of thrombin as well as the formation of covalent bonds between lysine and glutamine residues in fibrin which could increase the stability of the fibrin clot [5].

Under normal physiological conditions, blood balances both coagulation and fibrinolysis mechanisms. Nevertheless, this coagulation cascade could lead to thrombosis in human through pathogenic imbalance condition. Since thrombus forms the basis of many inflammatory responses in the micro vascular endothelium, through thrombosis, the thrombin has become the key factor for thromboembolism disorders like ischemic stroke and pulmonary infarction [6]. Inhibition of thrombin is therefore an effective therapeutic means to prevent the formation of blood clots and related thromboembolism disorders. For the past seven decades, various agents like Warfarin and Heparin, that attenuate the activity of thrombin, have been used as antithrombotic drugs, including indirect or direct thrombin inhibitors. Warfarin is involved in a number of drug-drug interactions which can be problematic in many cases [7]. On the other hand, Heparin and its low molecular weight derivatives based anticoagulant therapy has been used by activating antithrombin, which then inhibits indirectly the trypsin-like serine protease thrombin [8]. However, the therapeutic indexes of these

2 Combinatorial Chemistry & High Throughput Screening, 2013, Vol. 16, No. 9 Loganathan et al.

drugs are being lowered and Heparin is associated with many side effects including the enhanced risk of bleeding [9]. Since this drug originates from animals, it might carry over the viral or other potentially infectious agents to the consumers. Moreover, some of the incidents such as oversulfated chodroitin sulphate contamination and unfractionated Heparin preparation have brought the necessity of developing a new kind of inhibitor without any side effects [10].

Hence, our current goal is to generate chemical features based pharmacophore models using thrombin inhibitors which could help the medicinal chemist to design new classes of potent and selective thrombin inhibitors. Common features and ligand based pharmacophore models have been developed using Discovery Studio v2.5 (DS). The necessary key chemical features of the known thrombin inhibitors, collected from the binding database, were given as an input and these features were utilized to develop the ligand based pharmacophore model. The developed models have been evaluated using a cost function to select the best model. Further, the best model was cross validated by Fischer’s randomization technique and also evaluated for its activity by test and decoy sets. The well validated pharmacophore model was further used as a 3D query for virtual screening. Subsequently, the hit compounds were subjected to refinement by applying maximum fit value, Lipinski’s rule of five, ADME properties and molecular docking. Furthermore, de nova evolution was executed for the database hit compounds by adding different substitutions in their side chains. The upshot of our model will be able to reflect correctly the structure-activity relationship of thrombin inhibitors and also helpful in the identification of novel potential antagonist for thrombin activity.

2. MATERIALS AND METHODS

2.1. Data Set Preparation

The choice of compounds for training set is an important factor, in 3D pharmacophore model generation phases, to acquire a model with high predictability and statistical significance. Hence, the activity data of training set must be, widely populated, structurally diverse and activity range should be at least four orders of magnitude. A minimum of sixteen structurally diverse compounds should be present to avoid correlation difficulties. Most active molecules were included in the training set and all biologically relevant data were obtained by equivalent inhibitory assays. Total of 40 compounds were selected from the binding database and tested for thrombin activity (IC50, M) [11] using Pub Chem Bioassay (2008). Among these, 21 compounds were carefully selected for the training set spanning over a range of four order magnitude with a broad range of coverage and structurally diverse to achieve a reasonable pharmacophore model in terms of predictive ability and statistical significance. Moderate and inactive compounds were also included to spread the activity range as broad as possible. The remaining 19 compounds were considered as a test set to validate the generated pharmacophore model. 2D structures of all compounds were built by MDL-ISIS Draw v2.5 and converted into their 3D form using DS and were minimized

to closest local minimum using the CHARMm-like force field implemented in the DS [12]. The conformations were generated for each molecule by applying the Best conformation analysis method with the poling algorithm [13] and CHARMm force field parameters [12]. Maximum number of conformations was limited to a value of 255 with a constraint of 20 kcal/mol energy.

2.2. Pharmacophore Generation

In order to generate the pharmacophore model, one should recognize the important chemical features of the potent compounds to have some valuable information during quantitative pharmacophore modelling [14]. To identify the critical chemical features, firstly, the common feature pharmacophore model (Hip-Hop) was generated based on 5 most active compounds collected from the literature [15]. While generating the HypoGen model, the minimum and maximum counts of the selected features were kept as 0 and 5, respectively. During HypoGen computing process, the hypotheses were created in three phases: constructive phase, subtractive phase and optimization phase. Pharmacophore models that were common to the active molecules were identified and kept in constructive phase. Inactive molecules were removed in the subtractive phase. In the optimization phase, small perturbation, random translations of features, rotations of vectored features, and the removal or addition of features from the models were performed. Each perturbation was evaluated on the basis of cost values, which will be discussed later, and the top ten unique pharmacophore models were exported finally. The scheme of this study has been given in Fig (1).

2.3. Evaluation of Pharmacophore Model

During the evaluation process, DS exported top ten scoring hypotheses from the training set; we carefully examined all the hypotheses and chose the best hypothesis for further process. ‘Debnath method’ [16] was used for assessing the quality of the generated pharmacophore models in terms of the cost function and other statistical parameters were computed by the HypoGen algorithm during hypothesis generation. The best hypothesis should be characterized by the highest cost difference, high correlation coefficient, lowest RMS, and the total cost should be closed enough to the fixed cost and away from the null cost. Based on the above parameters, the best hypothesis was picked out and validated by Fischer’s randomization method [17], test and decoy sets. The main objective of validating a pharmacophore model was to demonstrate whether our selected model was having an ability to identify the active compounds and estimate their activity values accurately or not. The Fischer’s randomization test was used to validate the statistical relevance of pharmacophore model. In this technique, the training set molecules were randomized on the basis of the activity data and the pharmacophore hypothesis was generated using the same features and parameters which were used in the original hypothesis. For 95%, 98% and 99% confidence levels 19, 49 and 99 spreadsheets were generated, respectively. In our study, 19 spreadsheets were generated to achieve the 95% confidence level. The statistical significance can be calculated using the equation

Potent Thrombin Inhibitors Combinatorial Chemistry & High Throughput Screening, 2013, Vol. 16, No. 9 3

Significance = [1 - (1 +x)/y] 100

where ‘x’ is the total number of hypotheses having a total cost lower than the best significant hypothesis and ‘y’ is the total number of initial HypoGen runs plus random runs. Two other methods were used to validate the predictability of the best pharmacophore model (Hypo1). In the first method, Hypo1 was validated using a test set of 19 known thrombin inhibitors which were not included in training set but prepared like the training set. Subsequently, in the second method, the model was validated using the decoy set which contains known and unknown inhibitors of thrombin. The Enrichment factor (EF), goodness of hit (GH), false positives and false negatives were calculated in order to ascertain the robustness of Hypo1 [18].

2.4. Virtual Screening

The well validated pharmacophore model was further subjected to virtual screening which is a computer based method of retrieving the potent molecules based on their biological structures [19]. The primary objective of virtual screening is to segregate the selected organic compounds on the basis of their chemical features. Thus, the quantum of work in identifying the new lead molecule [20] is reduced. In our study, two databases namely NCI [21] and Maybridge [22] were employed and 100 conformers for each molecule were generated using Fast Flexible module implemented in DS, which allows with a maximum energy of 20kcal/mol. The best pharmacophore was used as a query for searching the database and all were performed using the Fast Flexible search in DS. Consequently, retrieved compounds were further filtered by applying a maximum fit value and Lipinski’s rule of five to make them more drug-likeness and ADME properties.

2.5. Molecular Docking Protocol

Molecular docking is a computational tool to investigate the proper binding modes between protein and ligand.

LigandFit, a modern docking program implemented in DS, was used in our docking study. The crystal structure of thrombin in complex with -phenyl-D-phenylalanyl-N-propyl-L-prolinamide (PDB ID: 3DA9) has directly been taken from Protein Data Bank (http://www.pdb.org). During the preparation of protein for docking, all water molecules and hydrogen atoms were added by applying the CHARMm force field. The Fast Flexible method was selected to get the conformation with Monte Carlo simulation. Grid resolution, RMS threshold and score threshold were set to 0.5 Å, 2.0 Å and 20kcal/mol, respectively, to avoid identical conformations. Scoring functions as implemented in DS, including LigScore1, LigScore2, Piecewise Linear Potential (-PLP1, PLP2), JAIN, Potential Mean Force (-PMF, -PMF04), and LUDI were calculated to evaluate and rank each of the saved form.

2.6. Optimization

Database hit compounds were further optimized by adding different substitutions in their side chains. The optimized compounds were docked with the thrombin active site using the Ligand fit module with the same protocol used in the original database. The final hits were chosen based on the scoring, binding mode and molecular interaction of crucial active site residues. The synthetic accessibility of optimized compounds was calculated using SYLVIA 1.0 program [23]. SYLVIA calculates the synthetic accessibility of a target structure by summing the five properties of weighed individual compounds: 1) Size, symmetry branching, rings, multiple bonds and hetero atoms of the target molecules, 2) Ring complexity of components, 3) the stereo chemical complexity, 4) Similarity of available starting materials and 5) Retro synthetic reaction fitness. Consequently, SYLVIA provided score values 1 and 10 for easily synthesized and not easily synthesized compounds, respectively. Finally, the novelty of compounds was searched through scifinder scholar [24] and Pub Chem search tools [25].

Fig. (1). Scheme of this study.

4 Combinatorial Chemistry & High Throughput Screening, 2013, Vol. 16, No. 9 Loganathan et al.

3. RESULTS AND DISCUSSION

3.1. Common Feature and Ligand Based Pharmacophore

Models

Before generating the common feature and ligand based pharmacophore models, the dataset, which contains structurally diverse thrombin inhibitors, should be prepared. In the common feature hypothesis generation (Hip-Hop), 5

highly-active compounds are selected as a training set from the literature [15]. On the other hand, 40 structurally diverse inhibitors are collected from the binding database to generate the ligand based pharmacophore models. These 40 molecules are divided into two sets: 1) Training set (21 molecules), to generate the pharmacophore models (Fig. 2) and 2) Test set (19 molecules), to validate the generated pharmacophore model (Fig. 3). In the common feature pharmacophore

N

N

N

N

ONH

F

N

N

N

O NH

O

N

S

N

N

N

N

O NH

N

N

N

O

O

S

N

N

N

O HN

N

S

HN

N

N

O O

NH

O

Br

N

N

N

O

NH2O

N

N

N

NH

F

O

N

N

N

O

SO

N

N

N

O

N

NH2

N

S

NH

S

O

O

O

O

N

N

N

O SO

OHN

N

OO

S

O

O

N

HN

N

OO

S

O

OS

N

HN

N

OO

S

O

OO

N

N

N

S

O

O

O

O

NN

N

NH

N

O

O

HN

O

S

O

O

N

O

N

N

N

O S

NHO

1(0.001μM)2(0.001μM) 3 (0.004μM)

4 (0.005μM)

5 (0.006μM)6 (0.013μM)

7 (0.016μM) 8 (0.019μM)9 (0.023μM)

10 (0.075μM

11(0.089μM

12(0.29μM)13 (0.54μM)

14 (0.69μM)

15 (0.85 μM)

16 (0.94μM)

17 (1.1μM)18 (1.2μM)

N

NNH

HN

NO

O

HN

N

OO

S

O

OO

N

N

N

N

NH2O

N

F

19 (1.2μM)

20 (1.3μM) 21 (2.3μM)

Fig. (2). 2D Chemical structure of thrombin inhibitors in the training set compounds together with their biological activity (IC50 values, M) for HypoGen run.

Potent Thrombin Inhibitors Combinatorial Chemistry & High Throughput Screening, 2013, Vol. 16, No. 9 5

generation phase, the most active molecule is considered as the reference compound. Thus, among the 5 training set molecules, the reference compound is specified with a ‘principal’ value of 2 and a ‘MaxOmitFeat’ value of 0. The ‘principal’ and ‘MaxOmitFeat’ values for remaining 4 compounds are set as 1. The best Hip-Hop model (Fig. 4a), which has the highest ranking score of 70.03 (Table 1),

contains seven features: two ring aromatic (R), two hydrophobic (H), two hydrogen bond donors (D), and one hydrogen bond acceptor (A). It suggests that these features are essential for the potent inhibitors of thrombin. The best Hip-Hop model aligned with the most active compound has been depicted in Fig. (4b).

N

N

N

N

O NH

S

N

N

N

O NH2

O

N

O

Cl

N

N

N

N

O NH2

Cl

N

N

N

O S

O

O

O

N

N

O

OS

N

N

N

N

O NH

S

N

N

N

SO

O

N

NNH

HN

NO

O

O

OHN

O

N

O S O

O N

O

NON

N

NH2

O O

O

O

O

O

OH2N

CNO2N

N

N

N

O

Cl

N

N

N

O

N

O

NH2

N

N

N

O S

O

O

O

N

S

NH

O O

N

HN O

S

O

O

OCl

Cl

HN

O

N

O S O

O

O

N

O

O

HN

NH

NO

NN

O

Cl

N+ OH

H

22(0.004μM 23(0.005μM 24 (0.083μM) 25(0.082μM) 26(0.023μM)

27(0.002μM) 28(0.003μM) 29(0.268μM) 30(0.158μM)

31(0.1μM) 32(0.915μM)

33(0.72μM)34(0.59μM)

35(0.3μM)

36(9.56μM)37(2.767μM)

38 (2.14μM) 39(2.95μM) 40 (3.04μM)

Fig. (3). 2D Chemical structure of thrombin inhibitors of test set compounds with their biological activity (IC50 values, M) to validate Hypo1.

6 Combinatorial Chemistry & High Throughput Screening, 2013, Vol. 16, No. 9 Loganathan et al.

Based on the valuable information from Hip-Hop model, A, D, R, H features are considered as an important chemical feature to develop the ligand based pharmacophore models. In the HypoGen run, top ten hypotheses are exported on the basis of activities of structurally diverse 21 training set molecules. The statistical parameters of the best ten hypotheses are given in Table 2. The quality of a hypothesis is determined using Debnath method [16] in terms of cost function: fixed cost, null cost, and total cost. The total cost of the hypothesis is the summation of the three costs components (error (E), weight (W), and configuration (C)) multiplied by a coefficient (default coefficient is 1.0 for each). The fixed cost represents the simplest model that fits the data perfectly. The null cost represents the cost of a hypothesis with no features that estimates every activity as average activity [26]. The difference between fixed cost and

null cost value of 40-60 bits for the unit of cost, which could imply 75-90% probability of correlation of the experimental and predicted activity values [27]. In our study, the first hypothesis, Hypo1, is comprised of four chemical features: two A, one H and one R (Fig. 5) and, it has the highest cost difference (86.19), lowest root mean square (RMS) value (1.21 Å) and the highest correlation coefficient (0.93). Furthermore, the fixed cost, total cost and null cost values are 79.03, 96.90 and 183.09, respectively. Besides, the total cost of pharmacophore is far away from the null cost and very close to the fixed cost. It is clearly demonstrated that the error, weight and configuration cost components are very low and not deterministic to the model. From the above criteria, it is shown that the model (Hypo1) describes all the pharmacophore features and has a good predictability power. Hypo1 is aligned with the most active compound (compound

Fig. (4). Qualitative pharmacophore model of thrombin inhibitors. (a) The best ranking Hip-Hop model. (b) Hip-Hop model aligned with the most-active compound1. Color coded for these features: green – hydrogen bond acceptor (A), magenta – hydrogen bond donor (D), blue – hydrophobic (H), brown – ring aromatic (R) feature.

Table 1. Characteristics of the Common Feature Hypotheses (Hip-Hop)

Hypotheses Featuresa Rank Direct Hit Mask

b Partial Hit Mask

c

1 RRHHDDA 70.03 1111 0000

2 RRHHDAA 69.23 1111 0000

3 RRHHDDA 67.68 1111 0000

4 RHHHDDA 67.61 1111 0000

5 RHHHDDA 67.24 1111 0000

6 RHHHDDA 67.24 1111 0000

7 RHHHDAA 66.81 1111 0000

8 RHHHDDA 66.68 1111 0000

9

10

RHHHDAA

RRHHDAA

66.66

66.57

1111

1111

0000

0000 aR, Ring aromatic; H, hydrophobic; A, hydrogen bond acceptor; D, hydrogen bond donor. bDirect hit mask indicates whether (1) or (0) not training set molecule mapped every feature. cPartial hit mask indicates whether (1) or (0) not a molecule mapped all but one feature.

Potent Thrombin Inhibitors Combinatorial Chemistry & High Throughput Screening, 2013, Vol. 16, No. 9 7

1(IC50= 0.001μ )) and inactive compound (compound 21(IC50= 2 μ )) of the training set (Fig. 6a, b).

To estimate the inhibition activity of the best model, Hypo1, all the training set compounds are classified into three categories: highly active (+++, IC50 < 0.1 M), moderately active (++, 0.1 IC50 < 1 M) and inactive compounds (+, IC50 1 M). The experimental and the predictive activities of the thrombin inhibitors (training set) together with the corresponding error (i.e., the difference between estimated and experimental activity), measured on the basis of Hypo1 are summarized in Table 3. Out of 21 training set compounds, 14 compounds are correctly predicted and the predicted values are close to the experimental values. All the above parameters, along with the chemical features, explain the validity of the pharmacophore model, Hypo 1, to some extent.

3.2. Pharmacophore Validation

The main aim of a pharmacophore model generation is to identify the new potent lead compounds and to forecast their activity values accurately. Hence the predictability of Hypo1 has been validated using Fischer’s randomization method, test and decoy sets

3.2.1. Fischer’s Randomization Method

Fischer’s method [17] has been applied to evaluate the statistical relevance of Hypo1 and to check the correlation between the chemical structures and their biological activity values. In Fischer’s method the experimental activity values of training sets are distributed randomly using CatScramble program. The resulting training set is used to generate hypothesis by applying the same protocol and parameters

Table 2. Results of Hypotheses (HypoGen) Generated Using Training Set Against Thrombin Inhibitors

Hypotheses Total Cost Cost (Null Costa -Total Cost) RMS

b Correlation Features

b

1 96.90 86.19 1.21 0.93 AAHR

2 98.58 84.51 1.29 0.92 AHRR

3 104.43 78.66 1.49 0.89 AHRR

4 109.01 74.08 1.67 0.86 AHRR

5 109.44 73.65 1.68 0.86 AHRR

6 110.78 72.31 1.73 0.85 AAHR

7 111.12 71.97 1.74 0.85 AHRR

8 111.50 71.59 1.75 0.85 AHRR

9 111.60 71.49 1.75 0.85 AAHR

10 112.15 70.94 1.75 0.85 AHRR aNull cost = 183.09; Fixed cost = 79.03; Configuration cost = 15.5. All cost units are in bits. bRMS, root mean square; A, hydrogen bond acceptor; H, hydrophobic; R, ring aromatic.

Fig. (5). Quantitative model of thrombin inhibitors developed by HypoGen. (a) The best quantitative model, Hypo1. (b) 3D spatial relationship and geometrical parameters of Hypo1. Color coded for these features: green – hydrogen bond acceptor (A), blue – hydrophobic (H), brown – ring aromatic (R) feature.

8 Combinatorial Chemistry & High Throughput Screening, 2013, Vol. 16, No. 9 Loganathan et al.

which are used in the initial hypothesis generation. There are 19 random spreadsheets generated to achieve the 95% confidence level. After randomization run, none of the hypotheses has a total cost and correlation values better than

that of corresponding best pharmacophore model, Hypo1 (Fig. 7). Out of 19 random hypotheses, only four spreadsheets 3, 9, 13 and 18 have correlation values more than 0.85 (Table 4), and somewhat closer to our model

Fig. (6). (a) Hypo1 aligned with the most active compound1 (IC50: 0.001 M). (b) Hypo1 aligned with least active compound21 (IC50: 2 M). Color coded for these features: green – hydrogen bond acceptor (A), blue – hydrophobic (H), brown – ring aromatic (R) feature.

Table 3. Experimental and Predicted Activities of Training Set Compounds Measured on the Basis of Hypo1

Compound

No.

Fit

Valuea

Experimental

IC50 M

Predicted

IC50 M Error

b

Experimental

Scale c

Predicted

Scalec

1 9.76 0.001 0.002 +2.6 +++ +++

2 9.87 0.001 0.002 +2 +++ +++

3 9.78 0.004 0.002 -1.6 +++ +++

4 9.03 0.005 0.014 +2.7 +++ +++

5 9.70 0.006 0.003 -2 +++ +++

6 8.53 0.013 0.044 +3.4 +++ +++

7 9.03 0.016 0.014 -1.2 +++ +++

8 9.00 0.019 0.015 -1.3 +++ +++

9 8.77 0.023 0.025 +1.1 +++ +++

10 8.08 0.075 0.12 +1.6 +++ ++

11 7.59 0.089 0.39 +4.3 +++ ++

12 7.22 0.29 0.9 +3.1 ++ ++

13 7.51 0.54 0.46 -1.2 ++ ++

14 7.00 0.69 1.5 +2.2 ++ +

15 7.49 0.85 0.48 -1.8 ++ ++

16 7.58 0.94 0.39 -2.4 ++ ++

17 7.62 1.1 0.36 -3.1 + ++

18 7.83 1.2 0.22 -5.6 + ++

19 6.88 1.2 2 +1.6 + +

20 7.50 1.3 0.47 -2.8 + ++

21 7.40 2 0.5 -3.4 + ++ aFit value indicates how well the features in the pharmacophore overlap the chemical features in the molecule. Fit = weight * [max (0,1,SSE)] where SSE = (D/T)2, D = displacement of the feature from the center of the location constraints and T = the radius of the location constraint sphere for the feature (tolerance). bDifference between the predicted and experimental values. “+ve” indicates that the predicted IC50 is higher than the experimental IC50, “-ve” indicates that the predicted IC50 is lower than the experimental IC50, a value of 1 indicated that the predicted IC50 is equal to the experimental IC50. cActivity scale: IC50 < 0.01 M = +++ (highly active); 0.1 IC50 < 1 M = ++ (moderately active); IC50 1 M = + (inactive).

Potent Thrombin Inhibitors Combinatorial Chemistry & High Throughput Screening, 2013, Vol. 16, No. 9 9

(Hypo1) value 0.93, but the RMS value is very high and, the total cost is far away from the fixed cost and close to null cost which is not desirable for a good hypothesis. The results of Fischer’s randomization test clearly shows that Hypo1 is

not generated by chance and its values are far superior than that of the 19 randomly produced hypotheses, and provides 95% confidence in our pharmacophore hypothesis.

Fig. (7). The difference in correlations of hypotheses between the Hypo1 and 19 random spreadsheets after randomization run.

Table 4. Results from Cross Validation Using CatScramble Implemented in DSab

Validation No. Total cost RMSc Correlation Cost Difference

Hypo1 96.9 1.21 0.93 86.1

Results for Scrambled

Random1 145.8 2.55 0.65 37.2

Random2 163.0 2.80 0.54 20.0

Random3 111.1 1.74 0.85 71.9

Random4 115.9 1.80 0.84 67.1

Random5 139.5 2.34 0.72 43.5

Random6 121.1 1.97 0.81 61.9

Random7 155.8 2.68 0.61 27.2

Random8 135.9 2.27 0.74 47.1

Random9 112.3 1.73 0.85 70.7

Random10 130.0 2.19 0.76 53.0

Random11 143.7 2.46 0.68 39.3

Random12 135.1 2.32 0.73 47.9

Random13 114.3 1.76 0.85 68.7

Random14 151.4 2.64 0.62 31.6

Random15 136.6 2.40 0.70 46.4

Random16 133.8 2.34 0.72 49.2

Random17 124.6 2.05 0.79 58.4

Random18 106.6 1.62 0.87 76.4

Random19 124.2 2.15 0.77 58.8 aNull cost = 183.09 bFixed cost = 79.03 c RMS, root mean square.

10 Combinatorial Chemistry & High Throughput Screening, 2013, Vol. 16, No. 9 Loganathan et al.

3.2.2. Test and Decoy Sets

The main aim of a pharmacophore model generation is not only to predict the activity of training set compounds accurately but also to predict the activities of external compounds and classify them accurately as active and inactive. In our study, we have prepared an independent test set, which contains 19 structurally different compounds from the training set molecules, to validate the Hypo1. Regression analysis of Hypo1 against the test set compounds gives a correlation coefficient value of 0.92 between experimental and predicted activity values (Fig. 8). Analyzed values such as experimental, predicted activities and error values of test set using the best model Hypo1 are summarized in Table 5. Like the training set, the test set is also divided into three categories: highly active (+++, IC50 < 0.1 M), moderately active (++, 0.1 IC50 < 1 M) and inactive molecules (+, IC50

1 M). All the highly active molecules are accurately predicted as highly active. Whereas 5 out of 7 moderately active molecules are predicted correctly and the remaining compounds 30 & 31 are predicted as inactive and highly active molecules, respectively. In the case of inactive compounds, 3 out of 5 are predicted correctly and the remaining two compounds (37 & 38) are overestimated as moderately active. Overall, 15 out of 19 (79%) test molecules are predicted accurately. The crossover predictability has been estimated between moderately active and inactive molecules.

Fig. (8). Correlation (r) graph between the experimental activity and the predicted activity by Hypo1 for the test set molecules.

Decoy set is another method to validate the statistical significance of Hypo1. It contains both known and unknown inhibitors of thrombin. Hypo1 has been used to screen the decoy set using the Best Flexible technique to segregate active molecules. A total of 1200 molecules comprised of 1190 unknown molecules and 10 known inhibitors of thrombin have been used in decoy set. The parameters- total hit molecules (Ht), percentage yields of molecules, the percentage ratio of actives in the hit list, EF, GH score, false negatives, and false positives are calculated and given in Table 6. Hypo1 has successfully retrieved 13 molecules from the database. Among them, 9 molecules are known thrombin inhibitors. It recalls 4 inactive compounds as active compounds (false positive) and one active compound was predicted as inactive compound (false negative). Further, EF and GH have been calculated using the below mentioned formula

EF = (Ha/Ht) / (A/D)

GH = ((Ha (3A + Ht)) / (4HtA)) (1- ((Ht – Ha) / (D – A)))

where Ht = number of hits retrieved, Ha = number of active molecules in the hit list, A = number of active molecules present in the database, and D = total number of molecules in the database. In our case, EF and GH score values are 8.3 and 0.74, respectively. GH score values 0 and 1 indicate the null and the ideal model [28] respectively. Whereas the score range above 0.60 indicates the very good model [29]. Hence it is considered that, Hypo1 has the superior ability to identify the false positives and highly differentiates the structural similarities of active and inactive thrombin inhibitors. Thus, the above validation results reflect the robustness of Hypo1 and hence, it is considered for further studies such as virtual screening.

3.3. Virtual Screening

Virtual screening is a widely accepted inexpensive, fast and alternative powerful tool to identify new potential lead candidates, from large databases, that can be subsequently synthesized and tested for their biological activities [30]. The best predictive model, Hypo1, has been utilized as a query to search the databases namely, NCI and Maybridge which contain 200,000 and 60,000 compounds, respectively. The Fast Flexible search method in DS is used to explore the database. A total of 48,227 hits from NCI (9,316) and Maybridge (38,911) databases have been screened in the initial screening. Whereas, 957 (NCI) and 5050 (Maybridge) hit compounds are sorted out by keeping the fit value as a maximum of 10.5. These hits are further screened using drug-like filters such as Lipinski’s [31] rule of five and ADME to make them more drug-likeness. The hit compounds having molecular weight > 500, hydrogen bond donors (OH + NH groups) >5, and hydrogen bond acceptors (O’s +N’s atoms) > 10 are removed by applying Lipinski’s rule of five. The number of rotatable bonds also affect the properties (e.g., oral bioavailability or oral absorption) of compounds [32]. The acceptable number of rotatable bonds for the selection of hits in the virtual screening studies, are 7 [33]. Consequently, compounds having more than seven rotatable bonds are eliminated from the database hit compounds. ADMET descriptor module/DS has been used to screen the database hit molecules. Blood Brain Barrier (BBB), solubility and Human Intestinal Absorption (HIA) are the mainly focused areas in our study. The levels 3, 3, and 0 are chosen for BBB, solubility and HIA, respectively. Eventually, a total of 591 molecules from NCI (82) and Maybridge (509) databases are selected for further studies such as molecular docking.

3.4. Molecular Docking

A molecular docking study has been carried out to identify the binding mode of ligand in the active site of a protein and to predict the binding affinity between the ligand and protein [30a]. The active site of the protein is identified based on the volume occupied co-crystal structure and their critical amino acids are carefully chosen for our docking study by exiting crystal structure [34]. Scoring functions namely, Ligscore1, LigScore2, -PLP1,-PLP2, -PMF, -PMF04, Dock score, JAIN and Ludi scores, binding modes, and molecular interaction of the retrieved compounds from the database are compared with one of the most active

Potent Thrombin Inhibitors Combinatorial Chemistry & High Throughput Screening, 2013, Vol. 16, No. 9 11

thrombin inhibitor Melagatran [35]. Hence, the Melagatran (control group) is initially docked with the protein active site and LigScore1, LigScore2 of 5.49, 5.00, -PLP1, -PLP2 of 103.40, 94.50, JAIN score of 2.81, -PMF, -PMF04 of 125.11 110.7, Dock score of 100.1 and Ludi1, Ludi2, Ludi3 of 665, 528, 613, scores, respectively were obtained (Table 7). Thus, the database hit compounds are selected on the basis of the Dock score (above 85) and LigScore (above 5) and their interactions are analyzed with the crucial amino acids present in the protein active site. A total of 4 (Maybridge 3 & NCI 1) database hit compounds have good LigScore and Dock score values above the control group (Table 7). LigScore1 and LigScore2 are used to calculate the descriptors of polar surfaces of receptor-ligand interaction [36]. In our study, the hit compounds DP_00512, BTB_08814, BTB_08770, and NCI0005473 have higher LigScore1 and LigScore2 values than that of the control group. It has been demonstrated that these compounds have more polar interaction area than the control group. Polar surfaces are important elements since polar hydrogen atoms are explicit hydrogen donors [37]. All the hit compounds have higher –PLP1, -PLP2 scores which are calculated based on the hydrogen bond formation [38]. Higher -PLP scores indicate that the hit compounds form strong hydrogen bonds with catalytically important residues in the protein active site. The -PMF and Dock scores of BTB_08814 and BTB_08770 are closer to the value of the control group.

JAIN [39] and Ludi scores [40] are calculated based on the hydrophobic interaction, entropy, and degree of freedom, respectively. In our case, all the hit compounds have scored higher JAIN and Ludi scores. This substantiates that the binding stabilites of the receptor-ligand complexes of hit compounds are higher than the control group. Moreover, we have once again visually analyzed these compounds and compared them with the molecular interaction of the control group. The binding mode and molecular interactions of a control group with the protein active site have been shown in Fig. (9a). The control group has forged a very good hydrogen bond with catalytically important residues S235, S256 and one water molecule (X177), but it fails to produce a hydrogen bond with G258. However, there are three hydrogen bonds essential for significant binding affinity [34]. Furthermore, it shows the hydrophobic interaction with Y83, L132 while stacking against W86 which fills the binding pocket of the protein very well. The binding modes of database hit compounds from Maybridge and NCI, DP_00512, BTB_08814, BTB_08770 and NCI0005473 are compared with the control group and depicted in Fig. (9b). The database hit compounds, DP_00512, BTB_08814, and BTB_08770 from Maybridge, have shown higher LigScore1, LigScore2, -PLP1, -PLP2, JAIN, and dock scores when compared to the control group. Besides, these three database hit compounds form very good hydrogen bonds with catalytically important residues (S235, S256, and G258) and

Table 5. Experimental and Predicted Activities of Test Set Compounds Calculated by Hypo1

Compound No. Fit Valuea Experimental IC50 M Predicted IC50 M Error

b Experimental Scale

c Predicted Scale

c

22 10.5 0.004 0.001 -4 +++ +++

23 9.5 0.005 0.005 1 +++ +++

24 8.5 0.083 0.038 -2.18 +++ +++

25 8.2 0.082 0.086 +1.05 +++ +++

26 9.3 0.023 0.015 -1.53 +++ +++

27 10.4 0.002 0.001 -2 +++ +++

28 10.2 0.003 0.002 -1.5 +++ +++

29 7.7 0.268 0.247 -1.08 ++ ++

30 7.3 0.158 1.517 +9.6 ++ +

31 8.6 0.1 0.083 -1.2 ++ +++

32 7.6 0.915 0.371 -2.47 ++ ++

33 8.1 0.72 0.11 -6.52 ++ ++

34 7.6 0.59 0.36 -1.63 ++ ++

35 7.5 0.30 0.46 +1.54 ++ ++

36 6.2 9.56 8.908 -1.07 + +

37 8.0 2.767 0.348 -7.95 + ++

38 7.7 2.142 0.591 -3.62 + ++

39 7.1 2.95 1.123 -2.63 + +

40 0.3 3.04 6.56 +2.15 + + aFit value indicates how well the features in the pharmacophore overlap the chemical features in the molecule. Fit = weight * [max (0,1,SSE)] where SSE = (D/T)2, D = displacement of the feature from the center of the location constraints and T = the radius of the location constraint sphere for the feature (tolerance). bDifference between the predicted and experimental values. “+ve” indicates that the predicted IC50 is higher than the experimental IC50, “-ve” indicates that the predicted IC50 is lower than the experimental IC50, a value of 1 indicated that the predicted IC50 is equal to the experimental IC50. cActivity scale: IC50 < 0.01 M = +++ (highly active); 0.1 IC50 < 1 M = ++ (moderately active); IC50 1 M = + (inactive).

12 Combinatorial Chemistry & High Throughput Screening, 2013, Vol. 16, No. 9 Loganathan et al.

one water molecule (X177) at the active site (Figs. 10, 11). NCI compound (NCI0005473) also has a very good LigScore1, LigScore2, and JAIN scores when compared to the control group and formed a strong hydrogen bond with active site residues S235, S256, G258, and one water molecule (X177) (Fig. 11). The same kind of hydrogen-bonding network is found in other thrombin-ligand complexes [34, 41]. Hydrophobic contacts of identifying hits (Y83, L132, W86, W257, V255, N131, I209, H79) are very similar to that of control group [42]. It was once again confirmed that binding modes of identifying hits coincide well with the control group. 2D structures of database hit compounds have been depicted in Fig. (12), Moreover all these hit compounds show good interaction with the catalytically important active site residues, also, these 4 hits are considered for an optimization study to improve their binding affinity at the protein active site.

3.5. Optimization

All the 4 database hit compounds have fitted well with the protein active site when compared to the control group, but azetidine and cyclohexyl groups of the control group have extended to some distance and occupy the aryl binding site of thrombin [41]. This prompted us to improve the binding affinity of the database hit compounds by the addition of different substitutions in their side chains. Forty different substitutions have been made and docked with the

Table 6. Statistical Parameters of Hypo1 from Screening the

Decoy Set

Number Parameter Values

1 Total number of molecules in database (D) 1200

2 Total number of actives in database (A) 10

3 Total number of hit molecules from the database (Ht) 13

4 Total number of active molecules in hit list (Ha) 9

5 % Yield of actives [(Ha/Ht) X 100] 69.23

6 % Ratio of actives [(Ha/A) X 100] 90

7 Enrichment Factor (EF) 8.3

8 False negatives [A-Ha] 3

9 False Positives [Ht - Ha] 1

10 Goodness of Hit Scorea (GH) 0.74 a[(Ha/4HtA)(3A+Ht) X (1-((Ht-Ha) / (D-A))]; GH score of 0.7-0.8 indicated a very good model [28].

thrombin active site using the LigandFit module with the same parameters used in the original database hits. Among 40 substitutions, 18 have shown very good score value (Table 8) when compared to the control group. A comparison between the data presented in Tables 7 and 8 shows that the optimized compounds, Opt4_DP

Table 7. Scoring Function Comparison of Database Hits with Melagatran

Molecule LigS1 LigS2 -PLP1 -PLP2 JAIN -PMF -PMF04 Dock Ludi1 Ludi2 Ludi3 BA kcal/mol

Melagatran 5.49 5.00 103.40 94.50 2.81 125.11 110.7 100.1 665 528 613 -8.7

DP_00512 5.51 5.67 107.80 99.66 6.24 70.08 58.17 96.50 774 626 719 -7.6

BTB_08814 6.13 5.50 115.15 118.06 8.33 133.55 105.9 93.41 834 696 886 -7.8

BTB_08770 5.84 5.72 111.64 105.81 5.33 132.11 106.00 99.04 749 600 799 -7.9

NCI0005473 5.88 5.74 84.16 81.89 5.71 88.95 64.98 85.19 642 518 479 -6.1

BA-Binding Affinity.

Fig. (9). (a) Molecular interaction of the Melagatran with catalytically important residues at protein active site. (b) Structure of Melagatran. (c) Binding modes of Melagatran and database hits. Color code: residues are in elemental color, Melagatran- cyan, DP_00512 – Violet, BTB_08814 –pink, BTB_08770 – blue, and NCI0005473 - green and dotted lines denoted as hydrogen bond

Potent Thrombin Inhibitors Combinatorial Chemistry & High Throughput Screening, 2013, Vol. 16, No. 9 13

Fig. (10). Pharmacophore mapping and molecular interaction (LIGPLOT) of the database hits (a) DP_00512, (b) BTB_08814, from Maybridge have shown interaction with Gly258, Ser235, Ser256 and water molecule (X177). Color code: residues are in elemental color, Ligand – violet, and red, dotted lines denoted as hydrogen bond.

Fig. (11). Pharmacophore mapping and molecular interaction (LIGPLOT) of the database hits (a) BTB_08778 (Maybridge), (b) NCI0005473 (NCI) have shown interaction with Gly258, Ser235, Ser256 and water molecule (X177). Color code: residues are in elemental color, Ligand – gold, and green, dotted lines denoted as hydrogen bond.

14 Combinatorial Chemistry & High Throughput Screening, 2013, Vol. 16, No. 9 Loganathan et al.

(DP_000512), Opt2_08814 (BTB_08814), Opt2_08870, Opt4_08770 (BTB_08770) and Opt1_NCI (NCI0005473), have higher score values than the control group and the original database hits. It has been found that these compounds have more polar interaction area, strong hydrogen bond, and higher hydrophobic nature with catalytically important residues in the protein active site than the control group and the original database hits. The binding mode and molecular interactions of these optimized compounds compared with the control group are shown Fig. (13). The binding mode indicates that the excellent hits from

the database and its derivatives (optimized compounds) are in good consistence with our pharmacophore analysis. 2D representations of the optimized compounds are shown in Figs. (14, 15). The synthetic accessibility of the selected optimized compounds has been checked using SYLVIA 1.0 program [23]. Based on the SYLVIA score (Table 8), it has been found that the selected optimized compounds are easy to synthesize. Further, the novelty of these selected compounds is confirmed using scifinder Scholar [24] and Pubchem structure search [25] tools.

Cl

NH

S

NH

OH

OCN

a) S

O

ON O

HN

O

O

b)

S

O

ON O

HN

O

OH

Cl

c) d)

HO

OHN

HN O

Fig. (12). 2D structure of database hit compounds (a) DP_00512, (b) BTB_08814 (c) BTB_08770, and (d) NCI0005473.

Fig. (13). Binding modes of Melagatran and optimized database hits. Color code: residues are in elemental color, Melagatran- cyan, Opt1_DP – orange, Opt2_08814 –green, Opt2_08770 – violet, and Opt1_NCI - pink and dotted lines denoted as hydrogen bond

Potent Thrombin Inhibitors Combinatorial Chemistry & High Throughput Screening, 2013, Vol. 16, No. 9 15

4. CONCLUSIONS

In the present study, generation of qualitative and quantitative pharmacophore models using Hip-Hop and HypoGen module of DS was carried out. The primary intent of generation of the qualitative hypothesis was to identify the

key features which were necessary for a quantitative method. Hence the development of the qualitative model based on five most active compounds for thrombin activity was successfully carried out. The best qualitative model containing seven features (2R, 2H, 2D and 1A) had been chosen based on the highest ranking score of 70.03 and these

Fig. (14). 2D representation of optimized database hit compounds.

HN

S

CNHN

O

OH

Cl

O

HN

S

CNHN

O

OH

Cl

N

HN

S

CNHN

O

OH

Cl

NH

S CN

NH

O

HO

ClO

O

NH

S CN

NH

O

HO

Cl

NH

S CN

NH

O

HO

Cl

N

NH

S CN

NH

O

HO

Cl

NN

O

S

O

ON

NH

O OO

OS

O

ON

NH

O O

O

N

S

O

ON

NH

O O

O

S

O

ON

NH

O O

O

O

S

O

ON

NH

O O

O

Opt1_DP Opt2_DP Opt3_DP

Opt4_DPOpt5_DP Opt6_DP

Opt7_DP Opt1_08814

Opt2_08814

Opt3_08814

Opt4_08814 Opt5_08814

S

O

ON

NH

O O

N

O

S

O

ON

NH

O O

N

O

N

O

S

O

ON

NH

O OH

OCl

O

S

O

ON

NH

O OH

OCl

N

Opt6_08814 Opt7_08814

Opt1_08770 Opt2_08770

16 Combinatorial Chemistry & High Throughput Screening, 2013, Vol. 16, No. 9 Loganathan et al.

features were considered as one of the inputs to generate the quantitative model. Top ten quantitative hypotheses were exported using 21 structurally diverse thrombin inhibitors. Among all hypotheses the best predictive model, Hypo1, was chosen based on its cost function, correlation co-efficient and RMS deviation. The Hypo1 was subsequently cross validated by Fischer validation and statistical significances which were manifested by external test and decoy sets. After all the three validations, Hypo1 had shown good predictability of active and inactive molecules and it was taken for further analyses such as virtual screening for retrieving the potential leads against thrombin. NCI and Maybridge databases were used for virtual screening. Hypo1 was used as a query and a total of 48,227 molecules were retrieved from the NCI (9,316) and Maybridge (38,911) databases. To refine the retrieved hits, the molecules were further filtered by applying maximum fit value, Lipinski’s rule of five and ADMET descriptors to attain the more drug-likeness. Totally 591 molecules from NCI (82) and Maybridge (509) databases were passed out these filtrations. All the molecules, which passed the above filtration, were taken for molecular docking studies to identify the suitable

orientations and critical interactions with essential amino acids in the thrombin active site. Thus all compounds including training set and 591 hits were docked with the active site of thrombin. Finally, compounds 1 and 3 were selected from NCI and Maybridge databases, respectively, based on the scoring function, binding mode and molecular interaction. These compounds had shown both good binding affinity with critical amino acid and drug-likeness which were similar to that of the control group. Further, optimization was made by adding different substitutions in their side chains of the 4 database hit compounds. Based on the optimization, it has been suggested that these 4 database hits and their derivatives (optimized compounds) were very good potential leads to design the novel thrombin inhibitors. SYLVIA 1.0 program scores suggested that these compounds can be easily synthesized. The novelty of these hit compounds was further confirmed by scifinder scholar and Pubchem structure search tools. Hence, we conclude that our pharmacophore model Hypo1 is able to identify new hits from any chemical databases and give potent molecules that may act as good leads against thrombin.

S

O

ON

NH

O OH

OCl

S

O

ON

NH

O OH

O

O

Cl

S

O

ON

NH

O OH

OCl

S

O

ON

NH

O OH

OCl

N

S

O

ON

NH

O OH

OCl

N

N

O

O

HN

OH

O

HN

O

N

HN

OH

O

HN

O HN

OH

O

HN

O

HN

OH

O

HN

O

O

HN

OH

O

HN

O

Opt3_08770 Opt4_08770

Opt5_08770

Opt6_08770 Opt7_08770

Opt1_NCI

Opt2_NCI Opt3_NCI Opt4_NCI

Opt5_NCI

N

HN

OH

O

HN

O

N

HN

OH

O

HN

O

N

OOpt6_NCI Opt7_NCI

Fig. (15). 2D representation of optimized database hit compounds.

Potent Thrombin Inhibitors Combinatorial Chemistry & High Throughput Screening, 2013, Vol. 16, No. 9 17

ABBREVIATIONS

DS = Discovery studio v2.5

A = Hydrogen bond acceptor

H = Hydrophobic

R = Ring aromatic

D = Hydrogen bond donor

RMS = Root mean square

EF = Enrichment factor

GH = Goodness of hit

ADMET = Absorption, Distribution, Metabolism, Excretion and Toxicity

BBB = Blood Brain Barrier

HIA = Human Intestinal Absorption

PLP = Piecewise linear potential

PMF = Potential mean force

Table 8. Scoring Function Comparison of Optimized Database Hits with Melagatran

Molecule LigS1 LigS2 -PLP1 -PLP2 JAIN -PMF -PMF04 Dock Ludi1 Ludi2 Ludi3 BA kcal/mol SA

Melagatran 5.49 5.00 103.40 94.50 2.81 125.11 110.7 100.1 665 528 613 -8.7

DP_000512

Opt1_DP 5.51 5.00 105.48 107.45 6.99 120.74 80.07 99.40 821 653 767 -8.3 4.72

Opt2_DP 5.45 4.97 100.49 100.99 6.86 110.80 85.44 96.94 866 676 723 -7.6 4.65

Opt3_DP 5.03 4.67 102.74 104.38 7.64 110.04 75.42 97.65 817 669 760 -8.2 4.75

Opt4_DP 4.89 4.82 115.63 108.33 6.53 130.61 96.11 105.24 714 690 862 -7.9 5.21

Opt5_DP 4.58 4.53 101.98 96.55 8.77 103.86 76.63 94.4 728 644 685 -8.7 4.88

Opt6_DP 5.65 4.56 109.63 111.83 9.71 119.23 92.68 91.39 916 771 757 -8.5 4.54

Opt7_DP 5.75 5.02 119.33 118.75 9.46 121.14 89.92 96.51 852 687 800 -8.4 5.16

BTB_08814

Opt1_08814 4.53 3.78 109.54 112.96 7.77 124.53 93.51 93.96 757 657 896 -8.5 4.21

Opt2_08814 6.01 5.65 114.94 112.66 5.33 107.46 102.85 95.79 623 547 738 -8.0 3.95

Opt3_08814 5.43 5.28 108.86 109.66 6.91 111.54 83.72 99.93 644 538 666 -8.7 4.06

Opt4_08814 6.35 5.57 112.55 114.05 8.24 128.49 116.03 90.95 787 623 807 -8.0 4.53

Opt5_08814 4.91 3.93 110.31 115.31 7.04 110.11 81.37 98.4 615 557 740 -8.6 4.01

Opt6_08814 4.79 4.83 116.23 119.97 9.34 128.94 108.04 102.65 735 645 820 -8.7 3.77

Opt7_08814 6.18 5.14 121.55 118.44 6.39 121.11 105.17 97.57 698 570 751 -7.9 4.44

BTB_08770

Opt1_08770 6.61 5.6 115.54 121.65 7.38 118.25 100.75 105.39 787 692 877 -8.8 3.98

Opt2_08770 6.48 5.86 115.80 114.04 6.16 98.05 102.22 98.66 742 597 799 -8.2 3.95

Opt3_08770 6.03 5.59 112.53 115.76 5.75 123.85 103.89 100.05 691 560 775 -8.7 3.91

Opt4_08770 6.29 4.69 107.44 114.87 8.37 115.71 101.19 96.38 877 711 953 -8.0 4.57

Opt5_08770 6.78 5.77 116.03 115.41 8.06 150.99 112.53 102.58 793 638 867 -9.0 3.90

Opt6_08770 6.69 6.13 119.78 112.44 6.64 145.33 114.79 101.11 859 696 826 -8.6 3.67

Opt7_08770 6.72 5.9 112.79 112.40 7.62 128.39 119.21 96.82 808 657 958 -8.6 4.32

NCI005473

Opt1_NCI 5.61 4.96 98.27 106.31 7.35 117.77 96.21 87.61 606 531 527 -7.4 4.43

Opt2_NCI 5.44 5.14 107.74 110.35 6.78 108.62 92.22 96.45 549 599 557 -6.6 4.34

Opt3_NCI 6.05 5.25 103.21 109.28 8.96 115.48 82.38 99.41 706 570 538 -6.8 4.63

Opt4_NCI 4.91 4.59 99.59 108.26 5.38 132.47 107.58 100.56 536 480 531 -7.1 4.62

Opt5_NCI 5.24 4.05 108.11 118.03 9.09 118.57 78.79 90.41 595 534 592 -6.7 4.46

Opt6_NCI 5.46 5.08 111.29 117.02 10.02 90.41 87.45 100.91 635 546 576 -6.9 4.17

Opt7_NCI 5.81 5.71 113.95 122.48 8.38 139.08 111.27 95.41 727 594 655 -6.9 4.17

BA - Binding Affinity, SA - Synthetic Accessibility.

18 Combinatorial Chemistry & High Throughput Screening, 2013, Vol. 16, No. 9 Loganathan et al.

CONFLICT OF INTEREST

The authors confirm that they have no conflicts of interest.

ACKNOWLEDGMENTS

This research was supported by the Basic Science Research Program (2009-0073267), Pioneer Research Centre Program (2009-0081539), and Management of Climate Change Program (2010-0029084) through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (MEST) of the Republic of Korea and this work was also supported by the Next-Generation BioGreen 21 Program (PJ008038) from Rural development Administration (RDA) of Republic of Korea. The authors would like to acknowledge financial support from the Department of Biotechnology (BT/PR14062/MED/30/357/2010) New Delhi, India.

REFERENCES

[1] Marinko, P.; Krbavcic, A.; Mlinsek, G.; Solmajer, T.; Bakija, A. T.; Stegnar, M.; Stojan, J.; Kikelj, D. Novel non-covalent thrombin inhibitors incorporating P1 4,5,6,7-tetrahydrobenzothiazole arginine side chain mimetics. Eur. J. Med. Chem., 2004, 39 (3), 257-265.

[2] Salvagnini, C.; Gharbi, S.; Boxus, T.; Marchand-Brynaert, J. Synthesis and evaluation of a small library of graftable thrombin inhibitors derived from (l)-arginine. Eur. J. Med. Chem., 2007, 42 (1), 37-53.

[3] Kettner, C.; Mersinger, L.; Knabb, R. The selective inhibition of thrombin by peptides of boroarginine. J. Biol. Chem., 1990, 265 (30), 18289-18297.

[4] Xia, Y.; Chackalamannil, S.; Chan, T.-M.; Czarniecki, M.; Doller, D.; Eagen, K.; Greenlee, W. J.; Tsai, H.; Wang, Y.; Ahn, H.-S.; Boykow, G. C.; McPhail, A. T. Himbacine derived thrombin receptor (PAR-1) antagonists: Structure-activity relationship of the lactone ring. Bioorg. Med. Chem. Lett., 2006, 16 (18), 4969-4972.

[5] Di Nisio, M.; Middeldorp, S.; Büller, H. R. Direct Thrombin Inhibitors. N. Engl. J. Med., 2005, 353 (10), 1028-1040.

[6] (a) Thalassitis, A.; Hadjipavlou-Litina, D. J.; Litinas, K. E.; Miltiadou, P. Synthesis of modified homo-N-nucleosides from the reactions of mesityl nitrile oxide with 9-allylpurines and their influence on lipid peroxidation and thrombin inhibition. Bioorg.

Med. Chem. Lett., 2009, 19 (22), 6433-6436; (b) Friedrich, R.; Riester, D.; Göttig, P.; Thürk, M.; Schwienhorst, A.; Bode, W. Structure of a novel thrombin inhibitor with an uncharged d-amino acid as P1 residue. Eur. J. Med. Chem., 2008, 43 (6), 1330-1335.

[7] Deswal, S.; Roy, N. Quantitative structure activity relationship studies of aryl heterocycle-based thrombin inhibitors. Eur. J. Med.

Chem., 2006, 41 (11), 1339-1346. [8] Howard, N.; Abell, C.; Blakemore, W.; Chessari, G.; Congreve,

M.; Howard, S.; Jhoti, H.; Murray, C. W.; Seavers, L. C. A.; van Montfort, R. L. M. Application of Fragment Screening and Fragment Linking to the Discovery of Novel Thrombin Inhibitors†. J. Med. Chem., 2006, 49 (4), 1346-1355.

[9] (a) Dinwoodey Dl Fau - Ansell, J. E.; Ansell, J. E. Heparins, low-molecular-weight heparins, and pentasaccharides. Clinics in

geriatric medicine 2006, 22, 1-15; (b) Agnelli, G. Current issues in anticoagulation. Pathophysiol. Haem. Thromb., 2005, 34 (Suppl 1), 2-9.

[10] Verghese, J.; Liang, A.; Sidhu, P. P. S.; Hindle, M.; Zhou, Q.; Desai, U. R. First steps in the direction of synthetic, allosteric, direct inhibitors of thrombin and factor Xa. Bioorg. Med. Chem.

Lett., 2009, 19 (15), 4126-4129. [11] www.bindingdb.org,

http://pubchem.ncbi.nlm.nih.gov/assay/assay.cgi?q=r&version=1.5&reqid=8361908697071728948, in.

[12] Brooks, B. R.; Bruccoleri, R. E.; Olafson, B. D.; States, D. J.; Swaminathan, S.; Karplus, M. CHARMM: A program for macromolecular energy, minimization, and dynamics calculations. J. Comput. Chem., 1983, 4 (2), 187-217.

[13] (a) Smellie, A.; Kahn, S. D.; Teig, S. L. Analysis of conformational coverage. 1. validation and estimation of coverage. J. Chem. Info. Comput. Sci., 1995, 35 (2), 285-294; (b) Smellie, A.; Kahn, S. D.; Teig, S. L. Analysis of conformational coverage. 2. applications of conformational models. J. Chem. Inf. Comput. Sci., 1995, 35 (2), 295-304.

[14] Wang, H.-Y.; Cao, Z.-X.; Li, L.-L.; Jiang, P.-D.; Zhao, Y.-L.; Luo, S.-D.; Yang, L.; Wei, Y.-Q.; Yang, S.-Y. Pharmacophore modeling and virtual screening for designing potential PLK1 inhibitors. Bioorg. Med. Chem. Lett., 2008, 18 (18), 4972-4977.

[15] (a) Bode, W.; Turk, D.; Karshikov, A. The refined 1.9-Å X-ray crystal structure of d-Phe-Pro-Arg chloromethylketone-inhibited human -thrombin: Structure analysis, overall structure, electrostatic properties, detailed active-site geometry, and structure-function relationships. Protein Sci., 1992, 1 (4), 426-471; (b) Krishnan, R.; Mochalkin, I.; Arni, R.; Tulinsky, A. Structure of thrombin complexed with selective non-electrophilic inhibitors having cyclohexyl moieties at P1. Acta Crystallograph. D, 2000, 56 (3), 294-303; (c) Skordalakes, E.; Dodson, G. G.; Green, D. S. C.; Goodwin, C. A.; Scully, M. F.; Hudson, H. R.; Kakkar, V. V.; Deadman, J. J. Inhibition of human [alpha]-thrombin by a phosphonate tripeptide proceeds via a metastable pentacoordinated phosphorus intermediate. J. Mol. Biol., 2001, 311 (3), 549-555; (d) Tucker, T. J.; Brady, S. F.; Lumma, W. C.; Lewis, S. D.; Gardell, S. J.; Naylor-Olsen, A. M.; Yan, Y.; Sisko, J. T.; Stauffer, K. J.; Lucas, B. J.; Lynch, J. J.; Cook, J. J.; Stranieri, M. T.; Holahan, M. A.; Lyle, E. A.; Baskin, E. P.; Chen, I. W.; Dancheck, K. B.; Krueger, J. A.; Cooper, C. M.; Vacca, J. P. Design and synthesis of a series of potent and orally bioavailable noncovalent thrombin inhibitors that utilize nonbasic groups in the P1 position. J. Med. Chem., 1998, 41 (17), 3210-3219.

[16] Debnath, A. K. Pharmacophore mapping of a series of 2,4-Diamino-5-deazapteridine inhibitors of mycobacterium avium complex dihydrofolate reductase. J. Med. Chem., 2001, 45 (1), 41-53.

[17] Fischer, R. The principle of experimentation illustrated by a psycho-physical experiment. Hafner Publishing: Newyork, 1966; p Chapter II.

[18] Guner, O. F. Pharmacophore Perception development and use in

Drug Design. International University Line: 2000. [19] Shoichet, B. K. Virtual screening of chemical libraries. Nature,

2004, 432 (7019), 862-865. [20] Tondi, D.; Slomczynska, U.; Costi, M. P.; Watterson, D. M.;

Ghelli, S.; Shoichet, B. K. Structure-based discovery and in-parallel optimization of novelcompetitive inhibitors of thymidylate synthase. Chem. Biol., 1999, 6 (5), 319-331.

[21] Developmental Therapeutics Program, NCI/NIH, 2000,

http://dtp.nci.nih.gov. [22] Maybridge Chemical company (England);

http://www.chem.ac.ru/Chemistry/Databases/MAYBRIDGE.en.html.

[23] (a) Boda, K.; Seidel, T.; Gasteiger, J. Structure and reaction based evaluation of synthetic accessibility. J. Comput. Aided Mol. Des.,

2007, 21 (6), 311-325; (b) Zaliani, A.; Boda, K.; Seidel, T.; Herwig, A.; Schwab, C.; Gasteiger, J.; Claußen, H.; Lemmen, C.; Degen, J.; Pärn, J.; Rarey, M. Second-generation de novo design: a view from a medicinal chemist perspective. J. Comput. Aided Mol.

Des., 2009, 23 (8), 593-602. [24] Wagner, A. B. SciFinder Scholar 2006: an empirical analysis of

research topic query processing. J. Chem. Inf. Model., 2006, 46 (2), 767-774.

[25] Wang, Y.; Bolton, E.; Dracheva, S.; Karapetyan, K.; Shoemaker, B. A.; Suzek, T. O.; Wang, J.; Xiao, J.; Zhang, J.; Bryant, S. H. An overview of the PubChem BioAssay resource. Nucleic Acids Res., 2010, 38 (suppl 1), D255-D266.

[26] Yang, Q.; Du, L.; Tsai, K.-C.; Wang, X.; Li, M.; You, Q. Pharmacophore mapping for kv1.5 potassium channel blockers. QSAR Comb. Sci., 2009, 28 (1), 59-71.

[27] Joseph, T. B.; Suneel Kumar, B. V. S.; Santhosh, B.; Kriti, S.; Pramod, A. B.; Ravikumar, M.; Kishore, M. Quantitative structure activity relationship and pharmacophore studies of adenosine receptor A2B inhibitors. Chem. Biol. Drug Des., 2008, 72 (5), 395-408.

[28] Ravikumar, M.; Pavan, S.; Bairy, S.; Pramod, A. B.; Sumakanth, M.; Kishore, M.; Sumithra, T. Virtual screening of cathepsin K

Potent Thrombin Inhibitors Combinatorial Chemistry & High Throughput Screening, 2013, Vol. 16, No. 9 19

inhibitors using docking and pharmacophore models. Chem. Biol.

Drug Des., 2008, 72 (1), 79-90. [29] Vadivelan, S.; Sinha, B. N.; Rambabu, G.; Boppana, K.;

Jagarlapudi, S. A. R. P. Pharmacophore modeling and virtual screening studies to design some potential histone deacetylase inhibitors as new leads. J. Mol. Graph. Model., 2008, 26 (6), 935-946.

[30] (a) Klebe, G. Virtual ligand screening: strategies, perspectives and limitations. Drug Discovery Today 2006, 11 (13-14), 580-594; (b) Muegge, I.; Oloff, S. Advances in virtual screening. Drug Discov. Today: Technol., 2006, 3 (4), 405-411.

[31] Lipinski, C. A.; Lombardo, F.; Dominy, B. W.; Feeney, P. J. Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. Adv. Drug Del. Rev., 1997, 23 (1-3), 3-25.

[32] (a) Leach, A. R.; Hann, M. M.; Burrows, J. N.; Griffen, E. J. Fragment screening: an introduction. Molecular BioSystems 2006, 2 (9), 429-446; (b) Vistoli, G.; Pedretti, A.; Testa, B. Assessing drug-likeness - what are we missing? Drug Discov. Today, 2008, 13 (7-8), 285-294.

[33] Halgren, T. A.; Murphy, R. B.; Friesner, R. A.; Beard, H. S.; Frye, L. L.; Pollard, W. T.; Banks, J. L. Glide: A new approach for rapid, accurate docking and scoring. 2. enrichment factors in database screening. J. Med. Chem., 2004, 47 (7), 1750-1759.

[34] Nilsson, M.; Hämäläinen, M.; Ivarsson, M.; Gottfries, J.; Xue, Y.; Hansson, S.; Isaksson, R.; Fex, T. Compounds binding to the S2 S3 pockets of thrombin. J. Med. Chem., 2009, 52 (9), 2708-2715.

[35] Brighton, T. A., The direct thrombin inhibitor melagatran/ ximelagatran. Med. J. Australia, 2005, 182 (5), 254-255.

[36] Krammer, A.; Kirchhoff, P. D.; Jiang, X.; Venkatachalam, C. M.; Waldman, M. LigScore: a novel scoring function for predicting binding affinities. J. Mol. Graph. Model., 2005, 23 (5), 395-407.

[37] Totrov, M. Atomic property fields: generalized 3d pharmacophoric potential for automated ligand superposition, pharmacophore elucidation and 3D QSAR. Chem. Biol. Drug Des., 2008, 71 (1), 15-27.

[38] Gehlhaar, D. K.; Verkhivker, G. M.; Rejto, P. A.; Sherman, C. J.; Fogel, D. R.; Fogel, L. J.; Freer, S. T. Molecular recognition of the inhibitor AG-1343 by HIV-1 protease: conformationally flexible docking by evolutionary programming. Chem. Biol., 1995, 2 (5), 317-324.

[39] Jain, A. N. Scoring noncovalent protein-ligand interactions: A continuous differentiable function tuned to compute binding affinities. J. Comput. Aided Mol. Des., 1996, 10 (5), 427-440.

[40] (a) Böhm, H.-J. The development of a simple empirical scoring function to estimate the binding constant for a protein-ligand complex of known three-dimensional structure. J. Comput. Aided Mol. Des., 1994, 8 (3), 243-256; (b) Böhm, H.-J. Prediction of binding constants of protein ligands: A fast method for the prioritization of hits obtained from de novo design or 3D database search programs. J. Comput. Aided Mol. Des., 1998, 12 (4), 309-309.

[41] Dullweber, F.; Stubbs, M. T.; Musil, D.; Stürzebecher, J.; Klebe, G. Factorising ligand affinity: a combined thermodynamic and crystallographic study of trypsin and thrombin inhibition+. J. Mol. Biol., 2001, 313 (3), 593-614.

[42] Wallace Ac Fau - Laskowski, R. A.; Laskowski Ra Fau - Thornton, J. M.; Thornton, J. M. LIGPLOT: a program to generate schematic diagrams of protein-ligand interactions. Protein Eng., 1996, 8, 127-134.

Received: February 3, 2013 Revised: May 18, 2013 Accepted: May 22, 2013