Recognition of docking sites on a protein using β-shape based on Voronoi diagram of atoms
Transcript of Recognition of docking sites on a protein using β-shape based on Voronoi diagram of atoms
Recognition of docking sites on a protein using b-shape based
on Voronoi diagram of atoms
Deok-Soo Kim a,b,*, Cheol-Hyung Cho b, Donguk Kim b, Youngsong Cho b
a Department of Industrial Engineering, Hanyang University, 17 Haengdang-dong, Seongdong-gu, Seoul 133-791, South Koreab Voronoi Diagram Research Center, Hanyang University, 17 Haengdang-dong, Seongdong-gu, Seoul 133-791, South Korea
Received 14 June 2005; accepted 22 November 2005
Abstract
A protein consists of atoms. Given a protein, the automatic recognition of depressed regions on the surface of the protein, often called docking
sites or pockets, is important for the analysis of interaction between a protein and a ligand and facilitates fast development of new drugs.
Presented in this paper is a geometric approach for the detection of docking sites using b-shape which is based on the Voronoi diagram for
atoms in Euclidean distance metric. We first propose a geometric construct called a b-shape which represents the proximity among atoms on the
surface of a protein. Then, using the b-shape, which takes the size differences among different atoms into account, we present an algorithm to
extract the pockets for the possible docking site on the surface of a protein.
q 2005 Elsevier Ltd. All rights reserved.
Keywords: Pocket; Binding sites; Docking; Voronoi diagram of spheres; b-shape; Protein interaction; Drug design
1. Introduction
Molecules such as proteins, DNA, and RNA consist of
atoms. Given the atomic complexes of these molecules,
analyzing interactions between them is important for under-
standing their biological functions. The interaction between a
protein and a small molecule is also one of the most important
issues in designing new drugs.
The study of molecular interactions, such as the docking of a
protein with a ligand or protein folding, can be approached
from a physicochemical and/or a geometrical point of view
[52]. While the physicochemical approach is to find regions on
the surface of a protein which minimize the potential energy
between two molecules, the geometric approach is to
determine whether two molecules have geometrically mean-
ingful features for the interaction.
A docking between a protein, called a receptor, and a small
molecule, called a ligand, usually occurs around depressed
0010-4485//$ - see front matter q 2005 Elsevier Ltd. All rights reserved.
doi:10.1016/j.cad.2005.11.008
* Corresponding author. Address: Department of Industrial Engineering,
Hanyang University, 17 Haengdang-dong, Seongdong-gu, Seoul 133-791,
South Korea. Tel.: C82 2 2220 0472; fax: C82 2 2292 0472.
E-mail addresses: [email protected] (D.-S. Kim), murick@voronoi.
hanyang.ac.kr (C.-H. Cho), [email protected] (D. Kim), ycho@
voronoi.hanyang.ac.kr (Y. Cho).
regions, called docking sites or pockets, on the surface of a
receptor. Since, designing a new drug requires finding a small
chemical which can dock or bind at pockets on a protein, the
recognition of pockets on proteins is one of the most
fundamental processes in the drug design. Considering that
chemical databases usually contain millions of chemical data
entries, manually identifying pockets on the surface of a
protein is time-consuming and error-prone. Therefore, the
automatic recognition of pockets and the evaluation of
the binding of a chemical to a pocket are rather important in
the study of protein-ligand docking for the development of new
drugs [37].
While the efforts on the physicochemical approach on this
issue have been given since the early days of science, efforts to
understand the geometry perspective of biological systems
have started only very recently [1,18,27,43,48,53,57]. Since the
geometry is also a critical consideration for biological systems
in various important aspects, just like any other disciplines,
research on the geometry in biological systems will provide
new challenges as well opportunities for the community of
CAD and CAGD.
In this paper, we will present the definition of a docking site,
also referred to as a pocket, on the surface of a protein in the
geometric point of view and present an effective and efficient
algorithm to automatically recognize pockets. Most proteins
consist of at most six different types of atoms: H, C, N, O, P,
and S which have the corresponding van der Waals radii of 1.2,
Computer-Aided Design 38 (2006) 431–443
www.elsevier.com/locate/cad
D.-S. Kim et al. / Computer-Aided Design 38 (2006) 431–443432
1.7, 1.55, 1.52, 1.8, and 1.8 A, respectively [63]. These atoms
with van der Waals radii are usually called van der Waals
atoms. The number of atoms for a protein varies from hundreds
to hundreds of thousands.
Given a protein, the proposed algorithm first computes a
Voronoi diagram of van der Waals atoms. Then, a construct
called a b-shape is computed from the Voronoi diagram using a
spherical probe. The Voronoi diagram of atoms presented in
this paper is similar to the ordinary Voronoi diagram for points
in the sense that the Euclidean distance metric is used.
However, it differs from the ordinary Voronoi diagram since
the distance is measured from the surface of atoms, not from
the centers of atoms.
The first step in defining a docking site is to define the
spatial proximity among the atoms on the surface of the
protein. This is done by using a mesh-like construct called
b-shape, which is similar in some ways to the well-known
a-shape [19]. Then, pocket primitives are defined on the
b-shape where a pocket primitive is a unit of depressed region
on the b-shape. Lastly, the validity of boundaries between
neighboring pocket primitives are evaluated to test if two
neighbors should be considered as being from a single pocket
or not. Eventually, there will be a few pockets left on the
surface of a receptor where each pocket corresponds to an
appropriately depressed region.
We want to emphasize here that a b-shape takes the size
variation of atoms into account for the computation of
proximity among the atoms on the surface of a protein. Recall
that the radius difference between atoms, H and P for example,
is quite significant.
In Section 2 of this paper, we review the previous work
related to the automatic recognition of docking sites on
proteins. After introducing the geometric model of a protein as
an atomic complex in Section 3, we discuss the representation
of topology for a whole protein in Section 4. Section 5 presents
the issues related to the topology among the atoms on the
surface of proteins and provides a definition of b-shape.
Section 6 discusses how to extract pocket primitives from a
b-shape, and Section 7 shows the automatic recognition of
pockets via the merging of pocket primitives. Section 8
concludes the paper by showing some experimental results.
2. Literature review
Since, it is usually agreed that the functions of a protein are
more determined by its geometric structure, the study of
geometric characteristics of proteins has been recently getting
more attention. Besides, the matured technologies in geometry,
such as CAD and computational geometry, are and will be
providing a strong driving force for such a trend.
The first formal treatment of geometry for a biological
atomic complex that we are aware of is the study of Bernal and
Finney in 1967. They examined the packing characteristics of
the complex [7]. Lee and Richards, in 1971, presented the
definition of solvent accessible surface which provided a
theoretical foundation for analyzing the mass properties of
protein [43]. In 1974, Richards defined a molecular surface
using the concept of a Voronoi diagram of atom centers, which
became the basis for most of structure analysis for a protein
including the extraction of pockets [54]. Connolly later
reported how to compute the molecular surface analytically
and beautifully visualized the rendered molecular surface
[13,14]. Thereafter, the molecular surface has also been
referred to as a Connolly surface.
Compared to the above studies, research on algorithmically
extracting cavities and/or pockets on protein began much later.
Geometric approaches to extracting pockets on a protein can be
broadly categorized into three types: a grid-based approach, a
sphere-coating approach, and an approach based on some
representation of surface atoms on the protein.
Being both conceptually and computationally easier than
the other two approaches, the grid-based approach was the first
method of choice for extracting pockets. The grid-based
approach primarily defines a 3D spatial lattice of the space
occupied by the protein and uses simple techniques to reason
the relative relations among the grid points in the lattice. The
grid points, associated with some attributes, are then used to
extract the exterior boundary of the protein and recognize the
depressed regions on the surface. After making some efforts to
use the concept of filling small spheres around a protein and
separating some meaningful chunks of spheres, the main
stream of this approach proceeds to use the mathematically
rigorous, computationally efficient and robust representation of
the atomic complex. The following is a literature review on this
topic which is summarized in Fig. 1.
The first research that we are aware of in the geometric
approach is by Voorinholt et al., in 1989, employing the grid-
based approach [59]. They created a grid of the bounding box
for a protein where each grid point was associated with a
distance value to the nearest atom. While the focus of this
research was on a fast visualization of protein, the density map
thus obtained was also effective in discriminating the regions of
low density where cavities exist. The distance-metric used in
this work was the squared Euclidean distance to save
computation time of square-root operation. In addition, the
concept of a digital differential analyzer was used for speeding
up in the distance computation for each boxel.
In 1990, Ho and Marshall proposed another grid-based
algorithm consisting of two steps [28]: first, they created a
bounding box of a protein to define a uniform grid, and then
sliced the bounding box. Then, after filling the cavity of the
protein with filler atoms using a flood-filling algorithm, they
isolated cavities using a boolean complement operation.
In 1991, Alard and Wodak reported on an elegant approach
to detect internal voids of protein using the concept of topology
[3]. Suppose that the intersection among all atoms is computed,
then the surfaces of all atoms are subdivided into a set of
spherical polygons. Then, some of the polygons are interior to
atoms and the others are not. Note that those not-interior-to-
atoms can be easily separated from the others. Reinventing the
idea of the B-rep and the related concepts in the geometric
modeling [39] such as orientations, topology operations, etc.
they separately constructed an outer shell and inner shells, if
they exist.
Fig. 1. Summary of research on detecting docking sites on a protein.
D.-S. Kim et al. / Computer-Aided Design 38 (2006) 431–443 433
A year later in 1992, Levitt and Banaszak reported on a
rather simple yet then practical algorithm to detect both
internal and external cavities [44]. They defined a fine grain
grid in the bounding box of protein and isolated grid cells
intersecting the atoms. By scanning the whole grid for each Y
and Z value of the grid system in the CX direction from the
smallest X value, they isolated cells contained between cells
which intersect the atoms. These cells together define a cavity.
To visualize the surface of cavities, they used marching cubes
for the approximation.
Kleywegt and Jones presented an algorithm to detect
internal voids and invagination, a depression on protein
whose mouth is relatively narrow [36]. Suppose that atoms in
a protein are fattened, or offset, by a specified scale. Then,
these features are isolated from outside and therefore can be
recognized by collecting the grid points lying interior to the
fattened protein by applying the technique studied previously.
In 1995, Laskowski presented an algorithm consisting of
two steps [38]. He first computed some representative tangent
spheres from surface atoms of a protein. Hence, many of the
tangent spheres intersect each other. Then, by collecting the
tangent spheres intersecting each other in an appropriate
density, he constructed the boundary of a cavity.
After computing exterior spherical polygons as Alrad and
Wodak [3], Seidl and Kriegel, in 1995, presented a topology
data structure, similar to the well-known winged-edge data
structure, among the spherical polygons [55]. They also
classified the spherical polygons into three categories: convex,
concave, and saddle patches. Using the idea of region growing,
they segmented the molecular surface while the neighboring
patches were approximated by a paraboloid which was
considered a cavity.
Recently, researchers have started to use more rigorously
defined mathematical and computational tools related to the
geometry among the atoms in a protein. An a-shape, reported
in 1994, is one of the most powerful tool. Since the a-shape can
construct the surface of protein quite efficiently for a fixed size
probe, it has been often used in the extraction of pockets [19].
In 1996, Peters et al. published a noble algorithm using an
a-shape to construct the topology among atoms on the surface
of protein [53]. They defined two constructions of surface
forms: global and detailed forms corresponding to larger and
smaller values for a, respectively. By investigating the
discrepancy between the two forms, they came up with a
truly automatic recognition scheme for the cavities. We want to
mention here that the concept used in our algorithm is similar
to this in that we also use two forms.
Starting from the mathematical definition of a pocket by
Edelsbrunner et al. [18] based on the well-known a-shape,Liang et al. in 1998 presented a decent algorithm and system
for the extraction of pockets from protein [48]. The algorithm
is based on the discrete-flow method which can be explained in
2D as follows. After an a-shape is constructed, each triangle
outside the a-shape is tested to see if it is obtuse or acute. Then,
all obtuse triangles are merged to the neighboring acute
triangle which results in a pocket with a relatively narrow
mouth. Note that the pocket recognized in this approach is
equivalent to the invagination in [36]. The implementation was
later successfully packaged into the popular software CAST.
In 2000, Brady and Stouten improved the sphere coating
approach to repeat coating spheres layer by layer. After a layer
of spheres is coated, some irrelevant spheres are removed from
the coated layer. Then, another layer of spheres is coated [8].
After some iteration, each chunk of spheres deposited around
cavities is identified as a pocket.
It is worth to mention that there have been other approaches
as well. Delaney reported an approach based on a pattern
recognition technique using cellular logic operation from
D.-S. Kim et al. / Computer-Aided Design 38 (2006) 431–443434
image processing where a logic value is assigned to each grid
[17]. In [49], Masuya and Doi described the definition of
pockets using the concept of set operations.
3. Geometric models of protein and related terminologies
In order to analyze the geometric characteristics of proteins,
it is necessary to have an appropriate geometric model for the
proteins. Depending on the application, various models such as
a hard sphere model, a ball-and-stick model, a ribbon model, or
a combination of the above have been used. In this research, we
have adapted the most popular hard-sphere model with van der
Waals atoms, which is sometimes called a CPK-model. A
protein represented by the CPK-model is shown in Fig. 2. The
balls in the figure denote van der Waals atoms constituting a
protein.
In most studies of analyzing geometric characteristics of a
protein with respect to another molecule, which is usually
relatively small, the analysis is usually done using the concept
of a spherical probe which encloses the small molecule. While
a probe is an approximation of the small molecule, the probe
best represents the molecule by incorporating its shape,
conformation changes, and all possible orientations of the
ligand with respect to the protein. Hence, it is considered that
the behavior of a probe best represents the geometric behavior
of the molecule with respect to a protein. In the case of a water
molecule, the corresponding probe is a sphere with the radius
of 1.4 A.
The points on the boundary of van der Waals atoms
constitute a boundary surface of a protein which is
conveniently referred to as the van der Waals surface of the
protein. In addition, there are two more important types of
surfaces associated with a protein: the solvent accessible
surface and the molecular surface. The solvent accessible
surface consists of points on the space where the center of the
probe is located when the probe is in contact with the protein.
The inner-most possible trajectories of points on the probe
SAS
p
CS RS
VWS
Fig. 2. The geometric model of a protein consisting of five atoms. Shown in the
figure are the van der Waals surface (VWS), the solvent accessible surface
(SAS), the contact surface (CS), and the reentrant surface (RS) corresponding
to a probe p.
surface, then, define a molecular surface. A solvent accessible
surface usually defines a free-space that a small molecule can
move around without interfering with the protein and therefore
plays a fundamental role for folding and/or docking [43]. On
the other hand, the molecular surface, often referred to as the
Connolly surface after the name of the researcher who first
analytically computed the surface, conveniently defines the
boundary between the interior and exterior volume of a protein
so that the volume or the density of the protein can be
calculated [13].
A molecular surface consists of two parts: the contact
surface and the reentrant surface. A contact surface consists of
points on the van der Waals surface of atoms which can be
contacted by the probe surface, and a reentrant surface consists
of points in the free-space touched by the probe when the probe
is in contact with nearby atoms in the protein. Note that atoms
contributing to the contact surface define the boundary of the
protein. In this paper, we will refer to such atoms as surface
atoms. Points on the molecular surface are always accessible
by the probe as it rolls over the protein. In the geometric
modelling community, the reentrant surface is called the
blending surface and its computation has been studied quite
extensively in a rather general setting [6,9,26]. The solvent
accessible surface is also known in the geometric modelling
community as the offset surface of a protein using the probe
radius as an offset distance [29]. Note that the definitions of all
of the above-mentioned surfaces depend on a probe.
Fig. 3 shows two molecules, a receptor R and a ligand L,
interacting with each other via a pocket defined on the surface
of molecule R. R and L interact with each other since the
protruding region of L is geometrically inserted into the pocket
on the surface of R.
Let AZ{a1,a2,.,an} be a protein consisting of atoms aiZ(ci, ri) where ciZ(xi, yi, zi) and ri is the center and the radius of
the atom ai, respectively. In addition, suppose that LZ{l1,l2,.,lm} is a ligand which also consists of a number of
atoms lj, defined similarly to ai, and L will be docking with A.
Let CZ{c1,c2,.,cn} be the set of centers of atoms. Note that in
general m/n. Let pZ(cp, rp), called a probe, be the minimum
sphere enclosing all atoms in the ligand L.
Let pj be a pocket where pjZ{aj1,aj2,.,ajk} and these
atoms together define a depressed region on the surface S of
Fig. 3. A docking configuration between a receptor and a ligand.
D.-S. Kim et al. / Computer-Aided Design 38 (2006) 431–443 435
protein A. The surface SZ{s1, s2,., sl} is defined as a set of
atoms of the protein where some points on the surface of the
atoms contribute to the molecular surface. Hence, S is a set of
atoms ai2A which may be touched by the ligand. Since
pj4S4A and p1gp2g/gpp4S4A, there may be some
atoms not included in any pocket. Let PZ{p1, p2,., pp} be
the set of all possible pockets on S.
4. Topology representation for a whole protein
To efficiently respond to queries about the spatial structure
of a protein, it is necessary to have a convenient
representation of the spatial structure among atoms constitut-
ing the protein.
In the study of protein structures, the ordinary Voronoi
diagram VD(C) for the set C of center points c’s for atoms, and
its dual structure Delaunay triangulation has been frequently
used since Bernal and Finney first introduced it in 1967 [7,54].
VD(C) forms a tessellation of space where each region in the
tessellation consists of locations in the space closer to a
corresponding input point. Since VD(C) is mathematically
well-defined and efficient as well as robust codes are available,
it has been widely used by most previous studies
[4,16,20,23,24,50,56,58,62].
By recognizing the fact that a VD(C) does not account for
the size differences among different atoms, Richards also
proposed a scheme, in 1974, to translate bisector edges to the
smaller atoms according to the ratio between the radii of two
neighboring atoms [54]. However, this transformation does not
necessarily produce a valid tessellation because the vertices
were not well-defined. Richards used the term a vertex error to
describe this situation.
Noting the vertex error, Gellatly and Finney proposed, in
1982, a method using radical planes instead of the translated
Voronoi edges to make sure that no vertex error occurs [22].
This radical plane approach is in fact equivalent to the power
diagram PD(A) for an atom set A, as named by Aurenhammer
in 1987 [5]. Since then, the power diagram has been frequently
used in biology problems since it reflects the size differences
among atoms at a certain level [6,21,22]. Note that the theory
of PD(A) is also well-established and efficient and robust codes
are available [65]. However, PD(A) does not fully reflect the
size difference in the sense that the distance from a location in
space to an atom is the tangential distance rather than the
Euclidean minimum distance.
Hence, in our research, we propose to use the Voronoi
diagram of atoms where the distance is defined as the
Euclidean minimum distance, instead of the tangential
distance, from the surfaces of atoms. While the ordinary
Voronoi diagram of points and the power diagram have been
studied quite extensively and efficient computational codes are
available, its counterpart for the Voronoi diagram of spheres
has not been studied as much. In many applications for
proteins, the ordinary and power metric Voronoi diagram are
the approximations of what is actually needed. It is only very
recently that the fast construction of Voronoi diagram for
circles and spheres with different radii became practical
[30–34]. Once the Voronoi diagram for spheres became
available, many studies in geometrical perspective of a protein
could be done quite efficiently. The constructed Voronoi
diagram is then stored in a radial data structure for the efficient
processing of various queries [10].
A Voronoi diagram VD(A) for an atom set A is defined as
follows. Associated with each atom aiZ(ci, ri)2Awhere ci and
ri are the center and radius of ai, there is a corresponding
Voronoi region
VRi Z fpjdistðp; ciÞKri!distðp; cjÞKrj; isjg
Note that dist(p, q) denotes an ordinary Euclidean distance,
i.e. distðp; qÞZffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiðxpKxqÞ
2C ðypKyqÞ2C ðzpKzqÞ
2q
. Then,
VD(A)Z{VR1, VR2,.,VRn} is the Voronoi diagram for the
given atoms and represented as GVZ(VV, EV, FV) where
VVZ fvV1 ; vV2 ;.g, EVZ feV1 ; e
V2 ;.g and FVZ ff V1 ; f
V2 ;.g are
sets of Voronoi vertices, edges and faces, respectively. From
the definition of a Voronoi diagram, a Voronoi vertex vV is the
center of an empty sphere tangent to four nearby atoms, while a
Voronoi edge eV is defined as a locus of points equi-distant
from the surfaces of three surrounding atoms. In addition, a
Voronoi face f V is the surface defined by two neighboring
atoms. Note that the face is always a hyperbolic surface and
any point on the face is equi-distant from the surfaces of both
atoms. For more details, readers are recommended to refer to
[33,34].
It is important to mention the combinatorial complexity of
the Voronoi diagram of spheres. While the numbers of vertices,
edges, and faces of the Voronoi diagram of general spheres are
all O(n2) in the worst-case, the average numbers for those are
all O(n). Halperin found that the upper bound of the
combinatorial complexity for all of the vertices, edges, and
faces of the Voronoi diagram for atoms in a protein is O(n) in
the worst-case [25]. This property is due largely to two
characteristics of atom distributions in a protein. According to
Pauli’s exclusion principle, two atoms cannot be located at the
same position meaning that an atom cannot be contained by
another atom [51]. In addition, the differences in the atom radii
are within a constant since most proteins consist of six different
types of atoms, such as H, C, N, O, P, and S, with the
corresponding van der Waals radii as discussed earlier. Under
these conditions, Halperin showed that the number of
neighboring atoms, which define Voronoi faces, for a given
atom is constant in the worst-case.
5. Topology among surface atoms of protein
Since, pj4S, extracting pockets needs to query on the
surface shape of the protein. Hence, an appropriate definition of
the surface of a protein and the efficient representation of the
topological structure among atoms on the surface is necessary.
5.1. a-shape
In 1994, Edelsbrunner proposed a 3D a-shape from a set of
3D points [19]. The a-shape is defined by carving out the space
D.-S. Kim et al. / Computer-Aided Design 38 (2006) 431–443436
with an omnipresent open sphere with a radius a when
the sphere does not contain any input point. When the sphere
touches two points, they are connected by an edge. When the
sphere touches three points, a triangular face is defined by the
points. When the sphere touches only one point, the point is left
as a singleton. Then, these points, edges and triangular faces
together define the a-shape for the given point set. When aZN, the a-shape of a point set is the boundary of the convex hull
of the set. If aZ0, the a-shape is the point set itself. A robust
algorithm with O(n2) worst-case time complexity was also
given by same authors to construct an a-shape from the
Delaunay triangulation of point set.
Since an a-shape defines the concept of shape without any
ambiguity and since an efficient and a robust code is available,
there have been several studies based on the a-shape in biology
such as automatic recognition of pockets [18,48,53], internal
voids of a protein [47], calculation of the area and volume of
protein [2,45,46]. There has also been research based on the
a-shape in computer graphics as well.
However, an a-shape suffers from the fact that it does not
incorporate the size differences among atoms since an a-shape
is computed from the Delaunay triangulation of atom centers.
Fig. 4, for example, illustrates a possible problematic situation
that can be encountered due to the size differences. Shown in
Fig. 4(a) is a protein with a pocket-like depression on the
surface and the mouth of the depression is located between two
relatively large atoms al and ar. Since the probe p, with a radius
rp, cannot freely enter into the depression without colliding
with al or ar or both, the depression should not be considered as
a pocket. Fig. 4(b) shows the ordinary point set Voronoi
diagram for the centers of atoms and its corresponding
Fig. 4. A false pocket recognized from the a-shape of a protein. (a) A protein
and a probe, (b) the ordinary Voronoi diagram for the atom centers and the
corresponding Delaunay triangulation, (c) the a-shape corresponding to the
inflated probe ~p, and (d) the false pocket recognized from the a-shape.
Delaunay triangulation. Fig. 4(c) shows the corresponding
a-shape computed from the Delaunay triangulation with the
value of a as rp plus the average radius of all atoms in the
protein. As shown in the figure, the probe passes the mouth,
represented as a dotted Delaunay edge between cl and cr, freely
and therefore a false pocket will be concluded as shown in
Fig. 4(d).
5.2. b-shape
Despite its many virtues, an a-shape is unable to account for
the size differences among different types of atoms. Hence, we
have recently proposed a geometric construct called a b-shape
based on the Voronoi diagram of spheres [35]. We first
introduce the concept of b-hulls and then extend it to b-shapes.
Conceptually, a b-hull is a generalization of an a-hull and can
be similarly described. The point set from which an a-hull is
defined is now replaced by a set of three dimensional spherical
balls.
Consider R3 filled with Styrofoam and some spherical
rocks scattered around inside the Styrofoam. The radii of the
spherical rocks vary. Now imagine a spherical eraser with
radius b. Then, carving out the Styrofoam with an
omnipresent and empty spherical eraser with the radius of
b will result in a b-hull. Since the eraser is omnipresent, there
can be interior voids as well. Recall that an a-shape is
obtained by straightening the curved geometry in the
corresponding a-hull. A b-shape can be similarly explained
with a slight, yet fundamental, difference. In the b-family,
therefore, the relationship between a b-hull and the
corresponding b-shape is slightly different from their
counterparts of the a-family.
Suppose that we are given a b-hull for an atomic
structure A. Then, connecting the centers of the appropriate
atoms with edges and triangles when a b-ball at a particular
position in the space touches two or three nearby atoms
simultaneously, respectively, the b-shape for a set A
corresponding to the b-hull can be obtained. The details of
the definition, the properties and the algorithms for b-shape
is presented in [35].
6. Extraction of pocket primitives from b-shape
Let pL and pN be a probe for a ligand L and a hypothetical
probe with infinite radius, respectively. Let BL and BN be the
b-shapes of a protein corresponding to pL and pN, respectively.
Then, BN is a b-shape bounded by faces defined by the centers
of atoms with unbounded Voronoi regions. Unlike an a-shape,
however, a b-shape BN may contain some isolated vertices.
Let BI and BO denote BL and BN to mean the inner and outer
b-shapes of a given model, respectively. Suppose that BIZðVB
I ;EBI ;F
BI Þ and BOZ ðVB
O;EBO;F
BOÞ. Let V
BOZ fvO1 ; v
O2 ;.g. EB
O,
FBO, V
BI , E
BI , and FB
I are similarly defined.
Fig. 5 shows a 2D analogy for the inner and outer b-shapes
for a protein. From this figure, we can make a simple
observation as following: for each edge of BO, there is zero
or one depression on BI of the protein. For example, in
Fig. 6. The example of pocket primitives: (a) two faces(dashed line) of outer
mesh on inner mesh and (b) corresponding pocket primitives.
BO
BI e1
e2
e1
e3
e4
e5
e2
I
I
I
I
I
O
O
(a)
(b)
Fig. 5. Inner and outer b-shape in 2D: (a) the solid line and dotted line are inner
and outer molecular surfaces, respectively and (b) corresponding inner and
outer b-shapes.
D.-S. Kim et al. / Computer-Aided Design 38 (2006) 431–443 437
Fig. 5(b), an edge eO1 of BO corresponds to a depressed region
formed by edges eI1, eI2, eI3 and eI4. When an edge of BO
corresponds to a depression, the depression can be regarded as
a pocket. Obviously, no pocket is defined when an edge on BO
coincides with one of inner b-shape, shown as eI5 and eO2 in the
figure.
Similar observation can be made for its 3D counterpart. For
a face f O2FBO of a 3D protein, there is a corresponding
depressed region on BI unless fO coincides with a face f I2FB
I .
However, a pocket on BI may or may not correspond to a
face f O2FBO. A large pocket, for example, may correspond to
two faces of BO in FBO when the depressed regions from two
faces of BO does not have a clear boundary between them. In
such a case, a depressed region on BI corresponding to a face
f O2FBO cannot be defined to form a complete pocket. Instead,
both depressed regions may altogether define a single pocket.
Hence, we first introduce the concept of pocket primitive f as a
unit depressed region on BI corresponding to each face
f O2FBO.
A face f Oi 2FBO has three associated vertices vOi1 , v
Oi2, and vOi3
in VBO, and there are always three vertices vIi1 , v
Ii2, and vIi3 in VB
I
which coincide with vOi1 , vOi2, and vOi3 , respectively. Let gði1;i2Þ
be
geodesic, i.e. the shortest path, on the inner b-shape BI between
vIi1 and vIi2. The path from a vertex follows an incident edge and
the distance between two neighboring vertices is defined as the
edge length between the two vertices. Hence, the distance
between two arbitrary vertices is the sum of the edge lengths
along the shortest path connecting two vertices. We call the
geodesic gði1;i2Þa ridge between two pocket primitives.
The geometric meaning of gði1;i2Þis as follows. While the
extreme vertices vIi1 and vIi2 are on both BO and BI, the other
vertices on the path define depressions on BI from the
corresponding face of BO. Hence, the geodesic gði1;i2Þcan be
interpreted as the most upward wall separating two relatively
deep depressions on BI. Other geodesics gði2;i3Þand gði3;i1Þ
can
be similarly interpreted.
Let ~FIi be a set of faces f Ih , where f Ih 2FB
I is interior to the
three geodesics gði1;i2Þ, gði2;i3Þ
, and gði3;i1Þ. Then, ~F
Ii forms a
topologically triangular shaped depression on BI from the
corresponding face of BO. This depression is called a pocket
primitive fi corresponding to a face f Oi 2FBO and is also
represented by another graph fiZ ð ~VIi ; ~E
Ii ; ~F
Ii Þ. Hrence, the
following properties should hold.
Property 1. jFBOj% jFB
I j where jXj is the cardinality of set X.
Property 2. ~FIi 3FB
I :
Algorithm Extraction of pocket primitives
Input: BI, FBO
Output: the set of pocket primitives Q
Step 1. For each f Oi 2FBO.
Step 1.1. Identify three vertices vi1, vi2, and vi3 in VBI corresponding to the vertices of f Oi .
Step 1.2. Find the geodesics g(i1,i2), g(i2,i3), and g(i3,i1) corresponding to vi1, vi2, vi3.
Step 1.3. Find fiZ ð ~VIi ; ~E
Ii ; ~F
Ii Þ surrounded by g(i1,i2), g(i2,i3), and g(i3,i1).
Step 1.4. Add fi to Q for pocket primitives.
End-for
Step 2. Terminate.
D.-S. Kim et al. / Computer-Aided Design 38 (2006) 431–443438
In Fig. 6(a), BI is shown as a mesh of solid lines and BO is
shown as two large triangles f OA and f OB bounded by broken
lines. Four geodesics on BI corresponding to the four edges
(shown as broken lines) on BO are shown in Fig. 6(b) as thick
lines. Inside the four paths, two corresponding pocket
primitives fA and fB are shown.
When a face f Oi 2FBO coincides with a face f Ii 2FB
I ; ~FIi
consists of a single face f Ii 2FBI and it is not considered as a
pocket primitive. Note that ~FIi can even be a null set in the case
when three geodesics degenerate to three curve segments
without containing any face inside. In such a case, no pocket
primitive corresponds to the face. From this, we can draw a few
properties.
Property 3. If fi and fj are not topological neighbors, then~FIih ~F
IjZ: where isj.
Property 4. If fi and fj are topological neighbors,~EIih ~E
IjZgðijÞs:. The geodesic g(ij) is called a ridge
between fi and fj.
Next we present an algorithm for extracting pocket
primitives from BO.
Suppose that a b-shape is stored in an appropriate data
structure supporting non-manifold models. In the CAD
community, such data structures have been quite extensively
studied [11,12,39–42,60,61]. In particular, we recommend
readers to refer to [11,39] for a thorough explanation of the
non-manifold data structure.
The loop in Step 1 iterates jFBOj times in the worst-case.
While Step 1.1 takes O(1), the Step 1.2 takes OðjEBI jC
jVBI jlogjV
BI jÞ if the Dijkstra algorithm based on a Fibonacci
heap is used [15]. Step 1.3 requires Oðj ~FIi jÞ for each face of F
BO.
It can be shown that the worst-case time complexity for the
whole algorithm is bound by either OðjEBI jC jVB
I jlogjVBI jÞ
when OjFBOjZOð1Þ or ðjFB
I jÞ when jFBOjZOðjFB
I jÞ.
7. Merging pocket primitives to form pockets
Given pocket primitives, we consider that one or more
neighboring pocket primitives may form a pocket. Hence,
we check if two neighboring pocket primitives can be
merged together to form a more meaningful depression
based on an appropriate criterion. Recall that a ridge g(i,j)
exists between two incident pocket primitives fi and fj. It
is also an edge chain on BI corresponding to the geodesic
between two extreme vertices of a pocket primitive. Hence,
a ridge plays the role of boundary between two incident
pocket primitives.
Let a mountain be the edge chain on BI separating two
pockets. If a ridge is sufficiently high, it can be regarded as a
mountain. Note that a pocket primitive has always 3 ridges and
a pocket is surrounded by three or more mountains. Therefore,
the boundary of a pocket primitive may or may not be the
boundary of a pocket.
Suppose that a path gk, which is in deed a ridge,
exists between two incident pocket primitives fi and fj
corresponding to an edge eOk of EBO. Note that there
always exists a geodesic on BI for an edge of BO.
Then, we can define a certain measure to determine the
discrepancy between two chains, eOk and gk. Depending on
the measure and its prescribed threshold value, two pocket
primitives sharing the chain can be considered from one
larger pocket.
Even though there are various ways to define such a
measure for the merge, we use the concept of average
distance between two chains. Let dk be the average distance
between eOk and gk. If dk is larger than a prescribed value, we
merge two neighboring pocket primitives sharing the chains.
Otherwise, we regard gk as a mountain chain. As such a
threshold value, in this paper, we have chosen the average of
all d values. After all, atoms for vertices in merged pocket
primitives define the pocket pk. Note that there may be
various other measures that can be used for the merge of
pocket primitives and these measures can be easily computed
once they are well-defined. For example, internal angles at
the edges of a ridge, the intrinsic shape of a pocket primitive,
etc. are such examples.
Once pockets are recognized, it is often necessary to
evaluate the significance of the pockets. In other words, some
recognized pockets might not be regarded as significant
pockets. There are different criteria for the measure for such
evaluation. For example, the average or maximum depth of
pocket from the entrance of the pocket, the volume of the
Fig. 7. Atomic structure of the protein 1BH8 [67] downloaded from PDB database: (a) chain A (darker atoms) and B (lighter atoms) and (b) atoms in the chain A.
D.-S. Kim et al. / Computer-Aided Design 38 (2006) 431–443 439
pocket from the entrance, etc. Note that these can be also easily
computed.
We want to mention that BO can be defined by a probe p
with a radius rp/N. Then, the resulting pockets from such a
BO is in a finer grain than those from BN. Often, the pockets
extracted from such a finer grain BO can be more meaningful.
We also want to point out that the proposed b-shape can be
easily used to accommodate the concept proposed by Liang
et al. [48] more effectively.
Fig. 8. Group A of the protein 1BH8 and its use for the pocket extraction: (a) the conv
the boundary of pocket after merges, and (f) the largest pocket on the molecular su
8. Experiments and discussions
Shown in Fig. 7(a) is a dimer, a protein consisting of two
separate groups of atoms, Transcription regulation complex,
downloaded from PDB [66,67] with the entry code 1bh8. The
darker and lighter atoms in Fig. 7(a) denote groups A and B,
respectively. Fig. 7(b) illustrates the atoms in-group A only.
From the figure, it can be easily seen that group B binds with
group A in a large depressed region of group A.
ex hull, (b) the molecular surface, (c) the b-shape, (d) the pocket primitives, (e)
rface.
Fig. 9. b-shape and the largest pocket (blue color surface) on molecular surface for 1BH8 A group according to probe size of outer b-shape: (a) 60 A, (b) 40 A, (c)
30 A, and (d) 20 A. Probe radius of all inner b-shapes is 8 A.
D.-S. Kim et al. / Computer-Aided Design 38 (2006) 431–443440
Fig. 10. Example of pocket for a protein used in CAPRI: (a) A and C chains of Lactobacillus HPr kinase, (b) Bacillus subtilis HPr binds to pocket of Lactobacillus
HPr, (c) the pocket on b-shape, and (d) the pocket on molecular surface.
D.-S. Kim et al. / Computer-Aided Design 38 (2006) 431–443 441
Fig. 8 visualizes the process of the pocket recognition from
the model in Fig. 7(b). After computing the Euclidean
Voronoi diagram of the protein, BO of the protein is computed
as shown in Fig. 8(a). Shown in Fig. 8(b) is the molecular
surface of the protein after blending using a predefined probe,
and Fig. 8(c) shows the b-shape. Then, the pocket primitive
corresponding to each face of BO is shown in Fig. 8(d). In this
example, an outer b-shape BOZBN and an inner b-shape BI is
obtained by blending the protein with a probe of radius 8 A. In
this figure, the yellow faces on BI denote faces coinciding
with faces on BO and therefore the yellow faces do not
contribute to any pocket primitive.
After pocket primitives are properly extracted, we evaluate
the ridges around all pocket primitives and merge the
appropriate pocket primitive pairs if necessary to form
pockets. Fig. 8(e) and (f) show the b-shape BI and the
corresponding blended molecular surface of the largest pocket
recognized on the protein.
While Fig. 8 shows the process of pocket recognition where
BOZBN, Fig. 9 illustrates the same process for four different
BO’s with different sizes of probes. Fig. 9(a) is the case when BO
is produced from a probe with a radius of 60 A rather than N.
The first and second columns show the b-shape and molecular
surface in the same orientation. The third and fourth columns
show the same models from a different view. The pocket shown
in the blue color is the largest one among the recognized pockets.
Fig. 9(b)–(d) are the cases where the radii of the probe forBO are
40, 30, and 20 A’s, respectively. When the radius of the probe is
50 A, the pocket is identical to the case of 40 A.
From these figures, one can easily observe that a different
BO, while the other conditions remain identical, produce a
different set of recognized pockets. For example, compare the
pockets in Fig. 9 with one in Fig. 8(f).
From this experiment, it can be observed that the number
of pockets increases as the radius of probe decreases. In
addition, there is a very strong tendency that increasing the
probe radius causes an increase of the area of a pocket on the
molecular surface. Fig. 9(c) and (d) show that even small
changes in probe size can cause a drastic change in pocket
area. In Fig. 9(c) and (d), the largest pockets are even located
at completely different places on the protein.
Fig. 10(a) shows a trimer, a receptor protein consisting of
three disconnected groups of atoms, called Lactobacillus HPr
kinase. This protein was downloaded from CAPRI (Critical
Assessment of PRediction of Interactions) [64]. The dark
portion in Fig. 10(b) is B. subtilis HPr which plays the role of
a ligand to bind with the receptor. Fig. 10(c) and (d) illustrate
the pocket recognized by the proposed algorithm.
D.-S. Kim et al. / Computer-Aided Design 38 (2006) 431–443442
9. Conclusions
The recognition for docking sites, called pockets, on the
molecular surface of a protein is one of the most important
starting points for structure-based rational drug design. In this
paper, we have provided the definition of pockets on a protein
from a geometric point of view. We have also presented an
algorithm to automatically recognize pockets on the surface of
proteins.
In the proposed algorithm, we first compute a Euclidean
Voronoi diagram of atoms and construct b-shapes correspond-
ing to given probes from the Voronoi diagram. We compute
two b-shapes: one for inner and the other for outer definitions
of surface atom sets. Then, we compute pocket primitives on
the inner b-shape corresponding to each face of the outer
b-shape. After extracting pocket primitives, we evaluate the
quality of boundaries between neighboring pocket primitives to
test if two neighbors should be merged into a single pocket or
not. Eventually, a few pockets remain on the surface of a
receptor where each pocket corresponds to an appropriately
depressed region. The algorithm in this paper has been fully
implemented using Microsoft CCC on Windows XP and has
been tested on various protein models.
Opening a new research area for the CAD community, this
research creates more challenges than solutions in the process
of the rational drug design. For example, a better definition of a
pocket, more appropriate criteria for the merge of pocket
primitives using other meaningful measures such as the
volume, depth, or the morphological shape of pocket
primitives, and so on, are left for the future research.
Acknowledgements
This research was supported by Creative Research
Initiatives from the Ministry of Science and Technology,
Korea. Authors thank Dr Jong Bhak for the helpful discussions.
References
[1] Agarwal PK, Edelsbrunner H, Harer NJ, Wang NY. Extreme elevation on
a 2-manifold. Proceedings of the twentieth annual symposium on
Computational geometry. New York, USA: Brooklyn; 2004. p. 357–65.
[2] Akkiraju N, Edelsbrunner H, Fu P, Qian J. Viewing geometric protein
structures from inside a CAVE. IEEE Comput Graph Appl 1996;16:
58–61.
[3] Alrad P, Wodak SJ. Detection of cavities in a set of interpenetrating
spheres. J Comput Chem 1991;12(8):918–22.
[4] Angelov B, Sadoc J-F, Jullien R, Soyer A, Mornon J-P, Chomilier J.
Nonatomic solvent-driven Voronoi tessellation of proteins: an open tool
to analyze protein folds. Proteins Struct Funct Genet 2002;49:446–56.
[5] Aurenhammer F. Power diagrams: properties, algorithms and appli-
cations. SIAM J Comput 1987;16:78–96.
[6] Bajaj CL, Pascucci V, Shamir A, Holt RJ, Netravali AN. Dynamic
maintenance and visualization of molecular surfaces. Discrete Appl Math
2003;127:23–51.
[7] BernalJD,FinneyJL.Randomclose-packedhard-spheremodel II.Geometry
of random packing of hard spheres. Discuss Faraday Soc 1967;43:62–9.
[8] Brady Jr GP, Stouten PFW. Fast prediction and visualization of protein
binding pockets with PASS. J Comput Aided Mol Des 2000;14:383–401.
[9] Chen C, Chen F, Feng Y. Blending quadric surfaces with piecewise
algebraic surfaces. Graph Mod 2001;63:212–27.
[10] Cho Y, Kim D, Kim D-S. Topology representation for euclidean voronoi
diagram of spheres in 3D. Digital engineering workshop and fifth Japan–
Korea CAD/CAM workshop. Tokyo, Japan: RCAST, University of
Tokyo; 2005. p. 121–6.
[11] Choi Y. Vertex-based boundary representation of non-manifold geo-
metric models. PhD Thesis, Carnegie-Mellon University, USA; 1989.
[12] ChoiG-H,HanS-H, LeeH-C.Optional storage of non-manifold information
for solid models. Trans Soc CAD CAM Eng 1997;2(3):150–60.
[13] Connolly ML. Solvent-accessible surfaces of proteins and nucleic acids.
Science 1983;221:709–13.
[14] Connolly ML. Analytical molecular surface calculation. J Appl Crystal-
logr 1983;16:548–58.
[15] Cormen TH, Leiserson CE, Rivest RL, Stein C. Introduction to
algorithms. 2nd ed. Cambridge: MIT press; 2001.
[16] David C. Voronoi polyhedra as structure probes in large molecular
systems. Biopolymers 1988;27:339–44.
[17] Delaney JS. Finding and filling protein cavities using cellular logic
operations. J Mol Graph 1992;10:174–7.
[18] EdelsbrunnerH, FacelloM,Liang J.On the definition and the construction of
pockets in macromolecules. Discrete Appl Math 1998;88:83–102.
[19] Edelsbrunner H, Mucke EP. Three-dimensional alpha shapes. ACM Trans
Graph 1994;13(1):43–72.
[20] Finney J. Volume occupation, environment and accessibility in proteins.
The problem of the protein surface. J Mol Biol 1975;96:721–32.
[21] Fischer W, Koch E. Geometrical packing analysis of molecular
compounds. Z fur Kristallographie 1979;150:245–60.
[22] Gellatly BJ, Finney JL. Calculation of protein volumes: an alternative to
the voronoi procedure. J Mol Biol 1982;161(2):305–22.
[23] Gerstein M, Tsai J, Levitt M. The volume of atoms on the protein surface:
calculated from simulation, using Voronoi polyhedra. J Mol Biol 1995;
249:955–66.
[24] GoedeA,PreissnerR,FrommelC.Voronoicell: newmethodfor allocationof
space among atoms: elimination of avoidable errors in calculation of atomic
volume and density. J Comput Chem 1997;18:1113–23.
[25] HalperinD,OvermarsMH.Spheres,molecules, and hidden surface removal.
Proceedings of 10th ACM symposium on computational geometry; 1994. p.
113–22.
[26] Hartmann E. Parametric Gn blending of curves and surfaces. Vis Comput
2001;17:1–13.
[27] Heifets A, Eisenstein M. Effect of local shape modifications of molecular
surfaces on rigid-body protein–protein docking. Protein Eng 2003;16(3):
179–85.
[28] Ho CMW, Marshall GR. Cavity search: an algorithm for the isolation and
display of cavity-like binding regions. J Comput Aided Mol Des 1990;4:
337–54.
[29] Kim D-S. Polygon offsetting using a voronoi diagram and two stacks.
Comput Aided Des 1998;30(14):1069–76.
[30] Kim D-S, Kim D, Sugihara K. Voronoi diagram of a circle set from
Voronoi diagram of a point set: I. Topology. Comput Aided Geom Des
2001;18:541–62.
[31] KimD-S, Kim D, Sugihara K. Voronoi diagram of a circle set from voronoi
diagram of a point set: II. Geometry. Comput Aided Geom Des 2001;18:
563–85.
[32] Kim D-S, Cho Y, Kim D, Kim S, Bhak J. Euclidean voronoi diagram of
3D spheres and applications to protein structure analysis International
symposium on voronoi diagrams in science and engineering. Tokyo,
Japan: University of Tokyo; 2004 p. 13–5.
[33] Kim D-S, Cho Y, Kim D. Edge-tracing algorithm for Euclidean Voronoi
diagram of 3D spheres. Proceedings of 16th Canadian conference on
computational geometry; 2004. p. 176–9.
[34] Kim D-S, Cho Y, Kim D. Euclidean voronoi diagram of 3D balls and its
computation via tracing edges. Comput Aided Des, 2005; 13:1412–24.
[35] Kim D-S, Cho C-H, Ryu J-H, Kim D. Three dimensional beta shapes.
Comput Aided Des, submitted.
D.-S. Kim et al. / Computer-Aided Design 38 (2006) 431–443 443
[36] Kleywegt GJ, Jones TA.Detection, delineation, measurement and display of
cavities in macromolecular structures. Acta Crystallogr Sect D 1994;D50:
178–85.
[37] Kunts ID. Structure-based strategies for drug design and discovery.
Science 1992;257:1078–82.
[38] Laskowski RA. SURFNET: a program for visualizing molecular surfaces,
cavities, and intermolecular interactions. J Mol Graph 1995;13:323–30.
[39] Lee K. Principles of CAD/CAM/CAE systems. Reading, MA: Addison
Wesley; 1999.
[40] Lee SH, Lee K. Compact boundary representation and generalized euler
operators for non-manifold geometric modeling. Trans Soc CAD CAM
Eng 1996;1(1):1–19.
[41] Lee SH, Lee K. Partial entity structure: a compact boundary
representation for non-manifold geometric modeling. ASME J Comput
Inf Sci Eng 2001;1(4):356–65.
[42] Lee SH, Lee K. Partial entity structure: a compact non-manifold boundary
representation based on partial topological entities. Proceedings of the
sixth ACM symposium on solid modeling and applications June 6–8.
Michigan, USA: Sheraton Inn, Ann Arbor; 2001.
[43] Lee B, Richards FM. The interpretation of protein structures: estimation
of static accessibility. J Mol Biol 1971;55:379–400.
[44] Levitt DG, Banaszak LJ. POCKET: a computer graphics method for
identifying and displaying protein cavities and their surrounding amino
acids. J Mol Graph 1992;10:229–34.
[45] Liang J, Dill KA. Are proteins well-packed? Biophys J 2001;81:751–66.
[46] Liang J, Edelsbrunner H, Fu P, Sudharkar PV, Subramaniam S. Analytic
shape computation of macromolecules I: molecular area and volume
through alpha shape. Proteins Struct Funct Genet 1998;33:1–17.
[47] Liang J, Edelsbrunner H, Fu P, Sudharkar PV, Subramaniam S. Analytic
shape computation of macromolecules II: inaccessible cavities in
proteins. Proteins Struct Funct Genet 1998;33:18–29.
[48] Liang J, Edelsbrunner H, Woodward C. Anatomy of protein pockets and
cavities:measurement of binding site geometry and implications for
ligand design. Protein Sci 1998;7:1884–97.
[49] Masuya M, Doi J. Detection and geometric modeling of molecular
surfaces and cavities using digital mathematical morphological oper-
ations. J Mol Graph 1995;13:331–6.
[50] Montoro JCG, Abascal JLF. The voronoi polyhedra as tools for structure
determination in simple disordered systems. J Phys Chem 1993;97(16):
4211–5.
[51] Noggle JH. Physical chemistry. 3rd ed.: Freedom Academy; 1996.
[52] Parsons D, Canny J. Geometric problems in molecular biology and
robotics. Second international conference on intelligent systems for
molecular biology, Palo Alto, CA; 1994. p. 322–30.
[53] Peters KP, Fauck J, Frommel C. The automatic search for ligand binding
sites in protein of known three-dimensional strucutre using only
geometric criteria. J Mol Biol 1996;256:201–13.
[54] Richards FM. The interpretation of protein structures: total volume, group
volume distributions and packing density. J Mol Biol 1974;82:1–14.
[55] Seidl T, Kriegel H-P. Solvent accessible surface representation in a
database system for protein docking. Third international conference on
intelligent systems for molecular biology, vol. 3, Cambridge, UK; 1995. p.
350–8.
[56] Shih J-P, Sheu S-Y, Mou C-Y. A voronoi polyhedra analysis of structures
of liquid water. J Chem Phys 1994;100:2202–12.
[57] Shoichet BK, Kunts ID. Protein docking and complementarity. J Mol Biol
1991;221:327–46.
[58] Voloshin VP, Beaufils S, Medvedev NN. Void space analysis of the
structure of liquids. J Mol Liq 2002;96–97:101–12.
[59] Voorintholt R, Kosters MT, Vegter G, Vriend G, Hol WGJ. A very fast
program for visulaizing protein surfaces, channels and cavities. J Mol
Graph 1989;7:243–5.
[60] Weiler K. The radial edge structure: a topological representation for non-
manifold geometric boundary modeling. In: Wonzy MJ,
McLaughlin HW, Encarnacao JL, editors. Geometric modeling for
CAD applications. New York: North Holland/Elsevier; 1988. p. 3–36.
[61] Weiler K. Boundary graph operators for nonmanifold geometric modeling
topology representations. In: Wonzy MJ, McLaughlin HW,
Encarnacao JL, editors. Geometric modeling for CAD applications.
New York: North Holland/Elsevier; 1988. p. 37–66.
[62] Zimmer R, Wohler M, Thiele R. New scoring schemes for protein fold
recognition based on voronoi contacts. Bioinformatics 1998;14:295–308.
[63] Cambridge crystallographic data centre, 2005; http://www.ccdc.cam.ac.uk/;
2005.
[64] Critical assessment of PRediction of interactions(CAPRI), 2005; home-
page. http://capri.ebi.ac.uk/.
[65] Computational geometry algorithms library (CGAL), 2005; homepage.
http://www.cgal.org/.
[66] PDB Sum. 2005; http://www.ebi.ac.uk/thornton-srv/databases/pdbsum/.
[67] RCSB protein data bank. 2005; http://www.rcsb.org/pdb/.
Deok-Soo Kim Deok-Soo Kim is a professor in
Department of Industrial Engineering, Hanyang
University, Korea. Before he joined the university
in 1995, he worked at Applicon, USA, and
Samsung Advanced Institute of Technology,
Korea. He received a B.S. from Hanyang Univer-
sity, Korea, an M.S. from the New Jersey Institute
of Technology, USA, and a Ph.D. from the
University of Michigan, USA, in 1982, 1985 and
1990, respectively. His current research interests
mainly lie in the theory and applications of
Voronoi diagram while he has been interested in various geometric problems.
He is current the director of Voronoi Diagram Research Center supported by
the Ministry of Science and Technology, Korea.
Cheol-Hyung Cho Cheol-Hyung Cho is a senior
researcher in Voronoi Diagram Research Center at
Hanyang University, Seoul, Korea. He received his
B.S., M.S. and Ph.D. degrees from Hanyang
University in 1996, 1998 and 2005, respectively.
His main research interests lie in the area of
computer graphics, geometric algorithms and their
applications in the molecular biology.
DongukKimDongukKim is a senior researcher in
Voronoi Diagram Research Center at Hanyang
University, Seoul, Korea. He received his B.S.,
M.S. and Ph.D. degrees from Hanyang University
in 1999, 2001 and 2004, respectively. His research
interests include computational geometry, geo-
metric modeling and their applications in the
molecular biology.
Youngsong Cho Youngsong Cho is a senior
researcher in Voronoi Diagram Research Center at
Hanyang University, Seoul, Korea. He received his
B.S., M.S. and Ph.D. degrees from Hanyang
University in 1995, 1997 and 2003, respectively.
His research interests include computational geo-
metry, geometric modeling and their applications in
the molecular biology.