Is optimal solution of every NP-complete or NP-hard problem determined from its characteristic for...

12
UNCORRECTED PROOF BIO 2404 1–12 BioSystems xxx (2004) xxx–xxx Is optimal solution of every NP-complete or NP-hard problem determined from its characteristic for DNA-based computing 3 4 Minyi Guo a,b,, Weng-Long Chang c , Machael Ho c , Jian Lu b , Jiannong Cao d 5 a Department of Computer Software, The University of Aizu, Aizu-Wakamatsu City, Fukushima 965-8580, Japan 6 b State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, PR China 7 c Department of Information Management, Southern Taiwan University of Technology, Tainan County, Taiwan 8 d Department of Computing, Hong Kong Polytechnic University, Hong Kong 9 Received 1 July 2004; received in revised form 15 October 2004; accepted 16 October 2004 10 Abstract 11 Cook’s Theorem [Cormen, T.H., Leiserson, C.E., Rivest, R.L., 2001. Introduction to Algorithms, second ed., The MIT Press; Garey, M.R., Johnson, D.S., 1979. Computer and Intractability, Freeman, San Fransico, CA] is that if one algorithm for an NP- complete or an NP-hard problem will be developed, then other problems will be solved by means of reduction to that problem. Cook’s Theorem has been demonstrated to be correct in a general digital electronic computer. In this paper, we first propose a DNA algorithm for solving the vertex-cover problem. Then, we demonstrate that if the size of a reduced NP-complete or NP-hard problem is equal to or less than that of the vertex-cover problem, then the proposed algorithm can be directly used for solving the reduced NP-complete or NP-hard problem and Cook’s Theorem is correct on DNA-based computing. Otherwise, a new DNA algorithm for optimal solution of a reduced NP-complete problem or a reduced NP-hard problem should be developed from the characteristic of NP-complete problems or NP-hard problems. 12 13 14 15 16 17 18 19 20 © 2004 Published by Elsevier Ireland Ltd. 21 Keywords: Molecular computing; DNA-based computing; NP-complete problems; NP-hard problems; Vertex-cover problem; Cook’s Theorem. 22 23 1. Introduction 24 Adleman wrote the first paper in which it 25 was demonstrated that deoxyribonucleic acid (DNA) 26 This work is supported in part by China National 973 Pro- gram under Grant 2002CB312002 and China NSFC (60273034, 60233010). Corresponding author. Tel.: +81 2423 72557; fax: +81 2423 72744. E-mail address: [email protected] (M. Guo). strands could be applied for figuring out solutions to 27 an instance of the NP-complete Hamiltonian path prob- 28 lem (HPP) (Adleman, 1994). Lipton wrote the second 29 paper in which it was shown that the Adleman tech- 30 niques could also be used to solve the NP-complete 31 satisfiability (SAT) problem (the first NP-complete 32 problem) (Lipton, 1995). Adleman and co-workers 33 proposed sticker for enhancing the Adleman–Lipton 34 model (Roweis et al., 1999). 35 In this paper, we use sticker to construct solution 36 space of DNA library sequences for the vertex-cover 37 1 0303-2647/$ – see front matter © 2004 Published by Elsevier Ireland Ltd. 2 doi:10.1016/j.biosystems.2004.10.003

Transcript of Is optimal solution of every NP-complete or NP-hard problem determined from its characteristic for...

PR

OO

F

BioSystems xxx (2004) xxx–xxx

Is optimal solution of every NP-complete or NP-hard problemdetermined from its characteristic for DNA-based computing�

3

4

Minyi Guoa,b,∗, Weng-Long Changc, Machael Hoc, Jian Lub, Jiannong Caod5

a Department of Computer Software, The University of Aizu, Aizu-Wakamatsu City, Fukushima 965-8580, Japan6b State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, PR China7

c Department of Information Management, Southern Taiwan University of Technology, Tainan County, Taiwan8d Department of Computing, Hong Kong Polytechnic University, Hong Kong9

Received 1 July 2004; received in revised form 15 October 2004; accepted 16 October 2004

10

Abstract11

Cook’s Theorem [Cormen, T.H., Leiserson, C.E., Rivest, R.L., 2001. Introduction to Algorithms, second ed., The MIT Press;Garey, M.R., Johnson, D.S., 1979. Computer and Intractability, Freeman, San Fransico, CA] is that if one algorithm for an NP-complete or an NP-hard problem will be developed, then other problems will be solved by means of reduction to that problem.Cook’s Theorem has been demonstrated to be correct in a generaldigital electroniccomputer. In this paper, we first propose aDNA algorithm for solving thevertex-cover problem. Then, we demonstrate that if the size of a reduced NP-complete or NP-hardproblem is equal to or less than that of the vertex-cover problem, then the proposed algorithm can be directly used for solving thereduced NP-complete or NP-hard problem and Cook’s Theorem is correct on DNA-based computing. Otherwise, a new DNAa from thec

12

13

14

15

16

17

18

19

20

©21

K heorem.22

23

124

25

w26

g6

f

s to27

rob-28

d 29

ch-30

plete31

te 32

rs 33

n 34

35

n 36

r 37

1 02 d

UN

CO

RR

EC

TED

BIO 2404 1–12

lgorithm for optimal solution of a reduced NP-complete problem or a reduced NP-hard problem should be developedharacteristic of NP-complete problems or NP-hard problems.2004 Published by Elsevier Ireland Ltd.

eywords:Molecular computing; DNA-based computing; NP-complete problems; NP-hard problems; Vertex-cover problem; Cook’s T

. Introduction

Adleman wrote the first paper in which itas demonstrated that deoxyribonucleic acid (DNA)

� This work is supported in part by China National 973 Pro-ram under Grant 2002CB312002 and China NSFC (60273034,0233010).

∗ Corresponding author. Tel.: +81 2423 72557;ax: +81 2423 72744.

E-mail address:[email protected] (M. Guo).

strands could be applied for figuring out solutionan instance of the NP-complete Hamiltonian path plem (HPP) (Adleman, 1994). Lipton wrote the seconpaper in which it was shown that the Adleman teniques could also be used to solve the NP-comsatisfiability (SAT) problem (the first NP-compleproblem) (Lipton, 1995). Adleman and co-workeproposedsticker for enhancing the Adleman–Liptomodel (Roweis et al., 1999).

In this paper, we usesticker to construct solutiospace of DNA library sequences for thevertex-cove

303-2647/$ – see front matter © 2004 Published by Elsevier Ireland Ltd.oi:10.1016/j.biosystems.2004.10.003

TED

PR

OO

F

2 M. Guo et al. / BioSystems xxx (2004) xxx–xxx

problem. Simultaneously, we also apply DNA opera-38

tions in the Adleman–Lipton model to develop a DNA39

algorithm. The main result of the proposed DNA algo-40

rithm shows that the vertex-cover problem is solved41

with biological operations in the Adleman–Lipton42

model from the solution space of stickers. Furthermore,43

if the size of a reduced NP-complete or a reduced NP-44

hard problem is equal to or less than that of the vertex-45

cover problem, then the proposed algorithm can be di-46

rectly used for solving the reduced NP-complete, or47

NP-hard problem.48

The rest of this paper is organized as follows. In Sec-49

tion 2, the Adleman–Lipton model is introduced and50

the comparison is made with other models. In Section51

3, the first DNA algorithm is proposed for solving the52

vertex-cover problem from solution space of sticker53

in the Adleman–Lipton model. In Section4, the ex-54

perimental result of simulated DNA computing is also55

given. Conclusions are drawn in Section5.56

2. DNA model of computation57

In Section2.1, a summary of DNA structure and58

the Adleman–Lipton model is described in detail. In59

Section2.2, the comparison of the Adleman–Lipton60

model with other models is also introduced.61

2.1. The Adleman–Lipton model62

in63

D64

c65

a g66

t s67

T are68

d tide69

c70

g ar-71

b xed72

n bons73

t um-74

b s-75

p se76

i re77

t78

eir79

b rim-80

Fig. 1. A schematic representation of a nucleotide.

idines (Sinden, 1994; Paun et al., 1998). Purines in- 81

clude adenineand guanine, abbreviated A and G. 82

Pyrimidines containcytosineandthymine, abbreviated 83

C and T. Because nucleotides are only distinguished84

from their bases, they are simply represented as A, G,85

C, or T nucleotides, depending upon the sort of base86

that they have. The structure of a nucleotide is illus-87

trated (in a very simplified way) inFig. 1. In Fig. 1, B 88

is one of the four possible bases (A, G, C, or T),P is the 89

phosphate group and the rest (the “stick”) is the sugar90

base (with its carbons enumerated 1′ through 5′). 91

Nucleotides can link together in two different ways92

(Sinden, 1994; Boneh et al., 1996; Paun et al., 1998). 93

The first method is that the 5′-phosphate group of 94

one nucleotide is joined with 3′-hydroxyl group of the 95

other forming aphos-phodiesterbond. The resulting 96

molecule has the 5′-phosphate group of one nucleotide,97

denoted as 5′ end, and the 3′-OH group of the other nu- 98

cleotide available, denoted as 3′ end, for bonding. This 99

gives the molecule thedirectionality, and we can talk 100

about the direction of 5′ end to 3′ end or 3′ end to 5′ end. 101

The second way is that the base of one nucleotide inter-102

acts with the base of the other to form a hydrogen bond.103

This bonding is the subject of the following restriction104

on the base pairing: A and T can pair together, and C105

and G can pair together—no other pairings are possi-106

ble. This pairing principle is called the Watson–Crick107

complementarity (named after James D. Watson and108

Francis H.C. Crick who deduced the famous double109

helix structure of DNA in 1953, and won the Nobel110

P 111

er)112

o our113

b (un-114

d re-115

s ts of116

e o 3117

e ed118

D in-119

g 20120

n oly-121

m uble122

UN

CO

RR

EC

A DNA is a moleculethat plays the main roleNA based computing (Paun et al., 1998). In the bio-hemical world of large and smallmolecules,polymersndmonomers, DNA is a polymer, which is strun

ogether from monomers calleddeoxyribonucleotide.he monomers used for the construction of DNAeoxyribonucleotides which each deoxyribonucleoontaining three components: asugar, a phosphateroup and anitrogenousbase. This sugar has five con atoms—for the sake of reference there is a fiumbering of them. Because the base also has car

o avoid confusion the carbons of the sugar are nered from 1′ to 5′ (rather than from 1 to 5). The phohate group is attached to the 5′ carbon, and the ba

s attached to the 1′ carbon. Within the sugar structuhere is a hydroxyl group attached to the 3′ carbon.

Distinct nucleotides are detected only with thases, which come in two sorts: purines and py

BIO 2404 1–12

,

rize for the discovery).A DNA strand is essentially a sequence (polym

f four types of nucleotides detected by one of fases they contain. Two strands of DNA can former appropriate conditions) a double strand, if thepective bases are the Watson–Crick complemenach other—A matches T and C matches G; als′nd matches 5′ end. The length of a single strandNA is the number of nucleotides comprising the sle strand. Thus, if a single stranded DNA includesucleotides, then we say that it is a 20 mer (it is a per containing 20 monomers). The length of a do

CT

PR

OO

F

M. Guo et al. / BioSystems xxx (2004) xxx–xxx 3

stranded DNA (where each nucleotide is base paired) is123

counted in the number of base pairs. Thus, if we make124

a double stranded DNA from a single stranded 20 mer,125

then the length of the double stranded DNA is 20 base126

pairs, also written 20 bp (for more discussion of the127

relevant biological background refer toSinden, 1994;128

Boneh et al., 1996; Paun et al., 1998).129

In the Adleman–Lipton model (Adleman, 1994;130

Lipton, 1995), splints were used to correspond to the131

edges of a particular graph the paths of which repre-132

sented all possible binary numbers. A s it stands, their133

construction indiscriminately builds all splints that lead134

to a complete graph. This is to say that hybridization135

has higher probabilities of errors. Hence, Adleman et al.136

(Roweis et al., 1999) proposed the sticker-based model,137

which was an abstract model of molecular computing138

based on DNAs with a random access memory and a139

new form of encoding the information, to enhance the140

Adleman–Lipton model.141

The DNA operations in the Adleman–Lipton model142

are described below (Adleman, 1994; Lipton, 1995;143

Boneh et al., 1996; Adleman, 1996). These operations144

will be used for figuring out solutions of the vertex-145

cover problem.146

The Adleman–Lipton model147

A (test) tube is a set of molecules of DNA (i.e.,148

a multi-set of finite strings over the alphabet{A, C,149

G ing150

o151

( d152

153

n154

d155

156

157

(158

r159

ual160

161

( t162

ins163

164

( d165

166

(5) Read. Given a tubeP, the operation is used to 167

describe a single molecule, which is contained168

in the tubeP. Even if P contains many different 169

molecules each encoding a different set of bases,170

the operation can give an explicit description of171

exactly one of them. 172

2.2. Other related work and comparison with the 173

Adleman–Lipton model 174

Based on solution space ofsplint in the 175

Adleman–Lipton model, their methods (Narayanan 176

and Zorbala, 1998; Chang and Guo, 2002a, 2002b,177

2002c, 2002d) could be applied towards solving the178

traveling salesman problem, the dominating-set prob-179

lem, the vertex-cover problem, the clique problem, the180

independent-set problem, the 3-dimensional matching181

problem and the set-packing problem. Those methods182

for the problems show exponentially increasing vol-183

umes of DNA and linearly increasing time.LaBean et 184

al. (2000)proposed ann1.89n volume, O(n2 +m2) time 185

molecular algorithm for the 3-coloring problem and a186

1.51n volume, O(n2m2) time molecular algorithm for 187

the independent set problem, wheren andmare, sub- 188

sequently, the number of vertices and the number of189

edges in the problems resolved.Fu (1997)presented 190

a polynomial-time algorithm with a 1.497n volume 191

for the 3-SAT problem, a polynomial-time algorithm192

with a 1.345n volume for the 3-coloring problem and193

a polynomial-time algorithm with a 1.229n volume for 194

t mes195

( g 196

t xity197

t 198

n- 199

z lique200

p hey201

c pool202

w ses203

( d 204

a hy-205

b 206

e man207

p with208

fi 209

e lar210

e od to211

fi l- 212

UN

CO

RR

E

, T}). Given a tube, one can perform the followperations:

1) Extract. Given a tubeP and a short single stranof DNA, S, produce two tubes +(P,S) and−(P,S),where +(P, S) is all of the molecules of DNA iP, which contain the strandSas a sub-strand an−(P,S) is all of the molecules of DNA inP, whichdo not contain the short strandS.

2) Merge. Given tubesP1 andP2, yield ∪(P1, P2),where∪(P1,P2) =P1∪P2. This operation is to poutwo tubes into one, with no change of the individstrands.

3) Detect. Given a tubeP, say ‘yes’ ifP includes aleast one DNA molecule, and say ‘no’ if it contanone.

4) Discard. Given a tubeP, the operation will discarthe tubeP.

ED

BIO 2404 1–12

he independent set. Though the size of those voluFu, 1997; LaBean et al., 2000) is lower, constructinhose volumes is more difficult and the time compleo the methods is very higher.

Quyang et al. (1997)showed that restriction eymes could be used to solve the NP-complete croblem. The maximum number of vertices that tan process is limited to 27 because the size of theith the size of the problem exponentially increa

Quyang et al., 1997). Shin et al. (1999)presenten encoding scheme for decreasing error rate inridization. The method (Shin et al., 1999) could bemployed towards ascertaining the traveling salesroblem for representing integer and real valuesxed-length codes.Arita et al. (1997)andMorimotot al. (1999), respectively, proposed new molecuxperimental techniques and a solid-phase methnd a Hamiltonian path.Amos (1997)proposed para

TED

PR

OO

F

4 M. Guo et al. / BioSystems xxx (2004) xxx–xxx

lel filtering model for resolving the Hamiltonian path213

problem, the sub-graph isomorphism problem, the 3-214

vertex-colorability problem, the clique problem and the215

independent-set problem. Those methods (Arita et al.,216

1997; Morimoto et al., 1999; Amos, 1997) have lower217

error rate in real molecular experiments.218

In the literature (Reif et al., 2000; LaBean et219

al., 2000; LaBean and Reif, 2001), the methods for220

DNA-based computing by self-assembly require to use221

DNA nanostructures, called tiles, that have efficient222

chemistries, expressive computational power, and con-223

venient input and output (I/O) mechanisms. DNA tiles224

have very lower error rate in self-assembly.Garzon and225

Deaton (1999)introduced a review of the most impor-226

tant advances in molecular computing.227

Adleman and co-workers (Roweis et al., 1999) pro-228

posed sticker-based model to enhance error rate in229

hybridization in the Adleman–Lipton model. Their230

model could be used for determining solutions to231

an instance of the set cover problem.Perez-Jimenez232

and Sancho-Caparrini (2001)employed sticker-based233

model (Roweis et al., 1999) to solve knapsack prob-234

lems. In our previous work,Chang and Guo (2004)and235

Chang et al. (2003)also employed the sticker-based236

model and the Adleman–Lipton model to deal with the237

dominating-set problem and the set-splitting problem238

for decreasing error rate of hybridization.239

3. Using sticker for solving the vertex-cover240

p241

e-242

s ace243

o in-244

t -245

r lem.246

I o-247

r n248

t ular249

c250

3251

252

E253

E254

i n255

V256

Fig. 2. The graphG of our problem.

Mathematically, avertex coverof a graphG is a 257

subsetV1 ⊆Vof vertices such that for each edge (va,vb) 258

in E, at lease one ofva andvb belongs toV1 (Cormen et 259

al., 2001; Garey and Johnson, 1979). The vertex-cover 260

problem is to find a minimum-size vertex cover from261

G. The problem has been shown to be a NP-complete262

problem (Garey and Johnson, 1979). 263

The graph inFig. 2denotes such a problem. InFig. 2, 264

the graphG contains three vertices and two edges. The265

minimum-size vertex cover forG is {v1}. Hence, the 266

size of the vertex-cover problem inFig. 2 is one. It is 267

indicated from (Garey and Johnson, 1979) that find- 268

ing a minimum-size vertex cover is an NP-complete269

problem, and it can be formulated as a search problem.270

3.2. Using sticker for constructing solution space 271

of DNA sequence for the vertex-cover problem 272

The first step in the Adleman–Lipton model is to273

yield solution space of DNA sequences for those prob-274

lems solved. Next, basic biological operations are used275

to remove illegal solution and find legal solution from276

solution space. Thus, the first step of solving the vertex-277

cover problem is to generate a test tube, which includes278

all of the possible vertex covers. Assume that ann- 279

digit binary number corresponds to each possible ver-280

tex cover to anyn-vertex graph,G. Also suppose that 281

V1 is a vertex cover forG. If the ith bit in ann-digit 282

binary number is set to 1, then it represents that the283

c 1284

b t the285

c 286

287

a 288

n 289

t 290

n nd-291

i 010292

a ing293

v . 294

T 295

UN

CO

RR

EC

roblem in the Adleman–Lipton model

In Section 3.1, the vertex-cover problem is dcribed. Applying sticker to constructing solution spf DNA sequences for the vertex-cover problem is

roduced in Section3.2. In Section3.3, one DNA algoithm is proposed to resolve the vertex-cover probn Section3.4, the complexity of the proposed algithm is offered. In Section3.5, the range of applicatioo famous Cook’s Theorem is described in molecomputing.

.1. Definition of the vertex-cover problem

Assume that a graphGcan be represented asG= (V,), whereV={v1, . . ., vn} is a set of vertices inG and={(va, vb)|va andvb are, respectively, vertices inV}

s a set of edges inG. |V| =n is the number of vertex iand|E| =m is the number of edge inE.

BIO 2404 1–12

orresponding vertex is inV . If the ith bit in ann-digitinary number is set to 0, then it represents thaorresponding vertex is out ofV1.

By this way, all of the possible vertex covers inGre transformed into an ensemble of alln-digit binaryumbers. Hence, with the way above,Table 1denotes

he solution space for the graph inFig. 2. The binaryumber 000 inTable 1represents that the correspo

ng vertex cover is empty. The binary numbers 001,nd 011 inTable 1represent that those correspondertex covers are{v1}, {v2} and{v2, v1}, respectivelyhe binary numbers 100, 101 and 110 inTable 1rep-

EC

T P

RO

OF

M. Guo et al. / BioSystems xxx (2004) xxx–xxx 5

Table 1The solution space for the graph inFig. 2

3-Digit binary number The corresponding vertex-cover

000 Ø001 {v1}010 {v2}011 {v2, v1}100 {v3}101 {v3, v1}110 {v3, v2}111 {v3, v2, v1}

resent that those corresponding vertex covers, subse-296

quently, are{v3}, {v3, v1} and{v3, v2}. The binary297

number 111 inTable 1represents that the correspond-298

ing vertex cover is{v3, v2, v1}. Though there are eight299

3-digit binary numbers for representing eight possi-300

ble vertex covers inTable 1, not every 3-digit binary301

number corresponds to alegal vertexcover. Hence, in302

next section, basic biological operations are used to de-303

velop an algorithm for removing illegal vertex covers304

and finding legal vertex covers.305

To implement this way, assume that an unsigned306

integerX is represented by a binary numberxn, xn−1,307

. . ., x1, where the value ofx1 is 1 or 0 for 1≤ i ≤n.308

The integerX contains 2n kinds of possible values.309

Each possible value represents a vertex cover for any310

n-vertex graph,G. Hence, it is very obvious that an311

unsigned integerX forms 2n possible vertex cover. A312

bit xi in an unsigned integerX represents theith vertex313

inG. If the ith vertex is in a vertex cover, then the value314

of xi is set to 1. If theith vertex is out of a vertex cover,315

then the value ofxi is set to 0.316

To represent all possible vertex covers for the vertex-317

cover problem,sticker(Roweis et al., 1999; Braich et318

al., 1999) is used to construct solution space for that319

problem solved. For every bit,xi , two distinct 15 base320

value sequences are designed. One represents the valu321

1 and another represents the value 0 forxi . For the sake322

of convenience of presentation, assume thatx1i denotes323

the value ofxi to be 1 andx0i defines the value ofxi324

to be 0. Each of the 2n possible vertex covers is repre-325

s g326

o each327

b ed328

l li-329

b d for330

s mple-331

m332

It is pointed out fromRoweis et al. (1999)andBraich 333

et al. (1999)that errors in the separation of the library334

strands are errors in the computation. Sequences must335

be designed to ensure that library strands have little336

secondary structure that might inhibit intended probe-337

library hybridization. The design must also exclude se-338

quences that might encourage unintended probe-library339

hybridization. To help achieve these goals, sequences340

were computer-generated to satisfy the following con-341

straint (Braich et al., 1999): 342

(1) Library sequences contain only As, Ts and Cs. 343

(2) All library and probe sequences have no oc-344

currence of 5 or more consecutive identical nu-345

cleotides; i.e., no runs of more than 4 A’s, 4 T’s,346

4 C’s or 4 G’s occur in any library or probe se-347

quences. 348

(3) Every probe sequence has at least four mismatches349

with all 15 base alignment of any library sequence350

(except for with its matching value sequence). 351

(4) Every 15 base subsequence of a library sequence352

has at least four mismatches with all 15 base align-353

ment of itself or any other library sequence. 354

(5) No probe sequence has a run of more than seven355

matches with any eight base alignment of any li-356

brary sequence (except for with its matching value357

sequence). 358

(6) No library sequence has a run of more than seven359

matches with any eight base alignment of itself or360

any other library sequence. 361

( se-362

363

t li-364

b ave365

l s, Ts,366

C d 367

b acts368

m , that369

t will370

b ids371

i and372

( akly373

w and374

( a low375

a en-376

s orm377

m 378

UN

CO

RRented by a library sequence of 15×nbases consistin

f the concatenation of one value sequence forit. DNA molecules with library sequences are term

ibrary strands and a combinatorial pool containingrary strands is termed a library. The probes useeparating the library strands have sequences coentary to the value sequences.

ED

BIO 2404 1–12

e

7) Every probe sequence has 4, 5 or 6 Gs in itsquence.

Constraint (1) is motivated by the assumption tharary strands composed only of As, Ts and Cs will h

ess secondary structure than those composed of As and Gs (Kalim, 1998). Constraint (2) is motivatey two assumptions: first, that long homopolymer tray have unusual secondary structure and second

he melting temperatures of probe-library hybridse more uniform if none of the probe-library hybr

nvolve long homopolymer tracts. Constraints (3)5) are intended to ensure that probes bind only wehere they are not intended to bind. Constraints (4)

6) are intended to ensure that library strands haveffinity for themselves. Constraint (7) is intended toure that intended probe-library pairings have unifelting temperatures.

T P

RO

OF

6 M. Guo et al. / BioSystems xxx (2004) xxx–xxx

The Adleman program (Braich et al., 1999) is modi-379

fied for generating those DNA sequences to satisfy the380

constraints above. For example, for representing the381

three vertices in the graph inFig. 2, the DNA sequences382

generated are:383

• x01 = AAAACTCACCCTCCT;384

• x02 = TCTAATATAATTACT;385

• x03 = ATTCTAACTCTACCT;386

• x11 = TTTCAATAACACCTC;387

• x12 = ATTCACTTCTTTAAT; and388

• x13 = AACATACCCCTAATC.389

Therefore, for every possible vertex cover to the390

graph inFig. 2, the corresponding library strand is syn-391

thesized by employing a mix-and-split combinatorial392

synthesis technique (Cukras et al., 1998). Similarly,393

for any n-vertex graph, all of the library strands for394

representing every possible vertex cover could be also395

synthesized with the same technique.396

3.3. The DNA algorithm for solving the397

vertex-cover problem398

The following DNA algorithm is proposed to solve399

the vertex-cover problem:400

A .401

( ever--

( in

ts

E

(

(a)T ONj+1 = +(Tj , x1

i+1) andTj =−(T0, x1i+1).

(b) Tj+1 =∪(Tj+1, T ONj+1).

EndFor

EndFor

(4) Fork= 1 ton

(a) If (detect (Tk) = ‘yes’) then

(b) Read (Tk) and terminate the algorithm.

EndIf

EndFor

402

Theorem 3.1. From those steps in Algorithm 1, the 403

vertex-cover problem for any n-vertex graph G can be404

solved. 405

Proof 1. In Step 1, a test tube of DNA strands that406

encode all 2n possible input bit sequencesxn, . . ., x1, 407

is generated. It is very clear that the test tube includes408

all 2n possible vertex covers for anyn-vertex graph,G.409

From the definition of vertex cover (Cormen et al., 410

2001; Garey and Johnson, 1979), Step 2(a) applies “ex- 411

traction” operation from the tubeT0 to form two test 412

tubes:θ1 and θ. The first tubeθ1 contains all of the 413

strands that havexi = 1. The second tubeθ consists of 414

all of the strands that havexi = 0. It is very clear from 415

the definition of vertex cover that the tubeθ represents 416

those sets which do not include the vertexvi . Next, 417

Step 2(b) uses “extraction” operation from the tubeθ 418

t e 419

θ 420

T ve421

x f 422

v s, 423

w re,424

S bes425

θ re 426

r 427

s . 428

ex-429

e p is430

( xe-431

c ere-432

f ime.433

S test434

t f 435

t 436

UN

CO

RR

EC

lgorithm 1. Solving the vertex-cover problem

1) Input (T0), where the tubeT0 includes solution spacof DNA sequences to encode all of the possibletex covers for anyn-vertex graph,G, with those techniques mentioned in Section3.2.

2) Fork= 1 tom, wherem is the number of edgesG.

Assume thatek is (vi , vj), whereek is one edge inGandvi andvj are vertices inG. Also suppose that bixi andxj , respectively, representvi andvj .

(a) θ1 = +(T0, x1i ) andθ =−(T0, x1

i ).

(b) θ2 = +(θ, x1j ) andθ3 =−(θ, x1

j ).

(c) T0 =∪(θ1, θ2)

ndFor

3) For i = 0 ton− 1

For j = I down to 0

ED

BIO 2404 1–12

o form two new test tubes:θ2 andθ3. The test tub2 includes all of the strands that havexi = 0 andxj = 1.he test tubeθ3 consists of all of the strands that hai = 0 andxj = 0. It is indicated from the definition oertex cover that the tubesθ1 andθ2 contain the strandhich satisfy the definition of vertex cover. Therefotep 2(c) applies “merge” operation to pour the tu1 andθ2 into the tubeT0. After Steps 2(a)–2(c) aepeated to execute m times, the tubeT0 includes thetrands, which represent those legal vertex coversWhen each time of the outer loop in Step 3 is

cuted, the number of execution for the inner looi + 1) times. The first time of the outer loop is euted, the inner loop is only executed one time. Thore, Step 3(a) and 3(b) will also be executed one ttep 3a uses “extraction” operation to form two

ubes:T ON1 andT0. The first tubeT ON

1 contains all ohe strands that havex1 = 1. The second tubeT0 consists

CT

PR

OO

F

M. Guo et al. / BioSystems xxx (2004) xxx–xxx 7

of all of the strands that havex1 = 0. That is to say that437

the first tube encodes every vertex cover including the438

first vertex and the second tube represents every vertex439

cover not including the first vertex. Hence, Step 3(b)440

applies “merge” operation to pour the tubeT ON1 into the441

tubeT1. After repeat to execute Steps 3(a) and 3(b), it442

finally producesnnew tubes. The tubeTk for n≥ k≥ 1443

encodes those vertex covers that containk vertices.444

Because the vertex-cover problem is to find a445

minimum-size vertex-cover, the tubeT1 first is de-446

tected with “detection” operation in Step 4(a). If it re-447

turns “yes”, then the tubeT1 contains those vertex cov-448

ers, which size is minimum. Therefore, Step 4(b) uses449

“read” operation to describe sequence of a molecular450

in the tubeT1 and the algorithm is terminated. Other-451

wise, repeat to execute Step 4(a) until a minimum-size452

vertex cover is found in the tube detected.453

The graph inFig. 1is used to show the power of Al-454

gorithm 1. It is pointed out from Step 1 in Algorithm 1455

that the tubeT0 is filled with eight library stands with456

the techniques mentioned in Section3.2, representing457

eight possible vertex covers for the graph inFig. 1.458

All of the edges in the graph inFig. 1are (v1, v2) and459

(v1,v3). So, the number of execution to Steps 2(a)–2(c)460

in Algorithm 1 is two times. From the first execution of461

Step 2a in Algorithm 1, two tubes are generated. The462

first tube,θ1, includes those vertex covers:{v1}, {v2,463

v1}, {v3, v1} and{v3, v2, v1} and the second tube,θ,464

also contains those vertex covers:Ø, {v2}, {v3} and465

{v3, v2}. Next, from the first execution of Step 2(b) in466

A467

i468

s :469

a s470

w tep471

2472

t r-473

t474

a d to475

d -476

c t477

t e478

c479

{480

481

i p in482

S r of483

e is484

dependent on the value of the loop variable in the outer485

loop. After the execution of the first time to Step 3(a)486

and Step 3(b) is finished, the tubeT1 contains those 487

vertex covers:{v1}, {v2, v1}, {v3, v1} and{v3, v2, v1} 488

and the tubeT0 includes only the vertex cover:{v3,v2}. 489

After repeating Steps 3(a) and 3(b), it finally produces490

three new tubes. The three tubesT1, T2 andT3, respec- 491

tively, include{{v2, v1}, {v3, v1}, {v3, v2}} and{{v3, 492

v2, v1}}. 493

Because the tubeT1 is not empty, “detection” opera- 494

tion for detecting the tubeT1 in Step 4(a) in Algorithm 495

1 returns “yes”. Therefore, Step 4(b) in Algorithm 1496

reads the answer from the tubeT1. Thus, a minimum- 497

size vertex cover for the graph inFig. 1 is {v1}. � 498

3.4. The complexity of the proposed DNA 499

algorithm 500

The following theorems describe time complexity of501

Algorithm 1, volume complexity of solution space in502

Algorithm 1, the number of the tube used in Algorithm503

1 and the longest library strand in solution space in504

Algorithm 1. 505

Theorem3.2. The vertex-cover problem for any undi-506

rectedn-vertexgraphGwithmedgescanbesolvedwith507

O(n2)biological operations in the Adleman–Lipton508

model,where n is the number of vertices in G and m is509

at most equal to(n(n− 1)/2). 510

P he511

v 512

G 2 is513

m o re-514

m 515

l us516

t - 517

t 3 518

i ent519

i o-520

r 521

o 522

o ver-523

t om524

A on”525

o tion.526

H is at527

o 1528

UN

CO

RR

E

lgorithm 1, two tubes are yielded. The first tube,θ2,ncludes those vertex covers:{v2} and{v3, v2} and theecond tube,θ3, also contains those vertex coversØnd{v3}. The tubes,θ1 andθ2, consist of the strandhich satisfy the definition of vertex cover. Hence, S(c) of Algorithm 1 pours the tubeθ1 andθ2 into the

ubeT0. Therefore, the tubeT0 now includes those veex covers:{v1}, {v2, v1}, {v3, v1}, {v3, v2, v1}, {v2}nd{v3, v2}. The same processing can be applieeal with other edge (v1, v3). After every edge is proessed, the remaining strands in the tubeT0 represenhe legal vertex covers. That is to say that the tubT0ontains those vertex covers:{v1}, {v2, v1}, {v3, v1},v3, v2} and{v3, v2, v1}.Because the number of vertex in the graph inFig. 1

s three, the number of execution to the outer lootep 3 in Algorithm 1 is three times. The numbexecution to the inner loop in Step 3 in Algorithm 1

ED

BIO 2404 1–12

roof 2. Algorithm 1 can be applied for solving tertex-cover problem for any undirectedn-vertex graph. Algorithm 1 includes three main steps. Stepainly used to determine legal vertex covers and tove illegal vertex covers from all of the 2n possible

ibrary strands. From Algorithm 1, it is very obviohat Steps 2(a) and 2(b) take2×m “extraction” operaions and Step 2(c) takesm “merge” operations. Steps mainly applied to figure out the number of elemn every legal vertex cover. It is indicated from Algithm 1 that Step 3(a) takes (n(n− 1)/2) “extraction”perations and Step 3(b) takes (n(n− 1)/2) “merge”perations. Step 4 is used to find a minimum-size

ex cover from legal vertex cover. It is pointed out frlgorithm 1 that Step 4(a) at most takes n “detectiperations and Step 4(b) takes one “read” operaence, from the statements mentioned above, itnce inferred that the time complexity of Algorithm

T P

RO

OF

8 M. Guo et al. / BioSystems xxx (2004) xxx–xxx

is O(n2) biological operations in the Adleman–Lipton529

model. �530

Theorem 3.3. The vertex-cover problem for any531

undirected n-vertices graph G with m edges can be532

solved with sticker to constructO(2n ) strands in the533

Adleman–Lipton model, where n is the number of ver-534

tices in G.535

Proof 3. Refer to Theorem 3.2. �536

Theorem3.4. The vertex-cover problem for any undi-537

rected n-vertices graph G with m edges can be solved538

with O(n) tubes in the Adleman–Lipton model, where539

n is the number of vertices in G.540

Proof 4. Refer to Theorem 3.2. �541

Theorem 3.5. The vertex-cover problem for any542

undirected n-vertices graph G with m edges can be543

solved with the longest library strand, O(15×n), in544

the Adleman–Lipton model, where n is the number of545

vertices in G.546

Proof 5. Refer to Theorem 3.2. �547

3.5. Range of application to Cook’s Theorem in548

DNA computing549

Cook’s Theorem (Cormen et al., 2001; Garey and550

Johnson, 1979) is that if one algorithm for one NP-551

c rob-552

l ob-553

l cor-554

r me555

t n556

a t557

—558

p as-559

s560

T s it561

o P-562

c563

l f ap-564

p ing565

T566

p -567

d no-568

m ter.569

If the size of a reducedNP-complete problem or a 570

reducedNP-hard problem is not equal to or less than571

that of the vertex-cover problem, then a newDNAalgo-572

rithm for optimal solution of the reducedNP-complete 573

problem or the reducedNP-hard problem should be de- 574

veloped from the characteristic ofNP-complete prob- 575

lems orNP-hard problems. 576

Proof 6. We transform the 3-SAT problem to the577

vertex-cover problem with a polynomial time al-578

gorithm (Cormen et al., 2001). Suppose thatU is 579

{u1, u2, · · · , un} andC is {c1, c2, · · · , cm}. U andC 580

are any instance for the 3-SAT problem. We construct581

a graphG= (V, E) and a positive integerK≤ |V| such 582

thatG has a vertex cover of sizeK or less if and only 583

if C is satisfiable. 584

For each variableui inU, there is a truth-setting com- 585

ponentTi = (Vi , Ei), with Vi ={ui , u1i } andEi ={{ui , 586

u1i }}, that is, two vertices joined by a single edge. Note587

that any vertex cover will have to contain at least one588

of ui andu1i in order to cover the single edge inEi . 589

For each clausecj in C, there is a satisfaction testing590

componentSj = (V 1j , E1

j ), consisting of three vertices 591

and three edges joining them to form a triangle: 592

V 1j = {aj[j], a2[j], a3[j]} 593

E1j = {{aj[j], a2[j]}, {a1[j], a3[j]}, {a2[j], a3[j]}} 594

N ast595

t 596

T hich597

l m-598

m van-599

t For600

e 601

d es602

e 603

E 604

T ver605

p 606

G 607

V 608

UN

CO

RR

EC

omplete problem will be developed, then other pems will be solved by means of reduction to that prem. Cook’s Theorem has been demonstrated to beect in a general digital electronic computer. Assuhat a collectionC is {c1, c2, · · · , cm} of clauses ofinite setU of variables,{u1, u2, · · · , un}, such thacx— is equal to 3 for 1≤ x≤m. The 3-satisfiability

roblem (3-SAT) is to find whether there is a truthignment forU that satisfies all of the clauses inC.he simple structure for the 3-SAT problem makene of the most widely used problems for other Nompleteness results (Cormen et al., 2001). The fol-owing theorems are used to describe the range olication for Cook’s Theorem in molecular comput

heorem 3.6. Assume that any otherNP-completeroblems or any otherNP-hard problems can be reuced to the vertex-cover problem with a polyial time algorithm in a general electronic compu

ED

BIO 2404 1–12

ote that any vertex cover will have to contain at lewo vertices fromV 1

j in order to cover the edges inE1j .

he only part of the construction that depends on witerals occur in which clauses is the collection of co

unication edges. These are best viewed from theage point of the satisfaction testing components.ach clausecj inC, assume that the three literals incj isenoted asxj , yj andzj . Then the communication edgmanating fromSj are given by:

2j = {{a1[j], xj}, {a2[j], yj}, {a3[j], zj}}.he construction of our instance to the vertex-coroblem is completed by settingK=n+ (2×m) and= (V, E), where

=(

n∪i=1

Ei

)∪

(m∪

j=1V 1

j

)

CTE

D P

RO

OF

M. Guo et al. / BioSystems xxx (2004) xxx–xxx 9

and609

E =(

n∪i=1

Ei

)∪

(m∪

j=1E1

j

)∪

(m∪

j=1V 2

E

).610

Therefore, the number of vertex and the number of611

edge inG are, respectively, ((2×n) + (3×m)) and612

(n+ (6×m)). This implies that Algorithm 1 is used613

to solve the reduced 3-SAT problem, then it will take614

O(4×n2 + 12×m×n+ 9×n2) biological operations615

with O(22n+3m) DNA strands. However, from the char-616

acteristic of the 3-SAT problem, Lipton’s algorithm617

will only take O(m) biological operations with O(2n )618

DNA strands. Therefore, it is inferred that if the size619

of a reduced NP-complete problem or a reduced NP-620

hard problem is not equal to or less than that of the621

vertex-cover problem, then a new DNA algorithm for622

optimal solution of the reduced NP-complete problem623

or the reduced NP-hard problem should be developed624

from the characteristic of NP-complete problems or625

NP-hard problems.626

From Theorem 3.3–3.6, if the size of a reduced NP-627

complete problem or a reduced NP-hard problem is628

equal to or less than that of the vertex-cover problem,629

then Algorithm 1 can be directly used for solving the630

reduced NP-complete or the reduced NP-hard problem.631

Otherwise, a new DNA algorithm for optimal solution632

of a reduced NP-complete problem or a reduced NP-633

hard problem should be developed according to the634

characteristic of NP-complete or NP-hard problems.635

�636

4637

c638

ro-639

g is640

a the641

v f the642

t643

i ard644

f c-645

t nc-646

t le-647

m in648

t -649

r hm650

1651

Table 2Sequences chosen to represent the vertices in the graph in Fig. 2

Vertex DNA sequence

x03 5′-ATTCTAACTCTACCT-3′

x02 5′-TCTAATATAATTACT-3 ′

x01 5′-AAAACTCACCCTCCT-3′

x13 5′-AACATACCCCTAATC-3′

x12 5′-ATTCACTTCTTTAAT-3′

x11 5′-TTTCAATAACACCTC-3′

The Adleman program is used for constructing each652

15-base DNA sequence for each bit of the library. For653

each bit, the program is applied for generating two 15-654

base random sequences (for 1 and 0) and checking to655

see if the library strands satisfy the seven constraints656

in Section3.2with the new DNA sequences added. If657

the constraints are satisfied, the new DNA sequences658

are greedily accepted. If the constraints are not satis-659

fied then mutations are introduced one by one into the660

new block until either (A) the constraints are satisfied661

and the new DNA sequences are then accepted or (B)662

a threshold for the number of mutations is exceeded663

and the program has failed and then it exits, printing664

the sequence found so far. Ifn-bits that satisfy the con- 665

straints are found then the program has succeeded and666

it outputs these sequences. 667

Consider the graph inFig. 2. The graph includes 668

three vertices:v1, v2 andv3. DNA sequences gener-669

ated by the Adleman program modified were shown in670

Table 2. This program, respectively, took one mutation,671

one mutation and 10 mutations to make new DNA se-672

quences forv1, v2 andv3. With the nearest neighbor pa-673

rameters, the Adleman program was used to calculate674

the enthalpy, entropy and free energy for the binding675

of each probe to its corresponding region on a library676

strand. The energy was shown inTable 3. Only G re- 677

ally matters to the energy of each bit. For example, the678

Table 3T giono

V

x

x

x

x

x

x

UN

CO

RR

E

. Experimental results of simulated DNAomputing

We finished the modification of the Adleman pram (Braich et al., 1999). This modified programpplied to generate DNA sequences for solvingertex-cover problem. Because the source code owo functionssrand48() anddrand48() was not foundn theoriginal Adleman program, we use the standunctionsrand() in C++ builder 6.0 to replace the funion srand48() and added the source code to the fuiondrand48(). We also added subroutines to the Adan program for simulating biological operations

he Adleman–Lipton model in Section2. We add suboutines to the Adleman program to simulate Algoritin Section3.3.

BIO 2404 1–12

he energy for the binding of each probe to its corresponding ren a library strand

ertex Enthalpy energy (H) Entropy energy (S) Free energy (G)03 105.2 277.1 22.402 104.8 283.7 19.901 113.7 288.7 27.513 112.6 291.2 25.612 107.8 283.5 2311 105.6 271.6 24.3

RR

EC

TED

PR

OO

F

BIO 2404 1–12

10 M. Guo et al. / BioSystems xxx (2004) xxx–xxx

Table 4DNA sequences chosen represent all possible vertex covers

5′-ATTCTAACTCTACCTTCTAATATAATTACTAAAACTCACCCTCCT-3′ 3′-TAAGATTGAGATGGAAGATTATATTAATGATTTTGAGTGGGAGGA-5′

5′-ATTCTAACTCTACCTTCTAATATAATTACTTTTCAATAACACCTC-3 ′ 3′-TAAGATTGAGATGGAAGATTATATTAATGAAAAGTTATTGTGGAG-5′

5′-ATTCTAACTCTACCTATTCACTTCTTTAATAAAACTCACCCTCCT-3′ 3′-TAAGATTGAGATGGATAAGTGAAGAAATTATTTTGAGTGGGAGGA-5′

5′-ATTCTAACTCTACCTATTCACTTCTTTAATTTTCAATAACACCTC-3′ 3′-TAAGATTGAGATGGATAAGTGAAGAAATTAAAAGTTATTGTGGAG-5′

5′-AACATACCCCTAATCTCTAATATAATTACTAAAACTCACCCTCCT-3′ 3′-TTGTATGGGGATTAGAGATTATATTAATGATTTTGAGTGGGAGGA-5′

5′-AACATACCCCTAATCTCTAATATAATTACTTTTCAATAACACCTC-3 ′ 3′-TTGTATGGGGATTAGAGATTATATTAATGAAAAGTTATTGTGGAG-5′

5′-AACATACCCCTAATCATTCACTTCTTTAATAAAACTCACCCTCCT-3′ 3′-TTGTATGGGGATTAGTAAGTGAAGAAATTATTTTGAGTGGGAGGA-5′

5′-AACATACCCCTAATCATTCACTTCTTTAATTTTCAATAACACCTC-3′ 3′-TTGTATGGGGATTAGTAAGTGAAGAAATTAAAAGTTATTGTGGAG-5′

deltaG for the probe binding a ‘1’ in the first bit is,679

thus, estimated to be 24.3 kcal/mol and the deltaG for680

the probe binding a ‘0’ is estimated to be 27.5 kcal/mol.681

The program simulated a mix-and-split combinato-682

rial synthesis technique (Cukras et al., 1998) to synthe-683

size the library strand to every possible vertex cover.684

Those library strands are shown inTable 4, and repre-685

sent eight possible vertex covers:Ø, {v1}, {v2}, {v2,686

v1}, {v3}, {v3, v1}, {v3, v2} and{v3, v2, v1}, respec-687

tively. The program is also applied to figure out the av-688

erage and standard deviation for the enthalpy, entropy689

and free energy over all probe/library strand interac-690

tions. The energy is shown inTable 5. The standard691

deviation for deltaG is small because this is partially692

enforced by the constraint that there are 4, 5 or 6Gs693

(the seventh constraint in Section3.2) in the probe se-694

quences.695

The Adleman program is employed for computing696

the distribution of the types of potential mishybridiza-697

tions. The distribution of the types of potential mishy-698

bridizations is the absolute frequency of a probe-strand699

match of lengthk from 0 to the bit length 15 (for DNA700

sequences) where probes are not supposed to match701

the strands. The distribution is, subsequently, 106, 152,702

Table 5The energy over all probe/library strand interactions

Enthalpyenergy (H)

Entropyenergy (S)

Free energy(G)

Average 108.283 282.633 23.7833S

Table 6DNA sequences generated by Step 2 represent legal vertex covers

5′-ATTCTAACTCTACCTTCTAATATAATTACTTTTCAATAACACCTC-3′

5′-ATTCTAACTCTACCTATTCACTTCTTTAATTTTCAATAACACCTC-3′

5′-AACATACCCCTAATCTCTAATATAATTACTTTTCAATAACACCTC-3′

5′-AACATACCCCTAATCATTCACTTCTTTAATAAAACTCACCCTCCT-3′

5′-AACATACCCCTAATCATTCACTTCTTTAATTTTCAATAACACCTC-3′

183, 215, 216, 225, 137, 94, 46, 13, 4, 1, 0, 0, 0 and 0.703

It is pointed out from the last four zeros that there are 0704

occurrences where a probe matches a strand at 12, 13,705

14 or 15 places. This shows that the third constraint in706

Section3.2 has been satisfied. Clearly, the number of707

matches peaks at 5 (225). That is to say that there are708

225 occurrences where a probe matches a strand at 5709

places. 710

The results for simulation of Step 2–4 were, respec-711

tively, shown inTables 6–10. From the tubeT1, the 712

answer was found to be{v1}. 713

Table 7DNA sequence generated by Step 3 represents that vertex cover onlycontaining one vertex

5′-ATTCTAACTCTACCTTCTAATATAATTACTTTTCAATAACACCTC-3′

UN

COtandard deviation 3.58365 6.63867 2.41481

CO

RR

EC

T P

RO

OF

M. Guo et al. / BioSystems xxx (2004) xxx–xxx 11

Table 8DNA sequences generated by Step 3 represent those vertex coversincluding two vertices

5′-ATTCTAACTCTACCTATTCACTTCTTTAATTTTCAATAACACCTC-3′

5′-AACATACCCCTAATCTCTAATATAATTACTTTTCAATAACACCTC-3′

5′-AACATACCCCTAATCATTCACTTCTTTAATAAAACTCACCCTCCT-3′

Table 9DNA sequence generated by Step 3 represents that vertex cover con-taining three vertices

5′-AACATACCCCTAATCATTCACTTCTTTAATTTTCAATAACACCTC-3′

Table 10DNA sequence generated by Step 4 represents the minimum-sizevertex cover

5′-ATTCTAACTCTACCTTCTAATATAATTACTTTTCAATAACACCTC-3′

5. Conclusions714

Cook’s Theorem is that if one algorithm for an NP-715

complete or an NP-hard problem will be developed,716

then other problems will be solved by means of re-717

duction to that problem. Cook’s Theorem has been718

demonstrated to be right in a general digit electronic719

computer. In this paper, we showed that, from Theo-720

rem 3.3 to 3.6, if the size of a reduced NP-complete721

problem is equal to or less than that of the vertex-cover722

problem, then Cook’s Theorem is right in molecular723

computing. Otherwise, a new DNA algorithm for op-724

timal solution of a reduced NP-complete problem or725

a reduced NP-hard problem should be developed from726

the characteristic of NP-complete problems or NP-hard727

problems.728

Chang and Guo (2002b, 2002d)applied splints to729

constructing solution space of DNA sequence for solv-730

ing the vertex-cover problem in the Adleman–Lipton.731

This causes that hybridization has higher probabili-732

ties for errors. Adleman and co-workers (Roweis et al.,733

1999) proposedstickerto decrease probabilities of er-734

rors to hybridization in the Adleman–Lipton. The main735

result of the proposed algorithms shows that the vertex-736

cover problem is solved with biological operations in737

the Adleman–Lipton model from solution space of738

sticker. Furthermore, this work represents clear evi-739

dence for the ability of DNA based computing to solve740

NP-complete problems. 741

Currently, there still are lots of NP-complete prob-742

lems not to be solved because it is very difficult to use743

basic biological operations for replacing mathematical744

operations. We are not sure whether molecular comput-745

ing can be applied for dealing with every NP-complete746

problem. Therefore, in the future, our main work is to747

solve other unsolved NP-complete problems with the748

Adleman–Lipton model and the sticker model, or de-749

velop a new model. 750

References 751

Sinden, R.R., 1994. DNA Structure and Function. Academic Press.752

Adleman, L., 1994. Molecular computation of solutions to combina-753

torial problems. Science 266, 1021–1024. 754

Lipton, R.J., 1995. DNA solution of hard computational problems.755

Science 268, 542–545. 756

Quyang, Q., Kaplan, P.D., Liu, S., Libchaber, A., 1997. DNA solution757

of the maximal clique problem. Science 278, 446–449. 758

Arita, M., Suyama, A., Hagiya, M., 1997. A heuristic approach for759

Hamiltonian path problem with molecules. In: Proceedings of760

Second Genetic Programming (GP-97), pp. 457–462. 761

Morimoto, N., Arita, M., Suyama, A., 1999. Solid phase DNA solu-762

tion to the Hamiltonian path problem. In: Series in Discrete Math-763

ematics and Theoretical Computer Science, vol. 48, pp. 93–206.764

Narayanan, A., Zorbala, S., 1998. DNA algorithms for computing765

shortest paths. In: Koza, J.R., et al. (Eds.), Genetic Programming766

1998: Proceedings of the Third Annual Conference, pp. 718–724.767

Shin, S.-Y., Zhang, B.-T., Jun, S.-S., 1994. Solving traveling sales-768

gs of769

, vol.770

771

C n to772

773

G Free-774

775

B pu-776

ol.777

biol-778

779

A NA780

Dis-781

rican782

783

A t of784

785

R an,786

sed787

e-788

nual789

tics790

UN

ED

BIO 2404 1–12

man problems using molecular programming. In: Proceedinthe 1999 Congress on Evolutionary Computation (CEC99)2, pp. 994–1000.

ormen, T.H., Leiserson, C.E., Rivest, R.L., 2001. IntroductioAlgorithms, second ed. The MIT Press.

arey, M.R., Johnson, D.S., 1979. Computer and Intractability.man, San Fransico, CA.

oneh, D., Dunworth, C., Lipton, R.J., Sgall, J., 1996. On the comtational power of DNA. In: Discrete Applied Mathematics, v71, pp. 79–94 (special issue on computational molecularogy).

dleman, L.M., 1996. On constructing a molecular computer. Dbased computers. In: Lipton, R., Baum, E. (Eds.), Series increte Mathematics and Theoretical Computer Science AmeMathematical Society, pp. 1–21.

mos, M., 1997. DNA computation, Ph.D. Thesis, DepartmenComputer Science, The University of Warwick.

oweis, S., Winfree, E., Burgoyne, R., Chelyapov, N.V., GoodmM.F., Rothemund, P.W.K., Adleman, L.M., 1999. Sticker BaModel for DNA Computation, Princeton University, In: Landwber, L., Baum, E. (Eds.), In: Proceedings of the second anworkshop on DNA Computing, Series in Discrete Mathema

T P

RO

OF

12 M. Guo et al. / BioSystems xxx (2004) xxx–xxx

and Theoretical Computer Science, American Mathematical So-791

ciety, pp. 1–29.792

Perez-Jimenez, M.J., Sancho-Caparrini, F., 2001. Solving Knapsack793

Problems in a Sticker Based Model. In: Proceedings of the Sev-794

enth Annual Workshop on DNA Computing, DIMACS: Series in795

Discrete Mathematics and Theoretical Computer Science, Amer-796

ican Mathematical Society.797

Paun, G., Rozenberg, G., Salomaa, A., 1998. DNA Computing: New798

Computing Paradigms. Springer–Verlag, New York.799

Chang, W.-L., Guo, M., 2002a. Solving the dominating-set problem800

in Adleman–Liptons Model. In: The Third International Confer-801

ence on Parallel and Distributed Computing, Applications and802

Technologies, Japan, pp. 167–172.803

Chang, W.-L., Guo, M., 2002b. Solving the clique problem and the804

vertex-cover problem in Adleman–Lipton’s Model. In: IASTED805

International Conference on Networks, Parallel and Distributed806

Processing, and Applications, Japan, pp. 431–436.807

Chang, W.-L., Guo, M., 2002c. Solving NP-complete problem in808

the Adleman–Lipton Model. In: The Proceedings of 2002 Inter-809

national Conference on Computer and Information Technology,810

Japan, pp. 157–162.811

Chang, W.-L., Guo, M., 2002d. Solving the 3-dimensional match-812

ing problem and the set packing problem in Adleman–Lipton’s813

Model. In: IASTED International Conference on Networks, Par-814

allel and Distributed Processing, and Applications, Japan, pp.815

455–460.816

Fu, B., 1997. Volume bounded molecular computation, Ph.D. Thesis,817

Department of Computer Science, Yale University.818

Braich, R.S., Johnson, C., Rothemund, P.W.K., Hwang, D.,819

Chelyapov, N., Adleman, L.M., 1999. Solution of a satisfiabil-820

ity problem on a gel-based DNA computer. In: Proceedings of

the sixth International Conference on DNA Computation in the821

Springer–Verlag Lecture Notes in Computer Science Series. 822

Kalim, M., Restricted genetic alphabet for DNA computing, In: Eric,823

B., Baum, Landweber, L.F., 1998. DNA Based Computers II: DI-824

MACS Workshop, June 10–12, 1996, volume 44 of DIMACS:825

Series in Discrete Mathematics and Theoretical Computer Sci-826

ence, Providence, RI, pp. 243–246. 827

Cukras, A.R., Faulhammer, D., Lipton, R.J., Landweber, L.F., 1998.828

Chess games: a model for RNA-based computation. In: Proceed-829

ings of the fourth DIMACS Meeting on DNA Based Computers,830

University of Pennsylvania, pp. 27–37. 831

Chang, W.-L., Guo, M., 2004. Using sticker for solving the832

dominating-set problem in the Adleman–Lipton Model. IEICE833

Trans. Inf. Syst. E-87D (7), 1782–1788. 834

Reif, J.H., LaBean, T.H., Seeman, 2000. Challenges and applications835

for self-assembled DNA-nanostructures. In: Proceedings of the836

Sixth DIMACS Workshop on DNA Based Computers, Leiden,837

Holland. 838

LaBean, T.H., Winfree, E., Reif, J.H., 2000. Experimental progress839

in computation by self-assembly of DNA tilings. Theor. Comput.840

Sci. 54, 123–140. 841

LaBean, T.H., Reif, J.H., 2001. Logical computation using algorith-842

mic self-assembly of DNA triple-crossover molecules. Nature843

407, 493–496. 844

Garzon, M.H., Deaton, R.J., 1999. Biomolecular Computing and845

Programming. IEEE Trans. Evolut. Comput. 3, 236–250. 846

Chang, W.-L., Guo, M., Ho, M., 2003. Solving the set-splitting prob-847

lem in sticker-based model and the Adleman–Lipton Model. In:848

The 2003 International Symposium on Parallel and Distributed849

Processing and Applications, Aizu City, Japan, LNCS 2745, pp.850

185–196. 851

NC

OR

RE

C

U

ED

BIO 2404 1–12