A DNA computer model for solving vertex coloring problem

9
ARTICLES www.scichina.com www.springerlink.com 2541 Chinese Science Bulletin 2006 Vol. 51 No. 20 25412549 DOI: 10.1007/s11434-006-2145-6 A DNA computer model for solving vertex coloring prob- lem XU Jin 1, 2 , QIANG Xiaoli 1 , FANG Gang 1 & ZHOU Kang 1 1. Department of Control Science and Engineering, Huazhong University of Science and Technology, Wuhan 430074, China; 2. Department of Biotechnology, Dalian University, Dalian 116622, China Correspondence should be addressed to Xu Jin ([email protected]) Received November 9, 2005; accepted December 23, 2005 Abstract A special DNA computer was designed to solve the vertex coloring problem. The main body of this kind of DNA computer was polyacrylamide gel electrophoresis which could be classified into three parts: melting region, unsatisfied solution region and solution region. This polyacrylamide gel was con- nected with a controllable temperature device, and the relevant temperature was T m1 , T m2 and T m3 , res- pectively. Furthermore, with emphasis on the encod- ing way, we succeeded in performing the experiment of a graph with 5 vertices. In this paper we introduce the basic structure, the principle and the method of forming the library DNA sequences. Keywords: DNA computer, graph vertex coloring problem, encod- ing. Since Adleman [1] demonstrated the way to apply standard methods of molecular biology to solving a hard computational problem, the research of DNA computation and DNA computer have made much headway. So far, many DNA computing models have been brought forward based on DNA molecule, enzyme and biotechnology. Some of them are theoretical mod- els, such as DNA computing model presented by Lip- ton [2] for solving 3-SAT problem, and the sticker model introduced by Roweis [3] . Some of them are based on ingenious molecular biology techniques. For example, in 2002, using a DNA computer, Adleman’s research group solved an NPC problem by the gel electrophore- sis and controllable temperature method. And this computational problem may be the largest yet solved by nonelectronic means [4] . Some are based on laboratory techniques. For instance, Sakamoto offered a model to solve the SAT problem through DNA’s hairpin structure elaborately [5] . By using POA to generate DNA se- quence, Ouyang designed the DNA molecular database for DNA computation, and based on these sequences they set up an algorithm for the maximal clique prob- lem [6] . The surface-based DNA computation was used by Liu to solve SAT problem [7] . Shapire group in Weizman Institute brought forward an automatic DNA computer model for diagnosing and curing diseases [8,9] . In this work, we designed a special DNA computer model to solve vertex coloring problem. The main body of this DNA computer was polyacrylamide gel electro- phoresis consisting of three parts, i.e. a melting region, an unsatisfied solution region and a solution region. The polyacrylamide gel was connected with a control- lable temperature device, and the relevant temperature for the three parts was denoted by 1 m T , 2 m T and 3 m T respectively. The library generating method had two parts: one, called storage library, was composed of all possible colorings represented by DNA strands, and the other was composed of many probes, were the relevant many complementary strands, representing the graph structure information. In this paper, taking an approach of encoding DNA computing advanced by us as an example, we give a set of codes for the graph with 20 vertices (Fig. 3). We succeeded in performing the ex- periment on a graph with 5 vertices. As is well known, the vertex coloring problem is a difficult combinatorial optimization problem involved in many fields, such as work schedule, curriculum schedule and storage [1012] . However, it belongs to NPC problems. Now many algorithms have been established to solve this problem, such as conventional algo- rithm [13 15] , artificial neural network algorithm [16,17] , genetic algorithm [18] and simulated annealing algo- rithm [19] . In these years our group has advanced many theoretical models for solving this problem. Based on our previous work [2022] , a new DNA computer model is designed in this paper. Throughout this paper all graphs are finite and undi- rected. G denotes a graph, and V(G), E(G) denote its vertex set and edge set, respectively, and r, b and y de- note the red, blue and yellow coloring set {r, b, y}, re- spectively. Thus solving the 3-coloring problem is equivalent to finding a mapping: f : ( ) { , ,y} VG rb , ( ), () () uv EG fu fv . The 3-coloring problem of a graph is an NPC prob- lem [23] . We can easily know that the result of this paper

Transcript of A DNA computer model for solving vertex coloring problem

ARTICLES

www.scichina.com www.springerlink.com 2541

Chinese Science Bulletin 2006 Vol. 51 No. 20 2541—2549 DOI: 10.1007/s11434-006-2145-6

A DNA computer model for solving vertex coloring prob-lem XU Jin1, 2, QIANG Xiaoli1, FANG Gang1 & ZHOU Kang1

1. Department of Control Science and Engineering, Huazhong University of Science and Technology, Wuhan 430074, China;

2. Department of Biotechnology, Dalian University, Dalian 116622, China

Correspondence should be addressed to Xu Jin ([email protected]) Received November 9, 2005; accepted December 23, 2005

Abstract A special DNA computer was designed to solve the vertex coloring problem. The main body of this kind of DNA computer was polyacrylamide gel electrophoresis which could be classified into three parts: melting region, unsatisfied solution region and solution region. This polyacrylamide gel was con-nected with a controllable temperature device, and the relevant temperature was Tm1, Tm2 and Tm3, res- pectively. Furthermore, with emphasis on the encod-ing way, we succeeded in performing the experiment of a graph with 5 vertices. In this paper we introduce the basic structure, the principle and the method of forming the library DNA sequences. Keywords: DNA computer, graph vertex coloring problem, encod-ing.

Since Adleman[1] demonstrated the way to apply standard methods of molecular biology to solving a hard computational problem, the research of DNA computation and DNA computer have made much headway. So far, many DNA computing models have been brought forward based on DNA molecule, enzyme and biotechnology. Some of them are theoretical mod-els, such as DNA computing model presented by Lip-ton[2] for solving 3-SAT problem, and the sticker model introduced by Roweis[3]. Some of them are based on ingenious molecular biology techniques. For example, in 2002, using a DNA computer, Adleman’s research group solved an NPC problem by the gel electrophore-sis and controllable temperature method. And this computational problem may be the largest yet solved by nonelectronic means[4]. Some are based on laboratory techniques. For instance, Sakamoto offered a model to

solve the SAT problem through DNA’s hairpin structure elaborately[5]. By using POA to generate DNA se-quence, Ouyang designed the DNA molecular database for DNA computation, and based on these sequences they set up an algorithm for the maximal clique prob-lem[6]. The surface-based DNA computation was used by Liu to solve SAT problem[7]. Shapire group in Weizman Institute brought forward an automatic DNA computer model for diagnosing and curing diseases[8,9].

In this work, we designed a special DNA computer model to solve vertex coloring problem. The main body of this DNA computer was polyacrylamide gel electro-phoresis consisting of three parts, i.e. a melting region, an unsatisfied solution region and a solution region. The polyacrylamide gel was connected with a control-lable temperature device, and the relevant temperature for the three parts was denoted by 1mT , 2mT and 3mT respectively. The library generating method had two parts: one, called storage library, was composed of all possible colorings represented by DNA strands, and the other was composed of many probes, were the relevant many complementary strands, representing the graph structure information. In this paper, taking an approach of encoding DNA computing advanced by us as an example, we give a set of codes for the graph with 20 vertices (Fig. 3). We succeeded in performing the ex-periment on a graph with 5 vertices.

As is well known, the vertex coloring problem is a difficult combinatorial optimization problem involved in many fields, such as work schedule, curriculum schedule and storage[10―12]. However, it belongs to NPC problems. Now many algorithms have been established to solve this problem, such as conventional algo-rithm[13―15], artificial neural network algorithm[16,17], genetic algorithm[18] and simulated annealing algo-rithm[19]. In these years our group has advanced many theoretical models for solving this problem. Based on our previous work[20―22], a new DNA computer model is designed in this paper.

Throughout this paper all graphs are finite and undi-rected. G denotes a graph, and V(G), E(G) denote its vertex set and edge set, respectively, and r, b and y de-note the red, blue and yellow coloring set {r, b, y}, re-spectively. Thus solving the 3-coloring problem is equivalent to finding a mapping:

f : ( ) { , , y}V G r b→ , ( ), ( ) ( )uv E G f u f v∀ ∈ ≠ . The 3-coloring problem of a graph is an NPC prob-

lem[23]. We can easily know that the result of this paper

ARTICLES

2542 Chinese Science Bulletin Vol. 51 No. 20 October 2006

can be adapted to the condition of 4k≥ .

1 Basic structure and fundamental principle

1.1 Basic structure

The computer was composed of two parts: the hard ware and the soft ware. The hard ware was polyacryla-mide gel electrophoresis containing three regions: The first region is called the melting region or melting box, in which dsDNA molecules can melt into ssDNA molecules. One of them is fixed on the polyacrylamide, and the other is kept on electrophoreses. The fixing technique was either AcryditeTM or using magnetic beads. For the theory and applications of AcryditeTM

please refer to ref. [24]. We put all the dsDNA repre-senting the possible solutions of the problem on a board, and name the board A.

The second region is called the unsatisfied solution sticking region or the unsatisfied solution box, on which we put a board with lots of probes. As a matter of fact, the probes are some ssDNA sequences equal to the adjacency matrix containing all the information of a graph, we name this board B. It should be noted that the purpose of using these probes on the board B is to eliminate the solutions that do not satisfy the problem.

The third region is called the solution region or solu-tion box. Each ssDNA here represents a solution satis-fying the problem.

It should be emphasized that there are two boards, board A and board B. Board A, named storage library, is a set that contain all the possible solutions of the problem. The storage library is made on a glass full of polyacrylamide gel, and the “data” contained in the storage library are all dsDNA; The board B is com-posed of lots of ssDNA, which represent the informa-tion of a given graph, so different graphs have different information on board B. board B can be divided into many sub-boards, and the union of all the probes on these sub-boards is the aggregate of probes on board B. In order to eliminate the unsatisfied solutions many times, we can make the intersection between any two sub-boards none-empty (it means that we can find the same probes in two or more sub-boards) in some ex-periments. But, the dsDNA molecules on board A should not be changed unless the number of the vertex is changed (Fig. 1).

1.2 Fundamental principle

The main operation of this DNA computer was

polyacrylamide gel electrophoresis. The principles are as follows:

The polyacrylamide is connected with a controllable temperature apparatus to keep the temperature of the three regions at Tm1, 2mT and 3mT , respectively. When the temperature of the melting region is 1mT , one dsDNA molecule could melt into two ssDNA molecules. And one was fixed on the polyacrylamide (board A), and the other unfixed would move to the unsatisfied solution region under the effect of electric field. Then we shift the temperature of the unsatisfied solution region to 2mT , and it is at right temperature that the probes could hybridize with ssDNA on board B (according to the experiment, the length of unsatisfied solution box can be adjustable). The temperature of the solution region is controlled at 3mT , and the ssDNA that reaches this box is the right solution to the prob-lem.

2 Design of the library

2.1 The principle of library design

We always assume that the treated graph in the graphical vertex-coloring DNA computer is n -order. We only consider the 3-coloring problem of a graph in this paper. Of course, the principle of the graphical vertex-coloring DNA computer is the same if the num-ber of color 4k≥ .

In this paper, we assume the vertex of the graph to be ( ) {1,2, , }V G n= , and each vertex can have three dif-

ferent colors, , ,i i ir b y , 1,2, ,i n= . Here, ir means that the vertex i is red, while ib is blue, iy is yellow. Each color is represented by ssDNA with a length l (Fig. 2).

Note that l is determined by such factors as the number of vertices and the number of colors. Actually, for the problem with fewer variables, the length l of ssDNA should satisfy 15 30l≤ ≤ . Here, using DNA strands with 15 bases to represent each variable (color-ing), we got a sequence set 1 2T { { , , } 1,2, , }n i i i ix x x x r b y i n= ∈ =; ; . (1) The number of the DNA sequences in this set, which represents all the possible colorings, is 3n . The length of each sequence is n l× , meaning that each sequence contains n l× bases.

To construct the storage library of this DNA com-

ARTICLES

www.scichina.com www.springerlink.com 2543

Fig. 1. The hardware of the DNA computer.

puter, we have to design each DNA sequence (i.e. en-coding) in the storage library, and then with the codes, the DNA sequences are synthesized by DNA molecular synthesis apparatus, such as ABI 3900. These se-quences can be fixed on board A.

Fig. 2. The vertices and corresponding colors.

2.2 The encoding for the storage library

The research in DNA computing shows that encod-ing is a crucial problem with great difficulties: ①It will affect the quantity of DNA syntheses directly; ② the quality of the code can ensure the biochemistry reaction performance we designed; ③there are close relations between the encoding and the solution space exponent explosion problem (i.e. the quantity of DNA molecules needed in DNA computing increases in terms of expo-nent functions when the size of computing problems increases). The solution space exponent explosion problem, a difficult problem in DNA computing, may

be solved by a good encoding; ④ another difficult problem is the detection for the solutions of problems, which may be solved by the combination of the encod-ing, enzyme, hairpin and fluorescence label.

Many factors such as ΔG, mT , enzymes, the ham-ming distance and the composition of DNA molecules are considered to decrease the nonspecific hybridiza-tion. The codes can be obtained after the following three steps: (i) Finding all biological constraints; (ii) converting them into mathematic constraints; (iii) giv-ing the encoding algorithms and corresponding codes.

Just as mentioned above, board A contains many DNA sequences representing all possible colorings of the vertex 3-coloring problem. The length of each se-quence is n l× , i.e. each sequence contains n l× bases. Actually, we encode every vertex with a 15-base sequence. By referring to ref. [4], we give the con-straints as follows:

1. Library sequences contain A’s, T’s and C’s. 2. All library and probe sequences have no occur-

rence of 5 or more consecutive identical bases. 3. Each probe sequence has at least 8 mismatches

with all 15 bases alignment of any library sequence (except for its matching value sequence).

4. Each sequence of library sequence has at least 8 mismatches with all 15 bases alignment of its own or

ARTICLES

2544 Chinese Science Bulletin Vol. 51 No. 20 October 2006

any other library sequence. 5. No probe sequence has a run of more than 7

matches with any 8 bases alignment of any library se-quence (except for its matching value sequence).

6. No library sequence has a run of more than 7 matches with any 8 bases alignment of its own or any other library sequence.

7. Every probe sequence has 5 or 6 Gs in its se-quence.

8. The hamming distance of every sequence is 8, m1-m10≥5, m6-m15≥5, m1-m5≥4,and m11-m15≥4. Here, m1-m10 represents the hamming distance from 1 to 10. In this paper we omit the method of converting biological constraints to mathematical constraints. The specific algorithm was given in an unpublished paper1).

As a paradigm, the codes of any graph with 20 ver-tices to solve 3-coloring problem are given as follows:

2.3 The design of coloring library

A coloring scheme of an n-order graph is denoted by

sequence 1 2 nx x x , which shows that the color of vertex i is ix , ix ∈ { , , }i i ir b y , 1,2, ,i n= . With this kind of enumeration method, a 20-order graph has 320

=3486784401 possible colorings. Each DNA sequence with 300 bases represents a 3-coloring. We take all the possible DNA sequences given by formula (1) as the single strand coloring library. The ssDNA in the single strand coloring library hybridizes with its W-C com-plement to form dsDNA. We take these dsDNA as the double strand coloring library. Actually, the sequences fixed on the board A are double strand coloring library.

The DNA sequences will be synthesized after en-coding. The DNA molecule synthesis is to form a DNA sequence by linking nucleotides with 3′-5′ phosphodi-ester bond in order. For synthesis of DNA sequence, please refer to ref. [25]. Here,we give the synthesizing method of DNA molecules as follows:

Phosphoramidite synthesis method is commonly used to synthesize DNA sequence. The first base at-tached to the solid support by an ester linkage at the 3′-hydroxyl end is at first inactive by a DMT group at

1r =CAACACATTTAACTCATTCT; 1b =CAATCAATACTTAACCATAT; 1y =ATACAACACCTTTAATACT;

2r =CTTCTCTAAATTCACTAACA, 2b =CTTACTTATCAATTCCTATA, 2y =ACCCTCAACTAATTCCTATA;

3r =ACCATAAAACATCCTTCCTT; 3b =ACACCTATAAACTAATACCA; 3y =ACTACAACCCTTTACAACTC;

4r =AACAAATTTTCCTCCTCTTT; 4b =AACTTCTTCATTAACCATAT; 4y =AAACTTTCTCCAAATTATCA;

5r =AAAATCTCCACTTTAATACT; 5b =AAATACTCAAATTCTACTAA; 5y =AATTAATATTTCATTATCCT;

6r =ATCCAACCTAAAATCTTCAC; 6b =ATCACACCCATAACATCATT; 6y =ATCTCTCCATTTTACAACTC;

7r =ATCTACCCACACCAACAAAT; 7b =ATACCACACACATATCACAT; 7y =ATAACTCAATCTCTATTTAC;

8r =ATATCCCATCCCACCACAAA; 8b =ATATAACATATATCACCTCC; 8y =ATTCCTCTATATACTCCACC;

9r =ATTCACCTACCCTCCTCTTT; 9b =ATTAAACTTACACATAAATC; 9y =ATTTCACTCAAACTCATTCT;

10r =CATACAACACCTTTAATACT; 10b =CTCCCTTTTCTACCAACCAA; 10y =CTATCTTCTCCAAATTATCA

11r =ATTTATCTCTCTATACTCTA; 11b =TCCCCTTTCCCACATAAATC; 11y =TCCCACTTCATTAACCATAT;

12r =TCCACCTTAACTATACTCTA; 12b =TCCTCATTTTCCTCCTCTTT; 12y =TCCTATTTTCTACCAACCAA;

13r =TCAACATCTTACCAACAAAT; 13b =TCATCTTCCCAAATCTTCAC; 13y =TCTCCATATTTCATTATCCT;

14r =TCTACTTACCTATCACCTCC; 14b =TCTAACTACAATCCTTCCTT; 14y =TACCTTCCATTTTACAACTC;

15r =TACACACCTAAAATCTTCAC; 15b =TACATCCCTCTCCTTCTTTA; 15y =TACTCTCCCTATTCTACTAA;

16r =TAACCACATATATCACCTCC; 16b =TAAATACACACATATCACAT; 16y =TAATCCCAACTCATTATCCT;

17r =TAATTTCAATCTCTATTTAC; 17b =TATCCTCTCTCTATACTCTA; 17y =TATATTCTATATACTCCACC;

18r =TATTCACTTACACATAAATC; 18b =TATTTCCTTCACTAATACCA; 18y =TTCCTCAATTCATATCACAT;

19r =TTCAACAACTAATTCCTATA; 19b =TTCATAAACCCTCTATTTAC; 19y =TTCTTTAAAACCACCACAAA;

20r =TTACTAATCCATACTCCACC; 20b =TTAATTATAAACTAATACCA; 20y =TTTCAAACACCTTTAATACT.

1) Xu Jin. The Theory and algorithms of encoding for DNA computing. Submitted

ARTICLES

www.scichina.com www.springerlink.com 2545

Fig. 3. An undirected graph with 20 vertices.

5′-hydroxyl end because all the active sites are blocked or protected. To add the next base, the DMT group must be removed. A nucleotide will be added after a cycle, which includes four steps. Step 1 is protecting, step 2 is base condensation, step 3 is capping and step 4 is oxidation. Steps 1―4 are repeated until all desired bases are added to the oligonucleotide. After all the bases are added, the oligonucleotide must be cleaved from the solid support and deprotected prior to use. This is done by incubating the chain in concentrated ammonia at a high temperature for an extended amount of time. At last desalting is done to purify the solu-tion[25,26].

It is almost impossible to synthesize DNA sequences with a length of 300 mer on ABI 392 DNA/RNA Syn-thesizer. As an example, the DNA sequences for graph in Fig. 3 can be synthesized using the method men-tioned in ref. [4]: First, the storage library is divided into two half libraries, one for vertices 1x through

10x and one for vertices 11x through 20x . Then these two half libraries are mixed with oligonucleotides

10 11r r , 10 11r b , 10 11r y , 10 11b b , 10 11b r , 10 11b y , 10 11y y ,

10 11y r , 10 11y b and T4 DNA ligase to incubate. At last, primer pair < 1x , Acrydite-modified 20x >, { , , }x r b y∈ is used to PCR, and the DNA sequence purified with 300 bases is a full library.

It should be noted that when using the magnetic beads to separate the sequences, < 1x , biotin-modified

20x >, { , , }x r b y∈ serves as the primer pair for PCR.

2.4 The design of the probe library and the principle of separation

The probe library is composed of different ssDNA which are complementary with the DNA strands repre-senting the unsatisfied solutions. The length of the probe is based on different graphs. The basic choosing principle of the probe is as follows: Firstly, the length of a probe should be as short as possible; secondly, the probe should be W-C complements with one or more sequences in the set { , , 1,2, ,20}i i ir b y i =; . Thus, the length of a probe must be multiples of 15. The prin-ciple of synthesis is the same as the DNA molecular synthesis as mentioned before. The probe should be labeled. Here we can use two different methods to label the probe:

If we separate the DNA by using magnetic beads, the 5′ end of the probe should be labeled by biotin. The principle of the separation technique is depicted in Fig. 4. If we use the Acrydite TM separating technique, the probe should be fixed on a polyacrylamide gel.

3 Experiment To access this DNA computer model we did an ex-

periment as shown by Fig. 5. G is a three-color graph. If the induced ( )V G of every k -coloring is the same, we say that this graph is only k -color. For example, the three-color groups are {1},{2,4},{3,5} in Fig. 5. Supposing the vertex 1 is red, we will get two right so-lutions (Fig. 5).

3.1 Main ideas

We separate the unsatisfied solutions with the tech-niques based on magnetic beads as follows:

① the storage library, probe library and primers are synthesized;

② the ssDNA representing the unsatisfied solutions are hybridized with probes;

③ the unsatisfied solutions are separated using the magnetic beads;

④ PCR is done to obtain the solution; ⑤ the solutions that we want are gotten with elec-

trophoresis.

3.2 Material and methods

(i) Probes and PCR. The storage library, probes

ARTICLES

2546 Chinese Science Bulletin Vol. 51 No. 20 October 2006

Fig. 4. Sketch map of the separation using the magnetic beads.

Fig. 5. A five-order graph and two solutions. and primers were synthesized by TaKaRa Company; Streptavidin MagneShere Promega Paticles were bought from Promega Company, and One Shot LA PCR Mix Kit from TaKaRa. iCycler PCR apparatus (Bio-Rad), Anke TGL-16G centrifuge, Lab Word gel imaging systerm, electrophoresis apparatus, and elec-trophoresis tank (Liu Yi, Beijing) were used.

(ii) Encoding. ix , the color of every vertex in the graph in Fig. 5, can be , ,i i ir b y , 1,2,3,4,5i = , and each coloring is represented by oligonucleotide with 15 bases. ix denote the complementary strands of ix . Each coloring of the graph is one of the possible solu-tions, and each of them, (i.e., ix , 1,2,3,4,5i = ), can be represented by ssDNA with 75 bases. The codes are

as follows: According to these DNA sequences, 53 243= DNA

strands would be contained in the storage library and each of them had 75 bases. These sequences were syn-thesized on the ABI 3900 synthesis apparatus.

The probes should contain the information of the graph. And 24 probes were designed according to Fig. 5. Some of them had 15 bases,such as 1r , and the others had 30 bases,such as 1 2r r . All of the probes were la-beled at 5′ end with biotin.

(iii) Hybridization. Step 1. 0.5 μL of probes ( 1 1 1, ,r b y ; 0.2 p mol/μL) was put into 3 tubes respec-tively, and 5 μL of ssDNA (8.35×10−4 pmol/μL) was added into each tube. These tubes were kept in hot wa-

1r =CATCACATTCTCAAT, 1b =CAAATCCCCACTTAT, 1y =CACTAATTCCATCCA

2r =CTTCTTAAACACTTC, 2b =CCCAATACTTTATCC, 2y =CCCTCAAACTATTAT

3r =CCTAATCTTTCTACT, 3b =ACACCAATCATAACA, 3y =ACTACCCTATCTAAC

4r =ACCAAAATACCACTA, 4b =ATCAATAACCCATCT, 4y =TTACATCCTTACTCA

5r =AATCCACAACTCACT, 5b =CACACATATATCATC, 5y =TAACCACCTATACCT

ARTICLES

www.scichina.com www.springerlink.com 2547

ter (65℃) for10 min,and then the tubes were removed and put into water (27℃) for 1 h. Then 5 μL Strepta-vidin MagneShere Promega Particles were added into each tube and the specific method was consistent with the protocol of Particles. After reaction the supernatants were discarded.

Step 2. The Particles were washed with 5 μL 0.5×SSC and the supernatants were discarded. 50 μL 0.5×SSC was added into each tube, and each of them was kept in water (65℃) for 10 min. Having been sepa-rated rapidly, the supernatants were removed to three new tubes respectively.

Step 3. 0.15 μL of three probes ( ir , ib and iy , 3,4,5i = ; 40 pmol/μL) was added separately into three

new tubes. These tubes were kept in water (65℃) for 5 min,and then they were kept in 27℃ water for 2 h. Separated by using one fold particles under the same condition, the supernatants were put into new tubes.

Step 4. Three probe sets ( 1 2r r , 1j jb b + , 1j jy y + ; 1 2b b ,

1j jr r + , 1j jy y + ; 1 2y y , 1j jr r + , 1j jb b + ; 35 p mol/µL)

were added separately into supernatants absorbed by probes ir , ib and iy in step 3. These tubes were kept in water (65℃) for 5 min,and then they were kept in 50℃ water overnight. The reactants were separated four times by using three fold particles. After washing the particles, the supernatants were removed to new tubes. In this step, the supernatants were mixed for PCR reaction.

(ⅳ) Achievement of the solution. One Shot LA PCR Mix kit (TaKaRa) was used for PCR. The super-

natant was used as DNA template. The PCR condition was as follows: the primers were 1, ix x , { , , }x r b y∈ ,

2,3,4,5i = , predenaturing at 94℃ for 1 min; 94℃, 30 s, 34℃, 30 s, 62℃, 30 s, 30 cycles; prolonging at 65℃ for 5 min.

3.3 Experimental result

(i) Detection of storage library. The set 1, ix r

was used as primer for PCR. Here x ∈{ , ,r b y }, 2,3,4,5i = . The result is shown in Fig. 6.

(ii) Sensitivity test. The concentration of the storage library DNA molecular was 8.35×10−4 pmol/μL and this solution was done using a ladderlike dilution. So, we got serial DNA solutions whose folds of dilution were 100, 200, 400, 600, 800, 1000, 5000, 10000, respec-tively. The PCR sensitive test condition was the same as the PCR's above (Fig. 7).

By analyzing the result, a clue could be found that the concentration of templates in PCR system should be more than 16.7×10-6 pmol under the condition in this paper.

(iii) Hybridization reaction. In sensitivity test, the probes and the ssDNA were mixed and hybridized. Af-ter reaction, the unsatisfied solutions were eliminated under the same condition. The hybridization products were used for PCR.

(iv) PCR. One Shot LA PCR Mix kit (TaKaRa) was used for PCR. The supernatant was used as DNA template. The PCR condition was: the primers were

1, ix x , { , , }x r b y∈ , 2,3,4,5i = , predenaturing at

94℃ for 1 min; 94℃, 30 s, 34℃, 30 s, 62℃, 30 s, 30

Fig. 6. Analysis of the storage library. PCR products were analyzed on 4% agarose gel. Lanes 1, 2 and 3 correspond to primer pair 1 2,x r ; lanes 4, 5

and 6 correspond to primer pair 1 3,x r ; lanes 7, 8 and 9 correspond to primer pair 1 4,x r ; lanes 10, 11 and 12 correspond to primer pair 1 5,x r . M1 and M2 are molecular weight markers.

Fig. 7. The result of the sensitivity test. The products were analyzed on 4% agarose gel. All the lanes correspond to primer pair 1 5,r r . Lane 1 corre-

sponds to the liquor with original concentration; lanes 2―9 correspond to 100- to 10000- fold dilutions.

ARTICLES

2548 Chinese Science Bulletin Vol. 51 No. 20 October 2006

cycles; prolonging at 65℃ for 5 min. (v) Agarose electrophoresis. According to

1r , 1b , 1y , the electrophoresis results were divided into three parts (Figs. 8―10).

When the vertex 1 is red, namely 1r , the truth value of the 5-order graph is shown as Fig. 8.

By analyzing Figs. 5 and 8, a group of truth value could be obtained, i.e. 1 2 3 4 5r b y b y and 1 2 3 4 5r y b y b .

When the vertex 1 was blue, namely 1b , the truth value of the 5-order graph was obtained as shown in Fig. 9.

By analyzing Figs. 5 and 9, a group of truth value could be obtained, i.e. 1 2 3 4 5b r y r y and 1 2 3 4 5b y r y r .

When the vertex 1 was yellow, namely 1y , the truth value of the 5-order graph was obtained as shown in

Fig. 10. By analyzing Figs. 5 and 10, a group of truth value

could be obtained, i.e., 1 2 3 4 5y r b r b and 1 2 3 4 5y b r b r .

4 Discussion In this paper a DNA computer model was proposed

for solving vertex 3-coloring problem, and the encod-ing sequences were given to solve the vertex three-

coloring problem of a graph with 20 vertices. The number of possible solutions of the problem in Fig. 3 is awfully 203 3486784401= . In order to solve this prob-lem, a DNA computer model for vertex three-coloring problem was presented and the detailed theoretic ex-planation and the principle were given. An experiment was carried out to solve the vertex three-coloring prob-lem of an undirected graph with 5 vertices successfully by using the biotechnology of hybridization, affinity capture and PCR amplification. The vast parallelism of DNA computing and its feasibility to solve NP-pro- blem were validated once again. At the same time, the sensitivity test was put into practice. In this way the condition of the experiment was specialized, and the reliability of the experimental results was guaranteed.

In previous research of encoding strategy, the codes could not be really implemented in the biological ex-periment[27, 28]. Considering the importance of the en-coding, we took into account the chemical free energy, temperature, enzyme, the hamming distance and the composition of DNA molecule. 8 constraints were pro-posed and the actual algorithm was presented. Based on this and previous research work, the suitable sequences for solving the vertex three-coloring problem of an un-directed graph with 20 vertices were obtained.

The theory of DNA computing was reinforced by

Fig. 8. The result of PCR. The products were analyzed on 4% agarose gel. All the lanes correspond to primer pair 1, ir x , here x ∈( , ,r b y ),

2,3, 4,5i = . Thereinto i=2, x=r (lane 1), i=2, x=b (lane 2), i=2, x=y (lane 3), i=3, x=r (lane 4), i=3, x=b (lane 5), i=3, x=y (lane 6), i=4, x=r (lane 7), i=4, x=b (lane 8), i=4, x=y (lane 9), i=5, x=r (lane 10), i=5, x=b (lane 11), and i=5, x=y (lane 12).

Fig. 9. The result of PCR. The products were analyzed on 4% agarose gel. All the lanes correspond to primer pair 1, ib x , x ∈{ , ,r b y },

2,3,4,5i = , and the rest are the same as in Fig. 8.

Fig. 10. The result of PCR. The products were analyzed on 4% agarose gel. All the lanes correspond to primer pair 1, iy x , x ∈{ , ,r b y },

2,3,4,5i = , and the rest are as the same as in Fig. 8.

ARTICLES

www.scichina.com www.springerlink.com 2549

successfully solving the vertex three-coloring problem of a graph with 5 vertices with molecular biological method. Furthermore, the experimental method can be adapted to study vertex coloring problem more deeply, and even to solve other NP-complete problems. But there are some drawbacks in implementing DNA com-puting in this way, such as time-consuming protocol and heavy loss of DNA molecules in solution. In order to overcome these shortcomings, some suitable and advanced biotechnologies should be adopted and more universal coding strategies should be studied.

Acknowledgements We thank Liu Xiaohui (TaKaRa) and Prof. Xu Chongbo (Department of Biotechnology, Dalian University) for assistance and advice. This work was sup-ported by the National Natural Science Foundation of China (Grant Nos. 60533010, 60574041, 60373089and 60274026).

References

1 Adleman L. Molecular computation of solution to combinatorial problems. Science, 1994, 66(11): 1021―1024

2 Lipton R J. DNA solution of hard computational problems. Sci- ence, 1995, 268(28): 542―545

3 Roweis S, Winfree E, Burgoyne R, et al. A sticker based architec- ture for DNA computation. In: Baum E B, et al., eds. DNA Based Computers, Proc. 2nd Annual Meeting. Princeton, 1999. 1―27

4 Braich R S, Chelyapov N, Johnson C, et al. Solution of a 20-variable 3-SAT problem on a DNA computer. Science, 2002, 296(19): 499―502

5 Sakamoto K, Gouzu H, Komiya K, et al. Molecular computation by DNA hairpin formation. Science, 2000, 288: 1223―1226

6 Ouyang Q, Kaplan P D, Liu S M, et al. DNA solution of the maxi- mal clique problem. Science, 1997, 278(17): 446―449

7 Liu Q H, Wang L M, Frutos A G, et al. DNA computing on sur- faces. Nature, 2000, 403(13): 175―178

8 Benenson Y, Gil B, Ben-Dor U, et al. An autonomous molecular computer for logical control of gene expression. Nature, 2004, 429(27): 1―6

9 Benenson Y, Paz-Elizur T, Adar R, et al. Programmable and autonomous computingmachine made of biomolecules. Nature, 2001, 414(22): 430―434

10 Bondy J A, Murty U S R. Graph Theory with Applications. Lon- don, Basingtoke and New York: The Macmillan Press Ltd., 1976

11 Gibbons A. Algorithmic Graph Theory. Cambridge, London, New York: Cambridge University Press, 1985

12 Jensen T R, Toft B. Graph Coloring Problems. New York/Singa- pore: A Wiley-Interscience Publication, John Wiley & Sons, 1995

13 Wood D C. A technique for coloring a graph applicable to large-scale optimization problems. Comput J, 1969, 12: 317

14 Karger D, Motwani R, Sudan M. Approximate graph coloring by semidefinite programming. Journal of ACM, 1998, 45(2): 246―

265 15 Halldorsson M M. A still better performance guarantee for ap-

proximate graph coloring. Inf Proc Lett , 1994, 45: 19―23 16 Xu Jin, Bao Zheng. Neural network and graph theory. Sci China

Ser F-Inf Sci, 2002, 45(1): 1―24 17 Blas A D, Jagota A, Hughey R. Energy function-based approaches

to graph coloring. IEEE Transactions on Neural Networks, 2002, 13(1): 81―90

18 Goldberg D. Genetic Algorithm in Search, Optimization and Ma- chine Learning. Boston: Addison-Wesley Publishing, 1989

19 Xu Jie, Du Wen, Li Zongping, et al. Study on the plan of using shunting locomotives based on Simulated Annealing Algorithm and Graph Coloring. Journal of The China Railway Society, 2003, 25(3): 24―30

20 Liu Yachun, Xu Jin, Pan Linqiang, et al. DNA solution of a graph coloring problem. J Chem Inf Comput Sc, 2002, 42(3): 524―528

21 Liu Wenbin, Xu Jin. A DNA algorithm for the graph coloring prob- lem. J Chem Inf Comput, 2002, 42: 1176―1178

22 Gao Lin, Xu Jin. A DNA algorithm for graph vertex coloring prob- lem. Acta Electronic Sinica, 2003, 31(4): 494―496

23 Dorit S. Hochbaum. Approximation Algorithms for NP-hard Prob- lems. Boston: PWS Publishing Company, 1997

24 Zhang Fengyue. The DNA computing model of 0-1programming problem and timetable problem. The Degree of Doctor of Philoso- phy, Huazhong University of Science and Technology, June, 2004

25 Xu Jin, Huang Buyi. DNA Computer Principle, Advances and Dif- ficulties ( ): Formation of Data BaseⅡ —DNA Synthesis. Chin J Comput, 2005, 28(10): 1―9

26 Lu Shengdong. Current Protocols in Molecular Biology. Beijing: Chinese Academy of Medical Science & Peking Union Medical College Press, 1999. 9

27 Fumiaki Tanaka, Atsushi Kameda, Masahito Yamamoto. Design of nucleic acid sequences for DNA computing based on a thermody- namic approach. Nucleic Acids Res, 2005, 33(3): 903―911

28 Garzon M, Deaton R J. Codeword design and information encoding in DNA ensemble. Nat Comput, 2004, 3: 253―292