Post on 13-Mar-2023
Programming Hypothesis on Life Phenomena and the Key Processes Simulation
Jun Ma1, a, Shuyan Li2,b and Yide Ma 3,c 1,3 The School of Information Science and Engineering Lanzhou University, Lanzhou, China
2 Lanzhou University Chemistry and Chemical Engineering Lanzhou University,Lanzhou, China
amajun@lzu.edu.cn, blishuyan@lzu.edu.cn, cydma@lzu.edu.cn
Keywords: Programming hypothesis on life phenomena, Program simulation, Gene expression, Cell division and differentiation, Life evolution, P-code, reflection technology
Abstract. The formula that life process follows is a major scientific mystery during centuries. Some
people put programming thoughts into this field like Gates brought the idea that “Human DNA is like
a computer program but far, far more advanced than any software we’ve ever created”[1]. Here we
proposed a more specific hypothesis on this topic as that DNA is a set of p-code[2] and the enzymes
which control chemical reactions and transport processes in cell metabolism are the basic
instructions. Based on this hypothesis, some program models were developed successfully in this
work to simulate the key processes of life phenomena: gene expression, cell division and
differentiation, and life evolution. The results of these simulations show that there is a high level of
similarity between life phenomena and computer programs; the process of cell differentiation and
evolution of life can be explained in a programming way. These models also suggest that reflection
technology[3, 4] is essential to life process. Besides, C-value paradox, N-value paradox[5] and
pseudogene as well as some other biological problems could be also explained by these programming
models. These conclusions imply that life phenomena are consistent with the concept of “process” in
computer fields.
Introduction
The coding principle of life information challenges scientists for centuries and variety studies were
developed to approaching this goal. Recently, it is noticed that there are similarities between DNA
sequences and computer program code and consequently the programming explanation became
popular in this field. D'Onofrio viewed a cell “as a complete computational machine in terms that are
akin to a multi-core computer cluster, where there is a centralized memory and instruction set, yet
computational tasks are distributed among distinct processing elements”[6]. Flynn considered DNA
as an example of shared memory in a distributed heterogeneous multiprocessor system with multiple
input streams and multiple output streams[7]. In this way, each cell has over 2,000 different enzyme
computers that read the shared memory data in DNA, processing that data according to the individual
programs, many operating independently[8]. In the last 10 years, at least 20 different natural
information codes were discovered in life, each operating according to arbitrary conventions. Such as
proteins address codes[9], acetylation codes[10], RNA codes[11], metabolic codes[12], cytoskeleton
codes[13], histone codes[14], and alternative splicing codes[15]. All these codes suggested that the
complexities of biology are growing by orders of magnitude[16], and the signaling information in
cells is organized through networks of information rather than simple discrete pathways which are
similar to computer program as well. Whole-cell simulation[17] and the game of life evolving
program[18] are performed based on the ideas above.
These theories inspired our group to proposed the hypothesis that a DNA chain is an intermediate
code sequence, which stores the operation codes and operands in a p-code way. Life is macroscopic
showing that is presented by the p-code sequence in running status, as a computer process. The basic
instructions of this program are various enzymes, which control chemical reactions and transport
processes in cell metabolism.
Advanced Materials Research Vol. 647 (2013) pp 258-263© (2013) Trans Tech Publications, Switzerlanddoi:10.4028/www.scientific.net/AMR.647.258
All rights reserved. No part of contents of this paper may be reproduced or transmitted in any form or by any means without the written permission of TTP,www.ttp.net. (ID: 117.136.27.202-05/01/13,02:25:01)
Comparison of life phenomena with computer programs
By comparing several programming languages with the process of cell metabolism, it is discovered
that Java processes the most similarities since it provides a p-code(Byecode) technology which is
more similar to DNA coding methodology than other languages. The comparison details of Jave with
cell metabolism were shown in Fig. 1a. Therefore, Java was utilized here in this paper to simulate the
the life phenomena.
If enzymes were considered as the basic instruction set for implementing cell metabolism by
controlling various chemical reactions or transport forms, then the running mode of life program
would be very similar to Java program, while RNA would be similar to the intermediate codes of
JVM (Java virtual machine) instructions, which need to be transformed to proteins. In these proteins,
the enzymes are similar to CPU instructions and they will be used to control chemical reactions of
metabolism, while the non-enzyme proteins are similar to the data in computer system, as shown in
Fig.1a. In addition, it is easier to understand the central dogma from the programming view, as shown
in Fig.1b and Fig.1c.
Simulation of key life processes
Figure 1.Comparison of cell life procedure and Java program execution procedure. a, Illustration of the
similarities of Java bytecode codes in computer program and DNA codes in cell program. b, Illustration of the
Central Dogma of molecular biology. c, Illustration of Central Dogma by programming model. The Center of
program is the bytecode sequence which is composed by the JVM instructions, A indicates that JVM loads
bytecode sequence and makes it into running state. B indicates that JVM instructions are translated into machine
instructions and then they were executed; C indicates that the bytecode instructions are wrote into the stored
bytecode sequence reversely, similar to the process as RNA reverse transcription to DNA; D indicates that the
bytecode sequence replicates itself which is controlled by machine instructions, similar to DNA replication
process which is controlled by DNA polymerases.
Based on the analysis above, the simulation of gene expression was put forward as follows: Firstly, appropriate gene class was selected from the chromosome and in this case javassist.CtClass (an abstract representation of a Java class in non-running state) object was utilized. Secondly, the objects were loaded into JVM through custom classloader, which made it into running state. Here java.lang.Class (an abstract representation of a Java class in running state) object was used. This procedure simulated the transcription from DNA to RNA. In whole process, the code sequence was not changed, but transformed from stored state into running state. Thirdly, the class object produced an instance object. This procedure simulated the translation from RNA to protein. Finally, the corresponding chemical reactions or transport forms were executed by the instance object. The principle of the specific gene function expression of a cell showed in Fig.2.
Advanced Materials Research Vol. 647 259
Figure 2.Abstracted gene expression process. a, a gene was simulated using a Java class, which contains three
subparts: cell function expression, regulation of cell division and other auxiliary code. b, A chromosome is a
combination of a group of genes. c, Shows a gene expression process.
In addition, combining with Reflection technology[19] and Bytecode engineering technology[20],
another two programming models were constructed in this paper for the other two key life processes:
Cell division and differentiation l and life evolution [21].
In a life cycle of cell division, the first step is related to DNA replication and separation. The
second stage related to the products of gene expression which involves transcription and translation of
DNA segments, and tries to decode and express the new gene. From the perspective of programming,
the growth of life is a process that organism loading and decoding DNA codes and also is the process
of cell division and differentiation which controlled by the DNA codes and the internal and external
environment of the cell. Since one type of cell is determined by a group of genes, k types of cells will
be determined by m genes. Let xi represents the code part of cell function expression and yi represents
the code part of regulation of cell division, then the vector X (xk1, xk2, xk3 ...) and Y (yk1, yk2, yk3 ...) will
determine the expression of cell function and regulation function. The principle of cell division and
differentiation showed in Fig.3.
Figure 3.Abstracted process of cell division. a, Division process of a cell. Decoded x will generate productions
for implementing specific function, and decoded y will generate productions for regulating gene expression.
These productions are used to activate the subsequent genes according to the reflection of current environment
condition. Then, the cell will generate a new type of cell. If the activation conditions are not met, the cell will
make regeneration division. b, Schematic diagram of process that how a set of gene groups determine various
type of cells from a global view, in which, F(X) and G(Y) respectively represent the decoding operation for X
and Y.
260 Biomaterial and Bioengineering
When program starts, it generates an embryonic stem cell object first, then continuously decodes
DNA coding sequences and generates various cells that have different functions. These cells will do
their own works, and then die randomly. If all the cells are dead, the organism will be over. The cell
function expression and regulation of cell differentiation are implemented through the reflection to
environment conditions. This model performed properly to simulate the processes of cell division and
differentiation, function expression and death degradation. It indicated that reflection mechanism is
crucial in life process.
Relative to cell division, Life evolution process is more abstract and complex. From a long time
span, life evolution can be seemed as an evolving process of life-program p-code sequence (DNA
molecular chains). An Evolutionary Programming Model[22] is utilized here to simulate the process.
In order to get a concise model, the gene function expressions in cell level were ignored and the genes
here in this process are only used to simulate macro behavior.
Assume a single-sex breeding dog, having a virtual genome as Fig.4a, is abstracted to simulating
this process. As shown in Fig.4b, firstly, several dog objects are generated by the same genome, and
then some of these dogs generate gene mutations randomly or through reflection. The dogs, which are
adapted to the environment, will generate offspring, and the others will die. Hence, the instance object
of the offspring has new abilities as shown in Fig.4c, which is as same as breeding phenomena in
biological world. This showed our program model can simulate the life evolution process in some
degree.
Figure 4.Abstracted process of evolutionary program. a, Genome of abstracted dog in parent generation. b,
Schematic diagram of the evolutionary program. c, Genome of offspring, in which the code sequence of gene0
has been changed, and a new gene (genex) which can properly adapt the environment has been appended.
Results and discussions
The models performed properly to simulate the processes of gene expression, cell division and
differentiation, function expression, death degradation and life evolution. The result showed that these
program models were able to simulate the life phenomena perfectly in a macro-scale way Thus it
concluded that if DNA coding sequences were regarded as a set of programming p-codes, the life
phenomena would be fully consistent with the concept of computer processes. This implies that life
phenomena are macroscopic presentations of static program-coding sequence in running status and the
codes were stored in DNA chain by a special p-code way. Besides, the simulating model for cell
division and differentiation suggested that reflection mechanism is crucial in life process.
Advanced Materials Research Vol. 647 261
When life program is running, a cell can be seemed as a microprocessor. Its signal channel[23] is
similar to the integrated circuit of computer chip, but the instruction set is more complex than
computer instruction set. Different from computer instructions that merely consume energy, the
instructions of a cell do not only consume energy, but also release or store energy. For life program
and computer program, although the complexity of their instruction sets is different, the essence of
them are the same from a procedural point of view: both are ordered instruction sequence, and both
can continuously run when there is enough amount of energy supply, and then perform the outcomes
as macro life phenomena by the organism or processing results by the computer. Based on this
modeling idea, it is discovered that many other pathways of metabolism, such as carbohydrate
metabolism, lipid metabolism, amino acids, nucleic acid metabolism and the citric acid cycle, are all
consistent with the work principle of sub-program.
Through program modeling and simulation, we are more convinced that the life phenomenon is a
process (which is the instructions continuously executing). The code of life program is stored in the
DNA in a p-code way. Chromosome is the compressed structure for the storage of these codes. And
we believe that the difference between chemical structures of deoxyribonucleotide and ribonucleotide
is just the difference of this p-code in stored status and running status.
In this p-code programming theory about DNA sequence, some exploratory explanation can be
provided for certain biological phenomena:
Exons and introns. Exons may be the codes that can be expressed as protein, while introns may be
the code that can indicate the position mark, regulation mark and temporary carrier.
Pseudogene. Maybe pseudogene is the outdated genes, which are not used at current growing
environment of organisms, but still have biological activities and positioning functions. They cannot
be removed because that cell differentiation process needs them to locate the follow-up gene.
Otherwise, it will cause inaccurate positioning, and then influencing the life process.
C-value paradox and N-value paradox. Since it is well known in the computer world that there are
no necessary connections between the complexity and code size of a computer program, the C-value
paradox and N-value paradox problem could be easily solved by considering a DNA coding sequence
as a program code.
Conclusions and methods summary
Inspired by the previous outstanding works, this paper puts forward a hypothesis: a DNA chain is a set
of intermediate programming codes which stores instructions and data, similar to Java bytecode
program in storing state; an mRNA chain is the programming codes in running state, similar to Java
bytecode program in running state. Life phenomena are the macroscopic performances of life program
when it starts running, which bases on cooperation of two related instruction, as RNA instruction set
and protease instruction set. These two sets are similar to JVM bytecode instructions and CPU
machine code instructions. In addition, based on this hypothesis, some computer program models were
developed in this work to simulate some key procedures of life phenomena: gene expression,
cell–division cycle and life evolution. These simulations show us that there is a high level of similarity
between life phenomena and computer process.
First, a gene is abstracted as a java class. Consequently, a chromosome can be composed of a series
of java classes. Then gene expression process and cell division and differentiation process can be
simulated. In this process we created many cell objects. Each object will activate or restrain the
corresponding genes according to various conditions and decode the activated genes. Decoding
process consists of two sub-processes. The first one is to locate and obtain the activated gene class in
chromosome, and uses a custom classloader to load it into JVM. This process is similar to the
transcription from DNA to RNA and RNA modification process. The second one is that JVM
translates the bytecode instructions in gene class into computer instructions, and executes these
instructions under appointed conditions. This process is similar to the translation from RNA to protein
and protein function expression. Cell division and differentiation is a continuous process of obtaining
and decoding genes.
262 Biomaterial and Bioengineering
Second, Evolutionary Programming Model is utilized here to simulate life evolution process. After
loaded and run the program which is designed by using EPM, the program's code sequence may be
changed. If the design is very exquisite, the program can perceive the environment and make
adaptive changes each time it runs, and then the program will have the ability to evolve. In our
program model only macro behaviors of organism were simulated while the gene function
expressions in cell level were ignored here in order to get a concise model.
References
[1] B. Gates, N. Myhrvold, and P. Rinearson, The road ahead: Penguin Books; Revised edition
(1996), 332.
[2] K.V. Nori, U. Ammann, Jensen, and H. Nageli, The Pascal P Compiler Implementation Notes.
Zurich: Eidgen1975).
[3] B.C. Smith, Reflection and semantics in a procedural language. Technical Report: MIT
Laboratory of Computer Science(1982), 272.
[4] P. Maes, ACM SIGPLAN Notices,Vol. 22(1987), 147-155.
[5] B. Lewin, Genes. Genes, New York: John Wiley and Sons, Inc(1983).
[6] D.J. D'Onofrio and G. An, Theor. Biol. Med. Model,Vol. 7(2010), 3.
[7] M.J. Flynn, IEEE trans. Comput.,Vol. C-21(1972), 948-960.
[8] D.E. Johnson, Programming of Life, Alabama: Big Mac Publishers(2010), 136.
[9] P. Barry, Science News,Vol. 173(2008).
[10] C.D. Knights, J. Catania, et al., J. Cell. Biol.,Vol. 173(2006), 533-544.
[11] M. Faria, RNA As Code Makers: A Biosemiotic View Of RNAi And Cell Immunity, in Introduction
to Biosemiotics, Springer-verlag New York Inc.2007), p. 347-364.
[12] M. Barbieri, Genetics as a communication process involving error-correcting codes, in
Biosemiotics: Information, Codes and Signs in Living Systems, Nova Science Publisher,
Inc.2007), p. 103.
[13] M. Gimona, Biosemiotics,Vol. 1(2008), 189-206.
[14] J.K. Sims, S.I. Houston, T. Magazinnik, and J.C. Rice, J. Biol. Chem.,Vol. 281(2006), 12760-6
[15] Y. Barash, J.A. Calarco, et al., Nature,Vol. 465(2010), 53-59.
[16] E.c. Hayden, Nature,Vol. 464(2010), 664-667.
[17] M. Tomita, Trends. Biotechnol.,Vol. 19(2001), 205-210.
[18] Information on http://www.spore.com/.
[19] I.R. Forman and N. Forman, Java reflection in action: Manning Publications Co.(2004), 273.
[20] Information on http://www.jboss.org/javassist/.
[21] J.H. Postlethwait and J.L. Hopson, Modern Biology: Holt McDougal(2006), 1130.
[22] J. Ma, M. Fan, and Y. Ma. Research on Evolutionary Programming Model based on Reflection
and Bytecode Engineering. in The 3rd International Conference on Information Science and
Engineering Yangzhou, China(2011).
[23] E.H. Davidson, J.P. Rast, et al., Science,Vol. 295(2002), 1669-1678.
Advanced Materials Research Vol. 647 263