Topological Stability and Dynamic Resilience in Complex Networks
Transcript of Topological Stability and Dynamic Resilience in Complex Networks
UNIVERSITY OF CALGARY
Topological Stability and Dynamic Resilience in Complex Networks
by
Satindranath Mishtu Banerjee
A THESIS
SUBMITTED TO THE FACULTY OF GRADUATE STUDIES
IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE
DEGREE OF DOCTOR OF PHILOSOPHY
DEPARTMENT OF COMPUTER SCIENCE
CALGARY, ALBERTA
SEPTEMBER, 2012
c© Satindranath Mishtu Banerjee 2012
UNIVERSITY OF CALGARY
FACULTY OF GRADUATE STUDIES
The undersigned certify that they have read, and recommend to the Faculty of Graduate
Studies for acceptance, a thesis entitled “Topological Stability and Dynamic Resilience in
Complex Networks” submitted by Satindranath Mishtu Banerjee in partial fulfillment of the
requirements for the degree of DOCTOR OF PHILOSOPHY.
Supervisor, Dr. Ken BarkerDepartment of Computer Science
Dr. Peter HøyerDepartment of Computer Science
Dr. Carey WilliamsonDepartment of Computer Science
Internal, Dr. Sui HuangDepartment of Biological Sciences
Institute for Systems Biology
External, Dr. Robert E. UlanowiczArthur R. Marshall Laboratory,
University of Florida
Date
Abstract
Stability is a concern in complex networks as disparate as power grids, ecosystems, financial
networks, the Internet, and metabolisms. I introduce two forms of topological stability that
are relevant to network architectures: cut and connection stability. Cut-stability concerns a
network’s ability to resist being broken into pieces. Connection-stability concerns a network’s
ability to resist the spread of viral processes.
These two forms of stability are antagonistic. Therefore, no network can ever be com-
pletely architecturally stable. Changes to network topology that increase one form of sta-
bility, compromise the other. This may seem disappointing, but there is good news. Dy-
namic processes can stabilize a network and compensate for architectural limitations. Let
us call such stabilizing processes, ‘resilient mechanisms’. Such resilient mechanisms can be
abstracted from stabilizing processes in biology, or designed de novo.
Resilient processes have evolved to dynamically stabilize biological networks in the face
of architectural limitations. They have been studied by biologists in several areas from
homeostasis to evolutionary robustness. These processes exist today because they have been
effective over evolutionary time scales. This provides an opportunity for computer scientists
to learn from biology about processes that can stabilize the complex networks characteristic
of distributed systems.
I introduce a multi-agent framework, Probabilistic Network Models (PNMs), within
which we can test different candidate resilient processes under varying network architec-
tures. I focus on a PNM for a viral instability where the resilient process is the simple
immune response of sending a warning message. Counter-intuitively, network architectures
that favour the virus, also favour the warning message running ahead. Dynamic resilience,
thus allows for an architectural weakness in connection-stability to be circumvented by pro-
cesses as simple as sending a warning message.
ii
Permutations
unfold and arise
from within and fracture what was simply simple
into many.
Repeat is scattered by rhythm and
released in
multitudes that stand in the plain void.
– from ’Flux’, by S.N. Salthe
iii
Acknowledgements
The ideas presented here have percolated for over twenty years. Enduring questions about
stability in biology, ultimately led me to computer science, whose formal methods allowed
me to articulate my intuitions and build the conceptual tools I needed.
The ideas that led to this thesis originated in discussions with a diverse collection of sci-
entists seeking to understand the interplay of physical and informational constraints involved
in originating and elaborating biological systems. They include Jack Maze, Daniel Brooks,
John Collier, Robert Ulanowicz, Stanley Salthe and Koichiro Matsuno. Over the nearly 20
years I worked in industry and outside of academia, they always found the time to answer my
questions. While the resulting theory of topological stability most obviously descends from
Robert Ulanowicz’s ecological theory of ascendency, it owes equally to all these individuals
and the inspiration their work provided me.
A few of my mentors at the University of Calgary (U of C) deserve special mention.
Ken Barker, my thesis supervisor, has been a constant source of encouragement, and gently
led me out of many intellectual dead ends as I developed the hypotheses that ground this
thesis. Peter Høyer, both understood my mathematical limitations, and guided me to rectify
them. In doing so, he introduced me to the lovely rigour of thinking through proofs. I am
indebted to his patient teaching and his high standards; board sessions with Peter have
been the highlight of my academic career here. Jorg Denzinger was a generous source of
ideas, critique, and insight connecting the multi-agent simulation approach to the biological
problems that drove me. There was no good idea he was not willing to discuss and no bad
idea that he was reticent about pointing out. Sui Huang introduced me to systems biology
over numerous discussions and his work and vision integrating empirical and theoretical
aspects of systems biology motivated much of Chapters 5 and 6. Ken Barker, Jalal Kawash,
Lisa Higham, Philipp Woelfel, and John Aycock, collectively as the ‘virus group’, gamely
iv
took my biologically inspired question about the simplest possible immune reaction and
guided it into the arena of networked systems in the form of a probabilistic network model,
the prototypical PNM.
My time at the U of C was smoothed by our excellent administrative staff, particularly
Susan Lucas, Stacey Chow, and Mary Lim.
Four people enriched my daily life on campus immensely, became close collaborators and
dear friends. They are Craig Schock, Jalal Kawash, Leanne Wu, and Rosa Karimi Adl.
My immediate and extended family both supported me, and lost me during the years of
this thesis. I am sorry I can never return that time lost to us. First and foremost, my wife
Julie Rao encouraged and supported me in all ways possible. She was my best critic and
translator from jargon to plain english. My family – Satyen, Maya and Mita Banerjee – and
my dear friend, Audrey Eastham, were constant sources of encouragement. Lois Garton and
Mavis Wahl kept me physically intact.
I thank my supervisory committee (Ken Barker, Peter Høyer, and Carey Williamson)
and externals (Robert Ulaniwicz and Sui Huang) for undertaking to evaluate a complex
multidisciplinary thesis.
This thesis is dedicated to the memory of my father, who would have enjoyed reading it,
and encouraged me to go a little further still.
v
Table of Contents
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iiAcknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ivTable of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viList of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ixList of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xList of Symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi1 Roadmap and Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 Roadmap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Introduction and Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2.1 Guiding Questions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.2.2 Preliminary Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3 Literature Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111.3.1 Some Basic Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 111.3.2 Current Network Models . . . . . . . . . . . . . . . . . . . . . . . . . 121.3.3 Cut Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131.3.4 Connection Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . 141.3.5 Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.4 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162 A Brief Survey of Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182.1 Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182.2 Introduction: Stability Through the Looking Glass . . . . . . . . . . . . . . 182.3 Philosophy: Stability, Cohesion, Individuality . . . . . . . . . . . . . . . . . 192.4 Dynamical Systems: Poincaire Stability . . . . . . . . . . . . . . . . . . . . . 212.5 Thermodynamics: Instability and Self-Organization . . . . . . . . . . . . . . 232.6 Biology: Homeostasis and Developmental Canalization . . . . . . . . . . . . 262.7 Computer Science: Byzantine Dilemmas . . . . . . . . . . . . . . . . . . . . 312.8 Stable Inferences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 342.9 Commonalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383 Mathematical Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . 413.1 Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 413.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 413.3 Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423.4 Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 433.5 Information (Classical) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 443.6 Information (Algorithmic) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 473.7 Derivation of Ascendency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 484 Topological Network Stability . . . . . . . . . . . . . . . . . . . . . . . . . . 544.1 Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 544.2 Introduction and Motivation I: A Network Architect’s Perspective . . . . . . 554.3 Cut-stability and Connection-stability Definitions . . . . . . . . . . . . . . . 57
4.3.1 Cut-stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 584.3.2 Connection-stability . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
vi
4.3.3 Extension of Cut and Connection Stability to Disconnected Graphs . 594.3.4 Extension of Cut-Stability and Connection-Stability to Directed Graphs 604.3.5 Antagonism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.4 Cut-Stability and Connection-stability are Antagonistic . . . . . . . . . . . . 624.5 Introduction and Motivation II: An Ecologist’s Perspective . . . . . . . . . . 644.6 Directed Graphs and Mutual Information . . . . . . . . . . . . . . . . . . . . 714.7 Mutual Information and Topological Stability . . . . . . . . . . . . . . . . . 79
4.7.1 Roadmap to Our Argument . . . . . . . . . . . . . . . . . . . . . . . 794.7.2 Cut-Stability and Connection-Stability in Strongly Connected Graphs 814.7.3 Monotonicity Conditions . . . . . . . . . . . . . . . . . . . . . . . . . 844.7.4 A Construction for Monotonic Decrease . . . . . . . . . . . . . . . . . 87
4.8 Balanced Stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 904.8.1 Visualizing Balanced Stability . . . . . . . . . . . . . . . . . . . . . . 904.8.2 Balanced Stability and Information Hiding . . . . . . . . . . . . . . . 94
4.9 Connections to Other Perspectives . . . . . . . . . . . . . . . . . . . . . . . 1004.9.1 Error and Attack Tolerance for Complex Networks . . . . . . . . . . 1014.9.2 Keystone Species, Indirect Effects and Cycling in Ecological Networks 1044.9.3 Social Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1074.9.4 Graph Spectra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
4.10 The Story So Far, The Road Ahead . . . . . . . . . . . . . . . . . . . . . . . 1105 Probabilistic Network Models . . . . . . . . . . . . . . . . . . . . . . . . . . 1145.1 Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1145.2 Introduction and Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . 1145.3 The PNM Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1175.4 Computational and Biological Contexts . . . . . . . . . . . . . . . . . . . . . 1226 Modelling with PNMs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1266.1 Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1266.2 Introduction and Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . 1276.3 Model 1 – Virus and Immune Response . . . . . . . . . . . . . . . . . . . . . 1326.4 Model 2 – Mutualism and Autocatalysis . . . . . . . . . . . . . . . . . . . . 1366.5 Model 3 – Gene Regulation . . . . . . . . . . . . . . . . . . . . . . . . . . . 1396.6 Model 4 – Differentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1476.7 Model 5 – Semiochemicals . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1496.8 Model 6 – Ecosystem Flow Networks . . . . . . . . . . . . . . . . . . . . . . 1546.9 Future Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1617 Dynamic Resilience . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1667.1 Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1667.2 Introduction and Motivation: Viruses in Computer Science and Biology . . . 1687.3 Dynamic Resilience . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
7.3.1 Dynamical Resilience in terms of Topological Network Stability andPNMs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
7.3.2 Resilience Concepts In Other Areas of Computer Science . . . . . . . 1717.4 Resilience Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
7.4.1 Resilience Example 1: Agent Hardening . . . . . . . . . . . . . . . . 1737.4.2 Resilience Example 2: Viral Propagation . . . . . . . . . . . . . . . . 175
vii
7.4.3 Resilience Example 3: Virus Immune Response Under Different Net-work Connectivities . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
7.4.4 Insights from the Examples: A Little Resilience Can Go A Long Way. 1807.5 Refining Resilient Mechanisms . . . . . . . . . . . . . . . . . . . . . . . . . . 181
7.5.1 Combining Resilient Mechanisms: Agent Resistance and Immune Re-sponse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
7.5.2 Further Refinements to the Virus and Immune Response PNM . . . . 1857.6 The Epidemiological and Immune Metaphors in Computer Science . . . . . . 1877.7 An Evolutionary Perspective . . . . . . . . . . . . . . . . . . . . . . . . . . . 1918 The Nascent Moment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1958.1 Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1958.2 Recap of Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1958.3 Future Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
8.3.1 Theoretical Next Steps . . . . . . . . . . . . . . . . . . . . . . . . . . 1978.3.2 Methodological Next Steps . . . . . . . . . . . . . . . . . . . . . . . . 1988.3.3 Empirical Applications . . . . . . . . . . . . . . . . . . . . . . . . . . 199
8.4 On the Origin of Interactions . . . . . . . . . . . . . . . . . . . . . . . . . . 200Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
viii
List of Tables
4.1 Adjacency Matrix for MacArthur’s Food Web . . . . . . . . . . . . . . . . . 744.2 Mutual Information Calculation for MacArthur’s Food Web . . . . . . . . . 754.3 Adjacency Matrix for MacArthur’s Modified Food Web . . . . . . . . . . . . 764.4 Mutual Information Calculation for MacArthur’s Modified Food Web . . . . 76
ix
List of Figures and Illustrations
1.1 Unconstrained Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71.2 Constrained Network . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
4.1 MacArthur’s Food Web . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 744.2 MacArthur’s Modified Food Web . . . . . . . . . . . . . . . . . . . . . . . . 774.3 Stability Measures in Terms of Cumulative Probability of Summation Terms 93
6.1 MacArthur’s Modified Food Web . . . . . . . . . . . . . . . . . . . . . . . . 158
7.1 Simulation Plot Matrix. From left to right, viral level increases. From top tobottom network connectivity increases. Red diamonds: viral vertices. Bluesquares: neutral vertices. Green triangles: immune vertices. For each com-bination of Virus Level and Network Connectivity the average of 30 trials issummarized by iteration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
x
Nomenclature
Notation Description
X, Y, Z Sets, events, random variables and statistical summaries.
x, y, z Variables, constants and indices.
|X| Cardinality of the set X.
G = (V,E) A graph G with vertex set V and edge set E.
GD A directed graph.
GU An undirected graph.
V (G) Vertex set of G.
E(G) Edge set of G.
uv An edge from vertex u to vertex v.
p(A) Probability of the event A.
p(X, Y ) Joint probability of events X and Y .
p(Y |X) Conditional probability of event Y given event X.
C(X) Information capacity of a random variable X.
C(X, Y ) Joint information capacity of X and Y .
C(Y |X) Conditional information capacity of Y given X.
I(X, Y ) Average mutual information between X and Y .
Ck(S) Algorithmic information complexity of a sequence S.
Ck(X, Y ) Algorithmic joint information complexity of X and Y .
Ck(Y |X) Algorithmic conditional information complexity Y given X.
Ik(X, Y ) Algorithmic mutual information of X and Y .
Sk(G) Cut-stability of a graph G.
Sc(G) Connection-stability of a graph G.
MVC(G) Minimum vertex cover of a graph G.
xi
MFS(v∗, G) The set of vertices in G flood-able from v∗, including v∗.
IE(G) Average mutual information calculated on the edges of a graph.
ecuv Edge constraint; value an edge from uv contributes to IE(G).
X tj A random variable X of the jth agent at time t.
Γ(j) Neighbourhood of an agent j.
xii
Chapter 1
Roadmap and Introduction
What is stable?
1.1 Roadmap
This chapter is a conceptual introduction to the major themes of this thesis and a roadmap
through subsequent chapters. I introduce two forms of stability that are relevant to network
architectures: cut-stability and connection-stability. Cut-stability concerns a network’s abil-
ity to resist being broken into pieces. Connection-stability concerns a network’s ability to
resist the spread of mal-information. I examine the literature on complex networks in several
fields to see how these two forms of stability have appeared in various forms. I introduce the
notion that these two forms of stability are antagonistic.
Chapter 2 provides a brief history of stability concepts from various fields. Its goal is
to sharpen our intuition about what stability in a system implies. We all have some basic
intuitive notions of stability, but these basic intuitions take specific forms in each field.
There are different implications to stability as it has been represented in the literature in
philosophy, physics, biology and computer science.
Chapter 3 provides mathematical preliminaries for the rest of the thesis. It introduces
some basic concepts in probability theory, information theory, and graph theory, as well as an
ecological application of these concepts known as ascendency [Ulanowicz 97, Ulanowicz 04,
?]1.
The concepts of cut-stability and connection-stability introduced in Chapter 1 are more
1 Put simply, ascendency is a measure of how information is constrained to certain paths in a network,such that ascendency is zero when there is no constraint on information, and maximal when information isconstrained to a single path.
1
formally developed in Chapter 4, as is the contention that cut-stability and connection-
stability are antagonistic. This leads to the insight that no network is architecturally stable.
The concept of balanced stability is introduced and developed to explore the idea that net-
works able to resist diverse attacks require a modicum of both cut-stability and connection-
stability. Following Chapter 4 we switch to a more process oriented viewpoint.
Chapter 5 introduces a new multi-agent modelling framework, Probabilistic Network Mod-
els (PNMs), that can be used to explore message passing processes on a network.
Chapter 6 is a survey of PNMs abstracted from processes associated with stability in
biological systems. Our goal is to illustrate the utility and breadth of this multi-agent
modelling approach when applied to different levels in the biological hierarchy from cellular
to ecosystem level phenomena. The PNM approach, when applied to biological processes,
stresses the network structure of biological interactions, and the logical form of biological
information transfers. It allows biological processes and their interactions to be captured in
a compact form amenable to computer simulation.
Chapter 7 brings together tools developed in Chapters 4–6 to introduce the idea that
dynamic processes may stabilize a network, and compensate for architectural limitations.
Such processes are referred to as resilient processes. Resilient processes allow a network
to be dynamically stabilized, even when the network would be otherwise architecturally
unstable. We explore this concept of resilient processes via a virus and immune response
PNM.
A theory of topological stability and dynamic resilience in complex networks is built
across Chapters 1–7. Finally, Chapter 8 looks back on the landscape we have just covered.
We restate the contributions made in this thesis. We briefly examine the scope for future
work via extending ideas developed in the thesis, and new questions arising from those ideas.
We end with some speculations on the origin of interactions that lead to complex networks
in nature, and which may provide further insight to the design, growth and development of
2
complex networks in technology.
The argument developed across Chapters 1–7 relies heavily on concepts and techniques
originally developed outside of computer science. While our focus throughout remains on the
stability of complex networks, we provide motivation, illustrations, and introduce specific
concepts from a range of biological fields. In Chapter 4 concepts are introduced from ecology
to relate our formal development of cut-stability and connection-stability to an ongoing
debate on the stability of ecological networks. In Chapters 5–6 concepts are introduced
from systems biology to illustrate how stabilizing processes from biological systems can
be abstracted into multi-agent models such as PNMs. Chapter 7 introduces ideas from
epidemiology and immunology concerning viruses and immune responses to explore how
resilient processes may dynamically stabilize a network. Chapter 7 also introduces some ideas
on evolutionary arms races which may apply to computer virus and anti-virus development.
Computer science thus provides a unifying methodological approach to stability concepts
that have been developed in different fields of biology. Biology thus provides a source of
new metaphors and concepts that can be introduced into computer science to guide its
development of increasingly complex systems.
1.2 Introduction and Motivation
1.2.1 Guiding Questions
The Internet is the basis of our email, our daily check of various news sites on Google, a
way to keep up with friends and acquaintances on Facebook, our portable online office, and
a host of other things. These functionalities are ultimately determined by the ability of the
Internet to efficiently move information as bits over space, point to point, router to router.
We are becoming so dependent on the presence of the Internet that, like air, it becomes
invisible. We cease to question the conditions for its continued existence. The Internet
is a network. It is currently the worlds most famous network. However, there are many
3
other kinds of networks in the world: gene interaction networks, social networks, metabolic
networks, electrical grids, and ecological networks to name a few. Since everyone is familiar
with the Internet, it is a good starting point for us to ask some specific questions about
stability in networks. These questions would apply equally to any other network.
Some questions are so simple, they nearly pass us by:
• Is the Internet stable? How do we characterize stability in a networked system such
as the Internet? The Internet is subject to errors in its components, attacks on its physical
structure, and malware propagating viruses. Is it possible to design an Internet architecture
that will be stable in the face of any of these perturbations?
• How is the Internet like an ecosystem? The Internet is a system of information flows.
Ecosystems are systems of material flows. Is it possible that methods to assess material flows
in ecosystems could also be used to assess information flows in the Internet? How would such
a translation of concepts, analysis, and metrics across disciplines as disparate as ecosystems
and the Internet be realized?
• Can the stability of the Internet be solely secured by architecture, or is specific
software necessary? What capabilities should such software possess?
• How easy will it be to determine the stability of the Internet (or any networked
system)?
These are the questions that drive this work. My thesis seeks to develop a conceptual
framework in which answers as well as further questions can be pursued.
Is the Internet stable? My conjecture is ‘No’. My argument begins by asserting that
two forms of stability occur in any networked system such as the Internet: cut-stability
and connection-stability. These two forms of stability can be defined without assuming
any particular model of network connectivity. Finally, I contend, that these two forms of
stability are antagonistic. It is impossible to design an Internet architecture that would be
4
simultaneously fully attack and error tolerant (cut-stable) and also maximally resistant to
viruses (connection-stable)2.
Is it possible that methods to assess material flows in ecosystems could be used to assess
information flows in the Internet? I believe, ‘Yes’ and introduce ascendency’ [Ulanowicz 97,
Ulanowicz 04, ?], a well-known methodology to assess ecosystem stability that can be ap-
plied to any complex message passing network, including the Internet3. Seeking methods
for Internet studies from ecosystem studies is part of a growing trend to seek commonali-
ties across systems. Deep similarities have been found across diverse systems such as the
Internet, ecosystems, gene-interaction networks, metabolic pathways and social networks
[Dorogovtsev 03, Kleinberg 08, Junker 09, Newman 03, Newman 06, Sole 01]. This raises
the possibility that insights and methods can be translated across disciplines that differ in
their details, but correspond to a common abstraction. Each of the systems I mentioned
above are for analytical purposes abstracted as networks.
Can the stability of the Internet be solely secured by architecture, or is specific software
necessary? Following from my ‘No’ to stability via architecture, the only route left open is
stability via software. Here, my answer is a tentative ‘Yes’. I introduce the notion of ‘resilient
mechanisms’, shared software that may provide additional stability to a system so it may
circumvent the limitations of its architecture. In this context, ‘resilience’ is the additional
stability in a network due to active mechanisms4, rather than passive architecture. While
the stability due to resilience is dynamically maintained, the stability due to architecture is
2 I suspect the stability of a network may be pre-requisite to it being additionally secure and private –but the proof of that intuition is beyond the scope of my thesis.
3 Ascendency is based on two concepts, mutual information and throughput. Mutual information, whenmeasured on a network of material or information flows, reflects the constraints in the network. The moreconstrained the paths through which matter/information can flow, the greater the mutual information.Throughput is a measure of the total amount of matter or information flowing through the system. Ascen-dency is mutual information (the system constraints) multiplied by throughput (total flows, an estimate ofsystem size).
4 These active mechanisms may be due to algorithms in technical systems, or due to processes in biologicalsystems.
5
static (given a particular architecture). Though it is impossible to design an architecture
that can overcome the antagonism between cut-stability and connection-stability, it is in
principle possible, with appropriate software, to circumvent architectural limitations. I will
provide an example inspired by work in immunology [Cohen 00a] and epidemiology [Daley 99]
of a situation with minimal connection-stability whose resilience can be enhanced by the
presence of software to send warning messages. I demonstrate that if the warning message
can get ahead of the viral payload, a network that is architecturally cut-stable will also be
dynamically connection-stable, via the shared software (which could be considered a simple
immune response). The specific capabilities required by such shared software to recognize
a virus and organize a counter-response are left as post-thesis challenges. I limit myself to
demonstrating the required conditions for successful resilience.
How easy will it be to determine the stability of the Internet (or any networked system of
moderate size)? The short answer is, ‘Not easy’. As Chapter 4 demonstrates, cut-stability
can be related to a known NP-complete problem, the minimum vertex cover problem5.
However, under special conditions, cut-stability can be estimated by a much simpler to
calculate measure, the mutual information. This argument will also be developed in Chapter
4.
1.2.2 Preliminary Concepts
This thesis is grounded in two concepts: cut-stability and connection-stability in networks.
From these two concepts, we derive a third concept, balanced stability. Since these are
new concepts, I want to briefly articulate them through simple examples. The task of the
thesis will be to formalize these three stability concepts. An additional task in the thesis
will be to relate these new concepts to existing stability concepts such as, for example,
Poincaire stability in dynamical systems [Gowers 08, Peterson 03] or homeostasis in biology
5 ‘NP’ represents a class of problems in computer science for which it is currently believed there are noefficient (polynomial time) general algorithms that will handle all cases, but for which a candidate solutionis efficiently verifiable [Arora 09, Garey 79].
6
Figure 1.1: Unconstrained Network
Figure 1.2: Constrained Network
([Ricklefs 79a]:pg.146, [Lehninger 65]:pg. 236, [Salisbury 85]:pg. 456).
Let us begin with a simple network image. Think of a fully connected network, where each
vertex is connected to all other vertices. This is the interconnection structure required for
tier-1 Internet Service Providers (ISPs) – each tier-1 ISP must be directly connected to every
other tier-1 ISP. These ISPs are collectively called the ‘Internet backbone’ ([Kurose 08]:31).
This structure reflects an emphasis in the early days of the Internet on robustness and
survivability, including a network being able to function even if a large portion was lost to
failure ([Leiner 03]:note 5). Figure 1.1 is an example of this kind of connection pattern for
four vertices. Consider each of the vertices as able to either generate messages, or forward
messages received from another vertex, and consider the edges to be channels along which
the messages can be passed. Each of the vertices is connected to the other three. The
7
‘arrowhead’ on each arrow indicates the direction messages can travel. Thus, the twelve
arrows in Figure 1.1 allow messages to travel back and forth between any two vertices. Since
a message on any vertex can go to any other vertex in a single step, we consider this pattern
‘unconstrained’. If we had a measure of constraint on this network, we would expect it to
have a minimal value. Since messages could take many paths (they do not necessarily have
to take the shortest 1-hop route between two vertices), having received a message, at some
vertex Y from some vertex X, we might be surprised by the actual path it took. In this
sense, constraint and surprise are inversely related.
Now, what happens if one of the vertices are cut? The other three vertices still remain
connected to each other. Cut a second vertex and the other two vertices remain connected.
Finally, on the cut of a third vertex, the fourth remaining vertex is isolated, and has no
other vertices to send messages to. This particular pattern, where each vertex is connected
to all other vertices with edges in both directions could be said to be cut-stable, in that it
takes as many cuts to vertices as possible to break the network down to isolated vertices
(that have nowhere to communicate). In general, for a network connected in this way, it will
take V − 1 cuts to break the network to pieces, so all that is left is a single island. Figure
1.1 represents a network that is maximally cut-stable. Two other associated properties are
the multiple paths available to a message between any two vertices, and the corresponding
surprise associated with finding a message to have taken a particular path. Our tier-1 ISPs
and their interconnections would be an example of such cut-stability.
Figure 1.2, by contrast, is a much simpler pattern. Indeed all the edges in this figure were
already in Figure 1. Now in Figure 1.2, there is a single path in which messages can move. In
this sense, the network is as constrained as it can be. Conversely, if you received a message
at some vertex Y from some vertex X, you would have absolutely no sense of surprise at
the path it took. If you cut a single vertex, there will be at least one vertex that no longer
receives messages. It is very easy to find two vertices that, if cut, will split this network into
8
two islands that cannot communicate. No matter how many vertices you arranged in this
pattern, only two well placed cuts are needed to split this network into islands. Subsequent
cuts of vertices in each island will split this network into even more pieces that cannot
communicate. So, from the perspective of cut-stability, this network design does not appear
particularly stable.
Now, let us take another point of view. Imagine instead of cuts to a vertex, there are
certain ‘bad’ messages that can pass through the network causing harm. While badness (or
goodness) is in the eye of the beholder, let us call these bad messages ‘viruses’. We assume
the bad messages have some kind of harmful effect: they slow down transmission, cause faults
to occur, corrupt other messages, et cetera. Our real concern is how such messages may pass
through a system. In Figure 1.1, every vertex is a single hop away from every other vertex;
if a bad message is generated at a particular vertex, in one step it could be transmitted to
all other vertices. Figure 1.1, while being very cut-stable is not very connection-stable, that
is, able to resist an attack that takes advantage of the pattern of connection in the network.
Figure 1.2, by contrast is about as connection stable as you can be. Any message starting at
some vertex will take V-1 steps to pass through the whole system. At the time the Internet
was originally designed, viruses were not a major problem. If prevention of viral spread was
the design focus, the tier-1 backbone may have ended up looking much more like Figure 1.2
than Figure 1.1.
In summary, the pattern in Figure 1.1 is highly cut-stable, but is not connection-stable.
By contrast, the pattern in Figure 1.2 is highly connection-stable but not cut-stable. One
pattern resists subsets of the network becoming isolated via cuts to vertices. The other
pattern resists viral spread of bad information. Two corollaries follow: we can improve
the cut-stability of a network via the selective addition of edges, and we can improve the
connection-stability of a network by selective removal of edges.
From our simple examples, cut-stability and connection-stability appear to be antagonis-
9
tic principles. A pattern that maximizes one; sacrifices the other. This leads to the notion
of balanced-stability : the ability to resist, to some degree, both kinds of perturbations. Let
us think about balanced stability in a different way, utilizing some of our insights from
cut-stability and connection-stability. The pattern that is maximally cut-stable has V − 1
incoming and outgoing edges for every vertex. The pattern that is maximally connection-
stable has exactly one incoming and one outgoing edge per vertex. If we had a rough idea of
the number of vertices in a large network, and observed that every vertex we encountered had
edges going in and edges coming out close to the number of vertices in the network, we would
go ‘Aha, cut-stable, therefore connection-unstable’, and if we were malevolent, we might be-
gin to design a connection-attack. Alternately, if every vertex we encountered had close to
one incoming or outgoing edge, we might go ‘Aha connection-stable, therefore cut-unstable’,
and again, if we were malevolent, begin designing a cut attack. If we were beneficent, we
would attempt to develop a way to guard the system from structural perturbations of any
kind. We could reason, if a potential adversary by traversing part of the network were to
gain sufficient local information (say, the degree of each vertex) to construct either a cut or
connection attack, they would do so. We can then turn that reasoning on its head and say
that if there existed a network from which no structured local information can be gained
by traversing a part of it, then the potential adversary can gain no information that allows
them to decide upon an attack strategy. Such a network would be an example of balanced
stability6. Since, we have previously seen that cut-stability is associated with minimal con-
straints and connection stability is associated with maximal constraints, we can infer that
this idealized network with balanced stability will have intermediate levels of constraint.
6 Such a network would not be maximally resistant to a cut attack. Nor would it be maximally resistantto a connection attack. It would, however, have some degree of stability with respect to both kinds ofattacks. Let us assume the information obtained at each point in the network the adversary is traversing isthe incoming and outgoing edges at each vertex. For the adversary to be unable to decide between a cut orconnection attack, the data obtained by the adversary must appear essentially random, with no underlyingpattern to leverage. In this case, the distribution of incoming and outgoing edges would have to appearrandom as the adversary traversed the network.
10
1.3 Literature Review
Ecosystems develop over time, and their constituent species are products of an evolution-
ary process. Technological systems such as the Internet, begin with design, but then often
develop in directions that could not be anticipated in the original design. Metaphorical rela-
tionships between the development of the Internet and ecological and evolutionary processes
appear fairly frequently in the literature. Huberman [Huberman 01] looks at the technical
and social interactions facilitated by the Internet as an ‘ecology of information’. Two re-
cent texts on Internet studies take an evolutionary perspective on the Internet and network
structural change in general [Dorogovtsev 03, Pastor-Satorras 04]. In this section we will
briefly examine how studies of ecological networks and the Internet mutually illuminate and
cross-validate each other.
I would like to briefly summarize some of the results across both the ecological and
Internet literature to identify common ground with respect to models, cut-stability and
connection-stability, as well as the development and evolution of networks. Later in the
thesis (Chapters 4, 5, 6, and 7), additional literature from other biological disciplines that
are relevant to complex networks is introduced.
1.3.1 Some Basic Terminology
Complex networks such as the Internet are vulnerable to various kinds of perturbations.
Error tolerance is the ability of a network to resist random faults (such as a router breaking
down). Attack tolerance is the ability of a network to resist directed attacks (on servers,
websites, routers). Finally, epidemic spread concerns the vulnerability of networks to mal-
ware such as viruses and worms that takes advantage of the connection structure of the
network. There are numerous papers on each of these three subjects (see below). The
papers are usually in the context of one of three models: random networks, small-world
networks and scale-free networks. For our purposes, error and attack tolerance are examples
11
of cut-stability; epidemic spread concerns connection-stability.
1.3.2 Current Network Models
The concept of stability we are developing depends on general properties of networks and in
its development we do not need to reference specific models of network connectivity such as
random graphs, small-worlds, or scale-free. However, we need to cover these models briefly
since much of the initial discussion of stability in the literature has been in terms of these
models.
Over the last decade the relevant models in studies of Internet structure have rapidly iter-
ated from Erdos–Renyi random graphs, to small-worlds, to scale-free [Newman 06]. Current
critiques [Keller 05] as well as empirical evidence that moves beyond power law character-
izations [Dunne 02a, Li 05, Seshadri 08]7 are leading both to refinements of the scale-free
model [Li 05] that popularized Internet structure in the late 1990s, and to the development
of newer models that have more complex mathematical structure [Leskovec 08]. The various
models and their refinements have reignited interest in graph theory as it pertains to complex
networks [Chung 09, Chung 06].
Each model focusses on a particular process for creating a network. While networks
tend to be interpreted in terms of one model or another, there is no reason why a network
corresponds to a particular model, nor why subsets of a network could not correspond to
different models. Uniformity is a simplifying assumption of the modeller rather than a
necessary property of real networks. The three most relevant models – random graphs,
small-world, and scale-free – are summarized in several reviews [Albert 02, Newman 03,
Newman 06].
The random graphs model assumes a process where one randomly and independently
selects undirected edges for a graph. The only parameters in the model are (a) n, the
7 The limitations of power law models in explaining empirical data has been noted in systems as diverseas ecosystems [Dunne 02a] and mobile-phone call based social networks [Seshadri 08].
12
number of vertices and (b) p, the probability of edge selection, denoted as G(n, p). As p
approaches 1/n, a giant connected component appears that includes the majority of vertices.
The small-world model assumes a process where a regular lattice8 is randomly re-wired.
You can essentially consider it the random graphs model applied to a pre-existing regular
lattice. As the graph is rewired, the average diameter (average shortest path) between any
two random points drops. A small–world network is defined in contrast with a random
graph. If a graph’s clustering coefficient is higher than that expected for a random graph
and has an average shortest path distance similar to that in the random graphs model, the
graph is considered to be small-world.
The scale-free model assumes a growth process called ‘preferential attachment’ where the
likelihood of a new vertex being attached to an existing vertex is proportional to the degree
of the existing vertex. It is characterized by a power-law (exponential) degree distribution:
p(k) ∼ k−b
where there is a finite probability of very high degree nodes; p(k) is the probability of a node
of degree k. The exponent b usually varies from 2-3 in empirical studies.
1.3.3 Cut Stability
Studies on ecological systems [Sole 01] and the Internet [Albert 00, Calloway 00, Cohen 00b,
Cohen 01, Crucitti 04] indicate networks in both domains are stable against random errors,
but susceptible to directed attacks. The Internet studies were primarily model driven (con-
trasting the Erdos–Renyi random graph model and the scale-free model) while the ecological
results used topologies of measured ecosystems. In both cases, the results can be explained in
a very common-sense way. In ecosystems, keystone species9 take the role of hubs connecting
various parts of an ecosystem. In the Internet, certain sites or certain parts of the Internet
backbone play the same hub-like role. Attacks (or random errors) that miss the hubs have
8 Imagine a network where that is either a grid, or a ring.9A keystone species is one that has a large number of links to other species [Sole 06] so that any pertur-
bation in the keystone species strongly affects the dynamics of the ecosystem as a whole.
13
limited effect. Perturbations affecting hubs (whether random or targeted) have a large effect.
As topological analysis has indicated, ecosystems and the Internet both fall into the range
of intermediate constraints.
1.3.4 Connection Stability
The empirical evidence from Internet studies of virus persistence [Pastor-Satorras 01] indi-
cate surviving viruses have a low level of persistence and affect only a tiny proportion of
all computers. Scale-free models that were developed to explain this data concluded that
there is no necessary minimum viral threshold below which an epidemic cannot occur (such a
threshold is the logic behind inoculation programs). The likelihood of an epidemic is largely
dependent on the transmission probability of a virus, under the assumption of a scale-free
network [Newman 02a], and is dependent on the presence of a giant component10 in the
network [Newman 02b]. Finally, there has been an interesting cross-disciplinary trend in
this literature for models first developed for human epidemiology to be transferred to a net-
work context, and the revised network models to be applied back to human epidemiology
[Meyers 05].
1.3.5 Development
To a biologist, development and evolution are processes of irreversible change; the former
occurring within individual organisms and the latter occurring across ancestor-descendant
lineages. Development has a characteristic pattern of moving from immaturity, to maturity,
to senescence11. This movement from immaturity to senescence is often associated with
increasing levels of constraint which, as a system matures, allow for efficient functioning, but
as a system becomes senescent, lead to brittleness [Salthe 93, Ulanowicz 97]. A technological
system like the Internet may begin with a particular designed structure. An early local
10See [Chung 06] for a compact description of the origin of the giant component in random graphs.11 Senescence refers to the biological process of deterioration with age.
14
area network structure was a token ring, whose graph structure is a cycle. The Arpanet
that preceded the Internet was initially designed to be a somewhat more complex connected
network. Unlike biological systems that tend to be initially loosely constrained, technological
systems can begin anywhere along the constraint spectrum according to their original design.
The question is, once things grow past the original design, what happens next? From its
origins in the Arpanet, the Internet developed a structure that is much larger but also much
sparser, and which is dominated by a small number of well connected hubs, whether they be
very popular sites, or the routers in the core. The Internet’s explosive growth phase began
soon after the emergence of the World Wide Web in the mid 1990s. It was the structure
resulting from that growth that was captured in studies of Internet topology in the late 1990s.
Change has continued apace over the last few years as peer-to-peer traffic flows became the
dominant source of bytes flowing through the Internet [Crovella 06], residential broadband
matured as a point of Internet connectivity [Maier 09] and organizations with sufficient
resources began to route around the Internet core routers, bypassing Tier-1 ISP’s [Gill 08].
This latter finding is particularly intriguing, as it indicates that the Internet as a whole is less
the product of design by any individual, organization, or committee, and is instead part of
an organic human response to the capabilities of the Internet to date. Internet development
parallels human cultural development, in the opportunistic use of existing tools, invention of
new tools, and combinatorial exploration of inter-connected tools. We study the Internet the
way we study any developing system – by seeking to map it at a broad scale and understand
its mechanisms at a fine scale. The low level mechanisms, the protocols on which the Internet
runs, are ultimately products of design. However, the uses of those protocols in a distributed
setting are less a case of a priori design and more reflective of a posteriori exploration. Gill
et al. [Gill 08] question whether these new developments should be considered the natural
evolution of the Internet or unsightly architectural barnacles that weaken the structure of the
system as a whole. They note that the final result is a flattening of Internet topology. From
15
our constraints oriented viewpoint, the development of these additional wide area networks
that can bypass Tier 1 ISP’s is an indicator of alternate pathways being developed via a
combination of technical, market and social/political forces. Do these barnicular alternate
pathways level the playing field in terms of flow heterogeneity? Do they provide a form of
additional cut-stability, in that if the core routers were ever to go down, there are alternate
paths through the system? Is it possible that just as Tier 2 is currently routing around
Tier 1, in the future Tier 3 may route around Tier 2, and possibly ‘Tier 4’ (the end users)
may route around Tier 3? Finally, the notion of technical barnacles, and local tinkering
invoked by Gill et al. is quite familiar to those who view evolution itself as a process of
tinkering over and around architectural constraints. This view was famously invoked by
the evolutionary biologists Gould and Lewontin [Gould 79] as a reply to selectionist purists
who saw evolution as a relentless progress towards optimization. Indeed, a new theory of
technology [Arthur 09] takes the explicit view that technologies are evolutionary phenomena,
where new technologies emerge from existing technologies and known natural effects (such as
heat, light, sound and magnetism) and then diversify through combining with other extant
technologies. We see this in the perpetual transformation of the Internet; new capabilities
are realized, as the opportunistic progress of the Internet constantly works around its original
design. Like living systems, the Internet appears to evolve.
1.4 Contributions
In the course of the thesis I make the following contributions towards developing a theory
of topological stability and dynamic resilience in complex networks:
1. Definitions of cut-stability, connection-stability and balanced-stability are pro-
vided. The ways in which these concepts may be related to information theory
is also developed. (Chapters 1,4).
16
2. The antagonism between cut-stability and connection-stability is demonstrated
(Chapter 4).
3. A formal model for PNMs is developed, and PNMs are designed that reflect a
range of biological processes associated with stability (Chapters 5,6).
4. Resilient processes and resilient mechanisms are defined (Chapter 7).
5. A PNM representing a virus and immune response is explored to identify
conditions under which a resilient mechanism is effective (Chapter 7).
6. Interdisciplinary contributions are made at various points. Topological sta-
bility concepts are applied to error and attack tolerance in technological net-
works, to stability in ecosystems, and is connected to some current concepts
in social networks (Chapter 4). Concepts from computational systems biol-
ogy inspire the development of the PNM approach, and the design of specific
PNMs (Chapters 5, 6). Concepts from epidemiology, immunology, and evo-
lutionary biology are incorporated into our development of resilient processes
and resilient mechanisms (Chapter 7).
17
Chapter 2
A Brief Survey of Stability
Seek
the common echo
binding disciplines.
2.1 Abstract
We survey the concept of stability in different fields of study. Looking back we want to connect
the notions of network stability introduced in the last chapter to stability in terms of other
kinds of systems and contexts. Looking forward, we want to abstract what is common to the
notion of stability across very different domains.
2.2 Introduction: Stability Through the Looking Glass
Chapter 1 introduced the notion that a network may be stable in one of two ways. It may
be cut-stable, able to resist perturbations that destroy vertices. It may be connection-stable,
able to resist perturbations that move virally along edges. A series of thought experiments
were used to explore these two forms of stability and we argue they are antagonistic: opti-
mizing cut-stability requires sacrificing connection-stability, and vice-versa. Both forms of
stability appear in the Internet literature, with cut-stability abstracting the notion of attack
or error tolerance [Albert 00, Crucitti 04] and connection-stability abstracting the notion
of resistance to viral epidemics [Pastor-Satorras 01]. Balanced stability would be dual re-
sistance to both attack/error tolerance and viral epidemics. We will revisit these topics in
Chapter 4. But how do these twinned notions of network stability compare to the con-
18
cept of stability in other fields of scholarship? Consider the egg-shaped logician, Humpty
Dumpty for whom a word means ‘just what I choose it to mean – neither more nor less’.
We could define cut-stability and connection-stability, mathematize the definition (Chap-
ter 4) and be done with it. I will argue contrariwise that the notions of cut-stability and
connection-stability are consistent with the notion of stability as it appears in a number of
fields of study. The details of each field differ considerably, but stability in each field has
some common characteristics, and a common form.
Our technique is comparative: to examine examples from several fields, strip away from
the examples the details specific to the field, and ask what remains. What remains will pro-
vide us with a general conception of stability that supplies the context for the network centric
concepts of cut-stability and connection-stability developed in the chapters that follow.
As a starting point, here is a standard dictionary definition [Sykes 82]:
‘stable. a. firmly fixed or established, not easily to be moved or changed or unbalanced
or destroyed or altered in value...; firm, resolute, not wavering or fickle.’
2.3 Philosophy: Stability, Cohesion, Individuality
Stability has a long history in philosophy. Aristotle’s ‘De Anima’ (‘On the Soul’) [Bambrough 63,
McKeon 92] considers the stability of such putative properties of organisms as soul and
mind. Its third book develops an argument about the relationship of the stability of
sense organs in terms of the intensity of the sensory stimulus that perturbs such an organ,
and given certain stimuli, may destroy the animal itself. We will examine a contempo-
rary example based on philosopher John Collier’s development of the notion of ‘cohesion’
[Collier 03, Collier 04, Collier 07, Collier 08, Collier 99].
Collier is interested in the philosophical notion of the identity or essence of a thing. He
contrasts previous notions where identity is associated with ‘essential properties’ of things
19
(which could be considered to exist a-priori) with an account of how properties that identify
something, and individuate it from similar things, may emerge. An example from biology
where such ideas play a role is the nature of biological species. Some scholars argue a species
can be defined via positing a class of essential properties for a given species. Other scholars
believe species to be similar to individuals who can change through time, yet maintain a
core identity1. Ideas on the nature of species, while touching on ancient philosophical issues,
have practical implications in terms of how we may make logical inferences in constructing
a taxonomy. The inferences we would make under the assumption that species are classes
are not those we might make under the assumption that species are individuals.
Collier provides an account of how identity may emerge in a system. He begins by
developing a philosophical account of the nature of cohesion. Cohesion, Collier argues, is
a necessary pre-requisite for a system to have a specific identity, and individuality that is
maintained temporally and spatially. Why am I Mishtu now, and 10 minutes from now?
Why am I not simply the collection of my parts: hand, eye, foot? Cohesion is required to
stabilize a system so it may continue to exist. A consequence of the stabilizing behaviour
of such cohesion is the emergence of new properties of the whole, that cannot be assigned
to properties of the parts. Collier’s standard example is a framed cloth kite in the wind.
It reacts as a whole to the lift in the wind to rise. The cloth integrates the actions of the
individual collisions with air molecules and transfers it to the frame to lift the kite. Parts
of its cloth do not react individually to the wind to scatter in different directions. Contrast
the behavior of a kite whose surface is cohesive with that of say a soap bubble in a kite
shaped frame [Collier 99]. How does a soap bubble react to molecular motions of the air?
It dissipates. It does not maintain its initial form, it eventually becomes indistinguishable
from the air. Though the frame and the soap bubble are shaped like a kite, there is no lift,
because there is no surface cohesion. Cohesion is called ‘the dividing glue’ [Collier 04] in that
the cohesion that holds a system together distinguishes that system from other systems, and
1 See for example [Sober 84] Chapters 28-35 for a range of contemporary views on this issue.
20
from its surrounding environment. When I die, my dust will not be distinguishable from your
dust, or from the dust in the corner of your room. Cohesion first makes a system insensitive
to local variations in its parts (the framed kite as oppossed to the framed soap-bubble) and
secondly affords for the emergence of properties of the whole that are not properties of the
parts.
Collier’s account of cohesion is part of a general programme (with philosopher Cliff
Hooker) of viewing all systems whether natural or man-made as dynamical [Collier 99].
This leads us to the next example – how the concept of stability appears within dynamical
systems theory.
2.4 Dynamical Systems: Poincaire Stability
In 1887, the physicist-mathematician, Henry Poincaire introduced a precise mathematical
description of stability in the context of dynamical systems in a prize winning paper on
what is called the ‘three-body-problem’2. In essence you have three bodies in gravitational
rotation around each other. What is their long term behavior? To make the situation more
concrete, imagine the three bodies to be parts of a miniature solar system. You have a star,
you have a planet orbitting the star, and finally you have a moon orbitting the planet. Will
the planet fall into the sun? Will the moon fly away from the planet? Is each complete
orbit of the planet around the sun, or the moon around the planet, going to resemble the
previous complete cycle? Or, will the cycles themselves change? To ask these questions
about a general three-body system is to ask them about our particular planet and the solar
system we are embedded in.
Poincaire’s essential idea ([Gowers 08]:pg. 495) was to introduce a notion of asymptotic
stability. An orbit is asymptotically stable if all sufficiently close orbits approach it as
time tends towards infinity. You could call this asymptotically stable orbit the ‘attractor’
2 See [Peterson 03] for a popular account and [Gowers 08] for a brief but very clear technical account.
21
([Arnold 92]:pg. 26) to which slightly perturbed nearby orbits tend. We will call this notion
of asymptotic stability in the orbits of a dynamical system, ‘Poincaire stability’ to distinguish
it from some of the other stability concepts introduced later in this chapter.
While the term ‘dynamical system’ was not coined until the mid-twentieth century, one of
its core concepts is this notion of ‘Poincaire stability’. The notion was extended in the mid-
twentieth century to incorporate ideas of ‘robust’ or ‘structurally stable’ systems in which
the perspective telescoped outward from consideration of the orbits of individual bodies to
the notion of systems. A dynamical system is structurally stable if all systems close to it
have the same qualitative behavior (topologically equivalent).
Finally, Poincaire stability leads to a pair of late twentieth century developments: chaos
and complexity. These twin research fields could be considered examples of descent with
modification for concepts originating in dynamics.
Consider chaos the inversion of Poincaire stability: slight perturbations in an orbit lead to
exponential divergence so that two dynamical objects with slight perturbations in their initial
conditions have very different final conditions. Poincaire first observed such a phenonmenon
in terms of the three-body problem and it later became the signature of chaotic systems:
sensitive dependance on initial conditions. In this sense chaos and stability are mirror images
of the same concept.
While chaos can be neatly defined in a single image of nearby trajectories rapidly diverg-
ing, complexity theory resists such characterization. Complexity can be seen as representing
a change in focus on the kinds of systems to be investigated with tools originally developed
in the study of dynamical and chaotic systems. Complexity theory3 appears to have bifur-
cated from chaos theory in the early 1990s, and entered public perception due to a pair of
popular books published in 1992 that centered on the activities of the Santa Fe institute
[Lewin 92, Waldrop 92]. The computer scientist Melanie Mitchell provides an excellent re-
3 This form of complexity is distinct from the computer science subfield with a similar name, ‘computa-tional complexity’.
22
cent overview on the scope of complexity theory that identifies a triplet of commonalities for
any system called complex ([Mitchell 09]:pp.12-13): complex collective behavior, signaling
and information processing, and adaption. Additionally, the components of complex systems
are often embedded in a network. Interestingly, these commonalities in complexity theory
(and the network perspective) are also common to much work in the computer science disci-
plines of multiagent systems [Shoham 09] and evolutionary computation [De Jong 06]. The
progression from dynamical systems and chaos to complexity appears to parallel a shift in
emphasis from systems whose behavior is modeled by equations to those whose behavior
is modeled via algorithms. To the extent that dynamics and chaos are components of any
theory of complexity, their stability concepts rest on a commonality: Poincaire stability. A
contrary view of complexity is offered by the physicist-chemist Philip Ball ([Ball 04]:pp. 5,
126), in which its conceptual core is simply the physics of collective behavior, which leads
us naturally to consider stability in thermodynamics.
2.5 Thermodynamics: Instability and Self-Organization
The physicist Ilya Prigogine makes an interesting distinction between dynamics and thermo-
dynamics, the former referring to situations in which the direction of time does not matter,
and the latter is concerned with situations in which the direction of time does matter. Heat
and temporal irreversibility are intricately linked in the science of thermodynamics. Pri-
gogine notes([Prigogine 80]:pg. 5):
‘If we heat part of a macroscopic body and then isolate this body thermally, we observe
that the temperature gradually becomes uniform. In such processes, then, time displays
an obvious “one-sidedness” ’.
Where dynamics offers a characterization of stability in terms of the motions of one to
several bodies, thermodynamics explores the properties of collectives that may be stable.
23
We make a distinction between the macrostate of the system, in terms of the microstates of
its parts. For example the macrostate of a system might be represented by some quantity
such as temperature or pressure. The microstates via a particular distribution in terms of
position and momentum for each particle in the system. Certain microstates (a distribution
of positions and momentums across all particles) will correspond to a particular macrostate
(or overall temperature and pressure) for the system. Thus, there is a preliminary notion of
hierarchy in thermodynamics where system-level properties (temperature, pressure) become
stable because the behaviour of the underlying collective of particles tends towards a uniform
distribution for velocity and position. We have moved from systems that are deterministic
in their characterization, to ones which are indeterministic in their characterization.
Thermodynamics offers us further insight into mechanisms by which systems may become
stable. One obvious form of stability is the equilibrium concept. If a system of particles is iso-
lated from its environment, different initial distributions of particles tend towards a uniform
final distribution, which represents maximum disorder. Imagine a box with a particulate gas
where the gas particles are spread out evenly throughout the box. In thermodynamics the
measure of molecular disorder is called entropy. Since there are statistically many more ways
to arrange for a uniform distribution of particles than a non-uniform distribution an isolated
system tends towards maximum disorder. Simply put, there are many more configurations
of particles where they are spread out over the box, than those where all particles end up in
the top left hand corner of the box.
The focus of Prigogine’s research was on nonequilibrium systems – systems which are not
isolated from their environment, and on dissipative structures – patterns that can emerge
and stabilize in such systems. Dynamics via Poincaire stability gave us a useful definition
of stability in terms of one to several dynamical bodies. Within thermodynamics those con-
cepts are extended to collectives of bodies. Furthermore thermodynamics offers particular
mechanisms towards creating stability. The equilibrium concept above is one. More inter-
24
esting from our perspective are stability conditions near equilbirium (a small perturbation
away) and farther from equilbrium – since it is in these realms that most real as opposed to
theoretical systems exist. Near equilbirium Prigogine ([Prigogine 80]:pp. 90-91) cites a trio
of relations that must be met for thermodynamic stability: thermal stability, mechanical
stability, and stability with respect to diffusion. Perturbations that violate these conditions
will lead to instability near equilibrium.
Up till now we have discussed the stability of a particular state in a thermodynamic
system with respect to perturbations. Dissipative structures originate via perturbations,
in such a way that fluctuations lead the system to a new stable state that is maintained
away from equilibrium ([Nicolis 89]:pp.65-71). Two such examples of stability away from
equilibrium via the formation of dissipative structures are Benard cell formation and the
Belousov-Zhabotinsky (BZ) reaction4. Benard cells are formed when heating a viscous fluid
from below between two plates, such that there is a temperature gradient between the bottom
and top plate. Benard cells form when the fluctuations in the density of the fluid overcome
the fluid viscosity faster than they can be dissipated, leading to coherent convection cells that
rotate. As long as the heat differential is maintained, the Benard cells form a stable pattern.
In the BZ reaction, the oxidation of an organic acid in the presence of appropriate catalysts
creates either a stable spatial pattern, or the formation of waves of chemical activity such
that the solution oscillates through a range of colours [Prigogine 80, Prigogine 84].
In both these examples, which Prigogine calls ‘dissipative structures’, internal fluctua-
tions (essentially due to an imposed gradient) result in the emergence of relatively complex
patterns that are stable as long as the gradient is maintained. The self-organization in these
systems is likened to the kind of self-organization that might have occurred early in biological
systems. Prigogine notes, that in chemical systems the initial instability that allows for the
emergence of a dissipative structure may originate in autocatalytic cycles ([Prigogine 84]pg.
4 See [Prigogine 84] for non-technical accounts of these two phenomena, and [Prigogine 80] for a moretechnical account.
25
145) ‘where the product of a chemical reaction is involved in its own synthesis’.
From our stability perspective, the key insight is that while stability was initially defined
in terms of resistance to perturbations; perturbations of particular types may themselves
initiate other forms of stability as long as other factors (such as the temperature gradient in
the Benard cell case, and the availability of appropriate reactants and catalysts in the BZ
reaction case) are held constant. This stability manifests itself in what looks like coordinated
activity in a system that would otherwise be considered random. Once formed, these patterns
are stable to small perturbations below a threshold, and this form of conditional stability is
called metastable.
Prigogine saw deep biological significance in these examples of stability in dissipative
structures, likening their self-organization to that required in biology. The jump from heated
plates and organic chemistry to biological metabolism, development, and evolution while
intuitively plausible, is very difficult to demonstrate conclusively. There is however the
truism that biology is far from equilibrium, and to the extent that individuals in biology
retain their identity in the face of small perturbations, we are all metastable.
2.6 Biology: Homeostasis and Developmental Canalization
In biology, our problem is not to define a particular stability concept, but rather to deal
with the abundance of stability concepts existing historically in the field, and currently in
the literature. In bridging empirical results and theory in biology there can be tension
between pre-existing biological notions and intuitions of stability, and the attempt to fit
them to modern concepts (and associated mathematical techniques) from dynamics and
thermodynamics. Additionally, hierarchical thinking is prevalent in biology. Biologists think
of an organ in terms of the whole individual, think of a whole individual in terms of a
population, think of populations in terms of a species, and so on. The stability of any
particular ‘focal’ system is seen as dependent from below on the stability of the components of
26
that system, and from above on the stability of the larger system that the focal system is itself
part of ([Salthe 85]: Chapter 4). To give a concrete example, consider a bog ecosystem. We
might consider the ecosystem’s stability to be dependent from below on flows of energy and
matter through the particular assemblage of species that it is composed of, and particularly
sensitive to the stability of the sphagnum moss that creates the conditions for the bog. We
might consider the bog’s stability to be dependant from above on the larger system it is part
of, or what is happening at its boundaries. For example, the stability of a bog ecosystem is
highly dependent on the forest around it. Cut down the forest, and the bog will disappear.
This tendency towards hierarchical thinking will show up repeatedly in the examples below
concerning stability concepts drawn from physiology and development.
We will begin with a quick look at the notion of homeostasis, which might be consid-
ered a quintessentially biological notion of stability. We then contrast the kind of stability
associated with homeostasis with notions of stability in organismal development.
Let us begin with standard dictionary definition for homeostasis [Sykes 82]:
‘homeostasis tendency towards relatively stable equilibrium between interdependent
elements, esp. as maintained by physiological processes’
In biological texts, such a capsule definition is easily expanded into whole chapters. We
will look at an example from a standard ecology text ([Ricklefs 79a]:pg.146):
‘Homeostasis refers to the ability of an organism to maintain consistent internal
conditions in the face of a different and usually varying external environment. All
organisms exhibit homeostasis to some degree, although the occurrence and effectiveness
of homeostatic mechanisms varies.’
The text goes on to cite a number of specific mechanisms that maintain homestasis
including: temperature regulation, salt-content regulation, and water balance. A single
form of homeostasis, such as temperature regulations, can be further subdivided into specific
27
mechanisms applicable to mammals, reptiles, and plants.
Some features should be noted about this biological definition of homeostasis. First of all,
note that equilibrium is used here in a somewhat different context than in thermodynamics.
In thermodynamics it refers to the most likely macrostate state of an isolated system given a
range of possible microstates. In biology it refers to a form of balance between inter-related
parts. Furthermore, there is an explicit hierarchical notion of stability of internal conditions,
with respect to external perturbations. This is almost the converse of stability as viewed in
classical thermodynamics, where the fluctuations are internal. Finally, mechanisms towards
homeostasis, rather than being few and general, are myriad and organism specific.
In general, homeostasis applies to stability at the level of an individual organism or below,
so we may speak of the homeostasis of a tissue, or a cell, or even an organelle. However, above
the level of individuals, one usually speaks of ‘sustainability’ (of a population, a species, an
ecosystem). The levels in the biological hierarchy to which homeostasis applies have greater
cohesion than the levels towards which sustainability applies.
While homeostasis in biology has a wider field of intention than stability in thermody-
namics, the biochemist Albert Lehninger provides an eloquent characterization of homeosta-
sis as it occurs at the lowest of hierarchical levels in biology (where energetics dominate),
which harken back to the notions introduced in the previous section on thermodynamics
([Lehninger 65]:pg. 236):
‘The exquisitely developed self-adjusting mechanisms which are intrinsically present in
many enzyme molecules, programmed into them by their amino-acid sequence, make
possible the continuous self-adjustment of the steady state of the cell to accommodate
changes in the environment. In this way they can keep entropy production always at a
minimum. The dynamic turnover of cell components is thus a thermodynamic necessity
to sustain the low entropy state of living organisms in the most efficient manner.’
To some degree, the notion of stability in thermodynamics intersects with the notion of
28
homeostasis, but each of the concepts also have non-overlapping implications.
Finally, in its original sense, the stability that homeostasis refers to is one that is actively
maintained by the organism. The physician-writer S.B. Nuland looks at the various ways the
human body maintains homeostasis, and quotes the physiologist W.B. Canon who coined
the term in the 1920s ([Nuland 97]:pg. 30):
‘As a rule, whenever conditions are such as to affect the organism harmfully, factors
appear within the organism itself that protect it or restore its disturbed balance.’
There is a second notion of stability prevalent within the biological literature in terms
of development, that is very different from homeostasis. In the development of particular
tissues, the end state (say the mature tissue and associated cell types) can be achieved in
spite of perturbations in earlier stages of development. The developmental biologist C.H.
Waddington put these ideas into their modern form via a pair of related concepts, ‘canal-
ization’ and ‘chreods’. Waddington’s final work, ‘Tools for Thought’ [Waddington 77] is a
unique synthesis of ideas from development and ideas from dynamical systems, cybernetics
and information theory that places developmental notions of stability in dynamical form.
In a chapter on ‘Stabilization in Complex Systems’ Waddington begins by distinguishing
the two forms of stability in biological systems ([Waddington 77] pg. 105):
‘While the process of keeping something at a stable, or stationary, value is called
homeostasis, ensuring the continuity of a given type of change is called homeorhesis, a
word which means preserving a flow.
Waddington continues by introducing two closely related concepts, ‘canalization’ and
‘chreods’. Developmental canalization can be considered those constraints that restrict the
pathways of change in a particular cell lineage. For example, in plants there is a tissue called
the vascular cambium that runs throughout stems and branches – and cells on the inside of
this tissue have very different developmental fates than cells on the outside of this tissue.
29
The particular pathways of change, Waddington calls chreods, which means ‘necessary path’.
He notes, that over the course of development, different types of cells move along different
paths towards a final morphology, which he likened to a landscape of peaks and valleys he
called the ‘epigenetic landscape’. Waddington likened the depth of a valley as corresponding
to the stability of a particular cell fate. Deep valleys in this epigenetic landscape would
require greater perturbations to push a cell lineage to an alternate fate than shallow basins.
He rapidly recalibrates his developmental language to equate the basins in an epigenetic
landscapes as attractors, and uses the term ‘attractor’ as it is applied to dynamical systems.
If a particular immature cell (or cell lineage) moving towards its mature state is viewed in
an abstract space consisting of those attributes that may characterize its shape, then in that
space each cell (or lineage) can be seen as circumscribing an orbit or trajectory. In this sense,
within biology we have recovered something very much like Poincaire stability. However, this
is not quite true in that Poincaire stability concerns two nearby trajectories regressing to a
common attractor. In development it is often possible to have cells with very different initial
morphologies moving towards the same final state.
We find stability concepts in biology that overlap those in both dynamics and thermo-
dynamics, but which also differ from their complementary concepts in those fields in terms
of mechanisms, scope, and implication. The particular stability concepts used then depend
on those appropriate to the level in the biological hierarchy under investigation. Indeed,
particular investigations usually involve two or more hierarchical levels. A developmental
biologist is likely to study cells in terms of their maturation and incorporation into particular
tissues or organs. A geneticist is likely to study genetic differences between individuals, and
their stability with respect to fitness in multiple environments in the context of populations
or species. An ecologist is likely to examine the sustainability of a particular ecosystem in
terms of the particular species it is composed of and their relationships and external sources
of disturbance to the ecosystem (for example, acid rain, deforestation, immigration of new
30
species from another ecosystem). A developmental biologist is likely to look at stability in a
very different way than a geneticist, who again is likely to look at stability differently than
an ecologist. Since they are all biologists, and may be using similar jargon terms, but with
different implications in their sub-fields, it is very easy in biology to talk past each other.
Finally, in biology the stability of the whole, is often actively maintained by the parts.
2.7 Computer Science: Byzantine Dilemmas
In computer science – the stability of a computer system depends on its ability to recover
from errors. You would not consider a computer stable if it failed every 15 minutes and you
had to constantly restart it. A particularly difficult case is that of a distributed system, where
there are multiple processors with no central control. Imagine each processor as a vertex,
and the communication paths as edges. The processors must work together to complete a
computation. There are numerous schemes to prevent error states, to detect errors, and to
recover from errors in a distributed computer system [Ozsu 99], many of which are based on
heuristics.
A special case of such problem is known as the ‘Byzantine General’s Problem’ [Pease 80,
Lamport 82]. It abstracts the notion of a distributed system reaching agreement in the pres-
ence of errors (‘faults’) so the system as a whole can come to a consensus even when a portion
of the system’s processors convey unreliable information, or do not pass on information at
all. Lamport et al. ([Lamport 82]:pg. 382) introduce this highly abstract problem in terms
of a story:
‘Reliable computer systems must handle malfunctioning components that give conflicting
information to different parts of the system. This situation can be expressed abstractly
in terms of a group of generals of the Byzantine army camped with their troups around
an enemy city. Communicating only by messenger, the generals must agree upon a
common battle plan. However, one or more of them may be traitors who will try to
31
confuse the others. The problem is to find an algorithm to ensure that the loyal generals
will reach agreement.’
The Byzantine Generals problem is: how do the generals reach consensus when a traitor
exists? By analogy, Byzantine fault tolerance in distributed computer systems, concerns
how the system achieves consensus when some parts are not following the same protocol,
and may fail in arbitrary ways. These failing parts, which might be sending erroneous or
corrupted messages, are analogous to the traitors in the original Byzantine Generals problem.
Byzantine failures stand in for the large range of ways parts in a distributed system may
arbitrarily fail from hardware errors, network traffic issues, to take-over by malicious code.
The original papers demonstrated that agreement is possible only if less than one-third
of the generals (parts) default, so that greater than two thirds remain loyal. A distributed
system can reach consensus only when less than a third of the processors are faulty. Within
that bound, the game is to define an algorithm for consensus with several key features
[Abraham 08]. First, the consensus scheme should be ‘optimally reslient’, that is it should
allow defaults up to the limit of one-third. Secondly the algorithm should ‘terminate’, so
that all non-faulty processors correctly complete the algorithm. Finally, for the algorithm
to be practically implementable it should be ‘polynomially efficient’. While, there is a large
literature in computer science trying to meet these three goals either singly or in combination
– our concern is the structure of this problem from a stability perspective.
The Byzantine agreement problem is interesting because it potentially applies to any case
where coordination is required amongst a series of autonomous agents, some of which might
not be following a protocol. When described in such a way, it could apply to any problem of
distributed coordination, and such problems occur in many fields outside computer science.
As a biological example of failures in coordination, consider cancer as certain cells defaulting
from the ‘protocol’ for growth and spread for their cell type. To reflect this generality, let us
now refer to ‘agents’ instead of processors. The agents have the capability to follow rules (an
32
algorithm) and communicate with each other. They also have the potential capability of not
following the algorithm. This could be due to an internal error, or it could be a choice. The
nature of the problem does not distinguish between reasons why an agent in a distributed
system may default.
The stability in concern, is the consensus decision, where all non-faulty agents (those
who follow the protocol) must arrive at exactly the same decision with unanimity. Consider
each defaulting agent as perturbing the system when it does not follow the protocol. The
system can tolerate perturbations due to default of up to one third of the agents, and still
reach consensus. After that point, consensus is not possible – the system can not reach
a stable result. Similar to the biological notion of homeostasis, the system actively seeks
its stability. Unlike homeostasis, perturbations are from sub-components of the system,
rather than external to the system. However, the Byzantine generals problem is neutral on
the causes by which a faulty agents may default, and those influences could be internal or
external to the system.
Recently there have been attempts to introduce ideas from the Byzantine generals for-
mulation in computer science to another area that has a distinct notion of stability, namely
game theory [Abraham 06, Halpern 08]. In game theory, every agent has a strategy and
is assigned a utility value which is its payoff for following the strategy. Nash equilibrium
is a set of strategies such that no player has incentive to unilaterally change their action.
([Shoham 09]:pg. 62):
‘Intuitively, a Nash equilibrium is a stable strategy profile: no agent would want to change
his strategy if he knew what strategies the other agents were following.’
If any player were to change their strategy, their utility would be less. The key concepts
being brought in from the Byzantine generals problem to game theory are first to move from
equilibrium with respect to the default of a single player, to equilibrium with respect to the
default of a coalition of players, and secondly to move from the assumption of rational agents
33
(who will only act to increase their utility) which is standard in game theory to consider
irrational agents that may be willing to sacrifice their utility or whose utilities are arbitrary.
In computer science, particularly distributed systems, we have the situation where sta-
bility is with respect to a particular protocol (say for reaching consensus), and perturbations
are in terms of the number of local defaults from that protocol that still allows the protocol
to correctly complete at a system level, and for all non-defaulting agents.
2.8 Stable Inferences
Thus far we have looked at the notion of stability in various fields, searching for common-
alities. We will conclude with a brief look at the way we make data based inferences. The
ideas here apply in a narrow sense to statistical inference, and in a broader frame to scientific
inference.
With respect to stability and statistical inferences, we have three issues that concern us:
error, sensitivity, and independence.
A short story from the history of science illustrates these three issues. Only in the mid
1800’s was it discovered that the hygiene of physicians is directly related to the health of
their patients ([Hempel 66]:pp.3–8). The physician Ignaz Semmelweis was distressed at the
number of his patients who were dying in childbirth. He noticed that the number of women
dying differed between two maternity wards in the same hospital, and wished to determine
the cause. One ward had three times the death rate of the other. He quickly worked through
several hypotheses. He had multiple lines of evidence at hand. First, women were dying at
a higher rate in the hospital than those who were overcome by labour in transit and gave
birth on the street. Secondly, he wondered if rough medical exams by medical students
could be the cause. Semmelweis rejected this hypothesis, noting that the injuries due to
birth are greater than those that might be caused by partially trained medical students.
He also noted that in the ward with fewer deaths, the midwives were using much the same
34
examination procedures as the medical students were using in the ward with the higher
death rate. Semmelweis wondered if a priest ringing a bell for the last sacraments could
have terrorized the women to death. He convinced the priest to change his route and the
deaths did not appear to decrease regardless of changes made to the priest’s route. Starting
to get desperate to solve this mystery he noted that in one ward women were delivered on
their backs and in another ward women were delivered on their sides. He examined switching
delivery positions, with no effect. Finally he had a critical insight. A colleague of his received
a puncture wound while performing an autopsy, and soon died of an illness that appeared
identical to that of his female childbirth patients. Semmelweis wondered if the contact of
‘cadaverous’ matter and an open wound might be implicated. It then occurred to him (in
a moment of horror we might imagine) that he and his medical students often attended to
one of the maternity wards immediately after conducting dissections in the autopsy room,
and did not thoroughly wash their hands. He immediately ordered all medical students to
wash their hands in a chlorinated lime solution before examining women. Very soon after,
the number of patients in the ward with the higher death rate fell to match the other ward.
At that point, several bits of evidence he had fell into place. The ward with the lower death
rate was attended by mid-wives who do not do autopsies. Secondly, women who give birth
on the street, are usually not examined on arrival, and hence avoided getting infected.
Semmelweis was constantly making inferences from data. His data and experimental
methods might not satisfy a modern day researcher, but there is no faulting his process of
inference. With respect to stability, there are three aspects of his (or anyone’s) inference
process that concern us: error, independence, and sensitivity. First Semmelweis developed
hypotheses, and based on these, developed predictions. He noted the error between his
predictions, and the actual results obtained. Based on this error, he refined hypotheses
to make new predictions. For example, he later found that not only cadaverous material,
but also putrid material from living patients could cause the fever. Secondly, he altered
35
conditions. If he found that the results were independant of all possible conditions for a
single factor he could alter (such as position of delivery), he ruled out that factor. Finally,
he looked at the sensitivity of his results to additional data. In addition to the women who
died at childbirth of fever, he considered their children. Only the children of mothers who
contracted the disease during labour also fell sick.
The philosopher, Deborah Mayo contends that error is the basis of experimental knowl-
edge [Mayo 96]. Error in this sense, is the difference between a prediction from a hypothesis
and the actual results obtained. Often our predictions are in terms of particular statistics,
such as the mean or variance. In that case, we also need to take into account the range
of variability in the statistics – the confidence limits. In the case where we have multiple
alternate hypotheses, we have strong grounds to choose in favour of a particular hypotheses
if the error for it is much less than the others and falls within the confidence limits for the
data obtained. These concepts lead to the idea of ‘severe tests’, where an experiment is
set up such that a hypothesis either clearly passes or fails. In essence there is a very high
probability the test procedure would not yield a passing result if the hypothesis were false.
The stability of our inferences, then depends on the ability to provide severe tests amongst
alternate hypotheses.
We have noted repeatedly that stability is often defined in reference to perturbations. To
the degree that a system property (say thermal equilibrium) is constant, given some pertur-
bation, it is considered stable. A pithy informal definition of the notion of independence is
given by philosopher Ian Hacking ([Hacking 01]:pg. 40): ‘Two events are independent when
the occurrence of one does not influence the probability of the occurrence of the other’.
Twisting that definition slightly, one could say, an event is stable with respect to another
event to the degree that it is independent of it.
Finally, we make inferences based on statistics we calculate from data. The degree to
which our inferences are stable derives in part from the stability of the statistics we are
36
basing our inferences on, given perturbations in data. Put another way, we are interested
in the sensitivity of particular statistics given such factors as outliers in the data, or even
assumptions required about the data. This leads to the search for robust statistics, which
make minimal distributional assumptions, and are insensitive to outliers. Examples of such
statistics can be found in the literature [Hoaglin 83, Hoaglin 85]. One particularly simple
and elegant example of the relationship between stability and perturbation is encompassed
in a statistical procedure known as ‘the jacknife’ ([Mooney 93]:pp. 22-27). As its name
suggests, the jacknife is an all purpose tool. A data set is systematically perturbed by
dropping one sample out. The required statistic is calculated for this perturbed data set.
The calculation is repeated for all perturbed datasets, each of which have another sample
left out. The variation in the statistic calculated this way represents a confidence interval
around the statistic. The stability of the statistic, is inversely proportional to its variation in
the jacknife. If the statistic’s value is very similar for each jackknifed data set – the statistic
is stable given the data. If the statistics value is extremely variable across jacknifed data
sets, the statistic is not stable given the data.
Ultimately, inferences are chains of ‘if-then’ reasoning from some premise to a conclusion.
If from a single premise, there are multiple chains of plausible reasoning to the conclusion,
one could say that the conclusion is stabilized’ via the alternative pathways available to
reach it. Halpern provides a visual example of chains of reasoning represented as a network
([Halpern 03]:pp. 132-133). In this case, the premise is that ‘one parent smokes’ and the
conclusion is ‘has cancer’. One line of reasoning runs from premise to conclusion via the
inference chain: parent smokes, therefore exposed to second hand smoke, therefore individ-
ual has cancer. A second line of reasoning runs via the inference chain: parent smokes,
therefore individual smokes, therefore individual has cancer. In this case, a cut in one chain
of inference, does not rule out the inference from premise to conclusion via another chain. In
that sense, conclusions supported by multiple chains of reasoning from premise to conclusion
37
could be said to be cut-stable.
2.9 Commonalities
Each of the fields briefly examined above has a perspective unique to the history and specific
problems encountered in that area. Each of these perspectives refracts, providing partial
illumination. From the various angles, what is common to the notion of stability?
In each case, stability seems to be a concept composed of several parts. Every case begins
by demarcating a system: a dynamical system, an ecosystem, a bridge, ..., a network. In
each system there is some property, let us call it ‘S’. This property is related to another
property of the system we will call ‘P ’. Property P is variable. It can be perturbed in some
way. Let us denote a perturbation of P as M P . A system is stable to the degree that as P
is perturbed, S is invariant. At one extreme, a small change in P can lead to a large change
in S. At another extreme, no change in P affects S, and so S is stable with respect to P .
These ideas could be stated symbolically as follows5:
A system is stable if the probability of a system having stability property, S, is invariant
given perturbations on some other systemic property P . Let p(S) be the probability of
observing the property S in the system. Let p(S| M P ) be the probability of observing the
property S in the system given perturbation M P . So, a system S is stable with respect
to P when:
p(S| M P ) = p(S).
The property S is independent of perturbation on property P , and therefore stable to
any perturbations M P to property P .
While the stability concept is usually focussed on relationships between properties in a
system, it could be extended to describe properties between systems. In this case, let us
5 This symbolic summary represents stability as a kind of conditional probability, the notation for whichis covered in Chapter 3.
38
call the problem ‘measurement’. Now let us consider one system to measure another to the
extent that a change in the first system for some property P causes a change in the second
for some property S. For example, while reading this essay, changes in the text should (I
hope!) be causing changes in the state of your mind. If you could not read, simply scanning
the text is unlikely to be causing changes in your mind – the text would appear meaningless.
On the other end, we can not measure neutrinos, because they pass through us; but we can
measure X-rays because our bodies can stop them. At low dosage levels X-rays perturbing
our bodies allow another device, X-ray machines to take measurements. At higher dosage
levels, our measure of X-ray dosage is that our cells respond by becoming cancerous.
As an extreme example where such stability does not hold, consider the famous EPR
paradox in quantum mechanics [Bell 87] where a measurement (a perturbation) on one part
of a quantum system has a non-local effect on another part of the quantum system. Thus,
the state of the latter part is not independent of the state of the former measured part even
when they are far apart.
In the context of a network, we could say three forms of stability apply. The dynamical
systems notion of Poincaire stability applies in terms of values that may be assigned to
vertices. When stability is discussed in model network systems such as Boolean automata
[Kauffman 93], it is this form of stability that is usually in mind. It presumes a network of
a particular structure, and the stability referred to is with respect to the values the vertices
take. If the values are constant, the system is stable. If the values never settle down, the
system is chaotic. When we are concerned with alternate network architectures or network
architectures that are dynamically changing structure, cut-stability and connection-stability
apply. Cut-stability and connection-stability require knowledge only of the structure of a
network. Poincaire stability requires knowledge of both the structure of a network, and
of the specific functions that are assigned to vertices. Additionally, Poincaire stability is
usually evaluated in the context of a fixed architecture. For these reasons, we designate
39
cut-stability and connection-stability as forms of topological stability, to distinguish them
from the dynamical systems notion of Poincaire stability6.
We return to the topic of topological stability in networks, and a more formal character-
ization of cut-stability and connection-stability in Chapter 4. Chapter 3 provides us with
some mathematical preliminaries.
6 The ultimate elucidation of the relationship between topological and Poincaire stability is beyond thescope of this thesis, though my intuition is that the concepts are orthogonal. Two networks of identicalstructure (and therefore identical topological stability) could have very different levels of Poincaire stability.Similarly, two networks of identical Poincaire stability could have very different levels of topological stability.For now, we note that topological stability has more modest input information requirements than requiredfor Poincaire stability, requiring only the network structure.
40
Chapter 3
Mathematical Preliminaries
The apparition of a network.
Commingled events
scattered and gathered
like petals.
3.1 Abstract
We introduce basic mathematical concepts and definitions that we will build upon as we
develop our argument for topological stability and dynamic resilience in complex networks.
We begin with a trio of mathematical concepts: networks, probability, and information. These
three concepts come together in an ecological application, ascendency, which we briefly derive.
3.2 Introduction
To develop the notion of network stability in the next chapter we must bring together ideas
from several distinct mathematical fields and combine them as we formalize our concep-
tualization of topological stability. Networks form the basic architecture with which we
are concerned. Probability theory provides an avenue into the concept of information from
classical information theory [Shannon 63]. We take a second route into information via algo-
rithmic information theory. Finally, we briefly derive ascendency, an application of classical
information theory applied to ecological networks
Sources consulted in these areas are: networks [Bondy 08, Chartrand 77, Chung 06,
Easley 10, Newman 03, Newman 10, Newman 06, Trudeau 76], probability [Feller 66, Grinstead 97,
Hacking 01, King 09, von Mises 57], classical information theory [Luenberger 06, MacKay 03,
41
Renyi 87], algorithmic information theory [Chaitin 99, Li 97, Li 04] and ascendency [Ulanowicz 97,
Ulanowicz 99a, Ulanowicz 04].
In subsequent chapters we build on the mathematical preliminaries introduced in this
chapter. We introduce new notation, concepts, and definitions as required to develop our
arguments for topological stability and dynamic resilience in complex networks.
3.3 Networks
A network is a directed graph with real valued edges.
A network, N , is an ordered pair (V,E) where V is a set of vertices1, and E is a set of
directed edges (ordered pairs of vertices, where the first member of the pair can be seen as
the tail of the edge and the second member can be seen as the head of the edge). The edge
vi − − > vj is distinct from the edge vj − − > vi. |V | is the cardinality of the vertex set.
|E| is the cardinality of the directed edge set.
Each edge is associated with a real number value: f(E) ∈ <.
An empty network is a network with no edges: N = (V, ∅).
A null network is a network with neither nodes nor edges: N = (∅, ∅)
The out-degree of a vertex vi are the number of edges with vi as the tail. The in-degree
of a vertex vi are the number of edges with vi as the head.
The neighbourhood of a vertex is the induced subgraph for the vertex consisting of all
other vertices adjacent to it. In terms of a directed graph the neighbours can be assigned to
two sets: in-neighbours and out-neighbours.
A subnetwork, A is a subset of of the nodes and edges of a particular network, N: A ⊆ N .
Paths in a network are sequences of distinct vertices, such that each vertex is the tail of
the edge to its immediate successor vertex.
A component for a network is a set of vertices such that in the corresponding undirected
1Vertices are also commonly referred to as nodes.
42
graph, there is (a) a path between any two vertices and (b) it is the largest such set (i.e. there
are no more vertices or edges to add from the network). You could consider components
‘distinct pieces’ of a graph [Trudeau 76].
We could consider other kinds of graphs as being restrictions on networks. For example,
a directed graph GD is a network, where edge values are uniform2. An undirected graph,
GU is further restricted from GDin that the edge vi −− > vj is no longer distinct from the
edge vj−− > vi. In the literature, ‘network’ has been used to refer to networks as we define
them, as well as to directed and undirected graphs.
The complement of a directed graph, GD is the corresponding directed graph GD defined
on the same vertices, where a vertex in GD is an in-degree or out-degree neighbour just when
it is not in GD.
Finally, two graphs are isomorphic if there is a mapping such that every vertex u and v
that are adjacent in the first graph correspond to vertices φ(u) and φ(v) in the second graph
where φ is a one-to-one mapping from one graph to the other [Chartrand 77].
3.4 Probability
We will take the frequentist approach to probability [von Mises 57], as opposed to the belief
approach. A reasoned and non-partisan discussion of the dual approaches to probability can
be found in [Hacking 01].
Assume a sample space E that is the set of all possible distinguishable outcomes, ei, of
an experiment. A subset of this sample space is called an event, A. The probability of A
occurring given the sample space E is the ratio of the cardinality of these two sets:
p(A) = |A||E| .
The probability of an event A and it’s complement A sum to 1: p(A) + p(A) = 1.
Let us assume there are two events, X and Y .
2Typically the edge values in a directed graph are set to 1.
43
If X and Y are mutually exclusive so that either X occurs or Y occurs:
p(X ∪ Y ) = p(X) + p(Y ).
If X and Y are independent events, then when both X and Y occurs:
p(X ∩ Y ) = p(X)× p(Y ).
The joint occurrence of X and Y , p(X ∩ Y ) is more simply denoted as: p(X, Y ).
Conditional probability, is the probability of some event Y happening, given some prior
event, X has happened. It is denoted as p(Y |X) and expressed as,
p(Y |X) = p(X,Y )p(X)
.
In the case where X and Y are independent:
p(Y |X) = p(X)×p(Y )p(X)
= p(Y ).
The formulae: p(Y |X) = p(X,Y )p(Y )
is often re-expressed in terms of the joint probability:
p(X, Y ) = p(X)P (Y |X).
This equation can be generalized to account for more events. For example, in the case of
three events, X, Y and Z:
p(X, Y, Z) = p(X|(Y, Z))× p(Y |Z)× p(Z).
Finally, we can represent the outcome of an experiment that depends on chance as a
random variable. Let X now be a random variable that represents flipping a fair coin.
X = 1 if the coin lands heads. X = 0 if the coin lands tails. Then, p(X = 1) = 0.5.
3.5 Information (Classical)
We will approach information theory in two ways – via classical information theory [Shannon 63],
which is defined on probabilities of events, and via algorithmic information theory which is
defined on strings.
Information is about the uncertainty associated with a particular event. It assumes from
44
probability theory a sample space, and all the other rules of probability. Multiplicative rules
in probability theory become additive rules in information theory.
The uncertainty around a particular event is proportional to the probability of that event.
Let us call this uncertainty c(x), and express it as,
c(x) = 1p(x)
.
The information capacity C(X) (also called the entropy) is the average uncertainty as-
sociated with an outcome:
C(X) = Σip(xi)log1
p(xi).
For a given number of distinguishable outcomes xi in X, C(X) reaches its maximum
value when all outcomes are equiprobable.
When p(x1) = p(x2) = ... = p(xN), Cmax(X) = logN .
If we are interested in the co-occurrence of two types of events X and Y, the information
capacity equation is modified to reflect the joint probabilities:
C(X, Y ) = ΣiΣjp(xi, yj)log1
p(xi,yj).
Information capacities are additive for independent random variables:
C(X, Y ) = C(X) + C(Y ) ⇐⇒ p(x, y) = p(x)p(y).
Otherwise they are sub-additive:
C(X, Y ) ≤ C(X) + C(Y ).
Finally, the conditional information capacity reflects the uncertainty associated with some
variable Y, knowing X has already occurred:
C(Y |X) = ΣiΣjp(xi, yj)log1
p(yj |xi) .
These three forms of information capacity can be related together by the chain rule for
information capacities (Mackay, pg 139):
C(X, Y ) = C(X) + C(Y |X) = C(Y ) + C(X|Y ).
45
The mutual information is the difference between the information capacity and the con-
ditional information capacity:
I(X, Y ) = C(Y )− C(Y |X) (and by symmetry, I(Y,X) = C(X)− C(X|Y )).
Invoking the chain rule for information capacities, we can re-express C(Y |X) as:
C(Y |X) = C(X, Y )− C(X).
Substituting this into the formula for I(X, Y ) we get:
I(X, Y ) = C(Y )− (C(X, Y )− C(X)) = C(Y ) + C(X)− C(X, Y ).
Thus we can interpret the mutual information as the sum of the information capacities
for X and Y , minus their joint information capacity3. We know that in the case where X
and Y are independent,
C(X, Y ) = C(X) + C(Y ).
Thus, the mutual information increases to the degree X and Y are not independent.
Given that:
C(X) = Σip(xi)log1
p(xi).
C(Y ) = Σip(yi)log1
p(yi).
C(X, Y ) = ΣiΣjp(xi, yj)log1(p(xi, yj).
Then:
I(X, Y ) = Σip(xi)log1
p(xi)+ Σip(yi)log
1p(yi)− ΣiΣjp(xi, yj)log
1(p(xi,yj)
.
I(X, Y ) = ΣiΣjp(xi, yj)log1
(p(xi)×p(yj)− ΣiΣjp(xi, yj)log
1p(xi,yj)
.
I(X, Y ) = ΣiΣjp(xi, yj)log(p(xi,yj)
(p(xi)×p(yj).
There are certain general properties of mutual information, I, we will want to emphasize
([Renyi 87]:24).
1. I is positive: I(X, Y ) ≥ 0. If X and Y are independant, I(X, Y ) = 0.
3 One other way to express the mutual information is as the joint information from which the conditionalinformation has been subtracted: I(X,Y ) = C(X,Y )− C(X|Y )− C(Y |X).
46
2. I is symmetric: I(X, Y ) = I(Y,X).
3. I is bounded: I(X, Y ) ≤ min(C(X), C(Y )).
3.6 Information (Algorithmic)
While classic information theory begins with a set of events and the probabilities associated
with an event as its basic ingredients, algorithmic information theory begins with a sequence,
S. It defines the information in that sequence with the length (in bits) of the smallest pro-
gram, d(S) that could generate the sequence [Chaitin 99, Li 04]. In this context, a random
sequence is defined as one where the length d(S) approximately equals S. In this context,
algorithmic information theory derives a measure that is analogous to the information ca-
pacity, C, in classical information theory. Rather than a capacity, they name this measure a
complexity. We will call this algorithmically defined information measure, Ck, to distinguish
it from our previous classical information theory measures:
Ck(S) = |d(S)|.
Similarly there is an analogue for mutual information, which we will again distinguish by
the subscript, k:
Ik(X, Y ) = Ck(X)− Ck(X|Y ). which can again be looked at as the sum of complexities
X and Y minus their joint complexity:
Ik(X, Y ) = Ck(X) + Ck(Y )− Ck(X, Y ).
For Komogorov complexity, the bug in the information theory ointment is the small
problem of the existence, and haltingness, of the minimal program d(S), which renders
the actual complexities uncomputable. They can however be estimated, for example by
compression programs [Li 04].
The ultimate relationship between these two complementary theories of information is
still to be determined. Refer to [Muller 07] for an attempt to reconcile the several information
47
theories currently in existance.
3.7 Derivation of Ascendency
Networks and classical information theory come together is ascendency, an ecological appli-
cation of these concepts4. Ascendency is a quantitative approach to measuring the degree
of constraint (order) and growth in ecological networks. Ascendency is essentially a com-
bination of mutual information, applied to an ecological network, scaled by a measure of
system size, throughput. Throughput represents the amount of material flowing through an
ecosystem. In ascendency, we can consider different species as our vertices, and predator-
prey relationships (who eats whom) as our directed edges. The value assigned to an edge is
proportional to the amount of matter passing through a predator-prey relationship over an
interval of time. Thus ascendency compactly brings together the pattern of material flows in
an ecosystem (via the mutual information) and the size of the ecosystem (via the through-
put). When mutual information and throughput are tracked over extended periods of time,
ascendency characterizes the material growth, and changing constraints as the ecosystem
develops.
Our derivation of ascendency differs slightly from the standard derivation [Ulanowicz 91,
Ulanowicz 99a, Ulanowicz 04, Ulanowicz 09c] in that we first introduce ascendency in terms
of a directed graph (where directed edges all have a uniform value of 1), and then in terms of
a flow network (where directed edges have a flow value proportional to the material transfers
between two species). In many technological networks, it is the structure of the network
that is recorded initially, and only upon more detailed studies, are the finer grained data
4 While we emphasize ascendency as a methodology that brings networks and information theory togetherin such a way that it gives us an approach to topological stability, in Ulanowicz’s hands ascendency formsthe basis of a general theory of ecology. Ulanowicz builds on ascendency to integrate a wide range ofecological phenomena and unify several earlier lines of theory. Adding in other theoretical constructs suchas autocatalytic cycles and indirect mutualism, he develops a general explanatory framework for ecosystemorigins, growth and development. The scientific and philosophical implications of these ideas are exploredby Ulanowicz in two book length treatments [Ulanowicz 97, ?].
48
as to actual flows of information recorded. Ascendency can apply to both situations. We
emphasize that the network structure, and information calculations now apply specifically
to material transfer (predator-prey) relationships between species in an ecosystem.
We first define some information theoretic measures for the case where we have data on
the topology of flows, but no finer grained data. We then expand our measures to the case
where we can measure both topology and the values for specific flows.
Let us first consider the case where all we know about a flow is that it exists. In that
case, every directed edge has a value 1 (flow exists between an ordered pair of species) or 0
(flows do not exist between an ordered pair of species).
Maximum information capacity, Cmax, for a network of |V | vertices is:
Cmax = log |V |2, which represents a complete graph with self-loops (in which every pos-
sible directed edge is realized) where |V |2 = |E|. Cmax represents the situation where given
a source i (a prey species), it is possible to move to any destination j (a predator species)
in a single step, i.e. no constraints.
For a network of |V | vertices and |E| edges, the information capacity, C, will be less than
Cmax: C = log |E|.
We can define a few other simple probabilistic measures. There are i sources (prey
species) and j destinations (predator species). Let si, dj be the flow value for a directed edge
from prey i to predator j. Let si refer to an edge for which prey i is the source. Let dj refer
to an edge for which predator j is the destination.
The marginal probability of a source i being part of a flow is:
p(si) =Σjsi|E| .
Similarly the marginal probability of a destination j being part of a flow is:
p(dj) =Σidj|E| .
The joint probability that there is a flow with row i as its source and column j as its
destination, :
49
p(si, dj) =sidj|E| = 1
|E| .
These marginal and joint probabilities allow us to define the mutual information, I, for
a set of flows from sources i to destinations j.
I(si, dj) = ΣiΣjp(si, dj)logp(si,dj)
(p(si)×p(dj).
Mutual information measures the degree to which flows are constrained to certain paths.
I has a few properties worth noting. First, if a flow starting at i has equal likelihood
of going to any destination, j, then I = 0. Second, if a flow starting at i can only go to j,
then I = C. Thus, I is bounded by the information capacity of the system. The first case
represents maximum disorder (the flow i could go to any j) and the second case represents
maximum order (the flow from i can go to only a single j). I is a measure of constraint. Its
mirror image is a measure of disorder, which we will call D:
D = C − I or equivalently C = I +D.
Essentially, the information capacity C of a network can be decomposed into an ordered
component, I and a disordered component, D. Recall, that we have defined these measures
on a network topology. Let us denote this by adding a subscript, t to indicate these metrics
are based on network topology. Then,
C = It +Dt.
We can organize the information measures developed so far:
Cmax ≥ C = It +Dt.
Since the information capacity scales with the number of edges, it is often useful to
rescale these measures by dividing by C. This allows us to compare the relative constraints
of systems that could have very different numbers of vertices and edges. We will add the
subscript r to denote these rescaled measurements, which are relative to the information
capacity of the network.
50
Cmax
C≥ 1 = Itr +Dtr.
If we can go beyond determining the topology of a set of flows, and measure flow values
for each directed edge, then we can define the same measures, but now based on flows.
To proceed, we note that all edges previously had a value of 1, and that the sum of all
the edge values was |E|, the number of edges. This is no longer the case if flows have positive
real number values. Let us call the sum of the flows the throughput, T .
T = ΣiΣjsidj.
The divisors in the joint and marginal probabilities are now T , rather than |E|, and the
edge values are now based on flow measurements.
p(si) =ΣjsiT
.
p(dj) =ΣidjT
.
p(si, dj) =sidjT
.
The formulae for all other calculations stay the same, but are now based on a measured
flow values. To distinguish these measures from those based on network topology only, let
us denote these new measures by the subscript f , for flows.
Cmax ≥ C = If +Df .
These measures can also be rescaled so they are relative to the information capacity; the
rescaled measures are again denoted by the subscript, r.
Cmax
C≥ 1 = Ifr +Dfr.
The move from mutual information to ascendency requires one more, seemingly miniscule
shift in perspective, which has proven to have greater than anticipated returns in practical
utility. Ulanowicz [?] realized that If , as a measure of constraint in a system, did not allow
one to distinguish between two systems with similar constraint, but very different amounts of
material throughput, T . That is, we are interested in both constraint on a system measured
51
by If , and also on the size of a system measured by T .
Ascendency, A, is the mutual information scaled by the throughput, T :
A = T × If .
The units of A are based on the units of measurement for the data used to calculate flows.
In ecosystems development, there is a tendency over time for A to continue to increase, which
led Ulanowicz [Ulanowicz 97] (pg 75) to state:
‘In the absence of overwhelming external disturbances, living systems exhibit a natural
propensity to increase in ascendency’.
This rise in ascendency is an empirical fact of ecosystems; as ecosystems mature, they
become both more highly constrained in terms of the structure of flows (greater mutual in-
formation), and those flows lead to increases in ecosystem performance (greater throughput).
Similar to ascendency, we may scale the disordered component of our flow metrics by
throughput, T, to create a measure of overhead, O.
O = T ×Df .
How are ascendency and overhead, as structural measures, related to system perfor-
mance? In the case of a completely ordered system (If = C), all the material flowing
through the system might be organized into a single path, say a cycle. In this case, ascen-
dency is maximal. There are no alternative paths. There is no overhead. However, such a
system is brittle in that the disruption of any single flow, will disrupt the system as a whole
(i.e., it is cut-unstable). The existence of alternate paths will add to the disorder of a system,
and lower its ‘efficiency’ in getting materials from point A to point B (and often require, in
technical systems, additional support infrastructure), but increase its cut-stability, in that if
one path is disrupted, other paths exist. Overhead is the measure of those alternate paths.
It is in this sense that the structure of a system is related to its performance.
In summary:
52
T × C = A+O,
which can be rescaled as,
T = Ar +Or.
This allows us to partition the total flow of materials through a system into an ordered
(ascendency) and disordered (overhead) component. If we consider disorder to be a form
of chaos, we could say these metrics allow us to estimate the mix of order and chaos in
a particular ecological network, or by extension, any technological network upon which
information flow values can be calculated.
53
Chapter 4
Topological Network Stability
When stabilities oppose
the middle road is balance.
4.1 Abstract
We now formalize the notions of connection-stability and cut-stability from the previous
chapters to build a theory of topological network stability. Graph theoretic definitions of
network stability are used to demonstrate that the cut-stability and connection-stability of
a graph are antagonistic in an undirected connected graph. Cut-stability is related to the
Minimum Vertex Cover problem in graphs. Connection-stability is related to the time to
flood a graph. Changes to a graph that increase one stability property, tend to decrease the
other stability property. Cut-stability and connection-stability are shown to be related to the
average mutual information of a directed graph. From this relationship between topological
stability and mutual information, the concept of balanced stability is developed. Balanced
stability is then extended to develop the notion of perfect information hiding on a graph. The
application of topological stability theory to technological, ecological and social networks is
briefly considered.
In this chapter, we build up a theory of topological network stability by alternating
between two very different perspectives: that of a network architect and that of an ecologist1.
The network architect is concerned with the stability of the systems he constructs to various
kinds of attacks, which are difficult to anticipate. The ecologist is concerned with the stability
1 The main text will contain the through-line of the argument, while footnotes are used to make ancillarypoints supporting the main argument, to fill in ecological details a computer scientist is not likely to know,and to add mathematical background that that may be outside the standard training of ecologists.
54
of the ecosystems she studies to various kinds of natural and man-made perturbations.
Consider our network architect and ecologist as playing a kind of game of intellectual leap-
frog with each other, where each applies their particular training and perspective to build
on the others’ results. Hopefully, seeing across disciplines in this fashion allows for the
construction of a richer theory, and a wider set of applications, as each has insights and
access to techniques that would be foreign to the other’s perspective.
In this fashion, we will first formalize our definitions of cut-stability and connection-
stability, and examine how they are antagonistic in the context of undirected graphs. We will
then extend our concepts to directed graphs, and show how they can be linked to information
theoretic concepts with a long history in ecology. We also show how our stability concepts
can provide insight into a foundational debate in ecology known as the ‘diversity-stability’
debate. Next, we develop a notion of balanced stability in a network, and show how it can
be related to classic work in information theory on the limits of inference. Finally we will
briefly survey areas of applicability of the theory of topological stability developed here to
other fields such as attack and error tolerance in technological networks, cohesion in social
networks, and critical concepts that have been hypothesized to stabilize ecosystems: keystone
species, indirect effects, cycling.
We begin with our network architect.
4.2 Introduction and Motivation I: A Network Architect’s Perspective
Imagine you are the architect of a large, critical, networked system. You want to evaluate
the stability of your architecture under different kinds of perturbations. Intuitively, you
know a few things. At one extreme a network where every vertex is connected to every
other vertex, is stable to loss of vertices, whether the loss is due to direct attacks or random
breakdowns. As vertices are lost, the system can still function since you can re-route the
system around failures. The system can lose a large proportion of its vertices and still be
55
able to pass a message between any two vertices. Let us call this idea ‘cut-stability’. At
another extreme, you know that if you construct a network that is very sparsely connected,
a virus beginning at a single vertex will be limited in its rate of spread, and provide human
operators or automated systems with more time to react. Let us call this idea ‘connection-
stability’. Finally you know that a system which can lose a large portion of its vertices and
still be functional, is very susceptible to viral attacks. The very connectivity that makes it
resistant to loss of vertices also makes it susceptible to viral attacks. The converse is also
true, that a sparsely connected system that limits the rate of viral spread is susceptible to
being easily cut into pieces by the removal of a few critical vertices. You sense that there is an
antagonism between these two aspects of network stability. You decide that the system you
will construct will have an architecture somewhere between these two extremes, providing a
modicum of stability in both senses – cut and connection.
Making these decisions is the ‘craft’ part of your job. Your decisions reflect previous
cases, rules of thumb, and your personal experience. Now comes the science: how do you go
about valuing the exact stability your system has, and the trade-offs you have made between
these intuitive notions of cut-stability and connection-stability. You have to justify your
decisions to your team, who will build on your architecture, to your managers, and to the
users of your network, who will assume it is functional and stable day in and day out. You
need to offer something more than ‘in my expert opinion ...’.
You need in fact, at minimum, two things. First, tight definitions of cut-stability and
connection-stability that move intuition to empirical verifiability. Second, a demonstration
that these definitions are indeed antagonistic. You need to think about the trade-offs you
are making in a way that can be transparent to others. You want to begin the journey
towards providing minimal guarantees about your network such as ‘it will function as long
as less than N vertices are cut’ or ‘it will take T transfers for a virus to propagate through the
system, and so we will design response mechanisms that can detect and block a viral system
56
in less than T’.
We begin the first steps of that journey by offering initial graph theoretical definitions of
cut-stability and connection-stability, and a demonstration of their antagonism.
4.3 Cut-stability and Connection-stability Definitions
Notation follows [Chartrand 77, Chung 06].
Let G = (V,E) be a graph with a vertex set, V , and an edge set, E.
Denote V (G) as the vertex set of G.
Denote E(G) as the edge set of G.
A pair of vertices u, v are adjacent if uv ∈ E(G).
Let GU designate an undirected graph, where each edge in E(G) is an unordered pair of
vertices.
Let GD designate a directed graph, where each edge in E(G0 is an ordered pair of vertices.
Unless otherwise stated assume an undirected graph so that if uv ∈ G then vu ∈ G.
A graph H is a subgraph of G if V (H) ⊆ V (G) and E(H) ⊆ E(G).
A graph G is connected if there is a path between all vertex pairs u and v, where a path
is an alternating sequence of vertices and edges beginning with u and ending with v.
C(G) are the set of components of G, i.e. maximal connected subgraphs of G.
If E(G) = ∅, G is an empty graph (also referred to as an ‘edgeless’ or ‘null’ graph).
If E(G) = (V, V ) = (v1, v2) : v1, v2 ∈ V , G is a complete graph with self loops2.
The complement of a graph, G is G′; two distinct vertices are adjacent in G
′only if they
are not adjacent in G. The complement of the complete graph is the empty graph. The
union of a graph and its complement is a complete graph.
2 While the standard definition for a complete graph is E(G) = (V, V ) = (v1, v2) : v1, v2 ∈ V, v1 6= v2,and excludes self loops, we will later be considering graphs with self loops, as they are used in ecologicalnetwork analysis.
57
4.3.1 Cut-stability
Let G be an undirected, connected graph.
Let MVC(G) be a ‘least cut set’ for G. MVC(G) is a smallest set of vertices which if
removed from G (along with their associated edges) results in a graph G∗ that is an empty
graph. MVC(G) is also a ‘minimum vertex cover’ for the graph, which is usually defined
as a smallest set vertices such that each edge in the graph is incident to at least one vertex
in this set ([Garey 79]. pg. 190). The two ideas, of a least cut set and a minimum vertex
cover are equivalent in that it is exactly smallest set of vertices that cover every edge, whose
removal along with their associated edges, results in an empty graph.
V (G∗) = V (G)−MVC(G) such that E(G∗) = ∅.
Let |MVC(G)| be the cardinality of MVC(G).
Definition 1. Let Sk(G) be the cut-stability of G.
For an empty graph G we define Sk(G) = 0.
Sk(G) = |MVC(G)|
For a complete graph, Sk(G) ≤ |V | − 1.
Sk(G)|V | ≤ 1 provides a normalized measure of cut-stability.
4.3.2 Connection-stability
Let v∗ be a vertex in G, from which G can be maximally flooded in the fewest iterations. A
‘flood’ begins at a vertex vi and in its first iteration includes all nodes adjacent to vi, and
continues to add adjacent nodes in each iteration.
Since G is connected, all vertices are reachable by a flooding process.
Let MFS(v∗, G) be the set of vertices flood-able from v∗ and include v∗.
Let |MFS(v∗, G)| be the cardinality of MFS(v∗, G).
Let T (G) be the number of iterations of flooding required to create MFS(v∗, G) from v∗.
Definition 2. Let Sc(G) be the connection-stability of G.
58
Sc(G) = T (G)× |V (G)||MFS(v∗,G)| .
Since G is undirected and connected, all vertices are reachable.
|MFS(v∗, G)| = |V (G)| so |V (G)||MFS(v∗,G)| = 1.
Therefore Sc(G) = T (G) in an undirected connected graph.
For an empty graph, there is no vertex adjacent to any vi, so |MFS(v∗, G)| = 1, and by
definition T =∞. For a complete graph T = 1.
With these definitions, our intuitive notions of cut-stability and connection stability
developed over the previous chapters has now been formalized3.
4.3.3 Extension of Cut and Connection Stability to Disconnected Graphs
Assume now a disconnected undirected graph. If the graph is not the empty graph, it is
composed of connected components. We transfer our previous definitions to apply to the
connected components.
For cut-stability, since the cut set for each component must be part of the cut set for the
graph as a whole, or else the conditions G∗ = G −MVC(G) and E(G∗) = ∅ are violated,
the definition for cut stability does not change. However we can refine the definition so it is
in terms of C(G), the components of G. We assume Sk(G) as the cut-stability of a graph,
and Sk(ci) as the cut-stability of the ith component in G, then,
Sk(G) = ΣiSk(ci).
For connection-stability, we can assume the stability of the graph as the weighted sum of
the connection-stability of individual components, weighted by the size (in vertices) of the
component.
Sc(G) = Σiwi ∗ Sc(ci) where wi = (|V (ci)| / |V (G)|) is the weight for each component.
3 Interestingly, the definition of cut-stability depends on Minimum Vertex Cover, a known NP-completeproblem [Garey 79, Dinur 05] while connection-stability depends on a flooding process akin to a breadth firstsearch [Kleinberg 06]pp. 79-82, 97-98 whose time complexity is O(|E|+ |V |).
59
4.3.4 Extension of Cut-Stability and Connection-Stability to Directed Graphs
Our first challenge is to deal with notions of connectivity and components in directed graphs.
Let GD be a directed graph. Let GU be the corresponding undirected graph resulting from
replacing each directed edge with an undirected edge.
Definition 3. A strongly connected directed graph, GD is one in which there is a directed
path between every pair of distinct vertices. A directed graph, GD is ‘weakly connected’ if
there is a directed path between any pair of distinct vertices in its corresponding undirected
graph, GU . [van Steen 10]pg. 61. A minimal strongly connected directed graph, GD, is one
in which the removal of a single edge will result in GD no longer being strongly connected.
Weakly connected components in a directed graph are the components of the corre-
sponding undirected graph. Strongly connected components in a directed graph are the
more restrictive case that for all vertices u and v in the strongly connected component, there
is both a directed path from u to v and from v to u, so all pairs of nodes are mutually
reachable ( [Kleinberg 06]pp. 98-99).
The notion of strongly connected components is too restrictive, in that we would consider
a directed graph fully flooded if every vertex vi is reachable from v∗ even if v∗ is not reachable
from all vi.
We use the notion of weak connectivity for our definitions at this stage. The formulae
for disconnected graphs above are unchanged other than that summation is now over the
weakly connected components. Later in this chapter we will consider our stability concepts
with respect to strongly connected directed graphs.
Since MVC(GD) depends only on the structure of adjacency in a graph, and not on edge
directions:
Therefore Sk(GD) = Sk(GU).
However, with respect to connection-stability and the notion of a flood, directionality of
edges does matter. This has two consequences. First, in a weakly connected component, all
vertices are no longer necessarily reachable due to the directionality of arrows:
60
|MFS(v∗i , Ci)| 6= |V (v∗i , Ci)| so 1 ≤ |V (v∗i , Ci)| / |MFS(v∗i , Ci)| ≤ |V (v∗i , Ci)|.
Therefore Sc(GD) ≥ Sc(GU).
Intuitively, cut-stability is related to the amount of effort required to cut a network to
pieces. The greater the cut stability of the network, the more effort required to cut it to
pieces. Intuitively, connection-stability is related to the time it would take for a viral process
to run through a network. The longer the process would take, the greater the network’s
connection stability4.
Our next task is to demonstrate these two stability concepts are antagonistic.
4.3.5 Antagonism
Definition 4. Two properties, A and B of an object, are considered antagonistic (or ‘antag-
onistically related’), if for a repeated operation o (excluding the identity operation) conducted
i times on the object, (a) there exists at least one instance where the value of each property
changes, and (b) one property monotonically increases in value, while the other property
monotonically decreases in value such that:
If: A(o0) ≥ A(o1) ≥ A(o2) ≥ A(o3) ≥ · · · ≥ A(oi)
Then: B(o0) ≤ B(o1) ≤ B(o2) ≤ B(o3) ≤ · · · ≤ B(oi)
or vice-versa5.
A special case is where two properties are inversely related; in this case the properties are
strictly antagonistic:
If: A(o0) > A(o1) > A(o2) > A(3) > · · · > A(oi)
4 For undirected graphs, T in connection-stability is bounded by the diameter of a component D(Ci)where D(Ci) is the maximal shortest path between a pair of vertices in the component. D(Ci) ≤ 2T (Ci).Since, every reachable vertex in a component (or weakly connected component for a directed graph) isreachable from vi in T steps, and assuming at worst that the paths have no vertices in common other thanv∗, the maximum possible diameter between two vertices vx and vy which only share v∗ would be the pathwhere they are joined in v∗ which will be at most 2T . For an undirected component: ∴ Sc(Ci) ≤ D(Ci)/2.So in an undirected network, calculating the diameter of the network will provide a quick estimate of thelevel of connection stability possible, though your actual connection stability may be much less. In the caseof directed network, which does not only depend on T , but also on the ratio of |V (GD)| / |MFS(v∗, GD)|,the relationship between connection-stability and diameter does not hold.
5 An example of such an antagonistic relationship is database transactions, where speed and security areantagonistic. Optimizing for transaction speed sub-optimizes for transaction security. Features that add totransaction speed do not necessarily add to security, and may reduce security. Features that add to transactionsecurity do not necessarily add to speed, and may reduce transaction speeds.
61
Then: B(o0) < B(o1) < B(o2) < B(o3) > · · · < B(oi)
or vice-versa6.
4.4 Cut-Stability and Connection-stability are Antagonistic
Theorem 1. In a connected undirected graph, G, Sk(G) and Sc(G) are antagonistic under
the operations of adding or deleting edges.
Proof. We assume our initial object is an undirected connected graph, G.
Let G→ G+ denote the addition of an edge to G.
Let G→ G− denote the removal of an edge from G.
Several cases need to be proved for each stability definition. Each definition is tied to a
critical set: MVC(G) for cut-stability, and MFS(v∗, G) for connection-stability. We need
to consider the case where an edge is added or removed from vertices within the respective
critical set, the case where an edge is added or removed from vertices without the respective
critical set, and the case where an edge crosses from a vertex within the respective critical
set to outside the critical set.
CASES 1 to 2: Addition of Edges Within or Outside the Critical Set.
The critical set is in MVC(G) for cut-stability; in MFS(v∗, G) for connection-stability.
Sk increases or stays the same on addition of an edge. If the edge is added within
MVC(G) then Sk(G) = Sk(G+). If the edge is added outside MVC(G) then, we need to
add one vertex of the edge to MVC(G+), so
Sk(G) + 1 = Sk(G+)
Sc decreases or stays the same on addition of an edge. Since all vertices are inMFS(V ∗, G)
and are already reachable in T iterations for G 7, the addition of an edge can only provide
a shortening in the number of iterations, or no change at all8, so
Sc(G) ≥ Sc(G+).
6 An example of such an inverse relation from quantitative genetics is the heritability of a trait andvariation in the same trait. As variation increases, heritability necessarily decreases.
7There is a path less than or equal to T from v∗.8The addition of an edge could lead a new vertex to becoming a candiate for v∗, only if it leads to a
smaller T than already exists.
62
CASE 3: Addition of Edges that Cross the Critical Set
For a graph G an edge crosses MVC(G) (or MFS(V ∗, G) respectively) if one vertex of
the edge is within the set, and one vertex of the edge is outside the set.
Sk does not change on the addition of an edge crossing MVC(G), as one vertex is in
MVC(G), and so that edge is lost when the least cut set is removed.
Sk(G) = Sk(G+).
For Sc in a connected undirected graph, it is impossible for there to be a crossing edge
by definition.
Summarizing Cases
Combining the sub-cases above,
Sk(G) ≤ Sk(G+).
Sc(G) ≥ Sc(G+).
The converse of these arguments hold in the opposite direction, as G→ G− so that
Sk(G) ≥ Sk(G−).
Sc(G) ≤ Sc(G−).
Antagonism
By transitivity, a series of i edge addition operations G → G+, where G0 is the original
graph, G1 is the graph after the first edge addition, G2 is the graph after the second edge
addition and Gi is the graph after the ith edge addition results in the antagonistic partial
orders:
Sk(Go) ≤ Sk(G1) ≤ Sk(G2) ≤ Sk(G3) ≤ ... ≤ Sk(Gi).
Sc(G0) ≥ Sc(G1) ≥ Sc(G2) ≥ Sc(G3) ≥ ... ≥ Sc(Gi).
Hence Sk and Sc, are antagonistic under either repeated edge additions or repeated edge
deletions on a graph, G, that is connected and undirected.
At this point, our network architect has arrived at some basic definitions of cut-stability
and connection-stability, and a demonstration of their antagonism. These are the core build-
ing blocks for a theory of topological stability, which we will begin to expand upon in the
63
sections that follow.
4.5 Introduction and Motivation II: An Ecologist’s Perspective
In the previous sections we took on the perspective of a network architect to motivate the
concepts of cut-stability, connection-stability, and their antagonistic relationship. We now
switch to the perspective of an ecologist. How would she view the results of the last section?
An ecologist is likely to relate the antagonism between cut-stability and connection-
stability to the kinds of networks she is most likely to work with, food webs, and ecological
flow networks9. A food web represents an ecosystem as a directed graph [Dunne 06], in which
the directed edges define who is eating whom. The vertices are species. If species v eats
species u, there is a directed edge, uv between those two species, where u is the edge source,
and v is the edge destination. An ecological flow network is like a food web with additional
detail added. Now, each directed edge has a value that reflects the amount of material flowing
along that edge over some time interval. These values usually represent the flow of matter
(via being eaten) from u to v over some observational period. The summation of all edge
values represents the total flow of matter through the system, called ‘throughput’. While the
edge values in ecological flow networks are usually quantified in terms of carbon transfers,
they could also be in terms of particular limiting nutrients, or even the flow of a toxin through
an ecosystem [Fath 99, Ulanowicz 97, Ulanowicz 04, Ulanowicz 99b]. Suppressing the edge
values in an ecological flow network results in the corresponding food web10.
Within the context of food webs and ecosystem flow networks our ecologist can view the
antagonism between cut-stability and connection-stability in terms of a foundational debate
9 Conversely, a network architect might interpret an ecologist’s directed graphs to their own field asmessages passing through a network. While reading what follows, it is useful to maintain both perspectives,that of the ecologist studying evolved ecosystems and that of the network architect designing informationecosystems de novo.
10 From the network designers perspective the directed graph of a food web is analogous to the pattern ofmessage flows in a network, while an ecological flow network is analgous to quantifying the actual messageflows over time.
64
in ecology commonly known as the ‘diversity-stability’ debate [McCann 00, Tilman 99]. The
diversity-stability debate, which links theoretical concepts to empirically testable results, has
been a point of contention for over fifty years in ecological studies, with vocal advocates for
different resolutions to the debate. The essence of this debate concerns whether ecosystems
with more species and densely connected food webs are more stable than ecosystems with
a few species and sparsely connected food webs. These issues have been approached both
theoretically and empirically.
In the 1950s observations by Elton [Elton 58] and Odum [Odum 53] suggested that species
rich communities were more stable than simpler communities with few species, and thus
diversity was positively associated with stability in ecosystems. MacArthur [MacArthur 55]
realized such observations could be used to make precise statements about the structure of
food webs, and used information theory to measure the stability of an ecosystem as reflected
in its food web. MacArthur summarized the idea behind his information theoretic conception
of stability in a food web as follows [MacArthur 55]:pg. 534:
‘The amount of choice which the energy has in following the paths up through the food
web is a measure of the stability of the community.’
MacArthur’s conception of stability (a) focussed on using information metrics to capture
aspects of the topology of a food web, (b) was intimately tied to energetic considerations, and
(c) considered two food webs with different topologies equivalently stable if their structure
incorporated the same amount of choice in energetic paths [MacArthur 55]:pg. 535, Figs. 3
and 4. He explicitly considered changes to stability as the topology of a food web is altered,
for example by adding species and links. His conception of ecosystem stability appears
analogous to cut-stability.
MacArthur’s theoretical conception of the relationship between stability and diversity in
ecosystems, influenced the field for close to twenty years. In the 1970s, theoretical work sum-
marized in May’s landmark book, Stability and Complexity in Model Ecosystems introduced
65
a different conception of stability in the context of dynamical systems models of ecosystem
interactions11. May found such models to become unstable to perturbations as the number of
species and interactions increases (where species are vertices and interactions are edges), so
that species are lost [May 00, Pimm 79]. These theoretical studies in the 1970s appeared to
contradict the earlier observational and theoretical work from the 1950s. May’s conception
of stability followed from the concept of ‘neighbourhood stability’ in dynamical systems the-
ory which concerns perturbations close to an equilibrium state. Each species rate of change
in population size is a function of the population sizes of all of the species with which it
interacts. In the equilibrium state, all species in the model ecosystem maintain stable pop-
ulations that do not change in size. A model ecosystem is stable, if when perturbed away
from an equilibrium point, it returns to it12.
May suggested that MacArthur’s earlier information theory based stability conclusions
that diversity and stability increase together in ecosystems had been mistakenly given the
status of a mathematical theorem ([May 00]:pp. 37-38, [May 09]:pg. 1643). While praising
MacArthur’s insight, May suggested MacArthur’s conclusions, while intuitive, were not for-
mally proven. This led to May’s own investigation of stability in terms of dynamical systems
models of ecological communities13. May’s views altered the conception of stability ecologists
11 Additionally, May’s follow up work on discrete time models [Gleick 87, May 74, May 76b, May 76a]introduced the concept of chaos into ecology as well as to other scientific disciplines.
12 May’s conceptualization of stability above is often contrasted with a closely related ecological concept,‘resilience’ developed by Holling [Holling 73]pg. 17, ‘Resilience determines the persistence of relationshipswithin a system and is a measure of the ability of these systems to absorb changes of state variables, drivingvariables,and parameters and still persist.’ He goes on to note that systems can be very resilient, whilefluctuating greatly, and thus having low stability. This suggests an antagonism between stability (relativeto equilbrium) and resilience similar to the antagonism between cut-stability and connection-stability. Hisparticular example was spruce budworm forest communities, and he noted that the large fluctuations inpopulation size between budworms and their predators, which could be seen as instability, were essentialto the persistence of the ecological community consisting of budworms, their predators, and their host treespecies. The example is particularly apropos in that one of the mitigating factors of the current bark beetleepidemic sweeping across B.C. and Alberta was a long term policy of fire suppression to control fluctuationsin the beetles host species, lodgepole pine, which resulted in a corridor of trees all in an age class susceptibleto beetle infestation [Halter 11b].
13 May ([May 00]:pg. 38) while appearing to cite Hutchinson’s influential 1959 essay, ‘Homage ToSanta Rosalia or Why are There So Many Kinds of Animals’, actually disputes Hutchinson’s belief[Hutchinson 59]:pg.149 that ‘Recently MacArthur (1955) using an ingenious but simple application of in-formation theory has generalized the points of view of earlier workers by providing a formal proof of the
66
were using from one centered around choice of energy flows in alternate topologies to one
around perturbations from an equilibrium point. However, there have been ongoing prob-
lems in resolving stability conclusions from dynamical systems models of ecosystems against
stability findings from empirical studies of ecosystems14. Furthermore, May’s dynamical
systems based conceptions of ecological stability have been critiqued for making unrealistic
assumptions about the nature of ecosystems, particularly the assumption that stability is
tied to equilibria in real ecosystems. McCann notes [McCann 00]:pg 229:
‘... ecological theory has tended traditionally to rely on assumptions that a system is
stable if, and only if, it is governed by stable equilibrium dynamics (that is, equilibrium
stability and equilibrium resilience). As discussed in the previous section, these are strong
assumptions with no a priori justification.’
Through the 1990s to present, studies that have manipulated actual ecosystems [Fagan 97,
Naeem 97, Romanuk 06a, Romanuk 06b, Romanuk 09a, Romanuk 10, Tilman 96, Tilman 94]
or compared food webs from different ecosystems [Aoki 01], have found a positive relation-
ship between diversity and stability, while studies simulating species losses based on real
food webs [Dunne 02b, Dunne 04, Dunne 09] find greater species connectance (more edges)
associated with robustness to species deletions. These empirical results echo MacArthur’s
earlier theoretical work. However the disparity between diversity-stability conclusions of the
growing weight of empirical results versus theoretical predictions from dynamical systems
models has been a source of creative tension in the field which is succinctly stated in Dell et
increase in stability of a community as the number of links in its food web increases.’14 May’s original conclusions were based on randomly connected food webs, and subsequent work on ecolog-
ical models that more closely matched to connection structure of real food webs [De Angelis 75, Hastings 84,Yodzis 81] had greater stability to perturbations around the equilibrium point. Hastings in particular, con-trasts two different stability concepts from the dynamical systems perspective ‘Lyapunov stability’ (stabilityto perturbations in the neighbourhood of an equilibrium point, the stability concept May used) against‘structural stability’ (stability against perturbations in the system parameters) [Hastings 84]:pg. 172, whichmay be antagonistic so that highly connected systems that are Lyapunov stable are structurally unstable, andnotes anecdotally that the conflicting requirements between these two forms of stability may be implicatedin power blackouts and immune response [Hastings 84]:pg. 176.
67
al. [Dell 05]:pp.425-42615.
‘This disparity between real patterns and those predicted by theory has been one of the
most pressing issues facing ecologists for the past few decades. If the mechanisms driving
trophic dynamics of natural communities are to be understood, this paradox needs to be
resolved and a robust theoretical framework needs to be developed that adequately explains
the persistence of complex food webs in a way that is consistent with high quality empirical
data.’
Information theory applied to ecological flow networks spans the complete history of
this debate. In the 1950s MacArthur [MacArthur 55] associated stability with informa-
tion capacities and the amount of choice in pathways via which energy16 flows through in
an ecosystem as represented by its food web. In the 1970s Rutledge et al. [Rutledge 76]
extended MacArthur’s stability measures to ecological flow networks, and incorporated
throughput, the total amount of energy travelling through the system. This paper also
introduced average mutual information as a measure of ecosystem organization. From
the 1980s onwards Ulanowicz extended these earlier ideas to focus on mutual informa-
tion scaled by throughput in ecological flow networks, identifying ecosystem stability as
a balance between constraints on energy pathways, and the existence of alternate pathways
[Ulanowicz 97, Ulanowicz 04, Ulanowicz 09a]17. Constraint of much of the energy throughput
to particular pathways allowed energy to flow efficiently through a system, while alternative
pathways allowed for resilience if the main pathways were somehow blocked. Ulanowicz’s
conception extended MacArthur’s original insights by emphasizing a balance between both
choice in alternative paths (akin to cut-stability), and constraints when long paths or cy-
cles develop in ecosystems (akin to connection-stability)18. Throughout the history of the
15 While Dell et al. [Dell 05] emphasize the empirical evidence, Dunne et al. [Dunne 05] provide acomplementary perspective on refining dynamical models to better match data.
16 In ecosystems energy flows are approximated by carbon flows.17 Ulanowicz’s information theoretic metrics are derived in Chapter 3.18 MacArthur’s information based stability measures [MacArthur 55] in turn extended and mathematized
68
diversity-stability debate, the information theory approach has emphasized the topology of
a network of ecological relationships, as a reflection of ecosystem energetics. This is clear
in MacArthur’s original paper that both originates the information theory approach, and
historically grounds the diversity-stability debate [MacArthur 55]:534.
‘This stability can arise in two ways. First it can be due to patterns of interaction
between the species forming the community; second it can be intrinsic to the
individual species. While the second is a problem requiring knowledge of the physiology
of the particular species, the first can at least be partially understood in the general case.’
Cut-stability and connection-stability trim the extreme edges of the diversity-stability
debate by showing that there are two antagonistic forms of topological stability that need
to be considered. An ecosystem with very few interactions, say a food chain, can be bro-
ken due to any disruption of the chain (it is not cut-stable). An ecosystem with a large
number of species as alternative food sources, can be disrupted by the rapid dissemination
of a toxin or disease that can be transferred across species (it is not connection-stable).
Any real ecosystem is subject to a wide range of perturbations, which implies that it must
balance between cut-stability and connection-stability, having intermediate levels of both19.
Similarly, the information theoretic perspective on ecosystem stability balances between two
tendencies, for ecosystems to be organized along certain major paths for energy flow (similar
to connection-stability) while retaining sufficient alternate paths to to deal with disruptions
Lindeman’s diagrammatic discussion of energy pathways in an ecosystem [Lindeman 42]. Worster, in ‘Na-tures Economy’, a history of ecology, emphasizes Lindeman’s focus on energetics of ecosystems as pivotal inthe birth of the New Ecology (essentially modern ecology) [Worster 77]:pg. 306, leading directly to a moremathematical and abstract theoretical framework for ecology as well as enduring analogies between ecologicalenergetics and economics [Worster 77]pg. 311. Ulanowicz [Ulanowicz 09b]pp. 4-7 and 80-89, provides a briefhistory of the lineage of ideas forward from Lindeman to his own focus on mutual information as an indi-cator of the organization of ecosystems. Ulanowicz’s ideas, in turn, have been extended outside ecosystemsto apply to the analysis of municipal water networks [Bodini 02], supply chain networks [Battini 07], andeconomic sustainability [Goerner 09].
19 How would our network designer view the diversity-stability debate from the perspective of his field? Itis essentially a debate about what constitutes a robust network design, given some reasonable assumptionsabout the agents that will be passing messages through the system and their dynamics.
69
to ecosystem structure (similar to cut-stability). Recently, Allesina and Tang [Allesina 12]
bring together dynamical and topological perspectives to show from within the dynamical
systems perspective pioneered by May that stability criteria are possible such that diver-
sity does not beget instability. These criteria are tied to applying more realistic network
topologies. These results echo an earlier line of investigation by Yodzis, that sought to tie
the results elucidated in May’s exercise on model ecosystems to the actual topology of real
ecosystems [Yodzis 80, Yodzis 81, Yodzis 82].
Can we relate our topological measures of stability to information theoretic metrics that
have been associated with stability in ecosystems? There are several benefits if we can
do so. First, we place stability concepts developed in ecology in the broader context of
topological stability in networks. Secondly, if a relationship is found, it allows us to use
information metrics as indicators of topological stability. Finally, it broadens the diversity-
stability debate so it can be applied to networks outside of ecology. Can we mathematically
relate mutual information directly to our measures of topological stability20?
In the sections that follow, we develop the mathematical relationships between mutual
information applied to a network and our measures of topological stability. We begin by
first reviewing some terminology we will be using. We then note that information theoretic
measures have been applied to graphs in several fields and that there is a relationship be-
tween the adjacency matrix representation of a graph, and the data required for a mutual
information calculation, so that every adjacency matrix for a graph can be used to calculate
its mutual information, and every calculation of mutual information for discrete probability
distributions can be represented as a graph. We then proceed to clarify the relationship be-
tween the properties of an adjacency matrix and the associated data table used to calculate
mutual information on a graph, and the properties of our topological stability measures: cut-
stability and connection-stability. Finally, we leverage the properties of mutual information
20 If we are successful in mathematically relating mutual information to our topological stability concepts,we create an additional metric for network stability that can be utilized by our network architect to test theperformance of his system under message loads.
70
to develop a concept of ‘balanced stability’.
4.6 Directed Graphs and Mutual Information
To recap from Chapter 3, I(X, Y ) is the average mutual information between two types of
events, X and Y . It measures the dependency between two kinds of events. If X and Y are
independent, I(X, Y ) = 0. If X and Y are dependent, the average mutual information is
bounded by the information capacities associated with the two event types,
I(X, Y ) ≤ min((C(X), C(Y )) where C(X) and C(Y ) are the information capacities
associated with the two event types.
In terms of information capacities the mutual information can be expressed as:
I(X, Y ) = C(Y ) + C(X) − C(X, Y ) where C(X, Y ) is the joint information capacity of
X and Y .
In terms of the associated event probabilities, where xi ∈ X and yj ∈ Y , the mutual
information can be expressed as:
I(X, Y ) = ΣiΣjp(xi, yj) logp(xi,yj)
(p(xi)×p(yj).
In terms of either an adjacency or a flow matrix, p(xi, yj) are calculated from the ma-
trix cell values, while p(xi) and p(yj) are calculated from matrix row and column sums,
respectively.
Let V (G) be the vertex set of a graph, G. Let E(G) be the edge set of a graph, G.
Unless, otherwise indicated, we are now referring to directed graphs rather than undirected
graphs and will now refer to directed graphs simply as G, rather than subscripted as GD.
Let X be those events where in a directed graph, G, a vertex xi ∈ X is the source of an
edge. Let Y be those events where in a directed graph G, a vertex yj ∈ Y is the destination
of an edge. Since both types of event are defined upon the edges of G, and there are at most
71
|V (G)| edge sources (X) and edge destinations (Y ),
X, Y ⊂ V (G) and |X|, |Y | ≤ |V (G)|.
Following Dehmer and Mowshowitz [Dehmer 11] (Section 2.2) we distinguish between
information measures calculated on edges versus those calculated on vertices of a graph21.
Let IE(G) be the average mutual information calculated on the edges of a graph.
IE(G) = I(X, Y ).
The average mutual information calculated on the edges of a graph is bounded by the
information capacities associated with the two event types X, Y . These information ca-
pacities are maximum when any vertex is equally probable as an edge source or as an edge
destination. In this case the information capacity is the logarithm of the number of equally
probable events, i.e. the vertices that can be edge sources or destinations.
IE(G) ≤ min((C(X), C(Y )) and Cmax(X) = Cmax(Y ) = log |V (G)|,
IE(G) ≤ log |V (G)|.
IE(G) is (a) upper bounded by log|V (G)|, is (b) always a positive value, and is (c) 0 if X
and Y are independent [Renyi 87]:pg. 24, leading to clear bounds on the mutual information
associated with any directed graph of |V (G)| vertices,
0 ≤ IE(G) ≤ log |V (G)|.
To an ecologist studying food webs, the event types X and Y have specific biological
meanings. X represents those species that are being eaten by other species. Y represents
those species who are eating other species. Imagine a repeated experiment where we ran-
domly select an individual organism, note its species and further note what species it is eaten
21An example of an information measure calculated on the vertices of the graph would be the informationcapacity calculated on the degree distribution of a graph. For an undirected graph, let there be k classes ofvertices of different degree. Let CV (G) be the information capacity of the vertices. Let pk be the probability
of the kth vertex class. Then, CV (G) = Σkpklog1pk
.
72
by (or the species it eats). If, selecting an individual of prey species, xi ∈ X (or of predator
species, yj ∈ Y ) fully determines the predator species who eats it (or the prey species who is
eaten), IE(G) = log |V (G)|. If, selecting an individual of prey species, xi ∈ X (or of predator
species, yj ∈ Y ) does not reduce our uncertainty as to the predator species who eats it (or
the prey species who is eaten), IE(G) = 0.
The essential idea is that every directed graph (representing food webs) and every di-
rected graph with positive real valued edges (representing flow networks) can be the basis for
a calculation of mutual information. These graphs are represented respectively by adjacency
matrices and flow matrices, and the probabilities required to determine the average mutual
information are calculated from the cell values; each cell value and its associated row and
column sums defines a term in the corresponding mutual information calculation. It is this
ability to go back and forth between the adjacency or flow matrix representations and the
mutual information calculation that leads to every directed graph having an associated mu-
tual information, and every mutual information calculation on discrete data being associated
with a corresponding directed graph.22.
An adjacency matrix representation of a directed graph can be seen as a row by column
data table where rows represent the source vertices for edges and where columns represent
the destination vertices for edges [van Steen 10]:pp. 31. In the case of an adjacency matrix
a cell with a 1 indicates a directed edge, from a source vertex (row) to a destination vertex
(column). In the case of a flow matrix, the edges, rather than being represented by 1s, are
represented by positive real values. An example of a flow matrix in tabular form is in Chapter
3. Introductions to information theory often display a joint distribution table as the basis
for a mutual information calculation where the rows represent data about the probabilities
22 This relationship between graphs and mutual information is leveraged in studies of flow networks[Zorach 03], food webs and other directed graphs occurring in technology [Bersier 02, Sole 04], and hasbeen surveyed across a range of fields from biology, chemistry and sociology [Dehmer 11]. Recent workby Bianconi has taken a statistical mechanics approach to calculate the information capacity of networkensembles satisfying particular structural constraints (usually constraints on the degree sequence of vertices)[Anand 09, Bianconi 07, Bianconi 09a, Bianconi 09b].
73
for events xi of an event type X and the columns data about events yj of an event type Y .
These tables have one additional row and column which sum the interior row and column
values respectively23 .
Figure 4.1: MacArthur’s Food Web
y1 = A y2 = B y3 = C y4 = Dx1 = A 1 1x2 = B 1x3 = C 1x4 = D
Table 4.1: Adjacency Matrix for MacArthur’s Food Web
Figure 1 illustrates the simple food web used by MacArthur ([MacArthur 55]:Figure 1,
pg. 533). Table 1 is the adjacency matrix for this food web, and Table 2, illustrates how it
can be used as the basis of a mutual information calculation. In both Table 1 and 2, only
entries with non-zero values are entered.
The calculation of average mutual information,
23 Example: see [MacKay 03]:pgs 140, 147; though in this case, xi are columns and yj are rows.
74
y1 y2 y3 y4 p(xi)x1 p(x1, y3) = 1
4p(x1, y4) = 1
4p(x1) = 2
4
x2 p(x2, y3) = 14
p(x2) = 14
x3 p(x3, y4) = 14
p(x3) = 14
x4
p(yj) p(y3) = 24
p(y3) = 24
Table 4.2: Mutual Information Calculation for MacArthur’s Food Web
IE(G) = ΣiΣjp(xi, yj) logp(xi,yj)
p(xi)×p(yj),
can be related to the terms in Table 2 . The joint probability p(xi, yj) is the cell value
from the adjacency matrix (Table 1) divided by the number of edges. Where the adjacency
matrix cell value is 1, p(xi, yj) = 1|E(G)| , otherwise 0. Let the row total be Ti. = Σjxi and
p(xi) = Ti.|E(G)| is then the marginal probability for xi. Similarly, let the column total be
T.j = Σiyj and p(yj) =T.j|E(G)| is then the marginal probability for yj.
Thus,p(xi,yj)
p(xi)×p(yj)=
1|E(G)|
Ti.|E(G)|×
T.j|E(G)|
= |E(G)|Ti.×T.j ,
so summing over cells in the adjacency matrix with 1 entries,
IE(G) = ΣiΣj1
|E(G)| log |E(G)|Ti.×T.j .
For MacArthur’s food web in Figure 1,
IE(G) = 14
log 42×2
+ 14
log 42×2
+ 14
log 41×2
+ 14
log 41×2
= 0+0 = 14
log 2+ 14log2 = 1
2log 2 = 1
2.
Since, the maximum value for the average mutual information is log|V (G)|, we can express
the relative constraint asIE(G)
log|V (G)| =12
log 4= 1
4.
Figure 2 illustrates the modified food web MacArthur derived while Table 3 provides its
adjacency matrix and Table 4 illustrates the mutual information calculation. MacArthur’s
modified food web is based on the biologically reasonable assumption that the energy leaving
the ecosystem equals that coming into it. This idea is captured by vertex E and the six grey
75
y1 = A y2 = B y3 = C y4 = D y5 = Ex1 = A 1 1 1x2 = B 1 1x3 = C 1 1x4 = D 1x5 = E 1 1
Table 4.3: Adjacency Matrix for MacArthur’s Modified Food Web
y1 y2 y3 y4 y5 p(xi)x1 p(x1, y3) = 1
10p(x1, y4) = 1
10p(x1, y5) = 1
10p(x1) = 3
10
x2 p(x2, y3) = 110
p(x2, y5) = 110
p(x2) = 210
x3 p(x3, y4) = 110
p(x3, y5) = 110
p(x3) = 210
x4 p(x4, y5) = 110
p(x4) = 110
x5 p(x5, y1) = 110
p(x5, y2) = 110
p(x5) = 210
p(yj) p(y1) = 110
p(y2) = 110
p(y3) = 210
p(y4) = 210
p(y5) = 410
Table 4.4: Mutual Information Calculation for MacArthur’s Modified Food Web
76
Figure 4.2: MacArthur’s Modified Food Web
directed edges going into and emanating from it. Every terminal vertex is given an edge into
E, and every initial vertex is given an edge from E.
The calculation of average mutual information for MacArthur’s modified food web is,
IE(G) = 110
log 103×2
+ 110
log 103×2
+ 110
log 103×4
+ 110log 10
2×2+ 1
10log 10
2×4+ 1
10log 10
2×2+ 1
10log 10
2×4+
110
log 101×4
+ 110
log 102×1
+ 110
log 102×1≈ 1.046.
Again the relative constraint can be expressed asIE(G)
log|V (G)| ≈1.046log 5≈ 0.451.
This modified graph is now strongly connected, every vertex is reachable from every other
vertex by a directed path. In a strongly connected ecological network, energy can recycle.
For example there is a trail in the graph (B,C,D,E,A,E,B) that forms a directed circuit
24 via which energy (from biomass transfers) can recycle.
24 Following [Chartrand 77]:pp. 41-42, a walk is an alternating sequence of vertices and edges, whereeach edge joins the vertex immediately preceding and following it. A trail is a walk with no repeated edges.
77
Note that in a strongly connected graph, there is at least one entry for every row and
column in the adjacency matrix or flow matrix. If this property did not exist, there would
be some vertex that either connects to no other vertices, or is not connected to any other
vertex.
Fact 1. In a strongly connected graph, there is at least one entry for every row and column
in the corresponding adjacency matrix.
Proof. Since every vertex is reachable from every other vertex by a direct path, each vertex
must have at least one incoming edge (a column entry in the adjacency matrix), and have
at least one outgoing edge (a row entry in the adjacency matrix).
This property is necessary in the adjacency matrix of any strongly connected graph, but
is not sufficient to prove connectivity. For example, if all edges were self-loops, this property
would hold, though the underlying graph is disconnected. If every row and column of the
adjacency matrix has exactly one entry, the corresponding graph is composed of one or more
components, each of which is a directed cycle. If there is only a single component, then the
whole graph is a directed cycle, and therefore also strongly connected. At the other extreme,
if the number of components equals the number of vertices, each component is a self-loop.
These properties can be captured in the notion of a cycle cover.
Definition 5. For a directed graph, G, a cycle cover is a set of vertex disjoint cycles, where
each vertex belongs to exactly one cycle [Kleinberg 06]:pg. 528.
If we consider a self-loop a kind of trivial cycle, then there are three kinds of cycle covers
we could expect, (a) a single directed cycle, (b) a set of components each of which is a
directed cycle, and (c) a set of self-loops. In the first case, the whole graph is minimally
strongly connected; in the second case, each component is minimally strongly connected; in
the third case, each self loop could be considered minimally strongly connected, again in a
A path is a trail with no repeated vertices. A circuit is a trail that begins and ends at the same vertex. Acycle is a circuit that does not repeat any vertices except the first and last. The trail (B,C,D,E,A,E,B)repeats the vertex E, so it is not a directed cycle.
78
trivial sense, since it is the same vertex on the incoming and outgoing edge, rather than
distinct vertices.
Lemma 1. A directed graph, G, is composed solely of a cycle cover (with no additional
vertices or edges), if and only if, it has exactly one entry for each row and column of its
corresponding adjacency matrix.
Proof. In a directed graph, G composed solely of a cycle cover, every vertex belongs to
one cycle only. Therefore every vertex has exactly one incoming and outgoing edge, and
correspondingly is associated with only a single column and row entry. In a directed graph,
G, where there is only a single entry for each row and column of the adjacency matrix, the
underlying graph must be a cycle cover. If any vertex were the member of more than one
cycle, it would have more than one incoming or outgoing edge (and thus more than one
column or row entry). If a vertex was a root, it would have no incoming edge (and thus one
column would have no entry). If any vertex was terminal, it would have no outgoing edge
(and thus one row would have no entry.)
While the property of exactly one entry in every row and column of the adjacency matrix
is not sufficient to prove a graph is strongly connected, it demonstrates that the components
of the graph are strongly connected, even in the trivial case of a graph consisting only of
self-loops.
4.7 Mutual Information and Topological Stability
4.7.1 Roadmap to Our Argument
We now proceed to examining the relationship between mutual information and topological
stability. We first introduce the idea of a mutualistic property, which complements the
concept of antagonistic properties25 developed earlier in this chapter. We then proceed to
25 Our use of the terms mutualistic and antagonistic are inspired by the ecological concepts of mutualismand antagonism, the first being the case where two entities mutually benefit each other and the latter beingthe case where benefit to one entity is a detriment to the other.
79
demonstrate the conditions under which the average mutual information of a graph IE(G)
can be related to our measures of cut-stability, Sk(G), and connection-stability, Sc(G).
Definition 6. Two properties, A and B of an object, are considered mutualistic (or ‘mutually
related’), if for a repeated operation o (excluding the identity operation) conducted i times on
the object, (a) there exists at least one instance where the value of each property changes, and
(b) one property monotonically increases/decreases in value, the other property monotonically
increases/decreases, respectively in value such that:
If: A(o0) ≥ A(o1) ≥ A(o2) ≥ A(o3) ≥ · · · ≥ A(oi),
Then: B(o0) ≥ B(o1) ≥ B(o2) ≥ B(o3) ≥ · · · ≥ B(oi).
If: A(o0) ≤ A(o1) ≤ A(o2) ≤ A(o3) ≤ · · · ≤ A(oi),
Then: B(o0) ≤ B(o1) ≤ B(o2) ≤ B(o3) ≤ · · · ≤ B(oi).
Our essential idea is that in a strongly connected directed graph, the average mutual
information of the graph will generally be antagonistic with cut-stability and mutualistic
with connection stability. We will build up to this result in small steps, by establishing:
1. Our original proof of antagonism for cut-stability and connection-stability
based on undirected graphs can be extended to directed graphs. Once that
is established, we will then relate our topological stability definitions to the
average mutual information via the following steps.
2. Strongly connected directed graphs have at least one entry for every row and
column in their adjacency matrix (demonstrated above).
3. Directed graphs with exactly one entry for every row and column in their
adjacency matrix consist solely of a cycle cover (demonstrated above).
4. Directed graphs consisting solely of a cycle cover are at their maximum bound
for the average mutual information.
80
5. For directed graphs consisting solely of a cycle cover, the first additional edge
must reduce the value of the average mutual information, and thus be antago-
nistic with cut-stability, and conversely mutualistic with connection stability.
6. It is possible to identify conditions under which a strongly connected directed
graph would monotonically decline in average mutual information upon edge
addition.
7. A construction exists, based on layering cycle covers, that represents an upper
bound on the decline of average mutual information as edges are added to a
strongly connected graph.
To build up some intuition around these ideas, we will also go through a graph con-
struction exercise that results in a directed graph with exactly one entry for every row and
column, and analyze the contribution individual terms make.
4.7.2 Cut-Stability and Connection-Stability in Strongly Connected Graphs
First off, let us extend our proof of antagonism from connected undirected graphs, to strongly
connected directed graphs.
Theorem 2. In a strongly connected directed graph, G, Sk(G) and Sc(G) are antagonistic
under the operations of adding or deleting edges
Proof. Cut-stability, Sk(G) is a property of the size of minimum vertex cover, |MVC(G)|.
Since cutting a vertex removes all its associated edges, the directionality of edges does not
affect |MVC(G)|. In the case of connection-stability, Sc(G), edge directionality does matter.
However, if an edge is added that increases Sk(G), it will either decrease Sc(G) or have no
effect, since the graph is already strongly connected. All the addition of an edge can achieve
with respect to Sc(G), is to lower the value of T , time to flood.
This establishes the extension of antagonism from connected undirected graphs to strongly
connected directed graphs. Next, we must establish that a directed graph, consisting solely
81
of a cycle cover is at its maximum bound for average mutual information.
Lemma 2. A directed graph, G, consisting solely of a cycle cover is at its maximum bound
for mutual information, such that IE(G) = log|V (G)|.
Proof. A directed graph, consisting solely of a cycle cover, has the same number of edges as
vertices, as each vertex is in one cycle only, and thus has only one incoming and one outgoing
edge. Therefore, the degree for each vertex is 2. Since for any graph the sum of the degrees,
is twice the number of edges (2|E|) ([Chartrand 77]pg.28), the number of vertices equals the
number of edges.
|V (G)| = |E(G)|.
The adjacency matrix value for each edge, as well as the row and column sums associated
with each edge are all 1. Each edge contributes,
1|E(G)| log
1|E(G)|
1|E(G)|×
1|E(G)|
to the average mutual information.
There are |E(G)| edges and the average mutual information summation is,
|E(G)|( 1|E(G)| log
1|E(G)|
1|E(G)|×
1|E(G)|
) = log|E(G)| = log|V (G)|
To understand the conditions required for cut-stability and the average mutual infor-
mation to be antagonistic, let us first examine a constructive example where they are not
initially antagonistic. We will examine a construction that results in a directed cycle through
all vertices of a graph (which is both a cycle cover and a minimal strongly connected graph).
Via this construction we can examine the relationship between topological stability and mu-
tual information past that point. Our construction proceeds on the adjacency matrix, as
follows.
1. Begin with an empty graph (no edges).
2. Randomly add a directed edge.
82
3. Keep adding random directed edges with the following constraints: (a) the
new edge results in a subgraph that is minimal strongly connected, (b) the
new edge uses a column and row that have no previous entries.
4. Stop, when (b) is no longer possible. The result is a directed cycle through all
vertices of the graph
To simplify our notation a little, we will designate the number of edges in a graph, |E(G)|
by m. So, as we add an edge to a graph, it is designated by m + 1. As each edge is added
in the construction of this directed cycle, the m previous edges contribution to a mutual
information summation are modified. Since all edges have the same cell value, and row and
column sums due to our construction rules, we can determine their contributions to the
mutual information summation by a formula.
There are m edges whose contribution is modified. The original contribution of each of
these edges , prior to addition of the m+1th edge is 1mlog(m). The modified contribution for
each of these edges after the addition of the m+1th edge is, 1m+1
log(m+1).The contribution
of the new edge is 1m+1
log(m + 1). Adding new edges up to a directed cycle on the graph,
increases mutual information because the contribution of the new edge is greater than the
accumulated reductions in contribution from the existing edges. This relationship is captured
in the following inequality,
m( 1mlog(m)− 1
m+1log(m+ 1)) < 1
m+1log(m+ 1).
The left hand side of the inequality are the reductions in contribution for existing edges.
The right hand side is the contribution of the new edge. With some algebraic manipulation
the inequality reduces to,
log(m) < log(m+ 1),
so the additional contribution of each new edge to the mutual information summation is,
83
log(m+1m
).
What if we added one new edge past the creation of the directed cycle through all vertices?
The addition of a new edge would increase cut stability. Would it continue to increase the
mutual information?
It is impossible for this single additional edge (whichever edge we choose to add) to
increase the average mutual information of the graph, because of the relationship,
I(X, Y ) ≤ min((C(X), C(Y )),
places a clear upper bound on the average mutual information. In this case, the maximum
possible value for mutual information is,
C(X) = C(Y ) = log|V (G)|.
Therefore, the addition of any additional edge can only decrease the mutual informa-
tion.26.
Lemma 3. For directed graphs consisting solely of a cycle cover, the addition of a single
edge must reduce the value of the average mutual information, or leave it unchanged.
Proof. By lemma 3, the average mutual information is already at its upper bound in a
directed graph composed solely of a cycle cover. Any additional edge subsequent to the
formation of the cycle cover, must therefore result in a total contribution (the contribution
of the new edge, plus reductions in the contribution of existing edges) that is negative, and
thus reduces the average mutual information below its upper bound.
4.7.3 Monotonicity Conditions
For the average mutual information to be antagonistic with cut-stability and conversely
mutualistic with connection stability, we need to understand the conditions under which a
26 While our construction was specific to the creation of a directed cycle through all the vertices, thedecrease in mutual information past a cycle cover applies without loss of generality, since any graph consistingsolely of a cycle cover will be at its upper bound for average mutual information.
84
sequence of edge additions following a cycle cover will lead to monotonic decrease in the
average mutual information. We first need a formulation that allows us to reason about
changes in the mutual information summation upon addition of edges past a cycle cover27.
We then need to identify conditions in the formulation that are a ‘worst case’, that is,
they contribute as much to the mutual information summation as possible. If, under this
worst case, the mutual information summation still decreases as edges are added, then a
construction exists that will monotonically decrease, and which will decrease slower (upper
bound) than other other possible edge addition sequences which may not monotonically
decrease.
These considerations lead us to construct a formula that tracks changes in the mutual
information summation. Upon addition of an edge there is an addition to: (a) the total
number of edges, (b) the row sum for the row in which the new edge is placed and (c) the
column sum for the column in which the new edge is placed.
As before, we will delineate the number of edges in a graph |E(G)| by m. We note
that the row sum Ti. is just the out-degree of a vertex vi, and the column sum T.j is just
the in-degree of a vertex vj. We will simplify our notation further to emphasize how our
calculations relate to graphs. Let ri be the outdegree of vertex vi and sj the indegree of
vertex vj.
Upon edge addition the number of edges increase by one, the out-degree of a vertex vi
increases by one, and the in-degree of a vertex vj increases by one:
1. m→ m+ 1. Let δ = mm+1
.
2. ri → ri + 1. Let δi = riri+1
.
3. sj → sj + 1. Let δj =sjsj+1
.
27 The development of this formulation and identification of monotonicity conditions for decrease inaverage mutual information upon edge addition is joint work with Peter Hoyer.
85
Note that 1 ≥ δ ≥ δi, δj.
We can consider δ, δi and δj as multipliers that can be applied to the standard summation
for average mutual information to incorporate how that summation would change after the
addition of an edge. The specifics of how the summation would change, depends on where
the edge is added.
Let Gij be a multiplier based on δ, δi and δj. The specific value Gij takes depends on
whether it is applied to the new summation term created by the added edge, or previously
existing summation terms for the existing edges. Let inew be the source vertex for the new
edge, and jnew be the destination vertex for the new edge.
Set
Gij =
δ if i 6= inew and j 6= jnew
δδi
if i = inew and j 6= jnew
δδj
if i 6= inew and j = jnew
δδi×δj if i = inew and j = jnew.
Our existing expression for the average mutual information now becomes,
IE(G) = ΣiΣjp(xi, yj)logp(xi,yj)
p(xi)×p(yj)= ΣiΣj
1|E(G)| log
|E(G)|Ti.×T.j = ΣiΣj
1mlog m
ri×sj .
We are now able to develop expressions that express the change in mutual information
after an edge addition. Let IE(G)old be the average mutual information summation prior
to the addition of a new edge, and IE(G)new be the modified average mutual information
summation after addition of a new edge. Let ri and sj be the out-degree and in-degree of
vertices i and j prior to edge addition, and r′i and s′j be the out-degree and in-degree of
vertices i and j after edge addition. Let r = rinew and s = sinew .
IE(G)old = 1m
ΣiΣj log(
mri×sj
), and
IE(G)new = 1m+1
ΣiΣj log(m+1r′i×s′j
)
86
= δm
ΣiΣj log(
1Gij
mri×sj
)+ δ
mlog(
m+1(r+1)×(s+1)
)= δIE(G)old − δ
mΣiΣj log (Gij) + δ
mlog(
m+1(r+1)×(s+1)
).
For monotonic decrease of we would require IE(G)new ≤ IE(G)old , which is equivalent to
the statement:
−ΣiΣj log (Gij) + log(
m+1(r+1)(s+1)
)≤ IE(G)old .
Let us rewrite the LHS of the statement.
LHS = m log(m+1m
)− r log
(r+1r
)− s log
(s+1s
)+ log(m+ 1)− log(r + 1)− log(s+ 1)
= [(m+ 1) log(m+ 1)−m log(m)]−[(r + 1) log(r + 1)− r log(r)]−[(s+ 1) log(s+ 1)− s log(s)].
Let us rewrite the RHS of the statement.
RHS = IE(G)old = 1m
ΣiΣj log(
mri×sj
)= log(m)− 1
mΣiΣj log(ri)− 1
mΣiΣj log(si).
We can now rewrite the monotonicity conditions IE(G)old ≤ IE(G)new as:
[(m+ 1) log(m+ 1)− (m+ 1) log(m)] + 1m
ΣiΣj log(ri) + 1m
ΣiΣj log(si) ≤ f(r) + f(s)
where f(x) = (x+ 1) log(x+ 1)− x log(x).
4.7.4 A Construction for Monotonic Decrease
The average mutual information of a directed graph need not monotonically decrease in all
conditions (all possible sequences of edge addition). However, it must eventually decrease un-
der any extended sequence of edge addition, as the complete graph has a mutual information
of 0.
Recall that the directed graph we are adding edges to, already has a cycle cover. So,
after having established a cycle cover, what is the worst case conditions for sequential edge
addition? Those conditions, are to add edges in such a fashion that the difference between
δ (based on the total number of edges) and both δi (based on out-degree) and δj (based on
87
in-degree) is as large as possible. Since there is a cycle cover, every row and column sum
already has a value of 1. So, the worst case condition is met, if we add edges such that every
row and column now has a sum of 2. That is, we add a second disjoint cycle cover. We can
add a third cycle cover in this way. Since, after each disjoint cycle cover, every edge has
exactly the same number of rows and columns, it is easy to calculate how terms are reduced
after each cycle cover is added.
After each cycle cover, each term contributes: 1mlog m
ri×sj . In the original cycle cover,
m = n. In the second cycle cover there m = 2n, for the third cycle cover, m = 3n. Including
the first cycle cover, there are n cycle covers to create a complete graph with self-loops whose
average mutual information is 0. After the first cycle cover, ri × sj = 1 × 1 = 1. After the
second cycle cover, ri × sj = 2× 2 = 4. After the nth cycle cover, ri × sj = n× n = n2.
The value for each term after completion of a cycle cover is given by the decreasing
progression,
1|V | log
|V |1, 1
2|V | log2|V |
4, 1
3|V | log3|V |
9, ..., 1
|V |2 log|V |2|V |2 .
Lemma 4. The mutual information summation for IE(G) must monotonically decrease after
a cycle cover.
Proof. The worst case scenario for edge addition past a cycle cover, is addition of edges
to construct another cycle cover, since edges added contribute as much to the mutual in-
formation summation as possible. However, after the addition of another cycle cover, each
term has a smaller value than it had, prior to the cycle cover. Therefore, the summation
of average mutual information terms decreases monotonically under any sequence of edge
additions after a cycle cover.
Theorem 3. In a strongly connected directed graph , IE(G) and Sk(G) are antagonistic under
the operations of adding disjoint cycle covers. Conversely, IE(G) and Sc(G) are mutualistic
under the operations of adding disjoint cycle covers.
88
Proof. By Lemma 1, a directed graph composed solely of a cycle cover has exactly one entry
in every row and column of its adjacency matrix and a directed graph with exactly one entry
for every row and column in its adjacency matrix is a cycle cover.
By Lemma 3, the addition of a single edge in a graph beyond the creation of a cover
cycle will result in a decrease (or no change) in mutual information.
By Lemma 4, the average mutual information must monotonically decrease after each
addition of a cycle cover.
Since, by Theorem 2, additional edges will add to the cut-stability, Sk(G), of a strongly
connected directed graph, cut-stability and the average mutual information must be antag-
onistic in strongly connected directed graphs under the addition of cycle covers.
Since by Theorem 2, additional edges will decrease the connection stability, Sc(G), in a
strongly connected directed graph, connection-stability and the average mutual information
must be mutualistic under the additon of cycle covers28.
The addition of cycle covers could be seen as providing an upper bound for monotonic
decrease of the IE(G) such that any other edge addition sequence, whether it monotonically
declines, or occasionally increase and then declines, will always be below this upper bound.
We conjecture that the majority of edge addition sequences do monotonically decline, so that
with high probability, IE(G) and Sk(G) are antagonistic while IE(G) and Sc(G) are mutualistic
under edge addition or deletion.
The intuition behind the conjecture of the antagonism between the average mutual
information and cut-stability in strongly connected graphs (and the mutualism between
connection-stability and average mutual information) is quite simple. In a strongly connected
graph, there exists a path between every vertex. Adding an edge will increase cut-stability
28 In this chapter we have stressed the relationship of cut and connection stability to mutual information,a measure of constraints in a network. The fact that the mutual information is bounded by the informationcapacity of the network, allows us to also express a complementary relationship tied to a measure of theuncertainty that remains in the network, given the constraints. Let UE(G) = log|V (G)| − IE(G) and be ameasure of uncertainty. It then follows, that for a fixed |V (G)|, as IE(G) increases, UE(G) must necessarilydecrease. Thus, UE(G) will have the opposite relationship to cut and connection stability as IE(G). UE(G)
will be mutualistic with cut-stability, Sk(G), and antagonistic with connection-stability, Sc(G). IE(G) andUE(G), are similar to Ulanwicz’s ascendency and overhead measures respectively which were reviewed inChapter 3.
89
as another vertex may need to be added to the minimum vertex cover. However, the added
edge will also reduce the mutual information, as it creates a new redundant path between
two vertices. Conversely, as long as the strongly connected graph property is maintained,
losing an edge will increase the average mutual information via removing a redundant path.
It will also increase the connection stability by potentially increasing the time to flood.29
Motivated by the diversity-stability debate in ecology, our ecologist has now been able to
extend the notions of cut-stability and connection-stability to mutual information measures
which (with other information measures) have a long history of application as stability indices
in this field. What would our network architect make of these results?
4.8 Balanced Stability
4.8.1 Visualizing Balanced Stability
Imagine our ecologist conveys her results to our network architect. How could he use her
information theory results in the context of his network designs30? Our network architect
does not a priori know what kinds of attacks his network might undergo. His best option
is to provide a balance between moderate levels of cut-stability and connection-stability. To
this end, he knows he can leverage the relationship between the mutual information of a
graph and its cut-stability and connection-stability.
The calculation of average mutual information can be seen as a summation of terms, with
one term for each edge. Let us call the value each edge contributes to the average mutual
information the edge-constraint, ecuv, so that for a given directed edge uv from vertex u to
vertex v,
29 While we focus on a network that is strongly connected; if the network has a cycle cover, the samerelationships should hold, even if the network is not strongly connected, as long as its components are. Thisis due to the fact that the minimum condition beyond which the antagonism of average mutual informationand cut-stability holds, is the existence of the cycle cover.
30 One immediate gain our network architect receives is the ability to use information metrics which areeasy to calculate to estimate topological stability measures which are more difficult to calculate in theirgraph theoretic form.
90
ecuv = p(xu, yv)logp(xu,yv)
p(xu)×p(yv)
Since there are clear bounds on the average mutual information for a network with |V |
vertices, 0 ≤ IE(G) ≤ log|V (G)|, we can consider two extreme cases; where each edge-
constraint term contributes 031 to the mutual information calculation, and where each edge-
constraint term contributes 1|V (G)| log|V (G)| to the summation. The first case represents a
maximally cut-stable graph (a fully connected directed graph with self-loops) where ecuv = 0
for each term. The second case represents a maximally connection-stable graph (where all
components are directed cycles) where ecuv = 1|V (G)| log|V (G)| for each term. In both these
cases, the terms contributing to the average mutual information have identical values. We
will restrict ourselves to graphs whose ecuv distributions are between the 0 contributions
for each edge associated with maximal cut-stability and the 1|V (G)| log|V (G)| associated with
maximal connection-stability32.
Can we develop some intuition as to what a graph that is exactly between these two
extremes might look like in terms of the individual terms contribution to its average mutual
information? We can visualize the situation geometrically by building up the cumulative
distribution of ecuv terms. Say there are Z terms. Let us sort the ecuv terms by ascending
value, and index them in ascending sort order from 0 to z − 1. We now have series of terms
sorted by value from ec1 to ecz−1. Let ecmax = 1|V (G)| log|V (G)| designate the maximum
31 While the average mutual information is always a positive value, individual edge constraint terms canhave negative values in special cases, for example, where two hubs are connected by an edge. A negative valueoccurs whenever the product of the row and column sums associated with an edge is greater than the totalnumber of edges. For a given directed graph of |V | vertices we can calculate the maximum negative valuethat can occur. For a graph of |V | vertices, a specific configuration results in the maximum negative value.In this configuration, there are two vertices, which we may call the out-vertex and in-vertex, respectively.The out-vertex has directed edges to every other vertex. The in-vertex has directed edges from every othervertex. Both these vertices also have self-loops. For such a configuration |E(G)| = 2|V (G)|. If u is the out-
vertex, and v is the in-vertex, ecuv = 1|E(G)| log
1|E(G)|
|E(G)|2
|E(G)|2
= 1|E(G)| log
4|E(G)| = 1
2|V (G)| log2
|V (G)| . Whenever
|V (G)| is greater than 2, this configuration will have negative values; when |V (G)| = 2, we have a completedirected graph with self-loops and ecuv = 0. In a graph, certain configurations of connections will precludeother configurations, so it is impossible to have a graph where all term values are negative, and indeed theaverage must always be positive.
32 Negative values for ecuv indicate situations that are neither cut nor connection stable, which we wantto avoid in our network design.
91
possible value for each term. We can then normalize each sorted term by dividing by the
maximum value so that ec1n = ec1ecmax
and ecz−1n = ecz−1
ecmax. These normalized values can be
used to build up a normalized cumulative distribution (where the X axis of values range
between 0 and 1, as does the Y axis of probabilities). For any threshold value ecn in a series
of i values (i ≤ z)where 0 ≤ ecn ≤ 1 the normalized cumulative distribution function is
Yecn =∑
ecin≤ecn
p(ecin).
Since the cumulative distribution is now normalized, the result is a monotonically in-
creasing function within the unit square. Given our restrictions of ecuv ≥ 0, every possible
normalized cumulative distribution will provide a different shaped curve through this bivari-
ate space, and all of these curves can be considered to fall between our extreme cases of
maximal cut-stability and maximal connection stability. In Figure 3, the blue line (dots)
indicates the normalized cumulative distribution for cut-stability while the red line (squares)
demarcates the normalized cumulative distribution for connection-stability. The green line
(triangles) demarcates points that are equidistant from cut-stability and connection-stability.
It represents the case where there is a uniform distribution of values in the range between
0 and log|V (G)|. The brown line (Xs) represents mixed-stability. Some parts of the graph
are highly constrained, other parts are weakly constrained. In the special cases of maximal
cut-stability or maximal connection-stability, since all terms are identical, the graph has
homogenous internal structure and the local topology of one part of the graph would look
much like that of another part of the graph. By contrast, a graph corresponding to bal-
anced stability would have heterogenous internal structure, and show fine grained variation
throughout. Some parts of the graph will be sparsely connected; other parts will be highly
connected. Additionally there will be parts with intermediate levels of connection.
The edge constraint terms of a mutual information calculation provides our network
92
Figure 4.3: Stability Measures in Terms of Cumulative Probability of Summation Terms
designer with a simple visual tool by which he can examine how closely his network design
approaches an ideal of balanced stability. He can then modify it towards balanced stability
by seeking a more uniform distribution of ecuv terms. While this uniform distribution is an
ideal, and may not be realizable due to other constraints, he can immediately visually test
how closely he comes by simply plotting out the cumulative distribution of terms obtained
from his network design against a uniform distribution. If he further wished to quantify the
difference between his network design and the ideal of balanced stability, he could determine
the area between the two curves via summation (or integration).
For the summation, choose equal sized contiguous intervals on the X axis of normalized
edge-constraint values. Let f(xj) be the cumulative probability for the jth value given
the empirical distribution of edge constraint values for a measured graph G. Let g(xj) be
the cumulative probability for the jth value given a uniform distribution of edge-constraint
values. Let 4x, be the size of the contiguous intervals on the X axis. Then the absolute
difference in the area of the two curves is:
93
AbsDiffAreas = Σj |f(xj) − g(xj)| 4x, where |f(xj) − g(xj)| is the absolute value of
the difference.
For differentiation, let the interval sizes become infinitesimals.
Now the absolute difference in the area of the two curves is:
AbsDiffAreas =∫ 1
0|f(xj)− g(xj)| dx
AbsDiffAreas may be interpreted as a measure of distance from balanced stability of a
measured graph G. Since we are keeping the summation (or integration) bounded between
0 (the network matches the uniform distribution of balanced stability) and the area of a
triangle (representing either maximum cut stability or maximum connection stability) in the
unit square, the maximum value is 12, which in the case of maximal cut-stability is the area
above the balanced stability line, and in the case of maximal connection stability is the area
below the balanced stability line.
This gives our network designer a first cut approximation of balanced stability. But
he could go further. He could reason that the cumulative distribution gives him an idea
of the overall contribution of terms, but no idea of how they are locally organized within
the network. For example, say all the vertices with smaller values are closer to each other
than vertices with larger value. Over the whole network, the cumulative distribution might
approximate a uniform distribution – but there might be local regularities that might be
leveraged by an attacker.
4.8.2 Balanced Stability and Information Hiding
Networks are subject to various kinds of attacks. Can we extend our concept of balanced
stability order to develop the notion of a network architecture for which an adversary would
have a difficult time determining which type of attack to launch?
Our network designer now considers – what might a network look like with no local
regularities that would allow an attacker to decide between a cut-attack and a connection-
94
attack? He imagines an attacker taking a walk on his network, moving from vertex to
vertex via directed edges (and he imagines the attacker does not hit any vertex or edge
more than once). This attacker can count the edges into and out from each vertex he has
sampled, and thus locally calculate ecuv for each edge along the vertices he has traversed.
Balanced stability would be the case where if he walked t steps, and calculated the value of
some property on each vertex (in our case ecuv), he would have no information by which to
predict the value of that property as he proceeded over the next edge, prior to reaching it,
and actually measuring it. He would simply have to guess the next value randomly from a
uniform distribution.
Let m1...t be the sequence of values measured (or calculated from measurements) for
some local graph property in t steps33 (the local properties could be measured on either the
vertices or the edges). Let mt+1 be the value of the property measured or calculated in the
next step. Let kt be the number of distinct values observed in the sequence m1...t34.
If, knowing the values on our walk m1...t provides no information by which to predict the
value encountered on our next vertex, mt+1, all we can do is guess at the value by assuming
a uniform distribution bounded between the lowest and highest values we have encountered
so far. Without additional prior information, it is impossible to do more. In this case we
have the probability of at most 1kt
of guessing the correct value35. There is no information
gained in the walk that helps us to predict the value of the next step in the sequence.
Definition 7. Perfect Information Hiding (PIH) exists on a graph, G, if and only if for
every t-step walk on the vertices (or edges) of a graph via traversing edges (or vertices) in
which no previously encountered edge (or vertex) is crossed, information obtained on the
33 Consider m1...t to denote the series of values from t sequential measurements, m1,m2,m3...mt.34 Depending on the nature of the measurement taken, the maximum possible value for kt, which we
will denote kmax, might be pre-determined. For example, if we know we are in an undirected graph of |V |vertices without self-loops, and our measured value is the degree of each vertex, then kmax = |V | − 1 whichis the largest possible degree for a vertex. In general, as we extend the walk beyond t, it may be possible tomeasure new distinct values.
35 If our walk so far has not encountered the true upper or lower bounds for our measured values, ourprobability is less than 1
kt.
95
previous vertices (or edges) for some property m provides no information on m for the next
vertex encountered.
If m has k distinct values observed in t steps,
p(mt+1|m1...t) ≤ 1kt
.
Since,
p(mt+1|m1...t) = p(m1...t,mt+1)p(m1...t)
.
If m1...t provides no information on mt+1, then m1...t and mt+1 are independent. In
the case of independence, the joint probability of a pair of events is equal to product of
the probabilities for each event [Hacking 01]:pp.25, 41-42, 60-62. Thus, p(m1...t,mt+1) =
p(m1...t)p(mt+1). Therefore,
p(mt+1|m1...t) = p(m1...t)p(mt+1)p(m1...t)
p(mt+1|m1...t) = p(mt+1)
Given we (a) only have information from previous measurements, and (b) no criteria on
which to bias our guesses over k distinct values encountered in t steps, and (c) there is the
possibility that a new distinct value may be encountered in the next step,
p(mt+1|m1...t) = p(mt+1) ≤ 1kt
.
PIH can be seen as an ideal, which may not be fully realized, for both graph theoretic
and practical reasons in the context of designing a network. PIH indicates very fine but
unpredictable sub-structuring of the graph in terms of the local property to be measured.
Two examples of properties we could apply PIH to are the ecuv values for each term
in a mutual information calculation for a sequence of edges, or the degree distribution for
a sequence of vertices. If applied to the values for each term in a mutual information
calculation, PIH extends our concept of balanced stability above. However, PIH can be
applied to any measurable property of a sequence of vertices edges.
96
Perfect Balanced Stability (PBS) is a special case of PIH. Perfect Balanced Stability
(PBS) exists on a graph, G, if PIH exists on G for the property, m measured on each edge,
where m is ecuv.
For PBS the cumulative distribution of observations of ecuv on every walk will be that
for a uniform distribution.
The idea of perfect information hiding on a graph from which we draw our concept of per-
fect balanced stability is quite similar to von Mises definition of randomness [von Mises 81],
as well as two other information theoretic concepts: Shannon’s information theoretic defi-
nition of perfect secrecy [Shannon 49] and the notion of the algorithmic complexity of a se-
quence which was independently developed by Kolmogorov [Kolmogorov 68b, Kolmogorov 68a],
Solomnoff [Solomonoff 64a, Solomonoff 64b], and Chaitin [Chaitin 66].
In von Mises classic text on probability, ‘Probability statistics and Truth’ randomness is
conceptualized in terms of a sequence of observations he calls a ‘collective’ [von Mises 81].
While his examples of collectives are sequences of observations made by casting dice, or
observing small and large stones during a walk, the idea applies equally to the observations
made during the walk of a graph [von Mises 81]:pp. 24-25.
‘A collective appropriate for the application of the theory of probability must fullfill two
conditions. First the relative frequencies of the attributes must possess limiting values.
Second, these limiting values must remain the same in all partial sequences which may
be selected from the original one in an arbitrary way.’
PIH occurs when the sequence of local property values obtained from any walk on a
graph are effectively random. If PIH holds not only at a walk of length t but between every
subsequence of that walk, and the next vertex encountered, the values obtained in the walk
must be random.
Shannon’s notion of perfect security [Shannon 49] is based on leveraging the relationship
between conditional probability and statistical independence. Shannon states the essential
97
idea simply in terms of cryptograms, E and the plaintext message PE(M) they correspond
to : [Shannon 49]: pg. 680.
‘The cryptanalyst intercepts a particular E and can then calculate, in principle at least,
the a posteriori probabilities for the various messages, PE(M). It is natural to define
perfect secrecy by the condition that, for all E the a posteriori probabilities are equal to
the a priori probabilities independently of the values of these. In this case, intercepting
the message has given the cryptanalyst no information.’
Luenberger [Luenberger 06]:pp. 186-189 summarizes Shannon’s key idea that perfect
security requires statistical independence between messages and ciphertext. Let M be a
plaintext message (Shannon’s PE(M)) and C the ciphertext of the encrypted message (Shan-
non’s E). A system is perfectly secure, if for all possible messages M , the probability of the
message given the ciphertext, p(M |C) is equal to the probability of the message, p(M).
p(M |C) = p(M).
So, the probability of a particular message, M is unchanged by information about the
ciphertext.
In such a case, the average mutual information across all messages and ciphertexts, is 0,
I(M,C) = 0.
By Bayes’ Rule [Tijms 07]:pp. 251-256:
p(M |C) = p(C|M)p(M)p(C)
.
p(M |C) = p(M) only if p(C|M) = p(C), which happens only if M and C are independent
so that,
p(M |C) = p(C)p(M)p(C)
= p(M).
In the case of PIH, the observations made during a walk of length t correspond to the
ciphertext, and the value of the next step in the walk, constitutes the message.
PIH relates also to the problem of inductive inferences, which leads directly to algorith-
98
mic information theory. Solomonoff [Solomonoff 64a, Solomonoff 64b] viewed all inductive
inference problems as essentially concerning whether given a sequence of symbols (say the
data from an experiment, or a walk upon a graph), the possibility of extrapolating the next
values in the sequence. [Solomonoff 64a]: pp. 2:
The problem will be the extrapolation of a long sequence of symbols – those symbols being
drawn from some finite alphabet. More specifically given a long sequence, represented by T ,
what is the probability that it will be followed by the subsequence represented by a? .... we
want c(a, T ), the degree of the confirmation of the hypothesis that a will follow, given the
evidence that T has just occurred.
In the context of PIH, the sequence are those observations already obtained by walking
a graph, and we want to infer a subsequence which is the next observation to be made in
the walk. Can observations in our walk so far help us to predict the next observation?
Solomonoff [Solomonoff 64a, Solomonoff 64b], and independently Kolmogorov [Kolmogorov 68b,
Kolmogorov 68a] and Chaitin [Chaitin 66], all arrived at the same solution creating the field
of algorithmic information theory, and its associated measure Kolmogorov complexity36. The
essential idea is that the Kolmogorov complexity, Ck(S) of a sequence is the minimal length
program d(S) that can generate the sequence. From Chapter 3,
Ck(S) = |d(S)|.
This measure can be related to the degree of randomness of the sequence S in that for a
random string, the Kolmogorov complexity is approximately the length of the string. That
is, if a sequence is random, the minimal program to generate the sequence is approximately
the size of the sequence itself. In that sense, random sequences are incompressible [Li 97]:pp.
36 A capsule summary of the different approaches by which algorithmic information theories three co-founders arrived at their results is given in [Muller 07]. A detailed consideration of Solomonoff’s ideas oninductive reasoning is in [Li 97]: Chapter 5. The idea that all approaches towards defining randomnesseventually arrive at algorithmic information theory is explored by [Volchan 02] who notes (pg. 48) ‘Inter-estingly, all these proposals ended up involving two notions apparently foreign to the subject of randomness:algorithms and computability. With hindsight, this is not totally surprising. In a sense to be clarified as weproceed, randomness will be closely associated with “noncomputability.”’
99
379.
It follows that if PIH holds at t steps and for all smaller walks with less that t steps, then
the best we can do to infer the next observation in a walk on a graph given the previous
observations is to simply guess. As we build up the sequence of observations, step by step,
PIH requires the sequence of observed values to be incompressible.
It may be that PIH may not hold for some graph topologies, due to other graph properties
providing the basis for an informed guess37. In that case, PIH represents the theoretical ideal
of perfect information hiding, and any demonstration the ideal can not be met because it
violates some other graph property, guarantees information leakage.
PBS simply follows by applying PIH to a particular locally observed graph property, ecuv.
Both PIH and PBS are intimately tied to the idea of random sequences via the connection
to algorithmic information theory. Furthermore, the existence of PBS depends now, not only
on the average value of the mutual information, but also on the distribution of the terms
contributing to the mutual information as encountered in a walk on the graph.
For PIH to exist in a graph, every sequence of t local values obtained from a walk on the
graph must be independent of the t+ 1th value obtained in the next step, and the sequences
of values are themselves incompressible or random. Conversely, if it can be demonstrated
for a particular graph that independence does not hold between m1...t and mt+1 or if it can
be demonstrated that the sequences t are compressible, then that graph can be said to leak
information about itself.
4.9 Connections to Other Perspectives
Our perspectives in this chapter have been drawn from both ecology and a network archi-
tect’s focus on designing a robust network that resists cut-attacks and connection-attacks.
37 As an example consider in particular invariant properties of graphs, where the same relationship holdsfor every graph. One example is the relationship between the sum of the degree distribution on verticesand the total number of edges, where the sum of the vertex degrees equals twice the number of edges[Chartrand 77]:pp. 28.
100
The concepts developed in this chapter can be applied to other perspectives such as error
and attack tolerance in technological networks, specific stabilizing mechanisms theorized to
operate in ecological networks, and to mechanisms believed to stabilize social networks. Ad-
ditionally, the stability concepts developed here may have relations to other mathematical
approaches such as graph spectra. Application of the topological stability concepts developed
here complements stability concepts from other existing perspectives on complex networks
and provides additional insight into both the mechanisms associated with stability, as well
as the stability ramifications of the underlying network models. Connections to these other
perspectives are briefly discussed below. These connections identify specific lines of appli-
cation along which the theory of topological network stability may be extended in future
work.
4.9.1 Error and Attack Tolerance for Complex Networks
Together our network architect and our ecologist have conceptualized topological network
stability by formalizing intuitive concepts from their respective fields. In this section we
briefly examine how our network architect could apply those concepts to gain insights into
the existing literature of network resistance to errors and attacks.
Our approach has been to develop our topological stability concepts in the general
context of undirected and directed graphs, and general attack strategies, cut-attacks and
connection-attacks, rather than in terms of specific graph generation models or detailed at-
tack protocols38. We want to understand how a network’s topology may be resistant to
attacks that cannot be anticipated. Much of the model specific literature deals with vari-
ants of our cut-stability and connection-stability concepts in the context of specific attack
protocols. In Chapter 1, we noted a number of literature examples of cut-stability in In-
ternet [Albert 00, Calloway 00, Cohen 00b, Cohen 01, Crucitti 04, Gallos 05] and ecological
38 For example, the scale-free, small-worlds, or the Erdos-Renyi random graphs models which were brieflysummarized in Chapter 3. Attack protocols could include random attacks, or directed attacks proceeding indescending vertex degree order.
101
[Sole 01] studies tied to the scale-free model, all of which indicate these networks can resist
random attacks, but are susceptible to directed attacks (that usually begin with the highest
degree vertices). From our perspective, these results follow naturally from scale-free graphs
having a relatively small vertex cover relative to the total number of vertices in a graph, such
that the ratio |MVC(G)||V (G)| is small. Therefore, the probability of hitting an element of MVC(G)
in a random attack is small, but a directed attack on elements of MVC(G) will be highly
effective in disrupting the graph structure, since only a relatively small portion of the total
network needs to be attacked. This is particularly true in Barabasi and Alberts preferential
attachment model for scale free graphs [Barabasi 99] where the bias of new vertices to attach
to existing vertices with high degree essentially guarantees that the minimum vertex cover
will increase at a slower rate than the rate of increase for vertices so that as the network
grows, the ratio |MVC(G)||V (G)| gets smaller and smaller39.
A recent study [Buldyrev 10, Vespignani 10] notes that when two networks are coupled
together and therefore interdependent, such as increasingly occurs between power networks
and Internet networks, they are more vulnerable to cascading failures than any single network
prior to coupling. Their results were obtained using percolation theory applied to pairs
of Erdos–Renyi random networks and scale-free networks respectively. Within a pair of
networks (generated via the same model), every vertex in one network is assumed to be
39 Using reasoning similar to that applied to the preferential attachment model, we can easily see the topo-logical stability consequences for other popular network generation models. Watts and Strogatz [Watts 98]and Kleinberg [Kleinberg 00] have both produced models which generate small worlds graphs. Key toboth models is the addition of extra edges to a regular graph (random rewiring) that connect directlynodes that would otherwise be distant (have a long path between them). The consequence of such ad-ditions is to increase cut-stability (the extra edges can add to the size of |MVC(G)|), but at the costof decreasing connection stability (the extra edges create shortcuts that can reduce the time to floodG). In particular, Kleinberg’s ideas have been extended to examine the searchability of arbitrary graphs[Duchon 06a, Duchon 06b, Fraigniaud 09, Fraigniaud 10]. These methods depend on making a graph moresearchable via augmenting a ‘base graph’ with additional random edges. The probability of an augmentededge from u to v is proportional to the distance(path length) between u and v. Again, any increase in search-ability due to creation of edges comes at a cost of connection-stability. Finally, the Erdos-Renyi randomgraph generation model [Chung 06]:pp. 91-92, depends on two parameters, n, the number of vertices, andp, the probability of selecting an edge. For a fixed number of vertices (constant n) as p approaches one,the graphs generated are increasingly cut-stable as they approach the complete graph at p = 1. Via theantagonism of cut-stability and connection-stability, as p approaches 1 the generated graphs are decreasinglyconnection-stable.
102
linked to a single vertex in the other member of the pair, and is functionally dependant
on it. Failure of a vertex in one network causes a coupled failure in a vertex of the paired
network. We can understand this phenomena of coupled networks being more vulnerable
to cascading failures via connection-stability, without reference to a specific network model.
For a pair of networks, there is now a set of additional edges linking the two networks which
provide pathways so that an instability that begins in one network can move virally to the
other network. Let us consider two undirected networks G1 and G2 with the same number
of vertices, so |V (G1)| = |V (G2)|. G1 and G2 are each connected networks. Let Sc(G1) = T1
be the connection-stability of G1 on its own. Let Sc(G2) = T2 be the connection-stability
of G2 on its own. These are the connection-stabilities if the networks were independent of
each other. Now, consider each vertex in G1 to have an edge connecting it to some vertex
in G2. This allows the possibility of some pair of vertices in G1 (or G2) to have a shorter
path connecting them via vertices in G2 (or G1) than exists in G1 (or G2) on its own. Let
T (G1|G2) represent the number of iterations required to flood G1 given a coupled network G2
exists. Let T (G2|G1) represent the number of iterations required to flood G2 given a coupled
network G1 exists. Then T (G1|G2) ≤ T1 and T (G2|G1) ≤ T2. The connection-stability of
the networks thus becomes sub-additive so that Sc(G1 + G2) ≤ Sc(G1) + Sc(G2). Coupling
the networks results in equal or less connection-stability for each of the networks, due to the
creation of alternate routes for flooding. The cascade due to coupling can be particularly
rapid if vertices that are distant in one network are connected via vertices that are close in
the second coupled network, so that a cascade beginning in the first network, can utilize the
second network to rapidly flood nodes that would otherwise be distant.
The brief examples above illustrate the utility of our topological stability approach to
provide a general route to insights originally obtained through, and tied to specific net-
work generation models. Recent studies of networks from a defence/security perspective
increasingly focus on (a) graph properties that are not tied to specific graph generation
103
models [Dekker 04], (b) the actual topologies networks under attack or surveillance evolve to
[Lindelauf 09], or (c) on strategies by which networks may actively re-structure themselves to
resist attack [Nagaraja 06, Nagaraja 08]. The three defense strategies to topological attacks
investigated by Nagaraja and Ross [Nagaraja 06] can be interpreted in terms of topological
stability40. The ‘random replenishment’ strategy consists of replacing vertices lost due to
attack with new vertices that are randomly attached to existing vertices, leading to a more
amorphous network. This corresponds to replenishing the minimum vertex cover for the
network, and thus its cut-stability. The ‘dining steganographers’ strategy consists of replac-
ing high degree vertices with rings of vertices, so that external connections of the original
vertex are distributed uniformly across the ring. Essentially this defence strategy increases
the connection-stability of a network by slowing down the rate at which the network can be
flooded. The ‘revolutionary cells’ strategy is similar except that now high degree vertices
are replaced with a clique41. It increases the cut-stability of the network by replacing sin-
gle vertices with cliques. In a study where networks are attacked in alternating waves of
targeted and random attacks [Tanizawa 05] the network structure that was the slowest to
degrade had a bimodal distribution with a proportion of the vertices forming a single clique
(small contributions to mutual information) and the remainder having a single edge (larger
contributions to mutual information) connecting it to the clique, a very simple form of mixed
stability.
4.9.2 Keystone Species, Indirect Effects and Cycling in Ecological Networks
What applications might our ecologist find for the topological stability concepts to integrate
theory in her field. Three concepts associated with stability of ecosystems are ‘keystone
species’, ‘indirect effects’ and ‘cycling’. Below we briefly examine each of these concepts and
40 The antagonism between cut-stability and connection-stability is defined in terms of alternate edgeconfigurations for a graph with a fixed set of vertices. The defense strategies below all consist of addingvertices to a network, so their cut-stability and connection-stability implications must be interpreted withrespect to alternate graphs with the same number of vertices.
41 A clique is a subset of the vertices of a graph, where every pair of vertices is adjacent.
104
how they can be related to topological stability.
Keystone species are members of an ecological community that have a disproportionately
large influence on community structure. In the context of ecological networks characterized
by trophic interactions, keystone species may be considered those species which, if removed,
lead to a major restructuring of the food web [Jain 02, Jordan 09, Jordan 05, Jordan 99,
Quince 05]. An intuitive concept, keystone species have been defined in various operational
ways by different authors42. Jordan notes [Jordan 09]:pp.1735 that perhaps the simplest
identifier of a keystone species in a food web is the degree of the vertex associated with
that species (combining both incoming and outgoing edges)43. Thus the ‘keystoneness’ is
associated with vertex degree. Another relatively straight-forward approach is to define
keystoneness via a simulation where certain species are knocked out, and species dependant
on them are eliminated. Keystoneness is now tied to whether such a simulated knock-out
signifcantly changes the structure of the graph beyond the initial vertex removed [Quince 05].
Indirect effects can be defined in contrast to direct effects. If direct effects refers to
any two species that share an edge in a trophic network, indirect effects refer to species
between which a path exists [Fath 99]:pg. 173, [Jordan 01]:pg. 1844. Some researchers also
include as indirect effects the ability of one species to modify the direct effects (i.e. the
value of the edge) between a pair of directly linked species [Wootton 94]:pg. 445. Several
researchers have demonstrated that indirect effects may have greater influence than direct
effects such that two species connected by a path have a stronger relationship than the
direct connections of either species [Fath 98, Fath 99, Jordan 01, Higashi 86, Higashi 89]45.
42 Indeed keystone species are defined in different not only across research groups, but also in several ofthe papers above with a common author, F. Jordan.
43 Jordan et al. [Jordan 05] provides a nice illustration of keystone species in marine ecosystems wherea few species such anchovies and sardines act like hubs connecting lower trophic levels (species anchoviesand sardines can eat) to higher trophic levels (species that eat anchovies and sardines), where the lower andhigher trophic levels have many more species than the hubs. Such ecosystems are then very sensitive to lossof these hub species, which act as keystones.
44 Indirect effects are also similar to Yodzis [Yodzis 00] conception of ‘diffuse effects’, which refers to themediating action of nonlocalized effects of other species on the interaction between a pair of species.
45 Bondavalli and Ulanowicz [Bondavalli 99] provide a nice illustration of indirect effects in cyprus swamps.They find the direct negative effects of alligators on certain prey (frogs, mice and rats), to be more than
105
A recent series of empirical papers on ecological flow networks demonstrates that indirect
effects come to rapidly dominate direct flows [Borrett 07, Borrett 10, Salas 11]. Keystone
species and indirect effects are not mutually exclusive concepts, in that a keystone species
influence might very well be via its indirect effects.
The concepts of indirect effects and keystone species can be viewed in light of topological
stability by asking of them seemingly naive questions. If keystone species are important for
stabilizing ecosystems, why are not all species keystone species? On the simplest definition of
a keystone species, vertex degree, this would imply a trophic network that can be represented
as a complete graph. It would also represent maximum cut stability. However, in that case,
the very meaning of keystone species would disappear, since loss of a single species would be
compensated for by connections through other species. However, from a connection-stability
perspective, in such a complete graph, any fluctuation in one species may rapidly transfer
to, and potentially disrupt, all other species. If indirect effects are important for stabilizing
ecosystems, why are they not maximal? If the trophic network were a large directed cycle,
most effects between species pairs would be indirect. It would also represent maximum
connection stability. However in such a case, there are no alternate trophic pathways, and
the ecosystem would be extremely vulnerable to loss of species. Actual ecosystems are
arranged between these two extremes in some semblance of balanced stability.
Cycling is another concept long associated with stability in ecosystems. It concerns
whether a resource (either energetic or nutrient) will be used again by the same species, i.e.
recycled. Lindeman’s original diagramattic view of energy cycling in ecosystems [Lindeman 42]
has been formalized into various cycling indices [Fath 07a, Finn 76, Patten 85, Patten 84,
Patten 90, Ulanowicz 83, Ulanowicz 04] and is associated with the existence of strongly con-
nected components which create subsystems in the ecological network [Allesina 05, Borrett 07].
compensated by the indirect positive effects on those prey, by alligators also feeding on their predators (turtlesand snakes). Notably, alligators also play a keystone species role in cyprus swamps. Indirect effects are alsoa complicating factor in attempts to ‘manage’ ecosystems. The classic cautionary tale is the use of DDT oncrop pests having unanticipated effects further up the food chain on predatory birds [Ulanowicz 09b]:pg. 7.
106
Intuitively, cycling concerns the movement of both energy (locked in the biomass of the dif-
ferent species) and nutrient flows through an ecosystem in such a way the energy or nutrients
cycle through the system, rather than being lost to the system. The usual picture of ecosys-
tem cycling begins with energy and nutrients captured initially in plants then flowing through
various levels of herbivores and predators, and then as individuals die, decomposers make
the energy and nutrients stored in biomass once again available to the system in simpler
form. Fath and Halnes [Fath 07a]:pg.18, provide a succinct structural definition that can be
applied to ecological networks:
‘A structural cycle is the presence of a pathway in the ecological network in which
matter-energy passes through biotic or abiotic stores returning for availability to the
same or lower trophic levels. Structural cycling is present ion food webs due to intraguild
predation, cannibalism, or other predation events that connect laterally or backwards in
the hierarchy.’
The conditions under which mutual information can be interpreted as a topological sta-
bility measure require exactly such structural cycling, via the existence of strongly connected
components.
The ecological concepts detailed in this chapter all hinge on material transfers from
species to species. They may have analogues in the field of economics, if we move from
trophic networks to networks of goods and services. Recent papers have begun to apply
lessons from stability in ecological networks, to economic networks taking both information-
theoretic [Goerner 09] and dynamical systems [May 08] perspectives into account.
4.9.3 Social Networks
The theory of topological network stability can provide insights into fields outside of the
perspectives of our network architect and ecologist. In this section, we take a brief look
at how topological stability can connect to theory developed in, or inspired by, the field of
social networks.
107
White and Harary [White 01] define social cohesion upon the k − connectivity of a
network (also known as the vertex-connectivity), where k is the minimum number of ver-
tices that would have to be cut in a connected network to create a disconnected network
[Harary 69]:pg.43, i.e. groups with no means of communicating with each other. For exam-
ple, a tree is 1 − connected because a single vertex cut can split it into two components,
while a cycle is 2 − connected because it requires at least two vertex cuts to split it into
two components. A group’s social cohesion then is equivalent to the value k for the social
network of that group. By contrast, |MVC(G)| is a more extreme notion, which represents
the number of vertices that must be cut, so that all that remains are components that consist
of a single vertex46. While k − connectivity reflects the amount of effort required to reduce
a cohesive network into two or more groups, |MVC(G)| is the amount of effort required to
reduce the group into single-person islands, cut off from communication with anyone else47.
We can similarly view connection-stability in the context of social networks. For a con-
nected graph, the connection stability, Sc(G) = T . T is proportional to the minimum time
it takes for a piece of information known by one member, to be known by all members via
gossip. As T decreases, gossip can spread more easily. Gossip, or rumour spreading has been
the subject of a number of recent papers [Censor-Hillel 11, Chierichetti 09, Chierichetti 10,
Giakkoupis 11] which relate it to a graph theoretic measure, the conductance [Bollobas 98]:
pg. 321, and seek to find efficient algorithms to spread rumours. The performance of these
algorithms is tied to the conductance of a graph. The conductance of a graph is based on
cutting a graph into two sets of vertices. Across all such cuts, the graph conductance is
the minimum value for the ratio of the number of edges that cross the two sets of vertices
divided by the size of the smaller group of vertices. A complete graph would have the highest
46 The resulting graph is now not merely disconnected, but totally disconnected.47 Given that the minimum vertex cover is a known NP-complete problem, whereas polynomial time
algorithms exist for the k−connectivity of a graph [Henzinger 00], recursive application of a k−connectivityalgorithm to a graph until it is reduced to disconnected vertices would result in an estimate of a vertexcover, though its membership will be larger than the membership of the minimum vertex cover and thusover-estimate the topological stability.
108
value for conductance, which might suggest conductance as another measure of cut-stability.
However the lowest value for conductance would be a graph composed of two equal sized
complete subgraphs that are joined by a single edge, which would also have a large degree
of cut-stability (but very low connection stability).
A recent study by Kossinets et al. [Kossinets 08] examining the structure of informa-
tion pathways in an email network identified a network backbone, the subgraph over which
information flows quickest. The network backbone was found to balance two antagonistic
tendencies, ‘flows that arrive at long range over weaker ties; and flows that travel quickly
through densely clustered regions in the network’ [Kossinets 08]:pg 7. Such a description,
while arrived at via very different techniques than applied here, sounds much like a system
balancing between connection-stability (via long range weaker ties) and cut-stabillity (via
densely connected regions that rapidly disseminate information.); and indeed bears some
resemblance to the dual roles of keystone species and indirect links in ecosystems.
4.9.4 Graph Spectra
Another fruitful area for future research that crosses disciplines, is the link between topolog-
ical stability as developed here, and graph spectra in complex networks, an area of rapidly
increasing interest [Chung 06, Van Miegham 11]. Graph spectra consist of geometric trans-
formations on an adjacency matrix using linear algebra. Graph spectra are ultimately based
on the properties of an adjacency matrix, as is the mutual information approach we have
developed in this chapter. Graph spectra have found applications in both ecology, and the
study of viral processes.
In ecology, Fath and Halnes [Fath 07a] have argued that the strength of structural cycling
in an ecological network is given by the size of the largest eigenvalue (also called the spectral
radius) of the corresponding adjacency matrix. Borret et al. [Borrett 07] have argued that
such cycling plays a strong role in functionally integrating subgroups of species via cyclic
indirect effects. Together with [Allesina 05] these studies imply that strongly connected
109
components in an ecological network act as functional modules, and spectral analysis can is
a useful indicator of the strength of cycling relationships within such modules.
In studying virus spread in networks, Wang et al. proposed [Wang 03] and Van Mieghem
et al. demonstrated [Van Mieghem 09] that the epidemic threshold is tied to the size of the
largest eigenvalue, while Draief et al. [Draief 08] concluded that the ratio of cure to infection
rates must be greater than the largest eigenvalue for a virus to be contained and not result
in an epidemic.
In both the ecological and viral applications of graph spectra, the size of the first eigen-
value is related to the ability of a process (energy cycling, viral spread) to move rapidly
through the network, a feature we have associated with high cut-stability (and correspond-
ingly low connection stability). This raises the interesting question: could changes in the
spectral radius of a network be related to cut-stability and connection-stability via being
correlated to changes in the mutual information of a network?
4.10 The Story So Far, The Road Ahead
We are now at the half-way point of our journey. The story so far has focussed on con-
ceptualizing topological stability in complex networks. We began with intuitive concepts of
cut-stability and connection-stability (Chapter 1), surveyed stability concepts across several
scientific disciplines (Chapter 2), then introduced basic ideas from graph, probability, and
information theory required to formalize our stability concepts (Chapter 3). In this chapter
(Chapter 4) we have formalized our concepts of cut-stability and connection-stability, then
extended them to mutual information, and finally used this extension to derive a concept of
balanced stability where network architectures can resist both cut and connection attacks.
We have provided a simple visual technique to test for balanced stability. We have shown
that these topological stability concepts apply to such disparate problems as the diversity-
stability debate in ecology, and to conceptualizing a network’s susceptibility to cut and
110
connection attacks, which we might call, ‘topological network security’.
Up to this point, our view has been focussed on the structure of a network. We have
alternated in this chapter between the perspectives of a network architect, and an ecologist,
as each learns from, and extends upon the other’s perspective. From these perspectives
we have been able to develop theory about the structural limitations of networks to resist
perturbations. However, networks are not merely static structures, they are structures within
which processes operate.
We now turn in the next two chapters to developing a dynamic framework in which we
can talk about processes on a network that lead to flows of messages, and examine how
dynamic processes can circumvent structural limitations. Again, we will alternate between
a computational and a biological perspective, but now in the context of a computational
systems biologist who must bridge both skill sets.48. Her goal is to develop a flexible modelling
framework capable of representing the wide range of signalling processes in biology, that
can be used to integrate experiment and theory in systems biology. The perspective of
our computational systems biologist could again be applied to that of a network architect.
Now his concern focusses on what kinds of processes might he incorporate into a network
to stabilize it. Having learned from our ecologist the benefits of looking to biology for
inspiration and evolutionarily tested design examples, he may consider processes that have
been known to stabilize biological systems
Chapter 5 introduces a probabilistic multi-agent model for message passing in networks,
the Probabilistic Network Model (PNM). Chapter 6 demonstrates the model can be used
to capture a broad range of process interactions relevant to biologists, which allows for (a)
from our network architect’s perspective, abstracting biological processes into computational
models that can be implemented on networks, and (b) from our computational systems
48 Systems biology is an interdisciplinary field focussed on interactions in biological systems, usually atthe molecular level. Computational systems biology focusses on computational techniques to analyze datareflecting patterns of biological interaction, and to develop computational simulation techniques that can beused to study how such interaction systems develop over time. Ideally computational systems biology canprovide a way of linking experimental data to theory in systems biology.
111
biologist’s perspective, develops a conceptual tool set that allows one to develop probabilistic
simulations of biological processes that can incorporate experimental data into the network
structures used as well as initialization of probabilities.
Chapter 7 brings together our conceptualization of topological stability and PNMs. The
cut-stability and connection-stability definitions in this chapter are extended to define a
new concept, resilience, the ability of a specific dynamic process to circumvent topological
limitations in cut-stability and connection-stability. Specifically, we explore a virus-immune
PNM introduced in Chapter 6 to examine the resilience provided by the very simple immune
response of sending a warning message, to limit the effects of a viral connection attack. Since
viruses occur in both biological and computational systems, we take both the perspectives
of our network architect, stress-testing a prospective topology, and also that of an epidemi-
ologist/immunologist seeking to understand situations in which a potential epidemic can be
damped by an early warning system. From both these perspectives, PNM simulations now
provide a framework to explore the relationship between topological stability and dynamic
resilience.
Finally, a few brief comments about lessons from the diversity-stability debate in ecology
that provide further avenues for future research. In our discussion of the diversity-stability
debate we saw the historical progress of both topological and dynamical notions of stability.
The topological and dynamic systems approaches stress different aspects of stability, and
their associated mathematical concepts have been developed independently. The original
information theoretic arguments about ecosystem stability arose from considering choices in
alternate pathways if a network’s structure is perturbed. The original dynamical systems
views on ecosystem stability focussed on perturbations from an equilibrium state for a given
network topology. If one takes a message passing viewpoint, common in computer science,
topological stability can be considered to focus on the paths via which messages flow, while
dynamical stability focuses on the values associated with messages; and whether the system
112
arrives at a single value, cycles among values, or moves chaotically through the complete
range of values. In principle, it is possible for messages to flow in an orderly manner, but
the values associated with the messages to be chaotic. Equally, in principle, it is possible
for messages flows to be disorderly, but the values associated with messages to be uniform.
In that sense, topological stability and dynamical stability may be considered orthogonal; in
principle they can be combined in all possible levels of topological and dynamical stability.
However, in specific systems in nature and technology, both topological and dynamic factors
may come into play. Ways to integrate these approaches provide avenues for future research.
Our development of the PNM models in Chapters 5-8, while grounded in the multi-agent
literature of computer science rather than dynamical systems theory per se, consider both
the paths via which messages may flow, and the actions triggered by messages having a
particular value. They begin to combine both topological and dynamic aspects of stability.
113
Chapter 5
Probabilistic Network Models
Behaviours
are the signals
that made a difference.
5.1 Abstract
In this chapter we develop Probabilistic Network Models (PNMs), a network centric multi-
agent system. PNMs are loosely inspired by cellular signalling in biology [Dray 90, Bray 09].
In a PNM, agents are represented as vertices, directed edges represent the communication
network amongst agents, and each agent is assigned a set of probabilistic behaviours that
determines how it responds to messages from other agents in the network. This chapter
establishes the PNM model, while Chapter 6 demonstrates how it can be used to capture the
structure of various biologically inspired information processes.
5.2 Introduction and Motivation
In the preceding chapters we have introduced the concepts of cut-stability and connection-
stability (Chapter 1), reviewed stability concepts in various sciences (Chapter 2), and finally
developed a theory of topological stability by formalizing our concepts of cut-stability and
connection-stability (Chapter 4). Our view so far has focussed on the architecture of a
network, vis-a-vis its ability to resist different forms of perturbation. We now move from
an architectural perspective to one that is more dynamic, and begin to consider processes
operating on networks. Our ultimate goal is to bring together our earlier architectural con-
siderations on the limits of topological stability in networks, with a framework for examining
114
how dynamic processes built into a network may overcome such limitations. To move from
architecture to dynamics, we begin to consider networks as multi-agent systems. Specifically
we are inspired by cellular signalling in biology [Dray 90, Bray 09] which plays a role in co-
ordinating development, and construct a model that can capture aspects of such systems. In
this chapter, we introduce Probabilistic Network Models (PNMs), and in the next we survey
how PNMs can be used to abstract various biological processes associated with stability.
PNMs allows us to do two things. First, going from biology to computer science, they
allow us to abstract from biological processes and mechanisms to computational models
that can be applied to distributed message passing systems. Secondly, going from computer
science to biology, they provide us with a computational modelling framework in which to
develop theoretical models of biological processes. Ideally, such models are constructed so
they can be tied to experimental data on the structure and likelihood of different kinds of
signalling based interactions in biology 1. Our ability to move in both directions is based on
the idea that systems as disparate as computer networks and cells in a tissue can both be
viewed in the abstract as distributed message passing systems, where the dynamics of the
system are tied to the types, initial distribution, and responses to messages received; be the
messages bits passed through a wire or chemicals bound at a cell surface.
A PNM is a form of multi-agent system (MAS). The agents are represented as vertices
on a network. The edges of the network represent communication channels between agents.
The agents are very limited – they cannot move, they can merely send messages. Agents
send messages or change state based on a combination of their current state, and messages
received from their local neighbourhood (incoming edges). Conditional probabilities are
used to represent transition functions of an agent sending a particular message, or entering
into a particular state. These conditional probability transition equations and the network
1 The PNM approach is well suited to incorporating experimental data from biological networks; such datacan be used to initialize the transition probabilities and network structure. In cases where no experimentaldata may be available, it allows the development of in-silico experiments, looking at the dynamics of a modelunder a range of transition probabilities and various network structures.
115
topology govern the dynamics of a particular model.
By focussing on such simple agents, whose main capabilities are to send and receive mes-
sages, and to change their state in response to messages, these models stress the information
dynamic capabilities of a messaging or signalling system. Message passing is a common
technique in framing distributed systems in computer science [Attiya 04]. We show similar
techniques can be used to model the information dynamics of several biological processes.
PNMS consist of the following elements:
1. A set of agents, which are represented as vertices in a directed graph.
2. A set of random variables that hold state. Each agent has its own set of
random variables.
3. A set of behaviours which are defined by conditional probability transition
functions describing state transitions in the random variables. Each agent has
one or more behaviours associated with it. Different agents can have different
behaviours.
4. A communication network, which is represented as directed edges amongst the
vertices.
PNMs allow biological processes to be translated into a computational framework in a
very compact form. Biological details are abstracted to the specific transition equations, how
they may be assigned in the network, and the architecture of the network. For the specific
PNM based biological models introduced in Chapter 6, we emphasize the stability question
underlying the modelled biological process. The stability in question in these models, may be
broader than the formally defined notions of cut-stability and connection-stability in Chapter
4, reflecting the various nuances of stability considerations as covered in Chapter 2.
116
The PNM approach, by focussing on agents interacting via messages, meets Mitchell’s
challenge to move beyond static analysis of network structures, and focus on information
processing in networks ([Mitchell 06]:pg. 1202):
‘To understand and model information propagation, one has to take into account not
only network structure, but also the details of how individual nodes and links propagate,
receive, and process information, and of how the nodes, links, and overall network
structure change over time in response to those activities. This is a much more complex
problem than characterizing static network structure’
5.3 The PNM Model
A probabilistic network model (PNM) consists of a directed graph and a set of behaviours.
Vertices of the graph represent agents, and directed edges represent communication channels
between agents. Consider two agents, A and B. If A communicates with B, there is a directed
edge from A to B. If A and B communicate with each other, there is a directed edge from A to
B and a second directed edge from B to A. Behaviours are represented in stimulus-response
fashion as functions, where the inputs are signals or messages received from neighbouring
agents, and the outputs are the signal/message sent by a target agent in response. The
models are stochastic, in that messages/signals are received (or sent) with some probability.
A given PNM is initialized by setting the initial conditions for each agent (its internal state),
and initializing message receipt probabilities to specific values between 0 and 1.
Let S be a PNM model, consisting of a directed graph and a set of associated behaviours.
S = (G,B) where G is the directed graph and B are the associated behaviours.
Vertices in G represent agents while directed edges in G represent communication chan-
117
nels amongst pairs of agents2.
Directed edges are defined by an ordered pair (u, v) ∈ G denoting a directed edge from
u into v. The first member of the ordered pair, u, is called the tail and the second member
of the ordered pair, v, is called the head. Self loops, where u = v, are allowed, and provide
a channel for an agent to send a message to itself3. Undirected graphs are emulated by
symmetric directed graphs for which every directed edge (uv) ∈ G has a matching inverted
edge (vu) ∈ G.
The indegree of a vertex, v, is the number of directed edges into v. The outdegree of a
vertex, v, is the number of directed edges out from v.
The neighbourhood of a vertex, v, is represented symbolically as Γv and for our purposes
consists of those vertices with a directed edge into v.4
There are N agents represented by vertices.
V = {v1, v2, ...vN}, |V | = N .
There are O behaviors represented by functions.
B = {f1, f2, ...fO}, |B| = O.
Each agent j, is associated with a specific group of behaviours.
∀Vj ∈ V ∃Bj ⊆ B , |Bj| ≥ 1.
If two agents i and j have identical sets of behaviour, Bi = Bj, then they are considered
of the same class. Otherwise they are of different classes. The biological analogy is that two
2 Directed graphs are described as per chapter 4. Our graph notation is briefly recapped here, and follows[Chartrand 77, Chung 06]. A graph is a pair, G = (V,E) where V are the vertices, and E are the directededges. Denote V (G) as the vertex set of G. Denote E(G) as the edge set of G.
3 The model may be extended to multigraphs, allowing for several edges between two vertices, whichcould represent different message channels. In that case, each directed edge from u to v is uniquely labelledwith a subscript, so that (uv)l 6= (uv)m, and l, m represent different message passing channels between thetwo vertices
4 In an undirected graph the neighbourhood would be all vertices adjacent to v.
118
agents with the same set of behaviours are analogous to members of the same species. The
computational analogy is that two agents with the same behaviours are similar to instances
of the same class with the same methods. The biological analogy breaks down if taken too
far, as biological species membership also assumes two agents share an ancestor-descendent
lineage.
For our purposes, we will represent behaviours via conditional probability transition
functions. These functions consist of random variables and probabilities in the general form:
p(X tj = a | X t∗<t
j = b ∧ fi∈Γ(j)(Y t∗<t
i = c)) = z .
X tj is a random variable of the jth agent, at the current time t, which holds state, and
X t∗<tj is the same random variable of the jth agent at an earlier time; a and b are specific
states of the random variable Xj. Y t∗<ti is another random variable in state c for agents
i that are in the neighbourhood of j. In summary, the transition probability of a change
of state in X depends on its previous state in the agent, and also depends on the previous
state of j’s neighbourhood5. Finally, z is the probability of a message being received given
all conditions are met. So, if a message is received with probability z, X will change state
from b to a. If z = 1, the transition becomes deterministic rather than stochastic.
The message probabilities are either directly given in the model, or inferred from the
neighbourhood of agents that can send messages to a given agent. If inferred from the
neighbourhood around an agent, j,
z : f(Γ(j)).
Given the PNM formalism, we tend to interpret it in terms of a system of incoming
and outgoing messages/signals. We interpret state changes in a PNM to be associated with
5 The expression fi∈Γ(j)(Y t∗<t
i = c) reflects that for the required state transition to occur, a calculationmust be made on the neighbourhood; the calculation usually involves logical conditions and mathematicaloperations such as summation or products on the state values. Alon [Alon 06]:16 notes that genes emulatinglogical and gates, logical or gates, and summation functions occur rather frequently. Some gene systemssuch as the lac system in E. coli may exhibit even more complex functions.
119
the sending of a message/signal. So, an agent receives a message with some probability, z,
which causes a state change that triggers an outgoing message. We can interpret the specific
conditions associated with receiving a message to refer to different kinds of messages; the
biological corollary is that the different conditions required for a state change are analogous
to different binding sites for chemical signals in a cell. Let us use a virus model as a simple
illustration of the application of the conditional probability function above6. Consider an
agent j. In the previous time step, j was uninfected (X t∗<tj = b) and one agent, i, in j’s
neighbourhood was infected (Y t∗<ti = c). The probability that j will be infected in the current
time step (X tj = a) is z. While our PNM models can capture fairly complex situations, this
example illustrates in simple form the common pattern of message transmission.7
Since the message output by one function can serve as the message input by another
function, these functions can be composed. If the message output by a function f is the
input to a function g then,
(g ◦ f)(x) = g(f(x)) where x is a message received.
An agent’s state is thus held collectively by the states of its associated random variables,
which in turn are associated with the specific behaviours of that agent.
A PNM model needs to be initialized for a simulation run. The additional informa-
tion required are the initial states of the random variables associated with each conditional
probability transition function for an agent. Additionally, the probability values that are
explicitly given at run-time for the conditional probability transition functions need to be
set to a value between 0 and 1.
6 In Chapter 6 we introduce a more elaborate version of the virus model, considering both the virus andan immune response.
7 From the point of view of the infected agent i in the virus model, we could say it roles a dice alongeach of its directed edges outwards, and transmits the virus with probability z. This is distinct from twoother possible cases of message transmission. The first is where i broadcasts the infection on all edgesdirected outwards; the second is where i produces a single unit of a message, which is randomly assigned toone outwardly directed edge – in which case the probability of message transmission would depend on thenumber of outgoing edges.
120
Let Ij be the initialized version of Bj where the initial states of all variables are assigned,
and probability values are assigned.
Then I∗ is the set of all initialized conditional probability functions assigned to the N
agents.
I∗ = {I1, I2, I3, ..., IN}, |I| = N .
If two agents i and j of the same class (i.e. having identical sets of behaviour) have addi-
tionally identical initializations, Ii = Ij, then they are considered to have identical potential
behaviours. Otherwise they do not have identical potential behaviours. If i and j had their
locations swapped in the network, they would behave exactly like their counterpart in that
location. Identical potential behaviour, however, is not the same as identical expressed be-
haviour. Given i and j have different positions in the network and different neighbourhoods,
their expressed behaviour over a simulation run could be quite different. The biological anal-
ogy is that two agents that are behaviourally identical are like genetically identical twins,
or like clones. While PNM models can apply to many levels of biology, at the organismal
level, where each agent represents an individual organism, we can apply a particular inter-
pretation. All agents in the same class are like members of the same species. All agents with
identical potential behaviour can be considered genetically identical individuals.
A simulation run requires the PNM model, S, and the initialization of all individual agent
behaviours, I∗:
Run(S, I∗) : f(S, I∗).
Two runs of a PNM with the same initializations are guaranteed to be identical in their
dynamics only if all probability values in the transition functions are set to 1.
121
5.4 Computational and Biological Contexts
While the next chapter considers specific biologically inspired PNMs, as well as connections
between PNMs and other modelling approaches, we would like to briefly consider PNMs in
computational and biological contexts.
Denzinger and colleagues have created a general classification of multi-agent systems
[Denzinger 04], which provides insight into the PNM as a particular kind of MAS. They
view an agent, Ag as a triplet, Ag = (Sit, Act,Dat). Sit represents the situations an Ag
experiences. In PNMs these correspond to the patterns of messages that can be received by
an agent from its neighbourhood. Act represents the actions an agent can take. In PNMs
these correspond to the messages an agent can output. Dat are the internal states an agent
can possibly be in. These internal states could be seen as akin to memory. In PNMs these
correspond to the state of the random variables assigned to PNM behaviours in an agent. The
actions an agent can choose are defined by a function, fAg, where fAg : Sit×Dat→ Act.
In PNMs these correspond to the conditional probability transition functions that represent
behaviours. Within this classification of MAS a distinction is made between reactive and
proactive agents. An Ag is reactive if the influence of Dat on actions is relatively small. An
Ag is proactive (or knowledge based) if the influence of Dat on actions is relatively large. In
the current incarnation of PNMs, where Dat corresponds to the states of random variables
assigned to an agent, an agent’s memory of its history consists largely of counters developed
via sequential transitions of an agent’s internal state. PNMs are thus reactive MAS. For them
to be proactive MAS would require addition of mechanisms that allow an agent’s internal
state to become increasingly complex over time as it encodes its history. Development of
such mechanisms are a natural future direction for modification of the probabilistic network
model.
While we have examined the PNM framework in the context of multi-agent systems, it
could be compared to other modelling frameworks developed in computer science and statis-
122
tics. For instance, there are similarities with Petri-nets which are used to model concurrency,
as well as with Bayesian networks which are used to model causality. We have previously
related PNMs to Boolean networks which are a special class of dynamical systems. The
unique motivation, and scope of PNMs, is to describe, as simply as possible, the signal pass-
ing amongst agents, particularly as it occurs in biology, where there is a mix of determinism
and stochasticity. In the next section we briefly consider the biological application of PNMs
to bring theory and experiment together.
The probabilistic network model (PNM) is inspired by signalling systems in biology.
Complex signalling systems are found at almost every level in biology from cells to ecosys-
tems. Let us consider the cellular level. At any point in time in a cell, numerous processes
are occurring simultaneously, each dependant on different chemical signals. Much of the
technical development in molecular biology has depended on the creation of more sophisti-
cated tools that can detect such chemical signalling. Since PNMs depend upon the types,
initial distribution, and responses to messages received, they provide a flexible framework
to translate information on chemical signalling in specific cells and tissues to agent based
models. As part of that translation, the details of biological systems are abstracted to a
form that emphasizes their structure as distributed systems. In doing so, conceptual tools
from distributed systems become available to think about biological systems. Since we have
populations of agents (with potentially different behaviours), this approach leads naturally
to think about signalling in populations of cells. For example, consider heterogeneity in the
states of a population of cells that are otherwise genetically identical. Such heterogeneity
has been implicated in lineage choice in cells [Chang 08, Huang 11]8
Let us think about heterogeneity and lineage choice in cells from the perspective of po-
tential and expressed behaviours in PNMs. For example, if we consider the set of behaviours
associated with an agent as the potential behaviour of that agent; the actual expressed be-
8 Lineage choice in cells concerns whether a cell differentiates into a particular cell types. Heterogeneityin cell populations can thus lead to differentiation into different kinds of cells.
123
haviour will depend on both the behaviours of an agent j, and the incoming signals from
its neighbourhood. Some behaviours may never be expressed given the neighbourhood of an
agent j. These represent states a particular agent (cell) will never realize. Since behaviours
may be composed, we can define a sequence of signalling behaviours that works its way
across multiple agents (cells)9. In the case where every agent has only a single behaviour,
such a sequence of k behaviours is only possible if there is a directed path through agents
containing behaviours 1 through k. Consider a PNM model in which two agents, i and
j have identical potential behaviours and whose conditional probability functions all have
probability 1, i.e. the system is deterministic. Under what conditions will the dynamics
of the two agents in terms of the messages they send be identical? If the agents have no
dependance on their neighbourhoods, then their output messages will be identical. However,
if their state changes depend on the incoming signals form their neighbourhood the situation
becomes more complex. For a single time iteration, if the two agents have neighbourhoods
in the same state their message sending dynamics will be identical. Over two time iterations,
dependance will be on both the neighbourhoods of the two agents, and the neighbourhood
of all agents in the neighbourhood of agents i and j. With three time iterations, depen-
dance is on the neighbourhood twice removed from the immediate neighbourhoods of i and
j. Working through this thought experiment a bit further, it becomes clear that even a
little heterogeneity with respect to incoming signals from agents neighbourhoods can lead
two agents who are otherwise identical in their initializations and behaviours (i.e. geneti-
cally identical, with identical potential behaviours) to behave differently, and can possibly
move into very different states (realizing very different expressed behaviours). Indeed, the
situation where the neighbourhoods would be guaranteed to be identical for a deterministic
9 Such a signalling sequence may be considered a list of distinct messages between agents at subsequenttime steps. Thus a series of agents passing on the same message in each time step might have a sequence[m1,m1,m1, ...,m1], while a series of agents passing on a pair of alternating messages might lead to a sequenceof the form [m1,m2,m1,m2, ....,m1,m2]. Such sequences are common in gene regulation networks [Alon 06]where transcription factor a may bind to gene A leading to the generation of transcription factor b, bindingto gene B and so on; where the final gene product is not transcribed unless the complete sequence can beactualized.
124
PNM (where all transition function probabilities are 1) is just the case where all agents have
identical potential behaviour, and the network is regular (each vertex has the same number
of neighbours). For a PNM that is not deterministic (the usual case where transition func-
tion probabilities are less than 1) incoming signals from the neighbourhoods of identically
initialized agents will not be uniform. Such reasoning may provide insight into non-genetic
heterogeneity in clonal cell populations [Chang 08, Huang 11] in that it immediately identi-
fies two general mechanisms that can lead to clonal variability: The length and order of a
sequence of signalling behaviours leading to a final message output (and final state), and the
heterogeneity of neighbourhoods in terms of incoming signals. From the PNM perspective,
if cells are modelled as agents, heterogeneity in expressed behaviour arises quite naturally
even if all cells (agents) have the same potential behaviour. PNMs thus provide a context
for translating biological detail about specific types of messages and the actions (and in-
teractions) they trigger into distributed systems models that can be used to examine the
dynamics arising from biological signalling.
In the next chapter we focus on abstracting PNMs from biological processes that have
been associated with the stability of biological systems.
125
Chapter 6
Modelling with PNMs
Life began
when cyclic reactions
began to signal
then intermingle.
6.1 Abstract
In this chapter we survey Probabilistic Network Models (PNMs) that are broadly inspired by
processes that have been associated with stability in biology. This chapter establishes how
such processes may be modelled. Our goal is breadth, to show that a wide range of mecha-
nisms associated with stability in biology can be modelled via the PNM framework. In doing
so, we can bring these mechanisms into computer science, where they can be investigated
as approaches to stabilizing distributed systems. We look at how the PNM framework can
be used to construct models of viral processes, mutualism and autocatalysis, gene regula-
tion, differentiation, chemical signalling amongst organism, and ecological networks. The
model of viral processes introduced here, is further explored in Chapter 7, where it is used
to demonstrate that the antagonism between cut-stability and connection-stability introduced
in Chapter 4 may be circumvented via resilient processes. In computer science we might
implement such processes by software mechanisms; in biology they are often tied to chemical
and configurational changes involving biomolecules.
126
6.2 Introduction and Motivation
Philosophers have a predilection towards compressing deep philosophical issues into apho-
risms they then proceed to unfold. What if I were a brain in a vat [Putnam 82] is philosoph-
ical shorthand as to whether you can determine if there truly is an external world. Biologists
have begun to ask a similarly deep question which could be stated aphoristically as, What
if I were a computer in soup? Well, not soup exactly, but colloidal materials such as exist
in cells [Pollack 01]. Exactly this idea inspires Dennis Bray’s Wetware. A Computer in
Every Living Cell. Wetware examines how molecular diffusion, molecular interactions, and
conformational changes in molecules within the cells can orchestrate a complex of biological
circuits where proteins act as both switches and message carriers, and where diffusion gra-
dients act as the wires1. This allows us to abstract a logical picture of cellular processes as
a network of interactions, a set of composed circuits. Such an abstraction has both power,
and peril; as Bray notes[Bray 09]:p.87:
‘So there are no wires. In fact, the term biochemical circuits is flawed in several respects,
a product, no doubt, of our propensity to attach spatial metaphors to processes of all kinds.
In reality, a signal traveling through a cell is a change in the numbers of specific molecules at
particular locations. Signals move from one place to another by diffusion and the influence
of enzyme catalysis. It sounds like a clumsy and haphazard process to us. But in the world
of atoms and molecules it can be astonishingly rapid and efficient. Let us hope it is anyway,
since our thoughts and actions depend on this very mechanism!’
Wetware like many recent works in computational systems biology2 has a small set of
1 Living cells are likened not only to computers, but also to distributed systems. Indeed, signallingproteins have been likened to ‘smart’ agents that play the coordinating role in the distributed system thatis the network of protein signalling interactions in a cell [Fisher 99].
2 Systems biology is an emerging biological field that focusses on complex interactions in biology (pri-marily at the molecular level). A good review of systems biology principles is available in [Huang 04].Computational systems biology focusses on developing computational models that can integrate data fromsystems biology into simulation models that can both be compared to, and suggest future, experiments.High level summaries of computational systems biology are available in [Kahlem 06, Kitano 02].
127
paradigmatic analogies: complex molecular interactions can be viewed as circuits; cells and
cellular machinery can be viewed as performing computations; system level understanding
arises from comprehending how individual biological circuits and computations are coordi-
nated and feed back into each other.3 These computational analogies can be run in the
opposite direction, to ask, how can a computer be more like a cell? In a cell, there is no de-
signer of hardware or algorithm; rather the line between hardware and software is blurred as
molecular interactions work in synchrony to maintain the functions of a cell. This synchrony
has come about over evolutionary time scales – the innovations we see today are those that
have resulted in systems sufficiently stable to persist and evolve.
In this chapter, we will develop several abstract models of biological processes. In ab-
stracting biological processes into message passing systems, we are designing distributed
systems models based on biological systems that have withstood rather harsh tests of time
and environmental change. Some of our abstractions will be from intra-cellular systems,
where the ‘agents’ are molecules of various types. Other abstractions will be at higher level,
where the respective agents may be cells, organisms, or species.
Imagine a computational systems biologist wishes to explore the similarities and differ-
ences of various biological processes that have been associated with stability. Is it possible
3 Cellular components and processes, viewed as the means by which cells produce computations, havebeen an effective analogy spurring biological investigations from the mid-twentieth century onwards. Bray’sWetware is a current example and popularization of a long line of work benefitting from the cell as computeranalogy. Some landmarks are the elucidation of specific regulatory components such as the lac operon byJacob and Monod as a kind of regulatory circuit [Jacob 66], Kauffman’s investigation of generic propertiesof gene regulation networks using Boolean networks [Glass 73, Kauffman 69b, Kauffman 69a, Kauffman 74,Kauffman 93, Kauffman 04], Conrad’s investigation of information processing capabilities of biomolecules[Conrad 72, Conrad 79, Conrad 85, Conrad 90, Conrad 81], Alon’s recent investigations of specific types ofcircuit patterns unique to living systems [Alon 06, Alon 07, Milo 02, Shen-Orr 02], and Davidson’s extendedseries of elegant experimental studies deciphering specific gene regulatory elements and circuits, primarily insea urchins [Davidson 08, de Leon 07, Erwin 09, Istrail 05, Istrail 07, Levine 05, Nam 10, Oliveri 08, Yuh 98,Yuh 01]. The work of Davidson’s group, in particular, gives detailed empirical flesh to the bones of theanalogy that regulation in development can be viewed as if it were a network of computations. The analogybetween cellular processes and computations also figures in efforts to design specific types of computers andalgorithms using biological materials [Abelson 00, Adleman 98, Conrad 85, Knight 98] . Evelyn Fox Kellerhas provided a historical analysis of the enduring power of computational metaphors in biology, particularlydevelopment [Keller 02], while from the opposite direction, Nancy Forbes has documented the wide range ofbiologically inspired computing [Forbes 04].
128
to use a single framework to model a wide range of processes? On the biology side she
would want the flexibility to incorporate biological details of individual processes. Ideally,
she would like to be able to incorporate into the models details from current experiments.
On the computational side, she would want a framework that allows the different models to
be subject to similar types of analyses, allowing for theoretical comparisons of the processes
underlying the models. Modelling thus becomes the means by which our computational
systems biologist can integrate theory and experiment.
Our goal in this chapter is to build on the PNM model introduced in the previous chapter
and develop specific PNM models of different biological processes. In doing so, we are in
essence taking a short tour through several areas of computational biology. As we do so, we
will use each model to first establish the flexibility and utility of the PNM approach, and
secondly we will use the abstract model to further investigate biological notions of stability.
It is no understatement to say that stability concerns pervade biology; and that the nuances
of such concerns are never fully captured in formal definitions – either the stability definitions
in dynamical systems theory or the topological stability definitions developed in Chapter 4.
In both the evolutionary time scale of a species and the developmental time scale of an
individual, subsystems must persist as conditions change, but also be flexible enough to
alter as new conditions emerge.
Our models strip away the biological details – the specifics of molecular components,
chemical interactions, conformation changes, diffusion rates – to develop a cartoon picture
of the structure of the interactions as a PNM. We are then able, within our simplified
picture or abstraction, to examine the different kinds of stability concerns that might be
associated with different kinds of biological processes. Our cartoon models are simple multi-
agent systems that provide a way of bridging biology and computer science. This allows
conceptual flow in both directions.
The PNM framework, when applied to developing specific biologically inspired models,
129
is a means to unfolding the pattern of interactions characteristic of biology at various scales
from the cell to the ecosystem. The resulting models, if they capture aspects of the biological
systems they are abstracted from, can build up our theoretical understanding of biological
processes, so we see more clearly the similarities between processes that operate at different
scales and upon different components. Alternatively, we can unmoor the models from their
original biological context, and examine their suitability as elements of a distributed system.
By abstracting from biological systems for which there is no designer, we arrive at design
principles we can use for technological or even modified biological systems we may design.
To recap from the last chapter: a probabilistic network model (PNM) is a form of multi-
agent system. The agents are represented as vertices on a network. Edges represent commu-
nication channels between agents. Depending on the network we wish to represent, the edges
may be considered either directed or undirected. The agents are very limited – they can not
move, they can merely send messages. Agents send messages or change state based on a
combination of their current state, and the messages received from their local neighbour-
hood (incoming edges). Conditional probabilities are used to represent transition functions
of an agent sending a particular message, or entering into a particular state. These condi-
tional probability transition functions and the network topology govern the dynamics of a
particular model.
We will first introduce a model of viral processes, the virus and immune response model.
This model is further investigated in Chapter 7. This PNM incorporates in simple form
ideas from both epidemiology and immunology about how a virus may spread and how it
may be limited in a system. Since the idea of a virus has implications to stability in both
biology and computer science, it seems like a good starting point for our investigation of
PNMs. We then proceed with a trio of models of specific biological processes which have
been considered critical to discussions of the origin and maintenance of stability in biological
systems: mutualism and autocatalytic networks, gene regulation, and differentiation. Next
130
we introduce a model of message passing amongst organisms (semiochemicals) that is based
on recent literature in multi agent systems. Finally, we introduce a simple ecological flow
network model inspired by the ecological networks that motivated much of the theoretical
development in Chapter 4. With these half dozen models we illustrate the diverse range of
biological processes that can be modeled within the PNM framework.
The PNM approach meets Mitchell’s challenge [Mitchell 06] to move beyond static analy-
sis of network structures, and focus on information processing in networks ([Mitchell 06]:pg. 1202):
‘To understand and model information propagation, one has to take into account not
only network structure, but also the details of how individual nodes and links propagate,
receive, and process information, and of how the nodes, links and overall network
structure change over time in response to those activities. This is a much more complex
problem than characterizing static network structure’
The PNM approach is well suited to incorporating experimental data, where the transi-
tion probabilities and network structure can be based on empirical results. In cases where
no experimental data may be available, they allow the development of in-silico experiments,
looking at the dynamics of a model under a range of transition probabilities.
In the sections below, we focus on the conditional probability transition functions asso-
ciated with a given model. A specific model includes these functions, but further requires
(a) the assignment of particular functions to each agent to define agent behaviours, (b) the
specific network architecture representing the communication channels amongst agents and
(c) initialization of random variables and probabilities. The same set of functions can lead
to diverse models, depending on how they are assigned to and initialized within agents, and
the network architecture they are applied to.
For each biologically inspired model, we emphasize the underlying stability question that
the modelled biological process applies to. The stability in question in these models, may
be broader than the formally defined notions of cut-stability and connection-stability in
131
Chapter 4, reflecting the various nuances of stability considerations as covered in Chapter
2. In general, we are seeking some property S of a system, that can be maintained as some
other property of a system P is perturbed, such that S is invariant given perturbations in
P . Then S is stable with respect to P .
6.3 Model 1 – Virus and Immune Response
A natural starting point for our computational systems biologist is viral phenomena, which
are implicated in the stability of both biological and technical networks. In the form of
rumour spreading and gossip, viral processes also affect social networks. The virus and
immune response model is our choice for detailed investigation since viruses exist in both
the technological and biological realms. The dynamics of the virus model4 under various
network topologies is the main subject of Chapter 7, ‘Dynamic Resilience’.
We consider a virus infection and an immune response moving through a network of
agents. The immune response is the simplest possible in that it is the sending of a warning
message. This proto immune response is found in systems as diverse as the human immune
system and the root communities of plants (‘allelochemicals’). Immune responses stabilize
biological systems in the face of externally originating perturbations such as pathogens,
wounding, and foreign substances.
Our model attempts to capture as simply as possible the interplay between a viral process
subverting the function of agents and an immune response that can lead to an agent response.
We focus only on the spread of the warning message, not the specifics of the orchestrated
response, since we assume the warning message must first travel through a system before a
response can be orchestrated. Future models can explore various forms of response.
Consider a viral process operating on a network, where an agent (a processor, an organ-
ism) is able to send a warning message with some probability q before transmitting a viral
4 Model 1 co-developed with the Virus Group at University of Calgary Computer Science that alsoincluded: John Aycock, Ken Barker, Lisa Higham, Jalal Kawash and Philipp Woelfel.
132
package with some probability r. Given a particular network, under what combinations of
q and r will the immune response run ahead of the viral contagion, and vice versa. We are
considering the notion of a ‘viral process’ broadly to be any case where information can be
communicated through a network via the connectivity of that network. Usually we asso-
ciate the notion of a ‘virus’ with mal-information that negatively affects the functioning of
a system. However, a viral process may also be beneficial – for example the spread of an
innovation or a new idea through a social network. In the model we develop, both our ‘virus’
and our ‘immune response’ could be considered viral processes.
In computer science viral processes could include models for Internet viruses and worms.
In biology this could include models for viruses, bacteria, pesticides or other poisons moving
through an ecosystem, as well as allelopathic5 responses in plants (‘chemical warfare’). On
the beneficial end of the spectrum, viral processes could represent marketing efforts, the
spread of innovation, dissemination of ideas and other social phenomena that spread via
word-of-mouth.
In the virus model we make some assumptions that allow us to simplify our model. We
begin by assuming the following round structure, including phases within rounds:
• Round 0 – initial infection (of a single node in the network).
• Round 1, Phase 1 – an immune message is sent.
• Round 1, Phase 2 – the viral payload is sent.
We assume that immunity is conferred by the reception of a warning message from a
node i to a node j.
Within a network, if a neighbour of j is immune in a previous time step it conveys its
immunity with probability 1 to j. In the case of infected neighbour(s) to j in the previous
time step, there is the probability q that immunity is conferred to j. In the case of infected
neighbour(s) in the previous time step, and no neighbour that has conferred immunity, the
5 In allelopathy chemicals are produced by roots that have usually detrimental (though sometimespositive) effects on neighbouring plants of differing species.
133
probability that j gets infected is r.
The random variable X tj is 1, if node j is infected at time t, otherwise 0.
The random variable Y tj is 1, if node j is immune at time t, otherwise 0.
In our model, there are three different possible states for an agent: V(iral), I(mmune)
and N(eutral)6.
The infection model equations are:
The previously infected case:
p(X tj = 1 | X t−2
j = 1) = 1.
The previously uninfected case:
p(X tj = 1 | X t−2
j = 0 ∧ Y t−1j = 0 ∧ ∀i∈Γ(j)
Y t−1i = 0) = 1−
∏i∈Γ(j)
(1−X t−2i ∗ r).
Put into words, if an agent j is previously infected, it remains infected (non-functional).
A previously uninfected agent, j has a probability of being infected at step t (given that j
is not previously immune) if any of its neighbours are infected. The probability of infection
for j is dependant on the number of its neighbours who are infected.
The immune model equations are:
The previously immune case:
p(Y tj = 1 | Y t−2
j = 1) = 1.
The previously immune neighbours case:
p(Y tj = 1 | Σi∈Γ(j)
Y t−2i > 0 ∧ Y t−2
j = 0) = 1.
The previously infected neighbours case:
p(Y tj = 1 | X t−1
j = 0 ∧ Y t−2j = 0 ∧ ∀i∈Γ(j)
Y t−2i = 0) = 1−
∏i∈Γ(j)
(1−X t−1i ∗ q).
Put into words, if an agent, j is previously immune, it remains immune. An agent, j,
6 The neutral state can switch to either viral or immune given receipt of the appropriate messages. Inconventional epidemiological models this state is often called ‘susceptible’.
134
that was not previously immune, is immune at step t if any of its neighbours are previously
immune OR it is immune with some probability if any of its neighbours are infected (and no
neighbour was immune in the previous time step).
The virus model can be easily extended to include other kinds of behaviours, such as
resistance to a virus, or spontaneous recovery from a virus. In the latter case, the model
may never terminate. Our stability question, which is addressed in the next chapter is
simply, under what network topology, and choice of q and r may the warning message run
ahead of the virus, and essentially block or limit a viral attack, thus dynamically providing
connection-stability.
This model can be seen as a variant of the compartmental models prevalent in the epi-
demiological modelling literature (see models and references in [Daley 99]) with a few key
distinctions. First, we state the models in terms of transition probabilities of an individual
agent moving from one compartment to another, rather than in terms of differential equa-
tions characterizing the populations movement through various compartments. Secondly, we
incorporate both the spread of the virus and the immune message into our model. Third,
our model can be interpreted as focussed on the cellular rather than organism level. While
organisms may recover from a virus, cells are often destroyed via lysis at the point of viral
transmission to other cells.
The model can also be considered to reflect a simple scenario in a distributed computing
setting. Say a computer virus deploys it’s payload to a specific logical port. The warning
message, if received by a server, blocks it from listening on that logical port.
In the next chapter we will consider this model under several different network archi-
tectures, and determine the conditions under which the virus races ahead (leading to an
epidemic) or the warning message races ahead (thus blocking the virus from creating an
epidemic), dynamically stabilizing the network.
In the next three sections our computational systems biologist considers a trio of models
135
that could be said to represent stability concerns at different qualitative levels of complex-
ity in a biological hierarchy from molecules, to cells, to tissues. The first model(s) concern
mutualism and autocatalysis, which are often associated with origin of life scenarios. Gene
regulation models then raise the question of how, within a cell, gene products can be main-
tained at a level required for the cell to be viable, via genes regulating other genes through
both negative and positive feedback. Finally, differentiation concerns how the development
of cell types within a tissue can be coordinated in a stable way as an organism develops
from a single-cell stage to multicellularity. While the first three models are all focussed on
processes within and between cells, the final two models look at interactions amongst organ-
isms and integrate the ideas from both systems biology and ecology. We examine a model
of the interactions of bees and their pollen sources inspired by the multi-agent literature in
computer science and recent ecologicalliterature on pollinator networks in biology. Finally,
we integrate the perspective of our computational systems biologist and our ecologist from
Chapter 4 to look at predator-prey relationships in a simplified ecosystem.
6.4 Model 2 – Mutualism and Autocatalysis
One of the most fascinating questions in biology is: how do stable sub-systems originate that
can maintain themselves separately from their environment? The origin scenario is often a
primordial ‘soup’ of chemical reagents. Biologists wonder how likely is a series of reactions
that can self-stabilize in the sense that the reactions maintain themselves. It is assumed that
such a series of reactions was necessary in creating the first proto-metabolism. In terms of
chemical reactions this phenomena is called ‘autocatalysis’. In ecology the term ‘mutualism’
is used for a set of positive interactions that mutually sustain each other.
Strictly speaking, autocatalysis requires both a set of processes that causally promote
each other, and a set of catalysts. Models of autocatalysis figure prominently in ori-
gin of life scenarios [Deacon 06, Kauffman 86, Hordijk 04, Hordijk 10, Konnyu 08, Kun 08,
136
Maynard Smith 99, Szathmary 06, Szathmary 07]. We will begin with the simpler case of
mutualism and then add in the features for autocatalysis. Rather than considering specific
chemical or ecological interactions, we will take an approach that could represent phenomena
in a distributed computational system, a series of processes that can turn each other on (or
off).
Consider three processes whose states are represented by the Boolean random variables,
X, Y , Z. Each process is either on (state =1) or off (state=0). Let o, q, r represent
transitional probabilities.
o: probability X is switched on for an agent i if Z is already on in a neighbouring agent.
q: probability Y is switched on for an agent i if X is already on in a neighbouring agent.
r: probability Z is switched on for an agent i if Y is already on in a neighbouring agent.
The mutualism model equations are:
Process 1: p(X tj = 1 | X t−1
j = 0) = 1−∏
i∈Γ(j)(1− Zt−1
i ∗ o).
Process 2: p(Y tj = 1 | Y t−1
j = 0) = 1−∏
i∈Γ(j)(1−X t−1
i ∗ q).
Process 3: p(Ztj = 1 | Zt−1
j = 0) = 1−∏
i∈Γ(j)(1− Y t−1
i ∗ r).
If we limit each agent to being assigned only a single process, then a self-perpetuating
cycle of positive interactions would require its precursor process to be in its neighbourhood of
vertices which have edges incoming to that agent. A biological example of such a mutualistic
set of processes is given by Ulanowicz [Ulanowicz 97] (pp: 42-45). Bladderworts are aquatic
plants. They absorb nutrients via filamentous stems and leaves. These leaves provide the
substrate for a film called ‘periphyton’ (itself a mix of bacteria, diatoms and blue-green algae).
Zooplankton feed on the periphyton film. These zooplankton are absorbed into the bladders
of the bladderwort where they decompose. The nutrients from their decomposition promote
137
the growth of new stems and leaves of the bladderwort, thus closing the cycle of mutualistic
interactions. So bladderworts (analogous to process 1) are promoted by zooplankton. The
periphyton film (analogous to process 2) are promoted by the growth of the bladderwort
leaves. The zooplankton (analogous to process 3) are promoted by the periphyton film.
To mutualism, let us now add the idea that in addition to a set of processes that mutually
support each other, there are some other processes that act as catalysts. For the purposes
of our model, if there exists some process X which if running can promote some process Y
with probability o, then the action of a catalyst is to increase the probability from o to o′
(o′ > o).
Let us imagine the mutualistic cycle previously but now with three catalysts, A, B and
C. For our purposes, a catalyst must at least double the probability of a reaction.
In the presence of catalyst A = 1, s = o+ l where o+ l ≤ 1 and l ≥ o.
In the presence of catalyst B = 1, t = q +m where q +m ≤ 1 and m ≥ q.
In the presence of catalyst C = 1, u = r + n where r + n ≤ 1 and n ≥ r.
In chemical reaction systems, the likelihood of a reaction in the absence of a catalyst
(o, q, r) is negligible relative to the likelihood of a reaction in the presence of a catalyst
(s, t, u).
The original equations from the mutualism model still apply, but we must now add the
following three equations to also account for the presence of catalysts, A, B and C.
The autocatalysis model equations are:
p((X tj = 1 | X t−1
j = 0) ∧ (Σi∈ΓjAt−1i ≥ 1 ∨ At−1
j )) = 1−∏
i∈Γ(j)(1− Zt−1
i ∗ s).
p((Y tj = 1 | Y t−1
j = 0) ∧ (Σi∈ΓjBt−1i ≥ 1 ∨Bt−1
j )) = 1−∏
i∈Γ(j)(1−X t−1
i ∗ t).
p((Ztj = 1 | Zt−1
j = 0) ∧ (Σi∈ΓjCt−1i ≥ 1 ∨ Ct−1
j )) = 1−∏
i∈Γ(j)(1− Y t−1
i ∗ u).
In both the mutualism and autocatalysis models, the stability question concerns the
138
distribution of agents and the initial agent states that allows for mutualistic or autocatalytic
cycles such that the whole system is continually in the ‘on’ state for all random variables in
all agents7.
6.5 Model 3 – Gene Regulation
Random Boolean Networks (RBN) were an early model of regulation (specifically, gene
regulation) developed in the late sixties by Stuart Kauffman [Kauffman 69b, Kauffman 69a,
Kauffman 93]. In this model, the agents represent genes. Each ‘gene’ holds one Boolean
logical function. The logical functions are randomly assigned to the genes. Each gene is
connected to k other genes. The model is deterministic in that the output of each gene
depends only on the inputs. Randomness only exists in (a) the initial assignment of logical
functions to genes and (b) the initial state of each gene (1, 0), but not in the course of
a simulation run given those initial conditions. The model was originally developed to
examine homeostasis and differentiation [Kauffman 69a], which we previously encountered
in Chapter 2. To recap from that chapter, homeostasis is the ability of interdependent
elements in a biological system to be relatively stable in their relationships to each other
even as external conditions change, whereas differentiation is a constrained change in the
relationships amongst parts in the course of development, usually from less specific to more
specific cell types.
In the RBNs Kauffman initially investigated, he noted two patterns that he proposed
may explain some general features of homeostasis and differentiation. First he noted that
while there are 2N possible states (where N is the number of genes), only a small proportion
of those states are realized in the simulations. Consider a disordered system one where
each state has equal probability of appearing during the course of a simulation. Relative
7 In biology, stability considerations are often in the context of maintaining functional processes. Home-ostasis, covered in Chapter 2, is with respect to an maintaining an internal environment where processeskeeping an organism alive can function. Death could be considered an extremely stable state, but notfunctional.
139
to that, the RBNs appeared to be highly ordered. This was inferred to be evidence for
homeostasis arising from interacting genes. He further noted that across runs with different
initial conditions, the RBNs ultimately fell into occupying different subsets of the possible
states; either a single state or a small cycle of states, which if reached at some time t,
would hold for that run for all subsequent times t∗ > t. These different cycles and points in
the space of all possible states were inferred to represent different cell types arising in the
course of differentiation from a common set of genes. Kauffman’s initial discoveries of these
patterns in the late sixties were ahead of their time. By the mid eighties, such patterns were
commonly known in both technical and non-technical jargon as ‘attractors’, and the initial
conditions that led to the same attractor were called ‘basins of attraction’8.
Let us consider a very simple example of a Boolean Network, one with only three genes
(agents). Each gene is connected to the two other genes by a directed edge. We will consider
only three Boolean functions, and, or and xor as the behaviours of the genes. In the
network we consider, each gene is assigned a different function. Since each gene has only one
behaviour, we could consider our three genes to be enacting, and, or, and xor.
The Boolean Network model is
and: p(X tj = 1 | mini∈Γj
(X t−1i = 1)) = o.
or: p(X tj = 1 | maxi∈Γj
(X t−1i = 1)) = q.
xor: p(X tj = 1 | Σi∈Γj
(X t−1i = 1)) = r.
Where o = q = r = 1.
This model is considerably simpler than the models previously considered. The output
of a gene at time t depends on the inputs from its neighbourhood (the other two genes) at
8 See [Kauffman 93]:175-179 for a non-technical description and [Ruelle 89]:24 for an operational definitionof attractors and basins of attraction. A brief high level survey of the mathematical ideas behind attractorsis in [Ruelle 06].
140
time t− 1.
We can give a message passing interpretation to the model. If a gene is in state 1 a
chemical message is sent to the genes it is connected to, whereas if it is in state 0, no message
is sent. Working through simple Boolean networks like this by hand is a good way to gain
intuition. Let us consider the above network beginning in the state: and = 1, or = 1, xor = 1.
Note that if the case and = 0, or = 0, xor = 0 occurs, no messages are sent, and we could
consider this system to halt at that point. The state of the system at a particular point in
time, is just the triplet of the states of the individual genes. Thus, (1, 1, 1) and (0, 0, 0) are
system states that reflect the state of all three genes at a particular point in time.
Say our initial state is (1, 1, 1). This is shorthand to indicate:and = 1, or = 1, xor = 1.
The initial and following states are listed below, one system state per line (time step).
Our associated probabilities for o, q, r are all set at 1, so the system essentially behaves
deterministically.
Run 0: Deterministic
and, or, xor
(1, 1, 1)
(1, 1, 0)
(0, 1, 0)
(0, 0, 1)
(0, 1, 0)
(0, 0, 1)
(0, 1, 0)
....
Even in this extremely simple example, we can see that the system runs through only 4 of
8 possible states for the given initial conditions, and after a few iterations, cycles only between
141
two states, (0, 1, 0) and (0, 0, 1). Secondly, given that the model is effectively deterministic
when all probabilities are 1, if in any run a state is repeated, a cycle ensues. Thus, the
system must repeat itself, if every possible state has occurred. For our particular model,
alternate initial conditions not encountered in this run are (1, 0, 0), (0, 1, 1) and (1, 0, 1) and
they will also end up in the same two state cycle. Thus, this particular system has a single
attractor that cycles between two states. All initial states except (0, 0, 0) are in the basin of
that attractor.
The model moves from deterministic to probabilistic if we loosen the condition o = q =
r = 1 to represent probability values: 0 ≤ p, q, r ≤ 1. Now, let us consider the weaker
condition: o = q = r = 0.5, where given the logical condition is met, the sending of a
message is determined by a coin toss.
Next we examine three runs, given the same initial conditions as before, with a coin-toss
determining if the message is sent. If the coin is heads, a message is sent (and arrives at
its destination). Thus, 1H would indicate the gene is in state 1 and its output message is
forwarded. If the coin is tails, the message is not sent (or is sent, but does not arrive at its
destination). Thus 1T would indicate the gene is in state 1, but it’s output message is not
received in the next time step. As before, a gene in state 0 sends no message. One way to
interpret these messages biologically is as transcription factors (proteins which modify the
activity of a gene) with weak binding9.
Run 1: Probabilistic
and, or, xor
(1H, 1H, 1T )
(0, 1H, 0)
(0, 0, 1H)
(0, 1T, 0)
9 Transcription factors are proteins that can bind to specific regions of DNA, and can act to eitherpromote or repress the transcription of DNA to RNA.
142
(0, 0, 0)
Run 2: Probabilistic
and, or, xor
(1T, 1H, 1T )
(0, 0, 1T )
(0, 0, 0)
Run 3
and, or, xor
(1H, 1H, 1T )
(0, 1H, 0)
(0, 0, 1H)
(0, 1H, 0)
(0, 0, 1H)
(0, 1T, 1H)
(0, 1H, 0)
(0, 0, 1H)
(0, 1T, 0)
(0, 0, 0)
Several things should be observed about these three runs, relative to the previous system.
First, given the same initial conditions, each run varies in both sequence of states and length
of the run. Secondly the total length of a run without cycles can now be larger than the
number of possible states (Run 3). Third, if we consider the sending of no messages at a
time step a halting condition (i.e. 3 tails), it is possible that the system may halt at any
iteration with some probability (0.53 in this case). Changing our model, from deterministic
to stochastic has created increased variability in the states observed within and across runs
143
(even with the same initial conditions). As mentioned earlier, the chosen probabilities, might
reflect binding strengths that are experimentally observed. They could be further expanded
to reflect other sources of variation in the experimental situation such as the probability
of a given chemical message being in the proximity of a binding site. For example, due to
diffusion processes there is a greater likelihood of a given quantity of transcribed chemical
signals affecting other genes in the same chromosome due to proximity (cis-regulation) versus
a lesser likelihood affecting genes in other chromosomes due to proximity (trans-regulation)10.
If the relevant probabilities (in this simple example, p, q and r) are close to but less than
1, then the system will act similarly to a deterministic Boolean network much of the time,
but due to some inherent stochasticity it will be possible during a simulation run that has
settled on a cycle, to jump to another cycle (i.e. switch attractors). It may also be that
as the probabilities are lowered, the increased frequency of such events leads to the cycles
that represent attractors in the corresponding deterministic system becoming increasingly
smeared out in the stochastic system. Such stochasticity that allows variation amongst runs
even when given the same initial conditions may be a source of developmental noise that allow
some cells to switch to an alternate cell fate than those surrounding it (see differentiation
below). Evidence that developmental noise does play a role in cell lineage choice was recently
given by Chang et al. [Chang 08].
Huang [Huang 04] has distinguished between two groups of researchers studying biomolec-
ular networks: globalists who seek to understand generic aspects of gene regulatory networks
[Kauffman 04], and localists who seek to elucidate individual pathways and units of regula-
tion [Istrail 05, Istrail 07]. Kauffman’s RBN approach, focussing on the dynamics of gene
regulatory networks is an example of the globalist approach that has developed over the last
forty years.11 The PNM framework, developed here, can be seen as a way of extending this
10 Cis-regulation refers to regulating the expression of genes on the same chromosome whereas trans-regulation refers to regulating the expression of genes on other chromosomes.
11 See [Kauffman 93] for a review of seminal literature on the globalist approach via RBNs; more recentliterature is reviewed in [Huang 09c] with a focus on carrying this approach over to cancer research.
144
approach from deterministic to stochastic systems, in a natural way that allows for incorpo-
ration of specific signalling behaviours drawn from different areas of biology.12 Over the last
decade or so, a localist view has developed that has been very influential in the development
of systems biology, namely network motifs [Alon 07]. Network motifs are individual ‘circuits’
of interactions, that appear in gene regulatory networks much more frequently than expected
in random networks. They too, can be described, and extended via the PNM framework.
Network motifs are typically one-to-several genes and their associated transcription factors.
Their extreme specificity, allows them to be models for specific genes, or small sets of genes
that interact. Two common network motifs are negative autoregulation, where a transcrip-
tion factor represses the transcription of its own gene, and positive autoregulation, where a
transcription factor promotes the transcription of its own gene.
A very simple PNM model of autoregulation consists of a single agent, that sends messages
to itself. The simplest example would be to associate a gene (agent) with two probabilities,
o and q, that are respectively the probability of a gene transcribing itself in the case where
it has not previously received a transcription factor message, and the probability of a gene
transcribing itself in the case where it has previously received a transcription factor message.
p(X tj = 1 | X t−1
j = 0) = o.
p(X tj = 1 | X t−1
j = 1) = q.
In this model, o > q represents negative autoregulation, while q > o represents positive
autoregulation. The absolute value of the difference p − q represents the intensity of the
regulatory response.
A more graduated example of autoregulation would be where the probability of sending
a message to itself decreases with the number of messages received, i.e. creates a weak form
12 Within the RBN framework, Ilya Shmulevich has developed a probabilistic extension [Shmulevich 02a,Shmulevich 02b] where each agent rather than being assigned a single Boolean function, is assigned a set offunctions, and at each time step chooses amongst those functions with some probability.
145
of memory, essentially the building of a counter13.
Consider a series of probabilities s1, s2, s3...sN for a gene transcribing itself. Let these
probabilities be associated with an indexed series of random variables where the nth stage
towards complete suppression or complete promotion at some time t for some vertex j is
represented by X t(n)j:
p(X t(n)j = 1 | X t−1
(n−1)j = 1) = sn.
If the probabilities are related such that s1 > s2 > s3 > ... > sN , we have a graduated
negative autoregulation. If the probabilities are related such that s1 < s2 < s3 < ... < sN ,
we have a graduated positive autoregulation.
As a slightly more complex example, consider a feedforward loop, a network motif found
in many gene systems across a range of organisms. There are three genes (represented by
random variables), X, Y , Z. X regulates both Y and Z. Y also regulates Z. Therefore, Z is
regulated by both X and Y . Let us represent a feedforward loop, where Z transcribes only
if transcription factor messages are received from both X and Y , and where Y transcribes,
only if it has received a transcription factor message from X. Let o be the probability that
X sends a message to Y . Let q be the probability that X sends a message to Z. Let r be
the probability that Y sends a message to Z.
p(Y t = 1 | X t−1 = 1) = o.
p(Zt = 1 | X t−1 = 1 ∧ Y t−1 = 1) = q × r.
Again, under a message passing interpretation, we could consider the probabilities o, q,
and r to represent values determined from experiment that may represent either binding
strengths for the chemical messages or the likelihood of the messages being in proximity.
Such empirical facts, once determined, could either be put into single values for p, q and
r, or result in these probabilities being determined by the other probabilities, such as those
13 In the current incarnation of the PNM model, the amount of memory in a system is tied to the numberof random variables that hold state, which act as simple registers that can hold a single value.
146
that may reflect the interaction of binding strength and proximity.
The enduring stability question in gene regulation is quite simply this: how does an
assemblage of interacting genes maintain production of chemical messages so as to allow the
cell to maintain function in the face of both external and internal perturbations. In this
context, not only does every single identified ‘gene circuit’ have to stay within tight bounds,
but the system as a whole, across interacting circuits, must stay within tight bounds that
allow for cell functionality, i.e. maintain homeostasis.
Maintaining homeostasis is a necessary condition at the higher level of cells, affording the
possibility of differentiation: regular directed changes in cells that lead from a single generic
cell type to multiple more specialized cell types that allow for complex organisms.
6.6 Model 4 – Differentiation
In this section, we will briefly consider differentiation in the context of cell-cell signalling.
How does one cell become many in the course of differentiation? To even begin to consider
this question, we have to develop several tools. First we have to identify the simplest models
under which a single primary cell type can differentiate into a secondary cell type. Secondly
we have to examine the conditions under which a distribution of different cell types can be
stably maintained. Third, to link theory to experiment, we must develop models that can
be tuned with experimental data on specific chemical signals (cytokines14) to develop virtual
experimental systems.
Below is a very high level model of the problem of differentiation, which serves as a
first step to develop more specific models of differentiation tied to particular tissues and cell
types15.
In the model cells are organized into tissues. Within a tissue each cell type has a specific
distribution. The switch from one cell type to another depends both on signals generated
14 Cytokines are small proteins used in communication amongst cells.15 Model 4 co-developed with Sui Huang.
147
internally and by neighbouring cells. The result is a family of PNMs of the general form:
p(Ctj = Stj | Ct−1
j = St−1j ) = f(Ct−1
i∈Γ(j), St−1
i ).
In words, the probability of a new cell type Ctj and its internal states Stj given the
immediate preceding cell type Ct−1j and its internal states St−1
j depends on the cells it can
receive signals from Ct−1i∈Γ(j)
and their internal states St−1i . For each cell at a particular point
in time the vector of internal cell states S takes the form [s1, s2, s3...sN ] and defines which
signals the cell is capable of receiving or sending at that point of time.
In the context of such a model, network structure becomes vital, and is determined by
the pattern of signals passed both between and within cells. Since the network topology
reflects interactions via signalling, it can be more complex than the topology of cells that
are immediate physical neighbours.
In differentiation, change is directed so as to result in a viable organism with functional
parts. Any botanist who has looked at a leaf incongruously appearing where a flower petal
should be, recognizes the effect of mis-signalling – to move a cell towards a different fate.
Thus, differentiation is a process where change is strongly constrained, so as to lead to stable
forms. One of the classic tools of developmental morphology in botanical studies was to study
teratologies – just those situations in which developmental change has gone awry, and by
doing so, to elucidate what signals or processes must have been altered from the course of
‘normal’ development to cause such a change. For example, the observation of leaves where
petals should be led to the idea that petals are actually modified leaves.
To the extent that we can develop models of the signalling processes resulting in dif-
ferentiation, we have a virtual toolkit to allow the examination of how cell fates come into
being, and how they can possibly be stabilized, or in the case of creating stem cells, be re-
programmed so that differentiated cells can give rise to more general cell types [Huang 09b].
We have shown how several specific areas of biological phenomena at the cellular level
and below are amenable modelling via the PNM framework. We now turn to a biologically
148
inspired multi-agent model of signalling amongst organisms via semiochemicals. Finally, we
build up in several stages a PNM model of ecological flow networks.
6.7 Model 5 – Semiochemicals
Our computational systems biologist can seek inspiration equally from biological phenomena,
as well as computational systems inspired by biology. We now turn to looking at signalling
amongst organisms, and amongst different species which has inspired several types of multi-
agent systems in computer science.
Semiochemicals, chemicals that carry a message, are means of signalling amongst organ-
isms. A well known case is the phenomenon of stigmergy16, or indirect coordination in ant
and termite colonies via laying down pheromone trails. Stigmergy is the inspiration behind
heuristic optimization techniques such as ant colony optimization [Bonabeau 99, Dorigo 04],
in which a multi-agent system collectively solves a problem. In these systems, local actions
by individual agents (self-)organize global coherent behaviour in the system as a whole.
Stigmergy is just one example of the use of semiochemicals. Semiochemical signalling mech-
anisms amongst organisms occurs widely in both plants and animals, between sexes, between
species, and within species. They can be mutualistic, symbiotic, and antagonistic. Suffice it
to say that in biology one finds a long history and numerous examples of distributed message
passing systems leading to coordinated behaviour.
Our next PNM model is loosely inspired by the work of Kasinger, Bauer, and Denzinger on
multi-agent models based on semiochemicals. They use digital semiochemical coordination as
a framework for building self-organizing multi-agent systems to solve specific computational
problems [Hudson 10, Kasinger 06, Kasinger 08b, Kasinger 08a, Kasinger 09b, Kasinger 09a,
Kasinger 10, Steghofer 10] , particularly the pickup and delivery problem [Savelsbergh 95].
16 Stigmergy is a form of indirect communication observed in insect populations which allows for decen-tralized coordinated behaviour. An organism leaves a trace in the environment that other organisms cansense and act upon. The standard example is an ant leaving a pheremone trail on returning from a foodsource. Other ants can follow these pheromone trails to find the same food sources.
149
Digital semiochemical coordination is one example of organic computing, the attempt to
incorporate self-organization, self-repair, and adaptive features inspired by biology into the
design of distributed computing systems [Branke 06]. Their work is inspired by pollination
biology, particularly the mutualistic relationship between insects (pollinators) and plants
(pollen sources and pollen destinations), where insects receive food from plants in the course
of pollination, and plants depend on the activity of the insects for pollination. These ac-
tivities are coordinated by a number of chemical and visual signals produced by plants, to
attract pollinators. The activity of pollinators can be related to pickup and delivery prob-
lems, where plants play the role of pickup and delivery sites, while pollinators play the role
of transport vehicles. As Kasinger, Bauer and Denzinger note [Kasinger 08b] biological solu-
tions may not be efficient in use of resources, in that pollinators can, and do, land on flowers
that either have no pollen to pickup or require no pollen delivered. They go on to examine
how the use of various kinds of digital semiochemicals for coordination can lead to efficient
solutions of pickup and delivery problems.
We eschew efficiency concerns for now, and capture the basic structure of the problem in
a PNM. Our PNM model assumes three agent types: PollenSources (male flowers, analogous
to pickup areas), PollenDestinations (female flowers, analogous to delivery sites) and Polli-
nators (analogous to vehicles picking up and delivering packages). Each agent’s behaviour
is encapsulated in conditional probability functions that define the messages it can respond
to, and the messages it sends out in response.
As with the virus model, we will assume a round structure, where each round has the
following phases:
• Phase 1 – Messages from PollenSources received by Pollinators.
• Phase 2 – Messages from PollenDestinations received by Pollinators
• Phase 3 – A Pollinator forwards PollenSource messages to other Pollinators.
• Phase 4 – A Pollinator forwards PollenDestination messages to other Pollinators.
150
• Phase 5 – A Pollinator picks pollen from a PollenSource and then delivers it to a
PollenDestination. It then re-initializes itself to await the next pollen delivery.
Pollinators have the following behaviours:
A Pollinator waits until it has received a message (PollenP ickup = 1) either directly from
a PollenSource, or forwarded from another pollinator. A PollenSource sends messages with
probability o if it requires a PollenPickup. A Pollinator forwards a PollenPickup message
with probability q.
p(PickPollentj = 1 | PickPollent−5j = 0) =
1− (∏
i∈Γ(j)(1− PollenP ickupt−5
i ∗ o)×∏
i∈Γ(j)(1− PollinatorP ickupt−3
i ∗ q)).
A Pollinator waits until it has received a message (PollenDelivery = 1) either directly
from a PollenSource, or forwarded from another pollinator. A PollenDestination sends mes-
sages with probability r if it requires a PollenPickup. A Pollinator forwards a PollenDelivery
message with probability s.
p(DeliverPollentj = 1 | DeliverPollent−5j = 0) =
1− (∏
i∈Γ(j)(1− PollenDeliveryt−5
i ∗ r)×∏
i∈Γ(j)(1− PollinatorDeliveryt−3
i ∗ s)).
Once both messages are received (PickPollen = 1, DeliverPollen = 1), the Pollinator
makes a pickup and delivery, then re-initializes itself (PickPollen = 0, DeliverPollen = 0)
to prepare for receipt of pickup and delivery messages in the next round. While the mechanics
of pickup and delivery are an implementation detail, one approach is have each agent have
a unique identifier, and incorporate additional state variables that can hold the value of the
identifiers for a pickup and a delivery.
p(PickPollentj = 0 ∧DeliverPollentj = 0 |
PickPollent−1j = 1 ∧DeliverPollent−1
j = 1) = 1.
PollenSources have the following behaviours:
151
A PollenSource requires a PollenPickup with some probability u, and does not require a
PollenPickup with some other probability v. If u = 1 and v = 0, that PollenSource always
requires a PollenPickup. Conversely, if u = 0 and v = 1, that PollenSource never requires a
PollenPickup.
p(PollenP ickuptj = 1 | PollenP ickupt−5j = 0) = u.
p(PollenP ickuptj = 0 | PollenP ickupt−5j = 1) = v.
PollenDestinations have the following behaviours:
A PollenDestination requires a PollenDelivery with some probability w, and does not
require a PollenDelivery with some other probability x. If w = 1 and x = 0, that Pol-
lenDestination always requires a PollenDelivery. Conversely, if w = 0 and x = 1, that
PollenDestination never requires a PollenPickup.
p(PollenDeliverytj = 1 | PollenDeliveryt−5j = 0) = w.
p(PollenDeliverytj = 0 | PollenDeliveryt−5j = 1) = x.
In a PNM model,increasing the efficiency of solutions with respect to resource use would
primarily involve tuning the probabilities associated with the behaviour of the agents, given
a particular network structure, that defines both the distribution of, and communication
channels for agents.17 In the semiochemical models developed by Kasinger, Bauer and Den-
zinger, much of the modelling effort was towards developing rules and data structures that
allow for efficient pickup and delivery [Hudson 10, Kasinger 09b, Kasinger 09a, Kasinger 10,
Steghofer 10], so that pickup and delivery sources are neither under nor over served, and all
vehicles are taking the shortest routes possible to ensure all required pickups and deliveries.
17 For example, given a network where each agent can receive messages from all other agents (which couldrepresent the Pollinators in a central depot, and PollenSources and PollenDestinations that are roughlyequidistant from that central depot), and in which there are more PollenSource and PollenDestinations thanPollinators, an efficient solution would be one where probabilities are tuned upwards from 0 such that inevery round every available pollinator makes a delivery. If there were more Pollinators than sources anddestinations, probabilities would be tuned downwards from 1 so that in each round each PollenSource andPollenDestination is served by only a single Pollinator on average.
152
Our PNM model is much simpler, by eschewing many of the implementation details of more
efficient solutions. Adding such details back in, would lead to the PNM model becoming
increasingly elaborate.
In natural systems the solutions that have evolved tend towards redundancy rather than
resource utilization efficiency, in that much more pollen is generated than is required, insects
visit both flowers without pollen and flowers that do not require pollen, and more nectar is
produced than strictly required to attract pollinators. Why does life tend towards abundance
(and redundancy) rather than optimal resource utilization? One reason might be that in nat-
ural systems, the populations of plants and insect pollinators are changing both within and
between seasons. Redundancy allows more flexibility towards changing conditions, whereas
an optimal solution for one set of conditions might be very suboptimal if the conditions
change. In that sense, redundancy and efficiency seem to be antagonistic in many natural
systems. Flexibility to changing conditions is gained at the price of reduced efficiency for a
particular set of conditions.
The actual structure of insect-plant relationships with respect to pollination is much more
complex than we have indicated so far. Insects do not pollinate all plants. Some insect species
are generalists, and may pollinate a wide range of plants. Other insects species are specialists,
pollinating only a few (or a single) species of plant(s). Within a single species of insect, dif-
ferent populations may preferentially pollinate different species of plants. Plant species may
be generalists, specialists, or not depend on insect species at all (wind pollinated). This more
complex set of relationships between insect-pollinators and the plants they pollinate results
in what are known as pollination networks. A current concern with such pollination net-
works is their stability, particularly with respect to extinction of pollinators [Bascompte 09,
Kaiser-Bunbury 10, Memmott 04, Oleson 07, Petanidou 08, Williams 11] due to factors such
as climate change [Halter 07, Visser 08], and ongoing crashes of bee populations covered in
both the popular press [McCarthy 11] and books [Halter 11a]. While pollination networks
153
can be seen as analogous to the predator-prey networks covered in Chapter 4, their struc-
ture is necessarily bipartite18, all edges in the network cross two groups of species, pollinators
and plants. Loss of generalist pollinators is implicated in decline in plant species diversity
[Memmott 04], and have been hypothesized to lead to complete collapse of a pollination
network [Kaiser-Bunbury 10].19 Pollination networks are but one example of insect-plant re-
lationships that result in bipartite networks; another example is host-parasite relationships
such as those between tree species and bark beetles [Halter 11c]. Semiochemical models,
while drawing inspiration from the signalling phenomena behind various plant-insect rela-
tionships, may also contribute to developing models which help us to understand the stability
of such networks under perturbations such as climate change, or loss of pollinators.
6.8 Model 6 – Ecosystem Flow Networks
We now move up another few levels in the biological hierarchy from interactions amongst
individuals to interactions amongst populations and species in an ecosystem.
With the advent of systems biology in the last decade, networks of interaction in molec-
ular and cell biology have been a growth area of scholarship. However, in ecology, in-
teractions have always been the raison d’etre. The modelling of ecological interactions
has a long history with diverse mathematical approaches reflecting different conceptual
perspectives including: population dynamics [Allesina 12, Caswell 01, Fox 02, Lotka 20,
May 00, McCauley 99, McCann 12, McCauley 08, Sole 06, Vasseur 08, Wang 09], network
architecture [Allesina 08, Allesina 09, Williams 10, Williams 11, Williams 00], bioenerget-
ics20 [Otto 07, Williams 08, Williams 07, Yodzis 92], network flow relationships [Fath 04,
18 A bipartite graph is one where there are two disjoint sets from which vertices can be drawn, and everyedge in the graph crosses these sets. So, for example, in pollinator networks, there are two disjoint sets,Plants and Pollinators, and all edges cross these two sets.
19 Topological stability theory, developed in Chapter 4 provides a nice context to think about stabilitylimits in pollinator networks. In such networks, the size of the minimum vertex cover (cut-stability) willbe less or equal to the size of the smaller of the two sets from which vertices are drawn, Pollinators andPlants. Viral processes (connection-stability) are forced to weave their way across the two sets.
20 Bioenergetic models combine aspects of population dynamic models with energetic considerations,
154
Fath 06, Fath 07b, Pahl-Wostl 94, Salas 11], stoichiometry21 [Li 11, Loladze 00, Marleu 11,
Wang 10a], various spatial, temporal and geographic scales [Brown 89, Levin 92, Massol 11,
Maurer 99, Pahl-Wostl 92], environmental heterogeneity [Eveleigh 07, Levins 68] and com-
binations of the above models [Romanuk 09b]. Levins, in an essay [Levins 66] taken to heart
by several generations of ecologists, notes that ecological models must strategically trade off
between generality, precision and realism, depending on the goals of a particular study. This
leads to a multiplicity of models reflecting different conceptual and methodical approaches.
PNMs provide yet another conceptual perspective within which to explore ecological inter-
actions. In a PNM for an ecological network, nutrient or carbon transfers are seen as akin
to passing messages.
We expand on our mutualism model to first build a flow network model, and then to build
an ecosystem model with a simple trophic structure22. The PNM model for mutualism is built
up to incorporate additional constraints abstracted from ecosystems. While our example is
specific to ecosystems, this approach illustrates how a simple model can be elaborated by
adding in the constraints particular to specific systems.
Every model we have considered so far could be considered a flow network, in that there
is a flow of messages passing through the system. However, ecological flow networks tracking
nutrient or biomass passage through species include several additional constraints that reflect
the energetic constraints ecosystems must operate within.
With respect to an ecosystem flow network these constraints are:
1. There is a limited amount of matter in the system.
2. Matter is conserved.
primarily related to body mass and size ratios (‘allometry’) between species.21 Ecological stoichiometry models are essentially population dynamics models that consider both energy
flow and nutrient cycling by incorporating constraints about the ratios of different nutrients relative to carbonin the interaction between a primary producer and a grazer.
22 Trophic structure refers to the roles species play as food consumers and food producers, and is intimatelyrelated with energy flow through an ecosystem [Ricklefs 79b]:pp 780-781. Examples of roles are green plantswhich are primary producers, herbivores which are primary consumers, carnivores which are secondary andhigher consumers.
155
3. Some matter is lost from the system, or is in a form no longer available to the
system (dissipated).
These constraints relate to the thermodynamic limitations on energy transfers in ecosys-
tems23. In a PNM they are translated into constraints on messages. The first constraint
restricts the number of messages that can be generated. The second constraint indicates
that all messages must be accounted for. In a PNM of ecosystem flows, the messages rep-
resent matter. Matter can move from one part of the system to another, it can exit the
system boundary, or it can enter the system from outside its boundary. The third constraint
indicates that some messages will become unavailable for use by other agents in the PNM
of ecosystem flows. Our tracking of energy changes through an ecosystem assumes mass
conservation between a designated ecosystem and its environment.
To introduce these constraints into a PNM requires an accounting system that tracks
messages (material transfers)24. Such accounting systems form the basis for ecological flow
analysis, and define the flow values assigned to edges in an ecosystem flow network.
To extend the mutualism model to a flow network, we must add now an accounting
framework. We do so, by first modifying the round structure, so that in each round there
are two phases.
1. Accounting phase
2. Message passing phase.
23 The second constraint follows from the first law of thermodynamics, though strictly speaking, it isenergy that is conserved. The third constraint follows from the second law of thermodynamics, though it,strictly speaking, refers to energy becoming unavailable to do work. Since energetic transfers in an ecosystem,above the level of plants, are via biomass transfers from feeding, the ecological constraints are in terms ofmatter. A more detailed overview of ecosystems viewed from a thermodynamic perspective is in Patten’sclassic paper on the cybernetics of ecosystems which links thermodynamic concerns to information theory[Patten 59].
24 Such an accounting system is used by Feynman to illustrate conservation of energy via the conceit ofa small child playing with blocks. His mother accounts for where the blocks have gone. Some of the blocksare in his bedroom, i.e. in the system. Some of the blocks have exited the bedroom via an open window, i.e.passed beyond the system boundary ([Feynman 95]:pp. 69-72). See also ([Ness 69]:pp. 3-8), for a slightlymore complex elaboration of Feynmans conceit, now involving sugar cubes which, unlike blocks, can dissolve.
156
The message passing phase operates as previously described. The accounting phase must
first account for the number of messages currently in Processes 1–3. Secondly, it must prevent
conflicts that would violate the three constraints on material flows we have described above.
In the model below, we will emphasize the message passing phase, while providing some
basic details of the items the accounting phase must take care of. To simplify the model, we
will assume the accounting phase occurs immediately following message passing, and does
not require its own time step.
First we must introduce a message counting function. Let P be a process random variable.
Pk is the random variable assigned to the kth process (in the mutualism model, k = 3).
Count(P t−1jk
) returns the number of messages M associated with a process random vari-
able Pk for agent j at time step t−1. For example, Count(Zt−1j ) would count all the messages
associated with Process 1, whose associated random variable is Z.
Count(P t−1jk
) = M t−1jk
.
Secondly we must introduce another Boolean state variable Qt−1jk
that depends on the
returned value of Count(P t−1jk
).
p(Qt−1jk
= 1 | Count(P t−1jk
) ≥ 1)) = 1.
Qt−1jk
determines if agent j has any messages that could be sent in the next round.
The constraint, of having to check if there are any messages available to send, now is
incorporated into the mutualism model via a multiplier, Qt−1jk
. Since, in the mutualism
model, we are assuming each agent has only a single process, we can drop the k subscript.
The mutualism model equations incorporating message flow constraints are25:
Process 1: p(X tj = 1 | X t−1
j = 0) = 1−∏
i∈Γ(j)(1−Qt−1
j ∗ Zt−1i ∗ o).
25 An implementation would require some additional functions to decrement M tj upon a message being
passed, and increment it upon a message being recieved. As well, the number of messages associated witheach process within an agent would have to be initialized.
157
Process 2: p(Y tj = 1 | Y t−1
j = 0) = 1−∏
i∈Γ(j)(1−Qt−1
j ∗X t−1i ∗ q).
Process 3: p(Ztj = 1 | Zt−1
j = 0) = 1−∏
i∈Γ(j)(1−Qt−1
j ∗ Y t−1i ∗ r).
Having added mechanisms to our mutualism model to incorporate constraints analogous
to those in flow networks, our next step is to further expand the mutualism model so it
can represent a simple trophic network of autotrophs, herbivores, and carnivores such as in
MacArthur’s modified food web in Chapter 426.
Figure 6.1: MacArthur’s Modified Food Web
In the context of MacArthur’s modified food web, vertices A and B are autotrophs and
are assigned Process 1, vertex C is a herbivore and assigned Process 2, and vertex D is a
carnivore and assigned Process 3.
We incorporate additional features into the model to reflect ecological details. First,
26 Autotrophs are plants, which receive energy from light. Herbivores are those animals that eat plants.Carnivores are those animals that eat other animals.
158
autotrophs will have an intrinsic growth rate, which is reflected in the model via a message
generation mechanism. Second, all organisms dissipate some energy, which is reflected in
the model as lost messages.
For each autotroph, we will model the intrinsic growth rate via the logistic growth equa-
tion27. This equation in its differential and difference forms is often used as a beginning
point for ecological models of both bounded growth within species and competition between
species( [Begon 81]:pp. 77-80, [May 76b], [McCann 12]:pp. 53-56). To model logistic growth,
we add another behaviour for autotrophs, that reflects growth. Let M tj be the messages
associated with the jth autotroph. Let c be the the total number of messages that can
be assigned to that autotroph. In ecological modelling, c is interpreted as the maximum
population a site can support, called the carrying capacity. The growth rate is denoted by
g.
The logistic growth equation used for autotrophs is:
M tj = g ∗M t−1
j (c−Mt−1
j
c).
Every agent j has a probability dj of losing messages to the environment. Whether an
agent will lose a message in the current round is held by a random variable, D.
p(Dtj = 1) = dj.
The number of messages M held by the dissipation agent is:
M tj = M t−1
j + Σi∈Γ(j)Dt−1i .
MacArthur’s assumption in constructing the modified food web is that energy lost from
the ecosystem equals that arriving at the ecosystem. Therefore, our final agent, E, receives
the dissipated messages. E, in turn, can pass messages onto the autotrophs.
27The logistic difference equation, while exceedingly simple, has a special place in the hearts of bothecologists and chaos theorists, due to Robert May’s elucidation of the chaotic properties 28 at the heart ofsome of the simplest models ecologists use.
159
We now have four trophic compartments: autotrophs, herbivores, carnivores, and dissi-
pation. The mutualism model is now further extended to reflect the relationships between
these trophic compartments.
Our trophic model is:
Autotroph Trophic Process 1: p(W tj = 1 | W t−1
j = 0) = 1−∏
i∈Γ(j)(1−Qt−1
j ∗ Zt−1i ∗ o).
Herbivore Trophic Process 2: p(X tj = 1 | X t−1
j = 0) = 1−∏
i∈Γ(j)(1−Qt−1
j ∗W t−1i ∗ q).
Carnivore Process 3: p(Y tj = 1 | Y t−1
j = 0) = 1−∏
i∈Γ(j)(1−Qt−1
j ∗X t−1i ∗ r).
Dissipation Process 4: p(Ztj = 1 | Zt−1
j = 0) = 1−∏
i∈Γ(j)(1−Qt−1
j ∗Dt−1i ∗ s).
If we used MacArthur’s modified food web as the network structure for this simple ecolog-
ical model, it would represent a strongly connected graph. Strongly connected components
in ecological networks have recently been found to act as integrating modules in ecosystems
[Borrett 07]. Their internal structure deviates from that expected from random graphs with
identical numbers of edges and vertices. Such evidence suggests strongly connected compo-
nents may be an ecological network motif29 [Bascompte 09, Milo 02, Rip 10]. In this context,
each agent could be considered to represent a species, and the network represents the flows
of matter between species. The message transfers represent individuals of one species being
eaten by another. So, M tj actually represents the population of the jth species at a par-
ticular point in time30. As emphasized in Chapter 4, our stability concerns with respect to
ecosystems are both topological and dynamic. From a topological perspective we could ask
a pair of questions. First, if the network structure is modified, and certain agents are lost
(extinction), is there still a path amongst the remaining agents via which messages can flow
29 In the case of a strongly connected component, the motif would be a specific class of directed graphs,rather than a specific directed graph. This might more properly be called a motif class.
30 We leave as an exercise the extension of this model where species are agents, to one where individuals of aspecies are agents, and the network structure represents proximity information. Population, now rather thanbeing represented by the number of messages in the system, are represented by the number of vertices, andthe proximity of individuals is reflected by the network structure. Extra credit for incorporating preferentialfood choice into carnivore and herbivore agents.
160
between any two species and cycle (cut-stability)? Second, if a toxin were to pass through
the ecosystem in the form of a tainted message, how long would it take to pass through
all agents (connection-stability). Finally, from the perspective of dynamics, what are the
initial number of messages assigned to each agent, and the probabilities for message transfer
associated each agent j and the specific process behaviour it is assigned? These intial values
and probabilities will govern whether the system reaches a steady state of message flows,
and second, how it would recover from disruptions to the steady state of message flows.
Within the context of PNMs, a common abstraction of message passing emphasizes the
unity of processes occurring in cells and tissues, to signalling amongst organisms, to processes
occurring at the level of ecosystems.
6.9 Future Directions
In his last book, Tools for Thought [Waddington 77], the great developmental biologist, C.H.
Waddington, drew into biology conceptual tools first developed in other fields – dynamical
systems, information theory, and game theory, amongst others. He introduced each of these
tools via biological analogies, often from organismal development. In doing so, he created
bridges between biological phenomena and techniques in other fields, that allowed the widen-
ing of his biological intuition outward towards those fields. Our goal in this chapter has been
to work in the opposite direction, to draw concepts developed within biology into other
fields, particularly computer science when it is concerned with complex interacting systems
that communicate via, and change state due to, the passing of messages. In doing so, we
also draw on the biological intuition that has been developed around phenomena such as
viral spread and immune response, mutualism and autocatalysis, gene regulatory networks,
differentiation, semiochemical signalling and ecological flow networks. Developing models of
these biological phenomena as multi-agent systems sets the stage to explore how they can
be expanded out to technological message passing systems such as the Internet.
161
The PNM framework allows us to compactly represent a wide array of biological phe-
nomena as distributed message passing systems, where each agent receives and generates
messages according to a set of behaviours. In doing so we emphasize the role of various
forms of message passing in biology in shaping interactions at the molecular, cellular, or-
ganismic, and ecological scales. The primary role of interactions at all levels in biological
systems provides a unifying perspective for diverse biological phenomena. In representing
these phenomena in terms of a multi-agent system, we also develop a toolkit of biologically
inspired processes that we can apply to stabilizing computational systems.
One important question to ask in identifying future directions for the PNM model is,
what is the expressive range for these models? While many of the examples in this chapter
have focussed primarily on Boolean conditions being met – conditional probability func-
tions can easily be extended to deal with conditions from predicate logic, as in the virus
and immune response PNM which uses the universal quantifier. The PNM approach, in
representing physical situations (either biological or technological) depends, first on there
existing a detector of logical conditions based on incoming signals, secondly on a network
that represents the paths signals may travel along, and finally on a response mechanism, so
that if a logical condition is met for incoming signals, an outgoing signal is generated with
some probability.
As noted in the previous chapter, the current version of the PNM model corresponds to
a reactive multi-agent system. What additions would be needed to allow for more expressive
behaviours? Currently, memory is simply contained in the state of the random variables
assigned to an agent. Histories can be built via addition of counters. However in biological
systems, memory appears to be held in conformations, and itself appears to be a network
property involving connections between different cell types31, rather than a value held in
31 Computer architect and neurobiologist Jeff Hawkins makes the point that human memory is fundamen-tally different than computer memory [Hawkins 04], and concerns storage and recall of sequences of patternsin invariant form; human memory is not about storage of specific values in a register, but a dynamic pro-cess. Recent work in neurobiology is finding that astrocytes, non-electrical conducting cells, that are denselyconnected to neurons (which would otherwise be more sparsely connected) play a role in the formation of
162
a register. In biology, persistent memory is considered due to a configuration of neural
architecture; as the architecture changes, so does memory. As long as the configuration does
not change, the memory is persistent. Every time a new synaptic connection gets built, there
is the possibility of modifying existing or creating new memories. This is inherently different
than computer memory, where memory is instantiated into a fixed state. The architecture
here, is not changing, as persistent memories change. Finally, behaviour in biological systems
is often anticipatory, and this would seem to imply the underlying logic must be extended to
at least modal logic. Incorporation of more elaborate memory structures, and anticipatory
behaviours are two obvious directions in which to extend the model.
With our emphasis on the behaviours assigned to each agent, another natural extension
is towards social agents. A large literature has grown up in the social sciences based on
relatively simple agents with no more memory or anticipatory power than currently incor-
porated into PNMs [Batty 05, Epstein 06, Epstein 96, Gilbert 08, Miller 07]. Such agents
often have goals that must be satisfied, but that is often manifested as changes in behaviour
triggered by internal state variables, which can be expressed in the PNM framework.
One area in biology that the PNM approach may be applied is in the study of robustness.
Models of gene regulation and cell-fates as we have sketched in this chapter also provide
insight into the biological study of robustness. Robustness in biological systems is a concept
that has appeared repeatedly in the history of biology in various forms, and has most recently
been developed in modern form by Wagner [Wagner 05]. A capsule definition of robustness is:
‘Robustness is the persistence of an organismal trait under perturbations’ ([Felix 08]:pg 132).
As used by Wagner’s group, it applies to all levels of biology from molecular to organismal.
The definition is quite close to the multi-disciplinary definition of stability we developed
in Chapter 2, so robustness as applied to biological systems can be considered yet another
stability concept. In robustness, the perturbations are seen as occurring from three different
sources: stochastic noise, environmental change, and genetic variation. Sources of robustness
long term memory [Bezzi 11, Santello 10, Henneberger 10].
163
are found in redundancy (the same part exists in multiple copies; whether it be copies of a
gene, an organ, or a cell type) and in alternative pathways (to the same functional result).
PNMs may provide a way to explore robustness, by explicitly examining alternate paths
in the network that lead to the sending of a particular message type, and by comprising a
network of redundant agents with the same behaviours.
The focus on redundancy and alternate pathways in the biological conception of robust-
ness should trigger in a computer scientist analogous thoughts on the roles of redundancy
and alternate pathways in the design of fault tolerant systems, where no single failure should
be allowed to disrupt a system. Robustness is often considered to have evolved in biolog-
ical systems, which introduces the idea of combining the PNM model with evolutionary
computation methods that modify network architecture and tune probabilities.
Another area of biology where PNMs may contribute, is the current renessaissance
of epigenetics, originally introduced in the middle of the last century by C.H. Wadding-
ton [Huang 12, Jamniczky 10, Slack 02, Speybroeck 02, Waddington 42]. Epigenetics is the
study of heritable changes above the level of the genome, which can include gene expres-
sion noise, and the complex activity of gene regulation networks [Chang 08, Huang 09a,
Huang 12, Wang 10b]. Experimental knowledge of the specific signalling behind gene regu-
lation can be the basis of PNM behaviours, whereas the stochastic nature of PNMs can be
used to explore the specific combination of behaviours and network conditions under which
gene expression noise can switch cell fates [Huang 09b, Huang 10, Huang 11].
Finally, as computational systems begin to reach a level of complexity usually associated
with the biological, the PNM approach provides a framework in which we can model stabi-
lizing mechanisms for systems of such complexity we can not anticipate all forms of attack.
In the next chapter, we begin a very preliminary approach to that goal. We develop the no-
tion of a proto-immune system on a network via implementing the virus model developed in
this chapter. Combining the PNM framework from the last two chapters, with theory from
164
Chapter 4, we also introduce one final theoretical concept, resilient processes, and examine
the virus and immune response PNM to see how a simple warning message may act as a
resilient process enhancing connection-stability.
165
Chapter 7
Dynamic Resilience
Homeostasis
is like a ballerina
on point.
Resilient processes
bend, block, redirect, sashay
perturbations through structure.
7.1 Abstract
We want to distinguish the stability (cut-stability or connection-stability) passively afforded
by a given network architecture from the stability that could be dynamically afforded by one
or several processes acting from within that architecture. We will call this latter form of
dynamic stability, resilience. The concept of dynamic resilience is developed in the context
of topological stability theory (Chapter 4) and PNMs (Chapters 5–6). Resilience is the ad-
ditional stability provided by active processes above and beyond that provided passively by
network structure. A resilient process is one that dynamically maintains cut-stability or
connection-stability. In biology, resilient processes can be seen in various organic responses
that provide for immunity and homeostasis in living systems. Similarly, resilient processes in
a networked technological system would be those actions the system itself may be programmed
to take that can confer greater stability to it. We explore resilience via first looking at several
simple cases of cut-resilience and connection-resilience, to demonstrate that resilient pro-
cesses can compensate for architectural limitations. We further explore connection-resilience
via the virus–immune PNM under various network architectures from sparsely connected to
166
highly connected, and at several levels of viral propagation. Counter-intuitively, network ar-
chitectures that favour the virus, also favour the warning message running ahead. Dynamic
resilience, thus allows for an architectural weakness in connection-stability to be circum-
vented by processes as simple as sending a warning message. These results suggest that there
is benefit in building such immune capabilities into distributed technological systems.
In this chapter we unite the topological perspective of Chapter 4 with the dynamical per-
spective of PNMs developed through Chapters 5–6. We first define resilient processes and
resilient mechanisms in terms of their effect on cut-stability and connection-stability. We
then explore the consequences of resilience through several simple examples. Next we use the
virus–immune PNM (called the ‘virus model’ in Chapter 6) to examine via simulation the
interplay of network structure, and dynamic processes that can act to stabilize a network. Fi-
nally we explore ways in which we can elaborate upon the virus–immune PNM to incorporate
further resilience. A virus model is particularly apt, as there is a large amount of conceptual
transfer between the biological and computational epidemiology literature focussed on net-
works [Cohen 03, Danon 11, Draief 08, Goel 04, Li 07, Lloyd 01, May 06, May 01, Meyers 05,
Mishra 07, Newman 02a, Newman 02b, Pastor-Satorras 01, Van Mieghem 09, Vogt 07, Yuan 08].
Two take-home messages emerge out of our examples and simulations. First, dynamic pro-
cesses can be used to circumvent structural weakness. Secondly, topology modifies dynamics,
and hence resilience. Dynamically stable systems depend on the interaction of structure and
process. The most stable architecture, in the absence of resilient processes, may not be the
architecture that best supports stability via a resilient process.
We will again begin with a contrast in perspectives, that of our network architect, and
that of an immunologist.
167
7.2 Introduction and Motivation: Viruses in Computer Science and Biology
Our network architect has a problem. He recognizes that based on cut-stability, any highly
connected network that is cut-stable is also connection-unstable. Designing a scale free
network does not get around this problem, since it is also susceptible to viral attacks. A
network with balanced stability will have some resistance to viral attacks while also being
resistant to other forms of attack. Initially he thought that was the best he could do, but now
he wonders, is there a way to do more? Even a connected network with balanced stability
merely slows viral progress, rather than stopping it. He can, of course, harden the system
by locking down resources, removing unnecessary software, and having tightly restricted
access protocols. However, it is usually only a matter of time before some ingenious though
unscrupulous malware designer finds a way around his hardened system. While he fervently
hopes all the network administrators are regularly updating their anti-viral software, he
knows this to be not in fact the case. He also has the queasy feeling that despite all the
antivirus solutions available, he does not in fact feel the system is more secure, because virus
designers seem to be able to develop new viruses faster than anti-virus designers can decipher
them. He also worries about the fact that his network must interface with other networks,
and he has seen how the connection in-stability of linked networks is greater than each
network on its own. Having previously gotten useful ideas from an ecologist, he now decides
to consult with an immunologist on their common problem: viruses, be they computational
or biological.
Our immunologist listens closely to the network architect’s dilemma, then smiles, and
says, ‘I think I can help you.’ He explains to our network architect that human inoculation
programs work largely because humans already have an immune system, and inoculation
programs are simply augmenting the existing capabilities of the human immune system. He
also introduces some new concepts unfamiliar to our network architect, specific gene-for-
gene resistance versus broad multi-gene resistance. He says that anti-virus programs appear
168
to be analogous to vertical (gene-for-gene) resistance, in that they depend on deciphering a
specific virus signature, and building an antiviral solution that will detect and eliminate that
specific virus. He then explains the idea of horizontal (multi-gene) resistance, which results
in a systemic capability to resist viruses, that is not tied to any specific viral signature.
Finally, he introduces a truly enigmatic concept to our network architect, the Red Queen
Hypothesis from evolutionary theory. The Red Queen Hypothesis, concerns the fact that
co-evolving systems, such as hosts (computers) and parasites (viruses) continually respond
to each other, leading to an escalating arms race simply to maintain themselves relative
to each other. While neither vertical nor horizontal resistance can circumvent such arms
races, gene-for-gene resistance (vertical) systems require only a single break through on the
part of the parasite (virus) to regain advantage, though they offer complete protection prior
to such a breakthrough. By contrast, multi-gene (horizontal ) resistance usually provides
less than complete protection, but also is more difficult to circumvent. In short, vertical
resistance leads to more accelerated evolutionary arms races than horizontal resistance. An
arms race seems to our network architect as apropos to the current state of affairs vis-a-vis
antivirus solutions and malicious viruses that can enter the network. Our network architect
asks, ‘What is the simplest thing I can do?’ Our immunologist answers, ‘Have the system
recognize when something is wrong, and send a warning message; this is one of the simplest
forms of horizontal resistance possible. Here’s a simple model you can play with, developed
by a computational systems biologist I work with’. With the virus– immune PNM in hand,
our network architect gets to work to try and understand what kinds of processes may
augment his network’s stability.
7.3 Dynamic Resilience
7.3.1 Dynamical Resilience in terms of Topological Network Stability and PNMs
A standard dictionary definition of resilience is [Sykes 82]:
169
‘resilient. a. recoiling, springing back; resuming original form after stretching, bending,
etc. (of person) readily recovering from depression, etc., buoyant.’
In short, we are looking for processes by which a system can ‘spring back’ from a pertur-
bation that may affect either its cut-stability or connection-stability. By ‘process’, we simply
mean a series of steps or actions taken to achieve a particular end, in this case, system sta-
bilization. While resilient processes can refer to any series of actions a system (biological or
technical) might take, in the more restricted case of computers, computational models and
specific algorithms, we might speak of resilient mechanisms. This leads us to a few initial
definitions of resilient processes and mechanisms below. We will define the resilient processes
generally for a system, while we define the resilient mechanisms more specifically in terms of
networks containing agents with behaviours, that is, in terms of our PNMs. If we see such
behaviours as shared algorithms, we can extend these ideas to other kinds of multi-agent
systems, and to distributed computing models. Recall from Chapter 5 that in a PNM:
There are N agents represented by vertices.
V = {v1, v2, ...vN}, |V | = N .
There are O behaviors represented by functions.
B = {f1, f2, ...fO}, |B| = O.
Each agent j, is associated with a specific group of behaviors.
∀Vj ∈ V ∃Bj ⊆ B , |Bj| ≥ 1.
Our intuition is that resilient process is a sequence of actions taken by a system to
stabilize itself against perturbations. A resilient process provides a system with enhanced
stability above and beyond its structure.
Definition 8. A resilient mechanism is a behaviour(s) fi associated with one (minimally)
to all (maximally) agents vj in a network that conveys to the network greater stability (cut-
stability or connection-stability) than that due purely to network topology.
170
A system may have multiple resilient processes (mechanisms) for different kinds of per-
turbations, or several resilient processes (mechanisms) may work in tandem to stabilize a
system against a particular perturbation. Following from the definitions above, we can speak
of cut-resilient mechanisms (processes) and connection-resilient mechanisms (processes). It
is possible that some mechanisms (processes) may be both cut-resilient and connection-
resilient. It is also possible that different resilient mechanisms (processes) can augment or
interfere with each others effects on system resilience.
Recall from Chapter 4, that cut-stability and connection-stability both depend on certain
sets of vertices in a network. The cut-stability of a network G depends on the minimum
vertex cover MVC(G) of that network. The connection-stability of a network G depends
on the maximum flood set that is possible in the network MFS(v∗, G), and T , the time or
number of iterations required to create MFS(v∗, G), where v∗ is a vertex in G from which G
can be flooded in the fewest iterations. We can define cut-resilient and connection-resilient
mechanisms based on these sets.
Definition 9. A cut-resilient mechanism is a behaviour(s) fi associated with one (minimally)
to all (maximally) agents vj in a network that conveys to the network greater cut-stability by
protecting members of MVC(G) from being cut.
Definition 10. A connection-resilient mechanism is a behaviour(s) fi associated with one
(minimally) to all (maximally) agents vj in a network that conveys to the network greater
connection-stability by preventing a malicious viral process from reaching members of MFS(v∗, G),
or by increasing the number of iterations T required to flood MFS(v∗, G).
7.3.2 Resilience Concepts In Other Areas of Computer Science
In the above formulations of resilient mechanisms, we have left open the exact nature of the
problem(s) a multi-agent network might be engaged in. Our rationale for this is that we
do not want to make much in the way of assumptions as to a specific problem domain. We
simply assume networked systems, and that any particular problem solving capability for
171
the multi-agent network benefits from the resilience (cut or connection) of the network.
The notion of resilience also appears in work of J. Halpern and colleagues that attempts to
unite ideas in computational game theory [Abraham 06, Halpern 08] where a ‘secret sharing
game’ is played in a distributed computing environment to ideas in the context of Byzantine
agreement or consensus problems [Abraham 08]. In these works resilience is related to the
number of agents that can default from protocol, while allowing the problem to be solved.
In the game theory work, the notion of k-resilient Nash equilbria are introduced – which
is essentially an extension of the notion of Nash equilbrium from a single default (k = 1)
to the default of a coalition (of size k). In the Byzantine agreement work, the focus is on
optimal resilience which is the notion of allowing a coalition up to the bound allowed for the
Byzantine agreement problem, where an impossibility proof exists that consensus can not
be reached for n processes if at least 13rd of the processes default. The game theory paper,
by first defining an extension of Nash equilibrium where coalitions default, and secondly
by introducing a notion of non-rational defaults (‘t-immunity’) develops bridge concepts
between game theory and the Byzantine problem. From our network-centric perspective we
should note that both works assume a fully connected network. This is achieved in the game
theory work via the notion of cheap talk, and in the Byzantine agreement work by the notion
of public broadcast. While the resilience in these works is technically defined with respect to
a particular problem and a particular assumed network architecture, the commonality with
resilience as I have defined it above is that resilience is due to dynamic actions of agents,
some of whom may default. Since solution to the consensus or secret sharing problem both
depend on the messages getting through to all agents, they could be seen as potentially
compromised by perturbations leading to connection-instability and cut-instability.
In a recent review of the role of ‘network thinking’ as it may contribute to artificial
intelligence, M. Mitchell [Mitchell 06] notes that the network literature has emphasized static
structural properties of networks (as opposed to their dynamic properties). To make the case
172
for dynamic properties, she first reviews dynamics in the context of cellular automata (which
could be considered extremely simple 1 or 2-D grid networks) [Wolfram 84, Wolfram 94] to
develop a notion of information processing on a network, followed by an examination of
information processing as it appears to occur in immune systems, from which she derives
several general principles for information processing in decentralized systems. ‘Information
processing’ in the sense Mitchell uses it, is to account both for network structure and how
vertices and edges deal with messages they may receive and/or propagate, including how the
network structure itself may alter over time. The notion of a resilience mechanism we have
introduced is in a similar spirit.
As different as the fields of game theory, distributed systems (and the Byzantine consensus
problem) and biologically inspired artificial life may be, they have as commonality the notion
of multi-agent systems and that the communication on such systems1 can be represented by
a network. Thus, the stability of multi-agent systems with communication, depends on the
development of (a) suitable architectures that passively provide stability and (b) resilience
methods that dynamically provide stability when architecture fails.
7.4 Resilience Examples
Our definition of resilience in terms of resilient mechanisms operating in networks allows
us to examine the gains to topological stability by adding particular resilient mechanisms.
We briefly run through several numerical examples, to see how resilient mechanisms can
augement the topological stability of a system.
7.4.1 Resilience Example 1: Agent Hardening
Hardening a computer network can take various forms, from securing the operating systems
of individual computers via mechanisms such as passwords, restricting access to critical files
the operating system needs to run, removing unnecessary services and keeping patches up to
1 Not all game theory formulations assume communication between agents.
173
date. Hardening a network, usually includes hardening the individual operating systems, and
additionally reducing network access, using secure communication protocols, and limiting
protocols and services allowed to operate on the network. From the context of PNMs, a
simple way to think about various hardening mechanisms is in terms of the resistance of
an agent to an attack. Thus any set of actions that allows an agent to resist failure under
attack can be considered a cut-resilient mechanism. What do we gain from such resilient
mechanisms?
Consider a simple resistance model where an agent can be in one of two states, F = 1, and
F = 0, representing failure, and functionality respectively. Let A = 1 represent an attack,
and A = 0 represent no attack. Let t represent discrete time. Finally let fail represent the
probability of failure if attacked.
p(F tj = 1 | At−1
j = 1) = fail.
The value for fail could be obtained from empirical data of the failure rate of computers
in a network under particular types of attack. If fail = 1, agents always fail if attacked. If
fail = 0, agents never fail if attacked. Given such a resistance model, what degree of cut-
resilience is afforded a system under attack? Let us consider a directed attack on MVC(G)
which can be considered the minimal effort an attacker can make for the maximum effect of
cutting the network to pieces. If the agents have no resistance, obviously the network is cut
to pieces in the first wave of attack. Let w be the attack wave, where in each attack wave all
remaining agents in MVC(G) are targeted for attack. We will call these remaining agents
RMV Cw(G). Then we can approximate the remaining agents after each attack wave as:
|RMV Cw(G)| ∼ |MVC(G)| × (1− fail)w.
We can consider fail = 1 the situation where there is no resilient mechanism, and hence
the whole network is cut to pieces in the first wave of the directed attack. All other situations,
where there is some resilient mechanism such that 0 ≤ fail < 1, some component of the
174
network will survive the first wave of attack, and as F → 0, subsequent waves of attack.
7.4.2 Resilience Example 2: Viral Propagation
Let us first informally recap the main features of the virus and immune PNM from Chapter
6. Agents can be in one of three states: (V)iral, (I)mmune, and (N)eutral. The viral state
refers to agents infected with the virus. All three states are mutually exclusive, an agent can
only be in one of these states at each time step. The immune state refers to agents immune
to the virus. The neutral state refers to agents prior to having changed state due to receiving
either a viral or immune message. Initially all but one agent are in the neutral state, and a
single agent is infected with the virus. When an agent is infected with the virus, it can send a
warning message to its neighbours with some probability q that immunizes recipients, before
transmitting a viral payload with some probability r that infects recipients. An agent that is
immune, transfers its immunity to all neighbours with probability 1; i.e. a functional agent
can act effectively to warn all its immediate neighbours. Within each round, the immune
messages are sent (either from infected agents, or immune agents) before viral messages are
sent.
Let us consider two cases of the virus and immune PNM under very different network
structures, a directed cycle and a directed complete graph. We will assume both networks
have the same number of vertices. The absence of a resilient mechanism corresponds to
q = 0, no immune messages get sent.
In the case of a directed cycle, or maximum connection-stability, each infected agent can
only infect one other neighbour. Let us set r = 1, for maximum virality. In the absence
of a resilient mechanism, the virus will run through the rest of the network in n − 1 steps,
where n is the number of vertices in the cycle. So, in the absence of a resilient mechanism,
the probability the whole network will eventually be infected is 1. We will express this as
p(allinfected) = 1. However, if we now add some degree of resilience via 0 < q ≤ 1, the
expression for p(allinfected) for a directed cycle is:
175
p(allinfected) = (1− q)n−1.
In such a case, where the immune response is low, say q = 1/n, as the directed cycle gets
large, n→∞, p(allinfected) = (1− 1n)n−1 → 1.
We have just considered the case of maximum connection-stability. What about the case
of a directed complete graph, which has high cut-stability but very low connection-stability.
In this case, if r = 1, and there is no resilient mechanism, the whole network is infected
in a single step. However, in the presence of a resilient mechanism, 0 < q ≤ 1, the whole
graph can only be infected if no immune message is sent in the first round from the viral
agent. Since there are n− 1 adjacent vertices, each of which has probability q, of receiving
a warning message, the condition where none of these vertices receive a warning message is,
again,
p(allinfected) = (1− q)n−1.
Thus, a warning message provides the same level of immunization in both the connection-
stable case of a directed cycle, and the connection-unstable case of a directed complete graph.
Note that, for the directed cycle case, the virus could proceed through several rounds before
a warning message appears to halt its progress, while in the directed complete graph, only
two rounds are possible. If a single agent is immunized in the first round, it will immunize
all remaining vertices that have not been infected in the second round.
We can characterize these two extreme cases of viral propagation and immune response
in a directed cycle and directed complete graph by a simple measure, called ‘connectiv-
ity’ commonly used in the ecological network literature [May 00, Rossberg 06, Williams 00,
Yodzis 80, Zorach 03]2. Let m be the number of directed edges, n be the number of vertices,
2 The ecological literature has some terminological inconsistencies and variant definitions that could leadto confusion. ‘Connectivity’ as defined here is also occasionally called ‘complexity’ in the ecological literature,and sometimes connectivity is defined as c = m
v [Zorach 03], which is also known as the ‘link density’. Iam using connectivity as it is was used by May [May 00]:pg 63 and [Williams 00]: pg 180, as the ratio ofactual to topologically possible directed edges. However Yodzis [Yodzis 80] even in citing May [May 00]limits connectivity to the fraction of off-diagonal elements. The common idea in all these variant definitions
176
and c be the connectivity defined on directed edges and vertices. Then c = mv2
, and is in-
terpreted as the ratio of actual to possible directed edges. For a directed cycle, m = n, so
c = mn2 = n
n2 = 1n. For a complete directed graph with self loops, m = n2, so c = m
n2 = n2
n2 = 1.
Thus, for directed graphs that are strongly connected, 1n≤ c ≤ 1.
Our simple calculations suggest the warning message may have greater opportunity to
race ahead at higher connectivities. In the next section we take a closer look at how network
connectivity may affect resilience.
7.4.3 Resilience Example 3: Virus Immune Response Under Different Network Connectiv-
ities
Essentially the two cases from the previous section could be seen as a ‘game’, somewhat like
Go, between the viral and immune response played out on a board that is the network’s
topology. We want to understand what the board needs to look like to give the immune
response its best chance of running ahead. We will do so by imagining how this game gets
played out in networks with different connectivities.
Let us consider a small simulation based experiment where we examine networks of
various connectivities, from low to high, and various levels of virulence and immune response,
from low virulence and high immune response to high virulence and low immune response.
We simulate the virus and immune PNM on small networks with 25 agents. We have
four levels of connectivity: (S)parse = 25 edges (c = 25625
= 0.04), (‘C)ritical’ = 50 edges
(c = 50625
= 0.08)3, (M)oderate = 100 edges (c = 100625
= 0.16) and (H)igh = 300 edges
(c = 300625
= 0.48). Let GD(n,m) be a directed graph drawn randomly from the family of
is to measure how many directed edges there are relative to directed edges possible.3 The ‘(C)ritical’ network size of 50, is so-called because it represents a connectivity level greater than the
point in the corresponding undirected random graph at which the giant component appears in Erdos-Renyirandom graph models [7]. In an undirected graph, the giant (connected) component appears at a phasetransition where the probability of an edge being selected for a graph with n vertices is 1
n . The averagedegree for a vertex at that point is n× 1
n = 1. Chung and Lu [7] note that when the average degree is lessthan 1, all connected components are small and there is no giant component. When the average degree isgreater than 1 the giant component is present. By implication when the probability of edge selection is 1/n,the giant component is in the process of emerging.
177
all graphs with n vertices and m directed edges [Luczak 90], which is the directed graph
extension of the G(n,m) model [Erdos 60, Luczak 94]. In these simulations, n = 25, and m
varies from 25 to 300. For each connectivity level, three different virus and immune response
levels are considered. Low virulence high immune response is r = 0.1 and q = 0.9. Moderate
virulence moderate immune response is r = q = 0.5. High virulence low immune response
is r = 0.9 and q = 0.1. For each connectivity level, three different random networks are
generated from the GD(n,m) model. For each network ten simulations are run at each of
the three virus and immune response levels. The initial infected vertex is randomly selected
in each simulation. There were 360 simulations run on twelve different networks, with
30 simulations run for each of the twelve network connectivity level by virus and immune
response level combinations.
In the spirit of exploratory data analysis [Chambers 83, Cleveland 85, Diaconis 85, Wainer 05,
Tukey 66] we focus on a simple graphical summary of average trends in our experiment as
we vary connectivity and the relative strength of viral and immune responses. Simulation
results are summarized in Figure 7.1, a simulation plot matrix which illustrates the average
progress of simulations across connectivity levels and virus and immune response levels. For
each plot in the matrix, blue squares represent neutral vertices that have not changed their
state, green triangles represent vertices that are immune, and red diamonds represent vertices
that are infected with the virus. Looking at the matrix from left to right (low virus/high
immune to high virus/low immune), the immune response acts as a resilient mechanism that
protects some portion of the network, even when virus response is high relative to immune
response. Looking at the matrix from top to bottom (low connectivity to high connectivity),
the immune response increases in effectiveness (number of vertices immunized) as connec-
tivity increases, and regardless of the specific combination of viral and immune response.
In all cases, the immune response acts as a resilient mechanism, protecting some agents.
In the top column, representing low connectivity, the lack of connected components limits
178
Figure 7.1: Simulation Plot Matrix. From left to right, viral level increases. From top tobottom network connectivity increases. Red diamonds: viral vertices. Blue squares: neutralvertices. Green triangles: immune vertices. For each combination of Virus Level and NetworkConnectivity the average of 30 trials is summarized by iteration
179
the progress of both the virus and the immune response. Finally, if we focus on the middle
column where the viral and immune response approximate the flipping of a fair coin, the
immune response exceeds the viral response in all four instances. For the bottom two rows,
representing moderate and high connectivity, all agents that begin neutral switch to viral or
immune by the end of the simulation, indicating they are part of the same strongly connected
component.
This small simulation experiment confirms the idea suggested by our earlier calculations
that the immune response, as a connection-resilient mechanism, is most effective as connec-
tivity increases. It provides increasing resilience as the structural connection-stability of the
network decreases. The reason the immune response of sending a warning message can run
ahead of the virus, even when both are at the same level is simply due to the fact that if a
‘healthy’ agent receives the warning message, it can warn all it’s neighbours, a reasonable
assumption in biological systems. If applied to a computational setting, this is equivalent
to assuming that healthy processors can propagate the warning message to neighbours with
probability 1.
7.4.4 Insights from the Examples: A Little Resilience Can Go A Long Way.
In all three examples, the resilient mechanisms provide a degree of protection, but not nec-
essarily absolute protection. In the case of resistant agents (agent hardening), the idea of
absolute resistance (no failure under attack), is unrealistic at the level of individual agents4.
However, even moderate resistance (say failure under attack is equivalent to flipping a fair
coin), preserves parts of the network for several rounds of attack, where the network would
have otherwise been cut to pieces in the first round. In the examples concerning a viral and
immune response, if the immune response and viral response have the same initial proba-
bility of propagation, the immune response has an advantage that increases as the system
4 It might however be more realistic as a goal at the level of networks or systems that absolutely mustnot fail, where guarantees of lack of failure might be achieved via multiple redundant systems.
180
connectivity increases. If we can assume that healthy agents (cells, processors) having re-
ceived a warning message can communicate efficiently with all their neighbours, the immune
response can race ahead. In a sense, the immune response could be looked at as a ‘good
virus’5 that has a home court advantage over the ‘bad virus’. The implication is that re-
silient mechanisms need not provide absolute guarantees of protection under various kinds
of attack, but a higher propensity for parts of the system to remain functional under various
kinds of attack, whether they be cut-attacks or connection-attacks.
Our examples have been exceedingly simple, focussed on extremely basic resilient mech-
anisms acting singly. In many biological systems homeostasis is maintained by multiple
redundant systems and processes, which suggests that the goal should not to be to design
‘the’ resilient mechanism that can handle all situations, but to design an array of resilient
mechanisms that may act together to handle different situations. Furthermore, while for
simplicity, we have the same resilient mechanism operating in each agent, diversity of re-
silient mechanisms across agents will make life harder for a malware designer, by putting
them essentially in the same position the anti-virus designer is in today.
7.5 Refining Resilient Mechanisms
We now briefly consider a few possible refinements for the virus–immune PNM which, while
they add complexity to the model, move it towards greater realism, and illustrate the idea
that different resilient mechanisms may work together.
7.5.1 Combining Resilient Mechanisms: Agent Resistance and Immune Response
One resilient mechanism that can contribute to the stability of a network is to make agents
‘resistant’ to failure. We have examined the consequences of such a resilient mechanism
earlier in the context of directed attacks. Now let us consider adding such a form of resilience
5 The idea of viewing the immune response as a beneficial virus was suggested by John Aycock.
181
to the existing virus–immune PNM. We will start off with a basic PNM for viral resistance,
and then combine it with our existing virus–immune PNM.
As before, we would like to black-box the low-level mechanisms – either due to hardware,
software, or cellular processes – by which a network component can be resistant to viruses,
and focus instead on the effects of a degree of resistance on the network; we are interested
in the consequences of a particular level of resistance. This approach is common in bio-
logical studies of epidemiology [Daley 99], where the immunological specifics of low level
host-intrusion and defence mechanisms at the level of individual agents (vertices in our case)
are black-boxed, to concentrate on how a contagion spreads through a population (networks
in our case) given particular assumptions about contact (adjacent agents in our case).
An agent is considered ‘exposed’ if it receive a viral message. Let us assume that having
been exposed, an agent transits to the viral state with some probability s that reflects how
susceptible the agent is to being infected upon exposure. Resistance, then is the comple-
mentary probability 1 − s of not transiting to the viral state when exposed. Let r be the
probability that an infected neighbour i transmits a viral message to j. The neighbourhood
around some agent j is symbolized by Γ(j).
We first need to model exposure in a network, and then resistance given exposure.
The random variable Ztj is 1 if vertex j is exposed at time t otherwise it is 0.
The random variable X tj is 1 if vertex j is infected at time t otherwise it is 0.
The basic exposure model is:
p(Ztj = 1 | Zt−1
j = 0) = 1−∏
i∈Γ(j)(1−X t−1
i ∗ r).
The basic resistance model is6:
6 This is the same resistance model we have seen earlier, p(F tj = 1 | At−1
j = 1) = fail. We havemerely changed notation to be consistent with the virus–immune PNM. Thus we now speak of ‘exposure’and ‘susceptibility’, rather than ‘attack’ and ‘failure’. We use the probability s for susceptibility, ratherthan its complement, so we may make statements in terms of probability of a change of state (rather thanprobability of remaining in the same state).
182
p(X tj = 1 | Zt−1
j = 1) = s.
To incorporate this simple viral resistance PNM into our viral–immune PNM, we first
have to modify our round structure from Chapter 6.
• Round 0 – initial infection (of a single agent in the network).
• Round 1, Phase 1 – an immune message is sent.
• Round 1, Phase 2 – the viral payload is sent (exposure).
• Round 1, Phase 3 – the viral payload is either accepted (infection) or rejected (resis-
tance) by the agent.
The phase structure within rounds serves to linearize the order of various message types,
thus preventing race conditions where two types of messages (say immune and viral) simul-
taneously arrive. Practically speaking they represent the situation where as a virus infects
an agent, it has some probability of sending out a warning message before it becomes conta-
gious, and that upon exposure to a virus, there is a delay before the vertex is either infected
or able to resist the virus.
Our modified model(s) incorporating both resistance and immune response is (are):
Resistance Model:
The previously infected case:
p(X tj = 1 | X t−3
j = 1) = 1.
The previously uninfected case:
p(X tj = 1 | X t−3
j = 0 ∧ Zt−1j = 1) = s.
Exposure Model:
p(Ztj = 1 | Zt−3
j = 0 ∧ Y t−1j = 0 ∧ ∀i∈Γ(j)
Y t−1i = 0) = 1−
∏i∈Γ(j)
(1−X t−2i ∗ r).
Immune Model:
The previously immune case:
183
p(Y tj = 1 | Y t−3
j = 1) = 1.
The previously immune neighbours case:
p(Y tj = 1 | Σi∈Γ(j)
Y t−3i > 0 ∧ Y t−3
j = 0) = 1.
The previously infected neighbours case:
p(Y tj = 1 | X t−1
j = 0 ∧ Y t−3j = 0 ∧ ∀i∈Γ(j)
Y t−3i = 0) = 1−
∏i∈Γ(j)
(1−X t−1i ∗ q).
The key effort in creating a joint model incorporating both resistance and immunity is to
first modify the phase structure within rounds, and second to modify the conditions associ-
ated with the probability of an event such as exposure. In this combined model, resistance
and immunity via the warning message work in a complementary fashion to promote the
resilience of the network to a viral attack. Immunity via the warning message blocks the
paths along which the virus may propagate (so not all of MFS(v∗, G) can be reached) while
resistance slows down the rate of viral propagation (increasing the time, T , required for the
virus to propagate through the network). Jointly, these two effects promote the likelihood
of the warning message racing ahead and the virus being contained to a portion of the net-
work, and are together more effective than either mechanism would be singly. The warning
message can race ahead even faster now relative to the viral message. This illustrates, in a
simple way, the benefits of multiple resilient mechanisms working together, which is the true
basis of immune responses in biology.
Hardening agents, so they are resistant to attacks can provide resilience in the face of
both cut-attacks such as targeted denial of service attacks, and against connection-attacks
such as viruses. In transferring these insights from models to real systems, we must of
course dig into the black box of the specific mechanisms used to harden a real world agent
(be it a cell, and individual organism, a processor). The resilient mechanism for hardening
against a denial of service attack is unlikely to be the same as that required for providing
viral resistance. While it is useful to first understand the effects of different kinds of resilient
184
mechanisms in general, a natural progression in the development of PNMs is to next consider
the detailed mechanics of specific resilient mechanisms7. For now, we will continue with our
investigation of how the current virus–immune PNM might be further elaborated without
getting into the specifics of how resistance and immunity will be orchestrated in detail. As
those details are added in, one begins to move from model to virtual implementation.
7.5.2 Further Refinements to the Virus and Immune Response PNM
In creating our combined model, we have made a number of simplifying assumptions, to keep
the model reasonably tractable. The simplifying assumptions include:
• Resistance is constant under repeated exposure.
• An immune response where receipt of a warning message is sufficient to convey
immunity, and where immune nodes immunize their neighbours with probability 1.
We briefly sketch how such assumptions may be made more realistic, at the price of
increased model complexity. As we elaborated the mutualism PNM into an ecosystem PNM
in Chapter 6, we can continue to refine the virus–immune PNM by refining and combining
existing resilient mechanisms, and developing new ones.
First of all, let us assume resistance is not constant, but a function of repeated exposure,
such that on each subsequent exposure of an agent, resistance decreases (or conversely sus-
ceptibility increases). Let us assume a series of probabilities s1, s2, s3...sN with relationship
s1 < s2 < s3 < ... < sN . Let us assume an indexed series of random variables where the nth
stage towards infection at some time t for some vertex j is represented by X t(n)j. Then,
7 The Virus Group at University of Calgary Computer Science in developing the initial version of thevirus–immune PNM has considered several more detailed mechanisms that could be elaborations of thatPNM. A partial list includes: (a) adding virulence period to the PNM, (b) adding terms to the PNM thatreflect the work it might take to send an immune message to all adjacent agents, (c) adding delay parametersto reflect cases where a virus is not immediately detected, (d) having an initial distribution of infected agents,rather than a single infected agent, and (e) having identifiers of infected agents travel with the immune agent,so only messages from those agents are selectively blocked (which essentially also dynamically alters networkstructure in terms of open and blocked channels). All of these elaborations consist of expanding on the basecase that immunity is conferred via first receiving a warning message.
185
p(X t(n)j = 1 | Zt−1
j = 1 ∧X t−3(n−1)j = 1) = sn.
In making such an adjustment, we have done two things. First we have made agents
increasingly susceptible upon repeat exposures, which is often a realistic assumption. Sec-
ondly, we have added a very weak form of memory. In some sense, by introducing an indexed
series of states, each agent now ‘knows’ how many times it has been exposed.
We could further make immunity realistic, by assuming that an immune neighbour does
not immunize with probability 1, but with some probability l between 0 and 1. The equation
for the case of immune neighbours would then have to be suitably adjusted along the lines
of,
p(Y tj = 1 | Σi∈Γ(j)
Y t−3i > 0 ∧ Y t−3
j = 0) = 1−∏
i∈Γ(j)(1− Y t−3
i ∗ l).
Other possibilities are easily imaginable, such as having immunity conferred only on
agents some distance away from the original warning message. This corresponds to the
notion that as the warning message races ahead, only those vertices at a distance from the
infection centre have the time to develop an effective immune response8.
Each of these modifications towards greater realism adds complexity to the model, so
there is a necessary balance between the effort required to make a useful model, and the
additional complexity added along the way. With reasonable effort, future versions of the
combined model may be made much more ‘realistic’ with respect to the nature of the resis-
tance and immune responses.
Even our simplifying assumption of a phase structure to linearize the order in which
different types of messages may be removed, though at the additional complexity of having
to add precedence rules for different kinds of messages that may be seen to be arriving (near)
simultaneously, moving us into the realm of asynchronous models [Attiya 04].
8 This idea was suggested by J. Denzinger.
186
7.6 The Epidemiological and Immune Metaphors in Computer Science
Some metaphors run deep, and the concept of a virus resonates in both the literature of
biology and computer science. Responses to viruses can be looked at from two different
biological perspectives, epidemiological and immunological. The epidemiological perspec-
tive concerns how viruses progress through a population of individuals, population patterns
(contact structure) that can promote or delay viral propagation, as well as techniques that
can slow or prevent such viral progress, such as an inoculation program. The immunological
perspective concerns how cellular mechanisms within an individual (the immune system) can
first recognize, and secondly block the spread of a virus, or any other foreign agent (antigen).
Immunological processes incorporate the functioning of several tissues and organs as well as
their interactions and chemical products (antibodies) to allow individuals to maintain home-
ostasis in the face of both external invaders, and internal processes9. Depending on how
we choose to interpret the virus–immmune PNM under a particular network structure, as a
population of cells within an individual, or as a population of individuals, we could either
perspective. Both perspectives have been applied to computer networks, but in different
ways. Each perspective has different implications. For example, immunization programs
are focussed on mass inoculation and not on the propagation of an immune response. The
inoculation approach has been adopted in the development of software anti-virus packages.
However, immune system responses have been adopted in biologically inspired network secu-
rity. We will briefly examine each perspective, the way it has entered into computer science,
and its relationship to resilient mechanisms.
The literature on mathematical epidemiology is broad and has a long history [Hethcote 00,
Nowak 06] in the biological literature. However, the starting point for a mathematical epi-
demiology model is usually to begin with a set of equations that represent reasonable as-
sumptions about the dynamics of a virus (either based on intuition, or the study of exist-
9 Cohen [Cohen 00a]:pp. 103–105 presents an agent based view of the immune system in terms of theagents required, their arrangement in space, and their interactions in time.
187
ing empirical data). The basic epidemiological model of which most other models can be
considered elaborations is the: (S)usceptible, (I)nfected, (R)ecovered model (SIR model 2)
[Hethcote 00]:pg. 604. This model consists of three differential equations, tracking Suscep-
tible, Infected, and Recovered population members over time.
dSdt
= −BIS.
dIdt
= BIS − vI.
dRdt
= vI.
S, I, and R represent the states susceptible, infected and recovered, respectively. B is
the contact rate, or likelihood of obtaining a disease via contact with an infected subject,
and v is the rate of recovery from an infection (the time it takes an individual to get over a
viral infection).
Essentially the progress of a virus through a population is modeled as transfers between
sequential compartments. Susceptible individuals are transferred to the Infectious compart-
ment and finally to Recovered compartment. Several studies on network epidemiology have
focussed on transferring classic models of population epidemiology from their original bi-
ological context [Daley 99] to a network context that can apply to Internet epidemiology
[Calloway 00, Newman 02a, Pastor-Satorras 01, Yuan 08]. This interaction has also worked
in the reverse direction, where network based models are applied back to biology [Meyers 05].
A recent work that considers immunity [Mishra 07] in a network context, models it as part
of the recovery state in a SIR type model, so does not consider its propagation.
The virus–immune PNM is even simpler than these epidemiological models in one sense,
and more complex in another. It is simpler in that the model has no recovery component.
This is akin to considering the infection deadly and without recovery. It is more complex
in that a new compartment is added, immune. While our virus–immune PNM is based on a
probabilistic rule set, the simulation results of the model (see Figure 7.1) can be recast as
188
rate equations similar to those in the SIR model10.
The obvious insight from Figure 7.1 is that the nature of the resulting rate equations
would be dependant on network connectivity. At each level of connectivity, there is a different
pattern of propagation for the virus and the immune response. To make a direct comparison
between our virus–immune PNM, and the simplest classic epidemiological model, SIR, we
would have to both add a component to and remove a component from our model. We would
have to add a state transition for recovery and an associated lag time. We would have to
remove the immune response, which is not a feature of classic epidemiological models.
Our current virus–immune PNM freezes or halts eventually in that over iterations every
agent that changes state, enters either a viral or immune state, and once in that state,
does not change. If the PNM were extended so that there is a recovered state (essentially
a transition from the viral state back to the neutral state such as in SIS models), there
is the possibility of cycles and chaotic behaviour developing within the model. Modified
models where the recovered state is again susceptible to viral infection, and where immunity
is temporary, may result in waves of viral and immune responses running through the system
without ever settling down.
While epidemiological models in biology focus on populations of organisms [Hethcote 00,
Nowak 06], immune responses are properties of individuals, arising from interactions between
cells and tissues. We could consider immunity an intrinsic property we can attempt to
build into a distributed system so that it autonomously deals with infection. Just such
an approach is beginning to appear in the literature on Internet worms [Cheetancheeri 06,
Costa 05]. Immune system inspired resilient mechanisms do come into play in the network
security literature [Forrest 97a, Somayaji 04], however at the low level of the mechanics of
a particular immunological mechanism, focussing on anomaly detection and distinguishing
‘self’ from ‘non-self’ (i.e. autonomous detection of potential viruses or other malware). This
10 A closely related variant is the SIS model (Susceptible, Infected, Susceptible) whose major difference isthat the infected vertices recovery state is one in which they are again susceptible to infection
189
has led to the concept of developing agent based artificial immune systems for computational
networks whose behaviour is analogous to that of natural immune systems [Forrest 07]. The
effectiveness of such low level resilient mechanisms, whether immunologically inspired or
not, can be used to empirically assign the resistance and immune response probabilities in
PNMs. Simultaneously, the biological immunology literature is beginning to incorporate
network structure into models of specific immunological processes [Callard 05]. Unlike the
epidemiological literature, where existing models are being transferred directly from biology
to computer science, it is the concept of an immune system orchestrating specific immune
mechanisms to recognize self from non-self, to isolate non-self, and to engage in self-repair
that are being transferred from the biological literature into computer science. The virus–
immune PNM, if it is interpreted as a virus propagating through cells, fits well within this
immunological perspective.
From a distributed systems perspective [Attiya 04, Ozsu 99] the propagation of an im-
mune response arises naturally from a design perspective that focuses on message passing
and communication overhead. In such contexts, the network represents processors (agents)
and communication channels (directed edges). Distributed processors can send varying mes-
sages along channels. Viruses and immune responses are essentially just differing message
types, whose receipt either compromises functionality (viruses) or helps maintain function-
ality (immune responses) relative to the goals the system is intended to fullfill.
Topological network stability, and the derived concepts of resilient processes and mecha-
nisms provides a theoretical framework within which we can investigate specific epidemiolog-
ical models and immune system inspired processes and mechanisms. The PNM framework
provides a general modelling approach to explore the interplay of these different models and
mechanisms.
190
7.7 An Evolutionary Perspective
The perspective that unifies sub-disciplines in modern biology is evolution. Resilient pro-
cesses in biology are not designed, but are products of evolution. Increasingly we are seeing
evolutionary themes, originally developed in a biological context, entering into computer
science. We began this chapter with our network architect learning a little about resilient
processes in biology from an immunologist with an evolutionary bent. The Red Queen Hy-
pothesis developed by Leigh van Valen [Ridley 93, van Valen L. 73], focuses our attention on
the fact that host-parasite arms races have a strong tendency to develop in complex systems.
They are to some extent unavoidable. The notions of vertical and horizontal resistance are
well established in crop science [Robinson 95], and lead to alternate breeding strategies11.
Antivirus inoculation systems may be an effective resilient mechanism, but one with a high
overhead in terms of keeping up in the resulting arms race (by staying current with viral sig-
natures), and with both the strengths (total resistance for a period of time) and weaknesses
(when resistance is overcome susceptibility is complete) of vertical resistance. As computer
security expert Bruce Schneier notes [Schneier 04]:pg. 154, ‘Viruses have no cure. It’s been
mathematically proven that it is always possible to write a virus that any existing antivirus
program can’t stop. ... if the virus writer knows what the antivirus program is looking for,
he can always design his virus not to be noticed. Of course, the antivirus programmers can
always create an update to their software to detect the new virus after the fact.’ In the
evolutionary arms race in technology between viruses and antivirus programs, the antivirus
programs are always one step behind, playing catchup. In a similar vein, Balthrop et al.
conclude that vaccination strategies whether targeted or random are unlikely to be effective
across all network structures, and suggest instead a dynamic mechanism such as ‘throttling’
(limiting the number of connections a computer can make to other machines ), essentially a
connection-resilient mechanism that increases T .
11 Specifically, breeding programs for vertical resistance have emphasized inbred lines, while horizontalresistance have emphasized heterogenous populations [Robinson 95].
191
In this chapter we have illustrated the ways in which a few resilient mechanisms analogous
to horizontal resistance may be useful in providing resilience to a network. Such mechanisms
do not provide absolute protection of all agents (cells, processors), but they do provide
general mechanisms by which simple aspects of immune systems drawn from biology can be
built into technological networks.
Diversity, homeostasis, and immune system inspired defence mechanisms are suggested
as sources of design principles that can be transferred from evolutionary biology to computer
science to develop more robust systems [Forrest 97a, Forrest 97b, Forrest 05, Somayaji 07a].
Indeed some computer scientists have begun scouring the evolutionary literature for addi-
tional evolutionary metaphors [Somayaji 04] that can be used as design principles.
In describing the many ‘bad things that can happen to a network’ our language for
networked systems can borrow from biology [Somayaji 07b]. Thus, in this chapter we have
used terms such as ‘exposure’, ‘virus’, ‘resistance’ and ‘immune response’, all of which are
common in the immunological [Cohen 00a] and epidemiological [Daley 99] literature. We
have chosen to focus on a high-level view of resilient mechanisms, concentrating on the effects
of such mechanisms as resistance, or the propagation of a warning message, but black-boxing
lower level details as to the construction of such resistance or warning systems. This allows
us to concentrate on the effects if such mechanisms existed, to quickly examine different
mechanisms, and to determine those that might be most effective given a particular network
architecture. A natural follow-up is then to begin working on the constructive details of
those resilient mechanisms that prove particularly effective under a particular architecture.
Studies of how technological systems change over time are increasingly leading to analo-
gies between the behaviour of complex technical systems and development and evolution
in ecosystems and organisms. Forbes in ‘Imitation of Life’ [Forbes 04] details the range of
concepts and techniques from biology that are being brought into computer science. Biology
provides vocabulary, concepts and analytical methodology to track changing organization
192
in systems which are increasingly applied outside of biology. Examples of other disciplines
borrowing from biology are Arthur’s theory of the evolution of technology [Arthur 09], Hu-
berman’s interpretation of the Internet as an ecology of information [Huberman 01] and
Dovrolis view of the Internet as an evolving ecosystem that has many parallels with biologi-
cal evolution [Dhamdere 08, Dhamdere 10, Dovrolis 08a, Dovrolis 08b, Rexford 10]. Dovrolis
focuses on the concept of evolvability, those features of the Internet that allow it to adapt as
its environment changes. This is an example where technological systems are borrowing evo-
lutionary concepts even while they are in active debate in biology [Kitano 04, Wagner 05]. At
the fundamental level of computation itself, there is the recognition that computations occur
within biological structures such as cells and DNA, and that the molecular interaction and
diffusion based methods of computation in these biological structures provides a ‘biotechnol-
ogy’ for computation [Adleman 98, Bray 09, Calude 01, Conrad 85, Shapiro 06, Winfree 98]
that is very different from our traditional computer architectures and their underlying com-
putational models.
Given the analogies between biological patterns and the development of technological sys-
tems, it is natural to adapt into computer science concepts first developed in biology. In the
opposite direction, biology borrows concepts from computer science such as the application
of distributed system concepts in the study of immune systems [Segel 01], the analogy that
biological processes are like distributed system protocols [Doyle 05, Doyle 07] or the explicit
use of message-passing as a framework for biological signalling that is applied in this chapter.
In new interdisciplinary fields such as systems biology, where theory construction includes
computational models encompassing complex biological phenomena, the opportunities for
conceptual transfer are particularly rich. In particular, studies of system structure and pro-
cess designs that have resulted from evolution in biological systems may be apt starting point
for testable designs for complex technological systems [Doyle 05, Doyle 07].
Computer science is perhaps unique among the sciences for its ability to borrow concepts
193
widely from other disciplines and fashion them into tools for its own use. Past examples have
been the incorporation of biological ideas into the development of neural networks (which by
a long chain of development lead to graphical models), genetic algorithms and evolutionary
computing, reinforcement learning, swarm computing, and semiochemical models. As such
computer scientists have unique opportunities to develop cross disciplinary insight as they
draw methods and concepts into their discipline by abstracting from other disciplines.
While biology may be a source for guiding concepts for resilient mechanisms that can
dynamically stabilize complex networks, there is no reason to assume that effective resilient
mechanisms are limited to the biological. It may even be possible to take a genetic algorithms
approach and search through the space of possible resilient mechanisms for those whose
effects prove most stabilizing under a particular network architecture (which we could call
‘narrow-sense resilience’) and those which are stabilizing under a wide range of network
architectures (which we could call ‘broad-sense resilience’).
Topological network stability, dynamic resilience, and the PNM approach provide a the-
oretical and modelling framework which allows computer scientists to borrow concepts from
a wide range of biological systems focussed on interactions, strip away much of the biological
detail, until the interactions are laid bare, and then examine the structure of interactions, and
mechanisms for interactions. In doing so, there are conceptual benefits to computer science
in terms of inspiration for new algorithms and stabilizing mechanisms, and benefits back to
those sciences which are conceptually drawn upon by computer science in the development
of new computational models and analyses – new tools for thought useful in epidemiology,
immunology, ecology, systems biology and evolutionary biology. Along the way, we move
from loose analogies, to suggestive metaphors, to the transfer of logical structure from one
discipline to another, seeking unification. The goal of the previous seven chapters has been
to take a small step on this path, to elucidate a single concept – topological stability in
complex networks – and its consequences, dynamic resilient processes and mechanisms.
194
Chapter 8
The Nascent Moment
This is the moment
of stillness
when before
and after
truncate
at the birth
of a distinction:
what is now,
was not then.
8.1 Abstract
We briefly review the major contributions of this thesis and identify several future research
directions. Finally, we speculate on the origin of interactions that are the basis of com-
plex networks in biology, and that may begin to drive the evolution of complex networks in
technology.
8.2 Recap of Contributions
Chapter 1 listed the contributions this thesis makes towards a theory of topological stability
and dynamic resilience in complex networks:
1. Definitions of cut-stability, connection-stability and balanced-stability are pro-
vided. The ways in which these concepts may be related to information theory
195
is also developed. (Chapters 1, 4).
2. The antagonism between cut-stability and connection-stability is demonstrated
(Chapter 4).
3. A formal model for PNMs is developed, and PNMs are designed that reflect a
range of biological processes associated with stability (Chapters 5, 6).
4. Resilient processes and resilient mechanisms are defined (Chapter 7).
5. A PNM representing a virus and immune response is explored to identify
conditions under which a resilient mechanism is effective (Chapter 7).
6. Interdisciplinary contributions are made at various points. Topological sta-
bility concepts are applied to error and attack tolerance in technological net-
works, to stability in ecosystems, and is connected to some current concepts
in social networks (Chapter 4). Concepts from computational systems biol-
ogy inspire the development of the PNM approach, and the design of specific
PNMs (Chapters 5, 6). Concepts from epidemiology, immunology, and evo-
lutionary biology are incorporated into our development of resilient processes
and resilient mechanisms (Chapter 7).
The essential contribution of this thesis is simply to argue over seven chapters for a theory
of topological stability and dynamic resilience in complex networks. To develop this thesis re-
quired us to develop arguments that cut across, and thereby connect, several sub-disciplines
in computer science and biology. We have emphasized interactions as the common basis of
complex networks in different fields. The stability of a complex network originates in the
stability properties of the interactions that underlie it. In biological systems, such interac-
tions originate as processes become coupled. In technological systems, we design some of the
interactions into the system, but others emerge due to the coupling of processes that may not
196
have been anticipated by the system designer, such as social processes, economic processes,
and as recent international events have shown, political processes. In the construction of
large technological networks, such processes, outside of our explicit design considerations,
may ultimately have strong influence on the stability of the systems we develop. In consid-
ering the effects of such interactions, it is useful to have a theory about the stability of a
network of process based interactions. Our goal in this thesis has been to provide a start to
such a theory.
8.3 Future Directions
Our theory is a starting point, from which implications, like directed edges, may connect in
various directions: theoretical, methodological, and empirical applications. In each of these
areas, I will briefly state some questions that interest me. The list is not exhaustive, but
reflects my mix of computational and biological interests. A social scientist, an engineer, or
an economist might come up with a very different list of next steps. A physicist might ask
similar questions, but pursue very different applications.
8.3.1 Theoretical Next Steps
Having developed a conception of perfect information hiding (Chapter 4), in the context of
balanced stability, one question that immediately interests me is whether perfect information
hiding is possible, or whether it may violate some other invariant topological properties of a
graph. If the latter is true, it could be said that every network leaks information from which
an attacker can learn.
In Chapter 4, we demonstrated that the antagonism of cut-stability and connection-
stability could be related to mutual information under a specific construction of layering
cycle covers. I wonder if this phenomena might be much more general, and whether a
probabilistic argument may be developed for cut-stability and connection-stability, by which
197
the vast majority of edge addition sequences (increasing cut stability) will result in monotonic
declines in the average mutual information.
Towards the end of Chapter 4, it was mentioned that spectral analysis of networks has
been related to estimates of a network’s susceptibility to viral attacks. These results, and
those in Chapter 7, suggest that we may be able to formalize the relationship between the
average mutual information and spectral analysis on a complex network.
Finally, concepts developed in this thesis originated in my slow recognition over a period
of twenty years that there are multiple conceptualizations of stability. In Chapter 4 we illus-
trated the different perspectives between topological and dynamical approaches to stability
in the ecological literature. The relationship between topological stability and what I have
called Poincaire stability in this thesis needs further exploration. One possible road inwards
is the recognition that the community matrix [McCann 12] that is the basis of dynamical
approaches to the diversity-stability debate in ecology may be interpreted topologically in
terms of cut-stability and connection-stability properties of the graph it corresponds to.
8.3.2 Methodological Next Steps
The majority of this thesis has been concerned with developing a single theory. However,
theory can be a guide to, and inspire new techniques for, data analysis. Chapter 4 introduced
a simple visual technique that can be used to explore how a particular network deviates from
balanced stability. Chapter 7 briefly illustrated how the analysis of simulations can provide
insight into resilient mechanisms. One area I am interested in developing methodologically
is an analysis framework specific to PNMs, both at the level of model, and specific simu-
lation. Here, I am inspired by the lovely techniques that come out of dynamical systems
theory [Glass 88, Guastello 95, McCann 12, Sprott 03], and would aspire to similarly develop
the theory of topological stability and dynamic resilience as the basis of a data analytical
framework that can be used to bridge theory and experimental data.
In Chapter 6 I identified several directions for the elaboration of PNMs, including explor-
198
ing modal logic to create anticipatory systems within PNMs, and introducing more complex
memory mechanisms inspired by biology, particularly recent work in neurobiology. Other
obvious approaches are to combine PNMs with methods from evolutionary computation to
create evolvable PNMs.
Finally, I am interested in the ways in which a combination of topological stability theory
and PNMs can be used to develop design principles for complex networks, and a framework
in which network and process designs can be tested in silico before being released into the
wild.
8.3.3 Empirical Applications
Complex networks concepts have provided a framework to integrate empirical data in several
fields including: computer science, systems biology, ecology, epidemiology, finance and the
social and political sciences.
The empirical applications I list reflect my specific interests, and colleagues whose work
has inspired me to apply my techniques to their problems.
In computer science, I am interested in using topological stability as a way to elucidate
design principles for complex technological systems, be they the network of code relationships
in a complex piece of software, or the network of infrastructure and process in our most
complex technological network, the Internet.
Again, in computer science, I am fascinated with the ways in which multi-agent systems
(including PNMs) can generalize the notion of dynamical systems, and be used to explore,
via simulation, basic concepts in coordination, problem solving and self-organization via
heterogenous agents.
In systems biology, I am particularly interested in the application of topological stability
theory, PNMs, and other concepts from distributed systems theory to multicellularity, in
particular to understand the conditions under which cells can switch fates under the influence
of cell–cell signalling.
199
Epidemiology and immunology have already contributed to computer science, as ref-
erenced in Chapter 7. The exploration of artificial immune systems, and epidemiological
dynamics within the context of PNMs is a natural extension of the latter stages of my thesis.
The majority of my working life has been focussed on understanding ecosystems at small,
intermediate, and large scales. This thesis began in my efforts to understand the stability I
saw in ecosystems relative to the numerous perturbations they were exposed to1. No doubt,
the conceptual tools from this thesis will be focussed on further understanding stabilizing
processes in ecosystems under stress, whether it be due to climate change, loss of pollinators,
or human actions that change the structure and functioning of ecosystems.
8.4 On the Origin of Interactions
‘The Architecture of Complexity’ is the title of two essays separated by forty-five years. Tech-
nology pioneer, Herbert Simon’s, essay of 1962 [Simon 62] was concerned with elucidating
design principles for complex technological systems such as modularity, near decomposabil-
ity, and hierarchy (amongst others) that allowed complex systems to be assembled out of
simpler constructions. Albert-Laslo Barbasi’s essay of 2007 [Barbasi 07] does not cite its
predecessor, but covers much of the same ground, now from the perspective of complex net-
works. If such an essay were to be again written, a few decades hence, it might very well
again cover similar ground, but now from the perspective of the origin of interactions.
Biologists have several views on the origin of interactions. The evolutionary theorist
and systematist D.R. Brooks has pointed out that the explanation of many current ecologi-
cal interactions originates in the interactions amongst the ancestors of contemporary species
1 In particular, I can still visualize the small sphagnum patches that were on the study site for myM.Sc. in the Sooke Mountains. These patches created miniature sphagnum bogs in a larger subalpine forestecosystem. Such miniature ecosystems were dependant both on the conditions of the larger ecosystem theywere contained in, and dependant on vagaries within that larger ecosystem, such as a fallen stump creatinga pool in which some sphagnum moss was initially established, which then created the conditions for otherbog plants. It impressed me then and now, that while perturbations can disrupt ecosystems, they can alsobe the basis for the existence of, and scale of ecosystems
200
[Brooks 02]. The biophysicist Koichiro Matsuno has noted that synchronized behaviour in bi-
ology, such as the motion of a muscle fiber, originates in asynchronous stochastic phenomena
that suddenly becomes coordinated. He looks to the mechanisms of such initial coordina-
tion [Matsuno 97, Matsuno 98], originating in the interplay of quantum and thermodynamic
constraints [Matsuno 99, Matsuno 01]. Several leading theoretical biologists have implicated
autocatalytic cycles in the origin of biological interactions. It is emphasized in the work of
Stuart Kauffman [Kauffman 93], R.E. Ulanowicz [Ulanowicz 97, Ulanowicz 09b], as well as in
the evolutionary synthesis of John Maynard-Smith and Eors Szathmary [Maynard Smith 99].
All these perspectives centre on the origin of biological processes.
Ulanowicz defines a process operationally as ([Ulanowicz 09b]:pg. 29):
‘A process is the interaction of random events upon a configuration of constraints that
results in a nonrandom but indeterminate outcome.’
All these approaches to the origin of interactions speak to that nascent moment, when
random collisions become meaningful interactions; become increasingly constrained and syn-
chronized into a process. Processes, once sufficiently coherent, could interact with each other,
leading to a combinatorial hierarchy of interactions. A theory of the origin of such interac-
tions is necessary to understand the origin of biological processes. The logical switches in
gene regulatory systems were only so at the end of such a history of constraint and synchro-
nization. Put simply, how out of many possible interactions, did a smaller set of regular, and
relatively cut-stable and connection-stable interaction networks emerge and evolve to create
the processes that structure ecosystems, gene regulatory networks and immune systems?
Hopefully, the conceptual tools developed in this thesis can be further developed to address
this question of the origins of the interactions, the nascent moment.
201
Bibliography
[Abelson 00] H. Abelson & N. Forbes. Amorphous Computing. Complexity, vol. 5, no. 3,pages 22–25, 2000.
[Abraham 06] I. Abraham, D. Dolev, R. Gonen & J. Halpern. Distributed Computing MeetsGame Theory: Robust Mechanisms for Rational Secret Sharing and MultipartyComputation. In PODC’06, 2006.
[Abraham 08] I. Abraham, D. Dolev & J. Y. Halpern. An almost-surely terminating poly-nomial protocol for asynchronous Byzantine agreement with optimal resilience.PODC’08, vol. 8, pages 405–414, 2008.
[Adleman 98] L. M. Adleman. Computing with DNA. Scientific American (August, pages54–61, 1998.
[Albert 00] R. Albert, H. Jeong & A.-L. Barbasi. Attack and Error Tolerance of ComplexNetworks. Nature, vol. 406, pages 378–382, 2000.
[Albert 02] R. Albert & A-L. Barabasi. Statistical Mechanics of Complex Networks. Re-views of Modern Physics, vol. 74, pages 47–97, 2002.
[Allesina 05] S. Allesina, A. Bodini & C. Bondavalli. Ecological Subsystems Via Graph The-ory: The Role of Strong Components. Oikos, vol. 110, pages 164–176, 2005.
[Allesina 08] S. Allesina, D. Alonso & M. Pascual. A General Model for Food Web Structure.Science, vol. 320, pages 658–661, 2008.
[Allesina 09] S. Allesina & M. Pascual. Food Web Models: A Plea for Groups. EcologyLetters, vol. 12, pages 652–662, 2009.
[Allesina 12] S. Allesina & S. Tang. Stability Criteria for Complex Systems. Nature, vol. 483,pages 205–208, 2012.
[Alon 06] U. Alon. An introduction to systems biology. design principles of biologicalcircuits. Chapman and Hall/CRC, 2006.
[Alon 07] U. Alon. Network Motifs: Theory and Experimental Approaches. Nat. Rev.Genet., vol. 8, pages 450–461, 2007.
[Anand 09] K. Anand & G. Bianconi. Entropy Measures for Networks: Towards an In-formation Theory of Complex Topologies. Physical Review E, vol. 80, page045102(R), 2009.
[Aoki 01] I. Aoki. Biomass diversity and stability of food webs in aquatic systems. Eco-logical Research, vol. 16, pages 65–71, 2001.
[Arnold 92] V. I. Arnold. Catastrophe theory. third edition. Springer-Verlag, 1992.
202
[Arora 09] S. Arora & B. Barak. Computational complexity. a modern approach. Cam-bridge University Press, 2009.
[Arthur 09] B. Arthur. The nature of technology. what it is and how it evolves. Free Press,2009.
[Attiya 04] H. Attiya & J. Welch. Distributed computing. fundamentals, simulations, andadvanced topics. second edition. Wiley, Interscience, 2004.
[Ball 04] P. Ball. Critical mass. how one thing leads to another. Farrar, Straus andGiraux, 2004.
[Bambrough 63] R. Bambrough. The philosophy of aristotle. New American Library, 1963.
[Barabasi 99] A-L Barabasi & R. Alberts. Emergence of Scaling In Random Networks. Sci-ence, vol. 286, pages 509–512, 1999.
[Barbasi 07] A-L. Barbasi. The Architecture of Complexity. From Network Structure toHuman Dynamics. IEEE Control Systems Magazine., August 2007.
[Bascompte 09] J. Bascompte & D. B. Stouffer. The Assembly and Disassembly of EcologicalNetworks. Phil. Trans. R. Soc. B, vol. 364, pages 1781–1787, 2009.
[Battini 07] D. Battini, A. Persona & S. Allesina. Towards a Use of Network Analysis:Quantifying the Complexity of Supply Chain Networks. Int. J. Electronic Cus-tomer Relationship Management, vol. 1, no. 1, pages 75–90, 2007.
[Batty 05] M. Batty. Cities and complexity. The MIT Press, 2005.
[Begon 81] M. Begon & M. Mortimer. Population ecology. a unified study of animals andplants. Blackwell Scientific Publications Ltd, 1981.
[Bell 87] J. S. Bell. Speakable and unspeakable in quantum mechanics. CambridgeUniversity Press, 1987.
[Bersier 02] L-F. Bersier, C. Banasek-Richter & M-F. Cattin. Quantitative Descriptors ofFood-Web Matrices. Ecology, vol. 83, no. 9, pages 2394–2407, 2002.
[Bezzi 11] P. Bezzi & A. Volterra. Astrocytes: Powering Memory. Cell, vol. 144, pages644–645, 2011.
[Bianconi 07] G. Bianconi. A Statistical Mechanics Approach for Scale-Free Networks andFinite-Scale Networks. Chaos, vol. 17, page 026114, 2007.
[Bianconi 09a] G. Bianconi. Entropy of Network Ensembles. Physical Review E, vol. 79,page 036114, 2009.
[Bianconi 09b] G. Bianconi, P. Pin & M. Marsili. Assessing the Relevance of Node Featuresfor Network Structure. PNAS, vol. 106, no. 28, pages 11433–11438, 2009.
203
[Bodini 02] A. Bodini & C. Bondavalli. Towards a Sustainable Use of Water Resources: AWhole-ecosystem Approach Using Network Analysis. Int. J. Environment andPollution, vol. 18, no. 5, pages 463–485, 2002.
[Bollobas 98] B. Bollobas. Modern graph theory. Springer, 1998.
[Bonabeau 99] E. Bonabeau, M. Dorigo & G. Theraulaz. Swarm intelligence. from naturalto artificial systems. Oxford University Press, 1999.
[Bondavalli 99] C. Bondavalli & R. E. Ulanowicz. Unexpected Effects of Predators UponTheir Prey: The Case of the American Alligator. Ecosystems, vol. 2, pages49–63, 1999.
[Bondy 08] J. A. Bondy & U. S. R. Murty. Graph theory. Springer, 2008.
[Borrett 07] S. R. Borrett, B. D. Fath & B. C. Patten. Functional Integration of EcologicalNetworks Through Pathway Proliferation. J. Theor. Biol., vol. 245, pages 98–111, 2007.
[Borrett 10] S. R. Borrett, J. Whipple & B. C. Patten. Rapid Development of IndirectEffects In Ecological Networks. Oikos, vol. 119, pages 1136–1148, 2010.
[Branke 06] J. Branke, M. Mnif, C. Miller-Schloer, H. Prothmann, U. Richter, F. Rochner& H. Schmeck. Organic Computing – Addressing Complexity by Controlled SelfOrganization. In Proceedings of ISoLA 2006, pages 200–206, 2006.
[Bray 09] D. Bray. Wetware. a computer in every living cell. Yale University Press, 2009.
[Brooks 02] D. R. Brooks & D. A. McLennan. The nature of diversity. an evolutionaryvoyage of discovery. The University of Chicago Press, 2002.
[Brown 89] J. H. Brown & B. A. Maurer. Macroecology: The Division of Food and SpaceAmong Species on Continents. Science, vol. 243, pages 1145–1150, 1989.
[Buldyrev 10] S. V. Buldyrev, R. Parshani, H.E. Pau G. Stanley & S. Havlin. CatastrophicCascade of Failures in Interdependent Networks. Nature, vol. 464, pages 1025–1028, 2010.
[Callard 05] R. E. Callard & A. J. Yates. Immunology and Mathematics: Crossing theDivide. Immunology, vol. 115, pages 21–33, 2005.
[Calloway 00] D. E. Calloway, M. E. J. Newman, S. H. Strogatz & D. J. Watts. NetworkRobustness and Fragility: Percolation on Random Graphs. Phys. Rev. Lett.,vol. 85, pages 5468–5471, 2000.
[Calude 01] C. S. Calude, & G. Paun. Computing with cells and atoms. an introduction toquantum, dna and membrane computing. Taylor and Francis Inc., 2001.
[Caswell 01] H. Caswell. Matrix population models. construction, analysis, and interepre-tation. second edn. Sinauer Associates Inc Publishers, 2001.
204
[Censor-Hillel 11] K. Censor-Hillel & H. Shachnai. Fast Information Spreading in Graphswith Large Weak Conductance. In SODA 2011, pages 440–448, 2011.
[Chaitin 66] G. J. Chaitin. On the Length of Programs for Computing Finite Binary Se-quences. Journal of the ACM, vol. 13, pages 547–569, 1966.
[Chaitin 99] G. J. Chaitin. The unknowable. Springer, 1999.
[Chambers 83] J. M. Chambers, W. S. Cleveland, B. Kleiner & P. A. Tukey. Graphical meth-ods for data analysis. Wadsworth International Group and Duxbury Press,1983.
[Chang 08] H. Chang, M. Hemberg, M. Barahona, D Ingber & S. Huang. Transcriptome-wide noise controls lineage choice in mammalian progenitor cells. Nature,vol. 453, pages 544–547, 2008.
[Chartrand 77] G. Chartrand. Introductory graph theory. Dover Publications Inc., 1977.
[Cheetancheeri 06] S. G. Cheetancheeri, J. M. Agosta, D. H. Dash, K. N. Levitt, J. Rowe &E. M. Schooler. A distributed host-based worm detection system. SIGCOMM06 Workshops. Sept 11-15. Pisa, Italy, 2006.
[Chierichetti 09] F. Chierichetti, S. Lattanzi & A. Panconesi. Rumour Spreading and GraphConductance. In SODA 2010, pages 773–781, 2009.
[Chierichetti 10] F. Chierichetti, S. Lattanzi & A. Panconesi. Almost Tight Bounds forRumour Spreading with Conductance. In STOC 2010, pages 399–408, 2010.
[Chung 06] F. Chung & L. Lu. Complex graphs and networks. American MathematicalSociety, 2006.
[Chung 09] F. Chung. Graph theory in the information age. Noether Lecture at the AMS-MAA-SIAM Annual meeting, Jan. 2009, 2009.
[Cleveland 85] W. S. Cleveland. The elements of graphing data. Wadsworth AdvancedBooks and Software, 1985.
[Cohen 00a] I. R. Cohen. Tending adam’s garden. evolving the cognitive immune self. Aca-demic Press, 2000.
[Cohen 00b] R. Cohen, K. Erez, D. Ben-Avraham & S. Havlin. Resilience of the Internet toRandom Breakdowns. Phys. Rev. Lett., vol. 85, pages 4626–4628, 2000.
[Cohen 01] R. Cohen, K. Erez, D. ben Avraham & S. Havlin. Breakdown of the Internetunder Intentional Attack. Phys. Rev. Lett., vol. 85, pages 4626–4628, 2001.
[Cohen 03] R. Cohen, S. Havlin, & D. ben Avraham. Efficient Immunization Strategies forComputer Networks and Populations. Phys. Rev. Lett., vol. 91, no. 24, page247901, 2003.
205
[Collier 99] J. D. Collier & C. A. Hooker. Complexly Organised Dynamical Systems. OpenSystems and Information Dynamics, vol. 6, pages 241–302, 1999.
[Collier 03] J. Collier. Hierarchical Dynamical Information Systems with a Focus on Biol-ogy. Entropy, vol. 5, pages 100–124, 2003.
[Collier 04] J. Collier. Self-Organization, Individuation and Identity. Revue Internationalede Philosophie, vol. 59, pages 151–172, 2004.
[Collier 07] J. Collier. A Dynamical Approach to Identity and Diversity. In P. Cil-liers & K. Richardson, editeurs, Explorations in Complexity Thinking. Pre-Proceedings of the 3rd International Workshop on Complexity and Philosophy.Isce Publishing, 2007.
[Collier 08] J. Collier. A Dynamical Account of Emergence. Cybernetics and Human Know-ing, vol. 15, no. 3-4, pages 75–100, 2008.
[Conrad 72] M. Conrad. Information Processing in Molecular Systems. Currents in ModernBiology, vol. 5, pages 1–14, 1972.
[Conrad 79] M. Conrad. Mutation-Absorption Model of the Enzyme. Bulletin of Mathemat-ical Biology, vol. 41, pages 387–405, 1979.
[Conrad 81] M. Conrad & A. Rosenthal. Limits on the Computing Power of BiologicalSystems. Bulletin of Mathematical Biology, vol. 43, pages 59–67, 1981.
[Conrad 85] M. Conrad. On Design Principles for a Molecular Computer. Communicationsof the ACM, vol. 28, no. 5, pages 464–480, 1985.
[Conrad 90] M. Conrad. The Geometry of Evolution. Biosystems, vol. 24, pages 61–81,1990.
[Costa 05] M. Costa, J. Cowcroft, M. Castro, A. Rowstron, L. Zhou, L. Zhang &P. Barham. Vigilante: End-to-End Containment of Internet Worms. SOSP05, pages 23–26, October 2005.
[Crovella 06] M. Crovella & B. Krishamurthy. Internet measurement. infrastructure, trafficand applications. Wiley, John and Sons, 2006.
[Crucitti 04] P. Crucitti, V. Latora, M. Marchiori & A. Rapisarda. Error and Attack Toler-ance of Complex Networks. Physica A, vol. 340, pages 388–394, 2004.
[Daley 99] D. J. Daley & J. Gani. Epidemic modelling. an introduction. CambridgeUniversity Press, 1999.
[Danon 11] L. Danon, A. P. Ford, T. House, C. P. Keeling Jewell, Roberts M. J., J.V.G. O. Ross & M.C. Vernon. Networks and the Epidemiology of InfectiousDisease. Interdisciplinary Perspectives on Infectious Diseases, vol. 2011, page284909, 2011.
206
[Davidson 08] E. H. Davidson & M. S. Levine. Properties of Developmental Gene RegulatoryNetworks. PNAS, vol. 105, no. 51, pages 20063–20066, 2008.
[De Angelis 75] D. De Angelis. Stability and Connectance in Food Web Models. Ecology,vol. 56, pages 238–243, 1975.
[De Jong 06] K.A. De Jong. Evolutonary computation. a unified approach. The MIT Press,2006.
[de Leon 07] S.B-T. de Leon & E.H. Davidson. Gene Regulation: Gene Control Network inDevelopment. Annu. Rev. Biophys. Biomol. Struct., vol. 2007, pages 191–212,2007.
[Deacon 06] T. W. Deacon. Reciprocal Linkage between Self-organizing Processes is Suffi-cient for Self-reproduction and Evolvability. Biological Theory, vol. 1, no. 2,pages 136–149, 2006.
[Dehmer 11] M. Dehmer & A. Mowshowitz. A History ofGraph Entropy Measures. Infor-mation Sciences, vol. 181, pages 57–78, 2011.
[Dekker 04] A. H. Dekker & B. D. Colbert. Network Robustness and Graph Topology. InASC2004, pages 359–368. 2004.
[Dell 05] A. I. Dell, G. D. Kokkoris, C. Banasek-Richter, L-F. Bersier, J. A. Dunne,M. Kondoh, T. N. Romanuk & N.D. Martinez. How Do Complex Food WebsPersist In Nature? In P.C. de Ruiter, V. Wolters & J.C Moore, editeurs,Dynamic Food Webs Multispecies Assemblages, Ecosystem Development andEnvironmental Change, pages 425–436. Academic Press, 2005.
[Denzinger 04] J. Denzinger & J. Hamdan. Improving Modeling of Other Agents Using Ten-tative Stereotypes and Compactification of Observations. In Proc. IAT 2004IAT2004, pages 106–112, 2004.
[Dhamdere 08] A. Dhamdere & C. Dovrolis. Ten Years in the Evolution of the InternetEcosystem. IMC’08, 2008.
[Dhamdere 10] A. Dhamdere & C. Dovrolis. The Internet is Flat: Modeling the Transitionfrom a Transit Hierarchy to a Peering Mesh. ACM CoNEXT, vol. 2010, 2010.
[Diaconis 85] P. Diaconis. Theories of Data Analysis: From Magical Thinking ThroughClassical Statistics. In D.C. Tukey J.W. Hoaglin & F. Mosteller, editeurs,Exploring Data Tables, Trends and Shapes. Wiley-Interscience, 1985.
[Dinur 05] I. Dinur & S. Safra. On the Hardness of Approximating Minimum Vertex Cover.Annals of Mathematics, vol. 162, pages 439–485, 2005.
[Dorigo 04] T. Dorigo M. andStutzle. Ant colony optimization. The MIT Press, 2004.
[Dorogovtsev 03] S. N. Dorogovtsev & J. F. F. Mendes. Evolution of networks. from biolog-ical nets to the internet and www. Oxford University Press, 2003.
207
[Dovrolis 08a] C. Dovrolis. What Would Darwin Think About Clean-slate Architectures.In Computer Communication Review 38(1):, pages 29–34. ACM SIGCOMM,2008.
[Dovrolis 08b] C. Dovrolis & J. T. Streelman. Evolvable Network Architectures: What CanWe Learn From Biology. ACM SIGCOMM Computer Communication Review,vol. 40, no. 2, pages 72–77, 2008.
[Doyle 05] J. C. Doyle, D. L. Alderson, L. Li, M. Roughan, S. Shalunov, R. Tanaka &W. Willinger. The ‘Robust Yet Fragile’ Nature of the Internet. PNAS, vol. 102,no. 41, pages 14497–14502, 2005.
[Doyle 07] J. Doyle & M. Csete. Rules of Engagement. Nature, vol. 446, no. 860, 2007.
[Draief 08] M. Draief, A. Ganesh & L. Massoulie. Thresholds for Virus Spread On Net-works. The Annals of Applied Probability, vol. 18, no. 2, pages 359–378, 2008.
[Dray 90] D. Dray. Intracellular Signalling as a Parallel Distributed Process. J. Theor.Biol., vol. 143, pages 215–231, 1990.
[Duchon 06a] P Duchon, N. Hanusse, E. Lebhar & N. Schabanel. Could Any Graph BeTurned Into a Small-World. Theoretical Computer Science, vol. 355, pages96–103, 2006.
[Duchon 06b] P Duchon, N. Hanusse, E. Lebhar & N. Schabanel. Towards Small WorldEmergence. In SPAA ’06, pages 225–232, 2006.
[Dunne 02a] J. A. Dunne, R. J. Williams & N. D. Martinez. Food-web Structure and NetworkTheory: The Role of Connectance and Size. PNAS, vol. 99, no. 20, pages12917–12922, 2002.
[Dunne 02b] J. A. Dunne, R. J. Williams & N. D. Martinez. Network Structure and Bio-diversity Loss in Food Webs: Robustness Increases with Connectence. EcologyLetters, vol. 5, pages 558–567, 2002.
[Dunne 04] J. A. Dunne, R. J. Williams & N. D. Martinez. Network Structure and Robust-ness of Marine Food Webs. Marine Ecology Progress Series, vol. 273, pages291–302, 2004.
[Dunne 05] J. A. Dunne, U. Brose, R. J. Williams & N. D. Martinez. Modelling Food-Web Dynamics: Complexity-Stability Implications. In A. Belgrano, U. Scharler,J.A. Dunne & R.E. Ulanowicz, editeurs, Aquatic Food Webs: An EcosystemApproach, pages 117–129. Oxford University Press, 2005.
[Dunne 06] J. A. Dunne. The Network Structure of Food Webs. In Ecological Networks.Linking Structure to Dynamics in Food Webs, pages 27–86. Oxford UniversityPress, Pascual, M. and Dunne, J.A, 2006.
[Dunne 09] J. A. Dunne & R. J. Williams. Cascading Extinctions and Community Collapsein Model Food Webs. Phil. Trans. R. Soc. B, vol. 364, pages 1711–1723, 2009.
208
[Easley 10] D. Easley & J. Kleinberg. Networks, crowds and markets. reasoning about ahighly connected world. Cambridge University Press, 2010.
[Elton 58] C. S. Elton. Ecology of invasions by animals and plants. Chapman and Hall,1958.
[Epstein 96] J. M. Epstein & R. Axtell. Growing artificial societies. social science from thebottom up. The MIT Press, 1996.
[Epstein 06] J. M. Epstein. Generative social science. studies in agent-based computationalmodeling. Princeton University Press, 2006.
[Erdos 60] P. Erdos & A. Renyi. The Evolution of Random Graphs. Publications of theMathematical Institute of the Hungarian Academy of Sciences, vol. 5, pages17–61, 1960.
[Erwin 09] D. H. Erwin & E. H. Davidson. The Evolution of Hierarchical Gene RegulatoryNetworks. Nat. Rev. Genet., vol. 10, pages 141–148, 2009.
[Eveleigh 07] E. S. Eveleigh, K. S. McCann, P. C. McCarthy, S. J. Pollock, C. J. Lucarotti,B. Morin, G. A. McDougall, D. B. Strongman, J. T. Huber, J. Umbanhowar &L. D. B. Faria. Fluctuations In Density of an Outbreak Species Drive DiversityCascades in Food Webs. PNAS, vol. 104, no. 43, pages 16976–16981, 2007.
[Fagan 97] W. F. Fagan. Omnivory as a Stabilizing Feature of Natural Communities.American Naturalist, vol. 150, pages 554–567, 1997.
[Fath 98] B. D. Fath & B. C. Patten. Network Synergism: Emergence Of Positive Re-lations In Ecological Systems. Ecological Modelling, vol. 107, pages 127–143,1998.
[Fath 99] B. D. Fath & B. C. Patten. Review of the Foundations of Network EnvironAnalysis. Ecosystems, vol. 2, pages 167–179, 1999.
[Fath 04] B. D. Fath. Ecological Network Analysis Applied to Large-scale Cyber-ecosystems. Ecological Modelling, vol. 171, pages 329–337, 2004.
[Fath 06] B. D. Fath & W. E. Grant. Ecosystems as Evolutionary Complex Systems:Network Analysis of Fitness Models. Environmental Modelling and Software,vol. 22, no. 5, pages 693–700, 2006.
[Fath 07a] B. D. Fath & G. Halnes. Cyclic Energy Pathways in Ecological Food Webs.Ecological Modelling, vol. 208, pages 17–24, 2007.
[Fath 07b] B. D. Fath, U. M. Scharler, R. E. Ulanowicz & B. Hannon. Ecological NetworkAnalysis: Network Construction. Ecological Modelling, vol. 208, pages 49–55,2007.
209
[Felix 08] M-A. Felix & A. Wagner. Robustness and Evolution: Concepts, Insights andChallenges from a Developmental Model System. Heredity, vol. 100, pages 132–140, 2008.
[Feller 66] W. Feller. An introduction to probability theory and its applications. volume1. second edition. Wiley, John and Sons Inc, 1966.
[Feynman 95] R. P. Feynman. Six easy pieces. essentials of physics explained by its mostbrilliant teacher. Addison Wesley, 1995.
[Finn 76] J. T. Finn. Measures of Ecosystem Structure and Function Derived From Anal-ysis of Flows. J. Theor. Biol., vol. 56, pages 363–380, 1976.
[Fisher 99] M. J. Fisher, R. C. Paton & K. Matsuno. Intracellular Signalling Proteins as‘Smart’ Agents in Parallel Distributed Processes. Biosystems, vol. 50, pages159–171, 1999.
[Forbes 04] N. Forbes. Imitation of life. how biology is inspiring computing. The MITPress, 2004.
[Forrest 97a] S. Forrest, S. A. Hofmeyr & A. Somayaji. Computer Immunology. Communi-cations of the ACM, vol. 40, no. 10, pages 88–96, 1997.
[Forrest 97b] S. Forrest, A. Somayaji & D. Ackley. Building Diverse Computer Systems.In Proceedings of the Fourth Workshop on Hot Topics in Operating Systems,pages 67–72. 1997.
[Forrest 05] S. Forrest, J. Balthrop, M. Glickman & D. Ackley. Computation in the Wild.In K. Park & W Willinger, editeurs, The Internet as a Large Scale ComplexSystem. Oxford University Press, 2005.
[Forrest 07] S. Forrest & C. Beauchemin. Computer Immunology. Immunological Reviews,vol. 216, pages 176–197, 2007.
[Fox 02] J. W. Fox. Testing a Simple Rule for Dominance in Resource Competition.The American Naturalist, vol. 159, no. 3, pages 305–319, 2002.
[Fraigniaud 09] P. Fraigniaud & G. Giakkoupis. The Effect of Power-Law Degrees on theNavigability of Small Worlds. In PODC’09, pages 240–249, 2009.
[Fraigniaud 10] P. Fraigniaud & G. Giakkoupis. On the Searchability of Small-World Net-works with Arbitrary Underlying Structure. In STOC 2010, 2010.
[Gallos 05] L. K. Gallos, R. Cohen, P. Argyrakis, A. Bunde & S. Havlin. Stability andTopology of Scale-Free Networks under Attack and Defense Strategies. Phys.Rev. Lett., vol. 94, page 188701, 2005.
[Garey 79] M. R. Garey & D. S. Johnson. Computers and intractability. a guide to thetheory of np-completeness. W.H. Freeman and Company, 1979.
210
[Giakkoupis 11] G. Giakkoupis. Tight Bounds for Rumor Spreading in Graphs of a GivenConductance. In STACS 2011, pages 57–68, 2011.
[Gilbert 08] N. Gilbert. Agent-based models. SAGE Publications, 2008.
[Gill 08] P. Gill, M. Arlitt, Z. Li & A. Mahanti. The flattening internet topology:Natural evolution, unsightly barnacles or contrived collapse, volume PAM08.Cleveland, 2008.
[Glass 73] L. Glass & S. A. Kauffman. The Logical Analysis of Continuous, Nonlinear,Biochemical Control Networks. J. Theor. Biol., pages 103–129, 1973.
[Glass 88] L. Glass & M. C. Mackey. From clocks to chaos. the rhythms of life. PrincetonUniversity Press, 1988.
[Gleick 87] J. Gleick. Chaos: Making a new science. Viking, 1987.
[Goel 04] S. Goel & S. F. Bush. Biological Models of Security for Virus Propagation inComputer Networks. ;Login, vol. 29, no. 6, pages 49–56, 2004.
[Goerner 09] S. J. Goerner, B. Lietar & R. E. Ulanowicz. Quantifying Economic Sustain-ability: Implications for Free-Enterprise Theory. Ecol. Econ., vol. 69, pages76–81, 2009.
[Gould 79] S. J. Gould & R. C. Lewontin. The Spandrels of San Marco and the PanglossianParadigm: A Critique of the Adaptionist Programme. Proc. R. Soc. Lond. B,vol. 205, pages 581–598, 1979.
[Gowers 08] T. (ed) Gowers. The princeton companion to mathematics. Princeton Univer-sity Press, 2008.
[Grinstead 97] C. M. Grinstead & J. L. Snell. Introduction to probability. second revisededition. American Mathematical Society, 1997.
[Guastello 95] S. J. Guastello. Chaos, catastrophe and human affairs. applications of nonlin-ear dynamics in work, organizations and social evolution. Lawrence ErlbaumAssociates Inc., Publishers, 1995.
[Hacking 01] I. Hacking. An introduction to probability and inductive logic. CambridgeUniversity Press, 2001.
[Halpern 03] J. Y. Halpern. Reasoning about uncertainty. The MIT Press, 2003.
[Halpern 08] J. Y. Halpern. Beyond Nash Equilibrium: Solution Concepts for the 21st Cen-tury. In PODC’08, pages 1–10. 2008.
[Halter 07] R. Halter. Wild weather: The truth behind global warming. second edition.Altitude Publishing, 2007.
211
[Halter 11a] R. Halter. The incomparable honeybee and the economics of pollination. re-vised and updated. Rocky Mountain Books, 2011.
[Halter 11b] R. Halter. The insatiable bark beetle. RMB Books, 2011.
[Halter 11c] R. Halter. The insatiable bark beetle. Rocky Mountain Books, 2011.
[Harary 69] F. Harary. Graph theory. Perseus Books Publishing L.L.C, 1969.
[Hastings 84] H. M. Hastings. Stability of Large Systems. Biosystems, vol. 17, pages 171–177, 1984.
[Hawkins 04] J. Hawkins & S. Blakeslee. On intelligence. Henry Holt and Company, 2004.
[Hempel 66] C. G. Hempel. Philosophy of natural science. Prentice-Hall, 1966.
[Henneberger 10] C. Henneberger, T. Papouin, S. H. R. Oliet & D. Rusakov. Long-termPotentiation Depends on Release of D-serine from Astrocytes. Nature, vol. 463,pages 232–237, 2010.
[Henzinger 00] M. Henzinger, S. Rao & H. N. Gabow. Computing Vertex Connectivity: NewBounds from Old Techniques. Journal of Algorithms, vol. 34, pages 222–250,2000.
[Hethcote 00] H. W. Hethcote. The Mathematics of Infectious Diseases. SIAM Review,vol. 42, no. 4, pages 599–653, 2000.
[Higashi 86] M Higashi & B. C. Patten. Further Aspects of theAanalysis of Indirect EffectsIn Ecosystems. Ecol. Modell., vol. 31, pages 69–77, 1986.
[Higashi 89] M Higashi & B. C. Patten. Dominance and Indirect Causality in Ecosystems.Am. Nat, vol. 133, pages 288–302, 1989.
[Hoaglin 83] D. C. Hoaglin, F. Mosteller & J. W. Tukey, editeurs. Understanding robustand exploratory data analysis. John Wiley and Sons, 1983.
[Hoaglin 85] D. C. Hoaglin, F. Mosteller & J. W. Tukey, editeurs. Exploring data tables,trens, and shapes. John Wiley and Sons, 1985.
[Holling 73] C. S. Holling. Resilience and Stability of Ecological Systems. Annu. Rev. Rev.Ecol. Syst., vol. 4, pages 1–23, 1973.
[Hordijk 04] W. Hordijk & M. Steel. Detecting Autocatalytic, Self-sustaining Sets In Chem-ical Reaction Systems. J. Theor. Biol., vol. 227, pages 451–461, 2004.
[Hordijk 10] W. Hordijk, J. Hein & M. Steel. Autocatalytic Sets and the Origin of Life.Entropy, vol. 12, pages 1733–1742, 2010.
[Huang 04] S. Huang. Back to the Biology in Systems Biology: What Can We Learn fromBiomolecular Networks? Briefings in Functional Genomics and Proteonomics,vol. 2, no. 4, pages 279–297, 2004.
212
[Huang 09a] S. Huang. Non-genetic Heterogeneity of Cells in Development: More ThanJust Noise. Development, vol. 136, no. 23, pages 3853–3862, 2009.
[Huang 09b] S. Huang. Reprogramming Cell Fates: Reconciling Rarity with Robustness.Bioessays, vol. 31, pages 546–560, 2009.
[Huang 09c] S. Huang, I. Ernberg & S. Kauffman. Cancer Attractors: A Systems Viewof Tumors from a Gene Network Dynamics and Developmental Perspective.Semin. Cell Dev. Biol., vol. 20, pages 869–876, 2009.
[Huang 10] S. Huang. Cell Lineage Determination in State Space: A Systems View BringsFlexibility to Dogmatic Canonical Rules. PLoS, vol. 8, no. 5, 2010.
[Huang 11] S. Huang. Systems Biology of Stem Cells: Three Useful Perspectives To HelpOvercome the Paradigm of Linear Pathways. Phil. Trans. R. Soc. B., vol. 366,pages 2246–2259, 2011.
[Huang 12] S. Huang. The Molecular and Mathematical Basis of Waddington’s EpigeneticLandscape: A Framework for Post-Darwinian Biology? Bioessays, vol. 34,no. 2, pages 149–157, 2012.
[Huberman 01] B. A. Huberman. The laws of the web. patterns in the ecology of information.The MIT Press, 2001.
[Hudson 10] J. Hudson, J. Denzinger, H. Kasinger & B. Bauer. Efficiency Testing of Self-adaption Systems by Learning Event Sequences. In Proc. Adaptive-10, pages200–205, 2010.
[Hutchinson 59] G. E. Hutchinson. Homage to Santa Rosalia or Why Are There So ManyKinds of Animals? Am. Nat., vol. 93, pages 145–159, 1959.
[Istrail 05] S. Istrail & E. H. Davidson. Logic Functions of the Genomic Cis-regulatoryCode. PNAS, vol. 102, no. 14, pages 4944–4959, 2005.
[Istrail 07] S. Istrail, S. B-T. de Leon & E. H. Davidson. The Regulatory Genome and theComputer. Developmental Biology, vol. 310, pages 187–195, 2007.
[Jacob 66] F. Jacob. Genetics of the Bacterial Cell. Science, vol. 152, no. 3278, pages1470–1478, 1966.
[Jain 02] S. Jain & S. Krishna. Large Extinctions in an Evolutionary Model: The role ofinnovation in keystone species. PNAS, vol. 99, no. 4, pages 2055–2060, 2002.
[Jamniczky 10] H. A. Jamniczky, J. Boughner, C. Rolian, P. N. Gonzalez, C. D. Powell, E. J.Schmidt, T. E. Parsons, F. L. Bookstein & B. Hallgrımsson. RediscoveringWaddington in the Post-genomic Age. Bioessays, vol. 32, pages 1–6, 2010.
[Jordan 99] F. Jordan, A. Takacs-Santa & I. Molnar. A Reliability Theoretical Quest forKeystones. Oikos, vol. 86, no. 3, pages 453–462, 1999.
213
[Jordan 01] F. Jordan. Strong Threads and Weak Chains? - A Graph Theoretical Estima-tion of the Power of Indirect Effects. Community Ecology, vol. 2, no. 1, pages17–20, 2001.
[Jordan 05] F. Jordan, W-C Liu & T. Wyatt. Topological Constraints on the Dynamics ofWasp-Waist Ecosystems. Journal of Marine Systems, vol. 57, pages 250–263,2005.
[Jordan 09] F. Jordan. Keystone Species and Food Webs. Phil. Trans. R. Soc. B, vol. 364,pages 1733–1741, 2009.
[Junker 09] B. H. Junker & F. Schreiber, editeurs. Analysis of biological networks. JohnWiley and Sons, 2009.
[Kahlem 06] P. Kahlem & E. Birney. Dry Work in a Wet World: Computation in SystemsBiology. Molecular Systems Biology, vol. 2, page 40, 2006.
[Kaiser-Bunbury 10] C. N. Kaiser-Bunbury, S. Muff., J. Memmott & M˙ The Robustness ofPollination Networks to the Loss of Species and Interactions: A QuantitativeApproach Incorporation Pollinator Behaviour. Ecology Letters, vol. 13, no. 4,pages 442–452, 2010.
[Kasinger 06] H. Kasinger & B. Bauer. Pollination - A Biologically Inspired Paradigm forSelf-Managing Systems. Journal of Systems Science and Applications, vol. 3,no. 2, pages 147–156, 2006.
[Kasinger 08a] H. Kasinger, B. Bauer & J Denzinger. The Meaning of Semiochemicals to theDesign of Self-Organizing Systems. In Proceedings of SASO, pages 139–148,2008.
[Kasinger 08b] H. Kasinger, J. Denzinger & B. Bauer. Digital Semiochemical Coordination.Communications of SIWN, vol. 4, pages 133–139, 2008.
[Kasinger 09a] H. Kasinger & J. Denzinger. Design Pattern for Self-Organizing EmergentSystems Based on Digital Infochemicals. In Proc. EASe 2009, pages 45–55,2009.
[Kasinger 09b] H. Kasinger, J. Denzinger & B. Bauer. Decentralized Coordination of Ho-mogenous and Heterogenous Agents by Digital Infochemicals. In SAC’09, pages1223–1224, 2009.
[Kasinger 10] H. Kasinger, B. Bauer, J. Denzinger & T. Holvoet. Adapting Environment-Mediated Self-Organizing Emergent Systems by Exception Rules. In Proc.SOAR 2010. SOAR, 2010.
[Kauffman 69a] S. Kauffman. Homeostasis and Differentiation in Random Genetic ControlNetworks. Nature, vol. 224, pages 177–178, 1969.
[Kauffman 69b] S. A. Kauffman. Metabolic Stability and Epigenesis in Randomly Con-structed Genetic Nets. J. Theor. Biol. ., vol. 22, pages 437–467, 1969.
214
[Kauffman 74] S. Kauffman. The Large Scale Structure and Dynamics of Gene ControlCircuits: An Ensemble Approach. J. Theor. Biol., vol. 44, pages 167–190,1974.
[Kauffman 86] S. Kauffman. Autocatalytic Sets of Proteins. J. Theor. Biol., vol. 119, pages1–24, 1986.
[Kauffman 93] S. A. Kauffman. The origins of order. self organization and selection inevolution. Oxford University Press, 1993.
[Kauffman 04] S. Kauffman. A Proposal for Using the Ensemble Approach to UnderstandGenetic Regulatory Networks. J. Theor. Biol., vol. 230, pages 581–590, 2004.
[Keller 02] E. F. Keller. Making sense of life. explaining biological development with mod-els, metaphors, and machines. Harvard University Press, 2002.
[Keller 05] E. F. Keller. Revisiting ’Scale Free’ Networks. BioEssays, vol. 27, pages 1060–1068, 2005.
[King 09] J. P. King. Mathematics in 10 lessons. the grand tour. Prometheus Books,2009.
[Kitano 02] H. Kitano. Computational Systems Biology. Nature, vol. 420, pages 206–210,2002.
[Kitano 04] H. Kitano. Biological Robustness. Nature Reviews Genetics, vol. 5, pages 826–837, 2004.
[Kleinberg 00] J. Kleinberg. Navigation In A Small World. Nature, vol. 406, no. 845, 2000.
[Kleinberg 06] J. Kleinberg & E. Tardos. Algorithm design. Addison Wesley, 2006.
[Kleinberg 08] J. Kleinberg. The Convergence of Social and Technological Networks. Com-munications of the ACM, vol. 51, no. 11, pages 66–72, 2008.
[Knight 98] T. F. Knight & G. J. Sussman. Cellular Gate Technology. In C.S. Claude &M.J. Dinneen, editeurs, Unconventional Models of Computation, pages 257–272. Springer-Verlag, 1998.
[Kolmogorov 68a] A. N. Kolmogorov. Logical Basis for Information Theory and ProbabilityTheory. IEEE Transactions on Information Theory, vol. 14, no. 5, pages 662–664, 1968.
[Kolmogorov 68b] A. N. Kolmogorov. Three Approaches to the Quantitative Definition ofInformation. International Journal of Computer Mathematics, vol. 2, pages157–168, 1968.
[Konnyu 08] B. Konnyu, T. Czaran & E. Szathmary. Prebiotic Replicase Evolution In ASurface-bound Metabolic System: Parasites As a Source of Adaptive Evolution.BMC Evolutionary Biology, vol. 8, page 267, 2008.
215
[Kossinets 08] G. Kossinets, J. Kleinberg & D. Watts. The Structure of Information Path-ways in a Social Communication Network. In KDD’08, 2008.
[Kun 08] A. Kun, B. Papp & E. Szathmary. Computational Identification of ObligatorilyAutocatalytic Replicators Embedded in Metabolic Networks. Genome Biology,vol. 9, 2008.
[Kurose 08] J. F. Kurose & K. W. Ross. Computer networking. a top-down approach. 4thedition. Addison Wesley, 2008.
[Lamport 82] L. Lamport, R. Shostak & M. Pease. The Byzantine Generals Problem. ACMTransactions on Programming Languages and Systems, vol. 4, no. 3, pages382–401, 1982.
[Lehninger 65] A. L. Lehninger. Bioenergetics. the molecular basis of biological energy trans-formations. W.A. Benjamin Inc, 1965.
[Leiner 03] B. M. Leiner, V. G. Cerf, D. D. Clark, R. E. Kahn, L. Kleinrock, D. C. Lynch,J. Postel, L. G. Roberts & S. Wolff. A brief history of the internet version 3.32.Internet Society, http//www.isoc.org/internet/hisotry/brief.shtml, 2003.
[Leskovec 08] J. Leskovec. Dynamics of large networks. Phd. Dissertation. Cornell Univer-sity, 2008.
[Levin 92] S. A. Levin. The Problem of Pattern and Scale in Ecology. Ecology, vol. 73,no. 6, pages 1943–1967, 1992.
[Levine 05] M. Levine & E. H. Davidson. Gene Regulatory Networks for Development.PNAS, vol. 102, no. 14, pages 4936–4942, 2005.
[Levins 66] R. Levins. The Strategy of Model Building in Population Biology. AmericanScientist, vol. 54, no. 4, pages 421–431, 1966.
[Levins 68] R. Levins. Evolution in changing environments. some theortetical explorations.Princeton University Press, 1968.
[Lewin 92] R. Lewin. Complexity. life at the edge of chaos. Macmillan Publishing Com-pany, 1992.
[Li 97] M. Li & P. Vitanyi. An introduction to komogorov complexity and its appli-cations. second edition. Springer, 1997.
[Li 04] M. Li, X. Chen, X. Li, B. Ma & P. M. B. Vitanyi. The Similarity Metric. IEEETransactions on Information Theory, vol. 50, no. 12, pages 3250–3264, 2004.
[Li 05] L. Li, D. Alderson, J. C. Doyle & W. Willinger. Towards a Theory of Scale-Free Graphs: Definition, Properties, and Implications. Internet Mathematics,vol. 2, no. 4, pages 431–523, 2005.
216
[Li 07] J. Li & P. Knickerbocker. Functional similarities between computer worms andbiological pathogens. 338-347, Computers and Security 26, 2007.
[Li 11] X. Li, H. Wang & Y. Kuang. Global Analysis of a Stoichiometric Producer-grazer Model with Holling Type Functional Response. Mathematical Biology,pages DOI 10.1007/s00285–010–0392–2, 2011.
[Lindelauf 09] R. Lindelauf, P. Borm & H. Hamers. Understnading Terrorist NetworkTopologies and their Resilience Against Disruption. CentER Discussion Pa-per No, vol. 2009, no. 85, 2009.
[Lindeman 42] R. L. Lindeman. The Trophic Dynamic Aspect of Ecology. Ecology, vol. 23,pages 399–418, 1942.
[Lloyd 01] A. L. Lloyd & R. M. May. How Viruses Spread Among Computers and People.Science, vol. 292, pages 1316–1317, 2001.
[Loladze 00] I. Loladze, Y. Kuang & J. J. Elser. Stoichiometry in Producer-Grazer Systems:Linking Energy Flow with Element Cycling. Bulletin of Mathematical Biology,vol. 62, pages 1137–1162., 2000.
[Lotka 20] A. J. Lotka. Analytical Note on Certain Rhythmic Relations in Organic Sys-tems. PNAS, vol. 6, pages 410–415, 1920.
[Luczak 90] T. Luczak. The Phase Transition in the Evolution of Random Digraphs. Jour-nal of Graph Theory, vol. 14, no. 2, pages 217–223, 1990.
[Luczak 94] T. Luczak. Phase Transition Phenomena In Random Discrete Structures. Dis-crete Math, vol. 1994, pages 225–242, 1994.
[Luenberger 06] D. G. Luenberger. Information science. Princeton University Press, 2006.
[MacArthur 55] R. MacArthur. Fluctuations of Animal Populations and a Measure of Com-munity Stability. Ecology, vol. 36, pages 533–536, 1955.
[MacKay 03] D. J. C. MacKay. Information theory, inference, and learning algorithms.Cambridge University Press, 2003.
[Maier 09] G. Maier, A. Feldmann, V. Paxson & M. Allman. On dominant characteristicsof residential broadband internet traffic. Proc. ACM IMC, November 2009,2009.
[Marleu 11] J. Marleu, Y. Jin, J. G. Bishop, W. F. Fagan & M. A. Lewis. A StoichiometricModel of Early Plant Primary Succession. American Naturalist, vol. 177, no. 2,pages 233–245, 2011.
[Massol 11] F. Massol, D. Mouquet Gravel, Cadotte N., Fukami M. W., T. & M.A. Leibold.Linking Community and Ecosystem Dynamics Through Spatial Ecology. Ecol.Lett., vol. 14, pages 313–323, 2011.
217
[Matsuno 97] K. Matsuno. Biodynamics for the Emergence of Energy Consumers. Biosys-tems, vol. 42, pages 119–127, 1997.
[Matsuno 98] K. Matsuno. Dynamics in Time and Information in Dynamic Time. Biosys-tems, vol. 46, pages 57–71, 1998.
[Matsuno 99] K. Matsuno. Cell Motility As An Entangled Quantum Coherence. Biosystems,vol. 51, pages 15–19, 1999.
[Matsuno 01] K. Matsuno. Cell Motility and Thermodynamic Fluctuations Tailoring Quan-tum Mechanics for Biology. Biosystems, vol. 62, no. 1–3, pages 67–85, 2001.
[Maurer 99] B. A. Maurer. Untangling ecological complexity. the macroscopic perspective.University of Chicago Press, 1999.
[May 74] R. M. May. Biological Populations with Nonoverlapping Generations: StablePoints, Stable Cycles and Chaos. Science, vol. 186, pages 645–647, 1974.
[May 76a] R. M. May & G. F. Oster. Bifurcations and Dynamical Complexity in SimpleEcological Models. Am. Nat., vol. 110, pages 573–599, 1976.
[May 76b] R.M. May. Simple Mathematical Models with Very Complicated Dynamics.Nature, vol. 261, pages 459–467, 1976.
[May 00] R. M. May. Stability and complexity in model ecosystems. princeton landmarksin biology edn. Princeton University Press, 2000.
[May 01] R. M. May & A. L. Lloyd. Infection Dynamics on Scale-free Networks. PhysRev. E, vol. 64, page 066112, 2001.
[May 06] R. M. May. Network Structure and the Biology of Populations. Trends EcolEvol, vol. 21, no. 7, pages 394–399, 2006.
[May 08] R. M. May, S. A. Levin & G. Sugihara. Complex Systems: Ecology for Bankers.Nature, vol. 451, pages 893–895, 2008.
[May 09] R. M. May. Food-web Aseembly and Collapse: Mathematical Models and Im-plications for Conservation. Phil. Trans. R. Soc. B, vol. 364, pages 1643–1646,2009.
[Maynard Smith 99] J. Maynard Smith & E. Szathmary. The origins of life: From the birthof life to the origins of language. Oxford University Press, 1999.
[Mayo 96] D. G. Mayo. Error and the growth of experimental knowledge. The Universityof Chicago Press, 1996.
[McCann 00] K. S. McCann. The Diversity–Stability Debate. Nature, vol. 405, pages 228–233, 2000.
[McCann 12] K. McCann. Food webs. Princeton University Press, 2012.
218
[McCarthy 11] M. McCarthy. Decline of Honey Bees Now a Global Phe-nomenon, Says United Nations. The Independent, pageshttp://www.independent.co.uk/environment/nature/decline–of–honey–bees–now–a–global–phenomenon–says–united–nations–2237541.html?printService=print, March 2011.
[McCauley 99] E. McCauley, R. M. Nisbet, R. M. Murdoch, A.M. de Roos & W.S.C. Gur-ney. Large-amplitude Cycles of Daphnia and its Algal Prey in Enriched Envi-ronments. Nature, vol. 402, pages 653–656, 1999.
[McCauley 08] E. McCauley, W. A. Nelson & R. M. Nisbet. Small-amplitude Cycles Emergefrom Stage-structured Interactions in Daphnia-algal Systems. Nature, vol. 455,pages 1240–1243, 2008.
[McKeon 92] R. McKeon, editeur. Introduction to aristotle. The Modern Library, 1992.
[Memmott 04] J. Memmott, N. M. Waser & M. V. Price. Tolerance of Pollination Networksto Species Extinctions. Proc. R. Soc. Lond. B, vol. 271, pages 2605–2611, 2004.
[Meyers 05] L. A. Meyers, B. Pourbohloul, M. E. J. Newman, D. M. Skowronski & R. C.Brunham. Network Theory and SARS: Predicting Outbreak Diversity. J.Theor. Biol., vol. 232, pages 71–81, 2005.
[Miller 07] J. H. Miller & S. E. Page. Complex adaptive systems. an introduction tocomputational models of social life. Princeton University Press, 2007.
[Milo 02] R. Milo, S. Shen-Orr, S. Itzkovitz, N. Kashan, D. Chklovskii & U. Alon. Net-work Motifs: Simple Building Blocks of Complex Networks. Science, vol. 298,pages 824–827, 2002.
[Mishra 07] B. K. Mishra & D. K. Saini. SEIRS Epidemic Model with Delay for Trans-mission of Malicious Objects in Computer Network. Appl. Math. Comput.,vol. 188, pages 1476–1482, 2007.
[Mitchell 06] M. Mitchell. Complex Systems: Network Thinking. Artificial Intelligence,vol. 170, no. 18, pages 1194–1212, 2006.
[Mitchell 09] M. Mitchell. Complexity a guided tour. Oxford University Press, 2009.
[Mooney 93] C. Z. Mooney & R. D. Duval. Bootstrapping. a nonparametric approach tostatistical inference. Sage Publications, 1993.
[Muller 07] S. J. Muller. Assymetry: The foundation of information. Springer, 2007.
[Naeem 97] S. Naeem & Li S. Biodiversity Enhances Ecosystem Reliability. Nature,vol. 390, pages 507–509, 1997.
[Nagaraja 06] S. Nagaraja & R. Anderson. The Topology of Covert Conflict. In T. Moore,editeur, Pre-Proceedings of the Fifth Workshop on the Economics of Informa-tion Security, 2006.
219
[Nagaraja 08] S. Nagaraja. Robust Covert Network Topologies. PhD thesis, University ofCambridge, 2008.
[Nam 10] J. Nam, P. Dong, R. Tarpine, S. Istrail & E. H. Davidson. Functional Cis-regulatory Genomics for Systems Biology. PNAS, vol. 107, no. 8, pages 3930–3935, 2010.
[Ness 69] Van Ness & H. C. Understanding thermodynamics. Dover Publications Inc.,1969.
[Newman 02a] M. E. J. Newman. Spread of Epidemic Disease On Networks. Phys. Rev. E.,vol. 66, page 016128, 2002.
[Newman 02b] M. E. J. Newman, S. Forrest & J. Balthrop. Email Networks and the Spreadof Computer Viruses. Phys. Rev. E., vol. 66, page 035101, 2002.
[Newman 03] M. E. J. Newman. The Structure and Function of Complex Networks. SIAMReview, vol. 45, pages 167–256, 2003.
[Newman 06] M. Newman, A. Barbasi & D. J. Watts, editeurs. The structure and dynamicsof networks. Princeton University Press, 2006.
[Newman 10] M. E. J. Newman. Networks. an introduction. Oxford University Press, 2010.
[Nicolis 89] G. Nicolis & I. Prigogine. Exploring complexity. an introduction. W.H. Free-man and Company, 1989.
[Nowak 06] M. A. Nowak. Evolutionary dynamics. exploring the equations of life. Belk-nap/Harvard, 2006.
[Nuland 97] S. B. Nuland. The wisdom of the body. Alfred A. Knopf, 1997.
[Odum 53] E. P. Odum. Fundamentals of ecology. Saunders, 1953.
[Oleson 07] J. Oleson, J. Bascompte, Y. Dupont & P. Jordano. The Modularity of Polli-nation Networks. PNAS, vol. 104, no. 19, pages 891–896, 2007.
[Oliveri 08] P. Oliveri, Q. Tu & E. H. Davidson. Global Regulatory Logic for Specificationof an Embryonic Cell Lineage. PNAS, vol. 105, no. 16, pages 5955–5962, 2008.
[Otto 07] S. B. Otto, B. C. Rali & U. Brose. Allometric Degree Distributions FacilitateFood Web Stability. Nature, vol. 450, pages 1226–1230, 2007.
[Ozsu 99] M.T. Ozsu & P. Valduriez. Principles of distributed database systems. secondedition. Prentice Hall, 1999.
[Pahl-Wostl 92] C. Pahl-Wostl. Information Theoretical Analysis of Functional Temporaland Spatial Organization in Flow Networks. Mathl. Comput. Modelling, vol. 16,no. 3, pages 35–52, 1992.
220
[Pahl-Wostl 94] C. Pahl-Wostl. Sensitivity Analysis of Ecosystem Dynamics Based OnMacroscopic Community Descriptors: A Simulation Study. Ecological Mod-elling, vol. 75/76, pages 51–62, 1994.
[Pastor-Satorras 01] R. Pastor-Satorras & A. Vespignani. Epidemic Spreading in Scale-freeNetworks. Phys. Rev. Lett., vol. 86, no. 14, pages 3200–3203, 2001.
[Pastor-Satorras 04] R. Pastor-Satorras & A. Vespignani. Evolution and structure of theinternet. a statistical physics approach. Cambridge University Press, 2004.
[Patten 59] B. C. Patten. An Introduction to the Cybernetics of the Ecosystem: TheTrophic-Dynamic Aspect. Ecology, vol. 40, no. 2, pages 221–231, 1959.
[Patten 84] B. C. Patten & M. Higashi. Modified cycling index for ecological applications.Ecol. Modell., vol. 25, no. 1-3, pages 69–83, 1984.
[Patten 85] B. C. Patten. Energy cycling in the ecosystem. Ecol. Modell., vol. 28, pages1–71, 1985.
[Patten 90] B. C. Patten, M. Higashi & T. P. Burns. Trophic Dynamics in EcosystemNetworks: Significance of Cycles and Storage. Ecol. Modell., vol. 51, pages1–28, 1990.
[Pease 80] M. Pease, R. Shostak & L. Lamport. Reaching Agreement in the Presence ofFaults. Journal of the Association for Computing Machinery, vol. 27, no. 2,pages 228–234, 1980.
[Petanidou 08] T. Petanidou, A. Kallimanis, J. Tzanopoulos, S. Sgardelis & J. D. Pantis.Long Term Observation of a Pollination Network: Fluctuations in Species andInteractions, Relative Invariance of Network Structure, and Implications forEstimates of Specialization. Ecol. Lett., vol. 11, pages 564–575, 2008.
[Peterson 03] I. Peterson. Newton’s clock. chaos in the solar system. W.H. Freeman andCompany, 2003.
[Pimm 79] S. L. Pimm. The Structure of Food Webs. Theoretical Population Biology,vol. 16, pages 144–158, 1979.
[Pollack 01] G. H. Pollack. Cells, gels and the engines of life. a new, unifying approach tocell function. Ebner and Sons, 2001.
[Prigogine 80] I. Prigogine. From being to becoming. time and complexity in the physicalsciences. W.H. Freeman and Company, 1980.
[Prigogine 84] I. Prigogine & I. Stengers. Order out of chaos. man’s new dialogue withnature. Bantom Books, 1984.
[Putnam 82] H. Putnam. Reason, truth, and history. Cambridge University Press, 1982.
221
[Quince 05] C. Quince, P. G. Higgs & A. J. McKane. Deleting Species from Model FoodWebs. Oikos, vol. 110, pages 283–296, 2005.
[Renyi 87] A. Renyi. A diary on information theory. John Wiley and Sons, 1987.
[Rexford 10] J. Rexford & C. Dovrolis. Future Internet Architecture: Clean-Slate VersusEvolutionary Research. CACM, vol. 53, no. 9, pages 36–40, 2010.
[Ricklefs 79a] R. E. Ricklefs. Ecology. second edition. Chiron Press, 1979.
[Ricklefs 79b] R. E. Ricklefs. Ecology. second edn. Chiron Press, 1979.
[Ridley 93] M. Ridley. The red queen: Sex and the evolution of human nature. HarperPerennial, 1993.
[Rip 10] J. M. K. Rip, K. S. McCann, D. H. Lynn & S. Fawcett. An ExperimentalTest of a Fundamental Food Web Motif. Proc. R. Soc. B: Biological Sciences,vol. 277, pages 1743–1749, 2010.
[Robinson 95] R. A. Robinson. Return to resistance: Breeding crops to reduce pesticidedependence. IDRC, 1995.
[Romanuk 06a] T. N. Romanuk, B. E. Beisner, N. D. Martinez & J. Kolasa. Non-omnivorousGenerality Promotes Population Stability. Biol. Lett., vol. 2, pages 374–377,2006.
[Romanuk 06b] T. N. Romanuk, R. J. Vogt & J. Kolasa. Nutrient Enrichment Weakens theStabilizing Effect of Species Richness. Oikos, vol. 114, pages 291–302, 2006.
[Romanuk 09a] T. N. Romanuk, R. J. Vogt & J. Kolasa. Ecological Realism And MechanismsBy Which Diversity Begets Stability. Oikos, vol. 118, pages 819–828, 2009.
[Romanuk 09b] T. N. Romanuk, Y. Zhou, U. Brose, E. L. Berlow, R. J. Williams & N. D.Martinez. Predicting Invasion Success In Complex Ecological Networks. Philo-sophical Transactions of the Royal Society B, vol. 364, pages 1743–1754, 2009.
[Romanuk 10] T. N. Romanuk, R. J. Vogt, A. Young, C. Tuck & M. W Carscallen. Main-tenance of Positive Diversity-Stability Relations Along A Gradient of Environ-mental Stress. PLoS ONE, vol. 5, no. 4, 2010.
[Rossberg 06] A. G. Rossberg, K. Yanagi, T. Amemiya & K. Itoh. Estimating Trophic LinkDensity form Quantitative but Incomplete Diet Data. J. Theor. Biol., vol. 243,pages 261–272, 2006.
[Ruelle 89] D. Ruelle. Chaotic evolution and strange attractors. Cambridge UniversityPress, 1989.
[Ruelle 06] D. Ruelle. What is a Strange Attractor. Notices of the AMS, vol. 53, no. 7,pages 764–765, 2006.
222
[Rutledge 76] R. W. Rutledge, B. L. Basore & R. J. Mulholland. Ecological Stability: AnInformation Theory Viewpoint. J. Theor. Biol., vol. 57, pages 355–371, 1976.
[Salas 11] A. K. Salas & S. R. Borrett. Evidence for the Dominance of Indirect Effects in50 Trophic Ecosystem Networks. Ecological Modelling, vol. 222, pages 1192–1204, 2011.
[Salisbury 85] F. B. Salisbury & C. W. Ross. Plant physiology, third edition. WadsworthPublishing Company, 1985.
[Salthe 85] S. N. Salthe. Evolving hierarchical systems. Columbia University Press, 1985.
[Salthe 93] S. N. Salthe. Development and evolution: Complexity and change in biology.MIT Press, The, 1993.
[Santello 10] M. Santello & A. Volterra. Astrocytes as Aide-memoires. Nature, vol. 463,pages 169–170, 2010.
[Savelsbergh 95] M. W. P. Savelsbergh & M. Sol. The General Pickup and Delivery Problem.Transportation Science, vol. 29, pages 17–29, 1995.
[Schneier 04] B. Schneier. Secrets and lies. digital security in a networked world. with newinformation about post-9/11 security. Wiley Publishing Inc., 2004.
[Segel 01] L. A. Segel & I. R. Cohen, editeurs. Design principles for the immune systemand other distributed autonomous systems. Oxford University Press, 2001.
[Seshadri 08] M. Seshadri, S. Machiraju, A. Sridharan, J. Bolot, C. Faloutsos & J. Leskovec.Mobile call graphs: Beyond power-law and lognormal distributions. KDD’08,2008.
[Shannon 49] C. Shannon. Communication Theory of Secrecy Systems. Bell System Tech-nical Journal, vol. 28, no. 4, pages 656–715, 1949.
[Shannon 63] C. E. Shannon & W. Weaver. The mathematical theory of commmunication.University of Illinois Press, 1963.
[Shapiro 06] E. Shapiro & Y. Benenson. Bringing DNA Computers To Life. ScientificAmerican (May, pages 45–51, 2006.
[Shen-Orr 02] S. S. Shen-Orr, R. Milo, S. Mangan & U. Alon. Network Motifs in the Tran-scriptional Regulation Network of Escherichia coli. Nature Genetics, vol. 31,pages 64–68, 2002.
[Shmulevich 02a] I. Shmulevich, E. R. Dougherty, S. Kim & W. Zhang. Probabilistic BooleanNetworks: A Rule-based Uncertainty Model for Gene Regulatory Networks.Bioinformatics, vol. 18, no. 2, pages 261–274, 2002.
223
[Shmulevich 02b] I. Shmulevich, E. R. Dougherty & W. Zhang. From Boolean to Probabilis-tic Boolean Networks as Models of Genetic Regulatory Networks. Proceedingsof the IEEE, vol. 90, no. 11, pages 1778–1792, 2002.
[Shoham 09] Y. Shoham & K. Leyton-Brown. Multiagent systems. algorithmic, game-theoretic, and logical foundations. Cambridge University Press, 2009.
[Simon 62] H. A. Simon. The Architecture of Complexity. Proc. Am. Philos. Soc., vol. 106,no. 6, pages 467–482, 1962.
[Slack 02] J. M. W. Slack. Conrad Hal Waddington: the Last Rennaissance Biologist?Nat. Rev. Genet., vol. 3, pages 889–895, 2002.
[Sober 84] E. Sober, editeur. Conceptual issues in evolutionary biology. The MIT Press,1984.
[Sole 01] R. V. Sole & J. M. Montoya. Complexity and Fragility in Ecological Networks.Proc. R. Soc. Lond. B., vol. 268, pages 2039–2045, 2001.
[Sole 04] R. V. Sole & S. Valverde. Information Theory of Complex Networks: OnEvolution and Architectural Constraints. Lect. Note. Phys., vol. 650, pages189–207, 2004.
[Sole 06] R. V. Sole & J. Bascompte. Self-organization in complex ecosystems. PrincetonUniversity Press, 2006.
[Solomonoff 64a] R. J. Solomonoff. A Formal Theory of Inductive Inference Part I. Infor-mation and Control, vol. 7, no. 1, pages 1–22, 1964.
[Solomonoff 64b] R. J. Solomonoff. A Formal Theory of Inductive Inference Part II. Infor-mation and Control, vol. 7, no. 2, pages 224–254, 1964.
[Somayaji 04] A. Somayaji. How To Win an Evolutionary Arms Race. IEEE Security andPrivacy, vol. 2, no. 6, pages 70–72, 2004.
[Somayaji 07a] A. Somayaji. Immunology, Diversity, and Homeostasis: The Past and Futureof Biologically-Inspired Computer Defenses. Inf. Secur. Tech. Rep., vol. 12,no. 4, pages 228–234, 2007.
[Somayaji 07b] A. Somayaji, M. Locasto & J. Feyereist. Panel: The Future of Biologically-Inspired Security: Is There Anything Left To Learn? New Security ParadigmsWorkshop, vol. 2007, 2007.
[Speybroeck 02] Van Speybroeck. From Epigenesis to Epigenetics. The Case of C.H.Waddington. Ann. N.Y. Acad. Sci., vol. 981, pages 61–81, 2002.
[Sprott 03] J. Sprott. Chaos and time-series analysis. Oxford University Press, 2003.
224
[Steghofer 10] J.P. Steghofer, H. Denzinger, H. Kasinger & B. Bauer. Improving the Ef-ficiency of Self-Organizing Emergent Systems by an Advisor. In Proc. EASe2010, pages 63–72, 2010.
[Sykes 82] J. B. (ed) Sykes. The concise oxford dictionary of current english. OxfordUniversity Press, 1982.
[Szathmary 06] E. Szathmary. The Origin of Replicators and Reproducers. Phil. Trans. R.Soc. B, vol. 361, pages 1761–1776, 2006.
[Szathmary 07] E. Szathmary. Coevolution of Metabolic Networks and Membranes: the Sce-nario of Progressive Sequestration. Phil. Trans. R. Soc. B, vol. 362, pages1781–1787, 2007.
[Tanizawa 05] T. Tanizawa, G. Paul, R. Cohen & H. E. Stanley. Optimization of NetworkRobustness to Waves of Targeted and Random Attacks. Physical Review E,vol. 71, 2005.
[Tijms 07] H. Tijms. Understanding probability. chance rules in everyday life. secondedition. Cambridge University Press, 2007.
[Tilman 94] D. Tilman & J. A. Downing. Biodiversity and Stability in Grasslands. Nature,vol. 367, pages 363–365, 1994.
[Tilman 96] D. Tilman. Biodiversity: Population Versus Ecosystem Stability. Ecology,vol. 77, no. 2, pages 350–363, 1996.
[Tilman 99] D. Tilman. The Ecological Consequences of Changes in Biodiversity: A SearchFor General Principles. Ecology, vol. 80, pages 231–251, 1999.
[Trudeau 76] R. J. Trudeau. Introduction to graph theory. Dover Press, 1976.
[Tukey 66] J. W. Tukey & M. B. Wilk. Data Analysis and Statistics: An ExpositoryOverview. AFIPS Conf Proc., Fall Joint Comput. Conf., vol. 29, pages 695–709, 1966.
[Ulanowicz 83] R. E. Ulanowicz. Identifying the Structure of Cycling In Ecosystems. Bio-science, vol. 65, pages 219–237, 1983.
[Ulanowicz 91] R. E. Ulanowicz & W. F. Wolff. Ecosystem Flow Networks: Loaded Dice?Mathematical Biosciences, vol. 103, pages 45–68, 1991.
[Ulanowicz 97] R. E. Ulanowicz. Ecology, the ascendent perspective. Columbia UniversityPress, 1997.
[Ulanowicz 99a] R. E. Ulanowicz. Life After Newton: An Ecological Metaphysic. Biosys-tems, vol. 50, pages 127–142, 1999.
225
[Ulanowicz 99b] R. E. Ulanowicz & D. Baird. Nutrient Controls on Ecosystem Dynamics:the Chesapeake Mesohaline Community. Journal of Marine Systems, vol. 19,pages 159–172, 1999.
[Ulanowicz 04] R. E. Ulanowicz. Quantitative Methods for Ecological Network Analysis.Computational Biology and Chemistry, vol. 28, pages 321–339, 2004.
[Ulanowicz 09a] R. E. Ulanowicz. The Dual Nature of Ecosystem Dynamics. EcologicalModelling, vol. 220, pages 1886–1892, 2009.
[Ulanowicz 09b] R. E. Ulanowicz. A third window. natural life beyond newton and darwin.Templeton Foundation Press, 2009.
[Ulanowicz 09c] R. E. Ulanowicz, S. Goerner, B. Lietaer & R. Gomez. Quantifying Sus-tainability: Resilience, Efficiency and the Return of Information Theory. Ecol.Complex., vol. 6, pages 27–36, 2009.
[Van Miegham 11] P. Van Miegham. Graph spectra for complex networks. Cambridge Uni-versity Press, 2011.
[Van Mieghem 09] P. Van Mieghem, Omic J. & R. Kooij. Virus Spread In Networks.IEEE/ACM Transactions on Networking, vol. 17, no. 1, pages 1–14, 2009.
[van Steen 10] M. van Steen. Graph theory and complex networks. an introduction. Maartenvan Steen, 2010.
[van Valen L. 73] van Valen L. A New Evolutionary Law. Evolutionary Theory, vol. 1, pages1–30, 1973.
[Vasseur 08] D. A. Vasseur & J. W. Fox. Phase-locking and Environmental FluctuationsGenerate Synchrony in a Predator-prey Community. Nature, vol. 460, pages1007–1011, 2008.
[Vespignani 10] A. Vespignani. The Fragility of Interdependency. Nature, vol. 464, pages984–985, 2010.
[Visser 08] M. E. Visser. Keeping up with a Warming World; Assessing the Rate of Adap-tation to Climate Change. Proc. R. Soc. B, vol. 275, pages 649–659, 2008.
[Vogt 07] R. Vogt, J. Aycock & M. J. Jacobson. Quorom Sensing and Self-StoppingWorms. WORM’07, pages 16–22, 2007.
[Volchan 02] S. B. Volchan. What is a Random Sequence? The Mathematical Associationof America Monthly, vol. 109, pages 46–63, 2002.
[von Mises 57] R. von Mises. Probablity, statistics and truth. Dover Publications, Inc, 1957.
[von Mises 81] R. von Mises. Probability, statistics and truth. Dover Publications, Inc.,1981.
226
[Waddington 42] C. H. Waddington. Canalization of Development and the Inheritance ofAcquired Characters. Nature, vol. 150, pages 563–565, 1942.
[Waddington 77] C. H. Waddington. Tools for thought. how to understand and apply thelatest scientific techniques of problem solving. Basic Books, 1977.
[Wagner 05] A. Wagner. Robustness and evolvability in living systems. Princeton UniversityPress, 2005.
[Wainer 05] H. Wainer. Graphic discovery. a trout in the milk and other visual adventures.Princeton University Press, 2005.
[Waldrop 92] M. M. Waldrop. Complexity. the emerging science at the edge of order andchaos. Simon and Schuster, 1992.
[Wang 03] Y. Wang, D. Chakrabarti, C. Wang & C. Faloutsos. Epidemic Spreading inReal Networks: An Eigenvalue Viewpoint. In Proc. 22nd Int. Symp. ReliableDistributed Systems (SRDS’03)22nd Int. Symp. Reliable Distributed Systems(SRDS’03, pages 25–34, 2003.
[Wang 09] H. Wang, J. D. Nagy, O. Gilg & Y. Kuang. The Roles of Predator MaturationDelay and Functional Response in Determining the Periodicity of Predator-Prey Cycles. Mathematical Biosciences, vol. 221, pages 1–10, 2009.
[Wang 10a] H. Wang. Revisit Brown Lemming Population Cycles in Alaska: An Exam-ination of Stoichiometry. International Journal of Numerical Analysis andModeling, Series B, vol. 1, no. 1, pages 93–108, 2010.
[Wang 10b] J. Wang, L. Xu, E. Wang & S. Huang. The Potential Landscape of GeneticCircuits Imposes The Arrow of Time In Stem Cell Differentiation. BiophysicalJournal, vol. 99, pages 29–39, 2010.
[Watts 98] D. J. Watts & S. H. Strogatz. Collective Dynamics of ‘Small-World’ Networks.Nature, vol. 393, pages 440–442, 1998.
[White 01] D. R. White & F. Harary. The Cohesiveness of Blocks in Social Networks: NodeConnectivity and Conditional Density. Sociological Methodology, vol. 31, no. 1,pages 305–359, 2001.
[Williams 00] R. J. Williams & N. D. Martinez. Simple Rules Yield Complex Food Webs.Nature, vol. 404, pages 180–183, 2000.
[Williams 07] R. J. Williams, U. Brose & N. D. Martinez. Homage to Yodzis and Innes1992: Scaling Up Feeding-Based Population Dynamics To Complex EcologicalNetworks. In From Energetics to Ecosystems: Dynamics and Structures ofEcological Systems, pages 37–51. Springer, 2007.
[Williams 08] R. J. Williams. Effects of Network and Dynamical Model Structure On SpeciesPersistence in Large Model Food Webs. Theoretical Ecology, pages 141–151,2008.
227
[Williams 10] R. J. Williams. Simple MaxEnt Models Explain Food Web Degree Distribu-tions. Theor. Ecol., vol. 3, pages 45–52, 2010.
[Williams 11] R. J. Williams. Biology, Methodology or Chance? The Degree Distributionsof Bipartite Ecological Networks. PLoS ONE, vol. 6, no. 3, 2011.
[Winfree 98] E. Winfree. Algorithmic Self-Assembly of DNA. PhD thesis, California Insti-tute of Technology, 1998.
[Wolfram 84] S. Wolfram. Cellular Automata as Models of Complexity. Nature, vol. 311,pages 419–424, 1984.
[Wolfram 94] S. Wolfram. Cellular automata and complexity. collected papers. AddisonWesley, 1994.
[Wootton 94] J. T. Wootton. The Nature and Consequences of Indirect Effects in EcologicalCommunities. Annu. Rev. Rev. Ecol. Syst., vol. 25, pages 443–466, 1994.
[Worster 77] D. Worster. Nature’s economy. a history of ecological ideas. Cambridge Uni-versity Press, 1977.
[Yodzis 80] P. Yodzis. The Connectance of Real Ecosystems. Nature, vol. 284, pages 544–545, 1980.
[Yodzis 81] P. Yodzis. The Stability of Real Ecosystems. Nature, vol. 289, pages 674–676,1981.
[Yodzis 82] P. Yodzis. The Compartmentation of Real and Assembled Ecosystems. Am.Nat, vol. 120, no. 5, pages 551–570, 1982.
[Yodzis 92] P. Yodzis & S. Innes. Body Size and Consumer-Resource Dynamics. Am. Nat,vol. 139, pages 1151–1175, 1992.
[Yodzis 00] P. Yodzis. Diffuse Effects in Food Webs. Ecology, vol. 81, no. 1, pages 261–266,2000.
[Yuan 08] H. Yuan & G. Chen. Network Virus-epidemic Model With the Point-To-GroupInformation Propagation. Appl. Math. Comput., vol. 206, pages 357–367, 2008.
[Yuh 98] C-H Yuh, H. Bolouri & E. H. Davidson. Genomic Cis-Regulatory Logic: Exper-imental and Computational Analysis of a Sea Urchin Gene. Science, vol. 279,pages 1896–1902, 1998.
[Yuh 01] C-H Yuh, H. Bolouri & E. H. Davidson. Cis-regulatory Logic in the Endo16Gene: Switching from a Specification to a Differentiation Mode of Control.Development, vol. 128, pages 617–629, 2001.
[Zorach 03] A. C. Zorach & R. E. Ulanowicz. Quantifying the Complexity of Flow Networks:How Many Roles are There? Complexity, vol. 8, no. 3, pages 68–76, 2003.
228