The theory of complex systems. Complexity theory arises from the need to understand the richness in structure and behavior often seen in large systems. The property that distinguishes complex systems from systems that are large but simple is the emergence of global features from local interactions, as captured in the popular saying “the whole is greater than the sum of its parts.” For example, a flock of birds emerges when individual birds coordinate their behavior with each other.
Complexity often appears as unpredictability, or as a tendency for a system to undergo sudden, rapid change. Typically, it arises within systems that have strong interrelationships between components. For instance, the coordinated action of millions of individual cells, all tightly bound together, produces the action of muscles in the body. Epidemics emerge from the spread of infection from individual to individual.
Complexity theory has its roots in many fields of research, including biology, computer science, ecology, engineering control systems, nuclear physics, and operations research. In physics, for instance, there is a need to understand properties that emerge within large ensembles of atoms and other particles. General systems theory, developed by the biologist Ludwig von Bertalanffy in the 1950s, identified the role of internal interactions and processes in self-organization. In the 1960s, W. Ross Ashby and Norbert Wiener developed cybernetics, which concerned communications and feedback in the control of systems. Complexity theory came into its own during the late 1980s, as increasing computing power made large-scale simulation studies practical. The formation of specialist research centers, such as the Santa Fe Institute, did much to promote the new science during the 1990s. See also: Cybernetics; Ecology; Nuclear structure; Operations research
Several fields of research have become closely linked to complexity theory. The link to graph theory is assured by the network model of complexity (discussed below). The relevance of chaos theory arises because complexity in dynamic systems often leads to nonlinear and chaotic behavior. Another related field is artificial life (Alife), which concerns lifelike features found in artificial systems. By addressing complexity in living systems, this field has influenced the development of theory in evolution, ecology, and sociobiology. Evolutionary computing concerns computational processes that address complex problems by mimicking species evolution. The best-known example is the genetic algorithm, introduced by John Holland in 1975. This solves complex optimization problems by treating key parameters as genes within a population of potential solutions. Complex adaptive systems (CAS) are systems, such as brains or societies, composed of interacting elements that adapt or change in response to experience. See also: Chaos; Evolutionary computation; Genetic algorithms; Graph theory
In most cases, complexity can be understood in terms of networks. This model treats systems as sets of objects (“nodes”) with connections (“edges”) linking them. For instance, within a flock of birds the nodes would be birds and their interactions would be the edges. In a nuclear chain reaction, the nodes would be atoms and the edges would be moving neutrons. The network model also applies to processes and behavior. In a game of chess, for instance, each arrangement of pieces on the board is a node in a network of board arrangements, and moves by the players provide the edges that link different arrangements. In many games, for instance, a general strategy is to restrict your opponent's moves while maximizing the number of possible moves that you can make.
Strictly speaking, a set of nodes linked by edges forms a graph (for example, people who are neighbors of each other). A directed graph is a graph in which each edge has direction (for example, descendants in a family), and a network is a graph in which the nodes or edges have attributes associated with them (for example, people have names).
In any network, some features arise from the nature of the objects and their relationships, and some emerge from the underlying network. A network is connected if there is a path (a sequence of edges) leading from any one node to any other. For example, the edges A-B, B-C, C-D form a path between nodes A and D. Attributes that are often used to express the structure of different networks include:
Connectivity (the number of edges that must be removed to break the network apart).
Edge density (the number of edges as a fraction of the total number possible).
Clustering (the density of links between sets of nodes linked to a given node).
Diameter (the greatest separation between pairs of nodes, where separation is taken to mean the shortest path between a given pair of nodes).
Several kinds of network structure are common because they confer important properties:
1. Cycles or circuits (Fig. 1a) are regular networks in which a single path joins every node. The sequence of edges A-B, B-C, C-D, D-A forms a cycle linking the nodes A, B, C, and D.
2. Trees (Fig. 1b) are connected networks that contain no cycles. Hierarchies are trees with a root node, a node that all edges lead away from (for example, a common ancestor in a family tree). In computing, objects can form hierarchies in two ways: whole-part hierarchies (for example, in a whole book, the parts are chapters), and gen-spec hierarchies (for example, a novel is a special kind of book, and a book is a special kind of publication).
3. Random networks (Fig. 1c) are networks in which edges are distributed randomly between nodes.
4. Regular networks (Fig. 1d) are connected networks in which the edges form a pattern (for example, a lattice or grid).
5. Scale-free networks (Fig. 1e) form when new nodes in a growing network attach preferentially to existing nodes with highest connectivity. The resulting distribution of links per node obeys an inverse power law. That is, some nodes have many connections, but most have few. Web sites on the Internet, connected by hypertext links, provide a good example.
6. Small worlds (Fig. 1f) are connected graphs that contain both local clusters and long-range interactions. They fall between random networks at one extreme and regular networks at the other. Social networks often take this form.
Complex systems have a tendency to undergo sudden change. Examples include crystallization, water freezing, and epidemics. In all cases, these phase changes occur at a fixed value (the critical point) of some order parameter (for example, water freezes at a temperature of 0°C). Phase transitions can usually be understood as connectivity avalanches in an underlying network (Fig. 2). If a network is formed by adding edges at random to a set of N nodes, then a phase change (connectivity avalanche) occurs when the number of edges reaches the critical density N/2 (Fig. 3): the nodes become absorbed into a unique giant component. This connectivity avalanche translates into many physical processes. In crystallization, for instance, it involves small crystal seeds merging to form large structures. In all epidemic phenomena, the process requires a certain critical level of interaction to sustain itself. Nuclear fission, for instance, starts when the fuel reaches a critical mass. See also: Critical mass; Critical phenomena; Crystallization; Epidemic; Phase transitions
Automata whose state spaces lie close to the phase change in connectivity (the so-called edge of chaos) often exhibit interesting behavior, leading to speculation that only these automata are capable of universal computation and the ability to evolve.
Self-organized criticality occurs when internal processes maintain a system in a critical state (for example, a sand pile). Characteristically, the frequency (f) of events in such a system (for example, the sand pile collapses) follows an inverse power law f(s) = s−τ, where s is the size of the event and τ is a self-similarity exponent.
Encapsulation involves grouping nodes in a network into modules that are highly connected internally, but with limited connections to the rest of the network. In both natural and artificial systems, encapsulation is the most common way of coping with complexity. It appears in many different guises. In human activity, large tasks are usually broken down into manageable parts: Large corporations are typically divided into work units such as divisions, departments, and sections. Likewise, filing systems are divided into cabinets, drawers, files, and folders.
Modular design simplifies construction. Computer software typically consists of self-contained, reusable objects and modules. Living trees grow by the repeated addition of modules, such as buds, leaves, and branches. Engineering systems are often designed around modular components. A motor car, for instance, has electrical, fuel, and steering systems, and so on.
Encapsulation reduces complexity by eliminating long-range connections. In a human organization, for instance, specialization reduces the scope of problems that each person needs to deal with. On the other hand, modular designs can be brittle, often failing when long-range connections occur, such as by placing a library book back on the wrong shelf.
Modular structures often form into hierarchies of organization. A book, for instance, is a hierarchy of modules consisting of chapters, sections, paragraphs, sentences, and words. In management, hierarchies arise from the dual desires to simplify complex tasks and to control complex organizations (“divide and rule”).
Internal interactions can lead to the emergence of order within complex systems, even in the absence of external influences. Several processes are known to promote self-organization. Ilya Prigogine introduced the term “dissipative systems” to describe systems (including living things) that are far from equilibrium and share energy with their environment. He argued that in dissipative systems local irregularities can grow and spread, allowing large-scale patterns and order to emerge. Crystal growth is an example. See also: Crystal growth
Feedback plays an important role in self-organization. It occurs where the output of a system is fed back as input to the same system. Negative feedback dampens disturbances and stabilizes local patterns and behavior. Familiar examples include thermostats, speed governors, and other control devices that maintain systems in a steady state (sometimes called homeostasis). Positive feedback amplifies local effects, allowing them to grow and spread. Control systems are designed to avoid positive feedback because of its destabilizing effect. In contrast, positive feedback plays an important role in natural systems by contributing to the formation of large-scale order. In interstellar gas clouds, for instance, gravity leads to the formation of large centers of mass. These centers gather material faster, and at the expense of smaller ones, eventually forming stars and planets. Within an ant colony, piles of food, eggs, and waste grow in a similar way. Eventually this sorting of materials creates the order seen in ant colonies. Firing of a laser occurs when groups of atoms synchronize their outputs by “enslaving” other atoms. See also: Control systems; Homeostasis; Laser; Molecular cloud
Other processes that promote self-organization include encapsulation, adaptation, and evolution within living systems.
In a system of objects, interactions lead to self-organization, which results in the emergence of large-scale features. An ant colony emerges from the joint activity of thousands of individual ants interacting with each other and with their environment. Emergent properties are not predictable from the properties of the individuals, but often involve the imposing of constraints on individuals. In module formation, for instance, individual interactions with the outside could disrupt the module's integrity.
Module formation is a crucial mechanism in the emergence of order in many complex systems. In a growing embryo, for instance, cells differentiate at a very early stage of development, forming modules that grow into separate limbs and organs. See also: Cell differentiation
A related problem is the paradox of self-replication. This originated during the seventeenth century with attempts to explain human reproduction. Human eggs or sperm were thought to contain homunculi, tiny bodies from which children would grow. However, this idea led to a paradox: a single homunculus would need to have nested within it smaller homunculi from which future generations would spring. The paradox of self-replication arises because some part a self-replicating system apparently needs to contain a copy of the whole system. How is it possible, for instance, for a plant or animal to reproduce without including a complete description of itself? The paradox is resolved by observing that only information about the process of replication needs to be embedded within the system, not the entire system itself. The rest of the information can be derived from interaction of the system with its environment. John von Neumann suggested that there is a threshold of complexity below which automata are too simple to be capable of self-replication. However, later researchers, such as Lionel and Roger Penrose, demonstrated that even certain simple block structures could replicate themselves. Tom Ray's Tierra model provided a practical demonstration of a computational environment within which programs emulated organisms by containing code for reproducing themselves.
In computing, the complexity of a problem is measured by the time (or space) required to solve it, and the way that time increases as the size of the input increases. For instance, given a list of the locations of n towns, the time needed to find the most northerly town increases in proportion to n [denoted O(n)], but the time required to find the shortest distance between pairs of towns increases in proportion to n2 [denoted O(n2)].
In problem solving, an important advantage of modularity is to reduce combinatorial complexity. In a rail network linking, say, 100 stations, separate fares for every pair of stations would require a table with 4950 entries, but grouping the stations into, say, 3 zones would reduce the table to no more than 6 different fares.
Problems are often classified according to their computational complexity. Conventional terminology defines the class P to consist of all problems that can be solved by a deterministic algorithm for which the time complexity is a polynomial function of the input (for example, the towns examples above). The class NP consists of problems that a nondeterministic machine can solve in polynomial time. An unresolved theoretical question is whether the classes P and NP are identical. See also: Cryptography
Hard problems occur when the time complexity increases faster than any polynomial of the input. In the traveling salesman problem, for instance, a salesperson needs to find a route that minimizes the total distance traveled while visiting a number of towns (without doubling back). The solution involves checking all possible orderings of a list of the towns.
Attempts to link complexity to thermodynamics and information theory led to the Kolmogorov-Chaitin definition of algorithmic complexity, which is the length of the shortest program needed to compute a given pattern. For example, a table of sports scores in general contains more or less random entries and cannot be compressed. In contrast, a table of logarithms can be condensed to a formula that calculates all the entries. Describing the path of a flock of birds greatly reduces the detail needed to describe the paths of all the individual birds. These ideas extend into many aspects of computing. Methods such as minimum message length (MML) adapt them to provide practical methods of classification and problem solving. Data compression algorithms typically employ a fixed program and reduce a file to the data needed to regenerate the original pattern. Fractals may be regarded as patterns that are produced by repeating a given algorithm on different scales. See also: Data compression; Fractals; Information theory
As information technology has spread throughout society, the idea of natural computation has been increasingly influential. This idea treats natural processes as forms of computing. For instance, scientific studies sometimes employ computational analogies (for example, animals as programmed robots) to understand complex processes in natural systems. Likewise, computing has borrowed ideas from nature, which has evolved ways of solving complex problems. This practice has resulted in a proliferation of biological terms such as cellular automata, genetic algorithms, neural networks, and swarm intelligence. See also: Neural network
Role of simulation
Complex systems usually behave in nonlinear fashion. They are often mathematically intractable, and the richness of their behavior can make them inherently unpredictable. Simulation models provide a practical way of representing them. Certain representations are widely used to simulate specific kinds of complex systems, notably cellular automata, spin glasses, and agent-based models.
Cellular automata are grids of cells, each of which is an automaton that interacts with its neighbors. First introduced by John von Neumann, cellular automata became popular during the 1970s as a result of John Conway's cellular automaton Game of Life. They have been applied to many physical processes, such as flows and spread of epidemics (Fig. 4). In a simple cellular automaton model of disease spread, for instance, each cell would denote an area within a landscape, and the cell's state at any given time might be “susceptible,” “infected,” or “immune.” The epidemic would spread from infected cells to neighboring susceptible cells. The virulence of the disease would determine the chances of a susceptible cell becoming infected; otherwise it would become immune. See also: Cellular automata
Spin glasses (regular lattices, in which each node has a “spin”) have been used to study processes, such as glass formation, involving transitions from incoherent to coherent configurations of spin states in magnetic materials. See also: Spin glass
Agent-based models consist of many independent agents (for example, people, animals, or companies) that interact with one another and behave according to defined sets of rules. Cellular automata are a subclass of agent-based models. Early examples of this approach included Paulien Hogeweg's work on the emergence of social order within bumblebee colonies and Craig Reynold's “boids” models of flocking behavior in birds.
Simulation models provide several practical advantages for dealing with complexity. One is the potential to perform “virtual experiments” for problems where real-world experiments are impractical. Another is the ability to explore “what-if” scenarios when complexity makes prediction impossible. See also: Simulation
Complexity theory is becoming more important as the problems faced by science and society grow more complex. Is it possible to evolve designs to specification for huge networks that are both robust and efficient? The Internet, for instance, poses enormous challenges as a self-organizing system for gathering, storing, and distributing information. See also: Internet
In biology perhaps the greatest challenge at present is to understand how genetic regulatory networks control growth and development. During the 1970s, Stuart Kauffmann proposed a switching model of genetic control that was based on Boolean networks. Research has increasingly pointed to modular organization within the genome. Development of the eye, for instance, is encoded by a set of genes whose activity is controlled by a single “eyeless gene.” See also: Developmental genetics; Molecular biology
One far-reaching impact of communications and globalization is that the world is increasingly rich in interactions. Cascading interactions often mean that events in one part of the world can affect events on the other side of the globe.