Synchronicity

This is an web-based lesson for teaching an undergraduate, STEM-focused introduction to graph & network theory, synchronicity, and random graphs.

It was made as a midterm project in Studio APMA: Topics in Emergent Behavior, a student-designed class at Brown University. It uses MathJax for math typesetting.

Thanks to Studio Applied Math, our advisor Björn Sandstede, d3.js, jquery, numeric.js, and mathjax. In addition, thanks to Steven Strogatz at Cornell University, whose research frames almost all of this content.

Treat this page like a textbook. Read it through and do the exercises, marked with ✿. The comments, marked in the left margin by ☞, are clickable for some small notes by your author.

Table of Contents:

0. Introduction

1. Graph as Matrix

2. Small-World Networks

3. Graph Generation

4. Graph Statistics

5. Synchronicity

6. Final Remarks

7. Interactive Force Graph

What is a network? Mathematicians have many names for these—some call them maps (but this term is loaded from other fields of math), some call them graphs, and some call them, well, networks

We frequently will think of a graph as sets of nodes/vertices \[V={A,B,C,D,E,\ldots}\] and pairwise-defined edges \[E={(A,B),(C,D),(A,E),\ldots}\] Recall that for a set \(S\) its size

A graph is an abstract object and can represent any number of things. Nodes and edges can be cities and roads, friends and friendships, particles and their interlocking forces, sets and the functions between them. We will be discussing graphs in the general case here, but have a few framing questions relevant to many fields:

What different qualitative types of graphs can we depict?

What different statistics

What kinds of algorithms can make graphs with certain properties?

And, due to the subject material for Studio Applied Math—how do graphs, or graph construction algorithms, display *emergent behavior*?

Assigning a firm definition to "emergent behavior" is difficult, but broadly we can say it is the display of novel or complex behavior from a collection of simple parts

The second part of the title, *Synchronicity*, is a layer on top of graphs. If these simple agents are connected (as a graph connects its nodes), can their behavior "synchronize" or fall into some cohesive regime? This depends on the dynamics, the Differential Equations that we assign to model each agent's behavior. How can dynamic systems interact with graphs, and what can they learn from each other? Do certain graphs lend themselves to emergent dynamics?

Let's imagine a five-node (\(N=5\)) graph called \(G\). \(G\) looks like this:

A common and efficient way of representing \(G\) in mathematical terms is as an

If, for example, as in G, the *second* node is connected to the *fifth* node, then the *second* row, *fifth* column position has a value \(A_{25}=1\). If the nodes weren't connected, we would have \(A_{25}=0\).

So, for our graph \(G\), our adjacency matrix is: $$ A = \left( \begin{array}{ccccc} 0 & 1 & 0 & 0 & 0 \\ 1 & 0 & 1 & 1 & 1 \\ 0 & 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 & 1 \\ 0 & 1 & 0 & 1 & 0 \end{array} \right) $$

Up until now, we have been talking about

For the time being, we will stick with

A graph of \(N\) nodes can have a possible \(\frac{N(N-1)}{2}\) edges.

We will use the adjacency matrix as our primary means of evaluating and manipulating graphs, as a Linear Algebraic framework provides many analysis tools.

Graph theory is a rich and active field; I recommend the Dover Mathematics: Introduction to Graph Theory as a gentle introduction from a pure mathematics viewpoint. The study of eigenvalues and eigenvectors of an adjacency matrix \(A\) and its graph \(G\) is called Spectral Theory. Graph-theoretical frameworks prove useful in many kinds of applied analysis, including operations research, finite element methods, and network theory.

In 1998, Duncan Watts, a Ph.D student at Cornell University, and his advisor Steven Strogatz, published a *Nature* paper, *"Collective dynamics of ‘small-world’ networks."* This went on to become one of the most highly-cited Physics papers ever.

In it, they present a mechanism for constructing graphs ("networks") with nodes that are well-connected to nearby nodes, with a few long-distance connections tossed in. They sought to imitate networks seen in various fields of application, where graphs were not strictly structured, but not entirely random either. One simple example is a social network: you know your neighbors and people within your communities of work & school, but there are also some relationships at a distance from chance encounters and other contact.

Bret Victor has a beautiful and effective interactive page for describing the Small-World effect paper.

The Watts-Strogatz model breaks into 3 parameters:

1. \(N\), the

2. \(k\), the

3. \(p\), the

We will discuss the exact Watts-Strogatz (W-S) algorithm in §3. The major findings of the paper boil down to this plot:

That is, path length from one node \(L\) drops off faster than the amount of clustering \(C\), as a graph becomes more random. The "small-world" effect is that just a few random connections can make the distance between nodes shrink quickly, without otherwise changing the interconnectivity very much.

The \(L(p)\) and \(C(p)\) values shared in the paper correspond to premeditated \(N\) and \(k\) values—more research has been done by Watts and associates to uncover deeper relationships to \(N\) especially. A review of much of this research (and many network-theory topics) can be found in Albert & Barbasi's 2002 paper *Statistical Mechanics of Complex Networks*.

For a network with parameters \((N, k, p)\), the total number of possible edges is \( e=\frac{N(N-1)}{2} \).

\(L(p)\) is just an average shortest*distance* (in number of edges to travel) between any two nodes in our graph made with \(p\).

\(C_i(p)\) is, for node \(i\), the amount of inter-connectivity among its neighbor points. It reflects how much of a fully-connected "cluster" there is around node \(i\).

with this, \(C(p)\) is just the average of all \(C_i(p)\) values.

\(L(p)\) is just an average shortest

\(C_i(p)\) is, for node \(i\), the amount of inter-connectivity among its neighbor points. It reflects how much of a fully-connected "cluster" there is around node \(i\).

with this, \(C(p)\) is just the average of all \(C_i(p)\) values.

$$
L(p) = \frac{1}{e}\sum_{i,j} \mbox{ shortest path length from } i \mbox{ to } j
$$
$$
C_i(p) = \frac{\mbox{number of neighbor edges for node } i \mbox{ not rewired}}{k(k-1)/2}
$$
$$
C(p) = \frac{1}{N} \sum_i^n C_i(p)
$$

Note! The *expected value* for \(C(p)\) is just \((1-p)\). This straight line becomes the curve we see in Fig. 3 when put on a log-log scale.

OK, we've seen how the Watts-Strogatz Small-World model works and the statistics they used to capture the "small-world" effect. Unfortunately the \(C(p)\) statistic is boring—I wonder if we can create a better measure for "cluster" effects

The natural question is: how did Watts and Strogatz make graphs that exhibited this statistical behavior? They carefully made an algorithm that gave them exactly what they wanted: a network that is well-connected locally with a sparse

The above graph has an adjacency matrix as such:
$$
\left( \begin{array}{cccccccccccccccccccc}
0 & 1 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 1 \\
1 & 0 & 1 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 \\
1 & 1 & 0 & 1 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\
0 & 1 & 1 & 0 & 1 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 1 & 1 & 0 & 1 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 1 & 1 & 0 & 1 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 & 1 & 1 & 0 & 1 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 & 0 & 1 & 1 & 0 & 1 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 & 0 & 0 & 1 & 1 & 0 & 1 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 1 & 0 & 1 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 1 & 0 & 1 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 1 & 0 & 1 & 1 & 0 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 1 & 0 & 1 & 1 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 1 & 0 & 1 & 1 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 1 & 0 & 1 & 1 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 1 & 0 & 1 & 1 & 0 & 0 \\
0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 1 & 0 & 1 & 1 & 0 \\
0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 1 & 0 & 1 & 1 \\
1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 1 & 0 & 1 \\
1 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 1 & 0
\end{array} \right)
$$
See it? See the big diagonals of ones, around the center diagonal of zeros? It can be difficult to visually see with text, so sparse matrices with values of only \(0\) or \(1\) are often visualized as *spy diagrams*, where a dot is placed in an x-y plane to denote a value 1. As a spy graph, this matrix looks like this:

It is very easy to create this lattice-like graph. However, as \(p\) increases, the adjacency matrix' strict structure degrades. Let us walk through the W-S mechanism for creating small-world networks.

Let us do an example with \((N, k, p)=(7,4,0.1)\).

At first, our 7-node, 4-neighbor adjacency matrix clearly reflects the neighbor connections.

After this, we randomly rewire the closest clockwise neighbor for each node (and then second-closest, and third-closest, etc. up to \(k/2\)). When we rewire, we remove the existing edge and connect the node in question to a node it does not already share an edge with.

After this, we randomly rewire the closest clockwise neighbor for each node (and then second-closest, and third-closest, etc. up to \(k/2\)). When we rewire, we remove the existing edge and connect the node in question to a node it does not already share an edge with.

$$
\left( \begin{array}{ccccccc}
0 & 1 & 1 & 0 & 0 & 1 & 1 \\
1 & 0 & 1 & 1 & 0 & 0 & 1 \\
1 & 1 & 0 & 1 & 1 & 0 & 0 \\
0 & 1 & 1 & 0 & 1 & 1 & 0 \\
0 & 0 & 1 & 1 & 0 & 1 & 1 \\
1 & 0 & 0 & 1 & 1 & 0 & 1 \\
1 & 1 & 0 & 0 & 1 & 1 & 0
\end{array} \right)
$$

We have a \(p=0.1\) so only about a tenth of our connections will get rewired. Let's say that \(A_{3,4}=0\) and is replaced by \(A_{3,6}=1\).

On our second go-around (rewiring second-nearest neighbors), we remove \(A_{6,1}\) and replace it with \(A_{6,2}\). These are our only two rewirings, and our updated adjacency matrix is complete.

On our second go-around (rewiring second-nearest neighbors), we remove \(A_{6,1}\) and replace it with \(A_{6,2}\). These are our only two rewirings, and our updated adjacency matrix is complete.

$$
\left( \begin{array}{ccccccc}
0 & 1 & 1 & 0 & 0 & 0 & 1 \\
1 & 0 & 1 & 1 & 0 & 1 & 1 \\
1 & 1 & 0 & 0 & 1 & 1 & 0 \\
0 & 1 & 0 & 0 & 1 & 1 & 0 \\
0 & 0 & 1 & 1 & 0 & 1 & 1 \\
0 & 1 & 1 & 1 & 1 & 0 & 1 \\
1 & 1 & 0 & 0 & 1 & 1 & 0
\end{array} \right)
$$

Also in the late 90s, Réka Albert (now at Penn State) and Albert László-Barbási (now at Northeastern) were studying graph properties of many naturally occuring networks in social and biological realms and used one crucial statistical measure to frame their approach, the

\( P(k) \) represents the proportion of all nodes in a graph with

From an adjacency matrix \(A_{ij}\), we can define $$ k_i = \sum_{j}^N A_{ij} $$

Albert and Barbási found that most of these naturally-occuring networks had degree distributions that were approximate power laws, that is
$$
P(k) ~ k^{-\gamma}
$$
Where \(\gamma > 0\) generally, and \(\gamma\) depends on the exact network but has typically been between 2 and 3. A network with power-law degree distribution is said to be

The authors created their own algorithm for generating graphs which produce this kind of degree distribution, which we will now detail.

The model begins with \(m_0\) initial nodes, and they are a complete graph (i.e. every node is connected to every other node).

Next, nodes are incrementally added. For each added node \(i\), the probability that \(i\) will be connected to node \(j\) is $$ \Pi = \frac{k_j}{\sum_{m}^{n} k_m} $$

Nodes are added until the final desired total \(N\) is achieved. This definition for \(\Pi\) is called

We will now walk through the addition of two nodes starting from a graph with \(m_0 = 4\), the only defining parameter. (Remember, the Watts-Strogatz model has two, \(k\) and \(p\).)

We begin with \(N=m_0=4\) nodes, fully connected.

$$
\left( \begin{array}{cccc}
0 & 1 & 1 & 1 \\
1 & 0 & 1 & 1 \\
1 & 1 & 0 & 1 \\
1 & 1 & 1 & 0 \\
\end{array} \right)
$$

Now, we add a new fifth node. We calculate its probabilities of connection simultaneously, so it has \(\frac{3}{12} = \frac{1}{4} = 0.25\) chance of being connected to any of the four existing nodes. Let us suppose only a connection to the third node is made.

$$
\left( \begin{array}{ccccc}
0 & 1 & 1 & 1 & 0 \\
1 & 0 & 1 & 1 & 0 \\
1 & 1 & 0 & 1 & 1 \\
1 & 1 & 1 & 0 & 0 \\
0 & 0 & 1 & 0 & 0
\end{array} \right)
$$

We will now add a sixth and final node. Its probabilities of connection to each of the previous 5 nodes, respectively, are

\(\frac{3}{14}\), \(\frac{3}{14}\), \(\frac{4}{14}\), \(\frac{3}{14}\), and \(\frac{1}{14}\). Let connections to the 3rd and 4th nodes be made.

And so concludes our graph construction via the A-B model.

\(\frac{3}{14}\), \(\frac{3}{14}\), \(\frac{4}{14}\), \(\frac{3}{14}\), and \(\frac{1}{14}\). Let connections to the 3rd and 4th nodes be made.

And so concludes our graph construction via the A-B model.

$$
\left( \begin{array}{cccccc}
0 & 1 & 1 & 1 & 0 & 0 \\
1 & 0 & 1 & 1 & 0 & 0 \\
1 & 1 & 0 & 1 & 1 & 1 \\
1 & 1 & 1 & 0 & 0 & 1 \\
0 & 0 & 1 & 0 & 0 & 0 \\
0 & 0 & 1 & 1 & 0 & 0
\end{array} \right)
$$

Yes, another two-name graph model

In their formulations, a

They found that many properties—like size of the largest component, for instance—change their value suddenly, at specific relationships between \(N\) and \(p\). It is hence we might call these properties *emergent*, arising suddenly from otherwise innocuous rules.

I have now laid out algorithms for the W-S, A-B, and E-R graph generation models. I will refer to them by these acronyms henceforth.

We saw in §2 that Watts & Strogatz created their algorithm as they did towards some statistical goals&mash;the average bath length \(L\) and clustering coefficient \(C\), both given \(p\) dependencies. By tuning or creating from scratch our graph-construction algorithms, we can achieve a great variety of qualitatively different graphs. Statistics are often excellent ways to quantify these differences.

Let us review the three key statistics we have examined for graphs, on a graph \(G\) with adjacency matrix \(A_{ij}\). $$ P(k) = \frac{\mbox{ number of nodes with degree k }}{|G|} $$ $$ C_i(p) = \frac{\mbox{number of neighbor edges for node } i \mbox{ not rewired}}{k(k-1)/2} $$ $$ C(p) = \frac{1}{n} \sum_i^n C_i(p) $$ $$ L(p) = \frac{1}{e}\sum_{i,j} \mbox{ shortest path length from } i \mbox{ to } j $$

Outside of the W-S rewiring mechanism for network creation, our notion of clustering does not make sense. What if \(C\) represented the proportion of nodes in a strongly-connected component

Of importance to Erdős and Réyni was the

What is happening in this video? The metronomes, simple dynamic pendulums, fall into a state of synchronization with each other, seemingly regardless of their random starting arrangements. How are they communicating—or what is the structure of their inter-connectedness?

We now enter into discussion of another dimension to explore—what happens when we lay a set of differential equations over a graph? Let each node represent a state variable \[x_i, i\in{1,2,3,...}\] and its set of edges reflect the form of the state variable's governing equation. Here, we enter into rich territory of exploring the relationships between dynamics of the system and structural properties of the graph.

One example, regularly returned to due to popularization in Steven Strogatz' book on Nonlinear Dynamics, is the

Let us have \(N\) state variables, \(\theta_i\) where \(\theta_i \in [0,2\pi)\) is periodic. Our dynamics are governed by: $$ \dot{\theta}_i = \omega_i + \sum_{j=1}^N \Gamma_{ij}(\theta_j - \theta_i) $$ where \(\omega\) represents the natural frequency of oscillator \(i\), and \(\Gamma_{ij}(x)\) couples one oscillator's dynamics to those of all others.

In the Kuramoto model,
$$
\Gamma_{ij}(\theta_j - \theta_i) = \frac{K}{N}\sin{\theta_j - \theta_i}
$$
where \(K\) is a constant, the

*OK—what is going on here?* The speed at which the *i*th oscillator moves (\(\dot{\theta}_i\) is described. This value starts from a fixed given \(\omega_i\) and is further pulled around based on how close it is, on the unit circle \([0,2\pi)\), to all other oscillators. An oscillator will try to "catch up" or "slow down" to meet other oscillators (sending \(\theta_j-\theta_i \rightarrow 0\)), and the rate at which it adjusts to its peers is dictated by the value \(K\).

\(\omega_i\) values are generally assumed to come from a unimodal symmetric distribution \(g(\omega)\).

Kuramoto's key insight in its analysis was to see that each oscillator's dependence on all other oscillators could be simplified. Let us define, in the complex plane, two values \(r\) and \(\psi\) by the equation $$ re^{\sqrt{-1}\psi} = \frac{1}{N}\sum_j^N e^{\sqrt{-1}\theta_j} $$ where I am using \(\sqrt{-1}\) as the imaginary number instead of \(i\) to reserve \(i\) for index notation.

Now let us multiply both sides by \(e^{-\sqrt{-1}\theta_i}\): $$ re^{\sqrt{-1}(\psi-\theta_i)} = \frac{1}{N}\sum_j^N e^{\sqrt{-1}(\theta_j-\theta_i)} $$ Now, if we focus only on the imaginary part, we find: $$ r\sin{(\psi-\theta_i)} = \frac{1}{N}\sum_j^N \sin{(\theta_j - \theta_i)} $$

With this, we can see that for an individual oscillator, $$ \dot{\theta}_i = \omega_i + Kr\sin{(\psi - \theta_i)} $$

Thus we have found that each oscillator is really just interacting with the *bulk* properties of the \(N\) oscillators—only the averages, \(r\) and \(\psi\) affect an oscillator's dynamics*t* value!

Per Sebastian Skardal has made an excellent stand-alone Kuramoto model visualization app called Synched.

Many questions follow. For a given \(g(\omega)\) distribution, does there exist a stable-state solution, where \(\dot{\theta}_i = 0 \forall i\)? If so, is it stable or unstable? These questions are yet unanswered in completely general cases, but some specific extensions of the Kuramoto model yield fruit.

Wiley, Strogatz, and Girvan allow \(K=N\) and for the sum to be specified only in a range \(k\) wide on either side of the node of interest \(i\). That is, $$ \dot{\theta}_i = \omega_i + \sum_{j=i-k}^{i+k} \sin(\theta_j - \theta_i) $$

Here our earlier work of graphical methods becomes relevant. Instead of the oscillator being coupled to *all others*, just connect it to its \(2k\) nearest neighbors. This is akin to a highly-ordered (\(p=0\)) W-S graph.

Now we have another layer of framework to ask the above questions. Will the existence of a stable equilibrium be affected by rewirings of the graph, with \(p\noteq 0\)? How stable is the equilibria, and are there bifurcation points in our parameter space? These questions and more are explored further in the aforementioned Wiley and co. paper.

Most results of research in this field are complicated either in analysis or computation and are outside of the scope of this page. As a starting point, see another Strogatz paper, From Kuramoto to Crawford. Recall: this is only *one dynamic system*! The possibilities for others are literally endless.

This lesson is meant to provoke curiosity and questions in a few directions. What mechanisms do we have for creating and describing networks? How intertwined are our generation algorithms and the processes by which we evaluate them? What can structural properties of networks tell us about dynamics that may occur on them?

Paul Baran, one of the forefathers of the modern internet, put forth a theory and justification for a distributed network of computers in his 1964 paper, On Distributed Communications. At one point, the following image appears:

We all intuitively, instantly, recognize the difference between these. The first has one central node, the next is formed by a few small packages, and the last a fully spread network. But what if a computer had to differentiate them? This is my last question for you.

Using D3, I made an interactive sandpit to create graphs. Click and drag from nodes to create new nodes or new edges, use backspace/delete to remove your selections, and watch statistics change.

Go to the interactive graph HERE.

That's all! Thanks for reading. Send any thoughts, comments, or criticism to *lukas@brown.edu*.