Network Structure
"Connected": a word of many meanings
The word "connected" speaks to the most basic structural properties of networks. It is arguably both the most important and the most overused term in the network vocabulary.

Important and proper uses of "connected":

Connected has several distinct formal definitions, each of which is important. We introduce three of them here and elaborate the details later.

  • A path connects its origin and destination nodes
  • Two nodes are connected if there is at least one path that connects them
  • A graph is connected if each of its nodes is connected to all the others

Examples:

  • Nodes 6 and 2 are connected in graph G1 below
  • Nodes a and b are not connected in graph G2 below
  • Graph G1 is connected; graph G2 is not connected
 connectednot connected

 Graph G1 = (V1,E1)

 Graph G2 = (V2,E2)

Common sloppy uses of "connected":

Sometimes "connected" is used informally when another term would be more clear. For example, avoid using "connected" when one of these unambiguous words would apply:

  • Joined
  • Adjacent
  • Incident
 
Induced Subgraphs

It is often useful to consider only part of a graph. Induced subgraphs are one particularly convenient way to define a specific sub-part of a larger graph.

Given a graph G=(V,E), an induced subgraph is defined by a subset of vertices S (i.e., S ⊆ V). Then

  • The nodes of the subgraph induced by S are simply the nodes in S, and
  • The edges of the subgraph induced by S are all edges in E that have both endpoints in S.
We can write the induced subgraph formally as GS = (S,ES), where ES is the set of edges defined above (edges in E that have both endpoints in S).

Example 1: Consider the graph G=(V,E) drawn below.

graph

The subgraph induced by the subset of nodes S={1, 2, 3, 4} can then be drawn:

induced subgraph

Example 2: Let G=(V,E) be the Facebook friends network:

  • V = {x: x has a Facebook account}
  • E = {{x,y}: x and y are Facebook friends}

This is a big network: if we want to draw it, we must consider a small subgraph. One way to define a subgraph is to run the Touchgraph Facebook browser and ask to see your top 4 friends, not including yourself. When Bruce Hoppe does that, Touchgraph calculates his top 4 friends as the set S={KenM, AlanP, MarkB, Patti}.

Although the notion of "top 4 friends" is admittedly murky, once we have defined a subset of Facebook accounts, such as S={KenM, AlanP, MarkB, Patti}, then that subset S unambiguously induces a subgraph of the Facebook friends network. In this case, Touchgraph draws the induced subgraph for us:

touchgraph subgraph

 
"Connected" defined formally

Earlier we stated that

  • A path connects its origin and destination nodes
  • Two nodes are connected if there is at least one path that connects them
  • A graph is connected if each of its nodes is connected to all the others

"Connected nodes" defined more formally:

Given an undirected graph G=(V,E), two nodes x and y (i.e., x∈V and y∈V) are connected if and only if one of the following is true:

  • x=y, or
  • there is at least one path in G with origin x and destination y

The most important clarification in our formal definition of connected nodes is that any node is by definition connected to itself.

 
Connected graphs and connected components

A connected graph (as defined above) is said to consist of a single connected component. If a graph is not connected, then it consists of two or more connected components.

Six Degrees offers this intuitive metaphor for connected component: Suppose the nodes of a graph are buttons and the edges of the graph are threads joining the buttons. The buttons and threads are all on the floor. Grab one button and lift it. If that button has any threads, then other buttons will start to rise in addition to the one button you are holding. Each additional button that rises off the floor may have even more threads that cause even more buttons to rise. Keep lifting until every button connected to the one in your hand (by any combination of threads) is off the floor. The buttons and threads that have risen off the floor are a connected component.

The following examples illustrate the idea of connected component.

Graph G1 is connected and therefore has 1 connected component:

connected graph

Graph G2 has 2 connected components:

2 components

Graph G3 has 3 connected components:

3 components

A connected component can be defined as an induced subgraph. For example, G3 (above) consists of three connected components:

  • the subgraph induced by {1,2,5,6}
  • the subgraph induced by {3,4}
  • the subgraph induced by {7,8}
 
Hubs

A hub is a node in a graph with much higher degree than average. For example, nodes 1 and 2 are both hubs in the figure below:

hubs
 
Clusters

Informally speaking, a cluster is a group of nodes in a graph that are more connected to each other than they are to the rest of the graph. For example, the red and yellow regions below are clusters:

clusters
 
Defining clusters, part one: connected components

The following sections present a "semi-formal" definition of clusters in three parts:

  1. First, we formally define connected component
  2. Second, we introduce and formally define clique
  3. Third, we informally define cluster based on the above two formal definitions

Given an undirected graph G=(V,E), we define a connected component to be a subgraph induced by node set SV (i.e., GS = (S,ES)) with the following two properties:

  1. GS is connected
  2. There is no edge in E that joins a node in S to a node not in S

The above 2 mathematical properties of a connected component translate into the button-and-thread metaphor as follows:

  1. GS is connected: The only buttons that rise off the floor do so because of the one button you are lifting and the threads that ultimately connect that button to other buttons (i.e., paths)
  2. There is no edge in E that joins a node in S to a node not in S: You have lifted the one button in your hand high enough so that no more buttons will come off the floor no matter how much higher you lift the button you are holding.

For example:

The graph G2 (drawn above and again below) is not a connected component because it violates property #1: it is not connected.

2-comp

The subgraph induced by S={c,d,e,f} (drawn with dark nodes and edges below) is not a connected component because it violates property #2. There are edges in E that join nodes in S to node b, which is a node not in S.

not component

 
Defining clusters, part two: cliques

In an undirected graph G=(V,E) a clique is a subgraph induced by node set SV (i.e., GS = (S,ES)) such that the density of GS is 1. Put another way, in a clique, every pair of nodes is adjacent.

Example: Consider undirected graph G=(V,E) drawn below.

2-comp

The following are cliques:

  • The subgraph induced by {c,d,e,f}
  • The subgraph induced by {b,f}

The following are not cliques:

  • The subgraph induced by {a,b,c}
  • The subgraph induced by {b,c,d,e}, which is drawn below
bcde
This subgraph is not a clique:
Its density is less than 1.

The subgraph induced by {c,d,e,f} is the largest clique in G: the clique with the most nodes. Usually the largest clique in a graph is the most interesting clique in that graph.

 
Defining clusters, part three

Our definition of cluster informally draws upon our formal definitions of connected components and cliques.

In an undirected graph G=(V,E) a cluster is a subgraph induced by node set S⊆V (i.e., GS = (S,ES)) with the following two properties:

  1. The average degree of GS is "relatively high"; (a relaxed adaptation of clique-ness)
  2. There are "relatively few" edges in E that join a node in S to a node not in S; (a relaxed adaptation of connected component-ness)

Example: In the graph G=(V,E) drawn below, the following subsets of nodes induce subgraphs that can fairly be called clusters:

Subsets of nodes:
  • {1,2,3,4,5,7}
  • {20,21,22,23,24,25}
  • {14,15,16}
Corresponding clusters: 
clusters

Not all connected components are clusters. For example:

  • In the graph above, the subgraph induced by {9,10,11,12,13} is a connected component, but it is arguably not a cluster, because the average degree of that induced subgraph is relatively low.
  • The subgraph induced by {6} is a connected component, but it is definitely not a cluster, because the average degree of the induced subgraph is zero.

Not all cliques are clusters--even relatively large cliques. For example:

  • The subgraph induced by {1,2,3,4} is a clique, but it is not a cluster because every single one of those nodes is adjacent to node 5; there are too many edges joining nodes in {1,2,3,4} to nodes not in {1,2,3,4}.
  • Even the largest clique in the above graph--the subgraph induced by {1,2,3,4,5}--is arguably still not a cluster because node 7 is adjacent to so many nodes in {1,2,3,4,5}.
 


Joomla Templates by Joomlashack