Fork me on GitHub

# ST_GraphAnalysis

### Description

Uses Brande’s betweenness centrality algorithm to calculate closeness and betweenness centrality for vertices and betweenness centrality for edges.

Let $$d(s, t)$$ denote the distance from $$s \in V$$ to $$t \in V$$, i.e., the minimum length of all paths connecting $$s$$ to $$t$$. We have $$d(s, s) = 0$$ for all $$s \in V$$.

Let $$\sigma_{st}$$ denote the number of shortest paths from $$s \in V$$ to $$t \in V$$ and set $$\sigma_{ss}=1$$ by convention. Let $$\sigma_{st}(v)$$ denote the number of shortest paths from $$s$$ to $$t$$ containing $$v \in V$$.

We have the following definitions for vertices:

\begin{array}{l r} C_C(v) = \left(\sum_{t \in V} d(v, t)\right)^{-1} & \qquad \textrm{closeness centrality} \\ C_B(v) = \sum_{s \neq t \neq v \in V} \frac{\sigma_{st}(v)}{\sigma_{st}} & \qquad \textrm{betweenness centrality} \\ \end{array}

Betweenness centrality for edges is defined similarly.

A high closeness centrality score indicates that a vertex can reach other vertices on relatively short paths; a high betweenness centrality score indicates that a vertex lies on a relatively high number of shortest paths.

##### All centrality scores are normalized.

But this normalization depends on the graph being connected. Use ST_ConnectedComponents to make sure you're calling ST_GraphAnalysis on a single (strongly) connected component.

##### A few caveats.

Results will not be accurate if the graph:

• contains "duplicate" edges (having the same source, destination and weight)
• is disconnected. If all closeness centrality scores are zero, this is why.

Though Brande's algorithm is much faster than a naïve approach, it still requires an augmented version of Dijkstra's algorithm to be run starting from each vertex. Thus, calculation times can be rather long for larger graphs.

##### Input parameters
Variable Meaning
INPUT_EDGES Table containing integer columns EDGE_ID, START_NODE and END_NODE; and optionally a weight column w (if the graph is weighted) and/or an edge orientation column eo (required if global orientation is not undirected)
o Global orientation string: directed, reversed or undirected
eo Edge orientation column name indicating individual edge orientations: 1 (directed), -1 (reversed) or 0 (undirected); required if global orientation is directed or reversed
w Edge weights column name
##### Screenshots

Closeness centrality of Nantes, France. Notice the limited traffic zone in the center. Edge betweenness centrality of Nantes, France. Notice how the beltway and bridges really stand out. The above screenshots were generated in OrbisGIS.

### Examples    ##### Exercises
1. Use ST_ShortestPathTree to calculate the shortest path trees for nodes 1 through 5. Then use the definition of node betweenness to calculate betweenness by hand and verify the results above.
2. Make use of ST_ShortestPath to write a script that calculates node betweenness automatically without using ST_GraphAnalysis. Make sure you get the same results as above.
3. Repeat exercise 2 using ST_ShortestPathLength to verify node closeness.