author Henk P. Penning date Sun Dec 3 13:02:06 CET 2006

- The PGP
web of trustcan be viewed as a directed graph where thepointsare the PGP keys, and thearrows(directed lines) are the signatures.- If there is a path from key
Ato keyB, thedistancefromAtoBis the length of theshortestpath fromAtoB.- The
strong setis the largest set of keys such that for any two keys in the set, there is a path from one to the other.- The
mean shortest distance(MSD) of a key is the average distance to that key.- The
avarage mean shortest distance(AMSD) is the average of the MSD's of all keys.

The size of thestrong setis growing ...... and dived down in 2004-12 ; a massive cleanup removed a lot of revoked/expired keys.

... and the average distance between the points is shrinking!

A possible reason : the average degree (number of signatures per key) is rising.

The population of the strong set changes over time, as keys enter and leave the set.In the graph below, for a given date

About half of the keys in the first set (april 2003) have disappeared since then.

- a blue line represents the size of the strong set
- a red line represents the number of keys that disappeared from the strong set

The plot below shows the distribution of degree (the number of signatures per key).The distribution hardly changes with time. The fraction of keys with only one signature is remarkably stable at some 25%.

The graph below shows last month's distribution of degree (number of signatures per key) using a

log/logscale.The graph is very typical for a social network exhibiting the small world phenomenon, a scale-free network.

- the red points show the raw degree/count scatter graph
- the blue points show the average degree per count (level)
- the smooth green line follows the blue points

The graph below shows the distribution of the distances between all PGP keys in thestrong set.Since the

strong setgrows, the number of paths grows too. If you look closely, you can see that the most common distance decreases. In april 2003 it was just below 6, now it is creeping to 5.5.

The plot below shows the same data, but the counts are normalized (divided) by the number of paths in thestrong set:#paths = |G| × (|G| - 1), where |G| is the number of points in the graphThe plot is interesting, because it is so boring! The shape of the distribution curves is very stable; over time, the pattern shifts slowly toward smaller distances.

Some people are interested in key ranking ... See the top 50 and top 1000.

Following the footsie idea of Matthew Wilcox, here is the

footsiefor thestrong set:

What happens to the size of the largest strongly connect component (scc) if we delete points ? The graph will break up and the size of the biggest component will go down.Howthis happens, depends on which points we remove. Below we will see what happens when we removecore,fringeorrandompoints.The plots below show what happens if we remove

kcore keys: the keys with low msd's.To see how the size of the scc (

- the
red lineshows the size of the remaining largest strongly connected component- the
green lineshows the number of strongly connected components- the
blue lineshows (an estimate for) the average mean shortest distance in the largest strongly connected component ; the msd is computed for 10% of the pointsred line) goes to zero, the second plot shows the same data on alogscale.The graph crumbles gracefully until some 10.000 keys are removed. After that, only splinters remain. As expected, the amsd shoots up, to over 15. Note that the

top-1000aren't very special. The crumbling is evenly distributed over the entire range, until there is nothing but graph dust, probably local clusters.

The plots below show what happens if we remove

kfringe keys: the keys with high msd's. To get a better view of the number of components (green line), the second plot shows the same data on alogscale.The graph remains very much intact, all the way down. The number of components remains low and very stable.

At first, the amsd decreases more than average because the the largest connected component becomes 'rounder'.

What happens if we remove

randomkeys ?The scc of the graph remains big, almost all the way down. The amsd (

blue line) remains very stable, between 5 and 6.

This plot shows thefractal natureof the strong set. A random subgraph has the same structure as the entire graph.

## history

- PGP Web of Trust Statistics by Neal McBurnett ; a first analysis of the PGP web of trust, 1997

- keyanalyze - Analysis of a large OpenPGP ring by M. Drew Streib ; a analysis of the PGP web of trust, starting in 2001.
## current

- Dissecting the Leaf of Trust by Jörgen Cederlöf is an analysis of trust relations in the strongly connected component of the pgp web of trust. Data is gathered from the level of trust assigned to signatures by the signer.

- The Footsie Web of Trust analysis by Matthew Wilcox ; shows the state of the Web-of-Trust at regular points of time ; average MSD's of the top 50 and top 1000 keys, etc.

- keyanalyze reports by Jason Harris

- Betweenness Centrality reports by Matthias Bauer ; the computations are based on Ulrik Brandes, A Faster Algorithm for Betweenness Centrality,
Journal of Mathematical Sociology, 25(5):163-177, 2001. The usefulness of a key is measured by the number of shortest paths that run through it.## glossary, theory

- wikipedia.org : PGP, web of trust, graph, graph theory, scale-free network social network, small world phenomenon
- Towards a Theory of Scale-Free Graphs: Definition, Properties, and Implications, Lun Li, David Alderson, Reiko Tanaka, John C. Doyle, Walter Willinger ; Condensed Matter, cond-mat/0501169, december 2004.

The plots were done by Henk P. Penning → home → pgp → pathfinder and key statisticsThe data was kindly provided by Patrick Feisthammel and Jörgen Cederlöf.