Above is a thumbnail from a large network visualization produced for SDFB by the talented folks at KNALIJ. Click here to view the whole image, which is large (~12mb) but can be zoomed and navigated using your web browser. The image includes only the top nodes and edges in our inferred network. For a rather unwieldy visualization of all 6,000 odd nodes and their edges, without labels, click here.
The proximity of the nodes is determined by their connection strength. If multiple nodes are all connected with a high degree of confidence, they will cluster together. So, for example, you can see members of the Elizabethan court clustered in the bottom left hand corner. The graph takes the shape of a circle because it’s what’s called a force-directed graph, in which links or edges are treated as springs whose stiffness varies based on confidence estimates. It’s as though the nodes and their connections had been compressed, then left to settle in place in accordance with Hooke’s law for springs and elasticity. Node size is a function of the number of connections, which is why a figure like Charles II is significantly larger than, say, Samuel Palmer. The color of the nodes is an indication of community. Nodes are members of the same community when they share a set number of edges with other members of the community. When a node is part of multiple communities, its color is determined by the community with which it shares the most edges.
What is remarkable about the image, from our perspective, is how much meaningful information it displays given the relatively sparse dataset on which it is based. All that we sent to KNALIJ was a matrix of nodes and edges with confidence intervals. But from this minimal data their clustering and community inference algorithms have inferred a remarkable amount.
For example, though our data includes no dates or other temporal information, the graph has an obvious, though not entirely consistent, chronological organization. Starting with the Elizabethan court in the lower left hand corner, the graph proceeds counter clockwise through the reigns of James, Charles I, and Charles II. Nodes at 12 noon are largely post-Restoration and/or 18th-century. Nodes at 10 o’clock are part of James’s Scottish court.
At first we wondered why the center of the graph is basically empty. Then we realized that to occupy the center, a node would need to share edges with communities stretching over 150 years. The empty center is, in effect, a sign of the temporal scope of of our network. Presumably a network stretched over a longer time period would have an even more pronounced doughnut hole.
It’s worth at this point acknowledging some of the embarrassing things that this image makes evident about the current state of our inferred network. We still have some named entity recognition problems. The “Society of Antiquaries” should not show up in our network. There’s more work to do on date limitations, since figures appear from both much earlier (King John) and later (Lloyd George) than our proposed date range of 1550-1700. As we’ve discussed in earlier posts, there are still de-duping problems, especially with regards to monarchs. Some of these should be simple to iron out: King James and James I should not have separate nodes.
But in other cases the duplication provides potentially significant information. Even though James VI of Scotland became James I of England, it is fascinating to see different communities and networks surrounding the two names. Nor, to scholars of the period at least, is it self-evident that James VI of Scotland and James I of England should be treated as the same person. As Jenny Wormold asked long ago, James VI and I: Two Kings or One?
King James’s northern and southern subjects shared one attitude: both treated this man, who embarked on his dual role three months short of his thirty-seventh birthday, as their king, dividing him as far as possible into two separate individuals.
At stake in the question “Two Kings or One?” is the category of Britain itself.
In some cases the use of colors to indicate communities shows fascinating breakdowns in social coherence. A light blue Elizabeth I is surrounded by a sea of relatively unbroken light blue Protestantism. But the pink Charles I is cut off from his Laudian community and hemmed in by the darker blues of Cromwell, Fairfax, and Henry Vane. Henrietta Maria appears to have her own small and dispersed community, set apart from the rest of the mid-century milieu, that is more closely connected to the courts of Charles II and James II, and doubtless to the court in exile starting in 1644.
As tempting as it is to turn these images into narrative, it would be unwise to draw any strong conclusions at this point. We can’t be sure which aspects of the visualization are artifacts of our highly imperfect network data, or of the arbitrary thresholds with which the visualization algorithms organize that data into a coherent image. Revised data, or different thresholds (particularly thresholds manipulable by users), could and doubtless will yield very different pictures.
That said, we see the filtered SDFB graph as a rather large map of problems. Why does Henrietta Maria have a community distinct from her husband and those most proximate to her? Why does Jacob Tonson sit so far out to the upper-right hand corner? Why are certain nodes so evidently out of place? When and why don’t communities align with proximity, color with clustering? The problems call out for further explanation, interpretation, and speculation especially by experts with knowledge of the period, of particular figures, or of graph learning and/or visualization.