Any project that sets out to map a social network - a network of relations between persons - will need to decide how to represent and categorize the types of relationships between people - the various ways that people are associated with one another. The big decision will be a matter of choosing between a controlled vocabulary (we pick a limited number of relationship types in advance) and an uncontrolled vocabulary (users can add new relationship types without restrictions). Some of our initial thinking about this significant decision can be found in this podcast (especially around the 22-minute mark), but this post is intended to survey the advantages and disadvantages of these choices more fully.
The advantage of a controlled vocabulary of types is that all relationships can be sorted, searched, and ordered by finite categories chosen in advance. There’s no danger of users adding redundant or specious types or of unseen overlapping hierarchies. Presumably two nodes can be connected by more than one relationship type (one person could be an uncle and a trade master and a guardian of another), so that one way to achieve specificity is by layering multiple relationship types. But the downsides of controlled vocabularies are even more glaring. They are inherently tendentious. One can always ask why some relationship (teacher and student? Father and “natural” (i.e. born out of wedlock) son?) is omitted while others are included. By the same token they are inherently normative, giving recognition to some relationships but not others. They universalize, imposing relationship types across different periods and communities. A Controlled vocabulary will normalize and universalize regardless of how carefully one assembles them, subjecting them both to historical critique (are their terms as appropriate in 1500 as in 1700?) and localist critique (do the same types apply in rural communities as in urban ones? in the north as in the south of England?). Historians and literature scholars notice and care about these problems and will (should?) bridle at the constraints they place on their ability to characterize connections between people. Worse still, controlled vocabularies standardize as matters of fact precisely the things that historians and literature scholars treat as central matters of concern and debate.
You can see the problem of controlled vocabularies most vividly in any popular social networking site. I may wish to be connected to someone on Facebook but think it imprecise or even absurd to identify her or him as my “friend.” This ends up changing the very definition of the term “friend,” making it into the general type of social relations qua relations, rather than one particular type of relation among others. This problem of imposing relationship categories is exacerbated when we’re aiming to reconstruct the network of a period distant in time and culture from our own.
So why not just use an uncontrolled vocabulary of relationship types? Why not let users characterize relationships with unconstrained subtlety, detail, and specificity? The disadvantages of an uncontrolled vocabulary of types is are roughly the negation of its advantages. Uncontrolled types can proliferate endlessly, making them nearly useless for searching, sorting, filtering, or ordering. If, as Aristotle observes, there is no science of particulars, only of categories, then dispensing with categories also dispenses with the science, leaving only the proliferation of disparate, particular relations. Since an uncontrolled vocabulary is not shared between members of a community (you have your preferred types, I have mine), this means that the community lacks a shared set of categories for querying or analyzing the network - or at least, overlap in categories will be the product of local and fleeting agreement. Without a controlled vocabulary of relationship types, it wouldn’t be feasible to filter the network to display only persons related through “Family” or through “Profession,” since those general categories would be thrown into the mix indiscriminately with more specific categories like “Step-Son” or “co-Member of Parliament.” Put simply, an uncontrolled vocabulary of relations would negate many of the practical benefits for which we’ve decided to reconstruct the social network in the first place.
That said, I believe that we (the Six Degrees team) have already decided, of necessity, to use an uncontrolled vocabulary for nodes, which amounts to letting users tag persons with basically an unlimited range of group membership descriptors. There’s no way around this because there’s no principled way to decide, in advance of historical inquiry, what kinds of groups an early modern person could have participated in. The groups in which persons take part change over time, they overlap, and they are debatable (what was the status of the group “Ranters?”) both in their own time and in historical retrospect. The only option for nodes is to have contributors deploy the fullest range of group types, including both general (“Puritan”) and specific (“Arminian”) tags, and without any attempt to impose hierarchical relations. Any attempt to enumerate and categorize all of the radical sects of the late 1640s and early 1650s into a taxonomic scheme would, I think, be to repeat the futile project of Edwards’s _Gangraena_. It would be beset with problem of overlapping hierarchies. For example, Milton could (arguably) be tagged as a Puritan (or “left Protestant,” anti-episcopal, etc.) and as an Arminian, but Arminianism is a subcategory of Anglican as well, even though Puritan and Anglican are, for most scholars, exclusive categories; a classical categorization scheme wouldn’t work. So no controlled vocabulary and no hierarchy for node types.
How different are relationship types from nodes? One intuition is that while groups (i.e. Ranters or members of the “Hartlib Circle”) are highly contingent historical categories, some relationship types have validity across periods and cultures. All periods (so the intuition goes) have notions of what it means to be related by family, even if the kinds of relations that are counted as family relationships vary dramatically between and even within periods and cultures. In the early modern period, as in earlier and later periods, the notion of a “natural” or “bastard” or “illegitimate” child occupies a liminal role in the family, inside in some respects or with respect to some family members, but outside in others. Yet this kind of liminal case, even as it troubles the coherence of the category “family,” at the same time demonstrates its indispensability. A more current example: our concept of what counts as a family has changed dramatically (and for the better, I scarcely need to say) as a result of the gay rights movement. People now speak publicly and proudly of same sex spouses, same sex partners, “gaybys” and other relations as family relations in a way that wasn’t the case decades, much less centuries ago. But this change in the content of the category “family” demonstrates, rather than undermining, the perdurability and generality of the category itself. (It’s unclear whether anti-normativity and anti-marriage gay theorists would think it possible or desirable to dispense with the normative category of family relations tout court; this is a question worth asking).
Digital humanities projects (as opposed to DH scholarship) forces us to stop poking at our basic categories somewhere and make a decision. This halt to fundamental questioning is the thing about DH that makes humanists like me uncomfortable. Humanities disciplines have taught us to think of ourselves as poking, deconstructing, troubling, and questioning categories indefinitely. But the discomfort with any halt to questioning is not peculiar to DH. The decisions required for DH just make it harder to forget that it is only possible to trouble or deconstruct any particular category or set of categories by leaving in place a whole set of background categories and assumptions. This is as true of radical critique as it is of any digital database. Total skepticism about categories just isn’t possible - or desirable, since it would mean the cessation of thought, not thought’s highest pitch. We can trouble anything, but we can’t trouble everything at once. An advantage of a DH project like Six Degrees of Francis Bacon is that it lets us clarify our background categories in a systematic and visible way, essentially disclosing new objects for critique. Its relative positivism (it is concerned to record, store, and make systematically available facts about how people were related) need not be opposed to critiques of categoires. Rather, the project can serve as a basis for further critique. In the terms of Bruno Latour, we can’t dispense with “matters of fact” if we hope to pursue “matters of concern” (in this case rethinking relationship types).
In that practical spirit, let me propose one possible way forward on the question of relationship types. Instead of choosing between a practically useful but theoretically indefensible controlled vocabulary, on the one hand, and a theoretically defensible but practically disastrous uncontrolled vocabulary on the other, we should mix the two. High level, relatively general and perdurable categories of relations – like family relations, work relations, or pedagogical relations – would be controlled. These would, for example, allow users to search and sort and filter all Royalist nodes connected by family relations. But the lower level types, which would be sub-specifications of the higher levels, would be uncontrolled. That is, we would leave scholars/users free to elaborate the types of familial relationships without constraint, even if this makes it harder to filter and sort coherently at the lower level. We would have a split-level hierarchy, one that (as with node types) would put no constraints on overlapping hierarchies (Master and Apprentice could be classed as both a pedagogical and a professional relation). The network would support debates about the family category by enabling debates over the specific kinds of relations that fall under the general category of family relations. This proposal offers a practical and technical compromise (and by compromise I mean not wholly satisfactory in any respect) to a fundamentally theoretical - i.e. conceptual and ontological - question: what kinds of relations between people are there?