Have you ever read an article that makes claims like, “Plato often talks about W” or “Kant typically associates X and Y” or “In his early work, Nietzsche seldom engages with Z”? I have. When I read these claims, I want to ask simple-minded questions like, “How often?” and, “What do you mean, ‘typically’?” and, “How seldom is seldom?” If these sorts of claims have any evidential value, it should be possible to verify or falsify them. Or — to turn the conditional around — if it’s not possible to verify or falsify them, then these sorts of claims have no evidential value.
As preparation for my in-progress book on Nietzsche’s moral psychology, I’m developing a methodology for quantifying, mapping, and analyzing the concepts used in philosophical corpora. My hope is that this methodology will make it possible to answer the simple-minded questions mentioned above, and that answering these questions systematically will lead to new insights. Furthermore, if my approach is on the right track, it should be fairly easy to retool it for the study of corpora by other philosophers, as well as corpus comparisons between (groups of) philosophers.
The three questions I started with ask about prevalence, association, and change. If we cut up a philosopher’s corpus into chunks and label each chunk based on its semantic content (i.e., whether it contains an expression for concept W, X, Y, and/or Z) as well as bibliographic information (i.e., which book it’s from and when it was published), it becomes possible to answer these questions. A concept is prevalent to the extent that it shows up in a large proportion of passages. It’s associated with another concept to the extent that it’s more likely than chance to be present in a passage when the second concept is present. A concept becomes more prevalent over a philosopher’s career to the extent that it shows up in higher proportions of passages over time.
How big or small a passage should be depends on the philosopher in question. It’s natural to make the chunks sentence-sized, since sentences express whole thoughts. It’s also natural to make the chunks paragraph-sized, since paragraphs express more complex thoughts and arguments. In the case of Nietzsche, a handy size is the numbered/titled section. At least after the Untimely Meditations (1873-76), his sections tend to be roughly the same length (half a page to a couple of pages), and they’re the standard unit of reference in the literature. For Plato and Aristotle, Stephanus pagination and Bekker furnish natural units of analysis. It’s a bit arbitrary, but the basic idea is clear.
In labeling sections, it helps enormously to have a searchable digital version of the text. Otherwise, one simply has to read everything by the philosopher under study from cover to cover, keeping a weather eye out for every concept of interest. That’s difficult, stressful, and time-consuming. Fortunately, for many prominent philosophers, searchable digitizations exist. In the case of Nietzsche, I was able to consult the Nietzsche Source (http://www.nietzschesource.org/), which I used to find every passage in which each of the concepts in Table 1 occurs.
Table 1: core constructs and operationalizations for querying the Nietzsche Source
Of course, Nietzsche wrote in German, so I couldn’t just search for the concepts directly. Instead, I had to operationalize each concept with a (disjunction of) word stem(s). Appending an asterisk to a search query returns every passage in which at least one word that begins with the word stem occurs. This is likely to return a few extraneous passages (false positives) and miss a few passages (false negatives), but it’s still highly reliable and reproducible.
I entered identifying information about each of the 3327 passages in Nietzsche’s published and authorized manuscripts into a spreadsheet, then dummy-coded each passage for the presence or absence of each of the concepts of interest. A representative piece of this spreadsheet is displayed in Figure 1.
Figure 1: data structure for cleaning query results from the Nietzsche source
The section of data pictured in Figure 1 is the preface and first 23 sections of The Anti-Christ, which was published in 1888 and has 64 total sections. Sections 1 and 2 refer to both virtue and value, but not to type or drive. Section 7 refers to instinct, virtue, value, and nobility. Prevalence within a book or within the whole corpus can be calculated by summing a column. For example, there are 152 passages that refer to drive and 303 passages that refer to virtue. Overall, there are 4439 total references. Co-occurrence of a pair of constructs within a passage is a bit more complicated: a pair of constructs co-occur when the columns associated with both constructs have a ‘1’ in the same row. Since there are 39 constructs in this dataset, there are 741 potential co-occurrence pairs or edges (38+37+…+1).
For the book, I’ll use this data structure to construct timelines, treemaps, section-by-section guides of each book, and semantic network visualizations, as well as to calculate inferential statistics such as Fisher’s exact test. In this post, I’m just going to show some of the network visualizations. I built these visualizations by converting the data to an adjacency list, then uploading the list to Gephi, an open-source network visualization application. In previous work, I’ve collaborated with Andrew Higgins and Jacob Levernier to map psycho-semantic networks of values, virtues, and constituents of wellbeing extracted from obituary texts.
The current project is similar, but, instead of working from obituaries to map laypeople’s normative structures, I’m working from Nietzsche’s writings to map his moral psychology. I made one overall map based on all of the data (Figure 2), as well as maps associated with each book. Nietzsche had a habit of republishing his books with new prefaces and new sections (e.g., Human, All-too-human and Gay Science); in those cases, I made a map of both the original book and the revised book. This resulted in 23 maps, starting with The Birth of Tragedy in 1872 and ending with Ecce Homo in 1889.
Figure 2: semantic map of Nietzsche’s overall moral psychology
Here is how to read such a map: the size of a node indicates its “weighted degree.” This is the sum of all of the co-occurrences of the concept in question. For example, suppose X occurs in three separate passages of a book. In the first passage, it co-occurs with Y but no other concept under study. In the second passage, it co-occurs with Y and Z but no other concept under study. And in the third passage it again occurs with Y but no other concept under study. X would then have a weighted degree of 4 (3 from Y plus 1 from Z). Size is thus a rough indicator of connectedness and therefore of prevalence in a moral psychological context. Edge width directly indicates weight: the wider an edge between a pair of nodes, the more frequently the concepts associated with those nodes co-occur. To clean up the “hairball” effect that emerges from having too many overlapping edges, I also dropped edges with low weight (sometimes just weight 1, sometimes 2 or 3 — it’s a bit arbitrary, but the idea is to cut enough noise to make the graph legible to the eye). The color of a node indicates its membership in a “community” or modularity group. The math here is a bit hairy, but the basic idea is that nodes are classed into the same community with other nodes that they tend to co-occur with, and into a different community from nodes that they tend not to co-occur with. Edge color is determined by the nodes the edges connect. If both nodes are blue, the edge will also be blue; if one is blue and the other orange, the edge will fade from blue to orange. Finally, the position of a node is determined holistically based on three forces: 1) all nodes are attracted to the center of the graph, 2) all nodes repel each other, and 3) a node attracts other nodes based on the weight of the edge connecting them.
There are three modules in Figure 2: 1) a group of (mostly) emotions in blue, 2) a smaller group of normative statuses in green, and 3) a group of psycho-social constructs in orange. Among the most prominent emotions are doubt, contempt, disgust, trust, sadness, joy, guilt, and curiosity.
You might find this map a bit surprising. When we teach Nietzsche to our students, we tend to focus on resentment, leaving out most of the other emotions that he actually talks about. My hunch is that this is because most translations of Nietzsche into English leave ‘ressentiment’ in the French and always italicize it, despite the fact that Nietzsche only italicizes it twice and only refers to it in a couple dozen passages. This distracts readers and leads them to fetishize resentment and ignore the other emotions.
You might also strain your eyes looking for ‘will to power’ — another Nietzschean construct that gets a lot of airtime despite playing only a modest role in his moral psychology. In Figure 2 it’s the little node between ‘value’ and ‘guilt’ (abbreviated ‘wtp’). Again, my guess is that because Nietzsche sometimes puts this striking phrase in italics, it’s received an undue amount of attention in the secondary literature. What Nietzsche actually talks about when he engages with moral psychology, though, is concepts like virtue, value, instinct, fear, doubt, emotion, contempt, courage, nobility, disgust, laughter, solitude, drive, and forgetting. Some of these concepts receive adequate attention in the secondary literature, but many don’t. Just to single out a few, check www.philpapers.org for contempt, courage, and solitude.
To see how Nietzsche’s views change over time, we can look at all of the maps, but it may be even more helpful to compare maps of his significantly revised books. Figures 3-6 show Human, All-too-human as it existed in 1878, 1879 (with the addition of “Assorted Opinions and Maxims,” 1880 (with the addition of “The Wanderer and His Shadow”), and 1886 (with the addition of prefaces for both the original book and Assorted Opinions and Maxims). Figures 7-8 show The Gay Science in 1882 and 1887 (with the addition of book 5 and a new preface).
Figure 3: semantic map of the moral psychology of Human, All-too-human in 1878
Figure 4: semantic map of the moral psychology of Human, All-too-human in 1879
Figure 5: semantic map of the moral psychology of Human, All-too-human in 1880
Figure 6: semantic map of the moral psychology of Human, All-too-human in 1886
Figure 7: semantic map of the moral psychology of The Gay Science in 1882
Figure 8: semantic map of the moral psychology of The Gay Science in 1887
These progressions suggest a few observations about Nietzsche’s changing positions. Start with Human, All-too-human. First, virtue and value move from the periphery to the center, and their node sizes increase, indicating that Nietzsche becomes more interested in these concepts over time. Second, obligation shrinks and moves to the periphery, indicating that Nietzsche is moving from an ethic of rights and duties to an ethics of virtue. Third, drive and instinct start in the same modularity class but show up in different modularity classes in all three of the updated versions of the book. This suggests that Nietzsche may not use them interchangeably (a bone of contention in the secondary literature). Turn now to Gay Science. While drive and instinct are in the same modularity class in both versions of this book, there are other interesting changes. First, whereas fear is by far the most prominent emotion in Human, All-too-human, it is significantly demoted in Gay Science, where it is sits side-by-side with disgust, joy, contempt, and doubt. This suggests that Nietzsche moves from a moral psychology centered on the fear of an individual in a state of nature in Human, All-too-human to a moral psychology that also makes a place for hierarchies maintained by contempt and disgust, as well as the skeptical epistemic emotion of doubt. Second, laughter plays a much larger role in Gay Science, which should be unsurprising given the title of the book. Laughter plays an even larger role in Thus Spoke Zarathustra (1885 edition, Figure 9), which is often thought of as a ponderous book even though it’s full of giggles.
Figure 9: semantic map of the moral psychology of Thus Spoke Zarathustra, 1885 edition
I’ll end with my favorite book of Nietzsche’s: Beyond Good and Evil (Figure 10).
Figure 10: semantic map of the moral psychology of Beyond Good and Evil, published 1886
This map once again places drive and instinct in separate modularity classes. It also continues to give a major role to the emotion of fear while contextualizing it with other emotions such as contempt, disgust, curiosity, doubt, surprise, and — to a lesser extent — admiration, sadness, trust, and guilt. Will to power plays a minor role, and resentment is completely absent.
A digital humanities approach like the one on offer here doesn’t answer all of the questions we might have about Nietzsche, but it does help to answer some of them and suggests under-explored lines of inquiry. I hope that other researchers find it useful and that some will be inspired to employ these methods in their own work — whether on Nietzsche or on other philosophers.