Research Report by Jamal Russell
Created 2 July 2018 (rev. 31 August 2020)
As it pertains to its relevance to the WhatEvery1Says project and the digital humanities writ large, the edition paradigm must be considered as a theoretical construct that is composed of three separate conceptualizations of the edition: the scholarly edition itself, the digital scholarly edition, and the scholarly edition of a literary system as described by Katherine Bode in her book A New World of Fiction. What unites these three theories of the edition is their consistent focus on the centrality of the document as a repository for the historical situatedness of discourse, and the commitment to constructing a scholarly apparatus that will not only preserve as much of the material specificities of the document as possible, but will also function as a tool by which the historical context of the documents in question can be conveyed properly to its readers.
The relationship that the WE1S project has to the edition paradigm, and specifically the scholarly edition of a literary system is quite fraught. While the WE1S project can learn much from the edition paradigm, particularly by examining how the WE1S project and those theorizing the scholarly edition of a literary system such as Bode are invested in the way in which each project’s respective objects of analysis (the word/topic/concept examined through the topic modeling interface for the former; the document for the latter) convey the historical movements of discourse over time, it must also be understood that these are two separate projects. I say this because the fundamental concepts underpinning the projects differ, the material means by which each project pursues their goals differ, and their engagement with the media ecologies of their objects of analysis differ as well.
The Edition paradigm, particularly in relation to the digital humanities and how the field employs the paradigm via the design of digital editions, can be said to be theoretical construct composed of three separate conceptualizations of the edition:
The first consists of theories of the scholarly edition itself, which determines how the concept was, is and continues to be applied to work done with print-based and born-digital documents of all types. A simple and concise definition of the scholarly edition is provided by Patrick Sahle,who states that “a scholarly edition is the critical representation of historic documents”.1 From this definition, Sahle specifies and elaborates on four fundamental terms that he deems critical to his definition of the scholarly edition. The first term is “representation,” which refers to “the recoding of a document or an abstract work and its transformation in the same or another kind of media”.2 The second term is “critical,” a term which “must stand for all processes that engage in a critical or reflective way–that is, on the basis of a scholarly agenda–with the material in question”.3 This encapsulates work such as identifying textual structures, descriptions of the treatment of texts within an edition, and deciding with which additional material, to what extent, and in what form an edited text should by contextualized for purposes of accessibility and comprehensibility, as well as larger disciplinary paradigms such as those of philology. “Critical,” in this context, becomes a “container for all those activities that apply scholarly knowledge and reasoning to the process of reproducing documents and transforming a document or text into an edition”.4 The third term is “documents,” which is used as a way of keeping in mind the relationship between the material document that an editor handles, and the “abstract notion of text or work” that editions are built upon. Finally, the last term that Sahle elaborates on is “historic,” which he uses to articulate how editions “explore the uncharted circumstances of documents, texts and their transmission” and, in doing such work, “bridge a distance in time, a historical difference” for a given reader.5 In this sense, the edition as Sahle describes it can be understood as the material and methodological means by which a particular text or set of historically situated texts are made comprehensible to a reader living in a different epoch or culture.6
Such a conception of the scholarly edition points toward the influence of D.F. McKenzie’s 1986 collection of lectures Bibliography and the Sociology of Texts on the practice of textual criticism and scholarly editing, in particular, his definition of bibliography as “the study of the sociology of texts”.7 Bibliography, and by extension the scholarly edition, “has an unrivalled power to resurrect authors in their own time” by using the historically and culturally situated material textas the way in which a reader or editor can “gain access to social motives”.8 One begins to understand how Sahle’s categories interact in the design of a scholarly edition, as the affordances of the material text are the means by which documents “[carry] the history of [their] own making” and, thus, “bears along with and as itself the gathered history of all its engagements.”9 The primary objective of the critical process underpinning the construction of the scholarly edition is the creation of an apparatus that employs the historical, material and cultural specificities of its documents for the purpose of, to paraphrase McGann, putting the relations of primary text and curated critical and archival materials into operation “as a machine to be used for accessing, analyzing, and evaluating” the historical and cultural record and assessing the material histories and histories of transmission that surround a given text.10
The second element of the Edition paradigm, the theory and practice of creating a digital scholarly edition, follows from both McKenzie’s theories of bibliography, as well as the issues surrounding the material affordances of historical documents and the processes underpinning the construction of the scholarly edition. Concerning the former, what is most notable about McKenzie’s definition of text is that it does not limit itself to print-based documents. He, in fact, expands it to include “verbal, visual, oral, and numeric data, in the form of maps, prints, and music, of archives of recorded sound, of films, videos, and any other computer-stored information, everything in fact from epigraphy to the latest forms of discography.”11 With the latter issues articulated above in mind, two questions follow from McKenzie’s words: First, how will the new media of the day allow bibliographers, textual critics, and scholarly editors to do the historical and critical bridgework that defines those academic fields? And second: how will the shift in medium from the book to the computer (bracketing for now the complex questions surrounding the computer as a medium) affect how the material affordances of the historically/culturally situated work will be represented using the scholarly edition apparatus?
In Radiant Textuality McGann begins to answer these questions by focusing on the problematic of employing the book/codex form as the primary tool by which scholars and critics study bookbound and print-based materials. McGann argues that, when the apparatus used to analyze the operations of another apparatus are of the same order, whether it be a book or merely a hard-copy text in general, “the scale of the tools seriously limits the possible results” one can gleanf rom said analyses.12 In McGann’s formulation, then, the use of electronic tools in textual scholarship and the corresponding design of scholarly digital editions does not merely have the function of providing scholars with new tools of literary criticism, but comprehensively “lift one’s general level of attention to a higher order.”13
McGann’s claim is worth investigating further. Scholarly editions were initially developed to in response to the complexity of literary works, specifically works that had “evolved through a long historical process,” as the Bible or Shakespeare’s works had done.14 The different mechanisms developed to deal with these transformations (the majority of which fall under the general umbrella of the “scholarly edition”) were delimited by the affordances of the book as a medium,and how those affordances affect how information is organized and accessed. The organization of information in a codex is linear but able to be accessed randomly, the structure of which is revealed to the reader by a table of contents, appendices, indexes, introductions to the edition, etc., all of which can be used to aid navigation through the text and access to information about the text the edition is built around.15 Nevertheless, the apparatus’s representation of historical and material specificities of documents within a print-based scholarly edition will be linear, and even historically teleological at times.
These logical and informational structures are that of both the scholarly edition as well as the majority of the materials that said editions will be built to analyze. According to McGann, because the logical and informational structures of the analytical tool are on the same level as those of the object of analysis, the power of the scholarly edition’s analytical mechanisms are checked. This becomes especially problematic when one wants to analyze either a non-semantic aspect of the text or connect documents to one another in ways beyond the affordances of the codex form. Thus, the point of “hyperediting” (as McGann terms born-digital techniques of textual editing in Radiant Textuality) and resultant digital scholarly editions is not only to “secure freedom from the analytic limits of hardcopy text,” but also to use digital tools to bring the techniques of bibliography, textual criticism, and book history to bear on historical texts in new, qualitatively different ways as a part of a new order of textual criticism and analysis.16 In other words, if the baseline of an understanding of the scholarly edition as a theoretical construct isthat it is a “web of discourses…interrelated and of equal understanding,” then the digital scholarly edition should be able to model such an understanding of the edition at the level of design and process.17
McGann illustrates how this shift occurs using two related examples of edition construction documented in Radiant Textuality. In the first example, McGann was asked to edit the New Oxford Book of Romantic Period Verse and wanted the edition’s texts to retain as much similarity to the original text as possible. To this end, this edition would also be designed around how the works of Romantic poets such as William Blake exploited the affordances of the print medium for expressive effects via the illustrations that accompanied and articulated the text of his poems. A scholarly edition of Blake’s work, in the eyes of McGann, would need to include the various illustrations, paintings, illustrations, engravings, and lithographs Blake made over his lifetime aswell as the text of the poetry itself. Unfortunately, McGann’s plans to do so with the New Oxford Book of Romantic Period Verse were, for the most part, shot down. Given that authors such as Blake operate within two media, McGann considers scholarly editions of such works that do not take this into consideration to be fundamentally incomplete, as they do not allow the critic to link documents to other media forms to analyze how they may have interoperated within a given author’s oeuvre. What would be needed here, in McGann’s words, is not a critical edition of Blake’s work, but a critical archive. This archive, in McGann’s words, must be able to “accommodate the collation of pictures and the parts of pictures with each other as well as with all kinds of purely textual materials.”18
McGann’s initial answer to these issues of the print-based scholarly edition was his second example: The Rossetti Archive, which began development in 1993 and was completed in 2008. The Rossetti Archive was designed to escape the limitations of the print-based digital edition as described in Radiant Textuality by orienting the user’s focus not on the single textual document that has been reconstructed and contextualized by a given textual critic(s), but on the relationship between the numerous documentary remains of a given author or set of authors. This is necessary because for the Rossetti Archive, the documentary remains it is constructed around are not only very numerous, but the history of those documents is not fractured or incomplete. Thus, the goal of an edition or archive crafted around the works of a Dante Gabriel Rossetti (or, for that matter,a William Blake) is to “sort out the relationships of the documents and put all of those relationships on display” in a manner that conveys both the media poetics (so to speak) of the authors’s works, as well as the historical context within which the works operated.19 To use the words of Hans Walter Gabler, the 70,000 digital files and 42,000 hyperlinks of the Archive are organized and designed within its architecture so that any given user is able to not merely shift their focus between the relationships between different types of documents and media forms, but are also able to generate and view material instantiations of the different webs of discourses running throughout a given digital edition.
Because computational media affords this mapping of relationships between works and versions of works as manifested in different documents, it is important that as many specificities of the documents’s original media are retained as possible. Mats Dahlstrom terms this process a media translation of materials from one medium or document type (print, poem, painting, codex, etc.) to a born-digital format that allows specific features of the document in question to be“preserved…carved into the flesh of the new medium and be expressed by its architecture and the language of its web of signs.”20 While Dahlstrom calls this a translation and Sahle calls it are presentation, it would be more prudent to call this process a remediation (to use the term introduced by Jay David Bolter and Richard Grusin introduced in their book of the same name) or a transcoding (to use Lev Manovich’s term introduced in The Language of New Media). I say this because the latter two terms highlight in a more theoretically and media-specific manner how the process Dahlstrom describes constitutes an employment of new media for the purpose of representing not only the relationships between documents within a given historical context and milieu, but also a modeling of their medium-specific effects upon their readers.21 In other words, Bolter/Grusin and Manovich’s terms articulate how this process is constitutive of the effects of the shift to a digital/electronic paradigm on the construction of digital scholarly editions.
As Sahle notes, “it can be said that digital editions follow a digital paradigm, just as printed editions have been following a paradigm that was shaped by the technical limitations and cultural practices of typography and book printing.”22 Sahle puts forth digitally encoded text as an example of the digital/electronic paradigm realized, stating that, as opposed to the print-based scholarly edition providing only one view of the text, “the deeply marked up textual code of the digital edition theoretically covers several views of the text and may lead to various presentations generated by specific algorithms.”23 This view of the digital/electronic paradigm dovetails with that of Marilyn Deegan and Kathryn Sutherland, who combine Sahle’s perspective with those of McGann, Gabler, and Dahlstrom when that state that not only does the computer become a“modeler of displaced materiality” when used to provide a sense of the discursive relationships between media and materials organized within a given digital edition, but also “redefines previous understanding[s] of how knowledge and information are organized and…how they are legitimated.”24 Scholars such as G. Thomas Tanselle are less effusive about the difference the digital/electronic paradigm makes when constructing a digital edition; he explicitly states that
what makes the associative reading valued by the scholars above is not the digital medium itself, but how the medium allows for new considerations of “how thoughtfully a text or set of texts is cross-referenced…with a network of searching capabilities as well as linked apparatuses and other editorial material,” all of which are vital to McGann and others’s conception of the digital scholarly edition.25
A definition of the digital scholarly edition would thus look something like this: “Scholarly digital editions are scholarly editions that are guided by a digital paradigm in their theory, method and practice.”26 However, even a definition as seemingly simple as this one contains complexities that must be teased out. For example, the first clause of that definition, “scholarly digital editions are scholarly editions,” connotes a point that Sahle makes two pages prior to his definition, namely that while “digital editions are essentially different from printed editions in their content, structure and role,” they still share the same subject and have the same goals even though the printed and digital editions may be working on different scales in terms of medium, format, and scale of materials.27 Even though he provides a definition of the digital scholarly edition, Sahle does not see himself as describing a fundamentally different object. Rather, he is describing the same type of object constructed on a different analytical and material level, the definition of which still falls under the general definition of the scholarly edition itself. The definition of the digital scholarly edition is needed to make the level distinction that is so important to textual critics such as McGann, as “changes in our technological and media environment make us aware of the fact that there is an alternative to the print edition. The print edition is no longer the edition but becomes recognizable as a particular form.”28 It is this distinction in form, or level in McGann’s words, that makes the necessary operations of digital scholarly editions such as the Rossetti Archive, operations that were not affordances of the print- based scholarly edition, available for their users.
While Sahle uses the singular form in the quote above, when directly addressing the distinction between print and digital scholarly editions, he addresses the issue using plurals: “the difference is not so much between editions and digital editions, but between the various forms of editions.”29 This underscores another anxiety that plagues discussions of the digital scholarly editions: for many, “there is no universally accepted definition of digital scholarly edition.”30 When discussing digital editions, one must confront the fact that “scholars continuously experiment with old and new tools in order to achieve the optimal digital experience of a manuscript and…the resulting projects often differ greatly.”31 Thus, while one can discuss the particulars of a project such as the Rossetti Archive, discussing the larger contexts of digital editions becomes difficult because one must confront the fact that, as Dahlstrom notes, “to even talk about digital editions as one particular type of edition is…debatable.”32 When one does attempt to make a sharp distinction between the print and digital edition, what they are in fact dividing are the conception of the edition of the product of the affordances of a particular medium, and the edition as the product of specific “epistemological foundation[s] and theoretically based [strategies].”33 Discussions of the digital scholarly edition that merge the two perspectives will have to discuss how particular approaches to merging the perspectives result in particular types of digital editions designed to provide specific perspectives on a document or a set of documents to its users.
For instance, McGann sees the construction of a digital scholarly edition (or archive) as the deployment of “the system of philology as a digital emergence.”34 To that end, he states that a digital edition should meet the following requirements: it should have comprehensive depository of artifacts and materials, the different component parts of the edition must be organized in a
network of internal links and external connections that can be represented as conventions (that is, classifications), the total system must rest in a single perspective that reflects the conception of the system generally agreed upon by its users, and the system must have the flexibility to license and store an indefinite number of particular views of its artifacts, their relations, and the system as a whole.35 What this is meant to result in is an edition conceived as a philological machine that provides a set of materials and documents to the user for the purpose of studying the historical, cultural, and material means by which webs of discourses run throughout the edition’s contents.This is not a conception of the digital scholarly edition that only addresses either the medium or theoretical/epistemological foundations, but merges the two.
This version of the digital scholarly edition underpins his concept of a “philology in a new key,”which itself is the basis of the scholarly edition of a literary system introduced by Katherine Bode in her book A World of Fiction. Both McGann and Bode consider philology in a new key as an implementation of philological principles outlined above and their use for navigating the complexity of the documentary record to respond to the challenges posed by digitization. From this, Bode describes the scholarly edition of a literary system as “a model of literary works that were published, circulated and read- and thereby accrued meaning- in a specific historical context, constructed with reference to the history of transmission by which the documentary evidence of those works is constituted.”36 This form of the digital scholarly edition is comprised of a curated dataset that offers a “stable and accessible model of the existence and interconnections of literary works in the past, and a critical apparatus that reveals the hypothetical nature of that modeled literary system and establishes it as a reliable basis for analysis.”37 The scholarly edition of a literary system builds on the approach that projects such as the Rossetti Archive takes to its organization of materials by emphasizing not only the collection of the materials themselves, but the necessity of organizing them and constructing an apparatus around them that properly represents to the user the contexts within which the documents in question operate. It is an approach meant to use the theoretical and material apparatus of the scholarly edition to “attend to the specificity of collections from which documents were gathered, the different documentary manifestations incorporated, and the partiality of the digitized record” and represent these facets of a specific set of documents to the user of a givenedition.38 In this sense, the scholarly edition of a literary system is a crystallized, theoretical reaction to the way in which new media has transformed perspectives on what the scholarly edition is, and how the use and objective of said scholarly apparatus has changed with use creation of digital scholarly editions alongside their print-based counterparts.
What the WE1S Project Can Learn
The immediate question that comes to mind for the WhatEvery1Says project is: can the WE1S project be considered as a scholarly edition of a literary system? Or, rather, is it a project that, while having similarities to Bode’s conception of where the scholarly edition is going, is something quite different? I would argue that the scholarly edition of a literary system and the WE1S project, while sharing a similar goal of charting the historical transformations of discourse as represented in a corpus of documents, are distinct in three ways: the two projects are conceptually different undertakings, they have materially different methods of achieving their goals, and finally, the way in which the workings of media ecologies figure in the goals of each project differ in a quite fundamental manner.
The conceptual distinction between the scholarly edition of a literary system that Bode is developing from McGann’s ideas, and the goals of the WE1S project, can be gleaned by comparing each entity’s objectives. Bode’s conception of the scholarly edition is developed for the purpose of constructing a born-digital apparatus around a set of documents to create an edition that will provide insight into (or a material argument about) the many lives of a given cultural object and how those lives developed within particular historical contexts. In this sense, what Bode describes is an edition that builds upon the foundation McGann built with the development of the Rossetti Archive. However, the WE1S project is doing something a bit different, which can be gleaned from the project’s Mellon grant proposal. When a reader looks atthe outline of the project’s goals in the proposal, said reader is confronted with the following language: “WE1S’s sample corpus of public discourse about the humanities is the basis for exploring,” among other things, “that articles containing the literal phrases ‘humanities,’ liberal arts,’ and ‘the arts’ are likely places to look for focused discussion of the humanities…and socially broad discussion of the humanities” and “that the crossing point between such focused and broader views can help us understand the ‘architecture’ of the ‘complex idea’ of the humanites.”
In other words, the WE1S project is not focused on how documents can provide a researcher with the sense of the documents’s historical context as a component of an interpretive apparatus. Rather, it is concerned with how the organization of words and phrases within a corpus of documents allows one to understand the relations of concepts by which discourses about the humanities and the liberal arts are advanced. To put it more simply: Bode is more worried about the relations between documents and the historical contexts that can be gleaned from said documents, while the WE1S project is more worried about the concepts about the humanities advanced via the language constituting the documents, and how that language runs across the corpus and throughout a given temporal window (currently 1981-2014).
The use of the word “concept” here is vital, and it is worth taking some time to unpack how it is being used in the context of the WE1S project. “Concept” is taken from the vocabulary of Peter de Bolla as he uses the term in his 2013 book The Architecture of Concepts. Concepts, according to de Bolla, “activate and support cognitive processing and enable us to sense that we have arrived at understanding. They are ‘ways of thinking’ whose identified or identifiable labels provide in shorthand the names we give to particular routes for…getting from one thought to another.”39 Thus, the concept “humanities” has condensed within it not only its own definition(what constitutes the field of the humanities?), but also a set of discourses that provides the user with a conceptual and semantic field by which ideas about the humanities, and concepts related to the humanities, may orient themselves within a given context. And it is the orientation of the concept of the humanities that the project wishes to analyze, particularly as it transforms overtime, such that one could argue that WE1S is not only concerned with the “architecture” of the concepts outlines above, but the history of the application of those concepts as well.40
It is for this reason that I believe it is necessary to differentiate the conceptual goals of the WE1S project from those of the scholarly edition of a literary system. This does not mean that there are no similarities between the two approaches; in fact, I would argue that one of the main things the two approaches share is a focus on what McGann calls “documentary collation” as a foundational element of the respective projects. Given that the primary goal of such work is to“expose the precise lines of a text’s transmission history in its real times and material circumstances,” one could say that the scholarly edition of a literary system and the WE1S project work toward that goal in distinct ways (McGann). The scholarly edition of a literary system achieves this goal through careful attention to the organization of materials within a
digital apparatus, as well as how that digital apparatus allows for an analysis of those materials and their historical contexts. The WE1S project is more interested in creating a digital apparatus and methodology for decomposing the carefully collected documents constituting its corpus into words. The relations between the words themselves are analyzed for the purpose of seeing what kinds of concepts and relations between concepts are brought forth, as well as what longitudinal changes in the discourse around concepts such as the “humanities” develop. It is for this reason that topic modeling has, to this point, been the main analytical tool by which the project pursues this goal.
A discussion of the material distinction between the scholarly edition of a literary system and the WE1S project thus begins with the distinction between the use of the topic model, and the model in general, for both entities. For the WE1S project, the primary importance of the topic modeling method for helping its researchers articulate the relations between concepts motivating particular discourses about the humanities emerges via the means by which topic modeling software produces its main object of analysis, the topic modeling interface. Under the algorithmic and probabilistic transformations of topic modeling software such as MALLET, documents cease to be sequences of words and punctuation that resolve into a certain kind of historical formation through which it gains importance to the editor of a scholarly edition. Rather, to paraphrase Matt Burton, documents become a word census, “a sum total of the number of times each word occurs in the original, natural document.” While the problem of exactly what unit of text will represent a document emerges for a researcher employing the method (similar to the problems the expanded documentary record caused for the print-based edition that digital editions and scholarly editions of literary systems are ostensibly meant to solve), the treatment of the document is nonetheless quite different. Documents, in this context, become a repository for a multitude of topics that run through them as a function of how sequences of words are constructed to create said documents. A topic can thus be defined as a “probability distribution over terms” in which “different sets of terms have a high probability” of occurrence within a given topic.41
One can glean from this description a similarity between what de Bolla defines as a concept, and the topics generated by topic modeling software. This relationship between the words that resolve into topics and the topics that are expressed in the documents of a given corpus, and the way in which topics (and relations between topics) can be understood as expressions of concepts (and relations between concepts) is interpreted through the use of a topic modeling interface. An interface such as Andrew Goldstone’s DFR-Browser (the interface used by the WE1S project [see M-7]) displays in a visual form the topics a document or collection contains and the relationships among the topics.”42 DFR-Browser mobilizes these relationships between topic and document, as well as topic and concept, by situating the list of most highly weighted words within a topic alongside the list of top documents within that topic. The function of the topic modeling interface then is to allow a human being to “look at and identify…a topic from the word list that the model produces” and thus render the topic interpretable and coherent.43 Goldstone thus presents, as other designers of topic modeling interfaces do, a design solution to the problem of producing an interface that provides the researcher with the most relevant information that a topic model produces.44
This method of gleaning concepts from documents shares with the scholarly edition of a literary system a more general interest in the use of modeling, in that modeling “instantiates an attempt to capture the dynamic, experiential aspects of a phenomenon.”45 In this sense, a topic model of a corpus of documents is meant to be a representation of how topics and concepts are formed and articulated by the sequences of words constructing the documents in question, and its interface is
designed to tell the researcher what cannot be immediately gleaned merely be reading the corpus of documents. It is for this reason that Bode finds modeling to be a potentially useful element of the data-rich literary history that the scholarly edition of a literary system is meant to present to agiven reader. As Bode notes, modeling “[offers] a mechanism through which to interrogate and refine conceptions of literary works and systems.”46 However, modeling by itself does not provide a mechanism by which one can “recognize and represent the inevitably transactional nature of the documentary record.”47 Because it cannot attend directly to the often-fraught history of the documents constituting an edition, it requires an apparatus that does attend to those historical and material specificities for the researcher’s use.
These differences are magnified when one examines the differences in the scholarly apparatuses that each project employs to construct and analyze their objects of study. As with other editions, the scholarly edition of a literary system places tremendous importance on the integrity of the document itself, as it is through the material specificities of the document itself that a researcher will be able to glean the historical contexts within which the document works. It is for this reason that Bode’s work with the National Library of Australia’s (NLA) Trove database of digitized historical newspapers puts so much emphasis on the necessity of an “archaeology of a mass-digitization,” that is, a “critical analysis of the amplifications and exclusions created by collection and remediation.”48 The conceptual underpinnings of such a call are not too different from those underpinning the use of Text Encoding Initiative (TEI) standards when representing texts employed in the creation of digital editions. Particularly, both share what John Lavignino terms a “belief in transcription,” the ability to use text encoding practices to “reproduce…a selection of features from that object” that one is transcribing.49 In this sense, the belief in transcription underpinning Bode’s project is also the same belief informing Dahlstrom’s concept of media translation, that practices of remediation and transcoding can be used to overcome the media-specificity of a document for the purpose of preserving the historicity of the document in a new format. The scholarly apparatus that emerges from such practices is, like the print-based and digital edition, an apparatus both constructed around the transcoded historical and material specificities of the documents within the apparatus. Accordingly, it is also built to excavate and articulate an argument about those selfsame historical specificities of the document to the researchers constituting its audience.
A discussion of the WE1S project’s scholarly apparatus, the WE1S Research Environment,begins with a discussion of the affordances of the topic modeling interface. Given that the objective of a topic modeling interface is to concisely and effectively render to a researcher the most important information needed to determine how words resolve into topics and how topics cohere into larger concepts, what an interface such as the DFR-Browser does and does not convey about the corpus becomes quite a trenchant issue. This is especially important one is invested in the relationships between topics and how the interrelations between concepts structure discourse about the humanities and liberal arts across time. For this reason, there is always a tension between the goals of the project and how the topic model interface obfuscates the full architecture of its topics for the sake of concision and effectiveness. This leads to problems such as top articles within topics that contain none of the top twenty words of said topic, and solutions such as the method Miriam Posner uses when engaging with an interface’s representation of knowledge: not only using a given interface to communicate knowledge about a given corpus, object, or discourse, but presenting it in such a manner that we are asked to“question the purpose of an interface” and what different interface designs ask of their users.50 These problems closely mirror Bode’s investment in the historically situated document, in that both put a particular importance on the function of the object of analysis within the context of the goals of each project.
The difference between the two treatments of the principle objects of analysis emerges when one examines how each object is constructed for such analyses and considerations. While the method of constructing transcoded documents for digital editions is based around preserving as many of the qualities of the original document as the new medium will allow, the WE1S research environment decomposes documents into their component words, as it is on this level that the analyses of topics and the resolution of topics into concepts proceeds. Because it is through this algorithmic decomposition of the corpus’s documents that the creation of the DFR-Browser interface is possible, it is necessary that the WE1S Research Environment that creates these models be fed not representative facsimiles of its documents, but plain text versions of those documents with accompanying metadata. Plain text, as “the most tractable format for computational analysis,” can easily be ingested, cleaned, processed, and rebuilt as the component parts of a DFR-Browser interface by the Virtual Workspace System employed by the WE1S project. In short, whereas the methods of working with the material affordances of documents within the scholarly edition of a literary system are fundamentally preservationist, the methods used by the WE1S project are fundamentally generative: one preserves the affordances and materialities of the original document as much as possible because the relevant histories the project is interested in inhere in those qualities, the other project generates new objects of analysis because it better allows its researchers to analyze the workings of its objects of analysis, that being the word, the topic, and the concept.
Finally, I would also note that there is a fundamental difference between the scholarly edition of a literary system and the WE1S project that emerges from how the two projects deal with the media ecologies of their objects of analysis. The term is used by Matthew Fuller to describe the“massive and dynamic interrelation of processes and objects, beings and things, patterns and matter” that coalesces into a mode or dynamics that “properly form or make sensible an object or process.”51 Both projects are invested in the workings of media ecologies and what they express about the historical transformations of discourses. However, both the scholarly edition of a literary system (and scholarly editions in general) and the WE1S project are focused on the workings of quite different media ecologies. The scholarly edition of a literary system is focused very tightly on the media ecology of the document, and how the material and conceptual“dynamics of arrangement” governing the coalescence of “words, concepts, footnotes, [and] the mechanics of a book” in general coalesce into an object that expresses a kind of historical situatedness that a scholar can convey to a reader through the scholarly apparatus of the edition.52 The WE1S project focuses more broadly, on media ecologies through which topics interrelate and congeal into concepts that generate and direct discourses over time. It is for this reason that, while the project began by building a corpus of newspapers published over a 30-plus year period, it has since begun to expand its scope to other forms of journalistic media such as television, radio, and podcasts. Because the objects of analysis are the word, topic, and concept rather than the document itself, it allows the WE1S project to incorporate a wider range of materials than the scholarly edition of a literary system, or scholarly editions, would be able to. Perhaps this is ultimately the solution to the problem of the print edition that McGann perceived when working on the New Oxford Book of Romantic Period Verse so many decades ago: while still retaining its lessons regarding the necessity of a scholarly apparatus for interpretive work, as well as many of its conceptual underpinnings, one must move away from the edition paradigm altogether in order to create an apparatus proper for the work one must do with digital materials.
To answer the question posed above, I believe that it is important for the WE1S project to understand that, on a conceptual and material level, what the project is creating is fundamentally different from any kind of edition, particularly the scholarly edition of a literary system that Bode advocates for. I make this statement for three important reasons:
- The objects of analysis, while related are fundamentally different: the scholarly edition of a literary system retains the edition’s focus on the document as its primary object of analysis, while the WE1S project is much more concerned with the document’s component words, and analyzing how those words resolve into topics and concepts via the topic modeling interface. Because they operate on different scales, each set of objects requires different methods of examining how they express the historical situatedness of particular discourses/
- The relationship between the objects of analysis and apparatuses built to facilitate the analysis of said objects differ between the two types of projects. The scholarly edition of a literary system, because it is so centrally focused on the integrity of the document, must be an apparatus built to preserve the integrity of the document through all transcodings and remediations given that it is through the affordances of the document that historical analysis is possible. The WE1S Research Environment, is built to decompose the document into its component words, and generate DFR-Browser interfaces that allows one to glean the relations between words, topics, and concepts.
- Both the scholarly edition of a literary system and the WE1S project are analyzing media ecologies at different scales. The scholarly edition of a literary system is very tightly focused on the workings of the document’s media ecology, while the WE1S project is more concerned with multiple media ecologies, and how concepts work across them to structure discourse about the humanities, the liberal arts, and the arts.
This is not to say that nothing can be learned by looking at the edition; as one can see above, the scholarly edition of a literary system (and the edition writ large) shares much with the WE1S project, particularly as it relates to the historical focus of both projects. However, I believe that it is very important that the WE1S project differentiates what it is doing from what editions do with documents conceptually and theoretically. Doing such work will not only emphasize the importance of the work the project is doing with tracing the workings of concepts within its corpus, but will also emphasize the importance of the fundamental concepts underpinning the creation of the edition as they apply to the methods employed by the WE1S project.
Blei, David M. “Topic Modeling and Digital Humanities.” Journal of Digital Humanities, vol. 2, no. 1, 2012, http://journalofdigitalhumanities.org/2-1/topic-modeling-and-digital- humanities-by-david-m-blei/. Accessed 24 July 2017.
Burton, Matt. “The Joy of Topic Modeling.” mcburton.net, 21 May 2013,http://mcburton.net/blog/joy-of-tm/. Accessed 24 July 2017.
Bode, Katherine. A World of Fiction: Digital Collections and the Future of Literary History. Ann Arbor, MI: University of Michigan Press, 2018.
Burnard, Lou, Katherine O’Brien O’Keeffe, and John Unsworth. Electronic Textual Editing. Modern Language Association of America, 2006.
Dahlstrom, Mats. “How Reproductive is a Scholarly Edition.” Literary and Linguistic Computing, vol. 19, no. 1, 2004, pp. 17-32.
De Bolla, Peter. The Architecture of Concepts: The Historical Formation of Human Rights. Fordham UP, 2013.
Deegan, Marilyn, and Kathryn Sutherland. Transferred Illusions: Digital Technology and the Forms of Print. Ashgate, 2009.
Driscoll, Matthew James, and Elena Pierazzo, editors. Digital Scholarly Editing: Theories and Practices. Open Book Publishers, 2016.
Franzini, Greta, Melissa Terras, and Simon Mahony. “A Catalogue of Digital Editions.” Driscoll and Pierazzo 161-182.
Fuller, Matthew. Media Ecologies: Materialist Energies in Art and Technoculture. MIT Press, 2005.
Gabler, Hans Walter. “Theorizing the Scholarly Edition.” Literature Compass, vol. 7, no. 2, 2010, pp. 43-56.
Hayles, N. Katherine. Writing Machines. MIT Press, 2002.
Jockers, Matthew L. Macroanalysis: Digital Methods and Literary History. U of Illinois P, 2013.
Lavagnino, John. “When Not to Use TEI.” Burnard, O’Keeffe, and Unsworth 334-338.
McCarty, Willard. Humanities Computing. Palgrave MacMillan, 2005.
McGann, Jerome. A New Republic of Letters: Memory and Scholarship in the Age of Digital Reproduction. Harvard UP, 2014.
−−−−−−. Radiant Textuality: Literature After the World Wide Web. Palgrave, 2001.
McKenzie, D.F. Bibliography and the Sociology of Texts. Cambridge UP, 1999.
Posner, Miriam. “What’s Next: The Radical, Unrealized Potential of Digital Humanities.” Debates in the Digital Humanities 2016, edited by Matthew K. Gold and Lauren F. Klein, U of Minnesota P, 2016, 32-41.
Ruecker, Stan, Milena Radzikowska, and Stéfan Sinclair. Visual Interface Design for Digital Cultural Heritage: A Guide to Rich-Prospect Browsing. Ashgate, 2011.
Sahle, Patrick. “What is a Scholarly Digital Edition.” Driscoll and Pierazzo 19-39.
Stallybrass, Peter. “Navigating the Bible: Books and Scrolls.” Books and Readers in Early Modern England: Material Studies, edited by Jennifer Andersen and Elizabeth Sauer, U of Pennsylvania P, 2002, 42-79.
Tanselle, G. Thomas. “Foreword.” Burnard, O’Keeffe, and Unsworth 1-6.
1 Patrick Sahle, “What is a Scholarly Digital Edition,” 23.
3 Sahle, 24.
5 Ibid., 26.
6 Unlike Sahle, I am hesitant to limit the elucidating operations of the scholarly edition to its historical function. Given the fact that, when one looks at the Norton Critical Editions catalog, the publisher not only produces critical editions of texts originally produced between the Early Modern Period and the late 1930s, but also produce critical editions of works by authors such as Chinua Achebe, Wole Soyinka, and Adrienne Rich, it seems necessary to note that the critical edition has a necessarily cultural function to it, in that it articulates not only the historical specificities of a given text, but its expression of certain cultural (I hesitate to use the term“subcultural” or “countercultural,” given the canonical and belletrist function of the critical edition)particularities in its construction as well.
7 D.F. McKenzie, Bibliography and the Sociology of Texts, 13.
8 Ibid., 28-29.
9 Jerome McGann, A New Republic of Letters, 39; 78. Emphasis his.
10 Ibid., 158.
11 Ibid., 13. Emphasis mine.
12 Jerome McGann, Radiant Textuality, 55.
15 See Peter Stallybrass, “Books and Scrolls: Navigating the Bible.”
16 Jerome McGann, Radiant Textuality, 57.
17 Hans Walter Gabler, “Theorizing the Scholarly Edition,” 44.
18 Jerome McGann, Radiant Textuality, 62.
19 Ibid., 69.
20 Mats Dahlstrom, “How Reproductive is a Scholarly Edition,” 23.
21 The term “media-specific” is taken from “media-specific analysis,” as described by N. Katherine Hayles in Writing Machines. See N. Katherine Hayles, Writing Machines, 29-33.
22 Sahle, 26
23 Ibid., 27.
24 Marilyn Deegan and Kathryn Sutherland, Transferred Illusions, 65; 67.
25 G. Thomas Tanselle, “Foreword,” 4.
26 Sahle, 28.
27 Ibid., 26.
29 Ibid. Emphasis his.
30 Greta Franzini, Melissa Terras and Simon Mahony, “A Catalogue of Digital Editions,” 162. Emphasis theirs.
32 Dahlstrom, 20.
34 McGann, A New Republic of Letters, 27.
35 Ibid., 27-28.
36 Katherine Bode, A World of Fiction, 7.
37 Ibid., 55.
39 Peter de Bolla, The Architecture of Concepts, 4.
40 Building upon the work of Quentin Skinner, de Bolla notes that while “concepts are historical in nature,” it is less certain that “concepts as such are historical forms” (27). The historicity of the concept inheres instead in the “relation between words and concepts,” which is more obviously susceptible to historical analysis according to de Bolla, and which is one the main foci of our analytical methods (27).
41 David Blei, “Topic Modeling and Digital Humanities.”
42 Stan Ruecker, Milena Radzikowska, and Stefan Sinclair. Visual Interface Design for Digital Cultural Heritage: Guide to Rich-Prospect Browsing, 166.
43 Matthew Jockers, Macroanalysis, 128.
44 For this reason, it is key that any researcher or project employing the topic modeling method take great care in selecting the interface best suited for their needs. The WE1S project selected Goldstone’s DFR-Browser from a collection of 14 different systems and interfaces. DFR-Browser emerged as the one best suited to the project’s objectives at the end of the evaluation process.
45 Willard McCarty, Humanities Computing, 23.
46 Bode, 10.
47 Ibid., 11.
48 Ibid. 79.
49 John Lavagnino, “When Not to Use TEI,” 338.
50 Miriam Posner, “What’s Next: The Radical, Unrealized Potential of Digital Humanities,” 38.
51 Matthew Fuller, Media Ecologies, 2. Emphasis his.
52 Ibid., 61; 11.