About this Project

Technological Process and Infrastructure

The process of converting the original Montel manuscript to a TEI-encoded document occurred in several stages. With permission from the director of Digital Initiatives and Services, I began by photographing each page of the Newberry Library’s manuscript. This stage in particular was more time consuming than it might have been if I had access to more resources. Given the cost to produce digital images through the Library, I opted to take the photos myself and was thus limited to the hours of the Newberry’s Special Collections Reading Room. After collecting my images in JPEG format, I converted each to a PDF before beginning Optical Character Recognition (OCR) through Google Drive, which uses computer algorithms to convert images into text documents. Following the OCR process, I merged each of the text files into one working document and began cleaning errors from OCR, which commonly included missed apostrophes, lowercase “l”s changed to number “1”s, and exclamation marks replaced with colons.

After minimizing the errors in my document, I worked with collaborators to create a markup schema that would make the text of Montel no longer just digital, but also machine-readable. In accordance with common practices in most digital humanities projects, we opted to use the Text Encoding Initiative (TEI), which provides standardized encoding practices for machine-readable texts in the humanities (TEI: Home). TEI is carried out using Extensible Markup Language (XML), a formal model based on “ordered hierarchies” (Birnbaum). The use of “ordered hierarchies,” which are less formally known as trees, simply suggests there is a logic behind how a text is encoded (Birnbaum).

For example: you can mark up chapters of a novel with <div> tags to tell the computer that these chapters are smaller divisions of one source text. Within each <div> tag, you can have <p> tags to indicate individual paragraphs. Since the <p> tag is a smaller element than a <div> tag, <p> tags must go within the <div> tags; you cannot have a <p> tag that encompasses anything outside of one set of <div> tags. Accordingly, if you encode one paragraph within a chapter, you also must encode every other paragraph within that chapter in order to have well-formed TEI and to indicate that each of the smaller elements (paragraphs) belong to a larger element (a chapter). With this understanding, XML is used for two primary reasons in the digital humanities: documents (like books) traditionally have a natural hierarchy, making it easy to map this onto the “ordered hierarchies” of XML; and computers can analyze and manipulate trees more efficiently than non-hierarchical texts (Birnbaum).

The Gladys Fornell Project employs TEI for multiple reasons within the corpus of text. On one level, the markup functions structurally, such as in the example above, where it differentiates elements like paragraphs or renders all quoted speech with a uniform appearance. This structural markup inherently transforms the text from its original form, a typescript manuscript, to an interpretation of this manuscript, filtered through my own lens to weigh the importance of individual elements of the text. On a second level, the markup functions as interpretative analysis. I created a series of <InterpGrps> with corresponding <interp> tags to make visible my interest in representations of femininity and their relationship with place in Montel. This was useful in allowing me examine the text categorically, e.g. separating “constraint” into “mechanical,” “clothing,” domestic,” and “literal,” but also challenging in the need to create categories and their more specific interpretations. While there could be some overlap between categories (for example: I have included <interp type = “angelic”> under <InterpGrp> for both representations of femininity and religious icon) and a phrase can be encoded with multiple <interp> tags (marking up the phrase “[Alice] believed in God and in the Immaculate Conception” using the two aforementioned types of “angelic”), I was limited in my ability to weight these tags (the phrase is more an angelic representation of femininity than it is an angelic religious icon) and potentially clarify ambiguity in assigning a phrase to two broad categories. The ability to categorize a phrase using multiple <interp> types, encoded in the corpus using <seg type = “____”>, minimized the overall need for a weighted markup schema at this level of encoding.

The final step in translating the TEI-encoded document to something that could be visible online was the use of TEI Boilerplate, which allows for the file to be published on a web browser. Already having set up server space through the Ohio Five, I was able to download TEI Boilerplate files and include an XML declaration before the <TEI> root of my file, thereby making my encoded document accessible through a browser. This TEI Boilerplate version of Montel is included as a page on the Gladys Fornell Project’s WordPress site, a self-hosted blogging tool chosen because of its user friendliness and customizable aesthetics.

Notes on editorial practices

As with all projects that involve analyzing and synthesizing materials, this thesis has warranted numerous editorial decisions. These decisions were inherent to the project: beginning with one manuscript of Montel and no other drafts for points of comparison, I surmised that this novel was in one of its final drafts. It is possible, however, that it incurred heavy revisions as Fornell corresponded with publisher Mark Patton, whose name and contact information is on the cover of the manuscript, suggesting that this was a copy returned from his office. Knowing the likelihood that multiple individuals looked over and made notes in the manuscript, this project grappled with decisions about marginalia. Were the typescript markings, unable to be easily distinguished as handwriting would be, suggestions on behalf of Patton or revisions inserted by Fornell? Given the low frequency of these markings and their seemingly unobtrusive changes to the text, I opted to preserve changes suggested in the manuscript’s marginalia, reasoning that these were either inserted by the author herself or agreed upon by author and publisher. Additionally, there were several inconsistencies in the manuscript’s use of quotation marks and commas for dialogue. These types of inconsistencies or distinctions seemed intentional and thus did not regularize punctuation in the transcription unless marked with marginalia. I did, however, regularize spelling when words were overtly misspelled or mistyped.

When using TEI to encode the manuscript, I faced similar decisions about preserving the integrity of the original manuscript. Since markup inherently gives argument to a text, I often found myself questioning my editorial authority and how I weighted reader experience against Fornell’s possible, yet unknowable, intentions. For something as granular as quotation marks surrounding dialogue, there were inconsistencies. Where most speech was surrounded by quotation marks, there were also passages that denoted speech with language like “Philip said,” but omitted quotation marks. Was there something distinct about these passages that warranted a different format? Could it just have been an inconsistency intended to be fixed in final edits? Did the button on Fornell’s typewriter physically jam and prevent her from adding quotation marks? It is impossible to know which, if any, of these is the answer. However, being able to ask these questions and consider my own bias within the text reminds both myself and you, reader, that what you might see as an entirely objective presentation of Montel is not: it strives to be impartial, yet is innately tinted with the complexities of editorial decisions.

Future research

The current version of the Gladys Fornell Project allows for a number of continued lines of research. Because the text of Montel is TEI-encoded, it can be further examined using a number of digital humanities tools. One interesting lens would be to use the tool stylo, which utilizes computational statistics to analyze genre elements, authorship attribution, and style development. In regards to Montel, this could be employed to examine the differences between Montel’s core narrative and the frame narrative, considering how the style and genre might have changed given the decade gap between Fornell’s writing of both section; compare Montel in other contemporary novels to evaluate how the writing diverges from canonical texts, potentially pointing to the presence (or lack thereof) of elements that may have hindered Fornell’s chances of publication; or look at Montel with comparison to specific literature Fornell documented reading, either works she edited at the Princeton University Press or listed reading through a seminar Mark Schorer facilitated at Princeton.

Another tool for further examination of the novel would be TokenX, which can generate word concordances and key words in context and allows for word play. Additionally, one could take the markup schema currently used in the file and modify it: if a scholar was interested in further exploring Montel’s representation of masculinity, which I address in my markup schema though not in depth, he or she would be able to strip the TEI document of <interp> groups related to femininity or build on them to highlight the relationship between representations of masculinity and femininity.

Beyond continued lines of research with the encoded file for Montel, the Gladys Fornell Project points to a new, developing mode of digital research at the undergraduate level. Using TEI and digital humanities tools to provide a deeper understanding of a humanities text underscores the value of digital literacy, demonstrating that we as humanists are able to adapt our skills to new mediums, and, more importantly, emphasizes that conducting close reading in new ways can allow one to bolster his or her already existing skillset of asking humanities driven research questions. In being able to adapt our perspective as scholars to a new medium of dissemination, we ultimately become better researchers and better investigators within our current and continually developing academic contexts.

Thank you to the Newberry Library with particular recognition to the Digital Initiatives and Services, Modern Manuscripts, and Special Collections departments, for the permissions to use the Montel manuscript in this Independent Study. Additional thanks to my advisor, Professor Jennifer Hayward, and collaborators, Jacob Heil, Catie Newton, and Stephen Flynn.

