Orlando: Women's Writing
Title of Contents

Scholarly Introduction

Going Electronic

Markup in Orlando

[See What is Markup For? for the basics of tagging: for what tags—and the attributes which modify some tags—look like and an overview of their purposes. Note also that a click on the Show Markup button in the textbase (top right on screens for entries and search results) reveals tagging in individual passages.]
Textual encoding provides the textbase materials with a consistent intellectual structure without compromising readability and without fixing text in a linear sequence. It opens the material to multiple uses beyond simple reading: to detailed searching and on-the-fly restructuring. In effect, encoding resides beneath the visible surface of texts, giving them the power of a database and the reading experience of regular prose. Orlando's encoding allows the extraction and ordering, from among thousands of documents, of material with which to create chronologies. It enables the grouping of all writers who lived in or travelled to the same places, who wrote in the same genres, or who responded to the same texts. This structuring of text enables the investigation of interrelationships. It supplements a traditional emphasis on the single writer with several possible views of a writer operating in relation with others, either contemporary or across generations. It supports a view of literary production as resulting from the circulation of words and ideas.
The encoding embeds in our text explicit representations of the formal and conceptual structures and priorities governing Orlando. The formal markup works by providing a structure from which a stylesheet then renders the text in a web browser. The conceptual markup embodies literary-historical priorities and provides a rubric of the features in lives and texts that are attended to in this history: these tags, specific to Orlando, provide a common structure for the entries on writers. Discussions of lives, for instance, almost invariably employ Birth, Cultural Formation, Family, Education, Location, and Death tags, and may, as required by the individual life, employ tags for Health, Politics, Occupation, or Violence (covering the range of violence from spousal abuse to war). Discussions of writing almost invariably employ tags for Production, Textual features, Genre, and Theme or Topic , and may, as required by the individual oeuvre, employ tags for Reception, Influence, Intertextuality, Relations with publisher, and so on. These tags provide a basis for grouping writers or excerpts from their entries together on the basis of the tags in their entries, and for performing precise searches across the textbase.
Electronic text in any form is itself a mode of representation; its tagging works in dialogue with the 'readable' text, to open new ways of both writing and reading literary history. Electronic text markup as it is used by the Web (HTML or Hypertext Markup Language) is rudimentary: it instructs web browsers on text display according to markup (italicized, indented, laid out in a table) or on linkage to other web materials. But in digital humanities work, SGML/XML has purposes beyond display. It sets out to describe the character of the text itself. So, for instance, instead of marking up a periodical title with a tag indicating that the title should appear in italic, Orlando markup designates it as a journal title. This practice of describing the nature of the text rather than instructing how to render it follows markup principles established in the Text Encoding Initiative and elsewhere. It enables systems to be designed to represent journal titles in various ways—with italics, underlining, or hyperlinking, and so on. The representational act of describing the piece of text is separate from the representational act of encoding to produce format. This may seem trivial as applied to titles, but it is easy to perceive the utility of a computer system being able to distinguish between 'London' as a place and 'London' as a word in a title or an organization name. Markup, or knowledge representation, insofar as it involves applying logic, or a set of rules, to a system of ontology in order to represent a knowledge domain, largely constrains what a computer can do with a text and what conclusions can be drawn from the way the markup is applied. Bibliographic Citation link. This kind of markup thus has far-reaching implications and more flexible and powerful applications than format-oriented markup.
Text markup is thus an interpretive activity: it creates meaning and it licenses various inferences about the text. Bibliographic Citation link. The semantic or conceptual markup used here constitutes a particular approach to literary history. Like all history, Orlando has had to select, frame, and organize its materials according to certain priorities, many of which it has embedded in the tagset. It is finally the fundamentally interpretive quality of the tagset that makes Orlando a history with a difference.
This tagging has been an experiment in using computers to undertake primarily qualitative rather than quantitative work. As a result, not all readers will agree with all of the tagging judgments in the Orlando textbase. For instance, the tag for Responses carries an attribute 'ad feminam' for personalized reactions to a writer or her works; use of this attribute requires the tagger to make a critical judgement: the history of the reception of texts is rich in violent and long-lasting disagreements over this very issue. Orlando creates patterns and groupings emerging from the categorizations and judgments that are intrinsic to literary historical analysis, but given the selective nature of historical evidence and historical reporting, we do not present the textbase as a statistically representative sample of the field of women's writing. (A number of scholars have pursued quantitative work in literary historical studies, including Gaye Tuchman with Nina Fortin, in Edging Women Out, 1989, and Simon Eliot in Some Trends and Patterns in British Publishing, 1994. Bibliographic Citation link. Franco Moretti has offered in Graphs, Maps, Trees: Abstract Models for Literary History, 2005, some provocative approaches to literary history—which he calls "a process of deliberate reduction and abstraction"—in which quantitative analysis, mapping, and evolutionary models produce "specific form[s] of knowledge" that allow interconnections between texts to emerge more clearly.)  Bibliographic Citation link.

Interpretation and Markup

The question posed in the application of markup to text is what—and what not—to tag. The answer to that question is what embodies values, priorities, and hypotheses in the tagging.
The Orlando tagset is designed to identify elements of writers' lives and writing that are important to an understanding of women's literary history: it attempts to map the diverse and changing literary conditions under which women's writing has been shaped and received. Many of its tags are markedly different from those associated with the tagging of primary text editions or a linguistic corpus.
The act of encoding often involves choosing between two or more applicable tags. When Florence Nightingale's entry mentions that her "father had no male heir, so her cousin William Shore Smith acquired the family estate. He increased her income to £2,000 annually," this material seems to invite use of several tags which are structurally incompatible: a Wealth tag, or a Family Member tag with an attribute for the father, or another for the cousin. Choice of the Wealth tag (and the decision not to lengthen or break up the discussion by using more than one conceptual tag) indicates that the tagger considered the increase in income and the historical pattern of excluding women from inheritance most pertinent to a history of women's writing. This example brings home the extent to which tagging is inescapably interpretive and partial. (These issues are explored more fully in relation to the Orlando Project in "Can a Team Tag Consistently" and "Intertextual Encoding in the Writing of Women's Literary History".) Bibliographic Citation link. Like any historical enterprise, Orlando is selective, produced according to the priorities of the historians. It has the advantage, however, that many of the overall priorities are legible in the tagset itself, and the results of particular decisions made are legible in the particulars of the markup.


The questions of what to hyperlink, by what criteria, and where to target the link, comprise another set of hard decisions for an electronic interface. Orlando consistently provides internal links using four 'core' tags: people's Names, the Titles of texts, Place names, and names of Organizations. Whenever entries in the textbase contain more than a single tagged instance of one of these, it will be hyperlinked. Rather than making the link point to a single spot in the textbase, Orlando leverages its encoding system to offer readers informed choices about which links to follow. Links screens (available also as part of author entries) provide links to all entries mentioning that person, text, place, or organization, with associated screens for viewing the relevant excerpts and timeline. The Links screen provides a perspective on authors produced from beyond their own particular entries: these links reflect the dialogue among the narratives and critical histories of various writers. The underlying tagging architecture allows a reader to choose which context(s) to pursue. These contexts (defined by specific tags or by groups of tags) offer a choice of directions for exploration.
A selective number of external links are provided to texts online, hopefully with sufficient conservatism to avoid the frustration produced by 'stale' links or blocked access to licensed sites, not to mention inaccuracies in non-scholarly online editions.