The Proximal Past: Digital Archives, and the Here and Now

[This is the text of my keynote for Digital Transformers, Manchester Metropolitan University, 23 May 2013.]

Web access to digital resources of various kinds has made the past proximal. Digitized historical objects and the databases that house them are only a keyword search (and maybe a few logins) away. Yet proximal is an interesting word. Through its connection to proximate, it implies succession and causation, a kind of temporality; yet when opposed to distal, proximal operates spatially, indicating that which is closest to some sort of feature or phenomenon. What I want to do today is use these two meanings – proximity in time and proximity in space – to think about how the forms of digital media affect the way we construct the past. Whereas it might appear that digitization pushes the past away from us by rewriting the objects that document it, what it reveals is that these objects are already to some extent written: situated, stabilized and ready for use. Digital transformation changes the past by changing what we can do in the present.

I am a historian and theorist of the nineteenth-century media based in an department of English. I tell the histories of printed objects of various kinds by inverting the traditional aims of the discipline: rather than focus on content, I’m interested in form; rather than identify the exceptional, I want to know about the generic; and the object of my analysis is not just text, but a version of textuality that encompasses materiality. There has never been a better moment to study the history of media. The different materialities that digital media put into play have denaturalized print culture, making explicit the role of media in producing meaning. For people like me who are interested in the actual objects that people read, digitization offers further advantages. The bibliographic complexities of nineteenth-century print culture – there is a lot of it, it’s very diverse, and it survives in fragmented runs – has encouraged researchers to approach it as if it was an archive of interesting texts rather than the surviving record of a set of cultural practices. However, its advantageous copyright status has attracted large-scale digitization projects from commercial publishers, libraries and scholars that have each exerted different degrees of bibliographic control. It has never been easier to read nineteenth-century texts in their original print contexts. But the crucial point is that this is reading predicated on seeing: these remediated forms are radically different, but this difference is itself mediated to enable something that resembles reading them in print. This morning, I want to focus on this attempt to reproduce but enhance, and what it means for understanding the past. Making historical objects proximal means making them something else.

My argument is in three stages. In the first, I explore why digital transformations are often understood in terms of deficit, with the new, digital object, seen as an impoverished imitation of the historical object on which it is based. In the second, I argue that digital transformations are always interpretive, identifying one aspect of whatever they transform and making it into something new. Finally, I will claim that deficit models of digitization are the result of a mistaken conception of historical objects. The past does not reside in the various objects, printed or otherwise, that we have inherited; rather, it must be produced by the ways these surviving objects are disciplined and put to use. These objects, in other words, are already subject to transformation. Throughout, my focus is on digitizing the various nondigital objects that survive from the past – what Tim Hitchcock has recently described as ‘stuff we inherit from dead people.’ Underpinning my paper is an argument about the performative nature of both materiality and historical significance: the traditional, nondigital archive is already constructed and interpreted; when we transform it, we reconstruct and reconfigure, granting its objects new forms that invite new uses. All digital transformations are abstractions, but moving away from the archival objects does not mean that we move away from the past. Quite the opposite: it is easy to mistake the objects as they survive with the objects as they once were, enmeshed in various contingent moments of use. Yet unmediated access to the past has always been impossible. Going away is also coming back again.

Digitization as Deficit

If you ask most people what they want from digitization, it is access to the stuff. The affordances of digital objects – the way they can be reproduced, distributed and processed – means that even the crudest – a page scan, for instance – can dramatically increase access. But the key question is access to what? If digital objects only serve as surrogates for nondigital objects, then they will always be in some way deficient. This deficiency is mitigated by identifying the key features of the source material – the way it looks, if it is to be read – and reproducing that at the cost of other, less important features. As the bulk of users come to digital resources to do something similar to what they would do with nondigital objects – read, in the case of the material I work with – it is important that designers put as few barriers in their way as possible. The rhetoric of the digital archive, where interface is presented as portal and distinguished from content, is about tempering the digital difference. The interface is where you search, the page images are what you read.

If we understand genre as social action, as a way of negotiating unfamiliar circumstances and transforming them into a species of the same, then we can see that many resources are designed to accommodate behaviours learned from interacting with nondigital objects. For commercial resources that seek to recoup costs through subscription this is particularly important, and so many are marketed as if they are transparent gateways to content. However, users never really leave the interface and it is what they do there that produces the distinction between the mediating framework of the resource and the content that it contains. Digitization takes place in an economy of loss and gain: what happens in these resources is that whatever is gained is appropriated as a kind of compensatory functionality that provides access to ‘content’ that is consequently marked as deficient. The new and distinctly digital properties are separated off, leaving only the minimal set of features that have been reproduced from the nondigital media and that allow users to do whatever it is that they already know how to do.

When lined up against the nondigital object upon which it is based, the digital object can only ever appear impoverished. A quick example. Charles Dickens had great success with his weekly periodical, Household Words. This was a weekly miscellany containing a range of literature, reviews, and commentary on a variety of subjects. It was explicitly designed to reach lower middle- and upper working-class readers, part of a typically Victorian project to extend ‘good’ reading to this constituency – while making lots of money for the journal’s proprietors, of course. Some of Dickens’s novels were first serialized in Household Words, but he also published other well-known authors from the period such as Elizabeth Gaskell and Wilkie Collins. Household Words was available in a range of formats:

  • weekly (2d)
  • monthly (9d)
  • six-monthly volumes (5s 6d, bound in cloth)
  • a set of 10 vols (£2 10s)
  • There was also the Monthly Narrative (2d; 3s for an annual volume) and an annual Almanac (4d)

The text of Household Words exists in multiple forms, each designed for a particular group of readers and moment of reading. Each object has its own set of meanings and each can be constructed as a kind of ‘original.’ The letterpress might be more or less the same, but each has its own history: reading the cheap weekly parts was not the same as reading them bound together as a (more expensive) monthly.

All serials have a complex relationship with temporality. The current issue is different to all those that have gone before as it, and only it, speaks to a moment that is still unfolding. On the appearance of the next issue, what was the current issue becomes part of the past, perhaps preserved as part of an archive, perhaps thrown away. Serials tend to be marked by their moment of publication and their place within the series. Each issue of Household Words, for instance, has a date on the masthead, as well as its number in the sequence; the bottom left hand corner also records the volume number and in the bottom right is the page number, which runs in a series throughout the volume. In the monthly parts, each part is numbered in the inside cover, meaning that not only are weekly issues numbered, but so too are the monthly parts and six monthly volumes, each in their own sequence. Serials segment time, marking it spatially. As the weeks pass, the pile grows; as the years pass, the volumes accrue on the shelves. Although constituting one series among many, each serial insists on a linear, standard time, reaching back and stretching beyond the current issue. However, serials also keep the past in play in the present. Each issue offers something new, but this novelty is carefully tempered by a set of recurring forms already known to readers. Layout, typography, the range of articles, the tone – a large part of the current issue is material that has been seen before, resurrected from previous issues. Readers are already good at recognizing this sort of material as part of what makes a serial, but with each issue they become adept at identifying it as separate from, and incidental to, the changing content. This repetition, which is integral to all serials, insists that the future is knowable and predictable, that any new content can be assimilated to known forms that have already been established and that will recur into the future.

The image shows the various ways that seriality is marked on the first page of the weekly number, cover of the monthly number, and titlepage of the six-monthly volume.

The image shows the various ways that seriality is marked on the first page of the weekly number, cover of the monthly number, and titlepage of the six-monthly volume.

The way serials are currently digitized is as a set of imaged files that are indexed through an ocr-generated transcript. This allows keyword searching and, if they have been appropriately marked-up, the retrieval of articles. For example, take ProQuest’s British Periodicals, published in 2007. In allowing users to search for text strings, the resource returns a list of articles, which are read, one after the other. By exerting a measure of bibliographic control over the archive, this resource makes it much more navigable than it ever could be in print; by offering page images as the reading text, it makes it much easier to see what the printed page looked like; and by making the articles accessible, it brings this material much closer to hand, allowing scholars to place it alongside the more familiar texts that have become canonized over time. However, this resource conceives of its source material as a repository of texts that are understood as identical to the words that each article contains. It privileges the verbal over the visual; the article over the section, page or issue; and the content that changes over the various formal features that mark seriality through their repetition. Although what is read is an image of the article on the page, rather than the transcript, there is no way to read in sequence, or interrogate any of the elements that make serials serial.

If we approach resources such as this as surrogates for the nondigital objects that they represent, they will always be deficient in some way. Yet deficiency can be instructive. This is transformation as interpretation: in focusing on one particular aspect of the source material, these resources make an implicit argument about what they think the objects are, and are for. In the case of British Periodicals, it is a searchable database of articles, which can be read as usual. However, in advancing this particular aspect of the source material, the resource also makes visible those aspects that had hitherto been taken for granted. When users remark that page size, for instance, is misrepresented in a digital resource, they are at least acknowledging its significance. The key is to reconceive loss as difference and use the way the transformed object differs to reimagine what it actually was. Critical encounters with digitized objects make us rethink what we thought we knew. And, because these digitized objects are radically different, the underlying data can be used to model this difference, creating new representations of both old objects and the connections between them.

People have begun to do this for resources based on nineteenth-century serials. I’m particularly interested in what Tim Sherratt has been doing with the data from Trove, the search engine of the National Library of Australia. His Querypic allows users to generate instant visualisations of the occurrence of search terms over time, both as a percentage of the total content and as a raw total. His The Front Page maps the constitution of the front pages of Australian newspapers by genre. This allows users to quickly visualize the decline of advertising and the rise of news; but it also makes clear the remarkable consistency of a print genre that deals with passing events. One of Sherratt’s arguments is that the emergence of news on the front page turns the newspaper inside out. The same is true of these visualizations: they subordinate content to form, making patterns visible by mapping both repetition and change.

The lines plot occurrences of 'Charles Dickens' and 'Household Words'. The two lines spike and converge after 1850, suggesting that it was the popularity of Household Words that brought Dickens's name before the Australian public.

The lines plot occurrences of ‘Charles Dickens’ and ‘Household Words’. The two lines spike and converge after 1850, suggesting that it was the popularity of Household Words that brought Dickens’s name before the Australian public.

This visualization plots the number of articles in each genre on the front page of the Sydney Morning Herald. What is striking is its formal consistency for almost a century.

This visualization plots the number of articles in each genre on the front page of the Sydney Morning Herald. What is striking is its formal consistency for almost a century.

The reason digitization operates in this economy of deficit is because we misrecognize the status of the nondigital objects on which resources are based. Reading the content of digital resources as deficient representations of nondigital objects implies a transmission model of digitization whose ultimate goal is perfect simulation. Such a process is impossible as it depends upon being able to fully describe the source object, whatever it might be. We tend to think of these objects as well-defined and bounded, with a set of given properties and stable meanings. Yet materiality simply does not work like this. All material objects, digital or nondigital, are always in excess of their stated properties and always have the potential to become something new. Materiality is emergent, the properties of an object becoming tangible as they are put to some sort of use. The same is true for what historical objects might mean. As we know, not only does the historical significance of surviving objects change over time, but so too does the way in which they are identified and defined. This is the reason why using data to model historical phenomena does not necessarily take us further away from the past: all historical practice is already compensatory. What we have – whether around us, or in designated archives – is what has survived, and the persistence of these objects cannot but remind us of that unknown sum that have been lost. The virtual and actual acts they record – what an object was designed to do; what was actually done with it – similarly point to a set of histories that can only be reconstructed. Even well-researched historical objects mark an absence. They are already deficient.

History has a considerable stake in the stability of its objects; their fixity provides a necessary point of departure to which the discipline can return and correct itself. Yet the objects are also deficient, the product of a practice that defines them and, crucially, maintains that definition even as it acknowledges their deficiency through supplementary historical explanation. The objects of history are not out there, waiting to be found, but are produced by historians. They become interfaces that enable a particular practice, history, that, in turn, transforms them into something else, something authentic, the raw witnesses of the past. Our digital transformations are an extension of this process, reinterpreting historical objects by reconfiguring them, and thus enabling them to anchor new narratives that depend on digital properties. Digital media appear to threaten the aura of authenticity that allows an object to link to the absent past; but the authentic object is already the result of a discursive transformation, with its own history, materials, and processes. The proximal past is made possible by transformation, digital or otherwise.

Going Away and Coming Back

If digital resources are considered surrogates, then they can only be conceptualised in terms of deficit. This deficit can be turned to account if we use it to recognize the interpretive nature of digitization. Digital transformations assert something about both the material before transformation and the material that results. There is, therefore, always a politics at play and so transformation requires reflection and critique. We have to continue to choose objects for digitization for particular reasons, and be prepared to argue the case for them. This involves thinking carefully about the way objects are digitized, the way digitized objects are processed in specific environments, and the composition of the archives themselves. No archive is ever neutral and transformation is a cultural practice.

Digital transformations can make the archive proximal, bringing disparate collections together, and making them reconfigurable and processable. The proximal past seems to come at the cost of the aura of the original; yet this historical presence is a product of a particular relationship to the nondigital object and, to follow Benjamin, is amplified through the ways in which it is mediated. The point is not to recreate the object in all its latent materiality and significance, but rather model a particular version of it for a particular end. In the case of historical documents, the most common historical object, this might be simply to encode textual content; however, even these objects mean more than it is written upon them, and so can – and should – be transformed in multiple ways.

The archive, however conceived, has always been a more or less formal institutional repository for the past and a point of departure for the way the past is made knowable by the discourse of history in the present. Some digital archives recreate the logic of the library, declaring themselves places where history can be carried out, much as it had been previously. Yet these archives take their place alongside the many other products of digital culture, whether these are the digital artefacts generated, intentionally or not, as part of everyday life, or the way the present is narrated across various social media. The proximal past seems to collapse the difference between the ratified documents of a formal, institutionalised and disciplined history and the vast, apparently disordered evidence of a culture trying to understand itself. The boundaries between these well-demarcated archives and the rest of digital culture are porous and their constituent objects have the potential to be repurposed and reconfigured, combined and visualized in new combinations and at various levels of scale. However, the differences that are imperilled are themselves the products of a history, a particular way of conceiving of and doing things with objects, and so they are recoverable. This new archive – heterodox, unbounded, and changing – both witnesses the archives that have preceded it and allows their constituent objects to be used in new ways. History is a practice that transforms its objects, whatever they might be. Digital history, or rather, doing history today, requires scholars able to understand digital transformations and make them for themselves; to recognize how the objects in the archive have been transformed and will be transformed again. In other words, it requires scholars to be both digital transformers, and recognize that transformation creates a new object that, at the same time, redefines whatever it once was.