Where is the national Web, exactly? A case study

[A summary of my chapter in The Historical Web and Digital Humanities. The case of national web domains, edited by Niels Brügger and Ditte Laursen.
It is due to be published by Routledge in April. The full title is ‘Understanding the limitations of the ccTLD as a proxy for the national web: lessons from cross-border religion in the Northern Irish web sphere’]

The writing of modern history has often depended on a stable idea of the state; on the idea that persons have some form of citizenship, a legal identification with a political unit. Even if they may hold more than one, each citizenship may stand on its own without legal ambiguity. Another fundamental assumption is that geographical space (at least on land) can usually be clearly divided into units under unified and monopolistic systems of law and government. To elaborate an insight of Max Weber, in order for a state successfully to enforce a monopoly on the use of violence, it must first know where its boundaries are.

Scholars have also been interested in the interactions between states and their peoples across borders, but still (by and large) supposing a fixity in those states at any one point in time. Studies of migration presuppose a point of origin and a point of arrival. Printed publications may circulate freely, but their publication is still governed by a national legal framework; something similar may be said of television and other broadcast media.

The advent of the web presents historians with a new and somewhat perplexing question: where is it? What does it mean to think of the web in spatial and quasi-geographic terms? How may we write national histories of the web? Where did a particular website ‘live’? Of where was it a resident or citizen, so to speak?

In most cases, the task of defining a national web domain has begun with one or more country code top-level domains (ccTLD) even if it has not ended with them. Here I examine the nature of the .uk ccTLD as a proxy for the UK web by means of a case study of the web estate of the Christian churches in Northern Ireland.

The society of Northern Ireland is marked by an interlinking of religious and national identity, which may be unique in Europe if not in the world. The chapter uses publicly available data, and including that provided by the British Library, to reconstruct the link relationships between churches in Northern Ireland, examining the regional, national, and cross-border relationships that they imply.

Due to its very particular religious and political history, Northern Irish society has been characterised by an exceptional sensitivity to symbols, to history, and to place. How far has that sensitivity to space and symbol been transferred online? Amongst the churches, Catholic and Protestant, in a province where the symbols of national identity have such prominence, does the location of a website within or outside the .uk domain carry any symbolic weight? Might those churches most associated with unionism be more likely to register in the UK ccTLD than Roman Catholic churches?

Based on the patterns of domain registration for the churches of Northern Ireland in 2015 and 2016, it would seem that Roman Catholic congregations were likely to register domains outside the UK, a finding broadly in line with the initial hypothesis. However, the converse – in relation to the Protestant churches – is not borne out; no particular prioritisation of registration within the UK ccTLD is evident in the data. Both conclusions point to important areas of future research on the nature of national webs, and the limitations of the ccTLD as a proxy for them. If organisations that might be expected to want their web estate to reside within a particular national domain do not in fact register their domains there, it suggests that the ‘gravitational pull’ of the ccTLD is weaker than might be supposed.

The second half of the chapter takes the case of one of the Protestant denominations in Ireland in order to investigate the mapping (or lack of it) between the nation and the ccTLD. It recreates the networks of links between individual Baptist churches on both sides of the border, and asks: are these link networks influenced by the fact of the ccTLD, or are there more geographic and cultural factors in play that determine their shape? It is based on an analysis of the .uk link graph for the period 1996-2010.

I conclude that although less than half of the Baptist web in Northern Ireland is registered in the UK ccTLD, the links between churches show in fact a very tight geographic concentration on the domains of churches in the eastern counties of Antrim, Down and (to a lesser extent) Armagh. Detailed local studies are needed to establish why this might be the case, although some lines of enquiry might be advanced. Is this a representation of a wider divide between rural and urban churches, or a reflection of the greater resources or perceived influence of churches in certain areas, particularly Belfast? Or is the prominence of certain individual churches merely the product of their particular local circumstances and understanding of their role? For whatever reason, the link graph shows little sign of sentiment regarding the common identity of all the Baptist churches in Northern Ireland.

These churches are linked together in a single organisation, the Association of Baptist Churches in Ireland: what evidence is there of link networks in the archived web that might reflect a sense of an all-Ireland identity? Approximately a quarter of Irish Baptist congregations are located in the Republic. What of the links from churches in the north to those in the south? The link graph connects only four Northern Irish congregations to twelve in the Republic, a very small proportion. Little all-Irish sentiment is to be detected in the northern Irish Baptist web.

Why might that be? Is the weakness of link connections between north and south characteristic of all churches in Northern Ireland, or only the Protestant churches, or is it unique to the Baptists? Is the network particularly weak in the Baptist case because of the relative weakness of its national organisational structure? These questions could in part be answered by the application of the approach used here to the web estate of the other churches.

More generally, a history of the web is required that also asks what it is that causes the human actors in control of websites to link to others. A substantial project of oral history interviews and fine-grained examination of individual websites is needed to understand the communicative strategies organisations adopt and their evolution over time. That said, I show what may be observed at a distance with a new kind of data. Macro-level analysis of the web such as this offers an additional tool for historians and other scholars to deploy alongside their existing methods.

The chapter has also pointed out a particular challenge that historians and analysts of national webs face. In the Baptist case, a network of links that is very tightly geographically concentrated is at the same time spread across four different TLDs. Studies of particular web spheres such as this are so far very few. However, if the kind of pattern I have outlined is at all typical of other web spheres, it suggests that for web archivists and scholars alike the ccTLD is a weak proxy indeed for the national web.

In addition, it brings into sharp relief one of the structural disadvantages of the division of world web archiving activities into national programmes. Though many web archives collect national material beyond their ccTLD, no organisation has any statutory responsibility to archive the non-geographic domains such as .com and .org as a whole. Unless and until it becomes possible to access web archives on a transnational basis, scholars will continue to work with fragmentary and non-commensurable data from several archives to reconstruct the national web.

Existing Web archives: an orientation

Web archives are fast becoming the fundamental source with which the history of the Web is written. Scholars coming to them for the first time are in need of some orientation, however, since those archives are brought into being by many different organisations for varying purposes and by different means. Their scope and structure also vary widely, as do the means of first locating and then using them.

My chapter in the new Sage Handbook of Web History aims to provide just that orientation.

It begins with a brief historical sketch of the development of Web archiving over the last 20 years, which I discussed at greater length here. It then moves on to outline the different means by which these archives are created, and what implications those differences have for how they must be interpreted. It outlines the varied kinds of collections in existence, and the different questions of method that this variety raises for scholars. Finally, it details the means by which scholars may first locate archived Web content, and (once located) how it may be used.

Along the way, it raises several points of necessary critical engagement for Web historians regarding the archived Web as a new class of primary source. Some of these issues have their analogues in print, manuscript or other sources; a scholar needs to understand who produced an object, whether it be a book, a manuscript, a painting or a PDF. But some of the issues presented here are peculiar to the archived Web, and must be thought through afresh.

The technologies that are used to create archived Web resources fundamentally shape those resources, and so understanding those technologies is a prerequisite to understanding the archive. Crucial also is an understanding of how the archive is structured: along national lines, by the institution or sector that created the content, by format or by a more general subject.

Finally, users must also understand something of the means by which they discover, search within, view and analyse archived objects, since those means are both relatively new and in a state of flux and development. That thinking will be greatly enabled by close collaboration between scholars and archivists: a partnership of mutual benefit which shows welcome early signs of growth.att

[See also, in the same volume, ‘Religion in Web history‘, my essaying of an agenda for the religious history of the Web.]

Religion in Web history

My chapter in the new Sage Handbook of Web History is now published. I summarise it here.

The literature on the phenomenon of religion in computer-mediated contexts is now very large, having built up over two decades. That literature is also produced both in, and in the spaces between, more than one discipline: Internet Studies, which concerns itself with the nature of the medium); the sociology of religion; and from scholars of religious studies concerned in particular with the relationship between religion and the media in general. The disciplinary labels vary between countries, but however it is named, little of this writing concerns itself directly with the kind of questions that most preoccupy historians.

This essay surveys the current state of Web history as it relates to religion, and falls into two halves.

Its first half attends to some debates of particular historical and methodological note with which the emerging history of religions on the Web may fruitfully be brought into conversation. These include debates concerning both the Web itself as a technological system, and religious responses to technological change in general.

It then sets out some points of contact between Web history and three key themes in contemporary religious history: secularisation; religious radicalism; and the place of religion in civic life and the law. It also argues for a fresh integration of the Web, and the archived Web in particular, with the study of offline religion, in pursuit of an ideal state in which the archived Web is merely one of many kinds of primary sources with which historians work.

The second half then takes a fourfold schema of different aspects of religions as they may be studied. The first of these is doctrine and religious knowledge: the symbols and forms of words that describe the divine, the world, the human person and their interrelations. Second are religious organisations and their representatives (clerical or lay). Third is religious practice: communal and solitary activities of prayer, worship and other rituals. Finally, the section on religions and the Other deals with the modes in which religious people and organisations encounter those outside: as potential proselytes, as discussion partners about wider social issues, and as antagonists. In each case, I identify the current state of research and set out elements of an agenda for future Web history research.

[See also, in the same volume, my introduction to existing Web archives.]

The silence of the archive. A review

[A review that appeared first in the LSE Review of Books.]

David Thomas, Simon Fowler and Valerie Johnson.
The Silence of the Archive.
Facet, 2017.

In the past two to three decades, the archival profession has been caught between two currents of cultural and technological change: simultaneous, largely unrelated, both apparently inexorable. Largely confined to the academy, but resonating beyond it, has been a radical scepticism about the stability of meaning in language resulting from the postmodern turn in historical thinking. Coupled with this epistemological scepticism has been a hermeneutic of suspicion of the power relations that are embedded in the creation, description and accessing of archival records. This has been bound up with the emergence of a wider politics of identity, and the assertion of the experience of marginalised groups as being equally worthy of documentation and study as those more ‘official’ voices that have traditionally dominated archives.

At much the same time, the transition from paper to digital in records management and archiving has presented the profession with challenges of exceptional scale and complexity, as laid out by David Thomas, former Director of Technology at the National Archives of the UK, in Chapter Three of this fascinating book. This transformation has fundamentally changed the ways in which live records are created and managed by organisations, with the significant added risk of mis-description as frontline staff are pressed into becoming their own archivists, and also of discontinuity in working IT systems such that data is lost or rendered uninterpretable. As these records pass to the archive, new and intractable challenges of scale come into play as archivists must select content for archiving and appraise it, presenting the difficulty of finding effective ways of describing these records and designing access systems that meet the needs of users.

For most working historians, much of the ferment of the discussion that these changes have prompted amongst archivists and theorists has been largely obscure; most of the literature that the authors (all three of them present or former TNA staff) synthesise here is to be found in the journals of the archival profession, into which historians rarely look. For those scholars whose only contact with archives is in the search room, this book will likely come as something of a revelation of just how far-ranging and radical some of that thought has been in the last ten years, and should be widely read for that reason alone. One might expect it also to find its way onto reading lists for introductory courses in the methods of archival research. It is therefore a matter for regret that the book, even in paperback form, is priced at a level that makes it unlikely that it will find its way into many private collections.

As a whole, the book has two major themes, one of which is acknowledged by both the title and the back cover, and another, equally if not more important, which is everywhere implied but rarely stated (to which this review returns below). Firstly, the theme of the title: the silence of the archive. The authors, along with Anne J. Gilliland in the foreword, identify an image that has formed in the public imaginary of the archive as a comprehensive repository of all known facts about the past. Scholars will differ on how potent and pervasive that image is, but the authors set out to show firstly that archives are neither comprehensive in this sense nor purely objective, even supposing such a state were possible.

Chapter One, by Simon Fowler, deals with ‘enforced silences’, whereby organisations conceal, amend or destroy records before they reach the archive, or where (as an unintended consequence of freedom of information legislation) records are never created as business is transacted informally. All manner of decisions are then made as the archivist selects which records should be preserved, appraises those records that are selected and removes material in the process, and then catalogues records in ways that bring certain aspects of a record to the fore while effectively silencing other voices. In addition, neither the transient quality of everyday life nor the lives of the majority of the population often come under the gaze of the state and so leave few traces (Chapter Six by Valerie Johnson is instructive on the ways in which marginalised communities may be intentionally brought into view and their stories documented as a result.)

Professional historians are of course accustomed to engaging critically with the ways in which their archival sources come into being, but they will benefit nonetheless from this wide-ranging survey of the particular issues. In several places, however, a strangely critical note is struck: a suppressed frustration with the users of archives and their apparent inability to understand the issues. In Chapter Two, ‘Inappropriate Expectations’, Fowler quotes the historian Nicholas Rodger on the distaste of staff of the Public Record Office when asked to provide subject indexes: to do so ‘would imply that the Office had a duty to provide something the public wanted, instead of the public having a duty to shift for itself and leave the archivists in peace’ (54).

While this whole book is a testament to how far those kinds of attitudes have been eclipsed, glimpses occasionally show through. Archivists, we are told on page 45, are familiar with being ‘bombarded’ with questions which cannot be answered, by users who ‘struggle to understand’ the issues (60). Johnson writes (after Lisa Jardine) of the ‘longing of historians and researchers to find that golden key which will unlock the secret they are investigating’, which in some cases leads to false assumptions about evidence that does not in fact exist and (at the extreme) to the sorts of conspiracy theorising, fictionalisation and fabrication that Thomas explores in Chapter Five. Whilst some researchers can and do cross this line, the experience of this reviewer, at least, is that such cases are rare, and are perhaps overstressed here. Most historians are able to control their longing. That said, archive users, for their part, have no doubt been guilty of failing to appreciate the role of the archivist as something more than a mere fetcher and carrier of files as Johnson notes (146): there is work to do perhaps on both sides of the relationship.

To a certain extent, the book is let down by its title and chapter headings, since the focus on what is not possible obscures a more hopeful and arguably more important thread which appears explicitly only on page 141. Johnson asks where the responsibility for the documentation of society lies, and answers: ‘it has been the implicit argument of this book that we are all responsible, whether as creators of records or professional curators of those papers, or as users, researchers, historians and informed citizens.’ At this point, this reviewer must declare an interest, as one working to facilitate precisely this better working between archives and the users of their digital services.

Nonetheless, The Silence of the Archive is throughout a call for a new relationship between archivists, the ‘archival subjects’ (those whose lives are documented) and those who use the archived record. Johnson writes of the process whereby those archival subjects are engaged in the process of creating the archive of their existence, thus becoming co-creators with the archivist (149-53). Thomas points out the acute need in a digital archive for close engagement with end users, both in the selection of material and in the design of the interfaces that make those records first discoverable and then usable (70-72). It is a shame, then, that this call for change – necessary and urgent – is somewhat muted here; indeed, in general, the authors have a tendency to quote and expound the work of others rather than elaborate an argument, and could have been bolder. However, it is a case that should be widely heard. Records managers, archivists, historians and other users of archives should read this timely and important book.

New article: Users, technologies, organisations – towards a cultural history of Web archiving

This article is now published, in Web 25. Histories from 25 Years of the World Wide Web, edited by Niels Brügger. It is published by Peter Lang, in hardback, paperback and ebook formats. A postprint version is available to download here.

From the Introduction:

If 2015 marked the elapse of 25 years since the birth of the web, 2016 marked the 20th anniversary of web archiving: of systematic attempts to preserve web content and make it accessible to scholars and the public. As such, the time is ripe to make an initial assessment of the history of the movement, and the patterns into which it has already fallen. This chapter represents the first attempt to document the subject at length. It concentrates on what might be termed the cultural history of the movement. It does not address the question of how web archiving has been carried out, but why, by whom, and on whose behalf.

Historians have for long known that, in order to interpret archival materials properly, it is first necessary to understand how that archive came into being. Why is a particular object to be found, and not another? What does the archive seek to document, and whose interests does it serve? The last very few years has seen a very welcome growth in interest in the archived web among scholars. However, that interest is not yet accompanied by the necessary familiarity with how the archived web came into being, and to be thus familiar is arguably even more important in this context than for traditional paper-based archives. Older distinctions with which historians are familiar — between published document, ‘grey literature’ and institutional records — have become blurred, as have those between personal and institutional publication. As a result, it has become less clear where the responsibility for preserving which types of content lies among the established institutions in the library and archives field. In addition, the archived web resource is unlike the live version from which it was derived in subtle and complex ways that do not apply to print publications or to manuscripts. If this chapter serves to orient users as to some of the questions they should be asking of their sources, and of the institutions that provide them, it will have achieved its aim.

It falls into the following sections: The Internet Archive / National libraries / The corporate record / Research-driven archiving / Activist archiving / Users and the future