Towards a cultural history of web archiving

[UPDATE: this article is now published; see the free pre-print version here.]

This week I’m writing the first draft of a chapter on the cultural history of web archiving, for a forthcoming volume of essays (details here). It is subject to peer review and so isn’t yet certain to be published, but here’s the abstract.

I should welcome comments very much, and there may also be a short opportunity for open online peer review.

Users, technologies, organisations: towards a cultural history of world web archiving

‘As systematic archiving of the World Wide Web approaches its twentieth anniversary, the time is ripe for an initial historical assessment of the patterns in which web archiving has fallen. The scene is characterised by a highly asymmetric pattern, involving a single global organisation, the Internet Archive, alongside a growing number of national memory institutions, many of which are affiliated to the International Internet Preservation Consortium. Many other organisations also engage in archiving the web, including universities and other institutions in the galleries, libraries, archives and museums sector. Alongside these is a proliferation of private sector providers of web archiving services, and a small but highly diverse group of individuals acting on their own behalf. The evolution of this ecosystem, and the consequences of that evolution, are ripe for investigation.

‘Employing evidence derived from interviews and from published sources, the paper sets out to document at length for the first time the development of the sector in its institutional and cultural aspects. In particular it considers how the relationship between archiving organisations and their stakeholders has played out in different circumstances. How have the needs of the archives themselves and their internal stakeholders and external funders interacted with the needs of the scholarly end users of the archived web? Has web archiving been driven by the evolution of the technologies used to carry it out, the internal imperatives of the organisations involved, or by the needs of the end user?

What’s in a (top-level domain) name?

[UPDATE (February 2019): this work is very shortly to be published; see details.

I think there would be general agreement amongst web archivists that the country code top-level domain alone is not the whole of a national web. Implementations of legal deposit for the web tend to rely at least in part on the ccTLD (.uk, or .fr) as the means of defining their scope, even if supplemented by other means of selection.

However, efforts to understand the scale and patterns of national web content that lies outside national ccTLDs are in their infancy. An indication of the scale of the question is given by a recent investigation by the British Library. The @UKWebArchive team found more than 2.5 million hosts that were physically located in the UK without having .uk domain names. This would suggest that as much as a third of the UK web may lie outside its ccTLD.

And this is important to scholars, because we often tend to study questions in national terms – and it is difficult to generalise about a national web if the web archive we have is mostly made up of the ccTLD. And it is even more difficult if we don’t really understand how much national content there is outside that circle, and also which kinds of content are more or less likely to be outside the circle. Day to day, we can see that in the UK there are political parties, banks, train companies and all kinds of other organisations that ‘live’ outside .uk – but we understand almost nothing about how typical that is within any particular sector. We also understand very little about what motivates individuals and organisations to register their site in a particular national space.

So as a community of scholars we need case studies of particular sectors to understand their ‘residence patterns’, as it were: are British engineering firms (say) more or less likely to have a web domain from the ccTLD than nurseries, or taxi firms, or supermarkets? And so here is a modest attempt at just such a case study.

All the mainstream Christian churches in the island of Ireland date their origins to many years before the current political division of the island in 1921. As such, all the churches are organised on an all-Ireland basis, with organisational units that do not recognise the political border. In the case of the Church of Ireland (Anglican), although Northern Ireland lies entirely within the province of Armagh (the other province being Dublin), several of the dioceses of the province span the border, such that the bishop must cross the political border on a daily basis to minister to his various parishes.

Anglican Ireland. (Church of Ireland, via WIkimedia Commons, CC BY-SA 3.0)
Anglican Ireland. (Church of Ireland, via Wikimedia Commons, CC BY-SA 3.0)

How is this reflected on the web? In particular, where congregations in the same church are situated in either side of the border, where do their websites live – in .uk, or in .ie, or indeed in neither?

I have been assembling lists of individual congregation websites as part of a larger modelling of the Irish religious webspace, and one of these is the Presbyterian Church of Ireland. My initial list contains just over two hundred individual church sites, the vast majority of which are in Northern Ireland (as is the bulk of the membership of the church). Looking at Northern Ireland, the ‘residence pattern’ is:

.co.uk – 23%
.org.uk – 20%
.com – 17%
.org – 37%
Other – 3%

In sum, less than half of these sites – of church congregations within the United Kingdom – are ‘resident’ within the UK ccTLD. A good deal of research would need to be done to understand the choices made by individual webmasters. However, it is noteworthy that, for Protestant churches in a part of the world where religious and national identity are so closely identified, to have a UK domain seems not to be all that important.

Notes
1. My initial list (derived from one published by the PCI itself) represents only sites which the central organisation of the denomination knew existed at the time of compilation, and there are more than twice as many congregations as there are sites listed. However, it seems unlikely that that in itself can have skewed the proportions.

2. For the very small number of PCI congregations in the Republic of Ireland (that appear in the list), the situation is similar, with less than 30% of churches opting for a domain name within the .ie ccTLD. However, the number is too small (26 in all) to draw any conclusions from it.

Dramatic adaptations of James Joyce’s ‘Dubliners’ in 1960s Belfast

Scholars of James Joyce (one of which I am assuredly not) may be interested in a chance discovery in the archival collections of the National University of Ireland Galway. Obscured by an incomplete catalogue record is the existence of adaptations for the stage of three of the stories in Joyce’s Dubliners, one of which at least was produced by the Lyric Players in Belfast in March 1963.

File T4/75 in the Lyric Theatre/O’Malley archive is catalogued as concerning a triple-bill production of plays by W.B. Yeats, J.M.Synge, and Lady Augusta Gregory. On examination of the file, the programme states that the production was in fact of four plays rather than three. The fourth was an adaptation of ‘Grace’, one of the stories in Dubliners, made by Maureen O’Farrell and James O’Connor. In the same file there is a script of the same that establishes the point. O’Farrell (later Maureen Charlton) was involved in the Belfast theatrical scene, and adapted Synge’s Playboy of the Western World as a musical. The file also contains a number of photographs of the production of ‘Grace’.

In the same file there is a second script, typed on the same yellow paper, with a missing first page. This appears to be a similar adaptation of ‘Ivy Day in the Committee Room,’, also from Dubliners. However, it doesn’t seem to have been produced, although it was presumably considered.

If my identification is correct, it also makes sense of file T4/432 in the same archive, which contains a third adaptation in the same typescript on the same yellow paper of ‘The Dead’, a third Dubliners story. The catalogue records this as of an unknown adaptor, although it seems likely that this was also the work of O’Farrell and O’Connor.

It may be that these adaptations are well known to Joyce scholars; but I record them here in case they are not.

Understanding the web of faith: forthcoming book chapter

[UPDATE: this chapter has now been published.]

I’m very pleased to say that an essay of mine has been accepted for a forthcoming volume: The Web as History: the first two decades. It is edited by Niels Brügger and Ralph Schroeder, and will appear Open Access with UCL Press in 2016.

Here’s my abstract:

‘Much of the discourse that historians of contemporary religion until recently tracked in correspondence, periodical publication and print ephemera has migrated online. But the task of understanding religious discourse in the UK web space has hardly begun. The task is hard to undertake at the highest level since there are no second-level domains that serve as useful units of analysis — there is no faith.uk to match nhs.uk or ac.uk.

‘This chapter represents a first step towards understanding the evolution of the UK religious web space, by means of two interrelated case studies, which between them point to the agenda and content of a larger research project. Both studies utilise the JISC UK Web Domain Dataset for the period 1996-2008, as held by the British Library.

‘Firstly, it will examine the web archive footprint left by the public controversy in 2008 over the comments made by Rowan Williams, archbishop of Canterbury, on the matter of sharia law. Using both the link graph and a direct qualitative analysis of archived content, it will explore both the shape and the content of the controversy and show the degree to which religious debate had not only migrated from print to the web, but in doing so had engaged different actors and lost others, and changed in its tone.

‘Secondly, it will consider the growing tension in religious discourse between faith groups and organisations with a secularist agenda. Again, using the link graph and some qualitative analysis, it will explore the patterns in which linkages grew and shifted between the web estates of key but opposed organisations in relation to issues including faith schools and creationism, the reform of the law on blasphemy, and the place of the bishops in the House of Lords.

Summer 1914: an exhibition

While in Paris last month I had the chance to visit the splendid exhibition at the Bibliothèque Nationale de France, on ‘Summer 1914’ (Été 14), which reflects on the First World War, as do so many other events and exhibitions at the moment.

One can feel the early signs of over-exposure to WWI, with a number of months still to go until the anniversary of the outbreak on 28 July. Nonetheless, I wanted to draw attention to this stunning exhibition. Its effect is sobering, melancholy perhaps, and the effect is achieved without any of the usual props and devices of Grand Guerre remembrance, and is all the more effective for it. Without a trench or or a mannequin in uniform to be seen, it weaves its spell.

It is confined in scope to the period immediately before 28 July, and the few days that followed, and succeeds triumphantly in evoking a civilisation sleep-walking into a catastrophe that only a few could imagine, and that only dimly. In the long summer of 1914, travellers still took the Chemin de Fer du Nord to resorts such as Boulogne-sur-Mer; paysans continued to make hay while the Parisian elite partied in the parks; and a Frenchman won the London Marathon. Not all is calm –  suffragette agitation is one sign of gears grinding – but Europe’s interlocked monarchies had, it seemed, little to fear from the assassination of Franz Ferdinand – a little local difficulty.

Even once war had begun, this exhibition shows all sides marching resolutely backwards into the conflict. There are manuals of troop movements here, cavalry techniques and siege warfare, reflections on colonial wars past. Old understandings of old enemies are pressed into service again: one cartoon shows Thor, the ‘most barbarous of all the gods of the old Germany ‘(‘la plus barbare d’entre les barbares divinités de la vielle Germanie.’) There was also a revival of older Christian means of sanctifying the fallen, a cult of martyrs ‘pour le patrie’ and their ‘glorious mourning’.

As one might expect, interweaved with the diplomatic correspondence and other official documentation are sources from writers and artists: Romain Rolland, André Gide, de Chateaubriand and others are represented here. But the exhibition neatly avoids any easy implication that artists were any more perceptive of the horrors to come. Thomas Mann could write in 1914 of a coming war as ‘une purification, une libération’, even an immense hope.

If you were to find yourself in Paris before 3 August with a couple of hours to spare, don’t miss this exhibition.