Web 25: Histories from 25 years of the World Wide Web

Niels Brügger (editor)
Web 25. Histories from the First 25 Years of the World Wide Web
New York, Peter Lang, 2017. Paperback at £36.

It’s always a great pleasure to have sight of a book in which some of your own work appears. In the case of Web 25, it contains my short cultural history of the first 20 years of world Web archiving. But the book as a whole is full of intriguing other things, some of which I draw out here.

One of the most interesting areas (for me) in the emerging field of Web history is that of the early intellectual history of the Web: the modes in which people told stories about how the Web came into being and what it was good for (and the dangers it held). It was just this kind of research that my own paper at the ReSAW conference in June was aiming at ( ‘Utopia, dystopia and Christian ethics in the history of the Web‘ (podcast)), and there are several points of contact with two papers here: Marguerite Barry on the ways in which the Web entered general public conversation; Simone Natale and Paolo Bory on understanding the early history of the Web as one instance of a ‘biography of media’.

There are also several intriguing chapters that examine the concrete histories of particular parts of the Web: Sybil Nolan on one particular news site (the Australian The Age Online); Elisabetta Locatelli on the genre of the blog in an Italian context; Michel Hockx on the development of the Chinese Web; Jean Marie Deken on one particular organisation, the Stanford Linear Accelerator Centre. Here we have case studies at every level of magnification: organisations, particular kinds of content, whole nations.

There is also methodological reflection: from Matthew S. Weber (‘The challenges of 25 years of data: an agenda for Web-based research’); Federico Nanni and Anwesha Chakraborty on integrating archived Web materials with other sources including interviews to build diachronic accounts of the evolution of a particular site; Anne Helmond on the importance of embedded third-party code as a means of understanding what she terms ‘historical website ecology’. It’s a potentially very fruitful approach that complements the kind of analysis of link relations between sites that I’ve attempted here and here. It also connects with Niels Brügger’s own chapter, a short history of the hyperlink.

Finally, in the same section as my own there are chapters on the experience of creating and managing Web archives themselves, both in national library contexts (Paul Koerbin on Australia, and Ditte Laursen and Per Møldrup-Dalum on Denmark) and Camille Paloque-Berges on Usenet as an archive that falls outside the more established patterns into which Web archiving has fallen.

All in all, the volume is another part of an exciting upswing in interest in the idea of Web history, represented by The Web as History, the new journal Internet Histories and the forthcoming Sage Handbook to Web History.


Religion, law and national identity in the archived Web: new article

I’m delighted to say that an article of mine has appeared this week in a new collection of essays, edited by Niels Brügger and Ralph Schroeder: The Web as History (London: UCL Press, 2017, ISBN: 9781911307563).

My article is ‘Religious discourse in the archived web: Rowan Williams, Archbishop of Canterbury, and the sharia law controversy of 2008’ (pp. 190-203). It examines the controversy over a public lecture given by the archbishop on the interaction of civil and religious law, but from a new angle: the imprint the controversy left in the archive of the UK web. It makes particular use of British Library data documenting the link structure of the .uk country code top level domain for the period 1996-2010.

The whole thing is available as an Open Access PDF, but here’s my conclusion.

It is a brave historian who attempts to interpret the very recent past, as opposed to merely documenting it. As with most aspects of very recent history, the full significance of Rowan Williams’ lecture about sharia law will only become clear as the passage of time grants the historian a sufficiently long perspective from which to view it. An exhaustive qualitative examination of both the published record, and memoirs and private papers that are as yet inaccessible (not least the papers of the archbishop himself, not due to be released until 2038) will be needed to place the episode in its fullest context. Without these, we cannot yet know how changes in patterns of communication that are observable in the archived web were motivated, or how opinions expressed online related to broader patterns of social and intellectual change. However, even if it is difficult to explain changing patterns of religious discourse on the web, we may nonetheless document those changes.

First, the sharia law episode prompted a step-change in the levels of attention paid to the domain of the archbishop of Canterbury, as evidenced by the incidence of inbound links, and also a broadening of the types of hosts that contained those links. Second, a comparison of the inbound links to the Canterbury domain to that of the archbishop of York suggests that the historic privilege given to the views of Canterbury over those of York was extended onto the web. Regardless of their actual status in relation to each other within the Church of England, the media and the public at large seemed only to pay attention to Canterbury. Finally, a qualitative examination of the site of the British National Party shows that at least one organization, with a very particular concern with the place of Islam in British life, certainly took new account of the person of the archbishop as a result of the 2008 controversy.

This chapter has also sought to use the episode as a means of demonstrating both the potential for historians to utilize the archived web to address older questions in a new way, and some of the particular issues of method that web archives present. At one level, the methodological complications presented here – understanding the meaning of a link from one resource to another, say – are peculiar to the archived web and must be understood anew. As with all other born- digital sources, there is work to be done amongst historians in understanding these issues of method, and in acquiring the skills needed to handle data at scale. At the same time, it is part of the historian’s stock- in- trade to assess the provenance of a body of sources, its completeness and the contexts in which those sources were transmitted and received. The task at hand is in fact the application of older critical methods to a new kind of source: a challenge which historians have confronted and overcome before.

This chapter has also tried to show some of the potential available to historians, should they accept the challenge. In the study of public controversy, the archived web allows the detection of changing communication patterns at scale that would be impossible using a traditional qualitative method. It also enables the detection of attention being paid online in places where a scholar would not think to look. More generally, the chapter has attempted to outline an approach that combines quantitative readings of the links in web archives with qualitative examination of particular subsets of resources. When dealing with a new superabundance of historical sources, a combination of distant and close reading will be required to understand the archived web.

Forthcoming web archive conferences

2017 offers not one but two international conferences for scholars interested in the way we use the archived web. I’m particularly pleased to promote them here as I am a member of the programme committee for both of them.

There are calls for papers open now for both.

Curation and research use of the past Web
(The Web Archiving Conference of the International Internet Preservation Consortium)
Lisbon, 29-30 March 2017
Call for Papers now open

Researchers, practitioners and the archived Web
(2nd conference of ReSAW, the Europe-wide Research Infrastructure for the Study of Archived Web Materials)
London, 14-15 June 2017
Call for Papers now open

British blogs in the web archive: some data

While working on another project, I’ve had occasion to make some data relating to the blog aggregator site britishblogs.co.uk  (now apparently defunct) which occurs in the Internet Archive between 2006 and 2012. I am unlikely to exploit it very much myself, and so I have made it available in figshare, in case it should be of use to anyone else.

Specifically, it is data derived from the UK Host Link Graph, which states the presence of links from one host to another in the JISC UK Web Domain Dataset (1996-2010), a dataset of archived web content for the UK country code top level domain captured by the Internet Archive.

It has 19,423 individual lines, each expressing one host-to-host linkage in content from a single year.

Since the blog as a format seems to be particularly prone to disappearance over time, scholars of the British blogosphere may find this useful in locating now defunct blogs in the Internet Archive or elsewhere. My sense is that the blogs included in the aggregator were nominated by their authors as being British, and so this may be of some help in identifying British content in platforms such as WordPress or Blogger.

Some words of caution. The data is offered very much as-is, without any guarantees about robustness or significance. In particular:

(i) I have made no effort to de-duplicate where the Internet Archive crawled the site, or parts of it, more than once in a single year.

(ii) also present are a certain number of inbound links – that is to say, other hosts linking to britishblogs.co.uk. However, these are very much the minority.

(iii) there is also some analysis needed in understanding which links are to blogs, and which are to content linked to from within those blogs (and aggregated by British Blogs).


Understanding the shape of the Anglo-Irish web: a pilot project

I’m delighted to be able to say that I shall be a Visiting Research Fellow at the Moore Institute of the National University of Ireland at Galway in 2015. Here are some details of what I plan to get up to.

The task of understanding what constitutes the nation in the web archive is only in its infancy. Web archivists in national libraries have long known that top-level domains such as .uk or .ie do not encompass all the content that should be considered British or Irish for the purposes of analysis. But even the task of understanding the shape of those top-level domains has only just begun. My project begins that process for the Irish web.

One of the live questions about the nature of the national web is the degree to which it interacts with other national domains. This is of particular interest in the Irish context, since many institutions on the island of Ireland interact in cyberspace in ways that do not respect the physical and political border between Northern Ireland and the Republic.

This pilot study will begin to examine this interaction by the triangulation of analyses of data available from the Internet Archive and from the British Library. In particular, the data from the British Library lists all of the outbound links in the .uk webspace for the period 1996-2010 (see this earlier post). Such a dataset does not exist for the Irish webspace, but by analysing the composition of links from .uk sites to those in the .ie domain, it will be possible to read the growth and composition of the Irish webspace in its reflection in the UK. It will also shed valuable and hitherto unseen light on one aspect of the relation between the UK and the Republic of Ireland.

The initial outputs will be a series of small case studies, documented on this blog. Over time, these will be synthesised into an appropriate article or articles. I also plan to make subsets of the data available for reuse by other scholars.

The Big UK Domain Data for the Arts and Humanities project has shown an appetite amongst humanities and social sciences scholars to understand the content of web archives, and also to understand the methodological implications of working with what amounts to a new class of primary source. I intend to use the period of the Visiting Fellowship to engage with scholars across the humanities and social sciences at NUI Galway and in other Irish universities, with a view to sowing the seeds of a community of scholars interested in exploring the archive of the Irish webspace.