A little while ago I wrote a post about the need to plan for archiving the digital “papers” of historians. In that post I talked about research data (what we used to called “notes”); about the systems that form the bridge between that data and the writing process; and about written outputs themselves, and their various iterations. It looked forward to a time when all these digital objects, in multiple formats but from one mind, are available to future students of the way the discipline has developed.
What that post neglected was data about the way I publicise my work. Perhaps one of the reasons we’ve been slow to think about this is that, at one time, most academics didn’t need to. Apart from giving papers at gatherings of the learned, the task of publicising one’s work belonged to the publisher. And if one’s publisher was the right one, then the work would inevitably end up in the hands of the small group of people who needed to know about it. And whilst the media don is not a new phenomenon, most historians might have thought such self-publicity outside the academy something of an embarrassment, even rather vulgar.
How times change. Universities are training their staff in dealing with the traditional media and in the most effective way of using social media. And this opens up a new category of data that ought to be archived, if only to understand how the push for ‘impact’ actually played out in these early years. And some of it is being archived. The Library of Congress are archiving every tweet, although it isn’t yet clear how that archive may be made available for use. The UK Web Archive, along with other national web archives, have been archiving selected blogs (including this one) for several years, and the EU-funded BlogForever project is looking to join those projects up. But this approach, valuable though it is, separates the content from the author, and from the rest of their digital archive. Whilst that link might be retrievable at a higher discovery layer, something important is still lost.
But now the helpful folk at Twitter, in a move that ought to be applauded, have made it very quick and easy to download an archive of one’s own tweets, right back to the beginning. And so I did: 1682 tweets, over 14.5 months. But what to do with it ?
Straight away, scrolling through a long CSV file starts to tell the story of the making of other things: the first retweet of someone else’s work which was subsequently to influence my own; the first traces of an idea, or even of a question I was beginning to ask, which spawned a blog post, and then a paper. I also find that I shared at least one link in more than two thirds of my tweets, which sounds public-spirited until I add that a good proportion were my own posts. I can start mining the data for key terms and themes, and how they ebbed and flowed.
It would be useful if there was a way to keep this data fresh, of course, to avoid going back to Twitter for a new download every so often. And, thanks to @mhawksey, there is a simple way of doing this, using Google Drive. Martin explains all here, with a handy video set-up guide.
And so I now have a cloud-based archive of my tweets, complete with a basic search and browse web interface. This is now a lazy man’s look-up of old tweets and the resources they pointed to, searchable by handle, hashtag or key term.
But perhaps this is something about which most people are lazy. Social media provides us with an overwhelming stream of quite-interesting things, in amongst which are nuggets of gold. Those nuggets I can manage in the old way, by recording them properly, perhaps in a bibliography. I might even read them, one day. But the quite-interesting stuff, whilst being too much ever to record properly, will probably remain quite interesting. And so this provides a middle way between formal curation of a webliography and just searching the live web (which assumes I can remember enough about what I’m looking for.)
Might this archive now change my future tweeting ? Early days to judge perhaps. But I think it may, since I may now retweet and share in preference to using favourites, in order to get a link to a resource into the archive. I can also imagine starting to use personal hashtags, as a way of structuring my own archive at the same time as I tweet. Real-time curation perhaps ?
And I might share it too. Since this is now unambiguously my own data, rather than Twitter’s, I can licence it for reuse by others in larger corpora for analysis. Imagine a pooled archive of the tweets of many historians. Now that would be interesting.