Thursday, 8 March 2012

from 'traditional' to 'digital' history

Amanda Goodrich gave a paper on the meaning of aristocracy at the C18 Britain seminar at the IHR yesterday. She explored text-mining various sources (British Library 19th century newspapers, Eighteenth Century Collections Online, House of Commons Parliamentary Papers) and using other digital resources such as Google ngram viewer to chart the use of the word 'aristocracy' in eighteenth and nineteenth-century print.

Two points emerged from the paper, which related more to methods rather than content:

1. using new digitised sources and databases as a 'layperson'. 

Amanda (and indeed I) researched and wrote a PhD thesis back in the days before digitised sources. The old way of calling up books and pamphlets in the British and Bodleian libraries, trawling through card catalogues in public libraries and local studies centres, transcribing text, cross-referencing using notecards and folders and post-its: all these are the main methods of doing text-based research.

Now we have huge databases of digitised sources at our disposal. This is a massive change that has only really taken off in the past couple of years. We can call up those books in minutes at home rather than having to travel to repositories and wait for them to be delivered from the stacks. We can use computer databases and keyword searching to do some of the legwork of making connections for us.

How does that change our research, both in the methods we use and the results we come up with?

As Tim Hitchcock hinted in his plenary lecture to the Gerald Aylmer seminar the week previously, to some extents historical research is no longer based on a selected number of texts. This number was often circumscribed by various factors including the particular library the researcher uses, the collection held by the library, the amount of time taken to make notes from those books, and the intellectual capabilities of the researcher to make connections between those texts.

Now we have a potentially limitless number of books and texts to 'mine', with constantly evolving and increasingly sophisticated tools to do so. Some are scholarly and 'curated' such as Eighteenth Century Collections online, but others are less well catalogued, in particular Google Books.

So although historians have always looked quantitatively at sources, they can now do so with much larger numbers of sources from a much wider range of repositories than is usually feasible physically. When a survey of 500 pamphlets took three years to complete, now it can take a few hours to consider 5000 texts.

Do our research methods therefore change? Moreover, do we come up with new conclusions based on more quantitative, 'data-mined', research?

Most of the talks and papers that I've seen and heard about this topic are normally from the producers of the databases and resources, rather than by their end users. We need to consider in more detail how as 'laypeople' use and can train themselves to use these resources. How do we make sure they are aware of the problems? The OCR of the digitised texts is still very poor and inaccurate. The key-word searches are often clumsy or too fuzzy, or inaccurate. Issues of cost and access to certain sites still comes into the process.

2. how to use these new sources in teaching and PhD supervision.

New students starting out in their PhD research (and indeed undergraduates), now have all these sources at their disposal.

How does this change the nature of their historical research methods? How do we as supervisors, who did our PhDs the 'old fashioned' (and some would argue hard, although I beg to differ) way, teach students the skills to use such resources? Is it possible, and intellectually defensible, to research an entire thesis using just digitised sources and database search methods to do so?

We all have to consider these new possibilities, our methodologies, and our training as historians in this new age of the 'digital humanities'.

Friday, 2 March 2012

Locating the Past: part III

Finally, Tim Hitchcock of the University of Hertfordshire gave the plenary lecture, 'Place and the Intellectual Politics of the Past'.

It was a lecture of two halves. First, he lamented the lack of collaboration of historians and geographers, who have been divided by the STEM vs arts fragmentation encouraged by universities and funding bodies. He reflected on the 'spatial turn' currently en vogue among historians, which he rightly suggested was a casual rapprochment with geography that was motivated by current academic fashions rather than a genuine desire to connect methods.

The 'spatial turn', as I have commented elsewhere, is in my view another extension of the cultural turn. It has a valuable emphasis on the symbolic and representative elements of space, but cannot provide a complete answer to the wider structures influencing historical action. Principally, it disregards the importance of place (as defined by custom, law, belonging, memory) in society and the economy.

Yet as Southall pointed out at the start of the day, Hitchcock underlined the problem in bridging the divide. Most historians still trade in text, mediated in ambiguity and disagreement. Yet most geographers trade in places, of known certainty.

With the advent of the internet and digitised sources, however, we no longer read circumscribed number of texts in detail. We are moving towards what Franco Moretti calls 'distant reading'. This is where Hitchcock pointed to the possibilities offered (and indeed already beginning to be achieved) by his involvement in such projects as Old Bailey Online, London Lives and Connected Histories. Digitised texts and connected data provide the bridge between geography and history, and point to whole new ways of thinking about historical evidence. Hitchcock argued that we now 'share a new culture of data', and this, he provocatively asserted, could and should lead to a 'bonfire of the disciplines'.

The second half of the lecture gave a more muted warning. Despite the amazing possibilities of data mining, geo-semantic tagging, etc,  Hitchcock worried that we could become too immersed in technological approaches to history and geography. There was too much data and too few people.

He therefore closed with a tale of the late 18th/early 19th century street character Charles MacKay/McGee, who frequented the same spot outside the Wilkes obelisk in Fleet Street every day for more than 30 years. Tim pondered whether ‘some people have a greater right to appear on a map than many buildings’.

A fitting end to an inspiring day of talks and discussion.

Locating the Past, part II

My review of 'Locating the Past' at the IHR, 29 February, continued...

David Thomas, Director of technology at the National Archives explained some of the ways in which TNA is digitising their most popular maps. Nothing particularly radical there, but sorely needed.

Panel 2: Applications

Ian Gregory of Lancaster University,  Richard Coates of UWE, and Nigel Walford of Kingston University showed us how GIS is transforming their research.

Gregory's paper was the most interesting for me, and it dealt with some of the issues raised earlier by Humphrey Southall about geo-semantics. Spatial Humanities: Text, GIS, Places is an ERC funded project seeking to develop a GIS tool for text, amongst other aims.

The pilot project was Mapping the Lakes which data mined the text of the journeys of the Romantics Gray and Coleridge to map their emotional response to the landscapes of Cumbria. Colin Jones raised a query about the issue of literary genre and fictional licence with regard to this project (again the question of spaces of representation came through here).

But I'm really excited about the other parts of the wider project, particular their development of semantic tagging (using xml, possibly natural language processing) of place names in British newspapers. Andrew Hardie of Lancaster is working on mapping places mentioned in London-based newspapers 1653-4, but they also aim to work on the British Library 19th century newspapers.

This is something that I wish to do. I would love a way of automating what I do by hand at the moment: map sites of meetings recorded in early nineteenth century northern newspapers, and use something like natural language processing to associate those place names with their contextual information: type of meeting, numbers attending, etc. This would enable me to map sites of protest and how they changed over time in response to physical changes in the urban landscape (especially significant in the early nineteenth century wave of urbanisation and industrialisation) as well as political pressures.

Panel 3: Audiences and Engagement

Caroline Kimbell of TNA, Bruce Gittings of Edinburgh Earth Observatory, and Nick Stanhope, CEO of HistoryPin showed how outreach and community engagement is the most important and far-reaching implication of all this new technology. This truly demonstrated how historians and geographers can break the barriers of academia and give anyone the power to explore their history and place in ways unthinkable before a couple of years ago.

HistoryPin was,  by a long mile, the most forward-thinking of all the projects showcased today. I initially had been a little agnostic about the project, as it didn't feel that dissimilar from something like or indeed flickr in enabling individuals to 'pin' their old photos onto places on the map, showing the 'before' and 'after' of historical change. Yet Nick Stanhope is a public speaker and thinker (perhaps in the 'ted' mode) knocked the spots of all the academic speakers in showing us how to speak and convince. The mission of his company is liberal-social, yet beyond the buzz words of 'enabling communities' and 'bridging inter-generational gaps', I could see the real point of HistoryPin, both socially and intellectually. Their emphasis is on community action and groups, and their experiment with communities in Reading to chart their own histories using their own personal archives of photos seemed like a great achievement.

Stanhope explained how HistoryPin's next objectives involve:

1. further local events, getting more groups together to form community archives.

This is crucial, not just for improving community relations, but also in opening up a wealth of photographs (and they hope oral histories and other documents) previously unavailable to historians. Although arguably local studies libraries and history groups have undertaken this sort of community history for a long time, this project feels like it can achieve much more, from the bottom up, giving individuals and groups a chance to curate their own archives and mini-museums.

2. channels and embeds

enabling local museums and groups a 'channel' on HistoryPin with their own content. This also involves developing the technology for apps and other features on mobile devices.

This emphasis on face to face events to upload material onto HistoryPin has also enabled it to go beyond being just faceless form of social media. However, Ben Anderson of the University of Gloucestershire asked whether this sort of community-based history making risked one group imposing their own narrative over another, avoiding searching questions about local histories that museums and historians might ask.

I look forward to how HistoryPin develops over the next year or so.

Thursday, 1 March 2012

Locating the Past, the Gerald Aylmer seminar at the IHR, 29 Feb 2012, part I

I attended the Gerald Aylmer seminar day at the Institute of Historical Research on 29 February, which had the theme of 'Locating the Past'. It was a stimulating and exciting day showcasing different ways in which historians, geographers, archivists and, for want of a better term, pioneers in social media for community histories, were using GIS and other technologies. It was definitely a forward-looking event, highlighting the great possibilities offered by mapping historical data of all kinds, but also indicated the potential problems and tensions with what's going on at the moment.

Humphrey Southall

Humphrey Southall began with a whizz-through overview of the basics of GIS and its underlying principles.

His main argument revolved around the way in which geographers and historical geographers, using current GIS software, are often confined to a  
geo-spatial definition of geography = maps and space
However, historians are generally most interested in the geo-semantic definition = text and place.
'Places' are how most people think about geography rather than co-ordinates. 

It is difficult to represent 'places' and 'localities' adequately as points or polygons on a map. Uncertain historical places become even more problematic the further back in time we go and cannot be pinned down to specific co-ordinates, such as medieval parishes for which we only know the administrative centres but whose boundaries fade away into marshes and forests.

Southall stressed the need to develop geo-semantics within GIS, for example, using 'IsNear' and 'IsAdministrativelyPartOf' as possible terms, although these can never express all the relationships between places. He also highlighted the importance of representing history as linked data, a process that the Ordnance Survey is currently undertaking.

Southall concluded by 'opening' his project Old Maps Online, which brings together all the freely available old maps through an easily searchable portal. It's still work in progress, but the range of maps already there is pretty impressive. It's a little thin for my area of interest - the north of England - for example they haven't got the layers that the Manchester Public Profiler has, and there are no [?] pre-1836 maps.

Panel 1: Sources

Kimberley Kowal, lead curator of digital mapping at the British Library, Dominic Fontana from the University of Portsmouth, and Andrew Hudson-Smith from the Tales of Things project, offered us three projects that are in one sense very different but actually dove-tailed in their attitude to sources.

The British Library maps online project successfully used crowd-sourcing to geo-reference their historical maps.

Fontana's project involved using GIS software to map paintings of the Battle of the Solent of 1545 and speculate about why the Mary Rose sank in the place she did.

Tales of Things by contrast uses QR codes to tag objects with 'memories', allowing them to tweet and/or play back recordings. Hudson-Smith gave a slick 'ted' style presentation and although he showed how their collaboration with Oxfam raised the charity's takings, he seemed very unsure about the wider purpose of this technology and more significantly, its consequences. [BTW, who really uses QR codes anyway? Will they really last as a form of technology with these objects, as he suggested, 'from cradle to grave'?]

I played the pernickety historian and asked the difficult question to all three speakers. What concerned me was about their attitude to the sources. Two inter-related issues concerned me (and to my relief, the other historians in the room too):

1. accuracy and reliability. 

All three projects seemed to be concerned with what they saw as 'the truth', or the most 'accurate' mapping.

Fontana's selection of paintings made a great story because they were surprisingly geographically accurate in the placing of forts, coastline, etc. Kowal said that the British Library selected a quarter of their map collection on grounds of geographic accuracy, so that they would not warp too much when geo-referenced. But what about the other three-quarters? Can't those maps tell historians something about why and how mapping techniques changed over time; the fact that they are inaccurate is important and not a reason to reject them. I wondered what happened if participants in the Tales of Things project lied about their objects, as those misrepresentations indeed are as interesting as truths, but Hudson-Smith seemed perturbed by this suggestion and proclaimed that he had faith in human nature not to lie.

2. spaces of representation

Cultural geographers have all read their Denis Cosgrove, Henri Lefebvre and their Edward Soja. Indeed I thought much of that was all old hat. So it seems obvious to me that both maps and objects are never neutral, but are rather inhabit what Lefebvre calls spaces of representation (Soja's thirdspace).

Maps and objects are not meant to be accurate representations of reality, but rather (as with all cultural items) are representations of ideology, power, politics, the intentions of their producers and patrons. The paintings of Henry VIII in the Solent are very much a case in point of ideological representations. So whether or not they are geographically accurate is besides the point: they are something more than that. So I asked what happens with maps that aren't geographically accurate because they have a point to make - a ducal claim to assert, a landowner's sketch of his estate, a town's portrayal of itself as civil rather than slum? What about the many paintings which use artistic licence to show certain landscapes, emphasise certain features, be symbolic of their owner's power? Also, how are those meanings received and indeed subverted (Soja's thirdspace) by their viewers and users?

Again, this did not seem to be taken into account by the three speakers, who gave the impression that their objects had neutral meanings. I worried about the three-quarters of the maps rejected by the BL because they were not accurate - again, that's what makes them more interesting to political and cultural historians, because of the layers of meaning they portray.

More review of the day to follow in the next part...