On Monday March 31 the Recordkeeping Roundtable hosted an evening event exploring the possibilities for better, more connected online access to archives. Our aim was to consider ideas put forward by Chris Hurley in his ‘modest proposal’ for improving access to archives and other records. To do this we enlisted the help of three speakers; Richard Lehane, Mike Jones and David Roberts. Read about our speakers here.
After an introduction by moderators Cassie Findlay and Kate Cumming , Chris Hurley started things off by addressing some of the observations and ideas in his proposal.
Chris has been thinking about how we can improve access to very large volumes of records while ensuring that ‘in depth’ contextual and structural description is implemented and available as needed. His report in 1986 to the now defunct Australian Council of Archives proposed a methodology by which archives could collectively pool decriptive information. Since that time, he noted, we seem to have made little progress.
Chris made the point that archival data is volatile; unlike library based description, the ‘object’ is always changing. That has implications for federation. Federation without standardised rules is the only valid model, he argued, as it’s a model that doesn’t interfere with existing arrangements. We want the ‘gathered’ and the ‘ungathered’ stuff in our federation; room for both the ‘gold plated’ and what he calls the ‘barefoot’ descriptions, from small community archives, for example. It is the functional requirements for federation that need to come first, Chris argued – not standards.
Chris then took us through some models for descriptive practices, which he characterised as the blender (for example RAD), the harvester (for example TROVE), and ‘through the looking glass’, a new model. In ‘the blender’ model you get everyone working with the same standards, put everything in, churn it through, unified product comes out at the end. Problem is that everyone has to be aligned at the front. You have to fit the gold plated standard. With ‘the harvester’ you go out and scoop up other people’s descriptions. These are flat description, no depth, duplication issues because description exists there and in the native application. This is a problem to be solved in federated access – there will be duplicated content, new records effectively, that will require management.
Under the third model, the ‘through the looking glass’ model, you could have a wiki structure that envelops the native descriptive environment – describers are essentially projecting themselves out into the world. The native environment stands alone, supported by standardised wiki content around it. Wiki results will display based on documents, deeds, doers, categories based on ABS descriptions, Subjects based on big picture headings like Great Depression. Filter by date and format and by other groups and categories like deeds, doers and documents. As with all wikis, there would be capacity for citizen archivists to come in, add search terms etc within the wiki structure. Chris was, however, careful to note this his ideas may not mean a wiki per se. It’s about contributing description to wrap around and reflect ‘native’ descriptions – however you achieve this technologically.
Next we heard from the first of our three ‘responders’; Richard Lehane. Richard said that he broke down Chris’s proposal into three groups – administrivia (funding, who does what), presentation (what interfaces do we present) and the big one – integration. This, Richard argued, was the main question that Chris was really posing – thinking about different integration challenges, challenges of integrating different systems. Do we integrate by standardisation or do we allow for freedom? Could you allow different descriptions within institutions?
Richard suggested that a wiki is a good conceptual model because it is a system where you can add anything, just navigate your way through, create your own trails, your own webs of interconnected things: free form is a good thing. However the risk is that by following the Wikipedia model too much, you could be constrained by conventions. Wikipedia is on the surface very open but there are a lot of conventions in play. Our methods also have a lot of conventions.
Richard asked: why do we have ‘Description’? Why not just assert facts where we have them and then tell stories about things? Communities could tell stories, archivists could tell stories, the disenfranchised could tell stories.
Richard agreed with Chris regarding great openess and associativeness but worried about the boxes that archivists have to fill in, how do we break these apart, think more freely about these things we want to describe?
Mike Jones talked about how his work with the eScholarship Research Centre creates federated resources. Find and Connect, women’s archives, science archives. They have been looking at how to bring this information together, presently a very manual process. Therefore he said he was very supportive of Chris’s proposal. Mike also responded to Chris on the role of standards, noting that the EGAD group is trying to create a very broad and conceptual model but whether this will follow through in reality remains to be seen.
Mike thought that a wiki is possibly not the best tool but certainly a good open model for discussion. Whatever is used, it needs to be open for users. In the case of Find and Connect, he said, people are looking for records about themselves across very small and large organisations. Most of these organisations have very poorly described records, people walk through their doors looking for records and often organisations have to say ‘we don’t know’. What’s needed is a modular approach, Mike suggested; how do you help those with poor description, how do you share description in ways that don’t share personal information?
Mike likes standards, and uses EAD and EAC all the time to mark up information and make it more useful. But it is important not to require conformance to ‘gold plated’ standards. He said we need to have tools that create very flexible data sets that can be exported, carried around, create different records for different communities that may need access to them. He also spoke about scale and depth; do we want broad scaffolding that enables lots of organisations to take part or depth of description?
David Roberts spoke from his experience in dealing with resource allocators, who would, he argued ask: “Why do we need something else? Can’t this be built into Trove? Trove covers archives and data sets already.” We may, David, said, have answers that make sense to us, but will these be understandable and meaningful to others?
Back in the early 2000s, he noted, CAARA tried to establish a model for federated access. The result was highly unsatisfactory report, with massive costs. Everyone was pushed into museums community model, CAN. State Records did a lot to shoehorn its complex series system descriptions into into this simple museum based model. This was a real life example of resource allocators requiring a specific – and not ideal – approach to try to connect archives, libraries, museums, galleries, broader cultural heritage.
David also spoke about different expectations regarding information access in the private sector. We are used to the idea of federated access providing a public good. But in business, private equals private. David’s current employers, an independent school, would say that no one has the right to access private archives. How does this work with federated access? Any contributions can impact branding, risk, managing, marketing, value etc. This could really limit extent to which commercial or private organisations would get involved.
During the general discussion, we heard lots of great contributions and ideas from the floor.
Greg Rolan has been looking at interoperability in archival systems. He has a different conception of federated; what about cooperation of systems using web APIs? What about fringe groups etc? Shirky says ontologies are an exercise in mind reading
Chris Hurley agreed that this was an important point: how ‘controlling’ should this tool be? What about non centralised control? How can different views be brought in, helped by a lack of control?
Luke Bacon is a web developer and, he suggested, perhaps one of the ‘citizen archivists’. Luke noted some parallels with his work in web design – where they work with both presentation as well as the back end. Some years ago people were creating records in silos eg Yahoo, Geocities – so there was a big push for a decentralised social web similar to the federation discussion. This turned into a standards exercise which fell apart. He has recently been following the indie web movement. The idea, he argued, is to build tools and then discuss them. Innovate by making failure cost nothing Luke also commented on the idea of control / authority, and compared it to open source branch development methodology – owners of the project have the master branch while others can fork and create their own branches.
At the conclusion of the evening we posed a number of questions that need answering in further discussions:
– Is the first priority breadth or depth?
– Is this envisaged as a centralised hub and spoke model or a decentralised network model?
– What is controlled and what is uncontrolled?
– What is the balance between automation and manual curation? This may vary over time – we may need to start with more manual curation and aim to automate over time, or aim to do more curation as content and resources increase.
– Can this be achieved by using or adapting existing technology/tools or is a new tool required?
– Do we want to try and maintain a single record and distribute access to this/views of this across multiple systems, or will we allow/support the creation of multiple records which may diverge over time?
– If multiple records, how do we manage divergence and version control?
– We want to be inclusive, but there must be some minimum requirements for inclusion – what are the entry requirements?
And perhaps the biggest question: how do we give this legs, a spark? Build a prototype? Form a working group? And who should take the lead? If not an institution like the NAA, or CAARA, then who?
We at the Roundtable hope this that event was a beginning of further discussion of how we can make federated access to archives (and other records) online a reality.
With many thanks to our speakers, our sponsor ATP Innovations, and all the participants on the night.