By Cassie Findlay
In the flow of information all around us – in businesses, governments, personal spaces, in the physical and online world, there is information that we want to fix at a point in time and give it an identifier that we know we can use to find it again. That is, be kept in a way so that it remains not only identifiable with a meaningful name, but also so it is inviolate and trustworthy over time. The information might be born digital (emails, datasets, web pages, tweets, PDF documents), digitised copies of physical formats (books, paper documents) or still in physical form only. It might be unique or duplicated many times, secret or published and widely disseminated.
Fixing information at a point in time and keeping it as evidence is recordkeeping. Traditionally this is about a person or organisation responding to a need for evidence to be kept (whether for personal, legal, business, other reasons), and keeping the thing (such as an email) in a recordkeeping system. That is, linking the thing to its business context, and making a relationship for the thing with others using metadata. Recordkeeping systems can be established for an individual, a business, or, in the case of state / national archives, for a society. An archive is simply another form of recordkeeping system.
Who ‘keeps’ the records?
Traditionally, the organisation or other entity doing the recordkeeping has ideally possessed certain characteristics. Government archives are usually statutory authorities with reasonable expectations attached of continuing to be supported by the Parliament and citizenry to keep records and make them available as a feature of a democratic society.
A set of functional requirements for recordkeeping was established in the 1990s by David Bearman at the University of Pittsburgh which were the basis of many standards and recordkeeping / archives projects since, and remain really useful in the digital recordkeeping world. The first section of these requirements relates to requirements for ‘Conscientious Organization’ – indicating the minimum set of characteristics that a record creating organisation should have if it is to set up a recordkeeping system that will keep records as good evidence.
In the digital preservation world a widely used standard is OAIS – the Open Archival Information System standard – originally developed by NASA for managing data from missions into space. This requires any organisation seeking to keep long term digital information to have demonstrated and transparent long term funding / legal status / governance arrangements to engender trust in the users of the archive that its contents will be there beyond next week or in one hundred years.
In all of these instances, the keeping of a digital object as a record remain the responsibility of actors who are well resourced or legislated and operate within legal / juridical / organisation / nation State boundaries. But with the blurring of the line between the public and the private domains, and the cloud making organisational boundaries for technology increasingly meaningless, and the rise of civic groups, NGOs, activists and others wishing to keep their own archives, surely there is a need for a system or framework that will keep trusted, authentic and well named digital objects, in all forms and that will sit above corporate and government entities – a decentralised archive?
The role of inviolability
In archival terms, records can be trusted as good evidence if they are the result of routine and consistent recordkeeping processes, have kept in recordkeeping systems with controls over alteration and tampering, and have good metadata to show the contexts in which they have been created and kept (their provenance). The challenge of keeping records inviolate – that is, protected from tampering and unauthorised removal or destruction – has been achieved in a variety of ways over the years, from practices of listing file contents and page numbers to restricted permissions enforced according to a user’s login details. Usually the methods were imperfect, and record tampering and removal could take place where there was a strong enough motivation.
New forms of archives
As more independent archives appear online – particularly those established by publishers, activists and others in adversarial and risky situations where the veracity and trustworthiness of what they keep are paramount – how best to ensure the inviolability of the records they hold and their longevity?
Is it possible to capture reliable and authentic records of online content, documents, sets of documents or data or any other piece of information as records without the anchor of Bearman’s ‘Conscientious Organization’? Without a ‘large, persistent organisation’ (as Mike Jones succinctly described them in his recent post on link rot and the ephemerality of the Web)? Or perhaps with that ‘organisation’ taking a new form?
Is it possible to have a trusted and robust decentralised archive and uncontested space for keeping records for all kinds of digital content that we want to find and reply on in the very long term? Can such an archive operate programmatically and without the need for ownership or control by a particular entity? And can we be assured of the integrity of the archive’s holdings even where it exists in a politically charged and volatile environment?
The blockchain is, to use a physical analogy, like a ledger or registry. It keeps track of entries. Those entries relate mostly to the transfer of money from one address to another, but an entry for almost anything can be recorded in the blockchain. A key distinguishing feature of the blockchain is consensus; the blockchain algorithm enables distributed (global) consensus on who owns what currency.
BitCoin is just one application utilising the infrastructure of the blockchain. In recent times developers have started to use blockchain in new ways that sound very familiar to those of us working in recordkeeping – building applications for keeping trusted records in a neutral, decentralised environment.
A decentralised archive utilising the blockchain as a storage mechanism could offer an uncontested space from which records could be accessed. Documents and other sets of data can be validated by the blockchain – even if an application you used to get it there is not working. It is decentralized proof which can’t be erased or modified by anyone; competitors, third parties, governments. This is what distinguishes using the blockchain from other forms of data timestamping and authentication. A number of businesses that are making the most of this capability have spring up: Proof of existence and Block Sign, for example.
The technology potentially offers a means for society – or at least groups within society – to keep their own records with some assurance about invioloability and longevity that was not possible before. This has huge ramifications in terms of the ability to guard against censorship of information that is damaging to the powerful. As WikiLeaks’ Julian Assange has observed:
At its core the blockchain provides global proof of publishing at a certain time. That means that once something is in the blockchain it identifies precisely what moment in history it occurred and can’t be undone. This breaks Orwell’s dictum that he who controls the past controls the future and who controls the present controls the past.
Startups like BlockTech have taken this notion and turned it into tools for capturing and preserving evidence of public life using the blockchain. The asset that their ‘Alexandria’ distributed app:
..preserves the integrity of the historical record. It taps into collective, on-the-ground reporting by scraping Twitter as events unfold and prevents after the fact censorship by archiving the information on a blockchain. Alexandria’s visual word cloud and timeline slider illuminate surprising connections. It’s history written by everyone, not just the victors.
The blockchain can also be an authenticating mechanism. Instead of relying on a central authority to certify the authenticity of a document, it can be used to assert the proof of its veracity via distributed cryptographic confirmation. Silicon Valley tech entrepreneur and author Andreas Antonopoulos describes this as “trust by computation”:
Trust does not depend on excluding bad actors, as they cannot ‘fake’ trust. They cannot pretend to be the trusted party, as there is none. They cannot steal the central keys as there are none. They cannot pull the levers of control at the core of the system, as there is no core and no levers of control.
But is it – or will it be – persistent? How does the blockchain promise longevity for the information it encodes?
Much like the premise put forward by the LOCKSS project, the premise is based on redundancy. Those who say that the blockchain will be around forever say it doesn’t matter if another technology takes the place of BitCoin. As reddit user 1blockologist said last year:
The 28gb list of current blocks is stored in enough places to be around forever. What is stored in the blockchain is stored amongst thousands of machines (and their backups) and won’t disappear just because a different technology became more popular.
However, digital preservation specialist and LOCKSS expert David Rosenthal is less enthusiastic, focusing on the ups and downs of BitCoin, and saying in relation to blockchain:
Clearly, a technology with this much volatility is a wonderful basis for gambling – shorting Bitcoin would have been a terrific investment over the past year had it been possible. But why would anyone think that it would make a suitable basis for any important social function, such as elections, or long-term information storage?
I am new to the blockchain world but it seems to me that 1blockologist has a very valid and LOCKSS-like response to this criticism.
ASCII Bernanke, Rickrolls and WikiLeaks
The blockchain is a ledger and a registry but can also be a home for full content. People who have hacked into it to store stuff have added a surprising variety of things. Ken Shirriff’s blog post ‘Hidden surprises in the Bitcoin blockchain and how they are stored’ listed a few of them, including the original Bitcoin white paper, a 2.5 Mb Wikileaks Cablegate backup, the lyrics to Rick Astley’s Never Gonna Give You Up! and a picture of former Chairman of the Federal Reserve Ben Bernanke executed in ASCII.
Maybe these are just the early markers of a greatly expanded use of the blockchain for keeping evidence and making sure it’s never lost. Working out ways for independent archives and others to make use of the blockchain seems like a very useful thing for us recordkeepers to explore. Hopefully we Roundtablers can find some experts and have a conversation about it soon.
‘Alternative chain’ The BitCoin Wiki https://en.bitcoin.it/wiki/Alternative_chain
Kevin Cruz, ‘Blocksign: Signing Documents on the Blockchain’, BitCoin Magazine, Nov 2014 http://bitcoinmagazine.com/17950/blocksign-signing-documents-on-the-blockchain/
Factom Outlines Record-Keeping Network That Utilises Bitcoin’s Blockchain http://www.coindesk.com/factom-white-paper-outlines-record-keeping-layer-bitcoin/
Robert Graham ‘BitCoin is a public ledger’ http://blog.erratasec.com/2013/05/bitcoin-is-public-ledger.html#.VLw7eEeUeVM
Nozomi Hayase ‘How Bitcoin’s Blockchain Could Stop History Being Rewritten’ http://www.coindesk.com/block-chain-aid-fight-free-speech/
Nozomi Hayase The Blockchain and the Rise of Networked Trust http://www.coindesk.com/blockchain-rise-networked-trust/
Walter Isaacson ‘How Bitcoin Could Save Journalism and the Arts’, Time, Oct 7, 2014 http://time.com/3476313/can-bitcoin-save-journalism/
Ken Shirriff ‘Hidden surprises in the Bitcoin blockchain and how they are stored: Nelson Mandela, Wikileaks, photos, and Python software’ http://www.righto.com/2014/02/ascii-bernanke-wikileaks-photographs.html