Recordkeeping Roundcasts Episode 5: Personal data, privacy and looking ahead

EllenHeadshot-15In part 2 of our chat with Ellen Broad, we talks about privacy and changing attitudes to data about – and of – ourselves.  Ellen’s book, Made by Humans: The AI Condition is available from Melbourne University Press, or from all the big online book retailers. You can read Ellen’s bio after the transcript.


CF: But I mean data is to those companies, whether it’s a specifically focused company that does a very particular thing or whether it’s Google is it’s gold. I’m living in San Francisco now, and we’re experiencing a gold rush bigger than 1849 because data that is collected and amassed and used by the big tech companies here is it’s just the gold of the 21st century. And so the stakes do get very high and it gets much harder for the public authorities to assert that control against these tech giants. Do you see, I mean, and this is also a bit of a side issue I suspect, but the sort of centralization of the few sort of massive companies that seem to be gobbling up all of our data, do you see any chance of a change away from that kind of semi-monopoly that they have?

EB: So I find the language like data is gold and oil and their comparisons to precious metals or finite resources really problematic because it shapes even how those companies treat the data that they have. So they sit on it as having innate value, so you know you accumulate. When I think of gold, I’m like they’re accumulating massive reserves of what I think of as inherently valuable information that can be exchanged for money when actually data is also incredibly fragile and difficult to preserve and very quickly out of date and cannot really just be kept in a database and sat on and still have that same value preserved.

It doesn’t innately grow in value over time. If anything, if you fail to think of it the way archivists think of information, the quicker the value in your data will decline as in, even the way that organizations approach data collection and storage, I don’t know many technology companies that think of preservation of the data that they have or the need to introduce quality controls around data collection itself to ensure a reliable historic record. I don’t know many organizations that actually approach data with that longevity in mind. It’s just, let’s pull it out of the ground to use the gold language. Let’s rip it out and see as much to them.

And then unfortunately if they just focus on, “Let’s get as much as we can and store as much as we can,” they’ll open it up eventually and realize that a lot of the value might have crumbled away. And that’s where I think where we have a reckoning coming, as in we have just kind of gotten into this mindset of we’re in a gold rush. The more you have, the more valuable it is, but we’re not investing in curation, preservation, ongoing access, long-term access…

CF: Yes. And understanding where… What I’ve seen certainly in some of the settings that I’ve had a bit of exposure to hear that a lack of kind of being able to look across a number of areas of business and understand the way the criticality and the value is and where to focus, if it’s on preserving that historical record, and all data is not created equal. And so what I’m doing in my current work is trying to kind of map out some of those, I guess, understandings of business to overlay on the technologies that are collecting, storing, sharing, destroying data.

And it’s fascinating stuff, but I wasn’t here to talk about me. I have a third question which is going to get to the privacy piece ’cause it is so much a part of this, and anyone who has a heartbeat and goes on to the internet understands that we are every day making choices about what we are willing to give away in return for a little bit of information about us or sometimes a lot of information about us in terms of convenience and you touched on it early when you sort of were talking about how we could be given certain preferences based on what we’ve done in the past and all of that.

So, I feel like though in the last twelve months globally, there’s been this kind of really interesting recognition of that bargain or that exchange. And with the introduction of things like the general data protection regulation in the EU, there’s a new law coming in in California, which has a lot of similarities to that, and at the same time, a lot of high profile cases of companies in particular, not just companies, governments as well, but companies like Facebook, breaching their customers’ trust by sharing data with third parties and in that case, Cambridge Analytica. So I guess the question I have is, where do you see this all heading? I mean, we’ve got the regulation on one hand, we’ve got data breaches seemingly happening at an ever increasing pace. What’s your kind of crystal ball prediction on where this is all going?

EB: Maybe I can talk about where I hope it’s going.

CF: Okay.

EB: Where I’m afraid it might go in part is I think like it’s really hard to predict what the world will look like in a few years because at the moment it seems like we’re almost being driven in several directions and it remains to be seen kind of what approach wins out. So definitely something that I think that is happening that is very positive is that we’re seeing a responsibility for safeguarding and making choices about personal data shifting from individuals on to entities collecting that data and that is I think a shift in responsibility that’s reflected in instruments like the general data protection regulation in the European Union, but also a reflection of how well sense of the responsibilities of organizations in this space is changing.

I think in the book, I have a throwaway half page reflection on reading that excellent book, The Immortal Life of Henrietta Lacks, which looked at specifically the story of Henrietta Lacks and her cancer cells, HeLa cells, that have been used widely throughout the world for cancer research, but were taken without her consent and the family’s battle to get control over that tissue back today. But what really struck me just looking back at kind of consent issues around blood and tissue in the 20th century was how much the debate has played out in almost exactly the same way with the same language.

We, in privacy and they went through exactly the same mindset shift in the medical sector was it started out being, this is just the way the sector works, if you don’t like what we’re doing, it is up to you to stop it from happening. That’s really interesting if you look back, you know, the language that Mark Zuckerberg uses and a lot of technology founders. It’s, if you are unhappy with the way that your data is being processed, don’t use our services. And how could this be unethical if this is just the way everybody is? And this is the way everybody uses these services.

And just as we saw with kind of consent issues and autonomy concerns in relation to blood and tissue, I think that’s what we’re starting to move into with personal data is actually that it is not quite often individuals don’t have the power to shape how information about them is collected, they don’t have any meaningful mechanisms to control it. And that even if you’re not in these spaces, information about you is being processed anyway. So number one, I think that has been a really good and positive shift that we’re seeing responsibility move away from individuals more on to organizations collecting data, and that should have flown effects in how we treat data breaches, in how we consider ancillary services, secondary services, provided using data that’s being collected.

So I think that is positive and I hope that that continues and kind of becomes the norm in other jurisdictions. What I do see as well that worries me because it could shape the way that we think about information in the future is the increasing approach to personal data as a commodity that should be bought and sold. So, as well as a sense that if you’re an organization collecting personal information, you need to follow certain practices and be responsible. We’re also seeing a lot of discussion around what is personal data worth. This is data that I own, how can I monetize it? Here are some services that can help you monetize your data. And information is not a commodity like your house or your car. Your providing information to somebody else doesn’t prevent you from having access to it yourself.

It is something more than, in my mind, an asset that can be bought and sold, particularly because we have relied on information about people throughout history to help us make lots of different kinds of decisions. So if we move so far down the path of ownership over personal data like individual ownership and control, I think it’s really hard to divide what is mine and what is yours, in relation to information. Like you and I, who owns this recording? I mean, I know we’re under copyright law, which of us owns it? But as information where I’m revealing information about myself and you’re revealing information about yourself, control over what happens with this is something that’s hard to just separate. It’s hard to just cut in half and give me certain rights and you certain rights. I have mentioned other people in our conversation. It’s just networks are complicated.

And I think we just can’t forget that, that we live in these complex webs of connectedness to other people and if we try and lock down information too much, we might prevent some really powerful uses of information as well.

CF: Yeah, I mean it’s… What you’re saying is really resonating with me. And that, of course, archival institutions have been about capturing data at scale that represents those networks and how they function. And of course, Mark has a full of personal information that has been collected and also makes me think about the interesting debates that have happened in more recent years around the census collection and around people… Now, you have people wanting to walk out on having themselves recorded as part of the census partly because they mistrusted how the agency charged with doing the census would store and handle that data, which is probably not an unreasonable fear, but also I think part of it is that sort of distancing themselves from wanting to be part of a giant data collection effort. But then on the flip side, that giant data collection, if it is there to help build roads and schools and is part of being part of a community, part of a society. I’m not sure if you touched on the census issue. I can’t remember. In the book, did you have a look at that?

EB: I didn’t touch on it. I removed it. I remember writing about it early on, but it was more because just I couldn’t pack in any now into the book but the census is a really interesting example of an instrument as you say that’s collected for really socially beneficial purposes and the desire to link census instruments across kind of four-year period is not malevolent. They’re not trying to create a surveillance economy in which every citizen can be monitored but at an aggregated level, that kind of societal level could help social researchers, statisticians, policy makers kind of understand trends that are changing in society.

The complexity though is that because we collect so much information digitally now like I was incredibly sympathetic to the concerns around linkage in the census and digital collection of census data. Because this information is digital, it is inherently easier to share and easier to access. And so while everything being on paper created a kind of friction, it disincentivizes easy linkage and make distribution of information to lots of different actors because it’s just expensive and difficult.

And I say this, I’d probably be preaching to the converted given that this is an archival audience, but I think about this a lot in the context of a family history research. My family is not a famous family, it’s a very ordinary family background. And then when you find these tiny glimpses of past ancestors in official records held in state local national archives, there is a real thrill.

Because they are the tangible manifestations of people that came before you and are related to you and you wanna have access to that and that is digital now means that I can search for migration records for my great grandparents that are in a West Australian archive while I’m in Canberra. So I was seeing this real benefit there, but I also have friends who have notorious families who hate the fact that their family is instantly searchable and that information is much easier to access because it’s been digitized and is accessible. And so, I mean I think that’s just the complexity of it, and I don’t think that anything should be changed, but it just to me, speaks to me how we can’t forget the importance of preserving this kind of identifiable individual information because for every person that it might be a problem too, there is someone else that benefits by its presence, by its existence.

CF: Yeah. Look, yeah, I think there’s a lot in what you’re saying and I think one of our challenges as recordkeeping professionals and archivists is navigating this transitional period and coming to groups with what being born digital or digitized means in terms of access, in terms of availability, in terms of impact on those people living in society because of course time periods are getting much more squished down as well. We’re not waiting for decades until things become available. So Tim Sherratt has written some brilliant stuff about this and about different concepts of access when you’re thinking in terms of digital. Yes, he is. So I’d encourage people to go and check his work out. But look, I probably need to wrap it up. I think that was a really nice observation on which to end actually, and it called to arms a little bit for recordkeeping people, but I guess all that I wanted to say to finish up is do go and read Ellen’s fabulous book, Made By Humans: The AI Condition. And thank you so much, Ellen, for making the time to chat with me today.

EB: Thank you so much, Cassie. I really enjoyed it.

About Ellen Broad

Ellen is Head of Technical Delivery, Consumer Data Standards for CSIRO’s Data61. She returned to Australia from the UK in late 2016, where she was Head of Policy for the Open Data Institute (ODI), an international non-profit founded by Sir Tim Berners-Lee and Sir Nigel Shadbolt. While in the UK Ellen became an expert adviser to senior UK government minister Elisabeth Truss on data. She has also held roles as Manager of Digital Policy and Projects for the International Federation of Library Associations and Institutions (Netherlands) and Executive Officer for the Australian Digital Alliance.

Ellen’s written and spoken about AI, open data and data sharing issues in places like the New Scientist and the Guardian, for ABC Radio National’s ‘Big Ideas’ and ‘Future Tense’ Programmes and at SXSW. Unrelated to her job, Ellen built an open data board game, Datopolis together with ODI CEO Jeni Tennison, which is being played in 19 countries. 


About Cassie Findlay

Digital archivist and recordkeeping professional, co-founder of the Recordkeeping Roundtable. @CassPF on Twitter.
