Recordkeeping Roundcasts Episode 4: AI, accountability and archives

51CJrKcgvFL._SY346_ In this latest episode of Recordkeeping Roundcasts, we talk to Ellen Broad, author of Made by Humans: The AI Condition, about the way that rapidly advancing technologies like artificial intelligence and machine learning are being deployed in business, government and society, and the wide ranging implications of their adoption. Ellen and the Roundtable’s Cassie Findlay discuss on real world results flowing from machine decision making, accountability for the use of these systems, the role of recordkeeping and archives, and changing perceptions of privacy in the data economy.

Made by Humans is available from Melbourne University Press, or from all the big online book retailers. You can read Ellen’s bio after the transcript.

Transcript

CF: I am thrilled to be talking to Ellen Broad today. Ellen is the author of Made By Humans: The AI Condition. There’s a link available from the site too. There’s information about the book and where to buy it. I heartily recommend it. It’s very timely. Look at the ethics and the societal implications really of what seems like ever increasing gallop towards the use of artificial intelligence and machine learning in so many different aspects of our lives. And of course, for the recordkeeping people like myself and hopefully people who follow this series of recordings, there are lots of things about the adoption of this technology that make us think about what our jobs are and potentially what our jobs might be in the future as far as keeping evidence in the way that these systems are set up through to trying to promote accountability around decisions to deploy the technology. So, first of all, hello Ellen, and thank you very much for joining me.

EB: Thank you very much for having me.

CF: So I’ve got three questions. And I think the first one is probably an opportunity for you to help our listeners just understand what it is we’re talking about a little bit because not all of us come at this with lots of background knowledge. And so, the first of my three questions is, in your book, you talk about how we seem to be moving from lower stakes AI systems to higher stakes AI systems. I was just wondering if you could explain a little bit more about what you mean by that and give us some examples of those higher stakes systems.

EB: Sure. So perhaps it’s worth walking it back a little bit to talk about the context in which I refer to lower stakes and higher stakes systems because the book isn’t about artificial intelligence in general because there are many, many different technologies that seat under that umbrella, there’s virtual reality, robotics, drones, what I’m interested in, in the context of the book, is the increasing use of data to make predictions and decisions about people. And we have started doing this more and more in our day-to-day lives. So the lower stakes systems that I talk about are systems helping us choose what we watch next on Netflix for example, or the information Google uses to shape what it is that we see in search results.

These are in some context, even those examples, are not what you would call lower stakes. There’s a lot of discussion, increasing discussion, around how Google shapes the information that we see in the world, for example, but we’re moving from these systems that essentially are used to make individual predictions to us about content, what we might like to watch, what we might like to buy, if we’ve just purchased this book, we might be interested in this one over here into starting to use the same techniques to make what I would call higher stakes decisions. These are decisions about the kind of person that we might be. So whether we’re trustworthy, whether we’re a good fit for a job that we’re applying for, whether we should be accepted to university and the types of degrees that we should study.

And the reason I call these higher stakes decisions is they have real world implications. I mean Netflix recommending me a terrible movie and not understanding my preferences doesn’t shape me in the same way that being mischaracterized for a job does so the consequences are more significant and these predictions are just genuinely quite hard to make. Predictions like the jobs that we might be interested in or be suitable for are shaped by a range of factors and our kind of past preferences and experiences are only one part of that. So that’s kind of where I’m coming from in lower stakes systems and why I think we need to focus on those higher stake systems more.

CF: I think I know from reading your book that you did travel around the world, and you talked with a lot of people, and you mentioned a lot of very specific examples, could you mention an example of one of those systems that has been put in place where it is something as serious as a job or a decision or a university application that is in place now?

EB: So one of the really key ones that I talk about in the book that has been getting a lot of focus and has really drawn attention to the complexities in these kinds of systems is the COMPAS sentencing algorithm, which is used to predict who is most likely to be recommitting a crime, and that’s an example of a probabilistic system that’s using hundreds of variables to estimate a person’s likelihood of re-offending but in practice was discovered to skew certain ways in its output. So it overpredicted on African American defendants being likely to recommit a crime and underpredicted on white defendants being likely to recommit a crime.

And the reasons for that are complex and difficult to unpick and solutions to that problem are also complex. And a lot of discussions in machine learning have started to focus on how you quantify those kinds of outputs from a system and how you try to change them or at least articulate why it is that that is the outcome. I talked a little bit about the growing use of predictions in job recruitment, but COMPAS is probably the most scoped out example of a high stake system in the book.

CF: And, I mean, this does come get us sort of fairly rapidly to what is I think for me one of the really key issues in this and so I’m happy to go there quite rapidly, which is around the lack of accountability, the lack of transparency for the algorithms that are making the decisions that have such significant effects. I don’t know who created the COMPAS application, it was a proprietary company?

EB: Previously called Northpointe but I think they’re now called Equivant.

CF: Yeah, and all the sort of complexity that you mentioned that goes into how that algorithm arrives at its decisions and what data is it analyzing, where does it come from, what’s the provenance of it, what biases went into it, and we just don’t know in most cases, I think is fair to say. Do you wanna speak a little bit to that?

EB: Yeah, absolutely and definitely the one thing that I’d preference talking about the opaque nature of a lot of these systems is that sometimes our human decision making is even harder to unpick and has similar effects and all these systems have to learn from those previous kind of human decisions and human patterns. So, in sentencing contexts, for example, judges, parole officers, policy makers have tried to predict who is likely to recommit a crime in a variety of ways for decades, nearly a century, in fact, because it’s a really crucial part of criminal justice and policing is you wanna make sure that you are releasing people quickly and expediently into the community where they serve no further risk while trying to identify and kind of protect the community from genuine risk in your population.

EB: And that’s a very emotional issue. I talked a little bit in the book about that this is a genuinely hard problem to solve, and we’ve been trying to solve it for a long time. And so the Northpointe/Equivant algorithm is really just a kind of a manifestation of the same type of decision making except this time it’s in software. And so while it’s opaque on purpose in certain ways as you say, it was very hard to uncover exactly what variables Northpointe/Equivant used for their prediction algorithm. It was hard to unpick what the training data was, information about testing that had been undertaken, the range of the kind of accuracy that the system had. That’s all proprietary information that’s been quite difficult to extract over time. Some of it is increasingly being extracted, but it was essentially black box on purpose, but because it is software we can actually at least quantify and identify the sources or the reasons for certain kinds of bias and decision making that in humans we have a feel for but can’t quite systematically unpick.

So COMPAS, what was really interesting about the COMPAS algorithm was not only that, yes, it’s an example of a proprietary algorithm that is being designed to have real world impact, it makes decisions or it helps I should say, helps judges and parole officers make decisions, so they’re relying on it to help them make other decisions, and it’s difficult to scrutinize. But what I found also particularly interesting about the Northpointe example is that follow-up research has indicated that the complexity of the algorithms, so they promote it as having these hundreds of variables to make its predictions and some follow-up research that I say in the book has indicated that they can achieve exactly or close to the same levels of accuracy using just five or six variables about individuals that adding all of these extra data points doesn’t necessarily improve the quality of your system. And I think that’s a crucial kind of point to keep coming back to particularly as we talk about big data is that these more complex algorithms, these algorithms that use lots and lots of inputs are may not be any better than a survey with five questions.

CF: That’s so interesting. And, yeah, I think that when we look at your earlier point around that humans have been making these decisions with all of the information that they have at their disposal and with all of the sort of potential for error that comes with it, brings to mind the way that we kind of scrutinize and try to validate that decision making in the old world, which is through things like FOI, examining the evidence behind decision making but with the increasing adoption of, you know, the increasing partnerships with the private sector in government, some of those opportunities to scrutinize the data, even if it was back in the day it was a sort of analog data, but now are being cut off. Have you sort of seen any evidence of better practices in government about trying to force accountability in partners where they’re performing a public service?

EB: So specifically in relation to predictive systems and kind of AI in Australia, not so much, but in the US, there’s some really interesting developments emerging around regulating automated services provided by private companies and partnership with government. I’ve been watching New York cities algorithmic task force with interest. They are looking at how they might hold the providers of automated services for New York City accountable and what that might include. What I think we can look at is like we use a variety of mechanisms and structures, rules, laws, principles, expectations to shape the way that private sector companies partner with government to deliver services in other areas. If you are providing medical tools for example or medical services on behalf of a government provider, you have to comply with a range of rules that shape how you deliver that service.

We have rules around how employees acting as government public servants utter, act, and operate. It’s never been a kind of free for all. It’s never been unfettered. There’s never been absolute freedom for private sector partners to government. And I found the discussion of IP in relation to automated systems kind of a red herring. We have never… The origin of intellectual property law has never been absolute. It has always been a balance between information that needs to be accessible in the public interest and information that should be protected and we manage that balance in every sector using both explicit exceptions on the intellectual property laws and rules that are just supposed to shape responsible conduct in that sector.

Pharmaceutical companies, for example, can’t keep absolutely secret everything they do in relation to the pharmaceuticals that they offer to the public because they have a responsibility to protect the public from harm and part of that is exposing information about their products undergoing testing, being audited, having to demonstrate accountability, so I find the emphasis on IP in relation to technology a total red herring because IP has never been intended to prevent information being accessible because it benefits the public and helps us make sure that people are protected from harm and able to make informed choices about what’s going on around them.

CF: Yeah. And I mean, I know from my own background working in government recordkeeping that the sort of negotiations and arrangements that go into government agency outsourcing a public function of some sort would touch on recordkeeping. But even as recently as sort of five years ago when I was still working in government, it was very sort of old-fashioned type of, oh at the end of the contract, you will give us all your records sort of thing rather than building in mechanisms for sharing of real-time information or at least expectations for a greater transparency where there were programs that were directly affecting government by proxy decision making. So I suppose what you’re saying is that there needs to be a kind of some more levers, some more mechanisms, start to become normal in the way that government strikes these agreements?

EB: Yeah, I would classify our procurement in relation to software, which is this is all software. When I talk about artificial intelligence and predictive systems, they’re all still software products being developed, and I think we’re still pretty immature in how we procure software and that’s partly because there’s a relatively lower understanding of kind of how software works and what you need to know and understand if you’re procuring software in order to make sure that you can understand how it works and build on it and get someone else to fix it. I used to do a lot of work in the UK around procurement of data. And quite often one of the real challenges that we would run into is kind of in a similar way when you say a lot of the requirements in procurement in relation to information is give us your records at the end and similarly in the kind of datasets, there might just be a clause saying you should, you must give over all of the data that you collect at the end, but they wouldn’t specify.

Sometimes you would see language like you must give over and anonymize the dataset at the end of the project, but actually what is really useful for an organization to have are the methodologies that went into collecting data. So if you wanna repeat the process, it’s easily repeatable so that you can understand how data was collected. You want to know where it was collected from, who it was collected from in case you need to go back and recreate it. You don’t just want the end result typically. And even getting the end result may not even give you the necessary rights you need to be able to do whatever you want with that anyway. So it’s just I think we’re still figuring out exactly what it is that we need in procurement.

CF: Yeah, and you make the point in the book I think around understanding the mechanisms and the methodologies behind the data collection from the point of view of fairness and the point of view of the kind of silences in the data that go back to the way archives have had silences around particular groups in society, particularly marginalized groups, and so sort of having some way to have a greater involvement as the public body that has hopefully society’s interest at heart in sort of engaging with those collection methods upfront is a much more appropriate way to go, I would have thought.

EB: I remember I actually think you had John (Sheridan) share it in on your podcast before. I haven’t really judged it from…

CF: Yes.

EB: Yes, but he and I always have fascinating mindblowing conversations about data in archives. I remember one conversation we had that was both about preservation of data and decision making but also the impact of that kind of privacy preserving technology and changing relationships to privacy has on kind of future preservation. But on government records, I remember we were talking about the consequence of having outsourced lots of data related functions is that it can take preservation of that information out of scope for the recordkeeping authority. And that it’s kind of like, you know, when you are trying to do some research on a project and you can find a report that references some research and that’s what you have, but you have no access to the actual research underpinning it, you can’t look at the data for yourself. And that’s increasingly what I find with bumping up against is we can get the outputs, we can archive reports, we can archive snapshots.

EB: But the preservation of data that’s collected by private companies, but essentially acting as public services, it has been really difficult. It feels ? about privacy in a really interesting way, but that’s kind of a separate…

CF: Oh don’t worry. I will have a quick question about privacy before we finish.

In the next episode we talk do indeed talk about data and privacy, and Ellen offers some thoughts on where this is all heading.

About Ellen Broad

EllenHeadshot-15 Ellen is Head of Technical Delivery, Consumer Data Standards for CSIRO’s Data61. She returned to Australia from the UK in late 2016, where she was Head of Policy for the Open Data Institute (ODI), an international non-profit founded by Sir Tim Berners-Lee and Sir Nigel Shadbolt. While in the UK Ellen became an expert adviser to senior UK government minister Elisabeth Truss on data. She has also held roles as Manager of Digital Policy and Projects for the International Federation of Library Associations and Institutions (Netherlands) and Executive Officer for the Australian Digital Alliance.

Ellen’s written and spoken about AI, open data and data sharing issues in places like the New Scientist and the Guardian, for ABC Radio National’s ‘Big Ideas’ and ‘Future Tense’ Programmes and at SXSW. Unrelated to her job, Ellen built an open data board game, Datopolis together with ODI CEO Jeni Tennison, which is being played in 19 countries.

Recordkeeping Roundcasts Episode 4: AI, accountability and archives

About Cassie Findlay

Leave a comment Cancel reply

Recordkeeping Roundtable on Twitter

Subscribe

Like us on Facebook

Recent Posts

Archives

CC BY-SA

Recordkeeping Roundcasts Episode 4: AI, accountability and archives

Share this:

Related

About Cassie Findlay

Leave a comment Cancel reply

Recordkeeping Roundtable on Twitter

Subscribe

Like us on Facebook

Recent Posts

Archives