Regulating Inscrutable Systems
(in progress) (with Solon Barocas)
From scholars seeking to “unlock the black box” to regulations requiring “meaningful information about the logic” of automated decisions, recent discussions of machine learning—and algorithms more generally—have turned toward a call for explanation. Champions of explanation charge that algorithms must reveal their basis for decision-making and account for their determinations. But so far, these calls lack a rigorous examination of what it means in practice for a machine learning system to explain itself, or how explanation might or might not vindicate the normative goals that its champions support.
This Article undertakes this task by reorienting the discourse on the call for explanation itself. The article identifies inscrutability, rather than secrecy, opacity, or validity, as the key property of machine learning systems that sets them apart from previous technologies. Indeed, machine learning may produce models that differ so radically from the way humans make decisions that they resist sense-making. The Article argues that explanations of decision systems can usefully be divided into a tripartite framework focused on explanations of the outcome, the logic, and the system design, and that this framework corresponds to the instincts of both lawmakers and computer scientists in trying to achieve explainable systems. Any regulation-by-explanation approach must demand that the technology or its designers be able to produce explanations at each of the three layers.
Having established a framework for the explanation of inscrutable systems, the Article then undertakes an examination of regulation-by-explanation in current law, using two examples as case studies: the adverse action notices of the Fair Credit Reporting Act and the Equal Credit Opportunity Act, which require a “statement of specific reasons” for adverse actions, and the automated processing provisions of Articles 13-15 of the EU General Data Protection Regulation, which require data processors to provide “meaningful information about the logic” of automated decisions. The Article considers how well these legal structures work in providing useful explanations, and how research on interpretability in computer science can aid in compliance.
The Article concludes with an examination of the costs and benefits of explanation. Ultimately, it argues that the picture is quite complicated. While machine interpretability may make compliance with existing legal regimes easier, or possible in the first instance, a focus on explanation alone fails to fulfill the overarching normative purpose of the law, even when compliance can be achieved. The paper concludes with a call to consider where such goals would be better served by other means, including mechanisms to directly assess whether models are fair and just.
Disparate Impact in Big Data Policing
49 Ga. L. Rev. __ (forthcoming 2017)
Police departments large and small have begun to use data mining techniques to predict the where, when, and who of crime before it occurs. But data mining systems can have a disproportionately adverse impact on minority communities, and predictive policing is no different. While scholars have begun to examine traditional Fourth Amendment concerns, none have yet considered the disparate impact that these systems will produce.
Reviewing the technical process of predictive policing, the Article begins by illustrating that use of predictive policing technology will often result in disparate impact on marginalized communities. After reviewing the possibilities for Fourth Amendment regulation and finding them wanting, the Article turns toward a regulatory proposal.
The Article proposes the use of a rulemaking procedure centered on “discrimination impact assessments.” Modeled on the environmental impact statements of the National Environmental Policy Act, such a policy would require police departments to publicly consider mitigation procedures and to evaluate the potential discriminatory effects of competing alternative algorithms. This regulatory response balances the need for police expertise in the adoption of new crime control technologies with transparency and public input, backed up by the possibility of litigation. Such a public process will also serve to increase trust between police departments and the communities they serve.
A Mild Defense of Our New Machine Overlords
We must make policy based on realistic ideas about how machines work. In Plausible Cause, Kiel Brennan-Marquez argues first that “probable cause” is about explanation rather than probability, and second that machines cannot provide the explanations necessary to justify warrants under the Fourth Amendment. While his argument about probable cause has merit, his discussion of machines relies on a hypothetical device that obscures several flaws in the reasoning. As this response essay explains, machines and humans have different strengths, and both are capable of some form of explanation. Going forward, we must examine realistically not only where machines might fail, but also where they can improve upon the failures of a system built with human limitations in mind.
Big Data’s Disparate Impact
104 Calif. L. Rev. 671 (2016) (with Solon Barocas)
Advocates of algorithmic techniques like data mining argue that these techniques eliminate human biases from the decision-making process. But an algorithm is only as good as the data it works with. Data is frequently imperfect in ways that allow these algorithms to inherit the prejudices of prior decision makers. In other cases, data may simply reflect the widespread biases that persist in society at large. In still others, data mining can discover surprisingly useful regularities that are really just preexisting patterns of exclusion and inequality. Unthinking reliance on data mining can deny historically disadvantaged and vulnerable groups full participation in society. Worse still, because the resulting discrimination is almost always an unintentional emergent property of the algorithm’s use rather than a conscious choice by its programmers, it can be unusually hard to identify the source of the problem or to explain it to a court.
This Essay examines these concerns through the lens of American antidiscrimination law — more particularly, through Title VII’s prohibition of discrimination in employment. In the absence of a demonstrable intent to discriminate, the best doctrinal hope for data mining’s victims would seem to lie in disparate impact doctrine. Case law and the Equal Employment Opportunity Commission’s Uniform Guidelines, though, hold that a practice can be justified as a business necessity when its outcomes are predictive of future employment outcomes, and data mining is specifically designed to find such statistical correlations. Unless there is a reasonably practical way to demonstrate that these discoveries are spurious, Title VII would appear to bless its use, even though the correlations it discovers will often reflect historic patterns of prejudice, others’ discrimination against members of protected groups, or flaws in the underlying data
Addressing the sources of this unintentional discrimination and remedying the corresponding deficiencies in the law will be difficult technically, difficult legally, and difficult politically. There are a number of practical limits to what can be accomplished computationally. For example, when discrimination occurs because the data being mined is itself a result of past intentional discrimination, there is frequently no obvious method to adjust historical data to rid it of this taint. Corrective measures that alter the results of the data mining after it is complete would tread on legally and politically disputed terrain. These challenges for reform throw into stark relief the tension between the two major theories underlying antidiscrimination law: anticlassification and antisubordination. Finding a solution to big data’s disparate impact will require more than best efforts to stamp out prejudice and bias; it will require a wholesale reexamination of the meanings of “discrimination” and “fairness.”
Contextual Expectations of Privacy
Fourth Amendment search jurisprudence is nominally based on a “reasonable expectation of privacy,” but actual doctrine is disconnected from society’s conception of privacy. Courts rely on various binary distinctions: Is a piece of information secret or not? Was the observed conduct inside or outside? While often convenient, none of these binary distinctions can adequately capture the complicated range of ideas encompassed by “privacy.” Privacy theorists have begun to understand that a consideration of social context is essential to a full understanding of privacy. Helen Nissenbaum’s theory of contextual integrity, which characterizes a right to privacy as the preservation of expected information flows within a given social context, is one such theory. Grounded, as it is, in context-based normative expectations, the theory describes privacy violations as unexpected information flows within a context, and does a good job of explaining how people actually experience privacy.
This Article reexamines the meaning of the Fourth Amendment’s “reasonable expectation of privacy” using the theory of contextual integrity. Consider United States v. Miller, in which the police gained access to banking records without a warrant. The theory of contextual integrity shows that Miller was wrongly decided because diverting information meant purely for banking purposes to the police altered an information flow in a normatively inferior way. Courts also often demonstrate contextual thinking below the surface, but get confused because the binaries prevalent in the doctrine hide important distinctions. For example, application of the binary third party doctrine in cases subsequent to Miller obscures important differences between banking and other settings. In two recent cases, United States v. Jones and Florida v. Jardines, the Supreme Court has seemed willing to consider new approaches to search, but they lacked a framework in which to discuss complicated privacy issues that defy binary description. In advocating a context-based search doctrine, this Article provides such a framework, while realigning a “reasonable expectation of privacy” with its meaning in society.
The Journalism Ratings Board: An Incentive-Based Approach to Cable News Accountability
The American establishment media is in crisis. With newsmakers primarily driven by profit, sensationalism and partisanship shape news coverage at the expense of information necessary for effective self-government. Focused on cable news in particular, this Note proposes a Journalism Ratings Board to periodically rate news programs based on principles of good journalism. The Board will publish periodic reports and display the news programs’ ratings during the programs themselves, similar to parental guidelines for entertainment programs. In a political and legal climate hostile to command-and-control regulation, such an incentive-based approach will help cable news fulfill the democratic function of the press.