Publications and Research

Fairness and Abstraction in Sociotechnical Systems

2019 ACM Conference on Fairness, Accountability, and Transparency (FAT*) (forthcoming 2019) (with danah boyd, Sorelle Friedler, Suresh Venkatasubramanian, Janet Vertesi)

A key goal of the FAT* community is to develop machine-learning based systems that, once introduced into a social context, can achieve social and legal outcomes such as fairness, justice, and due process. Bedrock concepts in computer science—such as abstraction and modular design—are used to define notions of fairness and discrimination, to produce fairness-aware learning algorithms, and to intervene at different stages of a decision-making pipeline to produce “fair” outcomes. In this paper, however, we contend that these concepts render technical interventions ineffective, inaccurate, and sometimes dangerously misguided when they enter the societal context that surrounds decision-making systems. We outline this mismatch with five “traps” that fair-ML work can fall into even as it attempts to be more context-aware in comparison to traditional data science. We draw on studies of sociotechnical systems in Science and Technology Studies to explain why such traps occur and how to avoid them. Finally, we suggest ways in which technical designers can mitigate the traps through a refocusing of design in terms of process rather than solutions, and by drawing abstraction boundaries to include social actors rather than purely technical ones.

The Intuitive Appeal of Explainable Machines

87 Fordham L. Rev. 1085 (2018) (with Solon Barocas)

Algorithmic decision-making has become synonymous with inexplicable decision-making, but what makes algorithms so difficult to explain? This Article examines what sets machine learning apart from other ways of developing rules for decision-making and the problem these properties pose for explanation. We show that machine learning models can be both inscrutable and nonintuitive and that these are related, but distinct, properties.

Calls for explanation have treated these problems as one and the same, but disentangling the two reveals that they demand very different responses. Dealing with inscrutability requires providing a sensible description of the rules; addressing nonintuitiveness requires providing a satisfying explanation for why the rules are what they are. Existing laws like the Fair Credit Reporting Act (FCRA), the Equal Credit Opportunity Act (ECOA), and the General Data Protection Regulation (GDPR), as well as techniques within machine learning, are focused almost entirely on the problem of inscrutability. While such techniques could allow a machine learning system to comply with existing law, doing so may not help if the goal is to assess whether the basis for decision-making is normatively defensible.

In most cases, intuition serves as the unacknowledged bridge between a descriptive account to a normative evaluation. But because machine learning is often valued for its ability to uncover statistical relationships that defy intuition, relying on intuition is not a satisfying approach. This Article thus argues for other mechanisms for normative evaluation. To know why the rules are what they are, one must seek explanations of the process behind a model’s development, not just explanations of the model itself.

Disparate Impact in Big Data Policing

52 Ga. L. Rev. 109 (2017)

Data-driven decision systems are taking over. No institution in society seems immune from the enthusiasm that automated decision-making generates, including—and perhaps especially—the police. Police departments are increasingly deploying data mining techniques to predict, prevent, and investigate crime. But all data mining systems have the potential for adverse impacts on vulnerable communities, and predictive policing is no different. Determining individuals’ threat levels by reference to commercial and social data can improperly link dark skin to higher threat levels or to greater suspicion of having committed a particular crime. Crime mapping based on historical data can lead to more arrests for nuisance crimes in neighborhoods primarily populated by people of color. These effects are an artifact of the technology itself, and will likely occur even assuming good faith on the part of the police departments using it. Meanwhile, predictive policing is sold in part as a “neutral” method to counteract unconscious biases when it is not simply sold to cash-strapped departments as a more cost- efficient way to do policing.

The degree to which predictive policing systems have these discriminatory results is unclear to the public and to the police themselves, largely because there is no incentive in place for a department focused solely on “crime control” to spend resources asking the question. This is a problem for which existing law does not provide a solution. Finding that neither the typical constitutional modes of police regulation nor a hypothetical anti-discrimination law would provide a solution, this Article turns toward a new regulatory proposal centered on “algorithmic impact statements.”

Modeled on the environmental impact statements of the National Environmental Policy Act, algorithmic impact statements would require police departments to evaluate the efficacy and potential discriminatory effects of all available choices for predictive policing technologies. The regulation would also allow the public to weigh in through a notice-and-comment process. Such a regulation would fill the knowledge gap that makes future policy discussions about the costs and benefits of predictive policing all but impossible. Being primarily procedural, it would not necessarily curtail a department determined to discriminate, but by forcing departments to consider the question and allowing society to understand the scope of the problem, it is a first step towards solving the problem and determining whether further intervention is required.

Meaningful Information and the Right to Explanation

7 Int’l Data Privacy L. 233 (2017) (with Julia Powles)

There is no single, neat statutory provision labeled the “right to explanation” in Europe’s new General Data Protection Regulation (GDPR). But nor is such a right illusory.

Responding to two prominent papers that, in turn, conjure and critique the right to explanation in the context of automated decision-making, we advocate a return to the text of the GDPR.

Articles 13-15 provide rights to “meaningful information about the logic involved” in automated decisions. This is a right to explanation, whether one uses the phrase or not.

The right to explanation should be interpreted functionally, flexibly, and should, at a minimum, enable a data subject to exercise his or her rights under the GDPR and human rights law.

A Mild Defense of Our New Machine Overlords

70 Vand. L. Rev. En Banc 87 (2017)

We must make policy based on realistic ideas about how machines work. In Plausible Cause, Kiel Brennan-Marquez argues first that “probable cause” is about explanation rather than probability, and second that machines cannot provide the explanations necessary to justify warrants under the Fourth Amendment. While his argument about probable cause has merit, his discussion of machines relies on a hypothetical device that obscures several flaws in the reasoning. As this response essay explains, machines and humans have different strengths, and both are capable of some form of explanation. Going forward, we must examine realistically not only where machines might fail, but also where they can improve upon the failures of a system built with human limitations in mind.

Big Data’s Disparate Impact

104 Calif. L. Rev. 671 (2016) (with Solon Barocas)

Advocates of algorithmic techniques like data mining argue that these techniques eliminate human biases from the decision-making process. But an algorithm is only as good as the data it works with. Data is frequently imperfect in ways that allow these algorithms to inherit the prejudices of prior decision makers. In other cases, data may simply reflect the widespread biases that persist in society at large. In still others, data mining can discover surprisingly useful regularities that are really just preexisting patterns of exclusion and inequality. Unthinking reliance on data mining can deny historically disadvantaged and vulnerable groups full participation in society. Worse still, because the resulting discrimination is almost always an unintentional emergent property of the algorithm’s use rather than a conscious choice by its programmers, it can be unusually hard to identify the source of the problem or to explain it to a court.

This Essay examines these concerns through the lens of American antidiscrimination law — more particularly, through Title VII’s prohibition of discrimination in employment. In the absence of a demonstrable intent to discriminate, the best doctrinal hope for data mining’s victims would seem to lie in disparate impact doctrine. Case law and the Equal Employment Opportunity Commission’s Uniform Guidelines, though, hold that a practice can be justified as a business necessity when its outcomes are predictive of future employment outcomes, and data mining is specifically designed to find such statistical correlations. Unless there is a reasonably practical way to demonstrate that these discoveries are spurious, Title VII would appear to bless its use, even though the correlations it discovers will often reflect historic patterns of prejudice, others’ discrimination against members of protected groups, or flaws in the underlying data

Addressing the sources of this unintentional discrimination and remedying the corresponding deficiencies in the law will be difficult technically, difficult legally, and difficult politically. There are a number of practical limits to what can be accomplished computationally. For example, when discrimination occurs because the data being mined is itself a result of past intentional discrimination, there is frequently no obvious method to adjust historical data to rid it of this taint. Corrective measures that alter the results of the data mining after it is complete would tread on legally and politically disputed terrain. These challenges for reform throw into stark relief the tension between the two major theories underlying antidiscrimination law: anticlassification and antisubordination. Finding a solution to big data’s disparate impact will require more than best efforts to stamp out prejudice and bias; it will require a wholesale reexamination of the meanings of “discrimination” and “fairness.”

Contextual Expectations of Privacy

35 Cardozo L. Rev. 643 (2013)

Fourth Amendment search jurisprudence is nominally based on a “reasonable expectation of privacy,” but actual doctrine is disconnected from society’s conception of privacy. Courts rely on various binary distinctions: Is a piece of information secret or not? Was the observed conduct inside or outside? While often convenient, none of these binary distinctions can adequately capture the complicated range of ideas encompassed by “privacy.” Privacy theorists have begun to understand that a consideration of social context is essential to a full understanding of privacy. Helen Nissenbaum’s theory of contextual integrity, which characterizes a right to privacy as the preservation of expected information flows within a given social context, is one such theory. Grounded, as it is, in context-based normative expectations, the theory describes privacy violations as unexpected information flows within a context, and does a good job of explaining how people actually experience privacy.

This Article reexamines the meaning of the Fourth Amendment’s “reasonable expectation of privacy” using the theory of contextual integrity. Consider United States v. Miller, in which the police gained access to banking records without a warrant. The theory of contextual integrity shows that Miller was wrongly decided because diverting information meant purely for banking purposes to the police altered an information flow in a normatively inferior way. Courts also often demonstrate contextual thinking below the surface, but get confused because the binaries prevalent in the doctrine hide important distinctions. For example, application of the binary third party doctrine in cases subsequent to Miller obscures important differences between banking and other settings. In two recent cases, United States v. Jones and Florida v. Jardines, the Supreme Court has seemed willing to consider new approaches to search, but they lacked a framework in which to discuss complicated privacy issues that defy binary description. In advocating a context-based search doctrine, this Article provides such a framework, while realigning a “reasonable expectation of privacy” with its meaning in society.

The Journalism Ratings Board: An Incentive-Based Approach to Cable News Accountability

44 U. Mich. J.L. Reform 467 (2011)

The American establishment media is in crisis. With newsmakers primarily driven by profit, sensationalism and partisanship shape news coverage at the expense of information necessary for effective self-government. Focused on cable news in particular, this Note proposes a Journalism Ratings Board to periodically rate news programs based on principles of good journalism. The Board will publish periodic reports and display the news programs’ ratings during the programs themselves, similar to parental guidelines for entertainment programs. In a political and legal climate hostile to command-and-control regulation, such an incentive-based approach will help cable news fulfill the democratic function of the press.