Researchers suggest a new framework to regulate the “fairness” of analytics processes. What could this mean for your organization?

There are certain insights online retailers are able to derive about their customers using data and analytics. Now, brick-and-mortar stores are able to get similar insights — whether someone who walked by their store came in, how long they stayed, where they went inside the store — even by what path. Predictive analytics can be applied to determine what resources a store might need, optimal layouts, or what shoppers might be likely to purchase.

One such analytics provider, Euclid, has about 100 customers, including Nordstrom and Home Depot, and has already tracked about 50 million devices (your smartphone and mine) in 4,000 locations, according to The New York Times.

Euclid, which has a prominent “Privacy” tab on its homepage, supplies retailers with reports culled from aggregated, anonymized data. While it does abstract data from cell phones that could identify an individual, it doesn’t actually use this data to pinpoint individuals (yet). To drive this point home — and to reassure, one would assume, jittery retailers worried about creeping out customers with Minority Report-like predictive technology — Euclid has worked with the Future of Privacy Forum to develop the Mobile Location Analytics Code of Conduct, a self-regulatory framework for consumer notification.

But, say researchers, these types of frameworks — not unlike like the White House’s Consumer Privacy Bill of Rights — do not go far enough to protect individuals from the potential harm of predictive algorithms in an era of big data, particularly as their use expands beyond retail — law enforcement, health care, insurance, finance, human resources.

In a recent research paper, authors Kate Crawford, principal researcher at Microsoft Research, a visiting professor at the MIT Center for Civic Media and senior fellow at NYU Information Law Institute, along with Jason Schultz, associate professor of clinical law at NYU School of Law, are proposing a new method to harness predictive privacy harm: Procedural data due process that determines, legally, the fairness of an algorithm.

In their paper Big Data and Due Process: Toward a Framework to Redress Predictive Privacy Harms, the authors outline their framework concept:

Procedural data due process would, rather than attempt regulation of personal data collection, use, or disclosure ex ante, regulate the “fairness” of the analytical processes of big data with regard to how they use personal data (or metadata derived from or associated with personal data) in any “adjudicative” process — a process whereby big data is being used to determine attributes or categories for an individual.

The authors use the example of a health insurance provider that uses big data to determine the likelihood that a customer has a certain disease and denies coverage on that basis. The customer would have a data due process right with regard to that determination.

All well and fine, but what does this mean for organizations that predictive analytics with big data? A couple of things — should data due process pass policy and legislative muster. The authors outline three principles for implementation:

Notice: This principle requires that those who use big data to make categorical or attributive determinations about others must post some form of notice disclosing not only the type of predictions they are attempting, but also the general sources of data that they draw upon, including a mechanism that allows those whose personal data is included to learn of that fact.

Also, when a set of predictions has been queried, notice would be sent out to inform those affected by the issues predicted. For example, if a company were to license search query data from Google and Bing to predict which job applicants would be best suited for a particular position, it would have to disclose that to all applicants who apply.

Opportunity for a Hearing: Once notice is available, the question then becomes how one might challenge the fairness of the predictive process employed. The answer: a hearing and an ability to correct the record. This would include examining both the data input and the algorithmic logic applied. In contexts where security and proprietary concerns arise, the role could be given to a neutral, trusted data arbiter.

Impartial Adjudicator and Judicial Review (or separation of church and state): A neutral data arbiter could field complaints and then investigate sufficient allegations of bias or financial interest that might render a predictive outcome unfair. The arbiter could examine the relationship between those who designed the analytics and those who run the individual processes to make sure that their roles are appropriate and distinct.

Importantly for businesses, this would require some form of audit trail that recorded the basis of predictive decisions, both in terms of data used and the algorithm employed.

The bottom line, according to the authors, is this: “Unless one decides that privacy regulations must govern all data ever collected, processed, or disclosed, deciding where and when to draw the lines around these activities becomes extremely difficult with respect to big data information practices.” Due process strikes the best regulatory balance between privacy and analytics.

2 Comments On: Why Predictive Analytics Needs Due Process

  • Marie Wallace | December 18, 2013

    Renee, I couldn’t agree more… A model for privacy and ethics is absolutely critical if people analytics is ever to have a sustainable future. I work more on the enterprise analytics side where a lack of ethics or respect for privacy and personal autonomy is just not an option. Most specifically when you work in Europe where there are really strong (and totally appropriate) legal frameworks and work practices designed to protect employees rights.

    A few years ago I wrote an article bemoaning the lack of attention privacy and ethics was getting (http://allthingsanalytics.com/2012/02/15/why-is-privacy-the-software-industrys-sopa/). It was after attending a social media analytics session in the Bay Area where I was totally shocked by what data scientists were proposing to do with customer’s data, which is probably reflected in the slightly heated tone :-)

  • Marie Wallace | December 18, 2013

    And… its not just about predictive analytics… social network analytics can expose some hugely invasive insights about individuals.

Add a comment