Algorithmic decision-making has been in the news of late. From Ofqual’s downgrading of students’ A-level results[1] to the complaint lodged by None of Your Business’ against the credit rating agency CRIF for failing (amongst other things) to be transparent about the reasons why a particular applicant had been given a negative rating[2]. We have been reminded of the potential backlash that could result from decisions that are perceived as incorrect or unfair by algorithms where the workings of which are largely unknown to the individuals they affect. This presents challenges for organisations which are increasingly adopting Artificial Intelligence-based solutions to make more efficient decisions. It is timely, then, that the UK Information Commissioner’s Office (ICO) has recently issued its “Guidance on AI and data protection[3] to assist organisations in determining how they should navigate the complex trade-offs that use of an AI system may require.

Background

If you’ve been following the ICO’s work on AI, much of what is in the guidance will not come as a surprise. Over the past two years, the ICO has published blog posts and issued public consultations on the topic, and the guidance largely crystallises views which it had previously expressed. Having said that, the guidance does provide more detail, particularly from a technical perspective, as to how these views might be implemented in practice.

It is important to note that the guidance only sets out the data protection considerations which apply to the development and implementation of AI solutions within an organisation. It is not meant to provide generic ethical or design principles for use of AI. In that vein, the rules that apply are simply those that are enshrined in the EU General Data Protection Regulation (GDPR) and UK Data Protection Act 2018. The guidance explains how the seven core data processing principles are strained when it comes to applying them in the AI context.

Key Takeaways

This section explains what we consider to be key takeaways from the ICO’s guidance which may require you to modify the way in which would normally assess risk in the development of an AI model, or when purchasing one from a third party provider.

  1. DPIA, DPIA, DPIA. A Data Protection Impact Assessment (DPIA) is necessary when an organisation is about to undertake processing activities which would result in a high risk to individuals’ rights and freedoms. However, the ICO has expressly stated that it expects that the “vast majority” of AI projects will result in processing that would be considered “high risk”, and that it expects to see a DPIA where such projects are being considered. Even if you assess that a particular use of AI does not involve “high risk” processing, the guidance states that you will need to document how you came to that conclusion.

DPIAs should always be carried out with input from stakeholders across different parts of the business. But, with AI projects, the ICO has stated that it may even be worth maintaining two versions of the DPIA: one containing a technical description for specialist audiences, and another containing more high-level explanations of the processing and the logic behind it. This is because it can be difficult to explain some of the more complex models which affect the fairness of the data processing within the body of one “regular” DPIA. Other AI-specific areas which the ICO expects to see covered in a DPIA include: the degree of any human involvement in the decision-making processes; any risk of bias or inaccuracy in the algorithms being used; and the measures that will be put in place to prevent such bias or inaccuracy.

  1. “Explainability” – the new “transparency”. Transparency is one of the trickiest considerations when it comes to processing personal data in an AI system. In fact, it is so tricky that the ICO  previously published an entirely separate piece of guidance on this issue (“Explaining decisions made with AI”[4]). The ICO’s latest guidance confirms the principles set out in the guidance on Explainability, and reiterates that striking the appropriate balance between explainability and statistical accuracy, security, and commercial secrecy will require organisations to make trade-offs. A process should be in place to weigh and consider these trade-offs. This may include: considering any technical approaches to minimise the need for any trade-offs; having clear lines of accountability about final trade-off decisions; and taking steps to explain any trade-offs to individuals.

On the last point, the Explainability guidance makes clear that you are not expected to disclose every detail (in particular, any commercially sensitive details) about the workings of your algorithm. An appropriate explanation is always context-specific. What sort of setting will my AI model be deployed in? A safety-critical setting requires further safety and performance explanations, as compared to lower-stakes domains such as e-commerce. What kind of impact will the decision have on the individual? Decisions affecting someone’s liberty or legal status require detailed explanations around any safeguards which have been put in place to ensure a fair result. What is the nature of the data being used? How urgent is the decision being taken? Who is the individual concerned and how sophisticated is he/she?

Thinking through the context is therefore your first task.  The next is to decide what goes into the actual explanation. This may include information as to the rationale behind a decision; who was responsible for making that decision (e.g. the persons involved in the development and management of an AI system); the data used in taking that decision; and the steps taken to ensure its safety and performance (e.g. design features which maximise its accuracy, reliability, security and robustness). If this seems like a lot to consider, the Explainability guidance includes a number of examples of explanations which the ICO would consider adequate.  It also provides worked examples of the process which you would be expected to go through in order to settle on that explanation.

  1. SARs – searching for needles in an AI haystack. Where an AI system makes decisions about individuals, SARs (subject access requests, or requests to exercise rights under the GDPR) are going to be part and parcel. The trouble is that as personal data flows through a complex AI system, it is transformed, processed, and pre-processed in such ways that make it challenging for organisations to implement effective mechanisms for individuals to exercise their rights. It comes as something of a relief, then, that the guidance is sympathetic to these challenges. The ICO endorses reliance on exemptions to fulfilling SARs where this is permissible under the GDPR. For example, if a request is manifestly unfounded or excessive, you may be able to charge a fee or refuse to act on the request. If you are not able to identify an individual in the training data, directly or indirectly (and you are able to demonstrate this), the individual rights under Articles 15 to 20 will not apply. Separately, if it would be impossible or involve a disproportionate effort to identify or communicate with the relevant individuals in a training dataset (e.g. where it has been stripped of personal identifiers and contact addresses), you might be able to claim an exemption from providing the fair processing information required under Articles 13 or 14 directly to the individual.

None of this is to say that you can throw your hands up and decline to fulfil SARs because compliance is “too difficult”. You would still need to comply if, for example, the individual making the request provides additional information that would then enable you to identify his/her data within the AI model. The threshold for claiming that a request is “manifestly unfounded or excessive” remains high, and cannot be met simply by pointing to the fact that request relates to an AI model, or that the individual’s motives for requesting the information may be unclear. Also, if it is not possible to contact each individual with the relevant Article 13/14 information, then efforts should be made to provide public information explaining where you obtained the data that you use to train your AI system, and how individuals may object to their inclusion in the data set.

If you are required to fulfil a SAR, your systems should have been developed from the outset in a manner that can  enable you to isolate data and comply with access requests. Ways to achieve this include keeping careful logs of all processes applied to personal data and recording where data is stored and moved.. You should separate data into the different phases where it will be used, in particular, the training phase (when past examples are used to develop algorithms) versus the inference phase (when the algorithm is used to make a prediction about new instances). This will make it easier to respond to a SAR that relates only, for example, to the inference phase. Good data minimisation practices will also make your job easier as it will help reduce the amount of personal data you would need to provide in the first place.  Therefore, you should (amongst other measures) eliminate features in any training dataset which are not relevant to your purpose (for example, not all financial and demographic features will be necessary to predict credit risk).

One last point on SARs: where the output of an AI model could be perceived as inaccurate or unfair, and particularly where that output has negative effects on the individual concerned, it is likely that requests for rectification of the output will be made. The ICO’s view is that predictions are not inaccurate if they are intended as prediction scores as opposed to statements of fact about the relevant individual. If the underlying personal data used to make that decision is not inaccurate, then the right to rectification of the AI model’s output does not apply  (although the human review process, where required, may achieve a similar outcome for the data subject). It is therefore important that in documenting your processes you make sure data is clearly labelled as inferences and predictions, and is not claimed to be a fact about a particular individual.

  1. Computer (or human) says no – the need for meaningful human review. You need to be clear about where the AI system will fit into your decision-making process. Will it, for example, be used to make decisions about applicants automatically; or will it be used as a decision-support tool by a human decision-maker in their deliberation? The distinction is important because solely automated decisions will be caught by Article 22 of the GDPR, with its restricted legal bases and requirement that additional safeguards are put in place. This includes the requirement that individuals who are the subject of solely automated decisions are given the opportunity to obtain human intervention, to express the their point of view; contest the decision made about them; and obtain an explanation about the logic of the decision. This is not to say that a similar appeals process should not be offered to individuals who are the subject of non-solely automated decisions. There are sound commercial reasons for doing so, including to avoid complaints of the scale we’ve seen lately regarding the UK exam results.– But the process described is mandatory and more prescriptive if the decision is solely automated.

How then, do you take a decision out of the scope of Article 22? By building meaningful human input into the decision-making process. Having a human at the end of the line “rubber-stamping” the output of an AI system will not do. Human reviewers must be active in their involvement, and be willing to go against the recommendation of the system. The guidance also reiterates the potential issues of “automation bias” and “interpretability” which we covered in a previous blog post[5]. To mitigate these risks, you should look to: (a) design and appropriate training for human reviewers as to how they should interpret any decision recommended by an AI system; (b) make sure human reviewers are given the right incentives and support to escalate a decision by the AI system that appears problematic or if necessary, to override the same; and (c) design the front-end interface of the AI system in a manner that gives consideration to the thought processes and behaviours of eventual human reviewers and allows them to effectively intervene.

  1. Balancing the scales – reducing the risk of bias. The ICO recognises that the increasing deployment of AI systems may result in decisions being made which have discriminatory effects on people based on their gender, race, age, health, religion, disability, sexual orientation or other characteristics. It proposes that the first step in mitigating risks of bias and discrimination is to understand their cause. One possible culprit for example, is that the training data is imbalanced. For example, an AI model may observe that more men than women have historically been hired at a company and therefore might give more priority to male candidates than female. The proposed solution in that case would be to balance out the training dataset by adding or removing data about under/overrepresented subsets of the population (e.g. adding more data about female candidates or removing data about men).

Another possible culprit might be where the training data reflects past discrimination, such as where human reviewers may have unfairly preferred candidates of certain ethnicities over others in the past. When this data is used to train an AI model, it is likely to reflect these past patterns of discrimination as the candidates of one ethnicity may appear to have been more “successful” in the application process. The proposed solutions here would be to modify the training dataset, to change the AI model’s learning process, or modify the model after it has been initially trained.

Most importantly, you should document your approach to bias and discrimination mitigation from the very beginning of any AI model’s lifecycle, so that you can put in place the appropriate safeguards during the design phase. You should also document your processes for ensuring robust testing, discrimination monitoring, escalation and variance investigation procedures, as well as clearly set variance limits above which the AI system should stop being used.

Other areas of note

The guidance also covers other areas such as:

  1. determination of controller-processor relationships in AI relationships (some examples are provided, but we expect to see more prescriptive classifications including deemed controllerships when the ICO updates its Cloud Computing Guidance in 2021);
  2. selection of appropriate legal bases when developing an AI model (noting the importance of defining the legal basis used during each specific phase (e.g. training / deployment), and the issues associated with relying on consent, which is withdrawable);
  3. security risks specific to AI models (including loss or misuse of the large amounts of personal data often required to train AI systems, and software vulnerabilities which may be introduced as a result of the introduction of new AI-related code and infrastructure).

What lies ahead?

It is worth noting that, whilst fairly extensive, the guidance is only one in a framework which the ICO is rolling out as it begins to audit the use of personal data in AI systems. The framework includes:

  • internal auditing tools and procedures that the ICO will use in audits and investigations (it is not clear if these will be published);
  • this guidance on AI and data protection; and
  • a toolkit designed to provide further practical support to organisations auditing the compliance of their own AI systems (to be released at a later date).

The ICO has not set out its enforcement strategy with respect to AI, but with the publication of the guidance, it is clearly prepared to take action in an appropriate case. We would imagine that this would be likely where a complaint is raised in relation to the outcome of a high profile or newsworthy AI-assisted decision, similar to the UK exam results controversy. Now would be the appropriate time, therefore, to carry out a review of the parts of your business which rely on or are planning to rely on AI solutions and to begin putting in place documentation and processes to respond in the event a query is received.

[1] Financial Times, “Were this year’s A-level results fair?”: https://www.ft.com/content/425a4112-a14c-4b88-8e0c-e049b4b6e099

[2] Data Guidance, “Austria: NOYB issues complaint against CRIF for violating right to information, data correctness, and transparency”:

https://www.dataguidance.com/news/austria-noyb-issues-complaint-against-crif-violating-right-information-data-correctness-and

[3] https://ico.org.uk/for-organisations/guide-to-data-protection/key-data-protection-themes/guidance-on-ai-and-data-protection/

[4] https://ico.org.uk/for-organisations/guide-to-data-protection/key-data-protection-themes/explaining-decisions-made-with-artificial-intelligence/

[5] https://www.dataprotectionreport.com/2019/04/ico-blog-post-on-ai-and-solely-automated-decision-making/