Clear-box systems: sustainable use of artificial intelligence for banks in view of GDPR
Financial institutions all over the world are interested in improving the efficiency of a wide range of activities, including rating methods, current and potential client profiling, identifying the quality of credit claims, planning marketing campaigns or personalising bank services. The best results in these areas are not obtained by processing personal data with patterns identified by humans, but by using decision and prediction-making processes carried out by what is defined by conventional terminology as “Artificial Intelligence” (AI).
At first glance, traditional Machine Learning techniques seem to be the way to go: they are able to produce predictive models in the form of mathematical functions, which can accurately make predictive decisions on credits, loans and other issues mentioned above; by processing customer information with the function to produce a positive or a negative outcome.
The problem? These models are called “black boxes” and for good reason: the complex mathematics of the model cannot be used to explain why the predictive decision was made. This represents a huge problem in an era, in which the European Commission is defining ethical guidelines on trustworthy AI, and the most important regulation in data privacy in 20 years, the General Data Protection Regulation (GDPR), which was enforced in May 2018.
GDPR says: the logic used by Artificial Intelligence must be understandable by humans
The legal sustainability problem for the technological opportunities on hand derive from the fact that the European regulatory body regulates a critical and essential aspect of data processing: the logic used by Artificial Intelligence must be understandable by humans. These points are debated in articles 22, 13, 14 and 15 of GDPR.
The clause contained in article 22 requires that any processing, which determines a decision regarding the subject, must not be performed in an exclusively automated manner, but must be accompanied by human intervention.
To avoid this clause being applied in a purely evasive way, the “human intervention” must be carried out with a full understanding of the processing logic and of the possibility of modifying it. So-called “black box” artificial intelligence, whose logic cannot be fully understood, naturally does not qualify for lawful processing as per article 22.
So how can banks meet the necessity of leveraging innovative technological solutions to stay competitive, while at the same time be legally sustainable in light of European Regulation no. 679 of 2016 (GDPR) on personal data? And how can, we as consultants, help banks in making the right choice?
“Clear-box” systems: XAI introduces new methods and technologies
The answer lies in a new wave of AI techniques now emerging, broadly called Explainable AI (XAI). XAI introduces new methods and technologies, which aim to produce self-explanatory predictions, rather than opaque mathematical functions. Explainable AI is just as accurate as black box algorithms, but in addition to making accurate predictions, it also produces what could be called “cognitive output” – meaning we gain knowledge of why each prediction is made the way it is.
Therefore – and in line with the regulation – the decision could be taken in an automatic way, but with the human knowledge of the algorithms used. In other words, using AI terminology, the decisions have to be taken through systems allowing an understanding of them, and so not based on “black box” systems.
GDPR policies do not refer to the results of the decision-making process
GDPR policies (article 13) that oblige the controller to inform the subject of any significant information regarding the logic used do not refer to the results of the decision-making process, or to the mere disclosure of parts of the source code. They relate to the variables and the weights used by the system to determine a decision. This requirement evidently presumes that:
- the logic is fully understandable, as only with a full knowledge of the logic is it then possible for the controller, or third-party observer, to evaluate whether the effectively communicated information is significant or not;
- it is possible to communicate this information in an understandable way to the average person as the policy is put in place to ensure that all subjects are fully informed and not just technical experts in the sector.
There are essentially two ways to solve the critical issues:
Either by using data processing systems that are based on very human identified patterns and consequently fully understandable. Systems of this type, however, are not competitive when there is a large amount of highly complex data.
Or by using so-called “clear box” systems – (platforms based on “logic learning machines” are a suitable example of them). These systems do not simply offer an efficient and legally sustainable technological option, compared to “non-clear” alternatives, but can, under certain conditions, represent valid technological-legal facilitators for the sustainable use of “opaque” decision-making software already acquired by the controller.
Usually, a credit request made by a client to a bank, is evaluated by analysing many pieces of information and using risk indicators, such as a specific attributed rating class. Most rating systems are based on statistical algorithms with high complexity, making it really difficult to explain them to the client. These algorithms are so-called “opaque”.
Together with a large Italian banking group, we used the “clear-box” algorithms (LLM – Logic Learning Machine of RULEX), to explain how a rating in a credit risk environment is attributed. Referring to six hundred of thousands of positions, we got the data needed to calculate the rating and the class of the rating assigned to every position, without the calculation algorithm. The objectives were:
- to generate a “clear-box” model, which contains a set of comprehensible rules that are able to explain assigning “rating classes” in a clear way
- to provide the client with comprehensible answers to questions regarding their risk classification
Therefore, we studied a way to do a kind of “reverse engineering” of an existing statistical algorithm, so opening an “opaque” algorithm.
In this way, starting with a question from a customer to a bank operator, such as “why are you refusing to extend the credit you granted me last year?”, the answer could be really complicated in the case of “opaque” algorithms, which using “clear-box” logic will make it much simpler and GDPR-compliant. This is because we are able to explain the decision through a set of understandable rules.