Ensuring Data Protection with Artificial Intelligence

Ensuring Data Protection with Artificial Intelligence

Veröffentlicht am Mittwoch, den 17. November 2021

Data protection regulations are a cornerstone of our highly digitised society. With the vast amount of customer data being gathered by companies – likely more than ever before - data protection policies have become necessary to protect our fundamental right to privacy. It’s for this reason that in 2016, the European Union formalised all requirements in terms of data protection under the General Data Protection Regulation (GDPR), which quickly became the gold standard. Every organisation, no matter where it’s located, handling the personal data of European citizens must comply with GDPR. Any instance of non-compliance can be severely punished, with fines in the order of millions of euros.

The Challenge

Due to their nature and sensitivity, data protection policies need to be adapted – to a very high level of detail – in each individual case. However, being written in natural language, they often reflect some of its problems, such as ambiguity, incompleteness, and inconsistency. Checking policy compliance is a complex, time-consuming, and labour-intensive task, in which even the tiniest mistake can have far-reaching consequences. In fact, companies worldwide have been struggling with GDPR compliance, and there is an important need for cost-effective methods that can help them manage it more efficiently.

Artificial intelligence can assist users with checking the compliance of data protection policies to the highest level of precision, saving time and money. This is the goal of the partnership between SnT and global law firm Linklaters. “Our firm entered this partnership because of SnT’s experience in AI in relation to natural language processing. We wanted to explore the possibilities to build concrete client-focused products, which would accelerate answers to clients, and at the same time be more cost-effective for them,” said Katrien Baetens, managing associate of the Dispute Resolution practice at Linklaters.

Over the last few years, several AI solutions in this sense have been developed. But most of the existing tools are based on what could essentially be called keyword search; they scan text for specific keywords that are often used to express certain GDPR requirements in legal documents rather than trying to interpret its meaning. Unlike AI which provides the means to capture different formulations of the same text, keywords are limited to a finite list and will always be lacking.

Artificial intelligence at the service of GDPR compliance

With the support of the Luxembourg National Research Fund (FNR), researchers from the Software Verification and Validation (SVV) research group at SnT have been working with Linklaters on the project entitled “Artificial Intelligence-enabled Automation for GDPR Compliance” (ARTAGO), an industry-driven project to develop AI-based solutions for checking the completeness of data protection policies according to the provisions of the GDPR. In fact, completeness checking is an essential prerequisite for compliance checking – but when executed manually, it is both a time-consuming and error-prone process.

The team, formed by Prof. Mehrdad SabetzadehProf. Lionel Briand and Dr. Sallam Abualhaija, has just recently unveiled the first result of their project: CompAI, a tool for AI-assisted completeness checking of privacy policies according to the GDPR. CompAI is based on the research work that has recently been accepted in the IEEE Transactions on Software Engineering (TSE) journal, and it relies on a combination of advanced natural language processing and supervised machine learning. The tool parses the content of a given privacy policy, and analyses and categorises its content into different information types. Then, based on a list of user inputs to a set of questions, CompAI verifies the policy’s content against 23 criteria related to GDPR requirements, which list what the privacy policy must explicitly cover. The tool, which supports the analysis of documents in .doc, .docx and PDF format, produces a report listing the criteria that were satisfied, those which were violated, and the sections that need to be corrected, with a precision of 92.9%.

Differently from keyword-based tools, CompAI can represent – up to a certain degree – the actual meaning of sentences. “The solution we developed uses artificial intelligence and, in particular, a combination of natural language processing and machine learning. This means that we don’t represent words as textual entities, but use word embeddings. These are mathematical vectors that are generated with deep learning to represent syntactical and semantical characteristics of the sentence. This process provides the tool with a certain ‘understanding’ of the text,” says Dr. Sallam Abualhaija, research scientist at SnT.

The tool is just the first result of the partnership taking place under the ARTAGO umbrella. “SnT has a wealth of experience in AI, including NLP-driven solutions. Our collaboration saw SnT provide its experience, while Linklaters provides its legal knowledge in the interpretation of the relevant legal texts. Thanks to this joint effort the first products are taking shape,” said Sylvie Forastier, Innovation & Information Management Specialist at Linklaters.

From privacy policies to data processing agreements and beyond

“Every organisation, EU-based or not, has to comply with GDPR – lest be fined up to € 20 million. Instead of having our lawyers spend their valuable time reviewing hundreds of legal documents, we reached out to SnT. Our partnership with the research centre paves the way for the development of effective AI-enabled automation for checking compliance,” said Patrick Geortay, managing partner, Linklaters LLP Luxembourg. “We are proud to be at the forefront of research together with SnT, and take a step towards integrating AI in the future of legal practice,” he added.

In fact, the partnership between the law firm and SnT covers many use cases of GDPR compliance. After CompAI, the researchers are now looking into other legal artifacts, covering different scenarios under the umbrella of GDPR. What’s more, the technology and the method being developed at SnT could in the future also be applied to different provisions and legal artifacts.

“We are now working on data processing agreements, which represent the next stage of our project,” said Abualhajia. “It is inspiring to be working on solutions that will one day help to protect the privacy of millions of users,” she added.