Bierens Terms and Conditions
ICT & Artificial Intelligence
Client company:Bierens Group
Tobin Stultiens
Project description
The main research question they had is if it was possible to automate the scanning and marking of flags related to their context within a document using deep learning techniques.
This meant that it was important to look at the flags and be able to understand their context so that the deep learning technique could determine the type of flag they were.
Context
One of the tasks Bierens group does is evaluating Terms and Conditions. What this means is they evaluate how good or bad the given Terms and Conditions are. They will then assign a score to determine the quality of the Terms and Conditions. In the current situation, this has been done manually by their lawyers. But having to do this manually costs the lawyers a lot of time to read through these documents and mark all of the positive or negative things in the document.
So the idea of the project is to automatically analyse the Terms and Conditions documents and flag all points of attention. These flags are determined by their lawyers; they can use phrases or keywords to define a flag. The flags will be categorized on a colour basis. Green means it’s a positive that it’s present, red means it’s a negative that it’s present and yellow means it requires context to determine its impact.
Results
The results show that the idea is possible but there is just a lack of concrete data to properly label the flags. This has resulted in us being unable to train the Robbert model to recognize the flags when running a sentiment analysis. We were able to train the model to be able to be familiar with the texts to run a sentiment analysis on them. We were only missing the labelled data to finish the project.
We are satisfied with the results we have been able to reach since we were able to prove that the concept is possible. Only by us having done this project have we realized some of the limitations of the project. The biggest one for us was the lack of context-related data. Since we had more than enough data to properly train the model to recognize and work with legal writing. But we did not have enough context-related information to properly train it to recognize labels in the text with sentiment analysis.
Methodology
We approached the project using the IBM data science methodology. We used this to approach the project in a constructive way. We did a lot of research on the different options we could use to resolve this project.
To solve most of the problems and challenges we faced we approached this in a structured way. By first analyzing the problems and what it is related to. Then to look for ways to work around the solution or resolve the solution. This is the method we mainly used to resolve the issues we faced since we did not have any impossible challenges that we could not resolve.
About the project group
My previous education was an MBO in Applied programming.
We have spent the last 18 weeks working on this project.
We worked mostly remote because of the current circumstances.