Advanced Threat Monitoring System for RobotLab
Cyber Security
Semester programme:Cyber Security Professional
Research group:Cyber Security
Project group members:Afonso Costa
Matei Patrascu
Norbert Knez
Sebastiao Rodrigues
George Batca
Miguel Mesquita
Racxshan Nagalingam
Project description
Research and education labs are putting more and more of their work in the cloud, but they usually do not have their own security team watching over it. So the question we started with was simple: how can you build a complete threat monitoring platform for a lab like that, one that keeps an eye on the network, the systems and the cloud account all at once, and that can spot and help deal with attacks without needing a full security team behind it?
The design challenge was to take a set of open-source security tools and turn them into one working platform, what the industry calls an XDR (Extended Detection and Response). It had to watch several layers at the same time, bring all the alerts together in one place, and still be cheap enough that a small organisation could actually afford to run it.
Context
The project is about cybersecurity, and more specifically about security monitoring for a research and education setting. Our client is a robotics research laboratory that runs part of its work in the cloud. Just like a lot of smaller organisations, it has valuable research and data to protect, but it does not have a big IT security team keeping watch all day.
This is a problem that keeps getting bigger. Attacks against cloud and container environments have gone up a lot in the last few years, and they usually do not stay on one layer. An attacker might start with a stolen cloud login, get into a container, and from there try to move across the internal network. Tools that only look at one of those layers tend to miss what is really going on.
The usual answer to this in the industry is XDR, which is the idea of pulling network, system and cloud monitoring together so the signals from each layer can be connected into one picture. The catch is that most ready-made XDR products are expensive and run as closed services, which does not really fit the budget or the open way of working that a research lab likes.
So our project looks at whether you can get the same result using open-source tools in the cloud, set up fully as code. That puts the work somewhere between cloud computing, container security, network monitoring and threat intelligence, while still keeping a close eye on cost.
Results
The main thing we built is a working XDR platform that runs in the cloud and watches three layers at the same time, then brings everything together in one place. The whole thing is set up as code, so it can be torn down and rebuilt automatically. That makes it reproducible and a lot easier to keep up to date.
The platform is made out of several open-source tools, and we picked each one after comparing the options:
For the network, two sensors look at the live traffic for things like intrusions, scans, brute-force attempts and suspicious connections, and they use threat-intelligence data to recognise known bad addresses.
For the systems and containers, sensors built on eBPF technology watch what processes are doing inside the cloud's container cluster, and they can even stop known hacking tools the moment someone tries to run them.
For the cloud itself, the platform also reads the activity log of the cloud account, so things like strange logins and changes to the setup show up too.
All of these signals end up in one central system, where the alerts are stored, searched and shown on a dashboard. On top of that there is a threat-intelligence platform and an automated response part to round off the chain.
The biggest insight for us is that an open-source XDR like this is actually realistic for a small organisation. The platform ran from end to end in a real cloud environment, and during a controlled purple-team test it handled somewhere around two hundred thousand alerts in a single day. We checked the detections against the MITRE ATT&CK framework, which is the well-known list of attacker techniques, and we could confirm that common attacks like port scans, brute-force attempts and the use of hacking tools were picked up across the different layers.
We were also honest about what is not finished, which we think counts as a result too. The automated response part is built and we tested it on its own, but the full automatic response chain is not switched on for everyday use yet, and one of the cloud log sources still needs some work before its alerts come in reliably. Writing these gaps down clearly is useful for whoever continues the project, and it gives a real picture instead of a perfect one.
For technology readiness we put the platform at TRL 6, which means the technology was shown working in a relevant environment, and it is moving towards TRL 7. The whole system was demonstrated working together in the kind of cloud environment it is actually meant for, and we tested it by attacking it instead of only checking it on paper. It is not higher than that yet, because TRL 8 or 9 would need it running in real daily use, watched around the clock, with the open gaps closed.
The value is on two sides. The client gets a cheap and transparent monitoring platform they can run and extend themselves, without being locked into a commercial product. And for the wider field it shows a repeatable example of how smaller research and education organisations can reach a level of visibility that used to need a big company and a big budget.
About the project group
We are a group of seven students from the Cyber Security side of the ICT programme at our university of applied sciences. Before this project most of us had already done courses in software development, networking and cloud in the earlier years, so we came in with a bit of a mixed background. That turned out to be useful, because this project needed a little bit of everything.
We worked on it for one full semester, which was about 16 weeks. On average each of us spent around two or three days a week on the project. We worked in an Agile way with sprints of three/four weeks. At the start of a sprint we planned what we wanted to do, and at the end we looked back at what we actually finished. For the research part we followed the DOT framework, which basically means we mixed reading and theory with talking to people who know the field and doing a lot of testing ourselves. We kept everything we built as code in a shared Git repository, so anyone in the team could rebuild or change it, and we always checked each other's work before it went live.