Generic test platform for LLM
Project description
One of the biggest challenges for this project was to find a way to check the models for correctness etc since BDO wanted a dynamic system that did not have set golden answers.
Context
This application would be used to test which llm would be better in which cases. BDO is an accountancy firm so this will be the sector it is used in.
Results
We delivered a fully working llm testing application that can be used by BDO the moment we send it over to them. The application is made with modularity and flexibility in mind. So whether the user wants to compare 1,2 or even 5 models at the same time our system is made to handle it all.
About the project group
Our group worked in an agile way across 18 weeks working with 2 week sprints.