How to track automated "performance"-type tests over time?
I'm pretty familiar with automated tests where you're comparing a received value to an expected value (e.g. basically all unit/integration tests) --- in a CI/CD workflow, you handle test failures by failing the whole pipeline, and then that commit/PR/etc has a pipeline that failed next to it.
However, what if I have some kind of "performance" measure I want to track, instead? Something that isn't pass/fail, but rather a set of experimental results over time? (e.g. speed of responses from an API, wins/draw/loss rates on chess bot, confusion matrix scores for a classifier, etc.) Is there a tool that can show that kind of "automated experiment" results in order by git commit, pull request, etc?
I thought about sending the data to some kind of data store with a Grafana front-end, but I was hoping there might be some less "diy" method for creating such a display.
My particular use case is actually for a hobby/fun project --- developing a bot in Rust to play a game (particularly, Screeps), and I want to track how fast it hits certain game thresholds with each newly developed feature. Gitea Actions for CI/CD, but it's all running on my local network/home lab so I'm happy to shift as needed.