CI: Introduce FAAC Benchmark Suite for automated regression testing#78
CI: Introduce FAAC Benchmark Suite for automated regression testing#78nschimme wants to merge 5 commits intoknik0:masterfrom
Conversation
|
Seems we need to tweak some permissions for it to leave a GH comment: https://github.com/knik0/faac/actions/runs/22672563501/job/65726481319?pr=78 We should have seen something like: nschimme#38 |
Frankly, it feels a bit uneasy to introduce a test suite that's about as big as the library itself and that downloads some random samples from somewhere else under a questionable license. I'll put trust in your justice if you tell me that the changes you suggest will generate output identical as before. You know, for me this is just a little side project. I'm the last one alive here with commit rights. I haven't written a single line of the actual codec myself. |
|
I get the hesitation, but I’m doing this specifically so you don't have to trust my 'justice.' I’ve already verified the changes are 100% bit-identical, and this suite is just the math to prove it to you so you don't have to audit code you didn't write. On the license/size stuff: the samples aren't in the repo, the CI just pulls them to run the check. It keeps the library clean. If the suite ever becomes a maintenance headache or the 'uneasiness' doesn't go away, just rm -rf tests/ and delete the workflow. I'll be the one maintaining it anyway, so if it breaks, that's on me. I’d rather have the data than fly blind. How about we run with it, and if it's a pain in the ass, we scrap it? |
|
That does beg the question, do you have access to give other people write access? If not, maybe we create a new |
I only have commit rights, I cannot change anything about the repository. My idea is to get the remaining three PRs merged into the code (without the test suite) and release this as 1.40. Then I'd abandon this repository as well and will happily hand over maintenance to a more active fork. And please don't forget about the brother project faad2. |
|
Sounds good, I'll keep maintaining it on my side and then leaving comments with the results in my PRs |
|
We could be cheeky... I see that https://github.com/FAACD is free 😈 |
|
I extracted the code out into a repo that I own and exposed it as GitHub action. This PR just uses it now. See the extracted solution at https://github.com/nschimme/faac-benchmark |
|
Answering my own question, I think I'll have to tweak the thresholds for failure and wins a bit (and possibly make changes together). I'll post this table here for our reference: |
This PR introduces the FAAC Benchmark Suite, an automated CI/CD pipeline designed to provide objective data on every change.
Currently, the project lacks a formal regression and testing suite. For a maintainer, this makes merging optimizations or refactors a high-risk activity. This suite aims to act as a "safety net," providing the metrics needed to ensure that new code maintains the project's standards for quality, speed, and size.
The "Golden Triangle" Philosophy
I've designed the benchmarking logic around three pillars critical to the FAAC mission. Note that these are a first draft—I am completely open to adjusting this philosophy or the specific metrics based on what you value most for the project.
libfaac.so. For embedded VSS and IoT targets, binary size is a primary feature, and this suite makes any "bloat" immediately visible.Implementation
Focus & Feedback Requested
The primary goal of this draft is to establish the metrics. I would value your feedback on:
masterbranch.This is intended to be a collaborative baseline. I want to ensure the metrics we track are the ones that give you the most confidence when reviewing contributions.
Sample Report from this PR: benchmark-report-full.zip