perf(fp-history): batch false positive history processing by valentijnscholten · Pull Request #14449 · DefectDojo/django-DefectDojo

valentijnscholten · 2026-03-05T20:54:48Z

Summary

N+1 query fix: replaced per-finding match_finding_to_existing_findings() calls with a single product-scoped batch query (_fetch_fp_candidates_for_batch) shared across all findings in the batch
Bulk update: replaced all per-finding save() / save_no_options() calls in false positive history paths with QuerySet.update(), which bypasses Django signals identically to the previous calls (all update sites carry a comment explaining this)
post_process_findings_batch (import/reimport): now calls do_false_positive_history_batch() instead of a per-finding loop — one DB query instead of N
_bulk_update_finding_status_and_severity (bulk edit): findings grouped by (product, dedup_alg) and processed with a single batch call per group; retroactive reactivation also batched
Dead-code fix: process_false_positive_history in the single-finding edit view had the condition finding.false_p and not finding.false_p (always False) because form.save(commit=False) with instance=finding mutates the object in place. Fixed by capturing old_false_p = finding.false_p before the form save and passing it as a keyword argument
Algorithm dispatch unified: extracted _fp_candidates_qs() as the single source of truth for hash_code / unique_id_from_tool / unique_id_or_hash / legacy query building, shared by both match_finding_to_existing_findings (returns lazy QS for chaining) and _fetch_fp_candidates_for_batch (evaluates into a keyed dict)
Moved to deduplication.py: all FP history helpers relocated from dojo/utils.py to dojo/finding/deduplication.py alongside the equivalent dedupe helpers; import sites in helper.py, views.py, and tests updated accordingly
4 new unit tests: batch single-query behaviour, retroactive batch FP marking, retroactive reactivation (previously unreachable), and the no-reactivation guard
##Query counts**: added some asserts on query counts to make sure we don't regress to N+1 in the future. Didn't go the full monty as with the import/reimport performance test as FP History is much less used.

Needs a Pro PR to cater for the moved/renamed methods.

Replaces the N+1 query pattern in false positive history with a single product-scoped DB query per batch, and switches per-finding save() calls to QuerySet.update() to eliminate redundant signal overhead. Changes: - Extract _fp_candidates_qs() as the single algorithm-dispatch helper shared by both single-finding and batch lookup paths - Add do_false_positive_history_batch() which fetches all FP candidates in one query and marks findings with a single UPDATE - do_false_positive_history() now delegates to the batch function - post_process_findings_batch (import/reimport) calls the batch function instead of a per-finding loop - _bulk_update_finding_status_and_severity (bulk edit) groups findings by (product, dedup_alg) and calls the batch function once per group; retroactive reactivation also batched the same way - Fix dead-code bug in process_false_positive_history: the condition finding.false_p and not finding.false_p was always False because form.save(commit=False) mutates the finding in place; fixed by capturing old_false_p before the form save - Replace all per-finding save()/save_no_options() in FP history paths with QuerySet.update() (bypasses signals identically to the old calls) - Move all FP history helpers from dojo/utils.py to dojo/finding/deduplication.py alongside the matching dedupe helpers All update() calls carry a comment explaining the signal-bypass equivalence with the previous save(skip_validation=True) calls. Adds 4 unit tests covering: batch single-query behaviour, retroactive batch FP marking, retroactive reactivation (previously dead code), and the no-reactivation guard.

Limit _fetch_fp_candidates_for_batch to only the fields actually read from candidate objects (id, false_p, active, hash_code, unique_id_from_tool, title, severity), avoiding loading unused columns. Correct update() comments to clarify that .only() does not constrain QuerySet.update() — Django generates UPDATE SQL independently — so the sync requirement is only for fields *read* from candidate objects.

assertNumQueries(7) on both batch tests covers: System_Settings, 4 lazy-load chain (test/engagement/product/test_type from findings[0]), candidates SELECT with .only(), and the bulk UPDATE — fixed regardless of batch size or number of retroactively marked findings.

New test creates 5 pre-existing findings and asserts the batch still uses exactly 7 queries regardless — proving the old O(N) per-finding save loop is gone and a single bulk UPDATE covers all affected rows.

github-actions bot added the unittests label Mar 5, 2026

valentijnscholten added 3 commits March 5, 2026 22:16

test(fp-history): assert query count stays flat with N affected findings

9232d60

New test creates 5 pre-existing findings and asserts the batch still uses exactly 7 queries regardless — proving the old O(N) per-finding save loop is gone and a single bulk UPDATE covers all affected rows.

valentijnscholten added this to the 2.57.0 milestone Mar 5, 2026

valentijnscholten added the affects_pro PRs that affect Pro and need a coordinated release/merge moment. label Mar 5, 2026

valentijnscholten marked this pull request as ready for review March 6, 2026 06:58

valentijnscholten requested review from Maffooch and mtesauro as code owners March 6, 2026 06:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perf(fp-history): batch false positive history processing#14449

perf(fp-history): batch false positive history processing#14449
valentijnscholten wants to merge 4 commits intoDefectDojo:devfrom
valentijnscholten:fp-history-batching

valentijnscholten commented Mar 5, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

valentijnscholten commented Mar 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

valentijnscholten commented Mar 5, 2026 •

edited

Loading