Skip to content

feature(parallel): add parallel plugin execution with resource-aware scheduling#10

Open
quinnjr wants to merge 1 commit intoFIUBioRG:masterfrom
quinnjr:feature/parallel-execution
Open

feature(parallel): add parallel plugin execution with resource-aware scheduling#10
quinnjr wants to merge 1 commit intoFIUBioRG:masterfrom
quinnjr:feature/parallel-execution

Conversation

@quinnjr
Copy link
Copy Markdown
Contributor

@quinnjr quinnjr commented Mar 27, 2026

Summary

Adds support for running independent pipeline plugins concurrently via explicit Parallel/EndParallel blocks in PluMA configuration files. This enables significant speedups for pipelines with independent processing steps while preserving PluMA's modular plugin architecture.

Key features

  • Explicit parallel blocksParallel/EndParallel directives in config files with workers, memory, gpu, and fail options
  • Fork-based process isolation — each plugin runs in its own process, avoiding Python GIL and R interpreter limitations
  • Resource-aware scheduling — memory, GPU, and worker slot budgets prevent oversubscription
  • Configurable failure modesfail=fast (default) aborts on first error; fail=continue runs all tasks

Config syntax example

Prefix /data/pipeline
Plugin Normalize inputA.csv normalizedA.csv

Parallel workers=4 memory=8G fail=continue
  Plugin FeatureExtractA normalizedA.csv featuresA.csv
  Plugin FeatureExtractB normalizedB.csv featuresB.csv
  Plugin FeatureExtractC normalizedC.csv featuresC.csv
EndParallel

Plugin Merge featuresA.csv,featuresB.csv,featuresC.csv merged.csv

Components

File Description
src/ParallelTypes.h Core data structures (ParallelBlock, PluginTask, ResourceBudget, etc.)
src/ConfigParser.{h,cxx} Parses config files including Parallel blocks, size strings, plugin tasks
src/ResourceBudget.{h,cxx} Tracks and enforces memory/GPU/worker constraints
src/ParallelScheduler.{h,cxx} Fork-based concurrent dispatch with signal handling and cleanup
tests/ 67 Catch2 unit tests covering parser, budget, and scheduler
fuzz/ 4 libFuzzer harnesses with seed corpus for parse_size, parse_config, parse_plugin_task, and resource_budget

Testing

  • 67/67 unit tests passing (Catch2)
  • Fuzz testing with libFuzzer + ASan/UBSan — found and fixed one crash in malformed option parsing
  • Memory leak testing with AddressSanitizer/LeakSanitizer — zero findings

Test plan

  • Review config parser behavior with existing PluMA pipeline files (backward compatibility)
  • Integration test with actual PluMA plugins in parallel blocks
  • Performance benchmarking with real bioinformatics workloads
  • Test on multi-GPU systems for GPU slot scheduling
  • Wire ConfigParser and ParallelScheduler into src/PluMA.cxx main loop

Made with Cursor

…scheduling

Introduces explicit Parallel/EndParallel blocks in pipeline config files
to run independent plugins concurrently via fork()-based process isolation.

Components:
- ConfigParser: parses Parallel blocks with workers/memory/gpu/fail options
- ResourceBudget: tracks memory, GPU, and worker slot allocation
- ParallelScheduler: fork-based dispatch with fail-fast/continue modes
- Unit tests (Catch2): 67 tests covering parser, budget, and scheduler
- Fuzz targets (libFuzzer): parse_size, parse_config, parse_plugin_task,
  resource_budget with seed corpus

Made-with: Cursor
@quinnjr quinnjr force-pushed the feature/parallel-execution branch from ef26e38 to e4a8bdd Compare March 27, 2026 18:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant