Skip to content

WIP: [feature] Allow parallel execution of SparkShuffleWriter#480

Open
markjin1990 wants to merge 1 commit intobytedance:mainfrom
markjin1990:parallel-shuffle-writer
Open

WIP: [feature] Allow parallel execution of SparkShuffleWriter#480
markjin1990 wants to merge 1 commit intobytedance:mainfrom
markjin1990:parallel-shuffle-writer

Conversation

@markjin1990
Copy link
Copy Markdown
Collaborator

What problem does this PR solve?

Issue Number: close #xxx

Type of Change

  • 🐛 Bug fix (non-breaking change which fixes an issue)
  • ✨ New feature (non-breaking change which adds functionality)
  • 🚀 Performance improvement (optimization)
  • ⚠️ Breaking change (fix or feature that would cause existing functionality to change)
  • 🔨 Refactoring (no logic changes)
  • 🔧 Build/CI or Infrastructure changes
  • 📝 Documentation only

Description

Describe your changes in detail.
For complex logic, explain the "Why" and "How".

Performance Impact

  • No Impact: This change does not affect the critical path (e.g., build system, doc, error handling).

  • Positive Impact: I have run benchmarks.

    Click to view Benchmark Results
    Paste your google-benchmark or TPC-H results here.
    Before: 10.5s
    After:   8.2s  (+20%)
    
  • Negative Impact: Explained below (e.g., trade-off for correctness).

Release Note

Please describe the changes in this PR

Release Note:

Release Note:
- Fixed a crash in `substr` when input is null.
- optimized `group by` performance by 20%.

Checklist (For Author)

  • I have added/updated unit tests (ctest).
  • I have verified the code with local build (Release/Debug).
  • I have run clang-format / linters.
  • (Optional) I have run Sanitizers (ASAN/TSAN) locally for complex C++ changes.
  • No need to test or manual test.

Breaking Changes

  • No

  • Yes (Description: ...)

    Click to view Breaking Changes
    Breaking Changes:
    - Description of the breaking change.
    - Possible solutions or workarounds.
    - Any other relevant information.
    

@markjin1990 markjin1990 force-pushed the parallel-shuffle-writer branch from 69fbd33 to 347e9cd Compare April 4, 2026 00:48
@markjin1990 markjin1990 force-pushed the parallel-shuffle-writer branch 5 times, most recently from a7197a2 to 0d9806f Compare April 16, 2026 20:11
fix LocalPartitionWriter::merge

refactor not modifying stop()
@markjin1990 markjin1990 force-pushed the parallel-shuffle-writer branch from 0d9806f to 255e90a Compare April 16, 2026 20:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant