[flink][test] Fix OOM when startTaskManager in FlinkMetricsITCase by Prajwal-banakar · Pull Request #2864 · apache/fluss

Prajwal-banakar · 2026-03-13T10:33:30Z

Purpose

Linked issue: close #2744

Fixes an OutOfMemoryError: Could not allocate enough memory segments for NetworkBufferPool that occurred when TaskManagerRunner.startTaskManager was called in FlinkMetricsITCase (and its Flink-version subclasses Flink119MetricsITCase, Flink120MetricsITCase, etc.) during sequential IT case execution in the same JVM fork.

Brief change log

The root cause is that MiniClusterWithClientResource allocates JVM direct memory via NetworkBufferPool during before(), and this memory was not reliably released between test classes, exhausting the JVM direct memory budget for subsequent classes.
Three changes were made to FlinkMetricsITCase:

beforeAll: Wrap MINI_CLUSTER_EXTENSION.before() in a try/catch that explicitly calls MINI_CLUSTER_EXTENSION.after() on failure. JUnit 5 does not invoke @afterall when @BeforeAll throws, so without this, any direct memory partially allocated before the failure would never be freed.
afterAll: Wrap resource cleanup in a try/finally block so that MINI_CLUSTER_EXTENSION.after() is always called even if admin.close() or conn.close() throws.
buildTestConfig: Reduce the NetworkBufferPool size from the default 64MB to 32MB via taskmanager.memory.network.min/max. These tests do not exercise high-throughput network paths, so the smaller size is sufficient and reduces direct memory pressure when multiple IT cases run in the same JVM fork.

Tests

Flink118MetricsITCase — passes
Flink119MetricsITCase — passes
Flink120MetricsITCase — passes
Flink22MetricsITCase — passes
Full fluss-flink-1.20 module (mvn verify -pl fluss-flink/fluss-flink-1.20 -am) — BUILD SUCCESS (225 IT tests, 0 failures), confirming no regressions introduced by this change

API and Format

No API or storage format changes

Documentation

No new feature introduced. No documentation changes required.

Prajwal-banakar · 2026-03-14T16:28:32Z

CC @loserwang1024

Fix OOM when startTaskManager in FlinkMetricsITCase

0c9a998

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[flink][test] Fix OOM when startTaskManager in FlinkMetricsITCase#2864

[flink][test] Fix OOM when startTaskManager in FlinkMetricsITCase#2864
Prajwal-banakar wants to merge 1 commit intoapache:mainfrom
Prajwal-banakar:Bug-fix-#2744

Prajwal-banakar commented Mar 13, 2026

Uh oh!

Prajwal-banakar commented Mar 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Prajwal-banakar commented Mar 13, 2026

Purpose

Brief change log

Tests

API and Format

Documentation

Uh oh!

Prajwal-banakar commented Mar 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant