Skip to content

SPOC-481: Add DISCARDFILE support for TRANSDISCARD exception handling#393

Open
danolivo wants to merge 1 commit intomainfrom
spoc-481
Open

SPOC-481: Add DISCARDFILE support for TRANSDISCARD exception handling#393
danolivo wants to merge 1 commit intomainfrom
spoc-481

Conversation

@danolivo
Copy link
Contributor

Summary

  • In TRANSDISCARD mode, instead of executing each DML in a subtransaction during replay and logging to exception_log, records are now dumped
    directly to a per-database binary file (pg_spock/discard_<dboid>.log).
    This follows the Oracle GoldenGate discard file approach: on error,
    skip re-execution and just record what was discarded.
  • discardfile_write() serialises records as length-prefixed JSON with a
    file format version header, using an atomic counter for unique IDs and
    a dedicated LWLock for concurrent append safety. Returns true/false to
    signal success or failure (with WARNING).
  • spock.discard_read() SQL function reads the binary file back as a set
    of jsonb records, e.g.:
    SELECT rec->>'node_name', rec->>'operation', rec->'remote_tuple'
    FROM spock.discard_read() AS rec
    ORDER BY (rec->>'id')::bigint;
  • TRANSDISCARD/SUB_DISABLE paths in handle_insert() are wired to use
    The discard file instead of subtransaction replay.

Design details

  • File format: [uint32 version][uint32 len][json bytes]... — version
    header enables future format upgrades, length prefix makes reading
    simple and predictable.
  • JSON field names are defined via DF_* constants shared between writer
    and reader.
  • Shared memory: pg_atomic_uint64 discard_id and LWLock *discard_file_lock
    added to SpockContext.

What's next

  • Further commits should wire discardfile_write() into handle_update() and
    handle_delete() replay paths.
  • We have some tests in the replication_set, but more profound tests should be added later.
  • We may redesign the read interface until the 5.1 release.

@danolivo danolivo requested a review from mason-sharp March 17, 2026 13:36
@danolivo danolivo self-assigned this Mar 17, 2026
@coderabbitai
Copy link

coderabbitai bot commented Mar 17, 2026

📝 Walkthrough

Walkthrough

Adds a discard-file subsystem: a writer API prototype, a SQL reader function, an LWLock in SpockContext for concurrent appends (initialized in shmem), integrate discard logging into apply-worker insert handling, and add test diagnostics that read the discard file.

Changes

Cohort / File(s) Summary
Public header
include/spock_exception_handler.h
Added extern bool discardfile_write(...); declaration to expose the discard-file write API.
Spock context / shmem
include/spock_worker.h, src/spock_shmem.c
Added LWLock *discard_file_lock to SpockContext; request/init a dedicated LWLock tranche and wire discard_file_lock during shared-memory startup.
Discard-file implementation & reader
src/spock_exception_handler.c
Implemented discard-file subsystem: directory/path helpers, versioned newline-JSON record format, discardfile_write appending records under exclusive lock with header handling and warnings on I/O errors, and SQL-callable spock_discard_read() returning SETOF jsonb. Added required includes and PG_FUNCTION_INFO_V1.
Apply worker changes
src/spock_apply.c
Refactored INSERT handling when use_try_block is true: for TRANSDISCARD/SUB_DISABLE call discardfile_write instead of performing the subtransaction insert; other behaviours use conditional PG_TRY/PG_CATCH with adjusted subtransaction lifecycle and logging.
SQL extension
sql/spock--6.0.0-devel.sql
Added CREATE FUNCTION spock.discard_read() RETURNS SETOF jsonb ... binding to C symbol spock_discard_read.
Tests (diagnostics)
tests/regress/sql/replication_set.sql
Added three diagnostic blocks that call spock.discard_read() to display discard records for verification.

Poem

🐇 I tuck the lost rows in a hidden trail,

I stamp each record with a quiet key,
Locked and numbered in a snug JSON pail,
A rabbit's ledger of what used to be,
Hop — the discard file keeps our memory.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 75.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately and specifically describes the main change: adding DISCARDFILE support for TRANSDISCARD exception handling, which is the primary objective of the PR.
Description check ✅ Passed The description comprehensively explains the DISCARDFILE feature, its design, implementation details, and next steps, all directly related to the changeset.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch spoc-481
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🧹 Nitpick comments (1)
src/spock_apply.c (1)

1346-1350: Remove unreachable exception-behaviour branch in DISCARD-only path.

At Line 1346-Line 1348, the TRANSDISCARD/SUB_DISABLE check is dead code because this block only executes in the else branch where those modes were already excluded.

♻️ Proposed refactor
-			if (!failed)
-			{
-				if (exception_behaviour == TRANSDISCARD ||
-					exception_behaviour == SUB_DISABLE)
-					RollbackAndReleaseCurrentSubTransaction();
-				else
-					ReleaseCurrentSubTransaction();
-			}
+			if (!failed)
+				ReleaseCurrentSubTransaction();
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/spock_apply.c` around lines 1346 - 1350, The conditional checking
exception_behaviour for TRANSDISCARD or SUB_DISABLE inside the DISCARD-only
execution path is unreachable; remove the TRANSDISCARD/SUB_DISABLE branch and
simplify the code to always call ReleaseCurrentSubTransaction() in that branch
instead of conditionally calling RollbackAndReleaseCurrentSubTransaction();
update the block containing exception_behaviour,
RollbackAndReleaseCurrentSubTransaction(), and ReleaseCurrentSubTransaction() so
it unconditionally invokes ReleaseCurrentSubTransaction() where the DISCARD-only
path is taken.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@sql/spock--6.0.0-devel.sql`:
- Around line 101-102: The function declaration for spock.discard_read
incorrectly uses STABLE even though its C implementation (spock_discard_read)
reads mutable on-disk state; change the function volatility from STABLE to
VOLATILE in the CREATE OR REPLACE FUNCTION spock.discard_read() statement so the
planner won't assume constant results for identical arguments and to match the
behavior of the C function (spock_discard_read exported via MODULE_PATHNAME).

In `@src/spock_apply.c`:
- Around line 1322-1325: The discardfile_write() return value is ignored and
failed is unconditionally set to false; change the call in spock_apply.c so you
capture its return (e.g., int rc = discardfile_write(...)) and only set failed =
false when rc indicates success, otherwise leave failed true and handle the
error path (log the failure and propagate/return error or jump to the existing
cleanup/error handling path) so discard audit writes are not silently treated as
successful; update the block that references discardfile_write, failed,
local_node->node->name, rel, remote_origin_id, newtup, and remote_xid
accordingly.

In `@src/spock_exception_handler.c`:
- Around line 358-372: The JSON being built with appendStringInfo in
spock_exception_handler.c injects raw values (node_name,
timestamptz_to_str(...), quote_qualified_identifier(...), and operation) and
must be JSON-escaped to avoid breaking records; update the appendStringInfo call
to pass JSON-escaped versions of these strings (create/use a helper like
json_escape_string or escape_json and call it on node_name, the timestamp
string, the result of quote_qualified_identifier(rel->nspname, rel->relname),
and operation before formatting) so the produced JSON is always valid and
parseable by spock_discard_read().

---

Nitpick comments:
In `@src/spock_apply.c`:
- Around line 1346-1350: The conditional checking exception_behaviour for
TRANSDISCARD or SUB_DISABLE inside the DISCARD-only execution path is
unreachable; remove the TRANSDISCARD/SUB_DISABLE branch and simplify the code to
always call ReleaseCurrentSubTransaction() in that branch instead of
conditionally calling RollbackAndReleaseCurrentSubTransaction(); update the
block containing exception_behaviour, RollbackAndReleaseCurrentSubTransaction(),
and ReleaseCurrentSubTransaction() so it unconditionally invokes
ReleaseCurrentSubTransaction() where the DISCARD-only path is taken.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 70a6fc93-9889-4844-9bf2-701f24c73413

📥 Commits

Reviewing files that changed from the base of the PR and between 0ae4205 and 938f3bf.

⛔ Files ignored due to path filters (1)
  • tests/regress/expected/replication_set.out is excluded by !**/*.out
📒 Files selected for processing (7)
  • include/spock_exception_handler.h
  • include/spock_worker.h
  • sql/spock--6.0.0-devel.sql
  • src/spock_apply.c
  • src/spock_exception_handler.c
  • src/spock_shmem.c
  • tests/regress/sql/replication_set.sql

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
src/spock_apply.c (1)

1324-1326: ⚠️ Potential issue | 🟠 Major

Handle discardfile_write() failure before marking operation successful.

Line 1324 ignores the return value and Line 1326 forces success. If the file write fails, the discard record is silently lost.

Proposed fix
-			discardfile_write(local_node->node->name, rel, remote_origin_id,
-							  "INSERT", NULL, &newtup, remote_xid);
-			failed = false;
+			if (!discardfile_write(local_node->node->name, rel, remote_origin_id,
+								   "INSERT", NULL, &newtup, remote_xid))
+				ereport(ERROR,
+						(errmsg("failed to write discard record for INSERT on %s.%s",
+								rel->nspname, rel->relname)));
+			failed = false;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/spock_apply.c` around lines 1324 - 1326, The discardfile_write call's
return value is ignored and failed is set to false unconditionally; update the
code around discardfile_write (the call using local_node->node->name, rel,
remote_origin_id, "INSERT", NULL, &newtup, remote_xid) to check its return
value, only set failed = false when discardfile_write reports success, and set
failed = true (and log an error via existing logging facilities) when it fails
so the operation isn't marked successful when the discard record couldn't be
written.
src/spock_exception_handler.c (1)

421-422: ⚠️ Potential issue | 🟠 Major

Use fixed-width length + validate before allocation.

The file format is defined as uint32 length-prefixed records, but writes/reads still use int-based flow and no guard before palloc(reclen + 1). Corrupted lengths can trigger unsafe allocations.

Proposed fix
-	int			reclen;
+	uint32		reclen;
...
-	if (write(fd, &buf.len, sizeof(buf.len)) != sizeof(buf.len) ||
-		write(fd, buf.data, buf.len) != buf.len)
+	if (buf.len < 0 || (Size) buf.len > (Size) PG_UINT32_MAX)
+	{
+		CloseTransientFile(fd);
+		LWLockRelease(SpockCtx->discard_file_lock);
+		ereport(WARNING,
+				(errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
+				 errmsg("discard record too large to serialize")));
+		pfree(buf.data);
+		return false;
+	}
+	reclen = (uint32) buf.len;
+	if (write(fd, &reclen, sizeof(reclen)) != sizeof(reclen) ||
+		write(fd, buf.data, reclen) != (ssize_t) reclen)
 	{
...
-	while (read(fd, &reclen, sizeof(reclen)) == sizeof(reclen))
+	while (read(fd, &reclen, sizeof(reclen)) == sizeof(reclen))
 	{
...
-		rec = palloc(reclen + 1);
+		if (reclen == 0 || reclen > (uint32) (MaxAllocSize - 1))
+		{
+			ereport(WARNING,
+					(errcode(ERRCODE_DATA_CORRUPTED),
+					 errmsg("invalid record length %u in discard file \"%s\"",
+							reclen, path)));
+			break;
+		}
+		rec = palloc((Size) reclen + 1);

Also applies to: 459-460, 494-501

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/spock_exception_handler.c` around lines 421 - 422, The code currently
uses an int-sized length and unsafe allocation (buf.len and palloc(reclen + 1)),
risking huge/negative allocations; change to a fixed-width uint32_t for the
on-disk length (use uint32_t reclen or buf.len), write/read using
sizeof(uint32_t) and validate reclen after converting/endian-handling, then
guard before allocation (ensure reclen > 0 and reclen <= a safe cap such as
MaxAllocSize-1 or a protocol max) before calling palloc(reclen + 1); also check
that write/read return values compare against sizeof(uint32_t) and reclen (use
ssize_t) and apply the same fixes to the other occurrences noted (the related
write/read/allocation sites around the other locations).
🧹 Nitpick comments (1)
src/spock_apply.c (1)

1348-1352: Remove unreachable exception-behaviour branch in DISCARD path.

This branch is inside the else case that already excludes TRANSDISCARD/SUB_DISABLE, so the conditional is dead code.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/spock_apply.c` around lines 1348 - 1352, The conditional checking
exception_behaviour against TRANSDISCARD/SUB_DISABLE inside the else branch is
dead code; remove the if/else and always call ReleaseCurrentSubTransaction() in
that branch (i.e., replace the conditional block with a single call to
ReleaseCurrentSubTransaction()), leaving
RollbackAndReleaseCurrentSubTransaction() only where it is legitimately
reachable.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/spock_exception_handler.c`:
- Around line 320-322: Update the function comment to reflect the actual on-disk
format and behavior: change the phrase "newline-delimited JSON" to indicate the
function writes binary length-prefixed records (i.e., a binary length header
followed by the JSON payload) and keep the note that it is safe to call outside
a transaction and does not access catalog tables; mention the exact
length-prefix semantics used by the implementation (e.g., 32-bit/64-bit
little-endian) and that each call writes one length-prefixed record so readers
must parse the length header to read the JSON payload.

---

Duplicate comments:
In `@src/spock_apply.c`:
- Around line 1324-1326: The discardfile_write call's return value is ignored
and failed is set to false unconditionally; update the code around
discardfile_write (the call using local_node->node->name, rel, remote_origin_id,
"INSERT", NULL, &newtup, remote_xid) to check its return value, only set failed
= false when discardfile_write reports success, and set failed = true (and log
an error via existing logging facilities) when it fails so the operation isn't
marked successful when the discard record couldn't be written.

In `@src/spock_exception_handler.c`:
- Around line 421-422: The code currently uses an int-sized length and unsafe
allocation (buf.len and palloc(reclen + 1)), risking huge/negative allocations;
change to a fixed-width uint32_t for the on-disk length (use uint32_t reclen or
buf.len), write/read using sizeof(uint32_t) and validate reclen after
converting/endian-handling, then guard before allocation (ensure reclen > 0 and
reclen <= a safe cap such as MaxAllocSize-1 or a protocol max) before calling
palloc(reclen + 1); also check that write/read return values compare against
sizeof(uint32_t) and reclen (use ssize_t) and apply the same fixes to the other
occurrences noted (the related write/read/allocation sites around the other
locations).

---

Nitpick comments:
In `@src/spock_apply.c`:
- Around line 1348-1352: The conditional checking exception_behaviour against
TRANSDISCARD/SUB_DISABLE inside the else branch is dead code; remove the if/else
and always call ReleaseCurrentSubTransaction() in that branch (i.e., replace the
conditional block with a single call to ReleaseCurrentSubTransaction()), leaving
RollbackAndReleaseCurrentSubTransaction() only where it is legitimately
reachable.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: df2fdda8-30f3-4ed5-8965-acc65c52f586

📥 Commits

Reviewing files that changed from the base of the PR and between 938f3bf and 28ab443.

⛔ Files ignored due to path filters (1)
  • tests/regress/expected/replication_set.out is excluded by !**/*.out
📒 Files selected for processing (7)
  • include/spock_exception_handler.h
  • include/spock_worker.h
  • sql/spock--6.0.0-devel.sql
  • src/spock_apply.c
  • src/spock_exception_handler.c
  • src/spock_shmem.c
  • tests/regress/sql/replication_set.sql
🚧 Files skipped from review as they are similar to previous changes (2)
  • sql/spock--6.0.0-devel.sql
  • src/spock_shmem.c

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (1)
src/spock_apply.c (1)

1311-1326: ⚠️ Potential issue | 🟠 Major

Capture the discard-file result here instead of forcing success.

This branch unconditionally clears failed after calling discardfile_write(). Downstream, the common logging block still calls log_insert_exception(), so a successful append is duplicated into spock.exception_log and a failed append is treated as handled. Track the write outcome, raise on false, and use that flag to skip the later per-row exception_log path when the discard file already has the row.

🛠️ Possible fix
+			if (!(wrote_discard_record =
+					discardfile_write(local_node->node->name, rel, remote_origin_id,
+									  "INSERT", NULL, &newtup, remote_xid)))
+				ereport(ERROR,
+						(errmsg("failed to write discard record for INSERT on %s.%s",
+								rel->nspname, rel->relname)));
-			discardfile_write(local_node->node->name, rel, remote_origin_id,
-							  "INSERT", NULL, &newtup, remote_xid);
 			failed = false;

Add bool wrote_discard_record = false; near the local declarations at the top of handle_insert(), then gate the later log_insert_exception() block on !wrote_discard_record.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/spock_apply.c` around lines 1311 - 1326, In handle_insert(), don’t
unconditionally clear failed after calling discardfile_write(); capture its
boolean return (e.g., bool wrote_discard_record = discardfile_write(...)) and
use that value to: 1) treat a false return as an error (propagate/raise/fail
instead of silently setting failed = false) and 2) skip the later call to
log_insert_exception() when wrote_discard_record is true so you don’t duplicate
successful discard-file writes into spock.exception_log; update the control flow
around discardfile_write(), failed, and the log_insert_exception() call
accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/spock_exception_handler.c`:
- Around line 468-524: spock_discard_read currently reads records without
coordinating with discardfile_write which writes length and payload in separate
syscalls; fix by acquiring the same lock used by discardfile_write in shared
mode (e.g. call the discardfile write-lock API or the module's RWLock in
shared/read mode) before opening/reading the file, or alternatively acquire that
lock, call lseek(fd, 0, SEEK_END) to snapshot EOF while still holding the lock
and then lseek back to start and only read up to that snapshot EOF; ensure you
hold the lock during the loop that reads reclen/rec (the code around fd,
read(fd,&reclen,...), palloc/pfree and tuplestore_putvalues) and release the
lock after finishing to prevent seeing a length header without its payload.
- Around line 443-450: The SQL example in the comment for the SQL-callable
function spock.discard_read() uses an unaliased identifier "rec" which will
fail; update the example to alias the function result (e.g. use FROM
spock.discard_read() AS rec) or reference the actual output column name instead
so the SELECT references a valid identifier; edit the comment block above the
spock.discard_read() declaration in spock_exception_handler.c to show the
corrected example line (SELECT rec->>'node_name', rec->>'operation',
rec->'remote_tuple' FROM spock.discard_read() AS rec;).

In `@tests/regress/sql/replication_set.sql`:
- Around line 217-222: The assertions currently read the entire
spock.discard_read() output and can pick up prior TRANSDISCARD rows; before each
scenario capture a baseline (SELECT max((rec->>'id')::bigint) FROM
spock.discard_read()) and then restrict subsequent checks to rows with
(rec->>'id')::bigint > baseline, or alternatively add explicit filters on
rec->>'relname' and rec->>'operation' in the SELECT/WHERE to only validate the
rows created in that scenario (also add the missing UPDATE check by filtering
for operation = 'UPDATE' when verifying update discard output); use the
spock.discard_read() call and the rec->>'id', rec->>'relname', rec->>'operation'
fields to implement these scoped assertions.

---

Duplicate comments:
In `@src/spock_apply.c`:
- Around line 1311-1326: In handle_insert(), don’t unconditionally clear failed
after calling discardfile_write(); capture its boolean return (e.g., bool
wrote_discard_record = discardfile_write(...)) and use that value to: 1) treat a
false return as an error (propagate/raise/fail instead of silently setting
failed = false) and 2) skip the later call to log_insert_exception() when
wrote_discard_record is true so you don’t duplicate successful discard-file
writes into spock.exception_log; update the control flow around
discardfile_write(), failed, and the log_insert_exception() call accordingly.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: a1fe25dd-f6e0-44bf-854e-6220460e6988

📥 Commits

Reviewing files that changed from the base of the PR and between 28ab443 and 191b8f1.

⛔ Files ignored due to path filters (1)
  • tests/regress/expected/replication_set.out is excluded by !**/*.out
📒 Files selected for processing (7)
  • include/spock_exception_handler.h
  • include/spock_worker.h
  • sql/spock--6.0.0-devel.sql
  • src/spock_apply.c
  • src/spock_exception_handler.c
  • src/spock_shmem.c
  • tests/regress/sql/replication_set.sql
🚧 Files skipped from review as they are similar to previous changes (3)
  • sql/spock--6.0.0-devel.sql
  • src/spock_shmem.c
  • include/spock_worker.h

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
src/spock_apply.c (1)

1324-1326: ⚠️ Potential issue | 🟡 Minor

Handle discardfile_write() failure.

The return value is ignored despite the function returning a boolean success indicator. The TODO comment at lines 1322-1323 acknowledges this needs to be addressed. At minimum, consider propagating the failure to failed so the exception logging path captures the issue.
,

🛠️ Suggested approach
-			discardfile_write(local_node->node->name, rel, remote_origin_id,
-							  "INSERT", NULL, &newtup, remote_xid);
-			failed = false;
+			if (!discardfile_write(local_node->node->name, rel, remote_origin_id,
+								   "INSERT", NULL, &newtup, remote_xid))
+			{
+				ereport(WARNING,
+						(errmsg("failed to write discard record for INSERT on %s.%s",
+								rel->nspname, rel->relname)));
+				/* Continue anyway - the WARNING from discardfile_write provides details */
+			}
+			failed = false;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/spock_apply.c` around lines 1324 - 1326, The call to
discardfile_write(...) currently ignores its boolean return value; capture that
return (e.g., bool ok = discardfile_write(local_node->node->name, rel,
remote_origin_id, "INSERT", NULL, &newtup, remote_xid)) and if it returns false
set failed = true so the existing exception/logging path picks up the error (and
consider propagating/returning the failure out of the surrounding function if
appropriate); update the code around the discardfile_write call and the failed
variable handling to reflect this change in spock_apply.c.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/spock_apply.c`:
- Around line 1346-1353: The if-check inside the non-exception branch is dead
because this code path only runs when exception_behaviour is already known not
to be TRANSDISCARD or SUB_DISABLE; remove the unreachable conditional and the
RollbackAndReleaseCurrentSubTransaction() path and always call
ReleaseCurrentSubTransaction() in the block. Locate the block using the symbols
exception_behaviour, TRANSDISCARD, SUB_DISABLE,
RollbackAndReleaseCurrentSubTransaction, and ReleaseCurrentSubTransaction and
simplify it by deleting the inner if/else and leaving a single call to
ReleaseCurrentSubTransaction().

---

Duplicate comments:
In `@src/spock_apply.c`:
- Around line 1324-1326: The call to discardfile_write(...) currently ignores
its boolean return value; capture that return (e.g., bool ok =
discardfile_write(local_node->node->name, rel, remote_origin_id, "INSERT", NULL,
&newtup, remote_xid)) and if it returns false set failed = true so the existing
exception/logging path picks up the error (and consider propagating/returning
the failure out of the surrounding function if appropriate); update the code
around the discardfile_write call and the failed variable handling to reflect
this change in spock_apply.c.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 15b2baf5-2cff-4638-beb6-7d5e343cafa2

📥 Commits

Reviewing files that changed from the base of the PR and between 191b8f1 and 06ee6a3.

⛔ Files ignored due to path filters (1)
  • tests/regress/expected/replication_set.out is excluded by !**/*.out
📒 Files selected for processing (7)
  • include/spock_exception_handler.h
  • include/spock_worker.h
  • sql/spock--6.0.0-devel.sql
  • src/spock_apply.c
  • src/spock_exception_handler.c
  • src/spock_shmem.c
  • tests/regress/sql/replication_set.sql
🚧 Files skipped from review as they are similar to previous changes (4)
  • src/spock_shmem.c
  • include/spock_worker.h
  • sql/spock--6.0.0-devel.sql
  • tests/regress/sql/replication_set.sql

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/spock_apply.c (1)

1355-1360: ⚠️ Potential issue | 🟠 Major

Skip per-row exception_log writes for the DISCARDFILE insert path.

After the TRANSDISCARD/SUB_DISABLE branch above, this still calls log_insert_exception(). That keeps the old per-row exception_log behavior, so discarded INSERTs land in both stores instead of only discard_<dboid>.log.

💡 One way to gate the old logging path
-		{
+		if (!(exception_behaviour == TRANSDISCARD ||
+			  exception_behaviour == SUB_DISABLE))
+		{
 			char	   *error_msg = edata ? edata->message :
 				(exception_log_ptr[my_exception_log_index].initial_error_message[0] ?
 				 exception_log_ptr[my_exception_log_index].initial_error_message : NULL);

 			log_insert_exception(failed, error_msg, rel, NULL, &newtup, "INSERT");
 		}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/spock_apply.c` around lines 1355 - 1360, The code unconditionally calls
log_insert_exception(failed, error_msg, rel, NULL, &newtup, "INSERT") even for
the DISCARDFILE insert path, causing discarded INSERTs to be written to both the
discard_<dboid>.log and the per-row exception_log; change the call site in
src/spock_apply.c (after the TRANSDISCARD/SUB_DISABLE handling) to skip calling
log_insert_exception when the current insert is the DISCARDFILE/SUB_DISABLE path
(i.e., detect the same condition/flag used for TRANSDISCARD/SUB_DISABLE) so that
for that branch you do not invoke log_insert_exception with
failed/error_msg/rel/newtup/"INSERT".
♻️ Duplicate comments (1)
src/spock_apply.c (1)

1324-1326: ⚠️ Potential issue | 🟠 Major

Don't treat a failed discard-file append as success.

The return value is still ignored and failed is cleared unconditionally, so an open/write failure drops the row from replay and from the discard audit trail.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/spock_apply.c` around lines 1324 - 1326, The call to
discardfile_write(...) is currently ignored and failed is unconditionally set to
false; change the logic in the block around discardfile_write (the call that
passes local_node->node->name, rel, remote_origin_id, "INSERT", NULL, &newtup,
remote_xid) to check discardfile_write's return value and only clear the local
variable failed when that call succeeds; if discardfile_write fails, preserve
failed=true (and add an appropriate processLogger/elog/error report) so the row
is not dropped from replay or the discard audit trail.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@sql/spock--6.0.0-devel.sql`:
- Around line 101-102: The upgrade script is missing the spock.discard_read()
function declaration, so upgraded databases won't get the function defined in
the base install; add a CREATE OR REPLACE FUNCTION spock.discard_read() ...
RETURNS SETOF jsonb VOLATILE LANGUAGE c AS 'MODULE_PATHNAME',
'spock_discard_read'; statement to the upgrade migration (the 5.0.6→6.0.0-devel
script) so it matches the base install script and ensures the spock_discard_read
symbol is installed during upgrades.

In `@src/spock_exception_handler.c`:
- Around line 424-425: The two separate write() calls that write buf.len then
buf.data can leave a truncated length header on partial failure; replace them
with a single atomic scatter/gather write using writev (prepare an iovec for
&buf.len and buf.data) and loop on EINTR, then verify the return equals
sizeof(buf.len) + buf.len; if it doesn’t, treat it as a full write failure and
handle cleanup/error reporting to avoid leaving a dangling header (this change
touches the location where write(fd, &buf.len, ...) and write(fd, buf.data, ...)
are called in spock_exception_handler.c).

In `@src/spock_shmem.c`:
- Around line 172-175: The discard_id is being zeroed in spock_shmem.c which
causes ID reuse after restarts; instead of
pg_atomic_init_u64(&SpockCtx->discard_id, 0) reseed it at startup by scanning
the existing discard_<dboid>.log (or reading the last appended record) to
determine the highest-used id and initialize the atomic to that value + 1;
perform this scan/probe while holding SpockCtx->discard_file_lock (the lock
obtained from GetNamedLWLockTranche("spock discard file lock")) and set the
atomic using the appropriate pg_atomic write/fetch API so IDs remain unique
across postmaster restarts.

---

Outside diff comments:
In `@src/spock_apply.c`:
- Around line 1355-1360: The code unconditionally calls
log_insert_exception(failed, error_msg, rel, NULL, &newtup, "INSERT") even for
the DISCARDFILE insert path, causing discarded INSERTs to be written to both the
discard_<dboid>.log and the per-row exception_log; change the call site in
src/spock_apply.c (after the TRANSDISCARD/SUB_DISABLE handling) to skip calling
log_insert_exception when the current insert is the DISCARDFILE/SUB_DISABLE path
(i.e., detect the same condition/flag used for TRANSDISCARD/SUB_DISABLE) so that
for that branch you do not invoke log_insert_exception with
failed/error_msg/rel/newtup/"INSERT".

---

Duplicate comments:
In `@src/spock_apply.c`:
- Around line 1324-1326: The call to discardfile_write(...) is currently ignored
and failed is unconditionally set to false; change the logic in the block around
discardfile_write (the call that passes local_node->node->name, rel,
remote_origin_id, "INSERT", NULL, &newtup, remote_xid) to check
discardfile_write's return value and only clear the local variable failed when
that call succeeds; if discardfile_write fails, preserve failed=true (and add an
appropriate processLogger/elog/error report) so the row is not dropped from
replay or the discard audit trail.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: eb8afade-e97a-4074-a253-1b65ffab3e0a

📥 Commits

Reviewing files that changed from the base of the PR and between 06ee6a3 and 0843312.

⛔ Files ignored due to path filters (1)
  • tests/regress/expected/replication_set.out is excluded by !**/*.out
📒 Files selected for processing (7)
  • include/spock_exception_handler.h
  • include/spock_worker.h
  • sql/spock--6.0.0-devel.sql
  • src/spock_apply.c
  • src/spock_exception_handler.c
  • src/spock_shmem.c
  • tests/regress/sql/replication_set.sql
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/regress/sql/replication_set.sql

Preparation commit.
In TRANSDISCARD mode, instead of executing each DML in a subtransaction
during replay (and logging to exception_log), dump records directly to a
per-database binary file under pg_spock/. This follows the Oracle
GoldenGate approach: on error, skip re-execution and just record what
was discarded.

- Add discardfile_write() to serialize records as length-prefixed JSON
  to pg_spock/discard_<dboid>.log, using an atomic counter for IDs and
  a dedicated LWLock for concurrent append safety.
- Add spock.discard_read() SQL function to read the file back as a table.
- Wire TRANSDISCARD/SUB_DISABLE paths in handle_insert() to use the
  discardfile instead of subtransaction replay.

Further commits should add this behaviour into other handle_* routines.
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
src/spock_exception_handler.c (2)

88-97: Clarify the distinction between DF_XID and DF_REMOTE_XID.

Both DF_XID (line 88) and DF_REMOTE_XID (line 97) are defined, and both are written in discardfile_write() at lines 359 and 383 with the same remote_xid value. This appears to be redundant—consider removing one or documenting why both are needed.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/spock_exception_handler.c` around lines 88 - 97, DF_XID and DF_REMOTE_XID
are both defined and both currently written using the same "remote_xid" value in
discardfile_write(), causing redundancy; update the code so DF_XID represents
the local transaction id and DF_REMOTE_XID represents the remote transaction id
(or remove one if truly unnecessary). Locate the DF_XID and DF_REMOTE_XID macros
and the discardfile_write() calls (references: DF_XID, DF_REMOTE_XID,
discardfile_write) and either: 1) change the value passed for DF_XID to the
correct local XID variable and add a comment documenting the distinction, or 2)
if only one XID is needed, remove the unused macro and its corresponding field
write to eliminate duplication and update any callers and comments accordingly.

508-514: Consider adding bounds check on reclen before allocation.

Per previous discussion, this is deferred to a TODO, but for reference: reclen is a signed int read directly from the file. A corrupted or malformed file could provide a negative or excessively large value, leading to undefined behavior or huge allocation in palloc(reclen + 1). A simple bounds check (e.g., reclen > 0 && reclen < MaxAllocSize) before the allocation would harden this path.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/spock_exception_handler.c` around lines 508 - 514, The loop that reads
reclen and then calls palloc(reclen + 1) must validate reclen first to avoid
negative or huge allocations: in the function containing the while(read(...))
loop (the block that allocates rec from reclen), add a guard that checks reclen
> 0 && reclen < MaxAllocSize (or an appropriate upper bound constant) and handle
failures by logging/erroring and breaking out or returning an error instead of
calling palloc; reference the variables and calls reclen and palloc(reclen + 1)
in spock_exception_handler.c and ensure the error path frees/cleans up
consistently.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/spock_exception_handler.c`:
- Around line 88-97: DF_XID and DF_REMOTE_XID are both defined and both
currently written using the same "remote_xid" value in discardfile_write(),
causing redundancy; update the code so DF_XID represents the local transaction
id and DF_REMOTE_XID represents the remote transaction id (or remove one if
truly unnecessary). Locate the DF_XID and DF_REMOTE_XID macros and the
discardfile_write() calls (references: DF_XID, DF_REMOTE_XID, discardfile_write)
and either: 1) change the value passed for DF_XID to the correct local XID
variable and add a comment documenting the distinction, or 2) if only one XID is
needed, remove the unused macro and its corresponding field write to eliminate
duplication and update any callers and comments accordingly.
- Around line 508-514: The loop that reads reclen and then calls palloc(reclen +
1) must validate reclen first to avoid negative or huge allocations: in the
function containing the while(read(...)) loop (the block that allocates rec from
reclen), add a guard that checks reclen > 0 && reclen < MaxAllocSize (or an
appropriate upper bound constant) and handle failures by logging/erroring and
breaking out or returning an error instead of calling palloc; reference the
variables and calls reclen and palloc(reclen + 1) in spock_exception_handler.c
and ensure the error path frees/cleans up consistently.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: fdc6d8f4-a049-49ba-b007-0dae9c93fd8c

📥 Commits

Reviewing files that changed from the base of the PR and between 06ee6a3 and 4360ca6.

⛔ Files ignored due to path filters (1)
  • tests/regress/expected/replication_set.out is excluded by !**/*.out
📒 Files selected for processing (7)
  • include/spock_exception_handler.h
  • include/spock_worker.h
  • sql/spock--6.0.0-devel.sql
  • src/spock_apply.c
  • src/spock_exception_handler.c
  • src/spock_shmem.c
  • tests/regress/sql/replication_set.sql
🚧 Files skipped from review as they are similar to previous changes (3)
  • include/spock_worker.h
  • src/spock_shmem.c
  • include/spock_exception_handler.h

@danolivo danolivo added the enhancement New feature or request label Mar 18, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant