Add comprehensive exception handling mode tests by rasifr · Pull Request #342 · pgEdge/spock

rasifr · 2026-02-11T13:59:59Z

Test coverage for all three exception handling modes:

DISCARD: Failed operations skipped, transaction continues
TRANSDISCARD: Entire transaction rolled back on failure
SUB_DISABLE: Subscription disabled on exception

Also tests exception log transactional behavior with deferred constraint triggers to verify logs survive transaction aborts.

coderabbitai · 2026-02-11T14:05:18Z

📝 Walkthrough

Walkthrough

Expanded a single test file from 25 to 43 tests to cover DISCARD, TRANSDISCARD, and SUB_DISABLE exception-handling modes with multi-phase subscription setup, per-mode transaction scenarios, replication waits, and exception_log validations.

Changes

Cohort / File(s)	Summary
Exception Handling Test Suite Expansion `tests/tap/t/013_exception_handling.pl`	Rewrote and extended the test to cover three exception modes (DISCARD, TRANSDISCARD, SUB_DISABLE). Added subscription creation/synchronization steps, per-mode setup and validation blocks, DDL replication waits, transaction/row assertions, exception_log content checks (non-NULL expectations), skip_lsn handling, and increased tests from 25 → 43. Lines changed: +323/-99.

Poem

🐇 I bounced through tests with twitchy nose and ear,
DISCARD, TRANSDISCARD, SUB_DISABLE now appear,
Logs are kept, transactions checked with care,
Forty-three hops of verification in the air,
A rabbit's cheer for bugs that veer!

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately summarizes the main change: adding comprehensive test coverage for all three exception handling modes (DISCARD, TRANSDISCARD, SUB_DISABLE), which is the primary focus of the 323 line additions to the test file.
Description check	✅ Passed	The description is directly related to the changeset, clearly explaining the test coverage being added for the three exception handling modes and exception log transactional behavior, matching the expanded test scenarios in the file.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch task/SPOC-253/exception_coverage

No actionable comments were generated in the recent review. 🎉

🧹 Recent nitpick comments

tests/tap/t/013_exception_handling.pl (4)
130-131: Debug output via backticks bypasses test harness error handling.

These backtick invocations shell out to psql directly for debug diagnostics. If psql is not on $PATH, or authentication fails, the error is silently swallowed into $log_output. Consider using scalar_query or psql_or_bail consistently, or at minimum check $? after the backtick call.

Also, if $db_password is set via environment (PGPASSWORD), it may appear in /proc/*/environ. This is a minor concern for test environments but worth noting.

Also applies to: 196-197, 263-264

211-215: Direct UPDATE on spock.subscription bypasses the spock API.

Lines 212 and 214 manipulate sub_enabled directly via SQL UPDATE instead of using spock.sub_disable / spock.sub_enable. This may skip internal state management (e.g., worker signaling). If this is intentional to force a clean state, a brief comment explaining why the API isn't used here would help future readers.

384-385: Use is() instead of ok(... eq ...) for better failure diagnostics.

When ok($deferred_row eq 't', ...) fails, TAP only reports "not ok" without showing the actual value. Using is($deferred_row, 't', ...) will display both expected and got values, making debugging easier. Same applies to $deferred_exception_count on line 378.
Proposed fix
-ok($deferred_exception_count > 0, 'Exception logs exist - apply worker handled deferred trigger gracefully');
+cmp_ok($deferred_exception_count, '>', 0, 'Exception logs exist - apply worker handled deferred trigger gracefully');

-ok($deferred_row eq 't', 'Data applied - apply worker handled deferred trigger gracefully');
+is($deferred_row, 't', 'Data applied - apply worker handled deferred trigger gracefully');
Note: cmp_ok would need to be available from Test::More (it is by default).
50-50: Heavy reliance on sleep for synchronization makes tests slow and potentially flaky.

There are 12 sleep calls totaling ~50 seconds of fixed waits. In faster environments these waste CI time; in slower environments (loaded CI runners, VMs) they may be insufficient, leading to flaky failures. If the test framework supports polling for a condition (e.g., checking replication caught up via LSN comparison), that would be more robust. This is a pre-existing pattern so not blocking, but worth considering for reliability.

Also applies to: 84-84, 105-105, 163-163, 178-178, 213-213, 215-215, 235-235, 248-248, 297-297, 331-331, 371-371

Tip

Issue Planner is now in beta. Read the docs and try it out! Share your feedback on Discord.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🤖 Fix all issues with AI agents

In `@tests/tap/t/013_exception_handling.pl`:
- Around line 303-397: The test's intent is contradictory: the header for Test
4a says exception log entries should be LOST, but assertions check for existence
(deferred_exception_count > 0 and deferred_row eq 't'); decide which behavior is
correct and make the file consistent by updating both comments and assertions:
if the expected behavior is that logs are lost and data not applied, change the
header and inline comments to state "exception logs are LOST and data not
applied" and replace assertions using deferred_exception_count and deferred_row
with exact equality checks (is(deferred_exception_count, '0', ...) and
is(deferred_row, 'f', ...)); if the expected behavior is that the apply worker
handles the deferred trigger and logs survive and data is applied, change the
header/comments to state "exception logs survive and data applied" and keep or
strengthen the assertions to explicitly assert existence
(isnt(deferred_exception_count, '0', ...) or is(gt,0) and is(deferred_row, 't',
...)); also ensure messaging strings in the pass/diag lines around test_deferred
and abort_at_commit_fn reflect the chosen expectation.
- Around line 280-296: The test continues when scalar_query(...) returns undef
for $skip_lsn which causes downstream calls to psql_or_bail("SELECT
spock.sub_alter_skiplsn('exception_test_sub', '$skip_lsn')") and
psql_or_bail("SELECT spock.sub_enable('exception_test_sub', true)") to fail;
update the test after the ok($skip_lsn, ...) check to bail out immediately when
$skip_lsn is falsey (e.g., call bail/die or skip the remaining steps) so that
you don't interpolate an undef into sub_alter_skiplsn/sub_enable and you surface
the original extraction failure from scalar_query rather than cascading errors.
Ensure you reference the scalar_query call and the $skip_lsn variable when
adding the early exit.

🧹 Nitpick comments (3)

tests/tap/t/013_exception_handling.pl (3)
130-131: Backtick psql calls bypass test helpers and may fail silently.

Lines 130, 196, and 263 invoke psql directly via backticks instead of using SpockTest helper functions. If the connection fails (e.g., missing .pgpass, wrong port), the error is silently swallowed into $log_output and only visible in diag. Consider using the existing helper or at minimum checking $? after the backtick call.

50-50: Sleep-based synchronization is fragile and adds ~53 seconds of fixed wait.

The test relies on fixed sleep calls (ranging from 2–8 seconds each, totaling ~53s) for replication sync. Under CI load or slower machines, these may be insufficient, causing flaky failures. On fast machines, they waste time unnecessarily.

Consider replacing with a poll-and-retry loop that checks the expected condition (e.g., table exists on subscriber, row count matches, subscription status) with a timeout. For example:
# Poll for a condition with timeout
my $timeout = 30;
my $start = time();
while (time() - $start < $timeout) {
    my $result = scalar_query(2, "SELECT EXISTS (SELECT 1 FROM information_schema.tables WHERE table_name = 'test_discard')");
    last if $result eq 't';
    sleep 1;
}
This would make the tests both faster and more reliable.

Also applies to: 84-84, 105-105, 235-235, 328-328, 368-368

211-215: Inconsistent subscription re-enable: direct UPDATE vs. API function.

Lines 212–214 toggle the subscription by directly updating spock.subscription, while line 294 uses the proper API spock.sub_enable(...). Direct table manipulation may skip internal bookkeeping (e.g., worker restart signals, state transitions) that the API function handles.

Unless there's a specific reason the API can't be used here, prefer using the API consistently:
Suggested fix
-psql_or_bail(2, "UPDATE spock.subscription SET sub_enabled = false WHERE sub_name = 'exception_test_sub'");
+psql_or_bail(2, "SELECT spock.sub_disable('exception_test_sub')");
 system_or_bail 'sleep', '2';
-psql_or_bail(2, "UPDATE spock.subscription SET sub_enabled = true WHERE sub_name = 'exception_test_sub'");
+psql_or_bail(2, "SELECT spock.sub_enable('exception_test_sub', true)");

tests/tap/t/013_exception_handling.pl

Test coverage for all three exception handling modes: - DISCARD: Failed operations skipped, transaction continues - TRANSDISCARD: Entire transaction rolled back on failure - SUB_DISABLE: Subscription disabled on exception Also tests exception log transactional behavior with deferred constraint triggers to verify apply worker handles errors gracefully and exception logs survive. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

ibrarahmad

LGTM

coderabbitai bot reviewed Feb 11, 2026

View reviewed changes

tests/tap/t/013_exception_handling.pl Show resolved Hide resolved

tests/tap/t/013_exception_handling.pl Show resolved Hide resolved

rasifr force-pushed the task/SPOC-253/exception_coverage branch from 4791696 to 86f3e69 Compare February 12, 2026 11:12

ibrarahmad approved these changes Feb 12, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add comprehensive exception handling mode tests#342

Add comprehensive exception handling mode tests#342
rasifr wants to merge 1 commit intomainfrom
task/SPOC-253/exception_coverage

rasifr commented Feb 11, 2026

Uh oh!

coderabbitai bot commented Feb 11, 2026 •

edited

Loading

Walkthrough

Changes

Poem

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

ibrarahmad left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

rasifr commented Feb 11, 2026

Uh oh!

coderabbitai bot commented Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

ibrarahmad left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

coderabbitai bot commented Feb 11, 2026 •

edited

Loading