Skip to content

Conversation

@oneby-wang
Copy link
Contributor

@oneby-wang oneby-wang commented Dec 28, 2025

Motivation

Since nextValidLedger:-1 is a valid markDeletePosition, we should move newPosition to nextValidLedger:-1 to avoid cursor position and ledger inconsistency in ManagedCursorImpl whenever possible.

PR #25087 also made this change.

Modifications

  1. Modify ManagedCursorImpl.asyncMarkDelete() method, move newPosition to nextValidLedger:-1 if we reach the following conditions:
    a. if lastConfirmedEntry >= newPosition, and next ledger exists, and current ledger entries are all consumed.
    b. if lastConfirmedEntry < newPosition, next ledger exists, and newPosition == nextValidLedger:-1.

I think the previous code might have a little problem. If newPosition == nextValidLedger:n, n is an non-negative number, we might set new newPosition to nextValidLedger:n which is greater than lastConfirmedEntry.

Position newPosition = ackBatchPosition(position);
if (ledger.getLastConfirmedEntry().compareTo(newPosition) < 0) {
boolean shouldCursorMoveForward = false;
try {
long ledgerEntries = ledger.getLedgerInfo(markDeletePosition.getLedgerId()).get().getEntries();
Long nextValidLedger = ledger.getNextValidLedger(ledger.getLastConfirmedEntry().getLedgerId());
shouldCursorMoveForward = nextValidLedger != null
&& (markDeletePosition.getEntryId() + 1 >= ledgerEntries)
&& (newPosition.getLedgerId() == nextValidLedger);
} catch (Exception e) {
log.warn("Failed to get ledger entries while setting mark-delete-position", e);
}

And snapshot positions into a local variable to avoid race condition.

  1. Add testAsyncMarkDeleteMoveToNextLedgerInNonRolloverScenario, testAsyncMarkDeleteMayMoveToNextLedgerInRolloverScenario, testAsyncMarkDeleteMoveToNextLedgerOneByOne, testAsyncMarkDeleteNextLedgerMinusOneEntryIdPosition tests in ManagedCursorImpl to verify the code change.

  2. Fix tests due to this PR's code change.

  3. Fix some flaky tests introduced by PR [fix][broker] Fix cursor position persistence in ledger trimming #25087.

Verifying this change

  • Make sure that the change passes the CI checks.

Does this pull request potentially affect one of the following parts:

If the box was checked, please highlight the changes

  • Dependencies (add or upgrade a dependency)
  • The public API
  • The schema
  • The default values of configurations
  • The threading model
  • The binary protocol
  • The REST endpoints
  • The admin CLI options
  • The metrics
  • Anything that affects deployment

Documentation

  • doc
  • doc-required
  • doc-not-needed
  • doc-complete

Matching PR in forked repository

PR in forked repository: oneby-wang#19

…position and ledger inconsistency in ManagedCursorImpl
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes cursor position and ledger inconsistency in ManagedCursorImpl by moving the mark delete position to nextValidLedger:-1 when all entries in the current ledger are consumed. This addresses potential inconsistencies where the cursor position and ledger state could become misaligned.

Key Changes

  • Modified ManagedCursorImpl.asyncMarkDelete() to intelligently move cursor position to nextValidLedger:-1 when appropriate
  • Added snapshot positions to local variables to prevent race conditions
  • Added four comprehensive test cases to verify the new mark delete behavior
  • Updated existing tests to reflect the new cursor positioning behavior

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
managed-ledger/src/main/java/org/apache/bookkeeper/mledger/impl/ManagedCursorImpl.java Core logic change to move cursor position to nextValidLedger:-1 when ledger entries are fully consumed; snapshots positions to avoid race conditions
managed-ledger/src/test/java/org/apache/bookkeeper/mledger/impl/ManagedCursorTest.java Added four new test cases (testAsyncMarkDeleteMoveToNextLedgerInNonRolloverScenario, testAsyncMarkDeleteMayMoveToNextLedgerInRolloverScenario, testAsyncMarkDeleteMoveToNextLedgerOneByOne, testAsyncMarkDeleteNextLedgerMinusOneEntryIdPosition) and updated existing tests to use relaxed assertions
managed-ledger/src/test/java/org/apache/bookkeeper/mledger/impl/ManagedLedgerTest.java Updated testNoRetention test to use Awaitility for proper async handling and updated assertions for cursor position validation; fixed flaky tests
pulsar-broker/src/test/java/org/apache/pulsar/compaction/CompactedTopicTest.java Updated test assertions to expect cursor position at currentLedgerId:-1 after compaction
pulsar-broker/src/test/java/org/apache/pulsar/broker/service/persistent/PersistentTopicProtectedMethodsTest.java Changed exact equality assertions to greater-than-or-equal assertions for mark delete position
pulsar-broker/src/test/java/org/apache/pulsar/broker/service/PersistentMessageFinderTest.java Updated message expiry tests to expect mark delete position at nextLedgerId:-1
pulsar-broker/src/test/java/org/apache/pulsar/broker/service/MessageTTLTest.java Fixed spelling from "exacly" to "exactly" and updated test to expect cursor at currentLedgerId:-1
pulsar-broker/src/test/java/org/apache/pulsar/broker/service/BacklogQuotaManagerTest.java Updated assertions to reflect that mark deleting moves cursor to nextLedgerId:-1

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@codecov-commenter
Copy link

Codecov Report

❌ Patch coverage is 80.64516% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 74.48%. Comparing base (9c5e1c3) to head (c337237).
⚠️ Report is 3 commits behind head on master.

Files with missing lines Patch % Lines
...che/bookkeeper/mledger/impl/ManagedCursorImpl.java 80.64% 4 Missing and 2 partials ⚠️
Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff              @@
##             master   #25117      +/-   ##
============================================
+ Coverage     74.44%   74.48%   +0.03%     
- Complexity    34046    34050       +4     
============================================
  Files          1899     1899              
  Lines        149655   149666      +11     
  Branches      17393    17397       +4     
============================================
+ Hits         111412   111477      +65     
+ Misses        29368    29313      -55     
- Partials       8875     8876       +1     
Flag Coverage Δ
inttests 26.38% <58.06%> (+0.24%) ⬆️
systests 23.04% <51.61%> (+0.04%) ⬆️
unittests 74.01% <80.64%> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
...che/bookkeeper/mledger/impl/ManagedCursorImpl.java 79.37% <80.64%> (+0.40%) ⬆️

... and 74 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

doc-not-needed Your PR changes do not impact docs ready-to-test

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants