Skip to content

Fix IP Allocation Bug: Reserved Range Not Detected#2657

Open
neddp wants to merge 1 commit intomainfrom
fix_ip_allocation_from_reserved_range_in_dynamic_networks
Open

Fix IP Allocation Bug: Reserved Range Not Detected#2657
neddp wants to merge 1 commit intomainfrom
fix_ip_allocation_from_reserved_range_in_dynamic_networks

Conversation

@neddp
Copy link
Copy Markdown
Member

@neddp neddp commented Jan 29, 2026

What is this change about?

This PR potentially fixes a bug where BOSH allocates IPs from reserved CIDR ranges, causing CPI failures with "Address is in subnet's reserved address range" (AWS) errors and similar for other providers.

The Bug

We were unable to reproduce the issue. We can only speculate what could be causing it.

The Fix

1. Add overlaps? method to IpAddrOrCidr — uses integer range comparison, bypassing IPAddr#coerce_other entirely:

def overlaps?(other)
  my_first = @ipaddr.to_range.first.to_i
  my_last = @ipaddr.to_range.last.to_i
  other_first = other.to_range.first.to_i
  other_last = other.to_range.last.to_i
  my_first <= other_last && other_first <= my_last
end

2. Prune by last address — only discard a CIDR when its entire range is behind the current position:

filtered_ips.reject! { |ip| ip.to_range.last.to_i < current_prefix.to_i }

3. Capture blocking_ip before mutation — stable reference for the advance calculation:

blocking_ip = filtered_ips.first
if blocking_ip&.overlaps?(current_prefix)
  if blocking_ip.count > current_prefix.count
    current_ip = to_ipaddr(blocking_ip.to_i + blocking_ip.count)
  else
    current_ip = to_ipaddr(current_ip.to_i + current_prefix.count)
  end

What tests have you run?

  • All existing unit tests pass
  • Added edge case tests for overlapping CIDR blocks with different prefix sizes

Release notes

Fixed: IP allocation now correctly detects reserved CIDR ranges. Previously, when allocating single IPs (/32) from a subnet with reserved ranges specified as larger blocks (e.g., /30), the algorithm failed to detect the overlap and attempted to allocate reserved IPs, causing CPI errors.

Breaking change?

No. This is a bug fix.

Tag your pair, your PM, and/or team!

@dudejas
@Ivaylogi98

@neddp neddp marked this pull request as ready for review January 29, 2026 11:34
@neddp neddp requested a review from fmoehler January 29, 2026 11:49
@aramprice aramprice requested review from a team and s4heid and removed request for a team January 29, 2026 16:03
@aramprice aramprice moved this from Inbox to Pending Review | Discussion in Foundational Infrastructure Working Group Jan 29, 2026
@neddp neddp changed the title Fix IP Allocation Bug: Non-Deterministic CIDR Deduplication Fix IP Allocation Bug: Reserved Range Not Detected Feb 6, 2026
@fmoehler
Copy link
Copy Markdown
Contributor

fmoehler commented Feb 11, 2026

Thanks for looking into this. Are you sure this is a real issue? I understand the issue from the description, but there were tests that already did cover the scenario in question e.g. in ip_repo_spec.rb:299 and they worked. I also checked out this branch, but used the logic from the current "main" branch in ip_repo.rb and all new tests were still green:

image

The scenario seems to be so basic that I think we should have stumbled across this before.

@neddp neddp marked this pull request as draft February 16, 2026 11:31
@neddp
Copy link
Copy Markdown
Member Author

neddp commented Feb 17, 2026

Moving to draft for the moment. Needs further attention.

@s4heid
Copy link
Copy Markdown
Contributor

s4heid commented Feb 27, 2026

@neddp I think you discovered a legitimate bug.

We have noticed sporadic deployment failures of the following kind:

{
  "error":{
    "code":"PrivateIPAddressInReservedRange",
    "message":"Private static IP address 192.168.11.2 falls within reserved IP range of subnet prefix 192.168.11.0/24.",
    "details":[]
  }
}

The subnet prefix in this example was belonging to the following bosh network:

networks:
- name: compilation_network
  subnets:
  - az: z1
    cloud_properties:
      application_security_groups:
      - asg-bosh
      subnet_name: subnet-bosh-z1
      virtual_network_name: vnet-bosh
    dns:
    - 168.63.129.16
    gateway: 192.168.11.1
    range: 192.168.11.0/24
    reserved:
    - 192.168.11.0 - 192.168.11.141
    - 192.168.11.242 - 192.168.11.255

If you look at the first reserved range (192.168.11.0 - 192.168.11.141), you notice that something is off, because bosh attempted to take IP 192.168.11.2, which is part of the reserved range.

Do you plan to continue to work on this change?

Copy link
Copy Markdown
Contributor

@s4heid s4heid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps, it would also make sense to add an integration / regression test to avoid catching something critical like this in the future again.

@neddp
Copy link
Copy Markdown
Member Author

neddp commented Mar 4, 2026

Hi @s4heid,

We plan on continuing to work on it, but any help is appreciated in the meantime.

Thanks!

@neddp neddp mentioned this pull request Mar 17, 2026
@seidab24
Copy link
Copy Markdown

Hi @neddp,

Just checking in on the status of this PR. It's been in draft for a while now and as @s4heid's feedback confirms this is a real bug — we're seeing sporadic PrivateIPAddressInReservedRange failures in our deployments.

A couple of questions:

  1. What's currently blocking progress and is there anything we can help with?
  2. Are there plans to add the integration test @s4heid suggested?

A short update would be great to get a sense of the timeline.

Thanks!

Add overlaps? method to IpAddrOrCidr that uses integer range comparison
instead of IPAddr#include?, which loses CIDR prefix information during
coercion (coerce_other converts IpAddrOrCidr via .to_i, producing /32).

Fix find_next_available_ip in IpRepo:
- Use overlaps? instead of include? for blocking IP detection
- Prune filtered_ips by last address (not base) to retain CIDRs that
  span the current candidate position
- Capture blocking_ip reference before mutation for correct advance

Add unit tests for overlaps?, CIDR block handling, IPv6 allocation,
and an integration test for CIDR reserved ranges.

Co-authored-by: Ivaylo Ivanov <19604369+Ivaylogi98@users.noreply.github.com>
Co-authored-by: Saumya Dudeja <193609732+dudejas@users.noreply.github.com>
@neddp neddp force-pushed the fix_ip_allocation_from_reserved_range_in_dynamic_networks branch from 277734b to c6bfa1f Compare March 31, 2026 14:06
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 31, 2026

📝 Walkthrough

Walkthrough

The changes introduce a new overlaps? method to detect intersecting IP ranges and refactor dynamic IP allocation logic to use overlap-based checking instead of containment-based logic. This improves handling of reserved CIDR blocks and overlapping IP reservations during allocation.

Changes

Cohort / File(s) Summary
IP Overlap Detection
src/bosh-director/lib/bosh/director/ip_addr_or_cidr.rb
Added public overlaps?(other) method that checks if two IP address ranges (computed from to_range) intersect using inclusive range comparison logic.
Dynamic IP Allocation Refactoring
src/bosh-director/lib/bosh/director/deployment_plan/ip_provider/ip_repo.rb
Refactored find_next_available_ip to pre-filter IPs by range end and replace containment-based logic with overlap detection using the new overlaps? method. CIDR advancement now uses blocking_ip.to_i + blocking_ip.count instead of previous containment-based calculation. Fixed indentation in try_to_allocate_dynamic_ip.
Unit Tests for Overlap Detection
src/bosh-director/spec/unit/bosh/director/ip_addr_or_cidr_spec.rb
Added comprehensive test coverage for overlaps? method with IPv4 and IPv6 scenarios including equality, containment, disjoint ranges, partial overlap, adjacency, nesting, and regression cases.
Unit Tests for IP Allocation
src/bosh-director/spec/unit/bosh/director/deployment_plan/ip_provider/ip_repo_spec.rb
Added test suite for dynamic IP allocation with CIDR blocks and overlapping ranges, including deduplication of overlapping/nested reservations, adjacent non-overlapping CIDRs, and IPv6 allocation scenarios.
Integration Test
src/spec/integration/global_networking/ip_reservations/allocating_dynamic_ips_spec.rb
Added integration test verifying that dynamic IP allocation correctly excludes addresses within reserved CIDR ranges (e.g., allocating from available IPs after a /28 reserved block).

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 A clever overlap, at last we can see,
When ranges collide in CIDR harmony,
No more containment, just overlaps true,
IP allocation now knows what to do! ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the main bug fix: detecting reserved IP ranges during allocation.
Description check ✅ Passed The PR description covers all required sections: change rationale, technical details, test coverage, release notes, breaking changes, and team tags.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix_ip_allocation_from_reserved_range_in_dynamic_networks

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@neddp neddp marked this pull request as ready for review March 31, 2026 14:07
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/bosh-director/lib/bosh/director/deployment_plan/ip_provider/ip_repo.rb (1)

117-126: ⚠️ Potential issue | 🟠 Major

Keep spanning reservations that start before the subnet.

Line 119 still filters on ip.to_i, so a broader block like 10.0.128.0/23 is dropped when allocating inside an overlapping 10.0.129.0/24, even though the /23 still covers the whole subnet. That lets the allocator hand out addresses that are already reserved on another overlapping network. Filter on the block’s last address instead, and add a regression for this shape.

Suggested fix
-      filtered_ips = addresses_we_cant_allocate.sort_by { |ip| ip.to_i }.reject { |ip| ip.to_i < first_range_address.to_i }
+      filtered_ips = addresses_we_cant_allocate
+        .sort_by { |ip| ip.to_i }
+        .reject { |ip| ip.to_range.last.to_i < first_range_address.to_i }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/bosh-director/lib/bosh/director/deployment_plan/ip_provider/ip_repo.rb`
around lines 117 - 126, The current filtering in find_next_available_ip uses
ip.to_i which drops reservation blocks that start before the subnet even if
their range still covers it; change the initial filtered_ips assignment to
reject entries where ip.to_range.last.to_i (the block's last address) is less
than first_range_address.to_i so spanning reservations (e.g., 10.0.128.0/23) are
kept when allocating inside an overlapping subnet (e.g., 10.0.129.0/24). Update
the subsequent loop logic that uses filtered_ips/current_prefix unchanged, and
add a regression test that creates a spanning reservation (10.0.128.0/23) and
verifies allocation inside the overlapping 10.0.129.0/24 does not return
reserved addresses.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/bosh-director/lib/bosh/director/ip_addr_or_cidr.rb`:
- Around line 87-92: The overlaps? method currently compares integer endpoints
and can falsely report overlap for mixed IPv4/IPv6 inputs (e.g., 0.0.0.1 vs
::1); update overlaps? to first check address family equality (IPv4 vs IPv6) for
self (`@ipaddr`) and the other argument (use other.family or
other.to_range.first.address_family) and return false immediately if families
differ, otherwise proceed with the existing integer range comparison; add a unit
test that calls overlaps? with an IPv4 address and an IPv6 address (e.g.,
"0.0.0.1" and "::1") asserting false to cover this case.

---

Outside diff comments:
In `@src/bosh-director/lib/bosh/director/deployment_plan/ip_provider/ip_repo.rb`:
- Around line 117-126: The current filtering in find_next_available_ip uses
ip.to_i which drops reservation blocks that start before the subnet even if
their range still covers it; change the initial filtered_ips assignment to
reject entries where ip.to_range.last.to_i (the block's last address) is less
than first_range_address.to_i so spanning reservations (e.g., 10.0.128.0/23) are
kept when allocating inside an overlapping subnet (e.g., 10.0.129.0/24). Update
the subsequent loop logic that uses filtered_ips/current_prefix unchanged, and
add a regression test that creates a spanning reservation (10.0.128.0/23) and
verifies allocation inside the overlapping 10.0.129.0/24 does not return
reserved addresses.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: c007187a-dfce-41d0-8634-90ad4849b8ba

📥 Commits

Reviewing files that changed from the base of the PR and between 3f4620e and c6bfa1f.

📒 Files selected for processing (5)
  • src/bosh-director/lib/bosh/director/deployment_plan/ip_provider/ip_repo.rb
  • src/bosh-director/lib/bosh/director/ip_addr_or_cidr.rb
  • src/bosh-director/spec/unit/bosh/director/deployment_plan/ip_provider/ip_repo_spec.rb
  • src/bosh-director/spec/unit/bosh/director/ip_addr_or_cidr_spec.rb
  • src/spec/integration/global_networking/ip_reservations/allocating_dynamic_ips_spec.rb

@github-project-automation github-project-automation bot moved this from Pending Review | Discussion to Waiting for Changes | Open for Contribution in Foundational Infrastructure Working Group Mar 31, 2026
@neddp
Copy link
Copy Markdown
Member Author

neddp commented Mar 31, 2026

We spent more time analyzing the code logic and decided to go with a completely different approach. Instead of the bidirectional include? method to IpAddrOrCidr that compares integer ranges directly - bypassing the coercion entirely. We went with a more simple logic that checks if the two subnets overlap, by doing an integer range comparison, with the new overlaps? method. We believe this is a more robust and clean approach that should theoretically address the issue.

@s4heid, @seidab24, unfortunately, we were still unable to reproduce the issue in a test that would fail with the old code and pass with the new one. A new integration test was still added as per your suggestion, along with unit tests that try to cover the edge cases.

Any suggestions and reviews are welcome and appreciated.

@neddp neddp requested review from fmoehler and s4heid March 31, 2026 15:12
@github-project-automation github-project-automation bot moved this from Waiting for Changes | Open for Contribution to Pending Merge | Prioritized in Foundational Infrastructure Working Group Apr 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Pending Merge | Prioritized

Development

Successfully merging this pull request may close these issues.

6 participants