Fix IP Allocation Bug: Reserved Range Not Detected#2657
Conversation
src/bosh-director/lib/bosh/director/deployment_plan/ip_provider/ip_repo.rb
Outdated
Show resolved
Hide resolved
src/bosh-director/lib/bosh/director/deployment_plan/ip_provider/ip_repo.rb
Outdated
Show resolved
Hide resolved
src/bosh-director/lib/bosh/director/deployment_plan/ip_provider/ip_repo.rb
Show resolved
Hide resolved
src/bosh-director/lib/bosh/director/deployment_plan/manual_network_subnet.rb
Outdated
Show resolved
Hide resolved
|
Moving to draft for the moment. Needs further attention. |
|
@neddp I think you discovered a legitimate bug. We have noticed sporadic deployment failures of the following kind: {
"error":{
"code":"PrivateIPAddressInReservedRange",
"message":"Private static IP address 192.168.11.2 falls within reserved IP range of subnet prefix 192.168.11.0/24.",
"details":[]
}
}The subnet prefix in this example was belonging to the following bosh network: networks:
- name: compilation_network
subnets:
- az: z1
cloud_properties:
application_security_groups:
- asg-bosh
subnet_name: subnet-bosh-z1
virtual_network_name: vnet-bosh
dns:
- 168.63.129.16
gateway: 192.168.11.1
range: 192.168.11.0/24
reserved:
- 192.168.11.0 - 192.168.11.141
- 192.168.11.242 - 192.168.11.255If you look at the first reserved range ( Do you plan to continue to work on this change? |
s4heid
left a comment
There was a problem hiding this comment.
Perhaps, it would also make sense to add an integration / regression test to avoid catching something critical like this in the future again.
src/bosh-director/lib/bosh/director/deployment_plan/manual_network_subnet.rb
Outdated
Show resolved
Hide resolved
|
Hi @s4heid, We plan on continuing to work on it, but any help is appreciated in the meantime. Thanks! |
|
Hi @neddp, Just checking in on the status of this PR. It's been in draft for a while now and as @s4heid's feedback confirms this is a real bug — we're seeing sporadic A couple of questions:
A short update would be great to get a sense of the timeline. Thanks! |
Add overlaps? method to IpAddrOrCidr that uses integer range comparison instead of IPAddr#include?, which loses CIDR prefix information during coercion (coerce_other converts IpAddrOrCidr via .to_i, producing /32). Fix find_next_available_ip in IpRepo: - Use overlaps? instead of include? for blocking IP detection - Prune filtered_ips by last address (not base) to retain CIDRs that span the current candidate position - Capture blocking_ip reference before mutation for correct advance Add unit tests for overlaps?, CIDR block handling, IPv6 allocation, and an integration test for CIDR reserved ranges. Co-authored-by: Ivaylo Ivanov <19604369+Ivaylogi98@users.noreply.github.com> Co-authored-by: Saumya Dudeja <193609732+dudejas@users.noreply.github.com>
277734b to
c6bfa1f
Compare
📝 WalkthroughWalkthroughThe changes introduce a new Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/bosh-director/lib/bosh/director/deployment_plan/ip_provider/ip_repo.rb (1)
117-126:⚠️ Potential issue | 🟠 MajorKeep spanning reservations that start before the subnet.
Line 119 still filters on
ip.to_i, so a broader block like10.0.128.0/23is dropped when allocating inside an overlapping10.0.129.0/24, even though the/23still covers the whole subnet. That lets the allocator hand out addresses that are already reserved on another overlapping network. Filter on the block’s last address instead, and add a regression for this shape.Suggested fix
- filtered_ips = addresses_we_cant_allocate.sort_by { |ip| ip.to_i }.reject { |ip| ip.to_i < first_range_address.to_i } + filtered_ips = addresses_we_cant_allocate + .sort_by { |ip| ip.to_i } + .reject { |ip| ip.to_range.last.to_i < first_range_address.to_i }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/bosh-director/lib/bosh/director/deployment_plan/ip_provider/ip_repo.rb` around lines 117 - 126, The current filtering in find_next_available_ip uses ip.to_i which drops reservation blocks that start before the subnet even if their range still covers it; change the initial filtered_ips assignment to reject entries where ip.to_range.last.to_i (the block's last address) is less than first_range_address.to_i so spanning reservations (e.g., 10.0.128.0/23) are kept when allocating inside an overlapping subnet (e.g., 10.0.129.0/24). Update the subsequent loop logic that uses filtered_ips/current_prefix unchanged, and add a regression test that creates a spanning reservation (10.0.128.0/23) and verifies allocation inside the overlapping 10.0.129.0/24 does not return reserved addresses.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/bosh-director/lib/bosh/director/ip_addr_or_cidr.rb`:
- Around line 87-92: The overlaps? method currently compares integer endpoints
and can falsely report overlap for mixed IPv4/IPv6 inputs (e.g., 0.0.0.1 vs
::1); update overlaps? to first check address family equality (IPv4 vs IPv6) for
self (`@ipaddr`) and the other argument (use other.family or
other.to_range.first.address_family) and return false immediately if families
differ, otherwise proceed with the existing integer range comparison; add a unit
test that calls overlaps? with an IPv4 address and an IPv6 address (e.g.,
"0.0.0.1" and "::1") asserting false to cover this case.
---
Outside diff comments:
In `@src/bosh-director/lib/bosh/director/deployment_plan/ip_provider/ip_repo.rb`:
- Around line 117-126: The current filtering in find_next_available_ip uses
ip.to_i which drops reservation blocks that start before the subnet even if
their range still covers it; change the initial filtered_ips assignment to
reject entries where ip.to_range.last.to_i (the block's last address) is less
than first_range_address.to_i so spanning reservations (e.g., 10.0.128.0/23) are
kept when allocating inside an overlapping subnet (e.g., 10.0.129.0/24). Update
the subsequent loop logic that uses filtered_ips/current_prefix unchanged, and
add a regression test that creates a spanning reservation (10.0.128.0/23) and
verifies allocation inside the overlapping 10.0.129.0/24 does not return
reserved addresses.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: c007187a-dfce-41d0-8634-90ad4849b8ba
📒 Files selected for processing (5)
src/bosh-director/lib/bosh/director/deployment_plan/ip_provider/ip_repo.rbsrc/bosh-director/lib/bosh/director/ip_addr_or_cidr.rbsrc/bosh-director/spec/unit/bosh/director/deployment_plan/ip_provider/ip_repo_spec.rbsrc/bosh-director/spec/unit/bosh/director/ip_addr_or_cidr_spec.rbsrc/spec/integration/global_networking/ip_reservations/allocating_dynamic_ips_spec.rb
|
We spent more time analyzing the code logic and decided to go with a completely different approach. Instead of the bidirectional @s4heid, @seidab24, unfortunately, we were still unable to reproduce the issue in a test that would fail with the old code and pass with the new one. A new integration test was still added as per your suggestion, along with unit tests that try to cover the edge cases. Any suggestions and reviews are welcome and appreciated. |

What is this change about?
This PR potentially fixes a bug where BOSH allocates IPs from reserved CIDR ranges, causing CPI failures with "Address is in subnet's reserved address range" (AWS) errors and similar for other providers.
The Bug
We were unable to reproduce the issue. We can only speculate what could be causing it.
The Fix
1. Add
overlaps?method toIpAddrOrCidr— uses integer range comparison, bypassingIPAddr#coerce_otherentirely:2. Prune by last address — only discard a CIDR when its entire range is behind the current position:
3. Capture
blocking_ipbefore mutation — stable reference for the advance calculation:What tests have you run?
Release notes
Fixed: IP allocation now correctly detects reserved CIDR ranges. Previously, when allocating single IPs (
/32) from a subnet with reserved ranges specified as larger blocks (e.g.,/30), the algorithm failed to detect the overlap and attempted to allocate reserved IPs, causing CPI errors.Breaking change?
No. This is a bug fix.
Tag your pair, your PM, and/or team!
@dudejas
@Ivaylogi98