switch envs, resources are shared on multiple envs and are exclusive by dnsi0 · Pull Request #1303 · oceanprotocol/ocean-node

dnsi0 · 2026-03-27T12:44:31Z

Fixes #1280 .

Changes proposed in this PR:

Multiple environments per engine: Each Docker engine now supports multiple compute environments, each with its own full resource definitions (cpu, ram, disk, GPUs), fees, access controls, and free-tier config.
Per-environment exclusive resources with global validation: CPU, RAM, and disk are exclusive per environment, inUse is tracked only within the environment where the job runs. A dual validation ensures both the target environment has capacity and the global aggregate across all environments doesn't exceed physical limits. GPUs are shared-exclusive (if gpu0 is used in envA, it shows in-use on envB too). CPU cores are hard-partitioned per environment via explicit cpuCores arrays with overlap validation at config time.
Removed engine-level resource config: All resource definitions moved from C2DDockerConfig to per-environment C2DEnvironmentConfig. No more engine-wide resources, fees, access, storageExpiry, maxJobDuration, minJobDuration, maxJobs, or free fields on the docker config — these are now per-environment.
New C2DEnvironmentConfig interface: Added with resources: ComputeResource[], cpuCores: number[], and all
environment-specific settings.
Zod schema updates: Per-environment validation (fees/free required, storageExpiry >= maxJobDuration), CPU core overlap validation across environments within a cluster.
Simplified engine initialization: Removed cpuOffset cascading between engines, removed buildEnvResources()/resolveEnvResources() merging logic — environments directly own their resources. CPU and RAM defaults are still auto-detected from Docker sysinfo when not configured.

src/@types/C2D/C2D.ts

dnsi0 · 2026-03-30T14:43:23Z

/run-security-scan

alexcos20

AI automated code review (Gemini 3).

Overall risk: medium

Summary:
This Pull Request introduces a significant architectural change to the Compute-to-Data (C2D) environment configuration. It refactors the system to support multiple, distinct compute environments per Docker engine, providing more granular control over resource allocation, fees, and access policies. The DOCKER_COMPUTE_ENVIRONMENTS configuration now expects an array of environments within each Docker cluster definition. Resource management has been enhanced with per-environment resource tracking (CPU, RAM, Disk), global resource availability checks to prevent over-allocation, and explicit handling for shared-exclusive resources like GPUs. Type definitions (C2D.ts) and configuration schemas (schemas.ts) have been updated to reflect this new hierarchical structure. All example and test configurations are adjusted accordingly. This is a major improvement for C2D flexibility and scalability.

Comments:
• [ERROR][bug] The ComputeEnvironmentFreeOptions interface in C2D.ts (which C2DEnvironmentConfig.free uses) does not define minJobDuration or maxJobDuration. However, the ci.yml, config.json, and various integration tests define minJobDuration and maxJobDuration within the free configuration. Furthermore, compute_engine_docker.ts in the start() method explicitly copies these properties if found in envDef.free. This creates a type and schema inconsistency. Please add minJobDuration?: number and maxJobDuration?: number to ComputeEnvironmentFreeOptions in src/@types/C2D/C2D.ts and ComputeEnvironmentFreeOptionsSchema in src/utils/config/schemas.ts.
• [ERROR][bug] The C2DDockerConfigSchema no longer enforces the presence of a 'disk' resource at the top level, as resources are now per-environment. However, C2DEnvironmentConfigSchema also does not enforce it, and C2DEngineDocker.start() does not automatically add a 'disk' resource to env.resources if not explicitly provided (unlike CPU and RAM). A 'disk' resource is fundamental for almost all compute jobs. This could lead to environments being configured without essential storage. Please consider adding a .refine rule to C2DEnvironmentConfigSchema to ensure a 'disk' resource is always present, or adding default 'disk' resources in C2DEngineDocker.start().
• [WARNING][other] This PR introduces a breaking change to the DOCKER_COMPUTE_ENVIRONMENTS configuration structure. The documentation (e.g., README.md) and release notes should clearly communicate this change, provide migration steps for existing deployments, and explain the benefits of the new multi-environment setup.
• [INFO][performance] The envResourceMap creation for (env.resources || []).map((r) => [r.id, r]) within the _getComputeEnvironmentInfo method is done inside the loop for each env. This map is then used inside the for (const job of jobs) loop. While likely negligible for typical scenarios, creating this map repeatedly might be slightly inefficient if there are many environments or jobs. Could envResourceMap be created once per env when the environments are initialized in C2DEngineDocker.start() and stored on the env object or a separate map for quicker lookup? This is a minor optimization suggestion.
• [INFO][style] Consider adding JSDoc comments to the new processFeesForEnvironment method and updating existing JSDocs for methods like checkIfResourcesAreAvailable and allocateCpus to reflect their new parameters and responsibilities. This will improve code clarity and maintainability.

dnsi0 · 2026-03-30T20:18:23Z

/run-security-scan

alexcos20

AI automated code review (Gemini 3).

Overall risk: medium

Summary:
This pull request introduces a significant architectural change to the Compute-to-Data (C2D) component, allowing for multiple, distinct compute environments within a single Docker cluster. The configuration for compute environments is now hierarchical, enabling granular control over resources, fees, access, and job durations for each environment. It also implements global resource availability checks to prevent over-allocation across environments and refactors CPU core allocation to be environment-specific. This is a substantial improvement in flexibility and resource management, but it introduces a breaking change in the configuration structure.

Comments:
• [ERROR][other] This PR introduces a breaking change to the dockerComputeEnvironments configuration structure. Previously, parameters like storageExpiry, maxJobDuration, fees, and resources were directly under the C2DDockerConfig object. Now, they must be nested within an environments array. While the .env.example and config.json files are updated, existing deployments will require manual configuration migration. Please ensure this breaking change is clearly documented in the release notes with migration instructions.
• [INFO][style] Good catch correcting the typo from 1 hours to 1 hour in the paymentClaimInterval comment.
• [INFO][performance] The addition of envResourceMap for efficient resource lookup is a good optimization, especially with the introduction of multiple environments. This avoids repeated array iteration.
• [WARNING][bug] The physicalLimits map correctly initializes for 'cpu' and 'ram' based on sysinfo. However, there's no corresponding initialization for 'disk'. While the checkGlobalResourceAvailability method for disk will sum total values from configured environments, it won't necessarily reflect the actual physical disk capacity of the host unless manually configured to match. Is there a plan to automatically detect physical disk capacity or enforce a configuration check to ensure the sum of disk.total across environments does not exceed the physical limit?
• [INFO][style] The ComputeEnvironmentFreeOptionsSchema now includes minJobDuration. This is consistent with the paid environment options. Good to ensure consistency here.
• [INFO][other] The validation rule environments: z.array(C2DEnvironmentConfigSchema).min(1) ensures that at least one compute environment is configured within a Docker cluster. This is a sensible default and prevents misconfigurations.
• [INFO][other] The environment ID fallback to String(envIdx) if not explicitly provided is functional for ensuring uniqueness. For user-friendliness and traceability in logs/monitoring, it might be beneficial to encourage explicit id definitions in the configuration, or generate a more descriptive ID based on other environment properties if id is missing.

dnsi0 · 2026-03-30T21:11:30Z

/run-security-scan

alexcos20

AI automated code review (Gemini 3).

Overall risk: medium

Summary:
This pull request introduces a major architectural refactoring of the compute environment configuration, allowing a single compute node to host multiple distinct compute environments with individual resource definitions, fee structures, and access controls. It moves core environment settings into a nested environments array within the Docker compute configuration. Key improvements include granular per-environment resource management (CPU, RAM, disk), a new global resource availability check to prevent over-allocation, and robust Zod schema validation for the new configuration structure. This is a significant enhancement for compute providers offering varied services.

Comments:
• [INFO][other] The updated example for DOCKER_COMPUTE_ENVIRONMENTS clearly demonstrates the new nested environments array structure. This is a breaking change for existing deployments, so clear migration notes will be essential.
• [INFO][other] The introduction of C2DEnvironmentConfig and the modification of C2DDockerConfig correctly reflect the new multi-environment architecture. This provides much-needed flexibility.
• [INFO][other] The fix from '1 hours' to '1 hour' for paymentClaimInterval comment is a minor but welcome detail correction.
• [INFO][performance] The envResourceMap creation is a good optimization for resource lookups within an environment, improving readability and potentially performance in resource-intensive loops.
• [INFO][other] The new logic distinguishing between shared-exclusive (GPU) and per-env exclusive (CPU, RAM, disk) resource tracking is crucial for accurate multi-environment resource management. This addresses a common challenge in container orchestration.
• [INFO][security] The checkGlobalResourceAvailability method is a critical addition for preventing over-allocation of physical resources across multiple logical compute environments. This enhances the stability and security of the compute node.
• [INFO][other] The processFeesForEnvironment method extracts and centralizes fee processing logic, which improves code organization and maintainability.
• [INFO][other] The start() method's refactoring to iterate over envConfig.environments and dynamically create ComputeEnvironment objects is the core of the multi-environment support. This is a robust implementation of the new design.
• [WARNING][performance] The cpuOffset was removed from the constructor and related logic for CPU allocation is now per-environment. While the new per-environment CPU allocation logic (using envCpuCoresMap) is correct, it's essential to confirm that the cpuOffset functionality for multiple physical Docker clusters (if that was ever a use case) hasn't been implicitly removed without replacement. Currently, C2DEngines iterates c2dClusters but C2DEngineDocker creates only for a single cluster. If there's a use case for multiple DOCKER clusters on one node and they need different CPU affinities, that might need re-evaluation.
• [INFO][performance] Passing envId to allocateCpus and using envCpuCoresMap for environment-specific CPU pinning is a good approach to ensure fair and isolated CPU resource allocation for jobs running in different configured environments.
• [INFO][other] The C2DEnvironmentConfigSchema with its refine clauses provides excellent validation for the new environment configurations. The mandatory disk resource and storageExpiry vs maxJobDuration checks are important for operational stability. The requirement for either fees or free configuration per environment is also sensible.
• [INFO][other] Adding minJobDuration to ComputeEnvironmentFreeOptionsSchema with a default of 60 seconds is a good consistency improvement. This ensures free jobs also have a minimum duration constraint.
• [WARNING][test] In some integration tests, paymentClaimInterval is now set on the DOCKER_COMPUTE_ENVIRONMENTS string at the top level of the Docker config, not within the nested environments array. This is consistent with the C2DDockerConfig definition, but it's important to ensure that this global paymentClaimInterval is respected and applies correctly to all nested environments, or if it should be an environment-specific setting.

dnsi0 · 2026-03-31T07:23:13Z

/run-security-scan

alexcos20

AI automated code review (Gemini 3).

Overall risk: medium

Summary:
This pull request introduces a significant architectural change by enabling support for multiple, distinct compute environments within a single C2D Docker cluster. Each environment can now define its own resources (CPU, RAM, disk, GPU), fee structures, access controls, and free tier configurations. Key changes include a revised configuration structure, updated TypeScript types, granular resource management logic distinguishing between shared (GPU) and exclusive (CPU, RAM, disk) resources, and the introduction of global physical resource limits. CPU affinity is now managed per environment.

This is a major feature enhancement that improves the flexibility and scalability of compute offerings.

Comments:
• [INFO][other] The C2DDockerConfig interface has been significantly refactored. Previously, it contained properties like storageExpiry, maxJobDuration, fees, and resources directly. These are now moved into the new C2DEnvironmentConfig interface, and C2DDockerConfig now primarily holds connection details and an array of environments.

This is a breaking change to the configuration structure, which needs to be clearly communicated to users and administrators during upgrade.
• [INFO][other] The resource usage tracking logic has been enhanced to differentiate between gpu resources (shared-exclusive, tracked globally) and other resource types (cpu, ram, disk) which are per-environment exclusive. This is a crucial improvement for supporting heterogeneous environments and ensuring accurate resource allocation across multiple defined compute environments.
• [INFO][other] Introduction of physicalLimits and checkGlobalResourceAvailability is a critical addition. This ensures that even with multiple compute environments, the aggregated resource demands do not exceed the actual physical capacity of the host machine. This adds a robust layer of safety and prevents over-provisioning at a global level.
• [WARNING][performance] In the start() method, if statfsSync(this.getC2DConfig().tempFolder) fails to detect the physical disk size, physicalDiskGB defaults to 0. This could lead to diskResources.total being 0 for all environments, effectively making disk resources unavailable or severely constrained if the detection fails. While the CORE_LOGGER.warn helps, consider if a default fallback (e.g., a minimum reasonable disk size) or a more explicit failure during startup is preferable if disk detection is critical and might fail in some environments.
• [INFO][other] The CPU affinity logic now uses envCpuCoresMap to manage core allocations per environment, rather than a single envCpuCores array. This correctly isolates and assigns CPU cores based on the configuration of each specific compute environment, which is vital for multi-environment support.
• [INFO][style] The C2DDockerConfigSchema now requires environments to be an array with a minimum length of 1 (.min(1)). This implies that a Docker compute cluster must define at least one environment. This is a sensible design choice for clarity and preventing misconfigurations where a cluster might be defined but offers no compute environments.

dnsi0 · 2026-04-01T19:56:50Z

/run-security-scan

alexcos20

AI automated code review (Gemini 3).

Overall risk: medium

Summary:
This pull request introduces a significant refactoring of the compute-to-data (C2D) environment configuration, allowing for multiple compute environments per Docker cluster. Each environment can now define its own resources (CPU, RAM, disk, GPU), fees, and access controls. It also implements global physical resource limits and environment-specific CPU allocation. A new feature to automatically create a benchmark environment is added. Configuration schemas, environment variable parsing, and core compute logic have been updated to support these changes.

Comments:
• [INFO][other] The DOCKER_COMPUTE_ENVIRONMENTS configuration has undergone a major breaking change, now requiring an environments array. This needs thorough documentation for node operators, including migration steps if applicable, to avoid confusion and misconfigurations during upgrades.
• [WARNING][performance] The start method in C2DEngineDocker has grown quite complex, handling physical limit detection, benchmark environment setup, and iterating through multiple environment configurations. Consider breaking down this method into smaller, more focused private methods to improve readability and maintainability. For example, a _initializePhysicalLimits() and _createEnvironment(envDef, sysinfo, platform, consumerAddress, supportedChains).
• [INFO][bug] The statfsSync call for disk size is wrapped in a try/catch. If this fails for some reason, the disk physical limit will not be set, potentially leading to incorrect disk resource management if a fallback isn't provided. While a warning is logged, for critical resources like disk, a more robust fallback or an explicit error might be considered if disk detection is essential for operation.
• [WARNING][style] The sepoliaChainId and usdcToken for the benchmark environment are hardcoded here. It would be better practice to define these as constants in src/utils/config/constants.ts or make them configurable via environment variables, similar to BENCHMARK_MONITORING_ADDRESS, for better maintainability and flexibility across different network configurations.
• [INFO][bug] The .refine rule for mandatory 'disk' resource: if (!data.resources) return false causes the schema to fail if the resources array is omitted entirely from an environment configuration. This is likely intended behavior to ensure disk is always specified, but it's worth confirming that environments without an explicit resources array (relying on implicit defaults or auto-detection) are not meant to be valid.
• [INFO][style] The C2DDockerConfigSchema has a .min(1) requirement for the environments array. This ensures that at least one compute environment must be defined for each Docker cluster. This is a good constraint for operational stability.

switch envs, resources are shared on multiple envs and are exclusive

5b9c673

dnsi0 self-assigned this Mar 27, 2026

dnsi0 added 3 commits March 27, 2026 14:53

fixed tests envs

c3c274f

update config

5de4526

update docker envs

3ca174f

alexcos20 reviewed Mar 27, 2026

View reviewed changes

src/@types/C2D/C2D.ts Outdated Show resolved Hide resolved

dnsi0 added 9 commits March 29, 2026 15:26

remove engine level resources

46cce61

remove unused import

cd62c2a

fixed conflicts

a148b2b

fix cpus shared resources

00e67b7

removed unnecessary validation

8429909

fix envs

80b0dfe

fix lint errors

df300ca

add back the sysinfo resources

cd6fbfd

check used resources on all envs for disk and ram

f6422f2

dnsi0 force-pushed the feat/adapt-node-envs branch from f67b833 to f6422f2 Compare March 30, 2026 08:35

dnsi0 added 2 commits March 30, 2026 13:29

remove free global check, fix algoStartTime issue

e109913

fix env.example

c1bbd3a

dnsi0 marked this pull request as ready for review March 30, 2026 10:38

dnsi0 requested review from andreip136, bogdanfazakas, giurgiur99 and ndrpp as code owners March 30, 2026 10:38

dnsi0 marked this pull request as draft March 30, 2026 10:39

dnsi0 added 5 commits March 30, 2026 15:11

remove cpu cores from config

3be45dc

fix lint issue

180801a

add check based on total physical limit

5b950b2

fixed conflicts

c47bffe

fixed conflict

0b3ff90

alexcos20 reviewed Mar 30, 2026

View reviewed changes

fix errors raised

865f658

alexcos20 reviewed Mar 30, 2026

View reviewed changes

dnsi0 added 3 commits March 30, 2026 23:27

fix test issue

cab0f9a

attempt to fix tests issue

87cb58b

fix config

c3694e1

alexcos20 reviewed Mar 30, 2026

View reviewed changes

dnsi0 added 2 commits March 31, 2026 09:39

add the physical value of disk

ed25ad5

fixed conflicts

a34a4f3

dnsi0 force-pushed the feat/adapt-node-envs branch from d4cb256 to a34a4f3 Compare March 31, 2026 07:16

alexcos20 reviewed Mar 31, 2026

View reviewed changes

dnsi0 marked this pull request as ready for review March 31, 2026 07:27

add flag and auto create benchmark environment

b3a1b0f

alexcos20 reviewed Apr 1, 2026

View reviewed changes

create new method for benchmark env, add consts for sepolia and usdc

60601c6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

switch envs, resources are shared on multiple envs and are exclusive#1303

switch envs, resources are shared on multiple envs and are exclusive#1303
dnsi0 wants to merge 28 commits intomainfrom
feat/adapt-node-envs

dnsi0 commented Mar 27, 2026 •

edited

Loading

Uh oh!

Uh oh!

dnsi0 commented Mar 30, 2026

Uh oh!

alexcos20 left a comment

Uh oh!

dnsi0 commented Mar 30, 2026

Uh oh!

alexcos20 left a comment

Uh oh!

dnsi0 commented Mar 30, 2026

Uh oh!

alexcos20 left a comment

Uh oh!

dnsi0 commented Mar 31, 2026

Uh oh!

alexcos20 left a comment

Uh oh!

dnsi0 commented Apr 1, 2026

Uh oh!

alexcos20 left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

dnsi0 commented Mar 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

dnsi0 commented Mar 30, 2026

Uh oh!

alexcos20 left a comment

Choose a reason for hiding this comment

Uh oh!

dnsi0 commented Mar 30, 2026

Uh oh!

alexcos20 left a comment

Choose a reason for hiding this comment

Uh oh!

dnsi0 commented Mar 30, 2026

Uh oh!

alexcos20 left a comment

Choose a reason for hiding this comment

Uh oh!

dnsi0 commented Mar 31, 2026

Uh oh!

alexcos20 left a comment

Choose a reason for hiding this comment

Uh oh!

dnsi0 commented Apr 1, 2026

Uh oh!

alexcos20 left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

dnsi0 commented Mar 27, 2026 •

edited

Loading