Consolidated fix: resolve open issues and PRs for OpenVX 1.3 sample impl#60
Open
simonCatBot wants to merge 1 commit into
Open
Consolidated fix: resolve open issues and PRs for OpenVX 1.3 sample impl#60simonCatBot wants to merge 1 commit into
simonCatBot wants to merge 1 commit into
Conversation
Fixes included: - KhronosGroup#20: vxCreateThresholdForImage returns error object on bad params - KhronosGroup#48: vxRead/WriteScalarValue respects data_addr for WithSize scalars - KhronosGroup#52: release virtual objects scoped to graph on vxReleaseGraph - KhronosGroup#21/KhronosGroup#22: vxMapTensorPatch strides based on full tensor dimensions - KhronosGroup#23/KhronosGroup#24: PoolingKernelImpl signed/unsigned cast fix - KhronosGroup#6: XML export threshold true/false_value typed correctly (S32) - KhronosGroup#50/KhronosGroup#29: remove -mfpu=neon for aarch64 builds - KhronosGroup#46: add /bigobj to Windows debug flags - KhronosGroup#40: guard FIND_NUM_PROCESSORS during cross-compilation - KhronosGroup#45: add --userdataobj flag to Build.py - KhronosGroup#54: rename reserved identifier header guards - KhronosGroup#59: move Laplacian Pyramid/Reconstruct compute from Initializer to Kernel Refs: KhronosGroup#6 KhronosGroup#20 KhronosGroup#21 KhronosGroup#22 KhronosGroup#23 KhronosGroup#24 KhronosGroup#29 KhronosGroup#30 KhronosGroup#34 KhronosGroup#40 KhronosGroup#45 KhronosGroup#46 KhronosGroup#48 KhronosGroup#50 KhronosGroup#52 KhronosGroup#53 KhronosGroup#54 KhronosGroup#59
|
Simon seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
simonCatBot
pushed a commit
to simonCatBot/OpenVX-sample-impl
that referenced
this pull request
Jun 18, 2026
- Remove legacy .travis.yml and .gitlab-ci.yml
- Add .github/workflows/ci.yml with 7 parallel CI stages:
1. Release build
2. Debug build with code-coverage instrumentation + CTS Vision run
3-7. CTS stages for Enhanced Vision, Neural Networks, NNEF Import,
Full-stack (Vision + NN + IX), and Pipelining + Streaming
- Collect coverage via lcov and upload as artifact
Ref: KhronosGroup#60
Contributor
Author
|
Updated branch with CI changes:
|
2d5c6b0 to
96fb9a0
Compare
Contributor
Author
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Consolidated Fix PR: Resolve Open Issues for OpenVX 1.3 Sample Implementation
Branch
fix/consolidated-open-issues(based onopenvx_1.3tot)Summary
This PR consolidates fixes for the majority of open issues and incorporates open PRs into a single, spec-compliant changeset against the OpenVX 1.3 sample implementation. Each fix has been analyzed against the OpenVX 1.3 specification to ensure correctness.
Issues Fixed (with code changes)
Core Framework / API Bugs
#20 —
vxCreateThresholdForImage()overwrites error object on bad paramsRoot cause: When
thresh_type,input_format, oroutput_formatwas invalid, the function assigned the error object tothreshold, but then unconditionally calledownCreateReference()and overwrote it.Fix: Added
return threshold;immediately after assigning the error object, matching the behavior ofvxCreateThreshold().Files:
sample/framework/vx_threshold.c#48 —
vxReadScalarValue/vxWriteScalarValueignoredata_addrforvxCreateScalarWithSizescalarsRoot cause:
vxReadScalarValueandvxWriteScalarValuealways accessed the inlinescalar->data.*union, but scalars created viavxCreateScalarWithSizestore their payload indata_addr.Fix: Both functions now check
if (scalar->data_addr != NULL && scalar->data_len != 0)and usememcpyto/fromdata_addrwhen applicable, falling back to the union only for plain scalars.Spec alignment:
vxCopyScalarWithSizewas already correct; this makesvxRead/WriteScalarValueconsistent with it.Files:
sample/framework/vx_scalar.c#52 — Virtual data objects tied to a graph are not released on
vxReleaseGraphRoot cause:
ownDestructGraphremoved nodes but never dereferenced virtual objects (images, tensors, etc.) whosescope == graph. This caused memory leaks in loops that create graphs and virtual tensors.Fix: After removing all nodes,
ownDestructGraphnow iterates the context's reference table and releases any virtual references scoped to the dying graph.Spec alignment: Virtual objects scoped to a graph should not outlive the graph.
Files:
sample/framework/vx_graph.c#21 / #22 — Incorrect strides in
vxMapTensorPatchRoot cause: Strides were computed as
stride[i] = stride[i-1] * (view_end[i] - view_start[i]), using the view size rather than the tensor dimension size. This produced incorrect offsets when mapping a sub-patch.Fix: Changed to
stride[i] = stride[i-1] * tensor->dimensions[i-1], which matches the OpenVX 1.3 specification for tensor stride calculation.Files:
sample/framework/vx_tensor.c#23 / #24 — Signed/unsigned comparison in
PoolingKernelImplRoot cause:
resultisint32_t, whilesize_x * size_yissize_t. Dividingint32_tbysize_tpromotes the dividend tosize_t, and the subsequentCLAMPcomparison withINT16_MINthen treated the negative lower bound as a huge positivesize_t, breaking clamping.Fix: Cast the denominator explicitly to
int32_t:result / (int32_t)(size_x * size_y).Files:
kernels/c_model/c_khr_nn.c#59 — Laplacian Pyramid / Reconstruct compute hoisted into Initializer
Root cause: The actual pyramid computation lived in
vxLaplacianPyramidInitializerandvxLaplacianReconstructInitializer, which run once duringvxVerifyGraph(). The kernel callbacks were empty stubs. This meant every subsequentvxProcessGraph()call returned stale data from verify-time.Fix: Moved the full computation logic into
vxLaplacianPyramidKernelandvxLaplacianReconstructKernel. The Initializers are now lightweight stubs that only returnVX_SUCCESS. This ensures computation is re-executed on everyvxProcessGraph()call, as required by the spec.Spec alignment: OpenVX 1.3 defines the initializer for "local data setup", not for the kernel's computation.
vxProcessGraphmust process current input data each call.Files:
sample/targets/c_model/vx_pyramid.csample/targets/venum/vx_laplacianpyramid.csample/targets/venum/vx_pyramid.cBuild System / Tooling Fixes
#50 / #29 — Remove
-mfpu=neonfor aarch64Root cause:
CMake_linux_tools.cmakeunconditionally appended-mfpu=neonwhenEXPERIMENTAL_USE_VENUM/OR OPENVX_USE_TILING/OR EXPERIMENTAL_USE_OPENCLwas enabled. GCC on aarch64 rejects this flag because NEON is mandatory.Fix: Wrapped both
-mfpu=neonassignments withif (NOT (CMAKE_SYSTEM_PROCESSOR MATCHES "aarch64")).Files:
cmake_utils/CMake_linux_tools.cmake#46 —
/bigobjneeded for Windows debug builds ofnnef.cppRoot cause: Visual Studio 2015 debug builds of
nnef.cppexceeded the COFF section limit (fatal error C1128).Fix: Appended
/bigobjtoADD_C_FLAGS_DEBUGinCMake_windows_tools.cmake.Files:
cmake_utils/CMake_windows_tools.cmake#45 —
OPENVX_USE_USER_DATA_OBJECTnot exposed inBuild.pyRoot cause:
Build.pyhad no CLI flag for enabling user data objects.Fix: Added
--userdataobjflag that maps to-DOPENVX_USE_USER_DATA_OBJECT=ON.Files:
Build.py#40 —
FIND_NUM_PROCESSORS()uses host CPU count during cross-compilationRoot cause:
FIND_NUM_PROCESSORS()read/proc/cpuinfo(or Windows env var) even when cross-compiling, then definedTARGET_NUM_CORESfrom the host count.Fix: If
CMAKE_CROSSCOMPILINGis set, forcePROCESSOR_COUNT_T = "1"to avoid using the build host's core count for the target.Files:
cmake_utils/CMakeFuncs.txt#54 — Reserved identifier violation in header guards
Root cause: Multiple internal headers used identifiers beginning with underscore (e.g.,
_OPENVX_INT_LOG_H_,_VX_EXTRAS_K_H_), which is undefined behavior per C/C++ standards.Fix: Renamed all affected header guards to remove the leading underscore.
Files: All
sample/include/*.handkernels/extras/extras_k.hXML Export Fix
#6 — Threshold XML export uses wrong type for
true_value/false_valueRoot cause:
vxExportToXMLThresholdprintedthresh->true_valueandthresh->false_valuewith%d, but these fields arevx_pixel_value_tunions, notvx_int32.Fix: Accessed the
S32member explicitly:thresh->true_value.S32andthresh->false_value.S32for the%dformat.Note: The import side (
vx_xml_import.c) was also reviewed; it parses these values viaxml_long()which ultimately stores into the union correctly, so only export needed adjustment.Files:
sample/framework/vx_xml_export.cIssues Partially Addressed / Deferred
#53 — Scale valid region propagation incorrect
Analysis: The Scale kernel does not register a
set_valid_rectangle_callback. The framework falls back to generic propagation, but Scale changes image dimensions, so simply copying the input valid rectangle to the output is incorrect per spec. Adding a proper callback requires either extendingvx_kernel_description_t(which has no valid-rect field for built-ins in this sample impl) or adding framework-level special-casing for Scale. This is a non-trivial architectural change beyond the scope of safe incremental fixes and is deferred to a follow-up. A comment has been added to the PR description for tracking.#30 —
TensorOp.vxuTensorMatrixMultiplyfails on RPi4 64-bitAnalysis: This appears to be an
aarch64-specific issue in the tensor matrix-multiply reference implementation (likely asize_t/vx_int32sign-extension or type-promotion bug). Without access to an RPi4/aarch64 target for reproduction, a speculative fix risks introducing regressions on x86_64. Deferred pending hardware access or a reproducible test log.#34 — NNEF parser duplication
Analysis: The repo includes local copies of
nnef_parser.h/.cthat should be replaced by references to theNNEF-Toolssubmodule (cnnef.h). Removing them requires coordinated CMake and include-path changes acrosskernels/NNEF-Tools/,cts/, and potentially build scripts. This is a structural refactoring that could break NNEF conformance builds if not done carefully. Deferred to a dedicated PR.Questions / Discussion Items Addressed in PR Description Only
vx_osal.h) and memory allocator. There is no upstream RISC-V vector target in the sample implementation. A downstream fork would need to implement a custom target backend.vxScheduleGraphandvxWaitGraphare supported by the framework when pipelining/streaming extensions are enabled (--pipelining/--streaminginBuild.py). A minimal example can be added in a follow-up; no code bug here.vx_khr_icd) is the Khronos-standard mechanism for coexisting implementations. The sample implementation supports ICD when built withOPENVX_USE_ICD=ON.Verification
python3 Build.py --os=linux --arch=64 --conf=Release --build=true).vx_exampleandvx_bug13510reproduce identically on the baseopenvx_1.3branch (confirmed by clean-branch testing), confirming no new regressions were introduced.Commit Stats
Closes #6
Closes #20
Closes #21
Closes #22
Closes #23
Closes #24
Closes #29
Closes #40
Closes #45
Closes #46
Closes #48
Closes #50
Closes #52
Closes #54
Closes #59
Related: #30 #34 #53 #55 #39 #42 #35