Fix memory safety vulnerabilities in high-level and VFD code#6140
Fix memory safety vulnerabilities in high-level and VFD code#6140brtnfld wants to merge 93 commits intoHDFGroup:developfrom
Conversation
Address multiple CWE-415 (double-free), CWE-416 (use-after-free),
and CWE-122 (buffer overflow) vulnerabilities identified by static analysis:
- hl/src/H5DS.c: Fix double-free in H5DSis_scale() by setting buf to NULL
after free and adding NULL check in cleanup path
- hl/src/H5LT.c: Fix multiple memory issues:
* Set myinput to NULL after free in H5LTtext_to_dtype()
* Add NULL check in realloc_and_append() to prevent use-after-free
* Refactor duplicated stmp handling by creating H5LT_append_dtype_super_text()
helper function, eliminating ~50 lines of repeated code across 4 case blocks
- hl/src/H5TB.c: Replace unsafe strcpy() with strncpy() in H5TBget_field_info()
using HLTB_MAX_FIELD_LEN constant to prevent buffer overflow
- hl/src/H5TBpublic.h: Document buffer size requirements for field_names parameter
- src/H5FDstdio.c: Fix inconsistent resource cleanup in H5FD_stdio_open() by
using file->fp instead of f throughout error paths
- src/H5VLnative.c: Add assert checks for obj and file parameters in
H5VL_native_get_file_struct() following internal API conventions
|
|
||
| /* Use the value in the property list */ | ||
| if (H5Pget_file_locking(fapl_id, &unused, &file->ignore_disabled_file_locks) < 0) { | ||
| fclose(file->fp); |
There was a problem hiding this comment.
What was the issue with these close calls? file->fp should be the same as f at this point
There was a problem hiding this comment.
Snyk flags it for clarity on resource ownership. The concern is that it's not explicit which pointer "owns" the resource after the assignment. Once you've assigned file->fp = f, the FILE* is conceptually owned by the file structure, and using file->fp in cleanup makes this ownership clear.
| buf = tmp_realloc; | ||
| } | ||
|
|
||
| if (!buf) |
There was a problem hiding this comment.
The intent of the _no_user_buf parameter isn't really obvious, but it seems like this check overlaps with the same check inside that block, which seems like it would imply buf being allowed to be passed in as NULL in the false case. But I'm guessing this check was added due to the strlen(buf) below. This seems like we should determine whether it was ever intended for buf to be allowed as NULL.
There was a problem hiding this comment.
added comments, side-stepped the question if we should assert/error more clearly.
| #define HLTB_MAX_FIELD_LEN 255 | ||
| #define TABLE_CLASS "TABLE" | ||
| #define TABLE_VERSION "3.0" | ||
| /* HLTB_MAX_FIELD_LEN is now defined in H5TBpublic.h */ |
There was a problem hiding this comment.
Harmless, but it's probably unnecessary to document that a macro used to be in this file
Address a breaking API change introduced in commit 7b22833 where H5TBget_field_info unconditionally wrote a null terminator at byte 254 (HLTB_MAX_FIELD_LEN - 1), requiring all callers to allocate 255-byte buffers regardless of actual field name length. Changes: - hl/src/H5TB.c: Implement smart truncation that only enforces the 255-byte limit when field names actually exceed it. For typical short field names, only the actual string length plus null terminator is written, preserving backward compatibility with existing code using smaller buffers. - hl/src/H5TBpublic.h: Update documentation for HLTB_MAX_FIELD_LEN and H5TBget_field_info to clarify that 255-byte buffers are only required for exceptionally long field names. Short names are copied exactly without padding. - hl/test/test_table.c: Add new test case "field info with small buffers (backward compatibility)" that verifies the function works correctly with 32-byte buffers for typical field names, ensuring no buffer overflow occurs. This fix maintains the security improvement (preventing unbounded writes from the original strcpy) while avoiding the compatibility hazard of requiring all existing code to be updated. Fixes: CWE-122 (Heap-based Buffer Overflow) - user-side compatibility issue Maintains: Security fix from 7b22833
Add comprehensive documentation and assertions to address review feedback about the ambiguous intent of the _no_user_buf parameter and redundant NULL checks in realloc_and_append (H5LT.c). Changes: - Enhanced function header comment to document: * Two operating modes (library-managed vs user-provided buffer) * Explicit parameter descriptions * Preconditions that buf must never be NULL - Added inline comments explaining: * Mode 1 (library-managed): buf initialized via calloc, can reallocate * Mode 2 (user-provided): fixed-size buffer, no reallocation * Why there are two NULL checks (defensive programming) - Added assertion before the second NULL check to: * Document the API contract (buf must be valid) * Aid debugging in development builds * Make it clear this check is for defensive programming The second NULL check (line ~1978) is intentionally redundant: - In library-managed mode: already checked at line ~1945 - In user-provided mode: catches caller errors - Prevents strlen(buf) crash regardless of mode This addresses the review comment about unclear intent and overlapping checks, making it explicit that buf=NULL is never a valid input, and the checks are defensive programming against logic/caller errors. Addresses: Review feedback from jhendersonHDF on commit 7b22833
The comment '/* HLTB_MAX_FIELD_LEN is now defined in H5TBpublic.h */' documents a past refactoring but provides no useful information for current development. The constant is properly defined in H5TBpublic.h and available through the include at line 20. Removes code archaeology that doesn't aid understanding.
… API Changes to hl/src/H5LT.c: - Simplify defensive redundancy in realloc_and_append() - Consolidate triple NULL check (lines 1945, 1972, 1979) into single assertion + runtime check at function start - Improves code clarity while maintaining identical safety guarantees - No functional change: both debug (assertion) and production (runtime) safety preserved Changes to hl/test/test_table.c: - Add comprehensive HLTB_MAX_FIELD_LEN boundary testing - Tests field name truncation at exact boundaries: * 253 chars: no truncation (253 + null = 254 < 255) * 254 chars: no truncation (254 + null = 255 = limit) * 255 chars: truncates to 254 (255 + null = 256 > limit) * 1000 chars: truncates to 254 (extreme case) - Complements existing small-buffer backward compatibility test - Verifies truncation logic in H5TB.c:3037-3040 works correctly - Fix compiler warnings: remove unused boundary_field_sizes variable, initialize boundary_names_out to NULL
Document the following changes in release_docs/CHANGELOG.md: Library section: - Fixed file descriptor leaks in stdio VFD error paths (H5FDstdio.c) - Added defensive NULL pointer checks in native VOL connector (H5VLnative.c) High-Level Library section: - Fixed critical buffer overflow vulnerability in H5TBget_field_info() (CWE-120) * SECURITY FIX: Replaced unbounded strcpy() with bounds-checked memcpy() * Field names exceeding 255 chars are now safely truncated * Backward compatibility preserved for small buffers - Made HLTB_MAX_FIELD_LEN constant public (moved to H5TBpublic.h) - Fixed memory leaks and improved safety in H5LT functions (H5LT.c) * Added NULL check after strdup() in H5LTtext_to_dtype() * Enhanced documentation for realloc_and_append() - Eliminated code duplication in H5LT datatype conversion * New helper function H5LT_append_dtype_super_text() reduces ~80 lines * Improves maintainability across ENUM, VLEN, ARRAY, COMPLEX handlers - Fixed use-after-free risk in H5DSis_scale() (H5DS.c) These entries correspond to commits 7b22833 through 31edfe0.
Fix issue where chunked datasets could get setup with an incorrect chunking index type in parallel HDF5 Fix issue where metadata cache images with an undefined address and size of 0 couldn't be properly decoded Fix issue where a flag in H5Cimage.c wasn't getting set correctly for release builds of the library, leading to incorrect error checking when reconstructing metadata cache entries
Disable float16 support for undefined sanitizer workflow for now as it causes a crash in UBSan
Link checker can't access the acm url, hence will fail. The change in this PR is a workaround to provide the url but prevent the link checker from accessing it. Please do not add https://.
* Fix display of '--' options in documentation * Fix more formatting
* fixed assignment of size in the wrapper * Call H5DSget_label directly from Fortran wrapper Replace the intermediate C wrapper h5dsget_label_c with a direct bind(c) call to H5DSget_label from H5DSget_label_f. This eliminates the malloc/free of a temporary buffer and the associated failure path where size was incorrectly set when H5DSget_label failed. The Fortran wrapper now handles the C-to-Fortran string conversion (equivalent to HD5packFstring) by blank-padding the buffer from the returned label length to the end. * Remove unused h5dsget_label_c C wrapper
… Array Indexing information which is embedded within the data layout message. (HDFGroup#6333)
* Consolidate documentation under doc/ directory
Move user-facing guides from release_docs/ and doxygen/ into a single
doc/ root. release_docs/ now holds only release artifacts (changelogs,
history, release process, maintainer info).
- git mv release_docs/INSTALL*.md, USING_*.md, README_HPC.md,
BuildSystemNotes.md, AutotoolsToCMakeOptions.md,
HDF5_Library_2.0.0_Migration_Guide.md → doc/
- git mv doxygen/ → doc/doxygen/
- Update CMakeLists.txt: HDF5_DOXYGEN_DIR and add_subdirectory path
- Update CMakeInstallation.cmake: all install paths for moved files
- Update bin/make_vers: hardcoded doxygen/ path substitution
- Update doc/doxygen/CMakeLists.txt: EXAMPLES_DIRECTORY and comments
- Update README.md, CONTRIBUTING.md, SECURITY.md, config/README.md,
release_docs/RELEASE_PROCESS.md: links to moved files
- Update doxygen .dox files: release_docs/ URLs for moved guides
- Rewrite release_docs/README.md for narrowed scope
* Add HDF5_DOCS_DIR variable for doc/ root path
Introduce HDF5_DOCS_DIR = \${HDF5_SOURCE_DIR}/doc so that
CMakeInstallation.cmake and future callers reference the doc/
directory symbolically rather than by hardcoded path.
HDF5_DOXYGEN_DIR is now derived from HDF5_DOCS_DIR.
Updated NVERDOT and NVERDASH environment variables to version 26.3.
* ci: add gate job to CodeQL workflow for text-only PRs Remove paths-ignore from the workflow trigger and add a check-changes job with dorny/paths-filter to detect code changes at the job level. This ensures the workflow always triggers so the codeql-complete gate job can report a passing status when analyze is skipped, preventing text-only PRs from being blocked by required status checks. * ci: check both check-changes and analyze results in gate job Add check-changes to the needs array of codeql-complete so that a failure in the change-detection job is not silently treated as a skipped analysis.
so that the "Require code scanning results" branch protection rule is satisfied for text-only PRs.
Restrict empty SARIF upload to pull_request events only, so that push-to-develop (e.g. after merging a text-only PR) does not overwrite the real CodeQL results in the Security tab with an empty SARIF.
…ttings (HDFGroup#6280) When a global API version is set (e.g., H5_USE_16_API), functions introduced after that version now default to their earliest version (version 1) instead of the latest. This prevents breakage when an application uses an older API setting but calls functions that were later versioned.
Updates the requirements on [actions/checkout](https://github.com/actions/checkout), [actions/download-artifact](https://github.com/actions/download-artifact), [actions/cache](https://github.com/actions/cache), [lukka/get-cmake](https://github.com/lukka/get-cmake), [actions/setup-java](https://github.com/actions/setup-java), [EndBug/add-and-commit](https://github.com/endbug/add-and-commit), [github/codeql-action](https://github.com/github/codeql-action), [advanced-security/filter-sarif](https://github.com/advanced-security/filter-sarif), [codespell-project/actions-codespell](https://github.com/codespell-project/actions-codespell), [azure/trusted-signing-action](https://github.com/azure/trusted-signing-action), [vmactions/freebsd-vm](https://github.com/vmactions/freebsd-vm), [julia-actions/setup-julia](https://github.com/julia-actions/setup-julia), [msys2/setup-msys2](https://github.com/msys2/setup-msys2), [vmactions/openbsd-vm](https://github.com/vmactions/openbsd-vm) and [softprops/action-gh-release](https://github.com/softprops/action-gh-release) to permit the latest version. * Keep vmactions/openbsd-vm@271a1ba # v1.3.4 until ssh doesn't fail with newer version.
The loop in H5O__dtype_decode_helper() that computes nelem by multiplying array dimension sizes has no per-step overflow check. This produces incorrect element counts that propagate through type conversion, vlen iteration, and size calculations. Add a per-step overflow guard inside the multiplication loop so the wrap is caught before it happens.
* Allow setting HDF5_INSTALL_JNI_LIB_DIR to specify install location for the JNI shared library * No library versioning for Java JNI
…FIELD_LEN public - H5TBget_field_info: replace unbounded strcpy with bounds-checked memcpy guarded by HLTB_MAX_FIELD_LEN; names >= 255 chars truncated safely (CWE-120) - Move HLTB_MAX_FIELD_LEN from H5TBprivate.h to H5TBpublic.h with docs so callers can correctly size their field_names[] buffers - test_table.c: add backward-compat (32-byte buffer) and boundary-length (253/254/255/1000-char names) tests for H5TBget_field_info - CMakeTests.cmake: add test_boundary.h5 to cleanup list - CHANGELOG.md: add H5TBget_field_info security fix and HLTB_MAX_FIELD_LEN entries; HDFGroup#6140's H5DSis_scale buf=NULL entry is superseded by our broader cleanup already on this branch
Address multiple CWE-415 (double-free), CWE-416 (use-after-free), and CWE-122 (buffer overflow) vulnerabilities identified by static analysis:
hl/src/H5DS.c: Fix double-free in H5DSis_scale() by setting buf to NULL after free and adding NULL check in cleanup path
hl/src/H5LT.c: Fix multiple memory issues:
hl/src/H5TB.c: Replace unsafe strcpy() with strncpy() in H5TBget_field_info() using HLTB_MAX_FIELD_LEN constant to prevent buffer overflow
hl/src/H5TBpublic.h: Document buffer size requirements for field_names parameter
src/H5FDstdio.c: Fix inconsistent resource cleanup in H5FD_stdio_open() by using file->fp instead of f throughout error paths
src/H5VLnative.c: Add assert checks for obj and file parameters in H5VL_native_get_file_struct() following internal API conventions
SAFE project work.
Important
Fixes memory safety vulnerabilities in HDF5 codebase, addressing double-free, use-after-free, and buffer overflow issues across multiple files.
H5DS.c: Fix double-free inH5DSis_scale()by settingbufto NULL after free and adding NULL check in cleanup.H5LT.c: Setmyinputto NULL after free inH5LTtext_to_dtype(), add NULL check inrealloc_and_append(), refactorH5LT_append_dtype_super_text()to reduce code duplication.H5TB.c: Replacestrcpy()withstrncpy()inH5TBget_field_info()to prevent buffer overflow.H5TBpublic.h: Document buffer size requirements forfield_namesparameter.H5FDstdio.c: Usefile->fpconsistently inH5FD_stdio_open()for error paths.H5VLnative.c: Add assert checks forobjandfileparameters inH5VL_native_get_file_struct().This description was created by
for 7b22833. You can customize this summary. It will automatically update as commits are pushed.