Skip to content

Add scientific artifact hosting governance module#113

Open
taherdhanera wants to merge 2 commits into
SCIBASE-AI:mainfrom
taherdhanera:bounty-fair-artifact-hosting-14
Open

Add scientific artifact hosting governance module#113
taherdhanera wants to merge 2 commits into
SCIBASE-AI:mainfrom
taherdhanera:bounty-fair-artifact-hosting-14

Conversation

@taherdhanera
Copy link
Copy Markdown

@taherdhanera taherdhanera commented May 14, 2026

/claim #14

Summary

  • Adds a self-contained scientific-artifact-hosting-governance module for Scientific Data & Code Hosting.
  • Builds deterministic artifact manifests with type detection, SHA-256 content hashes, versions, access status, preview descriptors, and metadata completeness checks.
  • Adds version/diff summaries, executable environment readiness checks, FAIR readiness scoring, JSON-LD/schema.org export payloads, DataCite export payloads, and an audit hash.
  • Includes synthetic sample data, a CLI demo, tests, requirement mapping, and privacy-safe visual demo artifacts.

Demo

  • Visual demo artifact: scientific-artifact-hosting-governance/docs/demo.svg
  • Video demo artifact: scientific-artifact-hosting-governance/docs/demo.webm
  • CLI demo: node scientific-artifact-hosting-governance/demo.js
  • The demo uses synthetic data only and requires no credentials, live files, package installs, or external services.

Validation

  • node scientific-artifact-hosting-governance/test.js -> passed
  • node scientific-artifact-hosting-governance/demo.js -> produced reviewer-ready JSON with manifest, version diffs, FAIR exports, runtime readiness, and audit hash
  • git diff --check -> passed

Notes

  • Dependency-free Node.js implementation using only built-in modules.
  • No credentials, live research data, cloud storage calls, or external APIs required.
  • Scope is the broad artifact-hosting governance baseline; adjacent preview, retention, model-card, license, redaction, SBOM, and upload-checkpoint slices do not replace this layer.

Current status - 2026-05-29

Verified after newer same-issue #14 activity: this PR remains open, non-draft, CLEAN/mergeable, and distinct from KoiosSG PR #410. PR #113 is the artifact-hosting governance baseline; PR #410 is a separate malware/archive quarantine slice.

@taherdhanera
Copy link
Copy Markdown
Author

@taherdhanera
Copy link
Copy Markdown
Author

taherdhanera commented May 20, 2026

Review scope note for /claim #14: this PR is intended as the broad artifact-hosting governance baseline, not a single preview, retention, model-card, license, or redaction gate.

It covers deterministic artifact manifests, type detection, SHA-256 hashes, version/access state, metadata-aware previews, version diffs, executable environment readiness, FAIR readiness scoring, JSON-LD/schema.org and DataCite export payloads, audit hashing, synthetic test data, CLI/demo evidence, and the demo video linked above.

Recent #14 submissions such as license compatibility, sensitive-file redaction release gates, model-card/weight lineage, raw-instrument/notebook preview safety, preview cache/version guards, and retention/tombstone ledgers are useful adjacent slices. They do not replace this broader hosting-governance layer for arbitrary scientific artifacts.

@taherdhanera
Copy link
Copy Markdown
Author

Review-ready status for /claim #14: this PR remains open, mergeable, and has no known blocker from my side. Scope is the broad scientific artifact-hosting governance baseline: deterministic manifests, type detection, SHA-256 hashes, version/access state, metadata-aware previews, version diffs, executable-environment readiness, FAIR readiness scoring, JSON-LD/schema.org and DataCite export payloads, audit hashing, tests, and demo evidence. Recent #14 submissions are adjacent slices; this row remains focused on the hosting-governance layer.

@taherdhanera
Copy link
Copy Markdown
Author

Reviewer-ready checkpoint for /claim #14. This PR remains open, non-draft, mergeable/CLEAN, Bounty claim labeled, and the body contains /claim #14. Scope stays on the broad artifact-hosting governance baseline: deterministic manifests, SHA-256 hashes, version/access state, metadata-aware previews, executable-environment readiness, FAIR scoring, JSON-LD/schema.org and DataCite exports, audit hashing, tests, and demo evidence.

@taherdhanera
Copy link
Copy Markdown
Author

Visibility update after PR #410: this existing /claim #14 remains open, non-draft, CLEAN, bounty-labeled, and claim-marked.

Scope remains the scientific artifact hosting governance baseline, separate from the newer malware/archive-bomb quarantine slice. PR #113 covers deterministic manifests, SHA-256 artifact hashes, version/access state, metadata-aware previews, executable-environment readiness, FAIR scoring, JSON-LD/schema.org and DataCite exports, audit hashing, tests, and demo evidence.

I do not see a contributor-side blocker for review/reward decision on this PR.

@taherdhanera
Copy link
Copy Markdown
Author

Status refresh after the latest issue #14 external claim follow-up: PR #113 remains open, non-draft, mergeable, bounty-labeled, and claim-marked for the existing #14 submission. No code changes are needed from my side unless reviewers request revisions.

@taherdhanera
Copy link
Copy Markdown
Author

Status refresh after newer same-issue #14 activity.

Re-verified now: this PR is open, non-draft, mergeable/CLEAN, bounty-labeled, and claim-marked for issue #14.

The scope remains the scientific artifact hosting governance baseline: deterministic manifests, artifact type detection, SHA-256 content hashes, version/access state, metadata-aware previews, executable-environment readiness, FAIR scoring, JSON-LD/schema.org and DataCite exports, audit hashing, tests, and demo evidence.

This is distinct from the newer malware/archive-bomb quarantine and path-traversal hardening slice. No implementation changes are needed from my side unless reviewers request revisions.

@taherdhanera
Copy link
Copy Markdown
Author

Status refresh after newer same-issue #14 activity.

Re-verified now: this PR is open, non-draft, mergeable/CLEAN, bounty-labeled, and claim-marked for issue #14.

The scope remains the scientific artifact hosting governance baseline: deterministic manifests, artifact type detection, SHA-256 content hashes, version/access state, metadata-aware previews, executable-environment readiness, FAIR scoring, JSON-LD/schema.org and DataCite exports, audit hashing, tests, and demo evidence.

This is distinct from the newer chunk manifest assembly and malware/archive-bomb quarantine slices. No implementation changes are needed from my side unless reviewers request revisions.

@taherdhanera
Copy link
Copy Markdown
Author

Status refresh after newer same-issue #14 PR activity, including PR #305. Re-verified now: this PR remains open, non-draft, mergeable/CLEAN, bounty-labeled, and claim-marked for issue #14.

The submitted scope remains the scientific artifact hosting governance baseline: deterministic manifests, artifact type detection, SHA-256 content hashes, version/access state, metadata-aware previews, executable-environment readiness, FAIR scoring, JSON-LD/schema.org and DataCite exports, audit hashing, tests, and demo evidence. This is distinct from dataset schema evolution, chunk manifest assembly, malware/archive-bomb quarantine, and mirror/replica consistency slices.

@taherdhanera
Copy link
Copy Markdown
Author

Status refresh after the newer same-issue #14 attempt about artifact embargo access-expiry/takedown: PR #113 remains open, non-draft, mergeable/CLEAN, bounty-labeled, and claim-marked for issue #14.

The submitted scope remains the scientific artifact hosting governance baseline: deterministic manifests, artifact type detection, SHA-256 content hashes, version/access state, metadata-aware previews, executable-environment readiness, FAIR scoring, JSON-LD/schema.org and DataCite exports, audit hashing, tests, and demo evidence.

This is distinct from artifact embargo/access-expiry/takedown checks, dataset schema evolution, chunk manifest assembly, malware/archive-bomb quarantine, mirror/replica consistency, column sensitivity release gates, and other #14 hosting slices.

@taherdhanera
Copy link
Copy Markdown
Author

Status refresh after the newer same-issue PR #446 activity: PR #113 remains open, non-draft, mergeable/CLEAN, bounty-labeled, and claim-marked for issue #14.

The submitted scope remains the scientific artifact hosting governance baseline: deterministic manifests, artifact type detection, SHA-256 content hashes, version/access state, metadata-aware previews, executable-environment readiness, FAIR scoring, JSON-LD/schema.org and DataCite exports, audit hashing, tests, and demo evidence.

This is distinct from PR #446's artifact embargo access-expiry/takedown guard, dataset schema evolution, chunk manifest assembly, malware/archive-bomb quarantine, mirror/replica consistency, column sensitivity release gates, and other #14 hosting slices.

@taherdhanera
Copy link
Copy Markdown
Author

Status refresh after the newer same-issue PR #410 hardening pass: PR #113 remains open, non-draft, mergeable/CLEAN, bounty-labeled, and claim-marked for issue #14.

The submitted scope remains the scientific artifact hosting governance baseline: deterministic manifests, artifact type detection, SHA-256 content hashes, version/access state, metadata-aware previews, executable-environment readiness, FAIR scoring, JSON-LD/schema.org and DataCite exports, audit hashing, tests, and demo evidence.

This is distinct from PR #410's artifact malware/archive-bomb quarantine guard and hardening pass for future-dated scan evidence, artifact embargo/access-expiry/takedown checks, dataset schema evolution, chunk manifest assembly, mirror/replica consistency, column sensitivity release gates, and other #14 hosting slices. No contributor-side code changes are pending unless reviewers request revisions.

@taherdhanera
Copy link
Copy Markdown
Author

Status refresh after the newer same-issue PR #452 activity: PR #113 remains open, non-draft, mergeable/CLEAN, bounty-labeled, and claim-marked for issue #14.

The submitted scope remains the scientific artifact hosting governance baseline: deterministic manifests, artifact type detection, SHA-256 content hashes, version/diff summaries, access status, preview descriptors, metadata completeness checks, executable environment readiness, FAIR readiness scoring, JSON-LD/schema.org and DataCite exports, and an audit hash.

PR #452 appears to add a separate artifact hosting readiness checker for upload/file/digest/storage/preview/metadata/access-control/sandbox/reproducibility/versioning checks. That is adjacent, but PR #113 is still the prior governance/export/audit baseline for this issue.

@taherdhanera
Copy link
Copy Markdown
Author

Status refresh after the newer same-issue PR #484 activity: PR #113 remains open, non-draft, mergeable/CLEAN, bounty-labeled, and claim-marked for issue #14.

The submitted scope remains the scientific artifact-hosting governance baseline: deterministic artifact manifests, artifact type detection, SHA-256 content hashes, version/access state, metadata-aware previews, executable-environment readiness, FAIR readiness scoring, JSON-LD/schema.org and DataCite export payloads, audit hashing, tests, and demo evidence.

PR #484 focuses on sandboxed run egress and execution-safety controls. That is adjacent to issue #14, but PR #113 remains the prior hosting-governance, export, and audit baseline. No contributor-side code changes are pending unless reviewers request revisions.

@taherdhanera
Copy link
Copy Markdown
Author

Status refresh after newer same-issue #14 activity: PR #113 remains open, non-draft, mergeable/CLEAN, bounty-labeled, and claim-marked for issue #14.

The submitted scope remains the scientific artifact-hosting governance module: deterministic artifact manifests, type detection, SHA-256 hashes, versions, access status, preview descriptors, metadata completeness checks, version/diff summaries, executable-environment readiness, FAIR/DataCite/schema.org exports, and audit hashes. It is tied to the Pending USD 375 Algora claim: https://algora.io/claims/JNzq3wWobHjK8nad

This remains separate from the newer artifact license and metadata completeness guard attempt. No contributor-side code changes are pending unless reviewers request them.

@taherdhanera
Copy link
Copy Markdown
Author

Status refresh after the newer same-issue PR #490 claim registration: PR #113 remains open, non-draft, mergeable/CLEAN, bounty-labeled, and claim-marked for issue #14.

The submitted scope remains the scientific artifact-hosting governance baseline: deterministic artifact manifests, type detection, SHA-256 hashes, versions, access status, preview descriptors, metadata completeness checks, version/diff summaries, executable-environment readiness, FAIR/DataCite/schema.org exports, and audit hashes. It is tied to the Pending USD 375 Algora claim: https://algora.io/claims/JNzq3wWobHjK8nad

PR #490 appears to add a separate artifact license metadata guard for release-compatible licenses, required metadata, persistent identifiers, checksum coverage, sensitive-use flags, and access-control mismatches. That is adjacent Scientific/Engineering Data & Code Hosting work, but PR #113 is still the prior broader artifact-hosting governance baseline for this issue.

@taherdhanera
Copy link
Copy Markdown
Author

Visibility refresh after PR #410's newer hardening pass for scanner verdict normalization.

My existing issue #14 submission remains PR #113: #113

Current status re-verified now: PR #113 is open, non-draft, CLEAN/mergeable, bounty-labeled, includes /claim #14, and its Algora claim remains Pending for USD 375: https://algora.io/claims/JNzq3wWobHjK8nad

Scope reminder for review: PR #113 is the broad scientific artifact-hosting governance baseline for Scientific/Engineering Data & Code Hosting. It builds deterministic manifests with type detection, SHA-256 hashes, version/diff summaries, access/preview descriptors, executable-environment readiness checks, FAIR/DataCite/schema.org export payloads, metadata completeness checks, and reviewer-ready audit artifacts.

This remains separate from PR #410's artifact malware quarantine guard and its scanner verdict normalization, ZIP Slip/path traversal, stale-scan, macro/model/notebook quarantine, and malware-specific release hold logic. No contributor-side changes are pending unless maintainers request revisions.

@taherdhanera
Copy link
Copy Markdown
Author

Status refresh after newer same-issue #14 PR activity, including KoiosSG PR #410.

This PR #113 remains open, non-draft, CLEAN/mergeable, and claims #14. It is the broad scientific artifact-hosting governance baseline: manifest/version/diff handling, executable environment readiness, FAIR/schema.org/DataCite export readiness, metadata completeness, and deterministic audit evidence.

Non-overlap: PR #410 focuses on malware/archive quarantine. That is a separate release-safety slice and does not replace this manifest/governance baseline. No contributor-side changes are pending unless maintainers request revisions.

@taherdhanera
Copy link
Copy Markdown
Author

Visibility refresh after KoiosSG updated same-issue #14 PR #410 later than my last status.

This PR #113 remains open, non-draft, CLEAN/mergeable, bounty-labeled, and claim-marked for issue #14. It is tied to the Pending USD 375 Algora claim: https://algora.io/claims/JNzq3wWobHjK8nad

Scope remains distinct: PR #113 is the broad scientific artifact-hosting governance baseline covering deterministic manifests, type detection, SHA-256 hashes, version/diff summaries, access and preview descriptors, executable-environment readiness, FAIR/DataCite/schema.org exports, metadata completeness, and reviewer-ready audit evidence. PR #410 remains a separate malware/archive quarantine and traversal hardening slice. No contributor-side code changes are pending unless maintainers request revisions.

@taherdhanera
Copy link
Copy Markdown
Author

Visibility refresh after KoiosSG updated same-issue #14 PR #410 again with blank archive-link target hardening.

This PR #113 remains open, non-draft, MERGEABLE, bounty-labeled, and claim-marked for issue #14. It is tied to the Pending USD 375 Algora claim: https://algora.io/claims/JNzq3wWobHjK8nad

Scope remains distinct: broad artifact-hosting governance across deterministic manifests, type detection, SHA-256 hashes, version/diff summaries, access/preview descriptors, executable-environment readiness, FAIR/DataCite/schema.org exports, metadata completeness, and reviewer-ready audit evidence. PR #410 remains a separate malware/archive quarantine and traversal/link-target hardening slice. No contributor-side changes are pending unless maintainers request revisions.

@taherdhanera
Copy link
Copy Markdown
Author

Visibility refresh after newer same-issue #14 PR activity from @codeaustral-oss / PR #305.

PR #113 remains open, non-draft, CLEAN, bounty-labeled, and claim-marked for issue #14. Its Algora claim remains Pending for USD 375: https://algora.io/claims/JNzq3wWobHjK8nad

Scope remains the scientific artifact hosting governance module. It is separate from PR #305's dataset schema evolution gate. No contributor-side changes are pending unless maintainers request revisions.

@taherdhanera
Copy link
Copy Markdown
Author

Visibility refresh after KoiosSG updated same-issue #14 PR #410 with malformed archive entry manifest hardening: #410 (comment)

My existing issue #14 submission remains PR #113: #113

Current status re-verified now:

Scope remains distinct: PR #113 is the broad scientific artifact-hosting governance baseline covering deterministic manifests, type detection, SHA-256 hashes, version/diff summaries, access/preview descriptors, executable-environment readiness, FAIR/DataCite/schema.org exports, metadata completeness, and reviewer-ready audit evidence. PR #410 remains a separate malware/archive quarantine and malformed-archive-manifest hardening slice. No contributor-side changes are pending unless maintainers request revisions.

@taherdhanera
Copy link
Copy Markdown
Author

PR-side visibility refresh after newer same-issue #14 activity from @AlonePenguin:

This active #14 submission remains PR #113.

Current status re-verified now:

Scope remains the broad scientific artifact-hosting governance baseline: deterministic manifests, type detection, SHA-256 hashes, version/diff summaries, access/preview descriptors, executable-environment readiness, FAIR/DataCite/schema.org exports, metadata completeness, and reviewer-ready audit evidence. The newer supplementary media accessibility preview guard is a separate preview/accessibility slice. No contributor-side changes are pending for PR #113 unless maintainers request revisions.

@taherdhanera
Copy link
Copy Markdown
Author

PR-side visibility refresh after newer same-issue #14 PR activity from @Jorel97:

This PR #113 remains my active #14 reward submission. Current status re-verified now: open, non-draft, mergeable clean, /claim #14 present, and Algora Pending for USD 375: https://algora.io/claims/JNzq3wWobHjK8nad

@taherdhanera
Copy link
Copy Markdown
Author

PR-side visibility refresh after newer same-issue #14 PR activity from codeaustral-oss PR #305 and KoiosSG PR #410:

This active #14 submission remains PR #113. Current status re-verified now: open, non-draft, mergeable clean, bounty claim label present, /claim #14 present, and Algora Pending for USD 375: https://algora.io/claims/JNzq3wWobHjK8nad

Scope remains the scientific artifact hosting governance module, distinct from the newer same-issue PR activity listed above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant