diff --git a/README.md b/README.md index 4c0b9e4..708b691 100644 --- a/README.md +++ b/README.md @@ -1,180 +1,433 @@ # OSH OAKRIDGE BUILDNODE -This repository combines all the OSH modules and dependencies to deploy the OSH server and client for ORNL. +This repository packages the OSH server and OSCAR client deployment used for ORNL field and test systems. ## Requirements -- [Java 21.0.10+](https://www.oracle.com/java/technologies/downloads/#java21) -- [Docker engine](https://www.docker.com) -- [Oakridge Build Node Repository](https://github.com/Botts-Innovative-Research/osh-oakridge-buildnode) -- Node v22 - -## Quick Start -1. **Download the latest release** - - Go to the Releases section of the repository and download the latest compiled release archive (for example, `oscar-3.3.5.zip`). -2. **Extract the archive** - - Extract the downloaded ZIP file to a directory of your choice. -3. **Verify Docker Engine** - - Ensure that [Docker engine](https://www.docker.com) is installed and actively running on your host machine. -4. **Launch the system** - - Open a terminal or command prompt in the extracted directory and run the OS-specific launch script: - - **Windows:** Run `launch-all.bat` - - **Linux/macOS:** Run `./launch-all.sh` - - **ARM systems:** Run `./launch-all-arm.sh` - -For a complete guide covering architecture, deployment, configuration, operations, and troubleshooting, please refer to the [OSCAR System Documentation Manual](dist/documentation/OSCAR_System_Documentation_Manual_3.5.md). - -## Installation -Clone the repository and update all submodules recursively + +- Java 21 or newer +- Docker Engine or Docker Desktop, running before launch +- A packaged OSCAR release archive, or a local source checkout for build workflows +- Node v22 only when building from source + +## OSCAR 3.5.1 packaged release quick start + +This section is for operators using the **prebuilt OSCAR 3.5.1 release ZIP**. + +### 1. Verify required dependencies + +Windows PowerShell: + +```powershell +java -version +docker version +``` + +Linux: + +```bash +java -version +docker version +``` + +Use **Java 21 or newer**. The launch scripts validate dependencies and will stop early if Java or Docker is missing or too old. + +### 2. Extract the release archive to a fresh directory + +Extract the downloaded ZIP to a fresh working directory. + +Do not launch a new release on top of an older extracted directory. Reusing an old directory can leave behind monitor state, runtime data, logs, or generated config that makes troubleshooting harder. + +### 3. Create the runtime environment file + +For packaged releases, use the environment file that ships with the archive: + +- if the package includes **env.txt**, rename it to **.env** +- if the package includes **env.template**, copy it to **.env** + +Windows PowerShell: + +```powershell +Copy-Item .\env.template .\.env +``` + +Linux: + +```bash +cp env.template .env +``` + +Edit `.env` before first launch and at minimum confirm: + +- `SYSTEM_PROFILE` +- `DB_NAME` +- `DB_USER` +- `DB_PASSWORD` +- `DB_PORT` +- `CONTAINER_NAME` + +Useful optional settings include: + +- `FORCE_RESTART=1` to replace an already-running OSCAR instance +- `ATTACH_TO_EXISTING=1` to monitor an already-running OSCAR instance +- `MAX_WAIT_SECONDS=300` +- `RETRY_MAX=120` +- `RETRY_INTERVAL=2` +- `POSTGIS_READY_DELAY=5` + +### 4. Preferred production start: use `launch-all` sessionless + +For routine production use, prefer the top-level **`launch-all`** scripts and run them **sessionless by default**. This avoids depending on an open SSH session, an RDP window, or a console that might be closed later. + +`launch-all` is the preferred production path because it starts PostGIS and OSCAR with the selected `.env` profile without the extra monitor loop, recurring snapshots, JFR checks, thread dumps, database trend files, and monitor-directory logging. This keeps routine startup simpler and avoids collecting detailed profile data when operators do not need an in-depth system profile. + +Prefer these **top-level launchers** over calling `osh-node-oscar/launch.(sh|bat)` directly unless you are debugging the node itself. + +#### Windows production start + +Interactive: + +```bat +launch-all.bat +``` + +Sessionless from PowerShell: + +```powershell +Start-Process cmd.exe ` + -ArgumentList '/c', 'launch-all.bat > launch.out 2>&1' ` + -WorkingDirectory $PWD ` + -WindowStyle Hidden +``` + +#### Linux production start + +Interactive: + +```bash +./launch-all.sh +``` + +Sessionless: + +```bash +nohup ./launch-all.sh > launch.out 2>&1 & +``` + +#### Production auto-start after reboot + +For production systems, configure the machine to start OSCAR automatically after restart. + +##### Windows Task Scheduler + +Create a scheduled task that: + +- runs **whether the user is logged on or not** +- triggers **at startup** +- starts in the extracted OSCAR directory +- launches `launch-all.bat` +- uses a small startup delay if Docker Desktop needs time to initialize +- restarts the task on failure + +A practical action is: + +```text +Program/script: powershell.exe +Arguments: -NoProfile -ExecutionPolicy Bypass -Command "Set-Location 'C:\path\to\oscar-3.5.1'; cmd /c launch-all.bat >> launch.out 2>&1" +``` + +If using Docker Desktop on Windows, make sure Docker Desktop itself is configured to start with Windows before relying on the scheduled OSCAR start. + +##### Linux systemd + +Use a dedicated `systemd` unit so OSCAR starts after Docker is available and restarts automatically if the service fails. + +Example `/etc/systemd/system/oscar.service`: + +```ini +[Unit] +Description=OSCAR launch-all service +After=network-online.target docker.service +Wants=network-online.target docker.service + +[Service] +Type=simple +User=oscar +WorkingDirectory=/home/oscar/oscar-3.5.1 +ExecStart=/bin/bash -lc './launch-all.sh' +Restart=on-failure +RestartSec=10 + +[Install] +WantedBy=multi-user.target +``` + +Then enable it: + +```bash +sudo systemctl daemon-reload +sudo systemctl enable --now oscar.service +``` + +Also ensure Docker starts on boot: + +```bash +sudo systemctl enable docker +``` + +### 5. Validation, troubleshooting, and profiling: use `monitor-oscar` + +For testing, burn-in, side-by-side field evaluation, troubleshooting, and system profiling, start OSCAR with the monitoring wrapper instead of `launch-all`. + +#### Windows monitor start + +Preferred interactive start: + +```powershell +powershell -NoProfile -ExecutionPolicy Bypass -File .\monitor-oscar.ps1 +``` + +Preferred sessionless start: + +```powershell +Start-Process powershell.exe ` + -ArgumentList '-NoProfile','-ExecutionPolicy','Bypass','-File',"$PWD\monitor-oscar.ps1" ` + -WindowStyle Hidden ` + -RedirectStandardOutput "$PWD\monitor.out" ` + -RedirectStandardError "$PWD\monitor.err" +``` + +If `monitor-oscar.bat` is still present in a package, treat `monitor-oscar.ps1` as the preferred Windows monitor entrypoint. + +#### Linux monitor start + +```bash +chmod +x launch-all.sh osh-node-oscar/launch.sh monitor-oscar.sh check-oscar-status.sh +./monitor-oscar.sh +``` + +Linux sessionless launch: + +```bash +nohup ./monitor-oscar.sh > monitor.out 2>&1 & +``` + +This is the preferred validation and troubleshooting path because it: + +- starts PostGIS and OSCAR using the current launch scripts +- captures memory, thread, JFR, and database snapshots over time +- produces a monitor directory and status report inputs automatically +- gives operators the evidence needed to compare profiles, diagnose startup failures, and confirm that PostgreSQL sessions and JVM threads stabilize + +Once the system is validated and no in-depth profile is needed, switch routine production starts back to `launch-all`. + +### 6. Running-instance handling and duplicate monitor protection + +The launch and monitor scripts detect already-running OSCAR JVMs. + +Default behavior: + +- `launch-all` refuses to start if OSCAR is already running +- `monitor-oscar` refuses to start if OSCAR is already running +- `monitor-oscar` also refuses to start if another `monitor-oscar` wrapper is already active + +Optional behaviors: + +- set `FORCE_RESTART=1` to stop the running OSCAR instance and start fresh +- set `ATTACH_TO_EXISTING=1` when using `monitor-oscar` to monitor the running instance instead of replacing it + +When using `nohup`, Task Scheduler, `Start-Process`, or another sessionless strategy, check these files after launch: + +- `monitor.last-status` +- `monitor.last-error` +- `monitor.out` +- `monitor.err` on Windows PowerShell launches that redirect stderr separately + +If a second monitor start is refused, `monitor.last-status` records a clear failure such as `FAILED duplicate_monitor ...` so the operator can tell why the wrapper exited without staying attached to the terminal. + +### 7. Reset and full cleanup guidance + +If you were previously running OSCAR, stop the old deployment before extracting and launching a new copy. + +#### Normal stop + +Windows: + +```bat +stop-all.bat +``` + +Linux: + +```bash +./stop-all.sh +``` + +Also verify no old OSCAR JVM is still running. + +Windows PowerShell: + +```powershell +Get-CimInstance Win32_Process | + Where-Object { + $_.Name -match '^java(\.exe)?$' -and + $_.CommandLine -like '*com.botts.impl.security.SensorHubWrapper*' + } | + Select-Object ProcessId, CommandLine +``` + +Linux: + +```bash +pgrep -af 'com.botts.impl.security.SensorHubWrapper' +``` + +#### If `reset-all` was run but old lanes still appear + +If a user runs the reset script while a monitor wrapper is still active, the monitor can restart OSCAR and old lanes can appear again. In that case, do a full cleanup: + +1. run `stop-all` first so the monitor wrapper and OSCAR JVM are both stopped +2. confirm no `monitor-oscar` wrapper and no `SensorHubWrapper` JVM are still running +3. delete the extracted release directory +4. re-extract the ZIP +5. recreate `.env` +6. relaunch using the preferred sessionless method + +Linux example: + +```bash +./stop-all.sh +cd .. +sudo rm -r oscar-3.5.1 +unzip oscar-3.5.1.zip +cd oscar-3.5.1 +cp env.template .env +nohup ./launch-all.sh > launch.out 2>&1 & +``` + +On Linux, removing the extracted release directory may require `sudo` depending on how files were created during previous runs. + +### 8. Generate a status report after startup + +After the system has been up long enough to settle, generate a one-file report. + +Windows PowerShell: + +```powershell +powershell -ExecutionPolicy Bypass -File .\check-oscar-status.ps1 +``` + +Linux: + +```bash +./check-oscar-status.sh +``` + +### 9. Admin access + +The admin username is typically **admin**. Do **not** assume the packaged password is always `admin`. + +For packaged releases, the initial password should be managed through the packaged secret file or environment-driven password initialization flow. Verify the package contents, then change the password before production use. + +## Building from source + +Clone the repository and update all submodules recursively: ```bash git clone git@github.com:Botts-Innovative-Research/osh-oakridge-buildnode.git --recursive ``` -If you've already cloned without `--recursive`, run: + +If you already cloned without `--recursive`, run: + ```bash cd path/to/osh-oakridge-buildnode git submodule update --init --recursive ``` -## Build + +## Build + Navigate to the project directory: ```bash cd path/to/osh-oakridge-buildnode ``` -Run the build script (macOS/Linux): +Run the build script. + +Linux/macOS: ```bash ./build-all.sh ``` -Run the build script (Windows): - -```bash -./build-all.bat -``` - -After the build completes, it can be located in `build/distributions/` - -## Deploy and Start OSH Node -1. Unzip the distribution using the command line or File Explorer: - - Option 1: Command Line - ```bash - unzip build/distributions/osh-node-oscar-1.0.zip - cd osh-node-oscar-1.0/osh-node-oscar-1.0 - ``` - ```bash - tar -xf build/distributions/osh-node-oscar-1.0.zip - cd osh-node-oscar-1.0/osh-node-oscar-1.0 - ``` - Option 2: Use File Explorer - 1. Navigate to `path/to/osh-oakridge-buildnode/build/distributions/` - 2. Right-click `osh-node-oscar-1.0.zip`. - 3. Select **Extract All..** - 4. Choose your destination, (or leave the default) and extract. -1. Launch the OSH node: - Run the launch script, "launch.sh" for linux/mac and "launch.bat" for windows. -2. Access the OSH Node -- Remote: **[ip-address]:8282/sensorhub/admin** -- Locally: **http://localhost:8282/sensorhub/admin** - -The default credentials to access the OSH Node are admin:admin. This can be changed in the Security section of the admin page. - -For documentation on configuring a Lane System on the OSH Admin panel, please refer to the OSCAR Documentation provided in the Google Drive documentation folder. - -## Deploy the Client -After configuring the Lanes on the OSH Admin Panel, you can navigate to the Clients endpoint: -- Remote: **[ip-address]:8282** -- Local: **http://localhost:8282/** - -For documentation on configuring a server on the OSCAR Client refer to the OSCAR Documentation provided in the Google Drive documentation folder. - -# Releasing a New Version - -## Release Checklist -Before releasing, ensure the following on the `dev` branch: -1. Update `version` in `build.gradle` to match the release version (e.g. `"3.2.0"`) -2. Update `deploymentName` in `dist/config/standard/config.json` to `"OSCAR "` (e.g. `"OSCAR 3.2.0"`) -3. Ensure there is no `pgdata` directory in `dist/release/postgis` -4. Verify the build succeeds locally with `./build-all.sh` or `./build-all.bat` - -## Release Steps -1. **Merge `dev` into `main`:** - ```bash - git checkout main - git pull origin main - git merge dev - git push origin main - ``` - Alternatively, create a pull request from `dev` → `main` on GitHub and merge it. - -2. **Tag the release on `main`:** - ```bash - git checkout main - git pull origin main - git tag v # e.g. git tag v3.2.0 - git push origin v - ``` - -3. **The release workflow runs automatically.** It will: - - Validate that the tag is on the `main` branch - - Verify version numbers match the tag in `build.gradle` and `config.json` - - Check that `pgdata` does not exist in the release directory - - Build the project (Gradle + oscar-viewer) - - Package the source code with all submodules included - - Create a GitHub Release with the build artifact and source archive - -# PostgreSQL Configuration -There are some tweaks that can be made to the PostgreSQL configuration to make it perform better. -Below is a list of suggested configuration parameters at varying levels of maximum system RAM. - -`shared_buffers` - Should be around 25% of maximum RAM -`effective_cache_size` - Should be around 70-75% of maximum RAM -`work_mem` - 16MB to 64MB. Depends on maximum system memory and size of the load -`maintenance_work_mem` - 512MB to 2GB. Depends on the load, but it's OK to try high numbers - -# Secure Node Over TLS (HTTPS) -In order to secure the OSH node over TLS, you must generate a Java keystore with an SSL certificate. - -Below is the command to generate a keystore with a self-signed certificate. - -`keytool -genkeypair -alias -keyalg RSA -keysize 2048 -validity -keystore .jks -storepass -keypass -dname "CN=, OU=, O=, L=, ST=, C=" -ext "SAN="` - -Then, in your OSH config (`config.json`), or in the Admin Panel under `Network` -> `HTTP Server`, you must specify the key store path, password, key alias, and HTTPS port. - -An example of the `config.json`'s HTTP Server config is shown below: - -```json -{ - "objClass": "org.sensorhub.impl.service.HttpServerConfig", - "httpPort": 8282, - "httpsPort": 8443, - "servletsRootUrl": "/sensorhub", - "authMethod": "BASIC", - "keyStorePath": "osh-keystore.jks", - "keyStorePassword": "changeit", - "keyAlias": "oscar-key", - "trustStorePath": ".keystore/ssl_trust", - "enableCORS": true, - "id": "5cb05c9c-9e08-4fa1-8731-ffaa5846bdc1", - "autoStart": true, - "moduleClass": "org.sensorhub.impl.service.HttpServer", - "name": "HTTP Server" -} -``` - -You can also edit this information in the OSH launch scripts at `osh-node-oscar/launch.(sh|bat)` - -```shell -java -Xms6g -Xmx6g -Xss256k -XX:ReservedCodeCacheSize=512m -XX:+UseG1GC -XX:+HeapDumpOnOutOfMemoryError \ - -Dlogback.configurationFile=./logback.xml \ - -cp "lib/*" \ - -Djava.system.class.loader="org.sensorhub.utils.NativeClassLoader" \ - -Djavax.net.ssl.keyStore="./osh-keystore.jks" \ - -Djavax.net.ssl.keyStorePassword="changeit" \ - -Djavax.net.ssl.trustStore="$SCRIPT_DIR/trustStore.jks" \ - -Djavax.net.ssl.trustStorePassword="changeit" \ - -Djava.library.path="./nativelibs" \ - com.botts.impl.security.SensorHubWrapper ./config.json ./db - -``` \ No newline at end of file +Windows: + +```bat +build-all.bat +``` + +After the build completes, the output is written under `build/distributions/`. + +## Source-tree deployment + +If you are testing from a source checkout instead of a packaged release: + +1. create `.env` from `env.template` +2. verify Java 21 and Docker +3. launch with `monitor-oscar` for first-run validation, troubleshooting, or profiling +4. use `check-oscar-status` after the monitored system reaches steady state +5. switch routine production starts to `launch-all` once validation is complete +6. use sessionless launch methods for unattended systems instead of relying on an open terminal + +## MediaMTX for larger camera deployments + +For larger camera counts, place **MediaMTX** in front of the RTSP sources and point OSCAR at the local MediaMTX proxy paths. + +This reduces load on the Java backend because OSCAR no longer has to open, maintain, and recover every remote camera connection directly. MediaMTX absorbs much of the connection churn, buffering, and stream fan-out work, while OSCAR reads from fewer, more stable local proxy endpoints. + +Use `monitor-oscar` while validating the camera profile, then return to `launch-all` for routine production operation. Keep the MediaMTX deployment simple and focused on camera proxying. See `dist/documentation/MediaMTX_OSCAR_camera_proxy_guide.md` for the full guide. + +## PostgreSQL tuning + +The packaged launch scripts size PostgreSQL by `SYSTEM_PROFILE`. + +Representative values: + +- `RPI4` -> max_connections 75 +- `8GB` -> max_connections 125 +- `16GB` -> max_connections 200 +- `32GB` -> max_connections 300 + +The launchers also set: + +- `superuser_reserved_connections=10` +- `idle_session_timeout=600000` +- connection and disconnection logging + +## Secure node over TLS + +To secure the OSH node over TLS, generate a Java keystore with an SSL certificate. + +```text +keytool -genkeypair -alias -keyalg RSA -keysize 2048 -validity -keystore .jks -storepass -keypass -dname "CN=, OU=, O=, L=, ST=, C=" -ext "SAN=" +``` + +Then configure the keystore path, password, alias, and HTTPS port in `config.json` or in the Admin Panel under **Network -> HTTP Server**. + +## Releasing a new version + +### Release checklist + +Before releasing from `dev`: + +1. update `version` in `build.gradle` +2. update `deploymentName` in `dist/config/standard/config.json` +3. ensure `dist/release/postgis/pgdata` is not packaged +4. verify the release ZIP name matches the intended version, such as `oscar-3.5.1.zip` +5. verify the release root directory name also matches the intended version +6. verify `env.template`, release notes, README, and launch documentation all reflect the same version + +### Release steps + +1. merge `dev` into `main` +2. tag the release on `main` +3. push the release tag and allow the workflow to build and publish the release artifacts diff --git a/build-all.bat b/build-all.bat index 2faa7ed..fea7faf 100644 --- a/build-all.bat +++ b/build-all.bat @@ -1,10 +1,30 @@ @echo off +setlocal EnableExtensions -call cd web/oscar-viewer +set "PROJECT_DIR=%~dp0" +set "RELEASE_VERSION=3.5.1" +set "DIST_DIR=%PROJECT_DIR%build\distributions" +set "STANDARD_ZIP=%DIST_DIR%\oscar-%RELEASE_VERSION%.zip" -call npm install -call npm run build +pushd "%PROJECT_DIR%web\oscar-viewer" || exit /b 1 +call npm install || goto :fail +call npm run build || goto :fail +popd -call cd ..\.. +pushd "%PROJECT_DIR%" || exit /b 1 +call gradlew.bat build -x test -x osgi || goto :fail +popd -call gradlew build -x test -x osgi +if exist "%DIST_DIR%" ( + powershell -NoProfile -Command ^ + "$dist = '%DIST_DIR%'; $target = '%STANDARD_ZIP%'; $zip = Get-ChildItem -Path $dist -Filter *.zip | Where-Object { $_.FullName -ne $target } | Sort-Object LastWriteTime -Descending | Select-Object -First 1; if ($zip) { Copy-Item -Force $zip.FullName $target; Write-Host ('Standardized release zip: ' + $target) } elseif (Test-Path $target) { Write-Host ('Release zip already available at: ' + $target) } else { Write-Warning ('No distribution zip found under ' + $dist) }" +) else ( + echo Warning: distribution directory not found: "%DIST_DIR%" +) + +exit /b 0 + +:fail +set "EXITCODE=%ERRORLEVEL%" +popd >NUL 2>NUL +exit /b %EXITCODE% diff --git a/build-all.sh b/build-all.sh index 9d635bb..c1ca269 100755 --- a/build-all.sh +++ b/build-all.sh @@ -1,10 +1,31 @@ #!/bin/bash +set -euo pipefail -cd web/oscar-viewer || exit +PROJECT_DIR="$(cd -- "$(dirname -- "$0")" && pwd)" +RELEASE_VERSION="3.5.1" +DIST_DIR="$PROJECT_DIR/build/distributions" +STANDARD_ZIP="$DIST_DIR/oscar-${RELEASE_VERSION}.zip" +echo "Making shell scripts executable..." +find "$PROJECT_DIR" -type f -name "*.sh" -exec chmod +x {} + + +cd "$PROJECT_DIR/web/oscar-viewer" npm install npm run build -cd ../.. || exit +cd "$PROJECT_DIR" +./gradlew build -x test -x osgi -./gradlew build -x test -x osgi \ No newline at end of file +if [ -d "$DIST_DIR" ]; then + GENERATED_ZIP="$(find "$DIST_DIR" -maxdepth 1 -type f -name "*.zip" ! -name "oscar-${RELEASE_VERSION}.zip" -printf "%T@ %p\n" | sort -nr | awk 'NR==1 {print $2}')" + if [ -n "$GENERATED_ZIP" ] && [ -f "$GENERATED_ZIP" ]; then + cp -f "$GENERATED_ZIP" "$STANDARD_ZIP" + echo "Standardized release zip: $STANDARD_ZIP" + elif [ -f "$STANDARD_ZIP" ]; then + echo "Release zip already available at: $STANDARD_ZIP" + else + echo "Warning: no distribution zip found under $DIST_DIR" >&2 + fi +else + echo "Warning: distribution directory not found: $DIST_DIR" >&2 +fi diff --git a/build.gradle b/build.gradle index 9ed9610..2174333 100644 --- a/build.gradle +++ b/build.gradle @@ -2,7 +2,7 @@ apply from: gradle.oshCoreDir + '/common.gradle' description = '' allprojects { - version = "3.5.0" + version = "3.5.1" } subprojects { @@ -67,6 +67,11 @@ distributions{ rel { distributionBaseName = 'oscar' contents { + eachFile { + if (it.name.endsWith('.sh')) { + it.mode = 0755 + } + } // OSH NODE into("osh-node-oscar/") { from 'dist/scripts/standard' diff --git a/changelog.md b/changelog.md index 0be2957..e06e33a 100644 --- a/changelog.md +++ b/changelog.md @@ -1,5 +1,21 @@ # OSCAR Build Node Change Log All notable changes to this project will be documented in this file. + +## 3.5.1 2026-05-07 +### Added +- Added `monitor-oscar.ps1` as the preferred Windows monitoring entrypoint for sessionless launches. +- Added documented sessionless launch patterns for Windows and Linux production operation. +- Added documented auto-start guidance for Windows Task Scheduler and Linux `systemd`. + +### Changes +- Updated packaged-release documentation to make sessionless `launch-all` the default production launch path. +- Updated monitoring documentation to explain duplicate-monitor protection and where operators should check `monitor.last-status`, `monitor.last-error`, `monitor.out`, and `monitor.err`. +- Updated reset/redeploy guidance to explain that operators may need to stop the monitor wrapper, remove the extracted release directory, re-extract the ZIP, recreate `.env`, and relaunch sessionlessly when old runtime artifacts or stale lanes persist. +- Updated MediaMTX guidance to be more succinct and to explain that MediaMTX reduces load on the Java backend by proxying and stabilizing camera connections before OSCAR consumes them. + +### Fixes +- Improved operational guidance around already-running OSCAR backends versus already-running monitor wrappers so sessionless launches fail more clearly. + ## 3.5.0 2026-04-24 ### Changes - Updated LaneSystem README diff --git a/dist/config/standard/config.template.json b/dist/config/standard/config.template.json new file mode 100644 index 0000000..4475c30 --- /dev/null +++ b/dist/config/standard/config.template.json @@ -0,0 +1,239 @@ +[ + { + "objClass": "org.sensorhub.impl.service.HttpServerConfig", + "httpPort": 8282, + "httpsPort": 0, + "staticDocsRootUrl": "null", + "staticDocsRootDir": "null", + "servletsRootUrl": "/sensorhub", + "authMethod": "BASIC", + "keyStorePath": ".keystore/ssl_keys", + "keyAlias": "jetty", + "trustStorePath": ".keystore/ssl_trust", + "enableCORS": true, + "id": "5cb05c9c-9e08-4fa1-8731-ffaa5846bdc1", + "autoStart": true, + "moduleClass": "org.sensorhub.impl.service.HttpServer", + "name": "HTTP Server" + }, + { + "objClass": "org.sensorhub.impl.security.BasicSecurityRealmConfig", + "users": [ + { + "objClass": "org.sensorhub.impl.security.BasicSecurityRealmConfig$UserConfig", + "id": "admin", + "name": "Administrator", + "password": "__INITIAL_ADMIN_PASSWORD__", + "roles": [ + "admin" + ], + "allow": [ + "fileserver[af72442c-1ce6-4baa-a126-ed41dda26910]" + ], + "deny": [] + }, + { + "objClass": "org.sensorhub.impl.security.BasicSecurityRealmConfig$UserConfig", + "id": "anonymous", + "name": "Anonymous User", + "password": "", + "roles": [ + "anon" + ], + "allow": [], + "deny": [] + } + ], + "roles": [ + { + "objClass": "org.sensorhub.impl.security.BasicSecurityRealmConfig$RoleConfig", + "id": "admin", + "allow": [ + "*" + ], + "deny": [] + }, + { + "objClass": "org.sensorhub.impl.security.BasicSecurityRealmConfig$RoleConfig", + "id": "anon", + "allow": [ + "sos[*]/get/*" + ], + "deny": [] + } + ], + "id": "bd112969-8838-4f62-8d10-1edf1baa6669", + "autoStart": true, + "moduleClass": "org.sensorhub.impl.security.BasicSecurityRealm", + "name": "Users" + }, + { + "objClass": "org.sensorhub.ui.AdminUIConfig", + "widgetSet": "org.sensorhub.ui.SensorHubWidgetSet", + "bundleRepoUrls": [], + "customPanels": [], + "customForms": [ + { + "objClass": "org.sensorhub.ui.CustomUIConfig", + "configClass": "com.botts.impl.system.lane.config.LaneOptionsConfig", + "uiClass": "com.botts.ui.oscar.forms.LaneConfigForm" + }, + { + "objClass": "org.sensorhub.ui.CustomUIConfig", + "configClass": "org.sensorhub.api.sensor.PositionConfig.LLALocation", + "uiClass": "com.botts.ui.oscar.forms.SiteDiagramForm" + }, + { + "objClass": "org.sensorhub.ui.CustomUIConfig", + "configClass": "com.botts.impl.service.oscar.OSCARServiceConfig", + "uiClass": "com.botts.ui.oscar.forms.OSCARServiceForm" + }, + { + "objClass": "org.sensorhub.ui.CustomUIConfig", + "configClass": "com.botts.impl.service.oscar.siteinfo.SiteDiagramConfig", + "uiClass": "com.botts.ui.oscar.forms.OSCARServiceForm" + } + ], + "deploymentName": "OSCAR 3.5.0", + "enableLandingPage": false, + "id": "5cb05c9c-9123-4fa1-8731-ffaa51489678", + "autoStart": true, + "moduleClass": "org.sensorhub.ui.AdminUIModule", + "name": "Admin UI" + }, + { + "objClass": "org.sensorhub.impl.service.consys.ConSysApiServiceConfig", + "databaseID": "a445cf15-c2ab-4d92-beab-3241667a8976", + "exposedResources": { + "objClass": "org.sensorhub.impl.datastore.view.ObsSystemDatabaseViewConfig", + "sourceDatabaseId": "a445cf15-c2ab-4d92-beab-3241667a8976" + }, + "customFormats": [], + "security": { + "objClass": "org.sensorhub.api.security.SecurityConfig", + "enableAccessControl": false, + "requireAuth": true + }, + "enableTransactional": true, + "maxResponseLimit": 100000, + "defaultLiveTimeout": 600.0, + "uriPrefixMap": [], + "ogcCapabilitiesInfo": { + "objClass": "org.sensorhub.impl.service.ogc.OGCServiceConfig$CapabilitiesInfo", + "serviceProvider": { + "objClass": "org.vast.util.ResponsibleParty", + "voiceNumbers": [], + "faxNumbers": [], + "deliveryPoints": [], + "emails": [], + "hrefPresent": false + } + }, + "enableHttpGET": true, + "enableHttpPOST": true, + "enableSOAP": true, + "endPoint": "/api", + "id": "6697cb4a-2e99-4fee-bba6-d1202d24dea5", + "autoStart": true, + "moduleClass": "org.sensorhub.impl.service.consys.ConSysApiService", + "name": "Connected Systems API Service" + }, + { + "objClass": "com.botts.impl.service.oscar.OSCARServiceConfig", + "videoRetentionConfig": { + "objClass": "com.botts.impl.service.oscar.video.VideoRetentionConfig", + "timeToRetention": 7, + "videoQueryPeriod": 1, + "enableFrameRetention": true, + "frameRetentionCount": 5 + }, + "nodeId": "default", + "databaseID": "a445cf15-c2ab-4d92-beab-3241667a8976", + "statsFrequencyMinutes": 60, + "webIdApiRoot": "https://full-spectrum.sandia.gov/api/v1", + "id": "09422f61-0098-4ac2-bb0e-3f42ec470524", + "autoStart": true, + "moduleClass": "com.botts.impl.service.oscar.OSCARServiceModule", + "name": "OSCAR Service Module" + }, + { + "objClass": "com.botts.impl.service.bucket.BucketServiceConfig", + "security": { + "objClass": "org.sensorhub.api.security.SecurityConfig", + "enableAccessControl": true, + "requireAuth": true + }, + "enableCORS": true, + "initialBuckets": [ + "sitemap", + "reports", + "videos", + "adjudication" + ], + "fileStoreRootDir": "files", + "endPoint": "/buckets", + "id": "51a2a980-cf7a-4e53-9a78-fa8f32e65cbb", + "autoStart": true, + "moduleClass": "com.botts.impl.service.bucket.BucketService", + "name": "Bucket Storage Service" + }, + { + "objClass": "org.sensorhub.impl.database.system.SystemDriverDatabaseConfig", + "dbConfig": { + "objClass": "org.sensorhub.impl.datastore.postgis.database.PostgisObsSystemDatabaseConfig", + "url": "localhost:5432", + "dbName": "gis", + "login": "postgres", + "password": "postgres", + "idProviderType": "SEQUENTIAL", + "autoCommitPeriod": 10, + "useBatch": false, + "id": "bfbd6d58-1a4a-40b4-999d-381a1489cbb5", + "autoStart": false, + "moduleClass": "org.sensorhub.impl.datastore.postgis.database.PostgisObsSystemDatabase" + }, + "systemUIDs": [ + "*" + ], + "autoPurgeConfig": [], + "minCommitPeriod": 10000, + "databaseNum": 4, + "id": "a445cf15-c2ab-4d92-beab-3241667a8976", + "autoStart": true, + "moduleClass": "org.sensorhub.impl.database.system.SystemDriverDatabase", + "name": "PostGIS Database" + }, + { + "objClass": "com.botts.impl.service.fileserver.FileServerConfig", + "staticDocsRootUrl": "/", + "staticDocsRootDir": "web", + "securityConfig": { + "objClass": "org.sensorhub.api.security.SecurityConfig", + "enableAccessControl": true, + "requireAuth": true + }, + "id": "af72442c-1ce6-4baa-a126-ed41dda26910", + "autoStart": true, + "moduleClass": "com.botts.impl.service.fileserver.FileServer", + "name": "OSCAR Client" + }, + { + "objClass": "org.sensorhub.impl.service.hivemq.MqttServerConfig", + "configFolder": "hivemq-config", + "dataFolder": "hivemq-data", + "webSocketProxyEndpoint": "/mqtt", + "enableWebSocketProxy": true, + "requireAuth": true, + "id": "0a0d9999-5d16-434a-8bb7-e245b474ba1d", + "autoStart": true, + "moduleClass": "org.sensorhub.impl.service.hivemq.MqttServer", + "name": "MQTT Server (HiveMQ)" + }, + { + "objClass": "org.sensorhub.impl.service.consys.mqtt.ConSysApiMqttServiceConfig", + "id": "e7c35780-dbb6-4481-9fbe-556a7e44045d", + "autoStart": true, + "moduleClass": "org.sensorhub.impl.service.consys.mqtt.ConSysApiMqttService", + "name": "Connected Systems API MQTT Extension" + } +] diff --git a/dist/documentation/MediaMTX_OSCAR_camera_proxy_guide.md b/dist/documentation/MediaMTX_OSCAR_camera_proxy_guide.md new file mode 100644 index 0000000..614b467 --- /dev/null +++ b/dist/documentation/MediaMTX_OSCAR_camera_proxy_guide.md @@ -0,0 +1,168 @@ +# Using MediaMTX to Reduce OSCAR Camera Stream Load + +This guide shows the recommended **MediaMTX** pattern for camera-heavy OSCAR deployments. + +The short version is simple: put MediaMTX between OSCAR and the physical cameras, point the OSCAR lane CSV at the MediaMTX RTSP paths, validate the camera profile with `monitor-oscar`, and then use `launch-all` for routine production starts. + +## Why MediaMTX helps + +Without a proxy, OSCAR opens direct RTSP sessions to each camera endpoint referenced by the lanes. At larger scale, that increases: + +- Java-side camera session setup work +- reconnect churn when cameras or networks wobble +- socket and thread churn around repeated stream handling +- load on the physical cameras when many logical lanes reuse the same feeds + +MediaMTX reduces that burden by presenting OSCAR with a smaller set of stable local RTSP paths. The proxy owns the upstream camera connections, while the Java backend talks to local endpoints instead of repeatedly reconnecting to every physical camera. + +## Recommended operating model + +- Use `monitor-oscar` during first-run validation, troubleshooting, burn-in, and camera-profile testing. +- Use `launch-all` for routine production after the camera profile is accepted. +- Keep MediaMTX on the OSCAR host or a nearby LAN host whenever possible. + +## Minimal MediaMTX configuration + +Use a lightweight RTSP-only profile unless you explicitly need other protocols. + +```yaml +api: yes +apiAddress: :9997 +rtmp: no +hls: no +webrtc: no +srt: no +paths: + lane03_cam: + source: "rtsp://:@192.168.8.73/axis-media/media.amp?adjustablelivestream=1&resolution=640x480&videocodec=h264&videokeyframeinterval=15" + sourceOnDemand: yes + sourceProtocol: tcp + + lane04_cam: + source: "rtsp://:@192.168.8.229/axis-media/media.amp?adjustablelivestream=1&resolution=640x480&videocodec=h264&videokeyframeinterval=15" + sourceOnDemand: yes + sourceProtocol: tcp +``` + +### Why these settings are recommended + +- `sourceOnDemand: yes` avoids pulling upstream video when OSCAR is not using the path. +- `sourceProtocol: tcp` is usually the most predictable RTSP transport on real LANs, VPNs, and NATed paths. +- Disabling `rtmp`, `hls`, `webrtc`, and `srt` keeps MediaMTX focused on simple RTSP proxying. +- The API on port `9997` gives you a quick way to verify that the paths are present. + +## OSCAR CSV pattern + +The OSCAR CSV should point camera fields at the MediaMTX host and path, not at the physical camera. + +```csv +Name,UniqueID,AutoStart,Latitude,Longitude,RPMConfigType,RPMHost,RPMPort,AspectAddressStart,AspectAddressEnd,EMLEnabled,EMLCollimated,LaneWidth,CameraType0,CameraHost0,CameraPath0,Codec0,Username0,Password0,CameraType1,CameraHost1,CameraPath1,Codec1,Username1,Password1 +sim-0,simu-0,FALSE,35.89,-84.19,Rapiscan,192.168.8.77,1601,,,FALSE,FALSE,4.820000172,Custom,192.168.8.77:8554,/lane03_cam,,,,Custom,192.168.8.77:8554,/lane04_cam,,, +``` + +### Important fields + +- `RPMHost` and `RPMPort` still point to the SRLS, Rapiscan, or emulator service. +- `CameraType*` should be `Custom` when you want OSCAR to use the host and path directly. +- `CameraHost*` should point to the machine running MediaMTX, usually `:8554`. +- `CameraPath*` should point to the MediaMTX path, such as `/lane03_cam`. +- `Username*` and `Password*` can usually remain blank because MediaMTX authenticates to the physical camera upstream. + +## Quick setup + +### 1. Start MediaMTX + +Linux: + +```bash +./mediamtx mediamtx.yml +``` + +Windows PowerShell: + +```powershell +.\mediamtx.exe mediamtx.yml +``` + +### 2. Verify that MediaMTX is listening + +Linux: + +```bash +ss -ltnp | grep -E '8554|9997' +``` + +Windows PowerShell: + +```powershell +Get-NetTCPConnection -LocalPort 8554,9997 -State Listen +``` + +### 3. Verify the API + +Linux or macOS: + +```bash +curl http://127.0.0.1:9997/v3/paths/list +``` + +Windows PowerShell: + +```powershell +Invoke-RestMethod http://127.0.0.1:9997/v3/paths/list +``` + +### 4. Point OSCAR at the proxy + +Update the lane CSV so each camera host and path points to MediaMTX instead of the physical device, then upload the CSV through the **Services** tab in the OSCAR admin page. + +## Camera profile guidance + +For larger systems, keep the streams modest unless you have already validated a heavier profile. A practical starting point is: + +- H.264 +- 640x480 +- 15 fps +- CBR +- 1-second keyframe interval when possible + +Those settings reduce total decode and transport cost while still giving OSCAR useful video. + +## Fast troubleshooting + +### OSCAR cannot open the stream + +Check: + +- MediaMTX is running +- the path name matches exactly +- the lane CSV points to the correct host and path +- port `8554` is reachable from the OSCAR host + +Quick direct test: + +```bash +ffplay rtsp://127.0.0.1:8554/lane03_cam +``` + +### The path exists but does not pull video + +Check: + +- upstream camera IP, credentials, and path +- whether the camera supports the requested stream format +- whether the camera accepts RTSP over TCP + +### Reconnect churn is still high + +MediaMTX reduces Java-side reconnect pressure, but it cannot fix every upstream problem. Also check: + +- camera network stability +- encoder configuration +- duplicate consumers +- emulator-side or payload-side issues +- OSCAR thread and reconnect logs from the monitor directory + +## Summary + +MediaMTX reduces the burden on the OSCAR Java backend by replacing many direct camera sessions with a smaller set of local RTSP proxy paths. That usually means less reconnect churn, less socket and thread churn, easier lane reuse, and a simpler camera topology for large test or production systems. diff --git a/dist/documentation/Node_Administration_3.5.1_addendum.md b/dist/documentation/Node_Administration_3.5.1_addendum.md new file mode 100644 index 0000000..3d7c838 --- /dev/null +++ b/dist/documentation/Node_Administration_3.5.1_addendum.md @@ -0,0 +1,32 @@ +# Node Administration 3.5.1 addendum + +The existing **Node Administration** PDF remains useful for Admin UI tasks such as: + +- starting and stopping modules +- adding users and roles +- configuring sensors, storage, and SOS services + +No major Admin UI workflow changes were required for the 3.5.1 launch, monitoring, and packaging updates. + +The operational changes for 3.5.1 are outside that PDF and are now covered in these updated deployment documents: + +- `Release_Notes_3.5.1.md` +- `OSCAR_launch_monitoring_guide.md` +- `MediaMTX_OSCAR_camera_proxy_guide.md` +- `OSCAR_System_Documentation_Manual_3.5.md` + +Use the PDF for Admin Panel behavior, and use the updated deployment documents for: + +- Java 21 and Docker prerequisites +- `.env` setup +- already-running OSCAR handling +- launch-mode selection: `launch-all` for efficient production, `monitor-oscar` for validation, troubleshooting, and system profiling +- sessionless production starts with `launch-all` on Linux and Windows +- automatic production startup after reboot with `systemd` on Linux or Task Scheduler on Windows +- monitoring and status scripts, including duplicate-monitor prevention +- the new Windows PowerShell monitor wrapper `monitor-oscar.ps1` +- the preferred hidden PowerShell `Start-Process` pattern for monitored Windows runs +- fresh-install cleanup of older OSCAR releases +- MediaMTX-assisted camera deployment guidance, now shortened and focused on Java-backend load reduction + +The updated monitor wrappers include single-instance protection so a second live monitor launch is refused until the first one is stopped. Use the monitor wrappers when detailed diagnostic evidence is needed; use `launch-all.sh` or `launch-all.bat` for routine production operation to avoid unnecessary monitoring logs and snapshot artifacts. diff --git a/dist/documentation/OSCAR_System_Documentation_Manual_3.5.md b/dist/documentation/OSCAR_System_Documentation_Manual_3.5.md index aff79d1..d68de18 100644 --- a/dist/documentation/OSCAR_System_Documentation_Manual_3.5.md +++ b/dist/documentation/OSCAR_System_Documentation_Manual_3.5.md @@ -86,7 +86,7 @@ The diagram below condenses the system relationships. It is not a source-code cl # 3. Installation and initial startup -Installation is intentionally simple: download a release archive, extract it, ensure Docker is installed and running, and launch the platform with the OS-specific launch-all script (`launch-all.bat` for Windows, `launch-all.sh` for Linux/macOS, or `launch-all-arm.sh` for ARM systems). The database and application come up together in the default deployment path. +Installation is intentionally simple: download a release archive, extract it, ensure Docker is installed and running, and launch the platform with the OS-specific `launch-all` script (`launch-all.bat` for Windows, `launch-all.sh` for Linux/macOS, or `launch-all-arm.sh` for ARM systems). `launch-all` is the preferred efficient production startup path because the database and application come up together without the extra monitoring logs and profile artifacts produced by the monitor wrapper. Use `monitor-oscar` instead for first-run validation, burn-in, troubleshooting, side-by-side evaluation, or system profiling. ## Prerequisites @@ -105,7 +105,11 @@ Installation is intentionally simple: download a release archive, extract it, en > > 2\. Install Docker and verify that the Docker service is running before starting OSCAR. > -> 3\. Run the launch-all script (`launch-all.bat`, `launch-all.sh`, or `launch-all-arm.sh`) for the operating system in use. In the default path, the script starts PostgreSQL locally in Docker and then starts the Java application. +> 3\. For efficient production operation, run the `launch-all` script (`launch-all.bat`, `launch-all.sh`, or `launch-all-arm.sh`) for the operating system in use. In the default path, the script starts PostgreSQL locally in Docker and then starts the Java application. +> +> Sessionless production is the preferred operating mode once the deployment has been validated. On Linux, a common pattern is `nohup ./launch-all.sh > launch.out 2>&1 &`. On Windows, the recommended hidden start pattern is a PowerShell `Start-Process` call that runs `launch-all.bat` with stdout and stderr redirected to log files. +> +> For first-run validation, troubleshooting, burn-in, side-by-side evaluation, or system profiling, run `monitor-oscar` instead. The monitor path intentionally creates additional logs, snapshots, JFR checks, thread dumps, and database trend files so operators can evaluate system behavior before or during investigation. Windows monitored runs now also have a PowerShell-first wrapper, `monitor-oscar.ps1`, which is the preferred hidden or redirected entry point for sessionless monitored launches. > > 4\. Open the application on the configured port. Port 8282 is the baseline HTTP application port, and 8443 is a representative HTTPS configuration. > @@ -127,6 +131,60 @@ Installation is intentionally simple: download a release archive, extract it, en +## Sessionless and restart-safe launch patterns + +For normal production use, `launch-all` should be treated as the default startup path, whether the system is started interactively or headlessly. + +### Linux production + +Interactive: + +```bash +./launch-all.sh +``` + +Sessionless: + +```bash +nohup ./launch-all.sh > launch.out 2>&1 & +``` + +For automatic restart after reboot, the recommended pattern is a small `systemd` unit that runs `launch-all.sh` after `docker.service` is available. + +### Windows production + +Interactive: + +```bat +launch-all.bat +``` + +Sessionless from PowerShell: + +```powershell +Start-Process powershell.exe ` + -ArgumentList '-NoProfile','-ExecutionPolicy','Bypass','-Command',"Set-Location -LiteralPath '$PWD'; .\launch-all.bat" ` + -WindowStyle Hidden ` + -RedirectStandardOutput "$PWD\launch.out" ` + -RedirectStandardError "$PWD\launch.err" +``` + +For automatic restart after reboot, the recommended pattern is a **Task Scheduler** task with an **At startup** trigger that runs `launch-all.bat` from the extracted OSCAR directory after Docker is available. + +### Windows monitored diagnostics + +For monitored validation or troubleshooting, the preferred Windows wrapper is `monitor-oscar.ps1`. + +```powershell +Start-Process powershell.exe ` + -ArgumentList '-NoProfile','-ExecutionPolicy','Bypass','-File',"$PWD\monitor-oscar.ps1" ` + -WindowStyle Hidden ` + -RedirectStandardOutput "$PWD\monitor.out" ` + -RedirectStandardError "$PWD\monitor.err" +``` + +This keeps the monitoring window hidden while preserving redirected output for later review. + ## Ports and first-run behaviors | **Area** | **Typical value or behavior** | **Interpretation** | @@ -396,6 +454,16 @@ Default deployment begins to strain as lane counts, event volume, and camera cou A practical target profile for scaled deployments is 640x480 at about 5 frames per second, while 1080p may be acceptable on smaller systems. Camera settings should be chosen with total system scale in mind. For a 50-lane or 100-camera target, the aggregate cost of 1080p video is likely excessive. +## MediaMTX as a scale helper + +When many logical lanes reuse a smaller number of physical camera feeds, MediaMTX can reduce the burden on the OSCAR Java backend by replacing many direct RTSP sessions with a smaller set of stable local proxy paths. In practice, that lowers Java-side reconnect churn, reduces repeated camera-side socket activity, and simplifies lane CSV reuse in larger systems. + +## Launch mode selection for operations + +Use `monitor-oscar` when deployment teams need evidence: first-run validation, burn-in testing, troubleshooting, camera-profile comparisons, memory and thread profiling, or PostgreSQL session analysis. The monitor wrapper is intentionally verbose and creates monitor directories, periodic snapshots, thread dumps, JFR status checks, and database trend files. + +Use `launch-all` for efficient production operation once the system profile has been validated. This keeps routine startup simple and avoids collecting detailed monitoring artifacts when no in-depth system profile is required. In production, prefer a sessionless `launch-all` start or an automated restart-safe wrapper such as `systemd` on Linux or Task Scheduler on Windows. + ## What stays where in a split deployment When PostgreSQL is moved off the application host, videos and the other stored files remain on the application server. In practice, that means the application tier continues to own the file system while the database tier handles relational persistence. This is important for storage planning, backups, and role-based file retrieval. @@ -450,7 +518,7 @@ This section captures known gaps and enhancement ideas so they are not confused > 1\. Install Docker and verify that the service is running on the host before launch. > -> 2\. Extract the chosen OSCAR release and start it with the OS-specific launch-all script (`launch-all.bat`, `launch-all.sh`, or `launch-all-arm.sh`). +> 2\. Extract the chosen OSCAR release and start production operation with the OS-specific `launch-all` script (`launch-all.bat`, `launch-all.sh`, or `launch-all-arm.sh`). Use `monitor-oscar` instead when the goal is validation, troubleshooting, burn-in, side-by-side comparison, or system profiling. > > 3\. Change the initial admin password using the package-provided settings file and password-initialization script before production use. > diff --git a/dist/documentation/OSCAR_launch_monitoring_guide.md b/dist/documentation/OSCAR_launch_monitoring_guide.md new file mode 100644 index 0000000..cfbe379 --- /dev/null +++ b/dist/documentation/OSCAR_launch_monitoring_guide.md @@ -0,0 +1,761 @@ +# OSCAR Launch, Monitoring, and Status Guide + +This guide explains the purpose and use of the updated `.env`, `launch-all`, `launch`, `monitor-oscar`, and `check-oscar-status` scripts for both Linux and Windows. It covers profiles, dependencies, startup and shutdown flow, cleanup, and how to interpret the monitoring output. + +## 1. What changed and why + +The updated scripts were designed to make OSCAR easier to run, safer on smaller machines, and easier to diagnose when memory or stability problems appear. + +The main improvements are: + +- **Profile-based sizing** so Java and PostgreSQL use settings that fit the machine. +- **Safer defaults** for a 16 GB machine and other profiles. +- **Separation of responsibilities**: + - `.env` holds deployment settings. + - `launch-all` starts PostgreSQL and then launches OSCAR. + - `launch` starts the Java node with the right memory and diagnostics. + - `monitor-oscar` starts OSCAR and continuously collects diagnostic snapshots. + - `check-oscar-status` summarizes all collected data into a single report file. +- **Built-in diagnostics** such as Native Memory Tracking and JFR support. +- **Cleaner testing workflow** so you can compare runs and determine whether memory is stable or leaking. + +--- + +## 2. Which files are involved + +### Shared configuration + +- `.env` + +### Linux + +- `launch-all.sh` +- `osh-node-oscar/launch.sh` +- `monitor-oscar.sh` +- `check-oscar-status.sh` + +### Windows + +- `launch-all.bat` +- `osh-node-oscar\launch.bat` +- `monitor-oscar.bat` +- `monitor-oscar.ps1` +- `check-oscar-status.ps1` + +--- + +## 3. What each file does + +## `.env` + +This file is the shared configuration layer. It tells the scripts which profile to use, how to connect to PostgreSQL, and which passwords to pass into the Java process. + +Typical variables: + +```env +SYSTEM_PROFILE=16GB +DB_NAME=gis +DB_USER=postgres +DB_PASSWORD=postgres +DB_PORT=5432 +DB_HOST=localhost +CONTAINER_NAME=oscar-postgis-container +KEYSTORE_PASSWORD=CHANGE_ME +TRUSTSTORE_PASSWORD=CHANGE_ME +JAVACPP_MAX_BYTES= +JAVACPP_MAX_PHYSICAL_BYTES= +JFR_FILENAME= +``` + +Why it is useful: + +- keeps machine-specific configuration out of the launch scripts +- lets you switch between profiles without editing multiple files +- makes Linux and Windows setups consistent + +How to modify it: + +- change `SYSTEM_PROFILE` when moving to a different machine size +- change the DB settings if PostgreSQL is not local +- change the passwords to match your real keystore and truststore values +- optionally override JavaCPP or JFR paths only when needed + +--- + +## `launch-all.sh` / `launch-all.bat` + +This is the top-level launcher. It reads `.env`, starts the PostGIS container with profile-appropriate settings, waits for PostgreSQL to become ready, and then starts OSCAR by calling the node-specific `launch` script. + +Why it is useful: + +- one command starts the full stack +- PostgreSQL settings are tied to the profile +- ensures the DB is available before Java starts +- helps keep tests repeatable + +What it usually does: + +1. reads `.env` +2. maps `SYSTEM_PROFILE` to PostgreSQL settings +3. builds or starts the PostGIS container +4. waits for `pg_isready` +5. enters `osh-node-oscar` +6. runs `launch.sh` or `launch.bat` + +How to modify it: + +- change PostgreSQL memory settings if your workload changes +- change container name or port if you need multiple local deployments +- change image/tag names if your Docker workflow changes + +Important note: + +If you change PostgreSQL settings, you need the container to be recreated or restarted in a way that actually applies the new settings. The updated launchers were designed to make that more predictable. + +--- + +## `osh-node-oscar/launch.sh` / `osh-node-oscar/launch.bat` + +This script starts the Java node itself. It reads `.env`, maps the selected profile to Java memory settings, sets up certificates, enables diagnostics, and launches the OSCAR process. + +Why it is useful: + +- central place for Java sizing +- keeps profile logic out of the top-level launcher +- enables Native Memory Tracking so memory problems can be investigated later +- passes JavaCPP limits to help control native memory behavior + +Typical responsibilities: + +- choose `-Xms` and `-Xmx` from `SYSTEM_PROFILE` +- set `-XX:+UnlockDiagnosticVMOptions` +- set `-XX:NativeMemoryTracking=summary` +- set JavaCPP limits such as: + - `-Dorg.bytedeco.javacpp.maxBytes=...` + - `-Dorg.bytedeco.javacpp.maxPhysicalBytes=...` +- set keystore and truststore paths +- start `com.botts.impl.security.SensorHubWrapper` + +How to modify it: + +- adjust profile memory values if testing shows a machine can safely handle more or needs less +- change JavaCPP limits if native memory use is too tight or too loose +- update certificate paths if the install layout changes +- add temporary JVM flags for debugging + +What not to remove: + +- `-XX:+UnlockDiagnosticVMOptions` +- `-XX:NativeMemoryTracking=summary` + +These are required for native memory inspection with `jcmd`. + +--- + +## `monitor-oscar.sh` / `monitor-oscar.bat` / `monitor-oscar.ps1` + +This is the diagnostic runner. It starts OSCAR, waits for the JVM to appear, starts JFR, and collects periodic snapshots into a timestamped `oscar-monitor-*` directory. On Windows, `monitor-oscar.ps1` is the preferred PowerShell wrapper for hidden or redirected monitored launches. + +Why it is useful: + +- gives you time-series data instead of one-off guesses +- captures memory, swap or pagefile, threads, and JVM info while OSCAR is running +- makes it easy to compare startup, steady state, and failure periods +- can be used as the primary launch method when you are testing stability + +What it collects over time: + +- JVM PID and command line +- process memory details +- thread counts +- heap information +- native memory summaries +- JFR recordings +- system memory and swap or pagefile state +- launch stdout and stderr logs + +Linux snapshots commonly include: + +- `/proc//status` +- `/proc//smaps_rollup` +- `pmap -x` +- `free -h` +- `vmstat` +- `jcmd VM.native_memory summary` +- `jcmd GC.heap_info` +- `jcmd JFR.check` + +Windows snapshots commonly include the nearest equivalents through PowerShell, `tasklist`, `wmic` or CIM, performance counters, and `jcmd`. + +How to modify it: + +- change the snapshot interval if you want more or less detail +- change the match expression if the Java main class changes +- change the JFR size or age limits +- add extra OS-level commands if you want more counters + +--- + +## `check-oscar-status.sh` / `check-oscar-status.ps1` + +This is a reporting script. It reads the latest monitor directory and writes one report file that summarizes the current run. + +Why it is useful: + +- gives you one file to review or share +- compares first and latest snapshots +- shows recent trend lines +- shows whether memory is rising, flattening, or thrashing + +What it includes: + +- live process status +- live JVM state +- live JFR and NMT information +- system memory and swap or pagefile status +- first snapshot summary +- latest snapshot summary +- recent trend table +- log tails +- a quick interpretation section + +How to modify it: + +- change how many recent snapshots are included +- add custom grep or PowerShell parsing for errors you care about +- add application log searches for reconnect loops or parse failures + +--- + +## 4. Choosing the right profile + +`SYSTEM_PROFILE` is the most important setting in `.env`. + +### Recommended meanings + +- `RPI4`: very constrained system +- `8GB`: small development or reduced-workload machine +- `16GB`: reasonable full-node starting point +- `32GB`: larger system with more headroom + +### How to make sure you use the right profile + +Use the profile that matches the **actual machine memory**, not what you hope the workload can handle. + +Linux: + +```bash +free -h +``` + +Windows PowerShell: + +```powershell +Get-CimInstance Win32_ComputerSystem | Select-Object TotalPhysicalMemory +``` + +General rule: + +- use `16GB` only on a machine with about 16 GB RAM +- use `8GB` on smaller test systems +- use `32GB` only when the machine really has that headroom +- when in doubt, choose the **smaller** profile first + +### Why this matters + +The Java heap is not the only consumer of memory. OSCAR also uses: + +- native libraries +- thread stacks +- PostgreSQL +- Docker +- OS page cache +- FFmpeg or JavaCPP native memory if video is enabled + +A machine can fail even when Java heap is not full if native memory and database memory are too large. + +--- + +## 5. Recommended profile behavior + +The modified launchers use conservative sizing. A reasonable starting strategy is: + +- smaller `Xms` than `Xmx` +- conservative PostgreSQL settings on shared hosts +- Native Memory Tracking enabled +- JFR started by the monitor rather than by the Java launcher + +If a profile proves stable for your workload, you may increase it carefully. Do not scale memory up just because RAM exists. + +--- + +## 6. Installing dependencies + +## Linux dependencies + +You normally need: + +- Java **JDK**, not just a JRE +- Docker +- Bash +- `jcmd` (comes with the JDK) +- `pg_isready` inside the PostgreSQL container image +- optional helpers: `pmap`, `vmstat`, `free` + +### Ubuntu or Debian example + +```bash +sudo apt update +sudo apt install -y openjdk-21-jdk docker.io procps psmisc +``` + +Optional but useful: + +```bash +sudo apt install -y net-tools sysstat +``` + +### Check whether dependencies are installed on Linux + +```bash +java -version +action="jcmd"; command -v "$action" +docker --version +bash --version +free -h +vmstat 1 1 +pmap $$ | head +``` + +--- + +## Windows dependencies + +You normally need: + +- Java **JDK**, not only a JRE +- Docker Desktop or a Docker Engine setup that provides `docker` +- PowerShell for the status script +- `jcmd.exe` from the JDK + +### Check whether dependencies are installed on Windows PowerShell + +```powershell +java -version +gcm jcmd +docker --version +$PSVersionTable.PSVersion +``` + +If `gcm jcmd` does not return anything, the JDK `bin` directory is probably not on `PATH`. + +Typical `jcmd.exe` location: + +```text +C:\Program Files\Java\jdk-\bin\jcmd.exe +``` + +--- + +## 7. How to start the program + +Use `launch-all` for normal production after validation. Use `monitor-oscar` when you want detailed diagnostics, a monitor directory, and a one-file status report later. + +## Linux normal startup + +Interactive production start: + +```bash +./launch-all.sh +``` + +Preferred sessionless production start: + +```bash +nohup ./launch-all.sh > launch.out 2>&1 & +``` + +## Linux monitored startup + +Interactive monitored start: + +```bash +./monitor-oscar.sh +``` + +Preferred sessionless monitored start: + +```bash +nohup ./monitor-oscar.sh > monitor.out 2>&1 & +``` + +## Windows normal startup + +From the project root, interactive production start: + +```bat +launch-all.bat +``` + +Preferred sessionless production start from PowerShell: + +```powershell +Start-Process powershell.exe ` + -ArgumentList '-NoProfile','-ExecutionPolicy','Bypass','-Command',"Set-Location -LiteralPath '$PWD'; .\launch-all.bat" ` + -WindowStyle Hidden ` + -RedirectStandardOutput "$PWD\launch.out" ` + -RedirectStandardError "$PWD\launch.err" +``` + +## Windows monitored startup + +Interactive monitored start from PowerShell: + +```powershell +.\monitor-oscar.ps1 +``` + +Preferred sessionless monitored launch: + +```powershell +Start-Process powershell.exe ` + -ArgumentList '-NoProfile','-ExecutionPolicy','Bypass','-File',"$PWD\monitor-oscar.ps1" ` + -WindowStyle Hidden ` + -RedirectStandardOutput "$PWD\monitor.out" ` + -RedirectStandardError "$PWD\monitor.err" +``` + +`monitor-oscar.bat` remains available for interactive `cmd.exe` use, but `monitor-oscar.ps1` is the preferred Windows wrapper for hidden or redirected monitored runs. + +## Production auto-start after reboot + +### Linux + +Use a small `systemd` unit that runs `launch-all.sh` after Docker is available. + +Example `/etc/systemd/system/oscar-launch-all.service`: + +```ini +[Unit] +Description=OSCAR production launcher +After=network-online.target docker.service +Wants=network-online.target docker.service + +[Service] +Type=oneshot +RemainAfterExit=yes +User=oscar +WorkingDirectory=/home/oscar/oscar-3.5.1 +ExecStart=/bin/bash -lc './launch-all.sh' +ExecStop=/bin/bash -lc './stop-all.sh' +TimeoutStartSec=0 + +[Install] +WantedBy=multi-user.target +``` + +Enable it: + +```bash +sudo systemctl daemon-reload +sudo systemctl enable --now oscar-launch-all.service +``` + +### Windows + +Use **Task Scheduler** with these settings: + +- Trigger: **At startup** +- Run whether user is logged on or not +- Run with highest privileges +- Action: `powershell.exe` +- Arguments: + +```text +-NoProfile -ExecutionPolicy Bypass -Command "Set-Location -LiteralPath 'C:\path\to\oscar-3.5.1'; .\launch-all.bat *> .\launch-startup.log" +``` + +Also make sure Docker Desktop or the Windows Docker service is configured to start automatically before OSCAR is expected to launch. + +## 8. How to stop the program + +## Linux + +If you started with `monitor-oscar.sh` and the script was written to stop the whole stack on signal: + +```bash +kill "$(cat monitor.pid)" +``` + +If needed, stop parts manually: + +```bash +pgrep -af 'com.botts.impl.security.SensorHubWrapper' +kill +docker stop oscar-postgis-container +``` + +## Windows + +If the Windows monitor script supports a stop command: + +```bat +monitor-oscar.bat stop +``` + +Otherwise stop the Java process and container manually: + +PowerShell: + +```powershell +Get-Process java | Stop-Process + docker stop oscar-postgis-container +``` + +Be careful if multiple Java processes are running on the machine. + +--- + +## 9. How to check that the monitor is working + +## Linux + +```bash +pgrep -af monitor-oscar.sh +pgrep -af 'com.botts.impl.security.SensorHubWrapper' +docker ps --filter name=oscar-postgis-container +ls -td oscar-monitor-* | head -n 1 +tail -f monitor.out +cat monitor.last-status 2>/dev/null || true +cat monitor.last-error 2>/dev/null || true +``` + +## Windows PowerShell + +```powershell +Get-Process java +Get-ChildItem . -Directory oscar-monitor-* | Sort-Object LastWriteTime -Descending | Select-Object -First 1 +Get-Content .\monitor.out -Tail 50 -Wait +Get-Content .\monitor.err -Tail 50 -Wait +Get-Content .\monitor.last-status -ErrorAction SilentlyContinue +Get-Content .\monitor.last-error -ErrorAction SilentlyContinue +``` + +The singleton guard in the monitor wrappers prevents a second live monitor from starting. For sessionless launches, `monitor.last-status`, `monitor.last-error`, `monitor.out`, and `monitor.err` are the fastest way to see whether the wrapper attached successfully, refused a duplicate launch, or exited because OSCAR was already running. + +## 10. How to generate a one-file status report + +## Linux + +```bash +./check-oscar-status.sh +``` + +This produces a file like: + +```text +oscar-status-20260504-000101.txt +``` + +## Windows + +```powershell +powershell -ExecutionPolicy Bypass -File .\check-oscar-status.ps1 +``` + +This produces a similar one-file report. + +--- + +## 11. How to analyze the data + +The most important sections are: + +- `LIVE JVM /proc STATUS` or the Windows live process section +- `LIVE JVM NATIVE MEMORY SUMMARY` +- `LIVE JVM GC HEAP INFO` +- `RECENT TREND` +- `vmstat` on Linux or pagefile/commit counters on Windows +- application log tails + +### Healthy pattern + +A healthy run usually looks like this: + +- RSS rises during startup, then flattens +- heap usage rises and falls with normal GC activity +- NMT committed memory stays in a narrow band +- thread count stabilizes +- swap or pagefile use may rise some but stops growing +- system still has plenty of available memory +- there is little or no sustained swap-in and swap-out pressure + +### Suspicious pattern + +A suspicious run usually looks like this: + +- RSS rises hour after hour without flattening +- VmSwap or pagefile keeps increasing steadily +- NMT committed memory keeps rising steadily +- thread count keeps climbing +- logs show reconnect loops and memory rises after each one +- system available memory keeps shrinking +- OS starts heavy paging or swapping activity + +### Linux-specific interpretation tips + +- `VmRSS`: resident memory in RAM +- `VmSwap`: memory for that process currently swapped out +- `vmstat si/so`: swap in and swap out activity +- `GC.heap_info`: whether Java heap is actually pressured +- `VM.native_memory summary`: whether JVM-managed native memory is rising + +### Windows-specific interpretation tips + +Watch these especially: + +- process working set +- private bytes or commit size +- system commit charge versus commit limit +- pagefile usage +- `jcmd VM.native_memory summary` + +### Important distinction + +A process can fail from **native memory exhaustion** even when Java heap is not full. That was the original reason these scripts were added. + +--- + +## 12. How to tell whether there is a leak + +Do not judge from the first hour alone. Startup always causes growth. + +Suggested checkpoints: + +- **30 to 60 minutes**: look for obvious runaway behavior +- **2 to 4 hours**: see whether memory is leveling off +- **12 to 24 hours**: determine whether the process is stable or slowly drifting + +What proves stability: + +- recent trend lines become narrow and flat +- NMT committed memory stays near one range +- threads stay near one range +- swap or pagefile stops rising + +What suggests a leak: + +- all trend lines keep climbing across many hours +- the slope stays positive even after warmup +- memory jumps after every reconnect or retry cycle and never comes down + +--- + +## 13. When to delete files + +The monitor and status scripts generate files that can grow over time. + +### Files you can delete safely after a run is complete + +- old `oscar-status-*.txt` reports +- old `monitor.out` +- old monitor directories such as `oscar-monitor-20260503-174333` +- old JFR files that you no longer need + +### Files you should keep while investigating a problem + +- the monitor directory for the run you care about +- its `launch.stdout.log` and `launch.stderr.log` +- any `*.jfr` files +- any JVM crash logs such as `hs_err_pid*.log` + +### Good cleanup practice + +Delete old monitor directories only after: + +- the run has been reviewed +- any useful JFR files have been copied somewhere safe +- you no longer need to compare against older runs + +Linux cleanup example: + +```bash +rm -rf oscar-monitor-20260503-174333 +rm -f oscar-status-*.txt +``` + +Windows PowerShell cleanup example: + +```powershell +Remove-Item .\oscar-monitor-20260503-174333 -Recurse -Force +Remove-Item .\oscar-status-*.txt +``` + +--- + +## 14. How to modify the scripts safely + +When changing the scripts, change one category at a time: + +1. profile memory sizes +2. PostgreSQL memory settings +3. monitoring interval +4. extra diagnostics +5. container or path settings + +After each change, run a monitored test and compare the new `oscar-status-*.txt` report against an older stable run. + +Do not change everything at once or you will not know what helped. + +--- + +## 15. Recommended workflow + +For a new machine: + +1. put the correct `.env` in place +2. verify dependencies +3. verify the chosen `SYSTEM_PROFILE` +4. start with `monitor-oscar` +5. let it run at least 2 to 4 hours +6. generate a status report +7. check whether RSS, swap or pagefile, NMT committed, and threads plateau +8. only then decide whether to raise or lower memory settings + +For production confidence: + +1. run monitored overnight +2. generate a final status report +3. verify that recent trend lines are flat +4. archive one known-good monitor directory and status report for comparison + +--- + +## 16. Common mistakes to avoid + +- using a profile larger than the machine really supports +- assuming Java heap is the only memory that matters +- removing NMT flags from the Java launcher +- starting JFR twice from both the launcher and the monitor without intending to +- judging a leak from startup-only growth +- deleting monitor directories before reviewing them +- forgetting that repeated reconnects can be a logic problem even when memory looks stable + +--- + +## 17. Bottom line + +The updated scripts give you a repeatable way to: + +- choose the right memory profile +- start OSCAR consistently +- capture memory diagnostics during the run +- summarize results into one report file +- distinguish between startup growth, stable operation, and a real leak + +For day-to-day use, the most important steps are: + +- set the correct `SYSTEM_PROFILE` +- start with `monitor-oscar` when testing +- use `check-oscar-status` to review the run +- keep the diagnostic files until you know the run is healthy diff --git a/dist/documentation/Release_Notes_3.5.1.md b/dist/documentation/Release_Notes_3.5.1.md new file mode 100644 index 0000000..058d4b0 --- /dev/null +++ b/dist/documentation/Release_Notes_3.5.1.md @@ -0,0 +1,616 @@ +# Release Notes + +## Overview + +OSCAR **3.5.1** improves deployment stability, observability, and scalability for larger multi-lane systems. This release focuses on: + +* reducing memory pressure +* preventing PostgreSQL connection exhaustion +* improving runtime diagnostics +* simplifying deployment on Linux and Windows +* improving support for MediaMTX-based camera proxy deployments +* making launch and monitoring behavior safer when OSCAR is already running +* improving first-run dependency and startup validation + +These changes were validated against a high-load configuration monitoring **50 radiation portal monitors and 100 camera streams**. + +This is a **prebuilt release**. Users should **unzip OSCAR 3.5.1 into a fresh directory**. Use the included **monitoring script** for first-run validation, burn-in, troubleshooting, and system profiling. Use **`launch-all`** for efficient production operation after validation, because it avoids unnecessary detailed monitoring logs and snapshot artifacts when an in-depth system profile is not required. + +--- + +## Before you start + +### Required dependencies + +Install these before running OSCAR 3.5.1: + +* **OpenJDK 21** +* **Docker** + +The packaged release archive is expected to be named **`oscar-3.5.1.zip`**. + +### Recommended deployment model + +For production operation: + +* unzip the release into a **new clean folder** +* rename `env.txt` to `.env` if needed +* select the correct system profile in `.env` +* use **MediaMTX** for camera-heavy deployments when appropriate +* start OSCAR with **`launch-all`** after validation so the system runs without unnecessary monitoring log and snapshot generation + +For testing, side-by-side field evaluation, first-run validation, troubleshooting, and system profiling: + +* start OSCAR with the **sessionless monitoring launch** when possible +* use the **check/status script** to review memory, thread, and PostgreSQL behavior +* use the new reset scripts when you need to clear a previous local test install before switching releases + +--- + +## What is new + +### Profile-based system sizing + +Deployment now supports profile-based resource tuning instead of using one fixed memory configuration for every machine. + +Supported profiles: + +* `RPI4` +* `8GB` +* `16GB` +* `32GB` + +These profiles allow the JVM and PostgreSQL configuration to be matched to the host hardware through the `.env` file and updated launch scripts. + +### Updated launch flow + +Launch scripts were updated for both Linux and Windows so they can: + +* load the selected system profile +* size Java heap appropriately for the machine +* size PostgreSQL more appropriately for the machine +* start the PostGIS container with tuned settings +* provide a more consistent startup path across environments +* check for required dependencies before launch +* stop or refuse duplicate OSCAR launches based on script settings +* avoid hard failure on optional runtime paths such as `nativelibs` or extra trusted-certificate drop-ins when they are not present + +### Safer process handling + +The launch and monitoring scripts now better handle cases where OSCAR is already running. + +Improvements include: + +* detection of already running OSCAR processes +* clearer behavior when a prior instance is found +* support for stopping and relaunching cleanly when configured to do so +* monitor behavior aligned with launch behavior end to end +* reduced risk of duplicate Java processes and conflicting monitor sessions +* explicit single-instance protection for the Linux and Windows monitor wrappers + +This makes startup behavior safer during testing, upgrades, and repeated field launches. + +### Dependency and environment validation + +Deployment scripts now better validate startup prerequisites and packaged paths before launch. + +Improvements include: + +* dependency checks for **Java 21** and **Docker** +* clearer startup errors when required tools are missing +* improved trust store handling on Windows +* better validation of expected runtime directories and packaged files +* updated environment template support for launch and monitor behavior + +These changes make prebuilt deployment more reliable, especially on fresh Windows systems. + +The current launcher checks now distinguish between required dependencies and optional runtime extras. Missing required tools such as Java or Docker still stop startup. Missing optional paths such as `nativelibs` or `trusted_certificates` no longer stop startup by themselves. + +### PostgreSQL tuning improvements + +PostgreSQL startup settings were updated to better support larger deployments. + +Improvements include: + +* increased connection limits by profile +* reduced per-connection memory pressure +* reserved superuser or admin connection slots +* idle session timeout support +* connection and disconnection logging for diagnostics + +For the 16 GB profile, PostgreSQL was raised from the earlier 100-connection ceiling to a higher-capacity configuration, resolving the immediate `too many clients already` failure mode during large-scale operation. + +### Hikari connection pool fix + +The main cause of database session over-allocation was identified and corrected in the PostGIS datastore connection manager. + +#### Root cause + +* each Hikari pool was configured with `maximumPoolSize(20)` +* `minimumIdle` was not set +* Hikari therefore defaulted `minimumIdle` to the same value as `maximumPoolSize` +* with multiple pools active, the system held a very large number of idle PostgreSQL sessions open at all times + +#### Fix + +* reduced per-pool size +* explicitly set `minimumIdle(0)` +* shortened idle timeout behavior +* preserved sufficient active connection capacity while eliminating unnecessary idle connection hoarding + +#### Result observed in testing + +* PostgreSQL steady-state sessions dropped from about **186** to about **21** +* idle JDBC sessions dropped from about **180** to about **15** +* database headroom increased substantially +* the immediate Postgres connection saturation problem was eliminated + +This is the most important backend stability improvement in this release. + +### Monitoring and status scripts + +New monitoring and status-check scripts were added for both Linux and Windows. + +Windows now also includes `monitor-oscar.ps1` as the preferred PowerShell wrapper for sessionless monitored launches. It works cleanly with `Start-Process` redirection, keeps the window hidden, and pairs well with the existing `monitor.last-status` and `monitor.last-error` files for headless troubleshooting. + +The monitor wrappers now also include a singleton guard so a second `monitor-oscar` launch is refused while another monitor is already active. This prevents duplicate snapshot loops, duplicate JFR starts, and confusing status output during sessionless operation. The wrappers now also update `monitor.last-status` and `monitor.last-error`, which makes it much easier to understand why a sessionless launch exited without staying attached to a terminal window. + +These scripts can now: + +* launch OSCAR under monitoring +* support sessionless launch for validation, troubleshooting, burn-in, and profiling runs +* expose a PowerShell-first monitored entry point on Windows through `monitor-oscar.ps1` +* capture JVM memory status +* capture native memory tracking summaries +* capture JFR status +* capture OS memory and swap usage +* capture PostgreSQL session counts and saturation state +* capture database activity detail +* produce a single-file health and status report for rapid review + +### Reset and shutdown scripts + +The deployment scripts now also support cleaner teardown between test installs. + +These updates include: + +* `stop-all` scripts that try to stop the monitor first, then continue with direct fallback shutdown +* `reset-all` scripts that stop OSCAR processes, remove the PostGIS container and volumes, and clear local runtime state for clean retesting +* better support for side-by-side installation testing on the same host + +### Improved database diagnostics + +Monitoring now includes PostgreSQL visibility such as: + +* `max_connections` +* `superuser_reserved_connections` +* total active sessions +* session state counts +* connection trend logging over time +* recent PostgreSQL log activity + +This makes it much easier to distinguish between: + +* memory pressure +* connection pool over-allocation +* true connection leaks +* normal steady-state pool behavior + +### MediaMTX deployment guidance + +Documentation was added for using **MediaMTX** as a local RTSP proxy layer to reduce the resource burden of handling many camera streams directly in OSCAR. + +This supports a deployment model where: + +* a smaller number of upstream camera streams are proxied locally +* multiple lanes can reuse proxied feeds +* OSCAR connects to stable local endpoints instead of managing a large number of direct camera connections +* the Java backend spends less effort on direct RTSP session setup, reconnect churn, and repeated camera-side socket activity + +This architecture is recommended for larger systems because it reduces camera-related reconnect burden and lowers Java-side camera handling overhead. Validate camera-heavy profiles with `monitor-oscar`, then use `launch-all` for efficient production starts once the profile is accepted. + +### Documentation updates + +Documentation was added or expanded for: + +* `.env` usage +* launch scripts +* monitoring scripts +* check or status scripts +* profile selection +* dependency installation and verification +* startup and shutdown procedures +* data interpretation and troubleshooting +* MediaMTX camera proxy setup +* already-running instance handling +* environment template settings for restart and attach behavior + +--- + +## Problems addressed + +### Memory pressure from fixed JVM sizing + +Previous deployments used a one-size-fits-all Java memory model. On smaller or moderately sized machines, this could reserve too much memory for the JVM and reduce operating system and PostgreSQL headroom, increasing swap or pagefile pressure. + +### PostgreSQL connection exhaustion + +Large deployments were exhausting PostgreSQL connection capacity because multiple Hikari pools were keeping too many idle connections open. This caused: + +* `too many clients already` +* Hikari connection timeouts +* database degradation under load + +### Duplicate launch and monitoring confusion + +Repeated test starts could leave users uncertain whether OSCAR was already running, whether a second Java process had been created, or whether the monitor had attached to the correct instance. A related gap was that the backend launchers had duplicate-start protection, but the monitor wrapper itself could still be started twice. + +The updated scripts address this by making existing-instance behavior more explicit and consistent, and by refusing a second live monitor session. + +### Limited visibility into failure mode + +Earlier logs were often noisy or incomplete during failures. The new monitoring and status scripts make it easier to determine whether the bottleneck is: + +* Java heap +* native memory +* swap usage +* PostgreSQL session pressure +* query activity +* startup or reconnect churn + +--- + +## Behavior observed after the fixes + +After applying the connection pool fix and updated deployment tuning: + +* PostgreSQL sessions dropped from about **186** to about **21** in testing +* idle JDBC sessions dropped from about **180** to about **15** +* JVM swap usage dropped to **0** in the improved test run +* the system remained stable during startup and warm-up +* database headroom improved dramatically + +--- + +## If you previously ran OSCAR + +If this machine was already running an older OSCAR release, do **not** install OSCAR 3.5.1 over the top of the old directory. + +Before starting OSCAR 3.5.1, stop and remove the older deployment components: + +* stop the old PostGIS container +* remove the old PostGIS container +* remove the old Docker network used by the previous OSCAR deployment, **if one exists** +* stop any old OSCAR Java process that is still running +* delete the old `oscar-3.5.0` directory +* unzip OSCAR **3.5.1** into a fresh folder + +### Linux cleanup example + +Stop and remove the old PostGIS container: + +```bash +docker stop oscar-postgis-container 2>/dev/null || true +docker rm oscar-postgis-container 2>/dev/null || true +``` + +If the previous deployment created a dedicated Docker network, remove it after the container is gone: + +```bash +docker network ls +docker network rm +``` + +Stop any running OSCAR Java process if needed: + +```bash +pkill -f 'com.botts.impl.security.SensorHubWrapper' || true +``` + +Remove the previous OSCAR folder: + +```bash +rm -rf ~/oscar-3.5.0 +``` + +### Windows cleanup example + +Stop and remove the old PostGIS container: + +```powershell +docker stop oscar-postgis-container +docker rm oscar-postgis-container +``` + +If the previous deployment created a dedicated Docker network, remove it after the container is gone: + +```powershell +docker network ls +docker network rm +``` + +Stop any running OSCAR Java process if needed: + +```powershell +Get-CimInstance Win32_Process | + Where-Object { + $_.Name -match '^java(\.exe)?$' -and + $_.CommandLine -like '*com.botts.impl.security.SensorHubWrapper*' + } | + ForEach-Object { Stop-Process -Id $_.ProcessId -Force } +``` + +Delete the previous OSCAR folder: + +```powershell +Remove-Item -Recurse -Force .\oscar-3.5.0 +``` + +If you are unsure whether a dedicated OSCAR Docker network exists, list networks first and remove only the one associated with the old OSCAR deployment. + +--- + +## Fresh install workflow for OSCAR 3.5.1 + +### Step 1: unzip the release + +Extract OSCAR **3.5.1** into a new folder. + +The packaged release archive is expected to be named `oscar-3.5.1.zip`. + +Example: + +```text +oscar-3.5.1/ +``` + +### Step 2: confirm dependencies + +Make sure the machine has: + +* **OpenJDK 21** +* **Docker** + +The packaged release archive is expected to be named **`oscar-3.5.1.zip`**. + +### Step 3: configure the environment file + +The release may include the environment file as: + +```text +env.txt +``` + +Rename it to: + +```text +.env +``` + +For Linux packaged builds, the `*.sh` files in the archive are now packaged executable. If your unzip tool strips permissions, restore them with `chmod +x *.sh osh-node-oscar/*.sh` before launching. + +Then edit the file and select the correct hardware profile: + +* `RPI4` +* `8GB` +* `16GB` +* `32GB` + +The environment template also supports launch and monitoring behavior such as restart and attach settings. + +### Step 4: choose the launch path + +For efficient production operation after validation, start OSCAR with `launch-all`. + +#### Linux production + +Interactive: + +```bash +./launch-all.sh +``` + +Preferred sessionless production start: + +```bash +nohup ./launch-all.sh > launch.out 2>&1 & +``` + +#### Windows production + +Interactive: + +```bat +launch-all.bat +``` + +Preferred sessionless production start from PowerShell: + +```powershell +Start-Process powershell.exe ` + -ArgumentList '-NoProfile','-ExecutionPolicy','Bypass','-Command',"Set-Location -LiteralPath '$PWD'; .\launch-all.bat" ` + -WindowStyle Hidden ` + -RedirectStandardOutput "$PWD\launch.out" ` + -RedirectStandardError "$PWD\launch.err" +``` + +#### Automated production start after restart + +Linux operators should normally use a `systemd` unit that runs `launch-all.sh` after Docker is available. Windows operators should normally use **Task Scheduler** with an **At startup** trigger that runs `launch-all.bat` from the OSCAR directory with highest privileges. This keeps the default production path on `launch-all` while avoiding reliance on an open terminal or SSH session. + +Use the monitoring script when diagnostics are needed for first-run validation, burn-in, troubleshooting, side-by-side comparison, or system profiling. + +#### Linux monitored validation or profiling + +Preferred sessionless pattern: + +```bash +nohup ./monitor-oscar.sh > monitor.out 2>&1 & +``` + +Attached interactive pattern: + +```bash +./monitor-oscar.sh +``` + +#### Windows monitored validation or profiling + +Interactive: + +```powershell +.\monitor-oscar.ps1 +``` + +Preferred sessionless PowerShell pattern: + +```powershell +Start-Process powershell.exe ` + -ArgumentList '-NoProfile','-ExecutionPolicy','Bypass','-File',"$PWD\monitor-oscar.ps1" ` + -WindowStyle Hidden ` + -RedirectStandardOutput "$PWD\monitor.out" ` + -RedirectStandardError "$PWD\monitor.err" +``` + +`monitor-oscar.bat` remains available for interactive `cmd.exe` use, but `monitor-oscar.ps1` is the preferred Windows wrapper for headless monitored runs. + +The monitoring path intentionally produces additional logs, monitor directories, snapshots, JFR checks, thread dumps, and database trend files. Use it when that evidence is valuable; otherwise use `launch-all` for routine production. + +### Step 5: check performance with the included status script + +After startup, and again after the system has been running for a while, generate a status report. + +#### Linux + +```bash +./check-oscar-status.sh +``` + +#### Windows + +```powershell +powershell -ExecutionPolicy Bypass -File .\check-oscar-status.ps1 +``` + +This report helps verify that memory, swap, and PostgreSQL usage remain healthy. + +--- + +## Recommended field test workflow + +For testing and side-by-side field deployment, users should: + +1. stop and remove any old OSCAR PostGIS container +2. remove the old OSCAR Docker network **if one exists** +3. stop any old OSCAR Java process if one is still running +4. delete the old `oscar-3.5.0` folder +5. unzip OSCAR **3.5.1** into a fresh folder +6. install or verify **OpenJDK 21** and **Docker** +7. rename `env.txt` to `.env` if needed +8. select the correct profile in `.env` +9. configure and use **MediaMTX** for camera-heavy systems +10. start OSCAR with the **sessionless monitoring launch** when collecting validation or profile evidence +11. use the check or status script to compare system behavior and performance +12. use the reset script when you need to remove the local OSCAR runtime state before testing another package on the same machine + +This is the preferred workflow for: + +* first-time deployment on a machine +* side-by-side comparison with another build +* validating memory behavior +* validating PostgreSQL behavior +* validating MediaMTX camera proxy performance + +--- + +## Included updates + +### Linux + +* `.env`-based configuration +* `launch-all.sh` +* `launch.sh` +* `monitor-oscar.sh` +* `check-oscar-status.sh` + +### Windows + +* `.env`-based configuration +* `launch-all.bat` +* `launch.bat` +* `monitor-oscar.bat` +* `monitor-oscar.ps1` +* `check-oscar-status.ps1` + +--- + +## Recommended operating model + +### Deployment + +* select the correct hardware profile in `.env` +* use the updated launch scripts +* use **`launch-all`** for efficient production operation after validation +* use a sessionless `launch-all` start for routine production when you do not need deep diagnostics +* use `systemd` on Linux or Task Scheduler on Windows for automatic production startup after reboot +* use the **sessionless monitoring launch** for initial validation, burn-in, side-by-side evaluation, troubleshooting, and profiling +* use the attached monitoring launch only for interactive troubleshooting +* let the scripts manage already-running instances instead of manually launching duplicates +* use MediaMTX where many camera streams are involved +* review generated status reports during early burn-in testing + +### Validation after upgrade + +After upgrading, confirm that: + +* PostgreSQL sessions plateau well below the configured connection limit +* swap usage remains low or zero +* JVM RSS stabilizes after startup +* thread count does not continuously climb over long runs +* database status reports do not show saturation errors + +--- + +## Known issues still under observation + +These changes significantly improve stability, but a few items are still worth monitoring: + +* `RapiscanSensor` parse errors such as `For input string: "000NaN"` +* repeated MQTT `Broken pipe` errors +* high thread counts in some runs +* reconnect churn on certain devices or services + +These do not appear to be the primary cause of the major stability issue addressed in this release, but they remain candidates for future cleanup. + +--- + +## Upgrade notes + +1. Stop and remove any previous OSCAR PostGIS container. +2. Remove the previous OSCAR Docker network **if one exists**. +3. Stop any previous OSCAR Java process that is still running. +4. Delete the old `oscar-3.5.0` directory. +5. Unzip OSCAR **3.5.1** into a fresh directory. +6. Install **OpenJDK 21** and **Docker**. +7. Rename `env.txt` to `.env` if needed. +8. Edit `.env` and select the correct hardware profile. +9. For camera-heavy deployments, configure MediaMTX. +10. Start the system with **`monitor-oscar`** when collecting validation, troubleshooting, or profile evidence. +11. Start the system with **`launch-all`** for efficient production operation after validation. +12. Use the check or status script after monitored startup and again after runtime burn-in. + +--- + +## Summary + +This release materially improves OSCAR behavior on larger systems by: + +* matching resource use to host hardware +* reducing unnecessary database connection retention +* improving monitoring and diagnostics +* increasing deployment consistency across Linux and Windows +* supporting MediaMTX-based camera proxy architectures +* validating dependencies and packaged startup requirements earlier +* handling already-running OSCAR instances more safely + +The biggest backend improvement is the correction of oversized Hikari idle pooling, which reduced PostgreSQL session usage from approximately **186** to approximately **21** in testing. diff --git a/dist/documentation/Standard_PostgreSQL_Setup.md b/dist/documentation/Standard_PostgreSQL_Setup.md new file mode 100644 index 0000000..c6f2a10 --- /dev/null +++ b/dist/documentation/Standard_PostgreSQL_Setup.md @@ -0,0 +1,101 @@ +# Standard PostgreSQL Database Setup + +If you are deploying OSCAR with a standard, standalone PostgreSQL database (rather than the default Dockerized option), follow these steps to initialize and configure the database properly. + +## Prerequisites + +- PostgreSQL (version 16 recommended, matching the Docker image) +- PostGIS extensions installed on the database server + +## Step 1: Create the Database + +Connect to your PostgreSQL instance as a superuser (e.g., `postgres`) and create the `gis` database: + +```sql +CREATE DATABASE gis; +``` + +## Step 2: Configure System Parameters + +Set the required `max_connections` limit: + +```sql +ALTER SYSTEM SET max_connections = 1024; +``` + +_Note: You will need to reload or restart the PostgreSQL service for system-level parameter changes to take effect._ + +## Step 3: Enable PostGIS and Required Extensions + +Connect to the newly created `gis` database. If using `psql`, you can do this by running: + +```sql +\connect gis; +``` + +Then, run the following SQL commands to enable the necessary extensions required by the OSCAR system: + +```sql +CREATE EXTENSION IF NOT EXISTS pg_trgm; +CREATE EXTENSION IF NOT EXISTS btree_gist; +CREATE EXTENSION IF NOT EXISTS btree_gin; +CREATE EXTENSION IF NOT EXISTS fuzzystrmatch; +CREATE EXTENSION IF NOT EXISTS postgis; +CREATE EXTENSION IF NOT EXISTS postgis_tiger_geocoder; +CREATE EXTENSION IF NOT EXISTS postgis_topology; +``` + +## Step 4: Configure User Credentials and Access + +Ensure your database is accessible to the OSCAR application: + +1. Configure appropriate host-based authentication in your `pg_hba.conf` file to allow the OSCAR server to connect. +2. Ensure the user connecting to the database has sufficient privileges on the `gis` database. + +## Step 5: Configure OSCAR Database Connection + +Once the database is set up, you must configure OSCAR to connect to it. This can be done in one of two ways: + +### Option A: Edit `config.json` Directly (Pre-launch) + +Before starting the OSCAR application, you can edit the `dist/config/standard/config.json` file. Locate the configuration module for `SystemDriverDatabaseConfig` containing the `PostgisObsSystemDatabaseConfig` and update the connection details. + +Find the block that looks similar to this: + +```json +{ + "objClass": "org.sensorhub.impl.database.system.SystemDriverDatabaseConfig", + "dbConfig": { + "objClass": "org.sensorhub.impl.datastore.postgis.database.PostgisObsSystemDatabaseConfig", + "url": "localhost:5432", + "dbName": "gis", + "login": "postgres", + "password": "postgres", + "idProviderType": "SEQUENTIAL", + "autoCommitPeriod": 10, + "useBatch": false, + "id": "bfbd6d58-1a4a-40b4-999d-381a1489cbb5", + "autoStart": false, + "moduleClass": "org.sensorhub.impl.datastore.postgis.database.PostgisObsSystemDatabase" + }, + // ... other fields + "name": "PostGIS Database" +} +``` + +Update the following fields to match your standalone database configuration: + +- `url`: The hostname or IP address of your PostgreSQL server and the port (e.g., `db.example.com:5432`). +- `dbName`: The database name (should be `gis` if you followed Step 1). +- `login`: The username for the database. +- `password`: The password for the database user. + +### Option B: Use the OSCAR Admin Panel GUI (Post-launch) + +If OSCAR is already running (and potentially failing to connect to its default database), you can update the settings through the web administration interface: + +1. Log in to the OSCAR Admin Panel (e.g., `http://localhost:8282/sensorhub/admin`). +2. Navigate to the **Databases** tab. +3. Click on the **PostGIS Database** module. +4. Update the **URL**, **Database Name**, **Login**, and **Password** fields with your standalone database details. +5. Save the configuration and restart the module. diff --git a/dist/release/check-oscar-status.ps1 b/dist/release/check-oscar-status.ps1 new file mode 100644 index 0000000..6bc7781 --- /dev/null +++ b/dist/release/check-oscar-status.ps1 @@ -0,0 +1,677 @@ +param( + [string]$BaseDirectory = $PSScriptRoot, + [string]$MonitorDirectory +) + +$ErrorActionPreference = "SilentlyContinue" + +function Add-Line { + param( + [System.Collections.Generic.List[string]]$Lines, + [string]$Text = "" + ) + $Lines.Add($Text) | Out-Null +} + +function Add-Block { + param( + [System.Collections.Generic.List[string]]$Lines, + [string]$Text + ) + + if ([string]::IsNullOrWhiteSpace($Text)) { + $Lines.Add("") | Out-Null + return + } + + $normalized = $Text -replace "`r`n", "`n" + foreach ($line in ($normalized -split "`n")) { + $Lines.Add($line) | Out-Null + } +} + +function Load-DotEnv { + param([string]$Path) + + $map = @{} + if (-not (Test-Path -LiteralPath $Path)) { + return $map + } + + foreach ($rawLine in Get-Content -LiteralPath $Path) { + $line = $rawLine.Trim() + if ([string]::IsNullOrWhiteSpace($line)) { continue } + if ($line.StartsWith("#")) { continue } + + if ($line.StartsWith("export ")) { + $line = $line.Substring(7).Trim() + } + + $idx = $line.IndexOf("=") + if ($idx -lt 1) { continue } + + $key = $line.Substring(0, $idx).Trim() + $value = $line.Substring($idx + 1) + + if (($value.StartsWith('"') -and $value.EndsWith('"')) -or ($value.StartsWith("'") -and $value.EndsWith("'"))) { + if ($value.Length -ge 2) { + $value = $value.Substring(1, $value.Length - 2) + } + } + + $map[$key] = $value + } + + return $map +} + +function Get-ActiveMonitorDirectory { + param([string]$BaseDir) + + $activePath = Join-Path $BaseDir ".monitor-active-dir" + if (-not (Test-Path -LiteralPath $activePath)) { + return $null + } + + $candidate = (Get-Content -LiteralPath $activePath -TotalCount 1 | Out-String).Trim() + if ([string]::IsNullOrWhiteSpace($candidate)) { + return $null + } + + if (Test-Path -LiteralPath $candidate) { + return Get-Item -LiteralPath $candidate + } + + return $null +} + +function Get-LatestMonitorDirectory { + param([string]$BaseDir) + + $dirs = Get-ChildItem -LiteralPath $BaseDir -Directory | + Where-Object { $_.Name -like "oscar-monitor-*" } | + Sort-Object Name -Descending + + return ($dirs | Select-Object -First 1) +} + +function Get-OscarJavaProcesses { + $procs = Get-CimInstance Win32_Process | + Where-Object { + $_.Name -match '^(java|javaw)(\.exe)?$' -and + $null -ne $_.CommandLine -and + $_.CommandLine -match 'com\.botts\.impl\.security\.SensorHubWrapper' + } | + Sort-Object ProcessId + + return @($procs) +} + +function Resolve-ToolPath { + param([string]$Name) + + $cmd = Get-Command $Name -ErrorAction SilentlyContinue + if ($cmd -and $cmd.Source) { + return $cmd.Source + } + + $whereExe = Get-Command where.exe -ErrorAction SilentlyContinue + if ($whereExe) { + try { + $resolved = & $whereExe.Source $Name 2>$null | Select-Object -First 1 + if ($resolved -and (Test-Path -LiteralPath $resolved)) { + return $resolved + } + } + catch { + } + } + + return $null +} + +function Resolve-JcmdPath { + $jcmd = Resolve-ToolPath -Name "jcmd.exe" + if ($jcmd) { + return $jcmd + } + + $jcmd = Resolve-ToolPath -Name "jcmd" + if ($jcmd) { + return $jcmd + } + + if ($env:JAVA_HOME) { + $candidate = Join-Path $env:JAVA_HOME "bin\jcmd.exe" + if (Test-Path -LiteralPath $candidate) { + return $candidate + } + } + + $javaCmd = Resolve-ToolPath -Name "java.exe" + if (-not $javaCmd) { + $javaCmd = Resolve-ToolPath -Name "java" + } + + if ($javaCmd) { + $javaDir = Split-Path -Parent $javaCmd + $candidate = Join-Path $javaDir "jcmd.exe" + if (Test-Path -LiteralPath $candidate) { + return $candidate + } + } + + return $null +} + +function Invoke-ExternalCapture { + param( + [string]$FilePath, + [string[]]$Arguments + ) + + $script:LastExternalExitCode = 0 + + if ([string]::IsNullOrWhiteSpace($FilePath)) { + $script:LastExternalExitCode = 1 + return "Tool path is empty." + } + + if (-not (Test-Path -LiteralPath $FilePath)) { + $script:LastExternalExitCode = 1 + return "Tool not found: $FilePath" + } + + try { + $result = & $FilePath @Arguments 2>&1 | Out-String -Width 4096 + $exitCode = $LASTEXITCODE + if ($null -eq $exitCode) { $exitCode = 0 } + $script:LastExternalExitCode = $exitCode + + $trimmed = $result.TrimEnd() + if ([string]::IsNullOrWhiteSpace($trimmed) -and $exitCode -ne 0) { + return "Command failed with exit code $exitCode and returned no output." + } + + return $trimmed + } + catch { + $script:LastExternalExitCode = 1 + return ($_ | Out-String).TrimEnd() + } +} + +function Get-DockerContainerRecord { + param( + [string]$DockerExe, + [string]$ContainerName + ) + + if (-not $DockerExe) { + return $null + } + + $raw = Invoke-ExternalCapture -FilePath $DockerExe -Arguments @( + "ps", "-a", + "--format", "{{.ID}}|{{.Image}}|{{.Status}}|{{.Names}}|{{.Ports}}|{{.Command}}" + ) + + if ($script:LastExternalExitCode -ne 0) { + return @{ + Error = $raw + } + } + + $lines = @($raw -split "`r?`n" | Where-Object { $_.Trim().Length -gt 0 }) + foreach ($line in $lines) { + $parts = $line.Split("|") + if ($parts.Count -ge 6 -and $parts[3] -eq $ContainerName) { + return @{ + Id = $parts[0] + Image = $parts[1] + Status = $parts[2] + Name = $parts[3] + Ports = $parts[4] + Command = $parts[5] + } + } + } + + return $null +} + +function Get-DockerTableText { + param($ContainerRecord) + + if ($null -eq $ContainerRecord) { + return "Container not found." + } + + if ($ContainerRecord.ContainsKey("Error")) { + return $ContainerRecord.Error + } + + return @" +CONTAINER ID IMAGE STATUS PORTS NAMES +$($ContainerRecord.Id) $($ContainerRecord.Image) $($ContainerRecord.Status) $($ContainerRecord.Ports) $($ContainerRecord.Name) +"@.TrimEnd() +} + +function Invoke-PsqlInContainer { + param( + [string]$DockerExe, + [string]$ContainerName, + [string]$DbUser, + [string]$DbName, + [string]$DbPassword, + [string]$Sql + ) + + return Invoke-ExternalCapture -FilePath $DockerExe -Arguments @( + "exec", + "-e", "PGPASSWORD=$DbPassword", + $ContainerName, + "psql", + "-U", $DbUser, + "-d", $DbName, + "-At", + "-c", $Sql + ) +} + +function Get-LaunchTail { + param( + [string]$MonitorDir, + [string]$FileName, + [int]$Tail = 50 + ) + + if ([string]::IsNullOrWhiteSpace($MonitorDir)) { return "" } + + $path = Join-Path $MonitorDir $FileName + if (-not (Test-Path -LiteralPath $path)) { + return "" + } + + return (Get-Content -LiteralPath $path -Tail $Tail | Out-String -Width 4096).TrimEnd() +} + +function Run-JcmdSection { + param( + [string]$JcmdExe, + [string]$Pid, + [string[]]$Args + ) + + if (-not $Pid -or $Pid -notmatch '^\d+$') { + return "No live OSCAR JVM found." + } + + if (-not $JcmdExe) { + return "jcmd.exe not found." + } + + if (-not (Test-Path -LiteralPath $JcmdExe)) { + return "jcmd.exe path does not exist: $JcmdExe" + } + + $stdoutFile = [System.IO.Path]::GetTempFileName() + $stderrFile = [System.IO.Path]::GetTempFileName() + + try { + $proc = Start-Process ` + -FilePath $JcmdExe ` + -ArgumentList (@($Pid) + $Args) ` + -NoNewWindow ` + -Wait ` + -PassThru ` + -RedirectStandardOutput $stdoutFile ` + -RedirectStandardError $stderrFile + + $stdout = "" + $stderr = "" + + if (Test-Path -LiteralPath $stdoutFile) { + $stdout = Get-Content -LiteralPath $stdoutFile -Raw + } + + if (Test-Path -LiteralPath $stderrFile) { + $stderr = Get-Content -LiteralPath $stderrFile -Raw + } + + $output = ($stdout + $stderr).TrimEnd() + $exitCode = $proc.ExitCode + + if ($exitCode -ne 0) { + if ([string]::IsNullOrWhiteSpace($output)) { + return "jcmd failed with exit code $exitCode using: $JcmdExe $Pid $($Args -join ' ')" + } + return $output + } + + if ([string]::IsNullOrWhiteSpace($output)) { + return "jcmd returned no output using: $JcmdExe $Pid $($Args -join ' ')" + } + + return $output + } + catch { + return ($_ | Out-String).TrimEnd() + } + finally { + Remove-Item -LiteralPath $stdoutFile -Force -ErrorAction SilentlyContinue + Remove-Item -LiteralPath $stderrFile -Force -ErrorAction SilentlyContinue + } +} + +$envMap = Load-DotEnv -Path (Join-Path $BaseDirectory ".env") + +$containerName = if ($envMap.ContainsKey("CONTAINER_NAME")) { $envMap["CONTAINER_NAME"] } else { "oscar-postgis-container" } +$dbUser = if ($envMap.ContainsKey("DB_USER")) { $envMap["DB_USER"] } else { "postgres" } +$dbName = if ($envMap.ContainsKey("DB_NAME")) { $envMap["DB_NAME"] } else { "gis" } +$dbPassword = if ($envMap.ContainsKey("DB_PASSWORD")) { $envMap["DB_PASSWORD"] } else { "postgres" } + +$monitorDirItem = $null +if (-not [string]::IsNullOrWhiteSpace($MonitorDirectory)) { + if (Test-Path -LiteralPath $MonitorDirectory) { + $monitorDirItem = Get-Item -LiteralPath $MonitorDirectory + } +} +else { + $monitorDirItem = Get-ActiveMonitorDirectory -BaseDir $BaseDirectory + if ($null -eq $monitorDirItem) { + $monitorDirItem = Get-LatestMonitorDirectory -BaseDir $BaseDirectory + } +} + +$monitorDir = if ($null -ne $monitorDirItem) { $monitorDirItem.FullName } else { "" } + +$timestamp = Get-Date -Format "yyyyMMdd-HHmmss" +$outputFile = Join-Path $BaseDirectory "oscar-status-$timestamp.txt" + +$pidFromMonitor = "" +if ($monitorDir) { + $pidPath = Join-Path $monitorDir "jvm-pid.txt" + if (Test-Path -LiteralPath $pidPath) { + $pidFromMonitor = (Get-Content -LiteralPath $pidPath -TotalCount 1 | Out-String).Trim() + } +} + +$oscarJava = Get-OscarJavaProcesses +$liveProc = $null + +if ($pidFromMonitor -match '^\d+$') { + $liveProc = $oscarJava | Where-Object { $_.ProcessId -eq [int]$pidFromMonitor } | Select-Object -First 1 +} +if ($null -eq $liveProc) { + $liveProc = $oscarJava | Select-Object -First 1 +} + +$livePid = if ($null -ne $liveProc) { [string]$liveProc.ProcessId } else { "" } + +$dockerExe = Resolve-ToolPath -Name "docker" +$jcmdExe = Resolve-JcmdPath + +$dockerContainer = Get-DockerContainerRecord -DockerExe $dockerExe -ContainerName $containerName +$dockerTableText = Get-DockerTableText -ContainerRecord $dockerContainer + +$containerRunning = $false +if ($dockerContainer -and -not $dockerContainer.ContainsKey("Error")) { + if ($dockerContainer.Status -like "Up*") { + $containerRunning = $true + } +} + +$osInfo = Get-CimInstance Win32_OperatingSystem | + Select-Object TotalVisibleMemorySize, FreePhysicalMemory, TotalVirtualMemorySize, FreeVirtualMemory +$osInfoText = ($osInfo | Format-List | Out-String -Width 4096).TrimEnd() + +$counterText = "" +try { + $counters = Get-Counter '\Memory\Committed Bytes','\Memory\Commit Limit','\Paging File(_Total)\% Usage' + $counterText = ($counters | Out-String -Width 4096).TrimEnd() +} +catch { + $counterText = "Could not read performance counters." +} + +$liveJvmText = "" +if ($livePid -match '^\d+$') { + $liveJvmText = (Get-Process -Id ([int]$livePid) | + Select-Object Id, ProcessName, Threads, VirtualMemorySize64, WorkingSet64, PrivateMemorySize64, CPU, StartTime | + Format-List | Out-String -Width 4096).TrimEnd() +} +else { + $liveJvmText = "No live OSCAR JVM found." +} + +$jfrText = "" +$heapText = "" +$nmtText = "" + +if ($livePid -match '^\d+$' -and $jcmdExe -and (Test-Path -LiteralPath $jcmdExe)) { + try { + $jfrText = (& $jcmdExe $livePid JFR.check 2>&1 | Out-String -Width 4096).TrimEnd() + if ([string]::IsNullOrWhiteSpace($jfrText)) { + $jfrText = "jcmd returned no output for JFR.check" + } + } + catch { + $jfrText = ($_ | Out-String).TrimEnd() + } + + try { + $heapText = (& $jcmdExe $livePid GC.heap_info 2>&1 | Out-String -Width 4096).TrimEnd() + if ([string]::IsNullOrWhiteSpace($heapText)) { + $heapText = "jcmd returned no output for GC.heap_info" + } + } + catch { + $heapText = ($_ | Out-String).TrimEnd() + } + + try { + $nmtText = (& $jcmdExe $livePid VM.native_memory summary 2>&1 | Out-String -Width 4096).TrimEnd() + if ([string]::IsNullOrWhiteSpace($nmtText)) { + $nmtText = "jcmd returned no output for VM.native_memory summary" + } + } + catch { + $nmtText = ($_ | Out-String).TrimEnd() + } +} +else { + $jfrText = "jcmd.exe not found or no live OSCAR JVM found." + $heapText = $jfrText + $nmtText = $jfrText +} + +$dbMetaText = "" +$dbByStateText = "" +$dbByAppText = "" +$dbErrorText = "" + +$maxConnections = "" +$superuserReservedConnections = "" +$usableClientSlots = "" +$totalSessions = "" +$activeSessions = "" +$idleSessions = "" +$idleInTransaction = "" + +if (-not $dockerExe) { + $dbErrorText = "docker.exe not found in PATH." +} +elseif ($null -eq $dockerContainer) { + $dbErrorText = "Container '$containerName' not found." +} +elseif ($dockerContainer.ContainsKey("Error")) { + $dbErrorText = $dockerContainer.Error +} +elseif (-not $containerRunning) { + $dbErrorText = "Container '$containerName' is present but not running. Status: $($dockerContainer.Status)" +} +else { + $dbMetaSql = "select current_setting('max_connections'), current_setting('superuser_reserved_connections'), (current_setting('max_connections')::int - current_setting('superuser_reserved_connections')::int), count(*), count(*) filter (where state = 'active'), count(*) filter (where state = 'idle'), count(*) filter (where state = 'idle in transaction') from pg_stat_activity;" + $dbByStateSql = "select coalesce(state,'') || '|' || count(*)::text from pg_stat_activity group by state order by 1;" + $dbByAppSql = "select coalesce(application_name,'') || '|' || coalesce(usename,'') || '|' || coalesce(client_addr::text,'') || '|' || coalesce(state,'') || '|' || count(*)::text from pg_stat_activity group by application_name, usename, client_addr, state order by application_name, usename, client_addr, state;" + + $dbMetaText = Invoke-PsqlInContainer -DockerExe $dockerExe -ContainerName $containerName -DbUser $dbUser -DbName $dbName -DbPassword $dbPassword -Sql $dbMetaSql + $dbByStateText = Invoke-PsqlInContainer -DockerExe $dockerExe -ContainerName $containerName -DbUser $dbUser -DbName $dbName -DbPassword $dbPassword -Sql $dbByStateSql + $dbByAppText = Invoke-PsqlInContainer -DockerExe $dockerExe -ContainerName $containerName -DbUser $dbUser -DbName $dbName -DbPassword $dbPassword -Sql $dbByAppSql + + $metaLine = ($dbMetaText -split "`r?`n" | Where-Object { $_.Trim().Length -gt 0 } | Select-Object -First 1) + if ($metaLine -and $metaLine.Contains("|")) { + $parts = $metaLine.Split("|") + if ($parts.Count -ge 7) { + $maxConnections = $parts[0] + $superuserReservedConnections = $parts[1] + $usableClientSlots = $parts[2] + $totalSessions = $parts[3] + $activeSessions = $parts[4] + $idleSessions = $parts[5] + $idleInTransaction = $parts[6] + } + } + else { + if ([string]::IsNullOrWhiteSpace($dbMetaText)) { + $dbErrorText = "psql returned no DB metadata output." + } + else { + $dbErrorText = $dbMetaText + } + } +} + +$snapshotDirs = @() +if ($monitorDir -and (Test-Path -LiteralPath $monitorDir)) { + $snapshotDirs = Get-ChildItem -LiteralPath $monitorDir -Directory | + Where-Object { $_.Name -match '^\d{8}-\d{6}$' } | + Sort-Object Name +} + +$firstSnapshot = if ($snapshotDirs.Count -gt 0) { $snapshotDirs[0].FullName } else { "" } +$latestSnapshot = if ($snapshotDirs.Count -gt 0) { $snapshotDirs[-1].FullName } else { "" } + +$recentSnapshotLines = @() +if ($snapshotDirs.Count -gt 0) { + $recentSnapshotLines = $snapshotDirs | + Select-Object -Last ([Math]::Min(20, $snapshotDirs.Count)) | + ForEach-Object { $_.Name } +} + +$launchStdoutTail = Get-LaunchTail -MonitorDir $monitorDir -FileName "launch.stdout.log" -Tail 50 +$launchStderrTail = Get-LaunchTail -MonitorDir $monitorDir -FileName "launch.stderr.log" -Tail 50 + +$dockerLogsTail = "" +if ($dockerExe -and $containerRunning) { + $dockerLogsTail = Invoke-ExternalCapture -FilePath $dockerExe -Arguments @("logs", "--tail", "100", $containerName) +} + +$procTableText = "" +if ($liveProc) { + $procTableText = ($liveProc | + Select-Object ProcessId, Name, CommandLine | + Format-Table -AutoSize | Out-String -Width 4096).TrimEnd() +} +else { + $procTableText = "No live OSCAR Java process found." +} + +$lines = New-Object 'System.Collections.Generic.List[string]' + +Add-Line $lines "OSCAR STATUS REPORT" +Add-Line $lines ("Generated: " + (Get-Date).ToString("o")) +Add-Line $lines ("Base directory: " + $BaseDirectory) +Add-Line $lines ("Monitor directory: " + $(if ($monitorDir) { $monitorDir } else { "" })) +Add-Line $lines ("Output file: " + $outputFile) +Add-Line $lines "" + +Add-Line $lines "=== PROCESS STATUS ===" +Add-Line $lines ("PID from monitor: " + $pidFromMonitor) +Add-Line $lines ("Live OSCAR PID: " + $livePid) +Add-Line $lines "" +Add-Block $lines $procTableText +Add-Line $lines "" +Add-Block $lines $dockerTableText +Add-Line $lines "" + +Add-Line $lines "=== SYSTEM MEMORY AND PAGEFILE ===" +Add-Line $lines "" +Add-Block $lines $osInfoText +Add-Line $lines "" +Add-Block $lines $counterText +Add-Line $lines "" + +Add-Line $lines "=== LIVE JVM PROCESS ===" +Add-Line $lines "" +Add-Block $lines $liveJvmText +Add-Line $lines "" + +Add-Line $lines "=== LIVE JVM JFR STATUS ===" +Add-Block $lines $jfrText +Add-Line $lines "" + +Add-Line $lines "=== LIVE JVM GC HEAP INFO ===" +Add-Block $lines $heapText +Add-Line $lines "" + +Add-Line $lines "=== LIVE JVM NATIVE MEMORY SUMMARY ===" +Add-Block $lines $nmtText +Add-Line $lines "" + +Add-Line $lines "=== LIVE POSTGRES STATUS ===" +Add-Line $lines ("max_connections: " + $maxConnections) +Add-Line $lines ("superuser_reserved_connections: " + $superuserReservedConnections) +Add-Line $lines ("usable_client_slots: " + $usableClientSlots) +Add-Line $lines ("total_sessions: " + $totalSessions) +Add-Line $lines ("active: " + $activeSessions) +Add-Line $lines ("idle: " + $idleSessions) +Add-Line $lines ("idle in transaction: " + $idleInTransaction) +Add-Line $lines "" +Add-Line $lines "--- db-by-state ---" +Add-Block $lines $dbByStateText +Add-Line $lines "" +Add-Line $lines "--- db-by-app ---" +Add-Block $lines $dbByAppText +Add-Line $lines "" +Add-Line $lines "--- db-error ---" +Add-Block $lines $dbErrorText +Add-Line $lines "" + +Add-Line $lines "=== SNAPSHOT STATUS ===" +Add-Line $lines ("Snapshot count: " + $snapshotDirs.Count) +Add-Line $lines ("First snapshot: " + $firstSnapshot) +Add-Line $lines ("Latest snapshot: " + $latestSnapshot) +Add-Line $lines "" + +Add-Line $lines "=== RECENT SNAPSHOTS (LAST 20) ===" +foreach ($snap in $recentSnapshotLines) { + Add-Line $lines $snap +} +Add-Line $lines "" + +Add-Line $lines "=== LOG TAILS ===" +Add-Line $lines "--- launch.stdout.log (last 50 lines) ---" +Add-Block $lines $launchStdoutTail +Add-Line $lines "" +Add-Line $lines "--- launch.stderr.log (last 50 lines) ---" +Add-Block $lines $launchStderrTail +Add-Line $lines "" +Add-Line $lines "--- postgres docker logs (last captured 100 lines) ---" +Add-Block $lines $dockerLogsTail +Add-Line $lines "" + +Add-Line $lines "=== QUICK READ ===" +Add-Line $lines ("Live JVM PID: " + $livePid) +Add-Line $lines ("Snapshots captured: " + $snapshotDirs.Count) +if ($totalSessions) { Add-Line $lines ("DB total sessions: " + $totalSessions) } +if ($usableClientSlots) { Add-Line $lines ("DB usable client slots: " + $usableClientSlots) } +Add-Line $lines "Interpretation guide:" +Add-Line $lines "- Healthy memory: process memory and JVM native memory plateau." +Add-Line $lines "- Healthy DB: total sessions rise at startup and then plateau well below usable client slots." +Add-Line $lines "- Suspicious DB: total sessions keep climbing, idle sessions pile up, or db-error shows query failures." + +$reportText = ($lines -join "`r`n") +Set-Content -LiteralPath $outputFile -Value $reportText -Encoding UTF8 +Write-Output $reportText \ No newline at end of file diff --git a/dist/release/check-oscar-status.sh b/dist/release/check-oscar-status.sh new file mode 100755 index 0000000..885bca8 --- /dev/null +++ b/dist/release/check-oscar-status.sh @@ -0,0 +1,225 @@ +#!/bin/bash +set -euo pipefail + +BASE_DIR="${1:-.}" +LATEST_DIR="${2:-}" +OUT_FILE="${3:-}" + +cd "$BASE_DIR" + +if [ -z "$LATEST_DIR" ]; then + LATEST_DIR="$(ls -td oscar-monitor-* 2>/dev/null | head -n 1 || true)" +fi + +if [ -z "$LATEST_DIR" ] || [ ! -d "$LATEST_DIR" ]; then + echo "Error: no oscar-monitor-* directory found." + exit 1 +fi + +if [ -z "$OUT_FILE" ]; then + OUT_FILE="oscar-status-$(date +%Y%m%d-%H%M%S).txt" +fi + +FIRST_SNAP="$(find "$LATEST_DIR" -maxdepth 1 -mindepth 1 -type d | sort | head -n 1 || true)" +LAST_SNAP="$(find "$LATEST_DIR" -maxdepth 1 -mindepth 1 -type d | sort | tail -n 1 || true)" +PID="" +[ -f "$LATEST_DIR/jvm-pid.txt" ] && PID="$(cat "$LATEST_DIR/jvm-pid.txt" 2>/dev/null || true)" +LIVE_PID="$(pgrep -f 'com.botts.impl.security.SensorHubWrapper' | head -n 1 || true)" + +extract_db_metric() { + local file="$1" default="$2" + if [ -f "$file" ]; then + tr -d '[:space:]' < "$file" | tail -n 1 + else + echo "$default" + fi +} + +calc_slots() { + local max="$1" reserved="$2" + if [[ "$max" =~ ^[0-9]+$ ]] && [[ "$reserved" =~ ^[0-9]+$ ]]; then + echo $((max - reserved)) + fi +} + +{ + echo "OSCAR STATUS REPORT" + echo "Generated: $(date -Is)" + echo "Base directory: $(pwd)" + echo "Monitor directory: $LATEST_DIR" + echo "Output file: $OUT_FILE" + echo + + echo "=== PROCESS STATUS ===" + echo "PID from monitor: ${PID:-}" + echo "Live OSCAR PID: ${LIVE_PID:-}" + echo + pgrep -af monitor-oscar.sh || true + pgrep -af 'com.botts.impl.security.SensorHubWrapper' || true + echo + docker ps --filter name=oscar-postgis-container || true + echo + + echo "=== SYSTEM MEMORY ===" + free -h || true + echo + echo "--- vmstat (5 samples) ---" + vmstat 1 5 || true + echo + + if [ -n "${LIVE_PID:-}" ] && [ -r "/proc/$LIVE_PID/status" ]; then + echo "=== LIVE JVM /proc STATUS ===" + grep -E 'Name|State|VmSize|VmRSS|VmSwap|Threads' "/proc/$LIVE_PID/status" || true + echo + fi + + if [ -n "${LIVE_PID:-}" ] && [ -r "/proc/$LIVE_PID/smaps_rollup" ]; then + echo "=== LIVE JVM SMAPS ROLLUP ===" + cat "/proc/$LIVE_PID/smaps_rollup" || true + echo + fi + + if [ -n "${LIVE_PID:-}" ] && command -v jcmd >/dev/null 2>&1; then + echo "=== LIVE JVM JFR STATUS ===" + jcmd "$LIVE_PID" JFR.check || true + echo + echo "=== LIVE JVM GC HEAP INFO ===" + jcmd "$LIVE_PID" GC.heap_info || true + echo + echo "=== LIVE JVM NATIVE MEMORY SUMMARY ===" + jcmd "$LIVE_PID" VM.native_memory summary || true + echo + fi + + echo "=== LIVE POSTGRES STATUS ===" + if docker ps --format '{{.Names}}' | grep -Eq '^oscar-postgis-container$'; then + if [ -f "$LAST_SNAP/db-max-connections.txt" ]; then + echo "max_connections: $(extract_db_metric "$LAST_SNAP/db-max-connections.txt" n/a)" + fi + if [ -f "$LAST_SNAP/db-superuser-reserved-connections.txt" ]; then + echo "superuser_reserved_connections: $(extract_db_metric "$LAST_SNAP/db-superuser-reserved-connections.txt" n/a)" + fi + if [ -f "$LAST_SNAP/db-total-sessions.txt" ]; then + echo "total_sessions: $(extract_db_metric "$LAST_SNAP/db-total-sessions.txt" n/a)" + fi + if [ -f "$LAST_SNAP/db-by-state.txt" ]; then + echo + echo "--- db-by-state ---" + cat "$LAST_SNAP/db-by-state.txt" || true + fi + if [ -f "$LAST_SNAP/db-by-app.txt" ]; then + echo + echo "--- db-by-app ---" + cat "$LAST_SNAP/db-by-app.txt" || true + fi + if [ -f "$LAST_SNAP/db-activity-detail.txt" ]; then + echo + echo "--- db-activity-detail (first 40 lines) ---" + head -n 40 "$LAST_SNAP/db-activity-detail.txt" || true + fi + if [ -f "$LAST_SNAP/db-error.txt" ]; then + echo + echo "--- db-error ---" + cat "$LAST_SNAP/db-error.txt" || true + fi + else + echo "Postgres container is not running." + fi + echo + + echo "=== FIRST SNAPSHOT SUMMARY ===" + echo "First snapshot: ${FIRST_SNAP:-}" + if [ -n "${FIRST_SNAP:-}" ] && [ -f "$FIRST_SNAP/proc-status.txt" ]; then + grep -E 'VmRSS|VmSwap|Threads' "$FIRST_SNAP/proc-status.txt" || true + fi + if [ -n "${FIRST_SNAP:-}" ] && [ -f "$FIRST_SNAP/nmt-summary.txt" ]; then + grep '^Total:' "$FIRST_SNAP/nmt-summary.txt" || true + fi + if [ -n "${FIRST_SNAP:-}" ] && [ -f "$FIRST_SNAP/db-total-sessions.txt" ]; then + echo "db total sessions: $(extract_db_metric "$FIRST_SNAP/db-total-sessions.txt" n/a)" + fi + echo + + echo "=== LATEST SNAPSHOT SUMMARY ===" + echo "Latest snapshot: ${LAST_SNAP:-}" + if [ -n "${LAST_SNAP:-}" ] && [ -f "$LAST_SNAP/proc-status.txt" ]; then + grep -E 'VmRSS|VmSwap|Threads' "$LAST_SNAP/proc-status.txt" || true + fi + if [ -n "${LAST_SNAP:-}" ] && [ -f "$LAST_SNAP/nmt-summary.txt" ]; then + grep '^Total:' "$LAST_SNAP/nmt-summary.txt" || true + fi + if [ -n "${LAST_SNAP:-}" ] && [ -f "$LAST_SNAP/db-total-sessions.txt" ]; then + echo "db total sessions: $(extract_db_metric "$LAST_SNAP/db-total-sessions.txt" n/a)" + fi + echo + + echo "=== RECENT TREND (LAST 20 SNAPSHOTS) ===" + for d in $(find "$LATEST_DIR" -maxdepth 1 -mindepth 1 -type d | sort | tail -n 20); do + printf "%s " "$(basename "$d")" + [ -f "$d/proc-status.txt" ] && grep -E 'VmRSS|VmSwap|Threads' "$d/proc-status.txt" | tr '\n' ' ' + [ -f "$d/nmt-summary.txt" ] && grep '^Total:' "$d/nmt-summary.txt" | tr '\n' ' ' + if [ -f "$d/db-total-sessions.txt" ]; then + printf "db_total=%s " "$(extract_db_metric "$d/db-total-sessions.txt" n/a)" + fi + if [ -f "$d/db-by-state.txt" ]; then + printf "db_active=%s " "$(awk -F'|' '$1=="active" {gsub(/^[ \t]+|[ \t]+$/, "", $2); print $2}' "$d/db-by-state.txt" | tail -n 1)" + printf "db_idle=%s " "$(awk -F'|' '$1=="idle" {gsub(/^[ \t]+|[ \t]+$/, "", $2); print $2}' "$d/db-by-state.txt" | tail -n 1)" + printf "db_idle_tx=%s " "$(awk -F'|' '$1=="idle in transaction" {gsub(/^[ \t]+|[ \t]+$/, "", $2); print $2}' "$d/db-by-state.txt" | tail -n 1)" + fi + if [ -f "$d/db-error.txt" ] && [ -s "$d/db-error.txt" ]; then + printf "db_error=yes " + fi + echo + done + echo + + if [ -f "$LATEST_DIR/db-connection-trend.csv" ]; then + echo "=== DB CONNECTION TREND CSV (LAST 40 LINES) ===" + tail -n 40 "$LATEST_DIR/db-connection-trend.csv" || true + echo + fi + + echo "=== LOG TAILS ===" + [ -f "$LATEST_DIR/launch.stdout.log" ] && { echo '--- launch.stdout.log (last 50 lines) ---'; tail -n 50 "$LATEST_DIR/launch.stdout.log"; echo; } + [ -f "$LATEST_DIR/launch.stderr.log" ] && { echo '--- launch.stderr.log (last 50 lines) ---'; tail -n 50 "$LATEST_DIR/launch.stderr.log"; echo; } + [ -f "$LAST_SNAP/docker-logs-tail.txt" ] && { echo '--- postgres docker logs (last captured 100 lines) ---'; tail -n 100 "$LAST_SNAP/docker-logs-tail.txt"; echo; } + + echo "=== QUICK READ ===" + FIRST_RSS=""; LAST_RSS=""; FIRST_SWAP=""; LAST_SWAP=""; FIRST_THREADS=""; LAST_THREADS="" + FIRST_DB_TOTAL=""; LAST_DB_TOTAL=""; FIRST_MAX=""; LAST_MAX=""; FIRST_RESERVED=""; LAST_RESERVED="" + + if [ -n "${FIRST_SNAP:-}" ] && [ -f "$FIRST_SNAP/proc-status.txt" ]; then + FIRST_RSS="$(grep '^VmRSS:' "$FIRST_SNAP/proc-status.txt" | awk '{print $2 " " $3}' || true)" + FIRST_SWAP="$(grep '^VmSwap:' "$FIRST_SNAP/proc-status.txt" | awk '{print $2 " " $3}' || true)" + FIRST_THREADS="$(grep '^Threads:' "$FIRST_SNAP/proc-status.txt" | awk '{print $2}' || true)" + fi + if [ -n "${LAST_SNAP:-}" ] && [ -f "$LAST_SNAP/proc-status.txt" ]; then + LAST_RSS="$(grep '^VmRSS:' "$LAST_SNAP/proc-status.txt" | awk '{print $2 " " $3}' || true)" + LAST_SWAP="$(grep '^VmSwap:' "$LAST_SNAP/proc-status.txt" | awk '{print $2 " " $3}' || true)" + LAST_THREADS="$(grep '^Threads:' "$LAST_SNAP/proc-status.txt" | awk '{print $2}' || true)" + fi + [ -n "${FIRST_SNAP:-}" ] && FIRST_DB_TOTAL="$(extract_db_metric "$FIRST_SNAP/db-total-sessions.txt" n/a)" + [ -n "${LAST_SNAP:-}" ] && LAST_DB_TOTAL="$(extract_db_metric "$LAST_SNAP/db-total-sessions.txt" n/a)" + [ -n "${FIRST_SNAP:-}" ] && FIRST_MAX="$(extract_db_metric "$FIRST_SNAP/db-max-connections.txt" n/a)" + [ -n "${LAST_SNAP:-}" ] && LAST_MAX="$(extract_db_metric "$LAST_SNAP/db-max-connections.txt" n/a)" + [ -n "${FIRST_SNAP:-}" ] && FIRST_RESERVED="$(extract_db_metric "$FIRST_SNAP/db-superuser-reserved-connections.txt" n/a)" + [ -n "${LAST_SNAP:-}" ] && LAST_RESERVED="$(extract_db_metric "$LAST_SNAP/db-superuser-reserved-connections.txt" n/a)" + + echo "First RSS: ${FIRST_RSS:-n/a}" + echo "Latest RSS: ${LAST_RSS:-n/a}" + echo "First VmSwap: ${FIRST_SWAP:-n/a}" + echo "Latest VmSwap: ${LAST_SWAP:-n/a}" + echo "First Threads: ${FIRST_THREADS:-n/a}" + echo "Latest Threads:${LAST_THREADS:-n/a}" + echo "First DB total sessions: ${FIRST_DB_TOTAL:-n/a}" + echo "Latest DB total sessions: ${LAST_DB_TOTAL:-n/a}" + echo "First DB usable client slots: $(calc_slots "$FIRST_MAX" "$FIRST_RESERVED")" + echo "Latest DB usable client slots: $(calc_slots "$LAST_MAX" "$LAST_RESERVED")" + echo + echo "Interpretation guide:" + echo "- Healthy memory: RSS, VmSwap, and thread count rise at startup and then flatten." + echo "- Healthy DB: total sessions rise at startup and then plateau well below usable client slots." + echo "- Suspicious DB: total sessions keep climbing, idle sessions pile up, or db_error shows too many clients already." +} > "$OUT_FILE" + +echo "Wrote report to: $OUT_FILE" diff --git a/dist/release/env.template b/dist/release/env.template new file mode 100644 index 0000000..6d3a356 --- /dev/null +++ b/dist/release/env.template @@ -0,0 +1,55 @@ +# --- SYSTEM PROFILE --- +# Options: RPI4, 8GB, 16GB, 32GB +SYSTEM_PROFILE=16GB + +# --- DATABASE SETTINGS --- +DB_NAME=gis +DB_USER=postgres +DB_PASSWORD=postgres +DB_PORT=5432 +DB_HOST=localhost +CONTAINER_NAME=oscar-postgis-container + +# --- SECURITY --- +# Replace before production use +KEYSTORE_PASSWORD=atakatak +TRUSTSTORE_PASSWORD=changeit + +# Optional: +# If set, launch scripts can use this instead of profile defaults/helper files. +# INITIAL_ADMIN_PASSWORD=admin + +# --- PROCESS HANDLING --- +# 0 = refuse to start if OSCAR is already running +# 1 = stop the running OSCAR instance and start fresh +FORCE_RESTART=0 + +# --- MONITOR BEHAVIOR --- +# 0 = refuse to attach if OSCAR is already running +# 1 = attach monitoring to an already running OSCAR instance +ATTACH_TO_EXISTING=0 + +# Maximum time to wait for OSCAR JVM startup in monitor scripts +MAX_WAIT_SECONDS=300 + +# --- POSTGIS STARTUP / READINESS --- +# Number of readiness retries for PostGIS startup checks +RETRY_MAX=120 + +# Seconds between readiness retries +RETRY_INTERVAL=2 + +# Extra delay after PostGIS reports ready +POSTGIS_READY_DELAY=60 + +# --- OPTIONAL MEMORY / DIAGNOSTICS OVERRIDES --- +# Leave blank to use profile defaults from launch scripts +JAVACPP_MAX_BYTES= +JAVACPP_MAX_PHYSICAL_BYTES= +JFR_FILENAME= + +# --- OPTIONAL ARM BUILD OVERRIDES --- +# Only needed when using ARM-specific launch/build paths +# POSTGIS_IMAGE_NAME=oscar-postgis-arm +# POSTGIS_DOCKERFILE=Dockerfile-arm64 +# POSTGIS_PLATFORM=linux/arm64 \ No newline at end of file diff --git a/dist/release/launch-all-arm.sh b/dist/release/launch-all-arm.sh index 96c1cc5..cacf672 100755 --- a/dist/release/launch-all-arm.sh +++ b/dist/release/launch-all-arm.sh @@ -1,76 +1,278 @@ -#!/bin/bash - -HOST=localhost -DB_NAME=gis -DB_USER=postgres -RETRY_MAX=20 -RETRY_INTERVAL=5 -PROJECT_DIR="$(pwd)" # Store the original directory -CONTAINER_NAME=oscar-postgis-container - -#sudo docker rm -f "$CONTAINER_NAME" 2>/dev/null || true - -# Create pgdata directory if needed -if [ ! -d "${PROJECT_DIR}/pgdata" ]; then - echo "Creating pgdata folder..." - mkdir -p "${PROJECT_DIR}/pgdata" -fi +#!/usr/bin/env bash +set -euo pipefail + +SOURCE="${BASH_SOURCE[0]}" +while [ -h "$SOURCE" ]; do + SOURCE_DIR="$(CDPATH= cd -P "$(dirname "$SOURCE")" >/dev/null 2>&1 && pwd)" + SOURCE="$(readlink "$SOURCE")" + case "$SOURCE" in + /*) ;; + *) SOURCE="${SOURCE_DIR}/${SOURCE}" ;; + esac +done +PROJECT_DIR="$(CDPATH= cd -P "$(dirname "$SOURCE")" >/dev/null 2>&1 && pwd)" +ENV_FILE="$PROJECT_DIR/.env" +MATCH_EXPR='com.botts.impl.security.SensorHubWrapper' +FORCE_RESTART="${FORCE_RESTART:-0}" +RETRY_MAX="${RETRY_MAX:-120}" +RETRY_INTERVAL="${RETRY_INTERVAL:-2}" +POSTGIS_READY_DELAY="${POSTGIS_READY_DELAY:-5}" +IMAGE_NAME="${POSTGIS_IMAGE_NAME:-${IMAGE_NAME:-oscar-postgis-arm}}" +POSTGIS_DOCKERFILE="${POSTGIS_DOCKERFILE:-Dockerfile-arm64}" +POSTGIS_PLATFORM="${POSTGIS_PLATFORM:-linux/arm64}" + +load_env() { + local env_file="$1" + while IFS= read -r line || [ -n "$line" ]; do + case "$line" in + ""|"#"*) continue ;; + export\ *) line="${line#export }" ;; + esac + local name="${line%%=*}" + local value="${line#*=}" + value="${value%$'\r'}" + export "${name}=${value}" + done < "$env_file" +} + +require_cmd() { + local cmd="$1" + if ! command -v "$cmd" >/dev/null 2>&1; then + echo "Error: required command not found: $cmd" + exit 1 + fi +} + +find_existing_oscar_pids() { + pgrep -f "$MATCH_EXPR" || true +} + +stop_existing_oscar() { + local pids="$1" + if [ -z "$pids" ]; then + return 0 + fi + + echo "Stopping existing OSCAR instance(s): $pids" + kill $pids 2>/dev/null || true + + local waited=0 + while [ "$waited" -lt 15 ]; do + sleep 1 + waited=$((waited + 1)) + if [ -z "$(find_existing_oscar_pids)" ]; then + return 0 + fi + done -# Check Docker -if ! command -v docker >/dev/null 2>&1; then - echo "Error: Docker is not installed. Please install Docker first." + echo "Existing OSCAR instance still running after graceful stop. Forcing stop." + kill -9 $pids 2>/dev/null || true + sleep 1 + + if [ -n "$(find_existing_oscar_pids)" ]; then + echo "Error: unable to stop the existing OSCAR instance." + exit 1 + fi +} + +check_existing_oscar() { + local pids + pids="$(find_existing_oscar_pids)" + + if [ -z "$pids" ]; then + return 0 + fi + + if [ "$FORCE_RESTART" = "1" ]; then + echo "Existing OSCAR instance found with PID(s): $pids. Replacing because FORCE_RESTART=1." + stop_existing_oscar "$pids" + return 0 + fi + + echo "OSCAR is already running with PID(s): $pids." + echo "Stop the running instance first, or set FORCE_RESTART=1 to replace it." exit 1 -fi +} -echo "Building PostGIS Docker image..." +require_number() { + local name="$1" + local value="${!name:-}" + case "$value" in + ''|*[!0-9]*) + echo "Error: ${name} must be a number, got '${value}'." + exit 1 + ;; + esac +} -cd postgis || { echo "Error: postgis directory not found"; exit 1; } +ensure_project_layout() { + if [ ! -d "$PROJECT_DIR/postgis" ]; then + echo "Error: postgis directory not found in $PROJECT_DIR" + exit 1 + fi -# Build PostGIS -sudo docker build . \ - --file=Dockerfile-arm64 \ - --tag=oscar-postgis-arm + if [ ! -f "$PROJECT_DIR/postgis/$POSTGIS_DOCKERFILE" ]; then + echo "Error: $POSTGIS_DOCKERFILE not found in $PROJECT_DIR/postgis" + exit 1 + fi -echo "Starting PostGIS container..." -echo "PROJECT_DIR is set to: ${PROJECT_DIR}" + if [ ! -d "$PROJECT_DIR/osh-node-oscar" ]; then + echo "Error: osh-node-oscar directory not found in $PROJECT_DIR" + exit 1 + fi -if docker ps -a --format '{{.Names}}' | grep -Eq "^${CONTAINER_NAME}$"; then - # The container exists - if docker ps --format '{{.Names}}' | grep -Eq "^${CONTAINER_NAME}$"; then - echo "Container already running: ${CONTAINER_NAME}" - else - echo "Starting existing container: ${CONTAINER_NAME}" - docker start "${CONTAINER_NAME}" + if [ ! -f "$PROJECT_DIR/osh-node-oscar/launch.sh" ]; then + echo "Error: launch.sh not found in $PROJECT_DIR/osh-node-oscar" + exit 1 fi + + mkdir -p "$PROJECT_DIR/pgdata" +} + +if [ -f "$ENV_FILE" ]; then + load_env "$ENV_FILE" else - echo "Creating new container: ${CONTAINER_NAME}" - docker run \ - --name $CONTAINER_NAME \ - -e POSTGRES_DB=$DB_NAME \ - -e POSTGRES_USER=$DB_USER \ - -e POSTGRES_PASSWORD=postgres \ - -e DATADIR=/var/lib/postgresql/data \ - -p 5432:5432 \ - -v "$(pwd)/pgdata:/var/lib/postgresql/data" \ - -d \ - oscar-postgis-arm || { echo "Failed to start PostGIS container"; exit 1; } + echo "Warning: .env file not found in $PROJECT_DIR" + if [ -f "$PROJECT_DIR/env.template" ]; then + echo "Warning: using built-in defaults. Copy env.template to .env to customize settings." + else + echo "Warning: using built-in defaults." + fi +fi + +SYSTEM_PROFILE="${SYSTEM_PROFILE:-8GB}" +CONTAINER_NAME="${CONTAINER_NAME:-oscar-postgis-container}" +DB_NAME="${DB_NAME:-gis}" +DB_USER="${DB_USER:-postgres}" +DB_PASSWORD="${DB_PASSWORD:-postgres}" +DB_PORT="${DB_PORT:-5432}" +DB_HOST="${DB_HOST:-localhost}" +export SYSTEM_PROFILE CONTAINER_NAME DB_NAME DB_USER DB_PASSWORD DB_PORT DB_HOST +export RETRY_MAX RETRY_INTERVAL POSTGIS_READY_DELAY IMAGE_NAME POSTGIS_DOCKERFILE POSTGIS_PLATFORM + +require_cmd docker +if ! docker info >/dev/null 2>&1; then + echo "Error: Docker is installed, but the Docker daemon is not running." + exit 1 fi -# Wait for PostgreSQL/PostGIS to become ready -echo "Waiting for PostGIS ARM64 (PostgreSQL) to be ready..." +check_existing_oscar +ensure_project_layout +require_number DB_PORT +require_number RETRY_MAX +require_number RETRY_INTERVAL +require_number POSTGIS_READY_DELAY + +case "${SYSTEM_PROFILE^^}" in + RPI4) + SYSTEM_PROFILE="RPI4" + PG_SHARED="256MB" + PG_CACHE="1GB" + PG_WORK_MEM="2MB" + PG_MAINT="64MB" + PG_MAX_CONN="75" + ;; + 8GB) + SYSTEM_PROFILE="8GB" + PG_SHARED="512MB" + PG_CACHE="2GB" + PG_WORK_MEM="4MB" + PG_MAINT="128MB" + PG_MAX_CONN="125" + ;; + 16GB) + SYSTEM_PROFILE="16GB" + PG_SHARED="1GB" + PG_CACHE="4GB" + PG_WORK_MEM="8MB" + PG_MAINT="256MB" + PG_MAX_CONN="200" + ;; + 32GB) + SYSTEM_PROFILE="32GB" + PG_SHARED="2GB" + PG_CACHE="8GB" + PG_WORK_MEM="16MB" + PG_MAINT="512MB" + PG_MAX_CONN="300" + ;; + *) + echo "Unknown profile '${SYSTEM_PROFILE}', using 8GB defaults." + SYSTEM_PROFILE="8GB" + PG_SHARED="512MB" + PG_CACHE="2GB" + PG_WORK_MEM="4MB" + PG_MAINT="128MB" + PG_MAX_CONN="125" + ;; +esac + +echo "Building PostGIS Docker image for Apple Silicon / ARM64..." +( + cd "$PROJECT_DIR/postgis" + if [ -n "$POSTGIS_PLATFORM" ]; then + docker build --platform "$POSTGIS_PLATFORM" . --file="$POSTGIS_DOCKERFILE" --tag="$IMAGE_NAME" + else + docker build . --file="$POSTGIS_DOCKERFILE" --tag="$IMAGE_NAME" + fi +) + +echo "Preparing PostGIS container for profile: $SYSTEM_PROFILE" +echo " Image: $IMAGE_NAME" +echo " Dockerfile: $POSTGIS_DOCKERFILE" +echo " Port: ${DB_PORT}:5432" +echo " Data: $PROJECT_DIR/pgdata" + +if docker container inspect "$CONTAINER_NAME" >/dev/null 2>&1; then + echo "Removing existing container '$CONTAINER_NAME' so updated settings take effect..." + docker rm -f "$CONTAINER_NAME" >/dev/null +fi -RETRY_COUNT=0 -export PGPASSWORD=postgres # Needed for pg_isready with password +echo "Creating new PostGIS container..." +docker run \ + --name "$CONTAINER_NAME" \ + -e POSTGRES_DB="$DB_NAME" \ + -e POSTGRES_USER="$DB_USER" \ + -e POSTGRES_PASSWORD="$DB_PASSWORD" \ + -e DATADIR=/var/lib/postgresql/data \ + -p "${DB_PORT}:5432" \ + -v "$PROJECT_DIR/pgdata:/var/lib/postgresql/data" \ + -d \ + "$IMAGE_NAME" \ + -c shared_buffers="$PG_SHARED" \ + -c effective_cache_size="$PG_CACHE" \ + -c work_mem="$PG_WORK_MEM" \ + -c maintenance_work_mem="$PG_MAINT" \ + -c max_connections="$PG_MAX_CONN" \ + -c superuser_reserved_connections=10 \ + -c idle_session_timeout=600000 \ + -c log_connections=on \ + -c log_disconnections=on \ + -c wal_buffers=16MB \ + -c random_page_cost=1.1 \ + -c effective_io_concurrency=200 -until docker exec "$CONTAINER_NAME" pg_isready -U "$DB_USER" -d "$DB_NAME" > /dev/null 2>&1; do - echo "PostGIS not ready yet, retrying..." - sleep "${RETRY_INTERVAL}" +echo "Waiting for PostGIS ARM64 to be ready..." +export PGPASSWORD="$DB_PASSWORD" +retry_count=0 +until docker exec "$CONTAINER_NAME" pg_isready -U "$DB_USER" -d "$DB_NAME" >/dev/null 2>&1; do + retry_count=$((retry_count + 1)) + if [ "$retry_count" -ge "$RETRY_MAX" ]; then + echo "Error: PostGIS did not become ready after $((RETRY_MAX * RETRY_INTERVAL)) seconds." + echo "Last container logs:" + docker logs --tail 50 "$CONTAINER_NAME" || true + exit 1 + fi + echo "PostGIS not ready yet, retrying..." + sleep "$RETRY_INTERVAL" done -echo "PostGIS (PostgreSQL) is ready! Please wait for OpenSensorHub to start..." +echo "PostGIS is ready. Starting OpenSensorHub..." +sleep "$POSTGIS_READY_DELAY" -sleep 10 +cd "$PROJECT_DIR/osh-node-oscar" +if [ ! -x ./launch.sh ]; then + chmod +x ./launch.sh +fi -# Launch osh-node-oscar -cd "$PROJECT_DIR/osh-node-oscar" || { echo "Error: osh-node-oscar not found"; exit 1; } -./launch.sh \ No newline at end of file +exec ./launch.sh diff --git a/dist/release/launch-all.bat b/dist/release/launch-all.bat index 5c93081..4b1abd5 100755 --- a/dist/release/launch-all.bat +++ b/dist/release/launch-all.bat @@ -1,117 +1,226 @@ -@echo off -setlocal enabledelayedexpansion - -REM ==== CONFIG ==== -set HOST=localhost -set PORT=5432 -set DB_NAME=gis -set USER=postgres -set RETRY_MAX=20 -set RETRY_INTERVAL=5 -set PROJECT_DIR=%cd% -set CONTAINER_NAME=oscar-postgis-container -set IMAGE_NAME=oscar-postgis - -echo PROJECT_DIR is: %PROJECT_DIR% - -where docker >nul 2>&1 -if %errorlevel% neq 0 ( - echo ERROR: Docker is not installed or not in PATH. - exit /b 1 -) - -if not exist "%PROJECT_DIR%\pgdata" ( - echo Creating pgdata directory... - mkdir "%PROJECT_DIR%\pgdata" -) - -echo Building PostGIS Docker image... -pushd postgis -docker build . -f Dockerfile -t %IMAGE_NAME% -if %errorlevel% neq 0 ( - echo ERROR: Docker build failed. - exit /b 1 -) -popd - -echo Starting PostGIS container... - -for /f "tokens=*" %%i in ('docker ps -a --format "{{.Names}}"') do ( - if "%%i"=="%CONTAINER_NAME%" ( - set CONTAINER_EXISTS=1 - ) -) - -for /f "tokens=*" %%i in ('docker ps --format "{{.Names}}"') do ( - if "%%i"=="%CONTAINER_NAME%" ( - set CONTAINER_RUNNING=1 - ) -) - -if defined CONTAINER_EXISTS ( - if defined CONTAINER_RUNNING ( - echo Container already running: %CONTAINER_NAME% - ) else ( - echo Starting existing container: %CONTAINER_NAME% - docker start %CONTAINER_NAME% - ) -) else ( - echo Creating new container: %CONTAINER_NAME% - docker run ^ - --name %CONTAINER_NAME% ^ - -e POSTGRES_DB=%DB_NAME% ^ - -e POSTGRES_USER=%USER% ^ - -e POSTGRES_PASSWORD=postgres ^ - -p %PORT%:5432 ^ - -v "%PROJECT_DIR%\pgdata:/var/lib/postgresql/data" ^ - -d ^ - %IMAGE_NAME% - - if %errorlevel% neq 0 ( - echo ERROR: Failed to start PostGIS container. - exit /b 1 - ) -) - -echo Waiting for PostGIS database to become ready... - -set RETRY_COUNT=0 - -:wait_loop -docker exec %CONTAINER_NAME% pg_isready -U %USER% -d %DB_NAME% >nul 2>&1 -if %errorlevel% equ 0 ( - echo Received OK from PostGIS. Please wait for initialization... - goto after_wait -) - -echo PostGIS not ready yet, retrying... -set /a RETRY_COUNT+=1 - -if %RETRY_COUNT% geq %RETRY_MAX% ( - echo ERROR: PostGIS did not become ready in time. - exit /b 1 -) - -timeout /t %RETRY_INTERVAL% >nul -goto wait_loop - -:after_wait - -timeout /t 10 >nul - -echo PostGIS database is ready! - -cd "%PROJECT_DIR%\osh-node-oscar" -if %errorlevel% neq 0 ( - echo ERROR: osh-node-oscar directory not found. - exit /b 1 -) - -if exist launch.bat ( - call launch.bat -) else ( - echo WARNING: launch.bat not found. Trying launch.sh through Git Bash... - bash launch.sh -) - -endlocal +@echo off +setlocal EnableExtensions EnableDelayedExpansion + +set "SCRIPT_DIR=%~dp0" +if "%SCRIPT_DIR:~-1%"=="\" set "SCRIPT_DIR=%SCRIPT_DIR:~0,-1%" + +set "NODE_DIR=%SCRIPT_DIR%\osh-node-oscar" +set "POSTGIS_DIR=%SCRIPT_DIR%\postgis" +set "ENV_FILE=%SCRIPT_DIR%\.env" + +if exist "%ENV_FILE%" call :load_env "%ENV_FILE%" + +if not defined SYSTEM_PROFILE set "SYSTEM_PROFILE=8GB" +if not defined DB_NAME set "DB_NAME=gis" +if not defined DB_USER set "DB_USER=postgres" +if not defined DB_PASSWORD set "DB_PASSWORD=postgres" +if not defined DB_PORT set "DB_PORT=5432" +if not defined CONTAINER_NAME set "CONTAINER_NAME=oscar-postgis-container" +if not defined POSTGIS_IMAGE_NAME set "POSTGIS_IMAGE_NAME=oscar-postgis" +if not defined POSTGIS_DOCKERFILE set "POSTGIS_DOCKERFILE=Dockerfile" +if not defined FORCE_RESTART set "FORCE_RESTART=0" +if not defined RETRY_MAX set "RETRY_MAX=120" +if not defined RETRY_INTERVAL set "RETRY_INTERVAL=2" +if not defined POSTGIS_READY_DELAY set "POSTGIS_READY_DELAY=5" + +set "PGDATA_DIR=%SCRIPT_DIR%\pgdata" + +if not exist "%POSTGIS_DIR%" ( + echo ERROR: Missing PostGIS directory: "%POSTGIS_DIR%" + exit /b 1 +) + +if not exist "%POSTGIS_DIR%\%POSTGIS_DOCKERFILE%" ( + echo ERROR: Missing PostGIS Dockerfile: "%POSTGIS_DIR%\%POSTGIS_DOCKERFILE%" + exit /b 1 +) + +if not exist "%POSTGIS_DIR%\init-extensions.sql" ( + echo ERROR: Missing PostGIS init script: "%POSTGIS_DIR%\init-extensions.sql" + exit /b 1 +) + +if not exist "%NODE_DIR%\launch.bat" ( + echo ERROR: Missing node launcher: "%NODE_DIR%\launch.bat" + exit /b 1 +) + +where docker >nul 2>nul +if errorlevel 1 ( + echo ERROR: Docker was not found in PATH. + exit /b 1 +) + +docker version >nul 2>nul +if errorlevel 1 ( + echo ERROR: Docker is installed but not responding. + exit /b 1 +) + +where java >nul 2>nul +if errorlevel 1 ( + echo ERROR: Java was not found in PATH. + exit /b 1 +) + +call :check_existing_oscar +if defined OSCAR_PID ( + if /I "%FORCE_RESTART%"=="1" ( + echo OSCAR is already running with PID !OSCAR_PID!. FORCE_RESTART=1, stopping it first... + call :stop_existing_oscar + call :wait_for_oscar_stop 60 + call :check_existing_oscar + if defined OSCAR_PID ( + echo ERROR: OSCAR is still running with PID !OSCAR_PID! after stop attempt. + exit /b 1 + ) + ) else ( + echo ERROR: OSCAR is already running with PID !OSCAR_PID!. + echo Set FORCE_RESTART=1 in .env to replace the running instance. + exit /b 1 + ) +) + +if /I "%SYSTEM_PROFILE%"=="RPI4" ( + set "PG_MAX_CONNECTIONS=75" + set "PG_SHARED_BUFFERS=256MB" + set "PG_EFFECTIVE_CACHE_SIZE=1024MB" + set "PG_WORK_MEM=2MB" + set "PG_MAINTENANCE_WORK_MEM=64MB" +) else if /I "%SYSTEM_PROFILE%"=="8GB" ( + set "PG_MAX_CONNECTIONS=125" + set "PG_SHARED_BUFFERS=1024MB" + set "PG_EFFECTIVE_CACHE_SIZE=3072MB" + set "PG_WORK_MEM=4MB" + set "PG_MAINTENANCE_WORK_MEM=128MB" +) else if /I "%SYSTEM_PROFILE%"=="16GB" ( + set "PG_MAX_CONNECTIONS=200" + set "PG_SHARED_BUFFERS=2048MB" + set "PG_EFFECTIVE_CACHE_SIZE=6144MB" + set "PG_WORK_MEM=4MB" + set "PG_MAINTENANCE_WORK_MEM=256MB" +) else if /I "%SYSTEM_PROFILE%"=="32GB" ( + set "PG_MAX_CONNECTIONS=300" + set "PG_SHARED_BUFFERS=4096MB" + set "PG_EFFECTIVE_CACHE_SIZE=12288MB" + set "PG_WORK_MEM=8MB" + set "PG_MAINTENANCE_WORK_MEM=512MB" +) else ( + echo WARNING: Unknown SYSTEM_PROFILE "%SYSTEM_PROFILE%". Using 8GB defaults. + set "PG_MAX_CONNECTIONS=125" + set "PG_SHARED_BUFFERS=1024MB" + set "PG_EFFECTIVE_CACHE_SIZE=3072MB" + set "PG_WORK_MEM=4MB" + set "PG_MAINTENANCE_WORK_MEM=128MB" +) + +if not exist "%PGDATA_DIR%" mkdir "%PGDATA_DIR%" + +echo Building PostGIS Docker image... +docker build -t "%POSTGIS_IMAGE_NAME%" -f "%POSTGIS_DIR%\%POSTGIS_DOCKERFILE%" "%POSTGIS_DIR%" +if errorlevel 1 ( + echo ERROR: Failed to build PostGIS Docker image. + exit /b 1 +) + +echo Preparing PostGIS container for profile: %SYSTEM_PROFILE% +echo Image: %POSTGIS_IMAGE_NAME% +echo Port: %DB_PORT%:5432 +echo Data: %PGDATA_DIR% + +docker ps -a --format "{{.Names}}" | findstr /I /X "%CONTAINER_NAME%" >nul +if not errorlevel 1 ( + echo Removing existing container "%CONTAINER_NAME%" so updated settings take effect... + docker rm -f "%CONTAINER_NAME%" >nul 2>nul +) + +echo Creating new container... +docker run -d ^ + --name "%CONTAINER_NAME%" ^ + -p %DB_PORT%:5432 ^ + -e POSTGRES_DB=%DB_NAME% ^ + -e POSTGRES_USER=%DB_USER% ^ + -e POSTGRES_PASSWORD=%DB_PASSWORD% ^ + -v "%PGDATA_DIR%:/var/lib/postgresql/data" ^ + "%POSTGIS_IMAGE_NAME%" ^ + -c max_connections=%PG_MAX_CONNECTIONS% ^ + -c superuser_reserved_connections=10 ^ + -c shared_buffers=%PG_SHARED_BUFFERS% ^ + -c effective_cache_size=%PG_EFFECTIVE_CACHE_SIZE% ^ + -c work_mem=%PG_WORK_MEM% ^ + -c maintenance_work_mem=%PG_MAINTENANCE_WORK_MEM% ^ + -c idle_session_timeout=600000 ^ + -c log_connections=on ^ + -c log_disconnections=on +if errorlevel 1 ( + echo ERROR: Failed to start PostGIS container. + exit /b 1 +) + +echo Waiting for PostGIS to be ready... +set /a WAIT_COUNT=0 + +:wait_for_postgis +docker exec "%CONTAINER_NAME%" pg_isready -U "%DB_USER%" -d "%DB_NAME%" >nul 2>nul +if not errorlevel 1 goto postgis_ready + +set /a WAIT_COUNT+=1 +if !WAIT_COUNT! GEQ %RETRY_MAX% ( + echo ERROR: PostGIS did not become ready in time. + docker logs "%CONTAINER_NAME%" + exit /b 1 +) + +timeout /t %RETRY_INTERVAL% /nobreak >nul +goto wait_for_postgis + +:postgis_ready +echo PostGIS is ready. +if %POSTGIS_READY_DELAY% GTR 0 timeout /t %POSTGIS_READY_DELAY% /nobreak >nul + +pushd "%NODE_DIR%" +call launch.bat +set "NODE_EXIT=%ERRORLEVEL%" +popd + +endlocal & exit /b %NODE_EXIT% + +:check_existing_oscar +set "OSCAR_PID=" +for /f "usebackq delims=" %%P in (` + powershell -NoProfile -ExecutionPolicy Bypass -Command "$procs = Get-CimInstance Win32_Process; foreach ($proc in $procs) { if ($proc.Name -match '^(java|javaw)(\.exe)?$' -and $null -ne $proc.CommandLine -and $proc.CommandLine -like '*com.botts.impl.security.SensorHubWrapper*') { [Console]::Write($proc.ProcessId); break } }" 2^>nul +`) do set "OSCAR_PID=%%P" +exit /b 0 + +:stop_existing_oscar +if not defined OSCAR_PID exit /b 0 +powershell -NoProfile -ExecutionPolicy Bypass -Command "try { Stop-Process -Id %OSCAR_PID% -Force -ErrorAction Stop; exit 0 } catch { exit 1 }" >nul 2>nul +exit /b 0 + +:wait_for_oscar_stop +set "WAIT_LIMIT=%~1" +if not defined WAIT_LIMIT set "WAIT_LIMIT=60" +set /a WAITED=0 + +:wait_for_oscar_stop_loop +call :check_existing_oscar +if not defined OSCAR_PID exit /b 0 +if !WAITED! GEQ %WAIT_LIMIT% exit /b 0 +timeout /t 1 /nobreak >nul +set /a WAITED+=1 +goto wait_for_oscar_stop_loop + +:load_env +for /f "usebackq tokens=1,* delims==" %%A in ("%~1") do ( + set "ENV_NAME=%%A" + set "ENV_VALUE=%%B" + call :set_env_var +) +exit /b 0 + +:set_env_var +if not defined ENV_NAME exit /b 0 +if "%ENV_NAME:~0,1%"=="#" exit /b 0 +if /I "%ENV_NAME:~0,7%"=="export " set "ENV_NAME=%ENV_NAME:~7%" +set "%ENV_NAME%=%ENV_VALUE%" +exit /b 0 \ No newline at end of file diff --git a/dist/release/launch-all.sh b/dist/release/launch-all.sh index 5716c7c..7dd5bb4 100755 --- a/dist/release/launch-all.sh +++ b/dist/release/launch-all.sh @@ -1,78 +1,261 @@ -#!/bin/bash - -HOST="localhost" -PORT="5432" -DB_NAME="gis" -DB_USER="postgres" -RETRY_MAX=20 -RETRY_INTERVAL=5 -PROJECT_DIR="$(pwd)" # Store the original directory -CONTAINER_NAME="oscar-postgis-container" - -#docker rm -f "$CONTAINER_NAME" 2>/dev/null || true - -# Create pgdata directory if needed -if [ ! -d "${PROJECT_DIR}/pgdata" ]; then - echo "Creating pgdata folder..." - mkdir -p "${PROJECT_DIR}/pgdata" -fi +#!/usr/bin/env bash +set -euo pipefail + +PROJECT_DIR="$(cd -- "$(dirname -- "${BASH_SOURCE[0]}")" && pwd)" +ENV_FILE="$PROJECT_DIR/.env" +MATCH_EXPR='com.botts.impl.security.SensorHubWrapper' +FORCE_RESTART="${FORCE_RESTART:-0}" +RETRY_MAX="${RETRY_MAX:-120}" +RETRY_INTERVAL="${RETRY_INTERVAL:-2}" +POSTGIS_READY_DELAY="${POSTGIS_READY_DELAY:-5}" +IMAGE_NAME="${POSTGIS_IMAGE_NAME:-${IMAGE_NAME:-oscar-postgis}}" +POSTGIS_DOCKERFILE="${POSTGIS_DOCKERFILE:-Dockerfile}" + +load_env() { + local env_file="$1" + while IFS= read -r line || [ -n "$line" ]; do + case "$line" in + ""|"#"*) continue ;; + export\ *) line="${line#export }" ;; + esac + local name="${line%%=*}" + local value="${line#*=}" + value="${value%$'\r'}" + export "${name}=${value}" + done < "$env_file" +} + +require_cmd() { + local cmd="$1" + if ! command -v "$cmd" >/dev/null 2>&1; then + echo "Error: required command not found: $cmd" + exit 1 + fi +} + +find_existing_oscar_pids() { + pgrep -f "$MATCH_EXPR" || true +} + +stop_existing_oscar() { + local pids="$1" + if [ -z "$pids" ]; then + return 0 + fi + + echo "Stopping existing OSCAR instance(s): $pids" + kill $pids 2>/dev/null || true + + local waited=0 + while [ "$waited" -lt 15 ]; do + sleep 1 + waited=$((waited + 1)) + if [ -z "$(find_existing_oscar_pids)" ]; then + return 0 + fi + done -# Check Docker -if ! command -v docker >/dev/null 2>&1; then - echo "Error: Docker is not installed. Please install Docker first." + echo "Existing OSCAR instance still running after graceful stop. Forcing stop." + kill -9 $pids 2>/dev/null || true + sleep 1 + + if [ -n "$(find_existing_oscar_pids)" ]; then + echo "Error: unable to stop the existing OSCAR instance." + exit 1 + fi +} + +check_existing_oscar() { + local pids + pids="$(find_existing_oscar_pids)" + + if [ -z "$pids" ]; then + return 0 + fi + + if [ "$FORCE_RESTART" = "1" ]; then + echo "Existing OSCAR instance found with PID(s): $pids. Replacing because FORCE_RESTART=1." + stop_existing_oscar "$pids" + return 0 + fi + + echo "OSCAR is already running with PID(s): $pids." + echo "Stop the running instance first, or set FORCE_RESTART=1 to replace it." exit 1 -fi +} -echo "Building PostGIS Docker image..." +require_number() { + local name="$1" + local value="${!name:-}" + case "$value" in + ''|*[!0-9]*) + echo "Error: ${name} must be a number, got '${value}'." + exit 1 + ;; + esac +} -cd postgis || { echo "Error: postgis directory not found"; exit 1; } +ensure_project_layout() { + if [ ! -d "$PROJECT_DIR/postgis" ]; then + echo "Error: postgis directory not found in $PROJECT_DIR" + exit 1 + fi -# Build PostGIS -docker build . \ - --file=Dockerfile \ - --tag=oscar-postgis + if [ ! -f "$PROJECT_DIR/postgis/$POSTGIS_DOCKERFILE" ]; then + echo "Error: $POSTGIS_DOCKERFILE not found in $PROJECT_DIR/postgis" + exit 1 + fi -echo "Starting PostGIS container..." + if [ ! -d "$PROJECT_DIR/osh-node-oscar" ]; then + echo "Error: osh-node-oscar directory not found in $PROJECT_DIR" + exit 1 + fi + if [ ! -f "$PROJECT_DIR/osh-node-oscar/launch.sh" ]; then + echo "Error: launch.sh not found in $PROJECT_DIR/osh-node-oscar" + exit 1 + fi -echo "PROJECT_DIR is set to: ${PROJECT_DIR}" + mkdir -p "$PROJECT_DIR/pgdata" +} -if docker ps -a --format '{{.Names}}' | grep -Eq "^${CONTAINER_NAME}$"; then - # The container exists - if docker ps --format '{{.Names}}' | grep -Eq "^${CONTAINER_NAME}$"; then - echo "Container already running: ${CONTAINER_NAME}" +if [ -f "$ENV_FILE" ]; then + load_env "$ENV_FILE" +else + echo "Warning: .env file not found in $PROJECT_DIR" + if [ -f "$PROJECT_DIR/env.template" ]; then + echo "Warning: using built-in defaults. Copy env.template to .env to customize settings." else - echo "Starting existing container: ${CONTAINER_NAME}" - docker start "${CONTAINER_NAME}" + echo "Warning: using built-in defaults." fi -else - echo "Creating new container: ${CONTAINER_NAME}" - docker run \ - --name "$CONTAINER_NAME" \ - -e POSTGRES_DB="$DB_NAME" \ - -e POSTGRES_USER="$DB_USER" \ - -e POSTGRES_PASSWORD="postgres" \ - -p $PORT:5432 \ - -v "${PROJECT_DIR}/pgdata:/var/lib/postgresql/data" \ - -d \ - oscar-postgis || { echo "Failed to start PostGIS container"; exit 1; } fi -# Wait for PostgreSQL/PostGIS to become ready -echo "Waiting for PostGIS (PostgreSQL) to be ready..." +SYSTEM_PROFILE="${SYSTEM_PROFILE:-8GB}" +CONTAINER_NAME="${CONTAINER_NAME:-oscar-postgis-container}" +DB_NAME="${DB_NAME:-gis}" +DB_USER="${DB_USER:-postgres}" +DB_PASSWORD="${DB_PASSWORD:-postgres}" +DB_PORT="${DB_PORT:-5432}" +DB_HOST="${DB_HOST:-localhost}" +export SYSTEM_PROFILE CONTAINER_NAME DB_NAME DB_USER DB_PASSWORD DB_PORT DB_HOST +export RETRY_MAX RETRY_INTERVAL POSTGIS_READY_DELAY IMAGE_NAME POSTGIS_DOCKERFILE + +require_cmd docker +if ! docker info >/dev/null 2>&1; then + echo "Error: Docker is installed, but the Docker daemon is not running." + exit 1 +fi -RETRY_COUNT=0 -export PGPASSWORD=postgres # Needed for pg_isready with password +check_existing_oscar +ensure_project_layout +require_number DB_PORT +require_number RETRY_MAX +require_number RETRY_INTERVAL +require_number POSTGIS_READY_DELAY -until docker exec "$CONTAINER_NAME" pg_isready -U "$DB_USER" -d "$DB_NAME" > /dev/null 2>&1; do - echo "PostGIS not ready yet, retrying..." - sleep "${RETRY_INTERVAL}" +case "${SYSTEM_PROFILE^^}" in + RPI4) + SYSTEM_PROFILE="RPI4" + PG_SHARED="256MB" + PG_CACHE="1GB" + PG_WORK_MEM="2MB" + PG_MAINT="64MB" + PG_MAX_CONN="75" + ;; + 8GB) + SYSTEM_PROFILE="8GB" + PG_SHARED="512MB" + PG_CACHE="2GB" + PG_WORK_MEM="4MB" + PG_MAINT="128MB" + PG_MAX_CONN="125" + ;; + 16GB) + SYSTEM_PROFILE="16GB" + PG_SHARED="1GB" + PG_CACHE="4GB" + PG_WORK_MEM="8MB" + PG_MAINT="256MB" + PG_MAX_CONN="200" + ;; + 32GB) + SYSTEM_PROFILE="32GB" + PG_SHARED="2GB" + PG_CACHE="8GB" + PG_WORK_MEM="16MB" + PG_MAINT="512MB" + PG_MAX_CONN="300" + ;; + *) + echo "Unknown profile '${SYSTEM_PROFILE}', using 8GB defaults." + SYSTEM_PROFILE="8GB" + PG_SHARED="512MB" + PG_CACHE="2GB" + PG_WORK_MEM="4MB" + PG_MAINT="128MB" + PG_MAX_CONN="125" + ;; +esac + +echo "Building PostGIS Docker image..." +( + cd "$PROJECT_DIR/postgis" + docker build . --file="$POSTGIS_DOCKERFILE" --tag="$IMAGE_NAME" +) + +echo "Preparing PostGIS container for profile: $SYSTEM_PROFILE" +echo " Image: $IMAGE_NAME" +echo " Port: ${DB_PORT}:5432" +echo " Data: $PROJECT_DIR/pgdata" + +if docker container inspect "$CONTAINER_NAME" >/dev/null 2>&1; then + echo "Removing existing container '$CONTAINER_NAME' so updated settings take effect..." + docker rm -f "$CONTAINER_NAME" >/dev/null +fi + +echo "Creating new container..." +docker run \ + --name "$CONTAINER_NAME" \ + -e POSTGRES_DB="$DB_NAME" \ + -e POSTGRES_USER="$DB_USER" \ + -e POSTGRES_PASSWORD="$DB_PASSWORD" \ + -p "${DB_PORT}:5432" \ + -v "$PROJECT_DIR/pgdata:/var/lib/postgresql/data" \ + -d \ + "$IMAGE_NAME" \ + -c shared_buffers="$PG_SHARED" \ + -c effective_cache_size="$PG_CACHE" \ + -c work_mem="$PG_WORK_MEM" \ + -c maintenance_work_mem="$PG_MAINT" \ + -c max_connections="$PG_MAX_CONN" \ + -c superuser_reserved_connections=10 \ + -c idle_session_timeout=600000 \ + -c log_connections=on \ + -c log_disconnections=on \ + -c wal_buffers=16MB \ + -c random_page_cost=1.1 \ + -c effective_io_concurrency=200 + +echo "Waiting for PostGIS to be ready..." +export PGPASSWORD="$DB_PASSWORD" +retry_count=0 +until docker exec "$CONTAINER_NAME" pg_isready -U "$DB_USER" -d "$DB_NAME" >/dev/null 2>&1; do + retry_count=$((retry_count + 1)) + if [ "$retry_count" -ge "$RETRY_MAX" ]; then + echo "Error: PostGIS did not become ready after $((RETRY_MAX * RETRY_INTERVAL)) seconds." + echo "Last container logs:" + docker logs --tail 50 "$CONTAINER_NAME" || true + exit 1 + fi + sleep "$RETRY_INTERVAL" done -echo "PostGIS (PostgreSQL) is ready! Please wait for OpenSensorHub to start..." +echo "PostGIS is ready." +sleep "$POSTGIS_READY_DELAY" -sleep 10 +cd "$PROJECT_DIR/osh-node-oscar" +if [ ! -x ./launch.sh ]; then + chmod +x ./launch.sh +fi -# Launch osh-node-oscar -cd "$PROJECT_DIR/osh-node-oscar" || { echo "Error: osh-node-oscar not found"; exit 1; } -./launch.sh \ No newline at end of file +exec ./launch.sh diff --git a/dist/release/monitor-oscar.bat b/dist/release/monitor-oscar.bat new file mode 100644 index 0000000..4fd2538 --- /dev/null +++ b/dist/release/monitor-oscar.bat @@ -0,0 +1,3 @@ +@echo off +powershell -NoProfile -ExecutionPolicy Bypass -File "%~dp0monitor-oscar.ps1" +exit /b %ERRORLEVEL% \ No newline at end of file diff --git a/dist/release/monitor-oscar.ps1 b/dist/release/monitor-oscar.ps1 new file mode 100644 index 0000000..e71fa6b --- /dev/null +++ b/dist/release/monitor-oscar.ps1 @@ -0,0 +1,290 @@ +param( + [string]$AttachToExisting, + [string]$ForceRestart +) + +Set-StrictMode -Version Latest +$ErrorActionPreference = 'Stop' + +$script:BaseDir = Split-Path -Parent $PSCommandPath +$script:MonitorDir = Join-Path $script:BaseDir ("oscar-monitor-{0}" -f (Get-Date -Format 'yyyyMMdd-HHmmss')) +$script:StatusFile = Join-Path $script:MonitorDir 'monitor-status.txt' +$script:HeartbeatFile = Join-Path $script:BaseDir 'monitor.heartbeat' +$script:BackendPidFile = Join-Path $script:BaseDir 'oscar.pid' +$script:CurrentMonitorFile = Join-Path $script:BaseDir 'current-monitor-dir.txt' +$script:MonitorLockDir = Join-Path $script:BaseDir '.monitor-lock' +$script:MonitorLockInfo = Join-Path $script:MonitorLockDir 'owner.json' +$script:LockAcquired = $false + +New-Item -ItemType Directory -Force -Path $script:MonitorDir | Out-Null +Set-Content -Path $script:CurrentMonitorFile -Value $script:MonitorDir -Encoding ASCII + +function Write-Status { + param([string]$Message) + + $ts = Get-Date -Format 'yyyy-MM-dd HH:mm:ss' + $line = "$ts $Message" + Write-Host $Message + Add-Content -Path $script:StatusFile -Value $line -Encoding UTF8 +} + +function Load-DotEnv { + param([string]$Path) + + if (-not (Test-Path $Path)) { + return + } + + foreach ($rawLine in Get-Content -Path $Path) { + $line = $rawLine.Trim() + if ([string]::IsNullOrWhiteSpace($line)) { continue } + if ($line.StartsWith('#')) { continue } + + $idx = $line.IndexOf('=') + if ($idx -lt 1) { continue } + + $name = $line.Substring(0, $idx).Trim() + $value = $line.Substring($idx + 1) + + if ( + ($value.Length -ge 2) -and + ( + ($value.StartsWith('"') -and $value.EndsWith('"')) -or + ($value.StartsWith("'") -and $value.EndsWith("'")) + ) + ) { + $value = $value.Substring(1, $value.Length - 2) + } + + [Environment]::SetEnvironmentVariable($name, $value, 'Process') + } +} + +function Convert-ToFlag { + param([string]$Value) + + if ([string]::IsNullOrWhiteSpace($Value)) { + return $false + } + + switch -Regex ($Value.Trim()) { + '^(?i:1|true|yes|y|on)$' { return $true } + default { return $false } + } +} + +function Get-BackendProcess { + $proc = Get-CimInstance Win32_Process | Where-Object { + $_.Name -match '^java(w)?\.exe$' -and + $_.CommandLine -match 'SensorHubWrapper' + } | Select-Object -First 1 + + return $proc +} + +function Get-ProcessStartTimeUtcString { + param([int]$Pid) + + try { + return (Get-Process -Id $Pid -ErrorAction Stop).StartTime.ToUniversalTime().ToString('o') + } + catch { + return $null + } +} + +function Update-Heartbeat { + param( + [string]$State, + [Nullable[int]]$BackendPid = $null + ) + + $lines = @( + "timestamp=$((Get-Date).ToString('o'))" + "state=$State" + "monitor_pid=$PID" + "monitor_dir=$script:MonitorDir" + ) + + if ($null -ne $BackendPid) { + $lines += "backend_pid=$BackendPid" + } + + Set-Content -Path $script:HeartbeatFile -Value $lines -Encoding ASCII +} + +function Release-MonitorLock { + if (Test-Path $script:MonitorLockDir) { + Remove-Item -Path $script:MonitorLockDir -Recurse -Force -ErrorAction SilentlyContinue + } +} + +function Acquire-MonitorLock { + for ($attempt = 1; $attempt -le 2; $attempt++) { + try { + New-Item -ItemType Directory -Path $script:MonitorLockDir -ErrorAction Stop | Out-Null + + $owner = [ordered]@{ + pid = $PID + acquiredUtc = (Get-Date).ToUniversalTime().ToString('o') + processName = (Get-Process -Id $PID).ProcessName + processStart = (Get-Process -Id $PID).StartTime.ToUniversalTime().ToString('o') + scriptPath = $PSCommandPath + hostName = $env:COMPUTERNAME + } + + $owner | ConvertTo-Json | Set-Content -Path $script:MonitorLockInfo -Encoding UTF8 + $script:LockAcquired = $true + + return @{ + Acquired = $true + ExistingPid = $null + } + } + catch { + $existingPid = $null + $alive = $false + + if (Test-Path $script:MonitorLockInfo) { + try { + $info = Get-Content -Path $script:MonitorLockInfo -Raw | ConvertFrom-Json + $existingPid = [int]$info.pid + $currentStart = Get-ProcessStartTimeUtcString -Pid $existingPid + if ($currentStart -and $currentStart -eq $info.processStart) { + $alive = $true + } + } + catch { + $alive = $false + } + } + + if ($alive) { + return @{ + Acquired = $false + ExistingPid = $existingPid + } + } + + Remove-Item -Path $script:MonitorLockDir -Recurse -Force -ErrorAction SilentlyContinue + Start-Sleep -Milliseconds 200 + } + } + + throw "Could not acquire monitor lock at $script:MonitorLockDir" +} + +function Invoke-Monitor { + Load-DotEnv -Path (Join-Path $script:BaseDir '.env') + + if ([string]::IsNullOrWhiteSpace($AttachToExisting)) { + $AttachToExisting = $env:ATTACH_TO_EXISTING + } + + if ([string]::IsNullOrWhiteSpace($ForceRestart)) { + $ForceRestart = $env:FORCE_RESTART + } + + $attach = Convert-ToFlag $AttachToExisting + $force = Convert-ToFlag $ForceRestart + + Write-Status "Monitor output: $script:MonitorDir" + + $lock = Acquire-MonitorLock + if (-not $lock.Acquired) { + Write-Status "Monitor script already running with PID $($lock.ExistingPid)." + Write-Status "Exiting without starting a second monitor." + return 200 + } + + Update-Heartbeat -State 'startup' + + $backend = Get-BackendProcess + + if ($null -ne $backend) { + $backendPid = [int]$backend.ProcessId + + if ($force) { + Write-Status "Existing OSCAR backend found with PID $backendPid. FORCE_RESTART=1, stopping it first..." + & (Join-Path $script:BaseDir 'stop-all.bat') + $stopRc = $LASTEXITCODE + Start-Sleep -Seconds 5 + + $backend = Get-BackendProcess + if ($null -ne $backend) { + throw "OSCAR backend still running with PID $($backend.ProcessId) after stop-all.bat." + } + + Write-Status "Previous OSCAR backend stopped." + } + elseif (-not $attach) { + Write-Status "OSCAR is already running with PID $backendPid." + Write-Status "Set ATTACH_TO_EXISTING=1 to monitor it, or FORCE_RESTART=1 to replace it." + return 201 + } + else { + Write-Status "Attaching to existing OSCAR backend PID $backendPid..." + } + } + + if ($null -eq $backend) { + Write-Status "Launching OSCAR stack..." + & (Join-Path $script:BaseDir 'launch-all.bat') + $launchRc = $LASTEXITCODE + + if ($launchRc -ne 0) { + throw "launch-all.bat failed with exit code $launchRc." + } + + $backend = $null + for ($i = 1; $i -le 60; $i++) { + Start-Sleep -Seconds 1 + $backend = Get-BackendProcess + if ($null -ne $backend) { + break + } + } + + if ($null -eq $backend) { + throw "Timed out waiting for OSCAR backend to appear." + } + } + + $backendPid = [int]$backend.ProcessId + Set-Content -Path $script:BackendPidFile -Value $backendPid -Encoding ASCII + + Write-Status "Monitoring OSCAR backend PID $backendPid..." + + while ($true) { + Update-Heartbeat -State 'running' -BackendPid $backendPid + Start-Sleep -Seconds 30 + + $backend = Get-BackendProcess + if ($null -eq $backend) { + Write-Status "OSCAR backend is no longer running." + Update-Heartbeat -State 'stopped' + return 0 + } + + $backendPid = [int]$backend.ProcessId + Set-Content -Path $script:BackendPidFile -Value $backendPid -Encoding ASCII + } +} + +$exitCode = 0 + +try { + $exitCode = Invoke-Monitor +} +catch { + Write-Status "ERROR: $($_.Exception.Message)" + Update-Heartbeat -State 'error' + $exitCode = 500 +} +finally { + if ($script:LockAcquired) { + Release-MonitorLock + } +} + +exit $exitCode \ No newline at end of file diff --git a/dist/release/monitor-oscar.sh b/dist/release/monitor-oscar.sh new file mode 100755 index 0000000..c06254b --- /dev/null +++ b/dist/release/monitor-oscar.sh @@ -0,0 +1,538 @@ +#!/bin/bash +set -euo pipefail + +SCRIPT_DIR="$(cd -- "$(dirname -- "${BASH_SOURCE[0]}")" && pwd)" +PROJECT_DIR="${PROJECT_DIR:-$SCRIPT_DIR}" +LAUNCH_CMD="${LAUNCH_CMD:-$PROJECT_DIR/launch-all.sh}" +MATCH_EXPR="${MATCH_EXPR:-com.botts.impl.security.SensorHubWrapper}" +INTERVAL="${INTERVAL:-60}" +MAX_WAIT_SECONDS="${MAX_WAIT_SECONDS:-300}" +OUT_DIR="${OUT_DIR:-$PROJECT_DIR/oscar-monitor-$(date +%Y%m%d-%H%M%S)}" +JFR_NAME="${JFR_NAME:-oscar}" +JFR_MAX_AGE="${JFR_MAX_AGE:-4h}" +JFR_MAX_SIZE="${JFR_MAX_SIZE:-1g}" +ENV_FILE="${ENV_FILE:-$PROJECT_DIR/.env}" +ATTACH_TO_EXISTING="${ATTACH_TO_EXISTING:-0}" +FORCE_RESTART="${FORCE_RESTART:-0}" + +STATE_DIR="$PROJECT_DIR/.monitor-state" +MONITOR_LOCK_DIR="$STATE_DIR/lock" +MONITOR_PID_FILE="$STATE_DIR/monitor.pid" +ACTIVE_MONITOR_FILE="$STATE_DIR/active-monitor-dir.txt" +STATUS_FILE="$PROJECT_DIR/monitor.last-status" +ERROR_FILE="$PROJECT_DIR/monitor.last-error" + +CONTAINER_NAME="oscar-postgis-container" +DB_NAME="gis" +DB_USER="postgres" +DB_PASSWORD="postgres" +DB_CSV="" + +LAUNCH_PID="" +PID="" +STOPPING=0 +USE_EXISTING=0 +MONITOR_LOCK_OWNED=0 +FINAL_STATUS_WRITTEN=0 + +log() { + printf '%s %s\n' "$(date -Is)" "$*" +} + +write_status() { + printf '%s %s\n' "$(date -Is)" "$*" > "$STATUS_FILE" +} + +write_error() { + printf '%s %s\n' "$(date -Is)" "$*" > "$ERROR_FILE" +} + +clear_error() { + : > "$ERROR_FILE" +} + +finalize_status() { + write_status "$*" + FINAL_STATUS_WRITTEN=1 +} + +require_cmd() { + local cmd="$1" + if ! command -v "$cmd" >/dev/null 2>&1; then + echo "Error: required command not found: $cmd" >&2 + write_error "Missing required command: $cmd" + finalize_status "FAILED missing_dependency command=$cmd" + exit 1 + fi +} + +get_java_major() { + java -version 2>&1 | awk -F'"' '/version/ { split($2, v, "."); print v[1]; exit }' +} + +check_dependencies() { + require_cmd bash + require_cmd java + require_cmd docker + require_cmd pgrep + + local java_major + java_major="$(get_java_major || true)" + if [[ -z "$java_major" || ! "$java_major" =~ ^[0-9]+$ || "$java_major" -lt 21 ]]; then + echo "Error: Java 21 or newer is required to run OSCAR monitoring." >&2 + write_error "Java 21 or newer is required to run OSCAR monitoring." + finalize_status "FAILED java_too_old" + exit 1 + fi + + if ! command -v jcmd >/dev/null 2>&1; then + log "Warning: jcmd not found. JFR/NMT snapshots will be skipped." + fi +} + +read_monitor_pid() { + if [ -f "$MONITOR_PID_FILE" ]; then + tr -d '[:space:]' < "$MONITOR_PID_FILE" + fi +} + +is_monitor_pid_running() { + local pid="$1" + [ -n "$pid" ] || return 1 + kill -0 "$pid" 2>/dev/null || return 1 + ps -p "$pid" -o args= 2>/dev/null | grep -Fq "monitor-oscar.sh" +} + +remove_monitor_state() { + rm -f "$MONITOR_PID_FILE" "$ACTIVE_MONITOR_FILE" + if [ -d "$MONITOR_LOCK_DIR" ]; then + rmdir "$MONITOR_LOCK_DIR" 2>/dev/null || rm -rf "$MONITOR_LOCK_DIR" 2>/dev/null || true + fi +} + +release_monitor_lock() { + if [ "$MONITOR_LOCK_OWNED" -eq 1 ]; then + local current_pid="" + current_pid="$(read_monitor_pid || true)" + if [ "$current_pid" = "$$" ] || [ -z "$current_pid" ]; then + remove_monitor_state + fi + MONITOR_LOCK_OWNED=0 + fi +} + +refuse_existing_monitor() { + local existing_pid="$1" + local existing_dir="" + if [ -f "$ACTIVE_MONITOR_FILE" ]; then + existing_dir="$(cat "$ACTIVE_MONITOR_FILE" 2>/dev/null || true)" + fi + + echo "Error: Another monitor-oscar.sh instance is already running with PID $existing_pid." >&2 + if [ -n "$existing_dir" ]; then + echo "Active monitor output: $existing_dir" >&2 + fi + echo "Run ./stop-all.sh or ./monitor-oscar.sh stop before starting another monitor." >&2 + + if [ -n "$existing_dir" ]; then + write_error "Duplicate monitor start refused. Existing monitor PID=$existing_pid output=$existing_dir" + finalize_status "FAILED duplicate_monitor existing_pid=$existing_pid output=$existing_dir" + else + write_error "Duplicate monitor start refused. Existing monitor PID=$existing_pid" + finalize_status "FAILED duplicate_monitor existing_pid=$existing_pid" + fi + exit 1 +} + +preflight_existing_monitor() { + local existing_pid="" + existing_pid="$(read_monitor_pid || true)" + if [ -n "$existing_pid" ] && is_monitor_pid_running "$existing_pid"; then + refuse_existing_monitor "$existing_pid" + fi + if [ -n "$existing_pid" ] || [ -d "$MONITOR_LOCK_DIR" ] || [ -f "$ACTIVE_MONITOR_FILE" ]; then + log "Removing stale OSCAR monitor state." + remove_monitor_state + fi +} + +acquire_monitor_lock() { + local existing_pid="" + mkdir -p "$STATE_DIR" + + if mkdir "$MONITOR_LOCK_DIR" 2>/dev/null; then + echo "$$" > "$MONITOR_PID_FILE" + MONITOR_LOCK_OWNED=1 + return 0 + fi + + sleep 1 + existing_pid="$(read_monitor_pid || true)" + if [ -n "$existing_pid" ] && is_monitor_pid_running "$existing_pid"; then + refuse_existing_monitor "$existing_pid" + fi + + log "Removing stale OSCAR monitor lock state." + remove_monitor_state + if ! mkdir "$MONITOR_LOCK_DIR" 2>/dev/null; then + echo "Error: Could not acquire OSCAR monitor lock at $MONITOR_LOCK_DIR" >&2 + write_error "Could not acquire OSCAR monitor lock at $MONITOR_LOCK_DIR" + finalize_status "FAILED lock_acquire path=$MONITOR_LOCK_DIR" + exit 1 + fi + + echo "$$" > "$MONITOR_PID_FILE" + MONITOR_LOCK_OWNED=1 +} + +find_existing_oscar_pid() { + pgrep -f "$MATCH_EXPR" | head -n 1 || true +} + +find_all_existing_oscar_pids() { + pgrep -f "$MATCH_EXPR" || true +} + +stop_existing_oscar() { + local pids="$1" + if [ -z "$pids" ]; then + return 0 + fi + + log "Stopping existing OSCAR instance(s): $pids" + kill $pids 2>/dev/null || true + + local waited=0 + while [ "$waited" -lt 15 ]; do + sleep 1 + waited=$((waited + 1)) + if [ -z "$(find_all_existing_oscar_pids)" ]; then + return 0 + fi + done + + log "Force killing existing OSCAR instance(s): $pids" + kill -9 $pids 2>/dev/null || true + sleep 1 + + if [ -n "$(find_all_existing_oscar_pids)" ]; then + echo "Error: unable to stop existing OSCAR instance(s)." >&2 + write_error "Unable to stop existing OSCAR instance(s): $pids" + finalize_status "FAILED existing_oscar_stop pids=$pids" + exit 1 + fi +} + +run_db_query() { + local sql="$1" + docker exec -e PGPASSWORD="$DB_PASSWORD" "$CONTAINER_NAME" \ + psql -U "$DB_USER" -d "$DB_NAME" -At -c "$sql" +} + +collect_db_snapshot() { + local d="$1" + local ts failed total active idle idle_tx max_conn super_reserved + ts="$(date -Is)" + failed=0 + + if ! docker ps --format '{{.Names}}' | grep -Eq "^${CONTAINER_NAME}$"; then + echo "Container ${CONTAINER_NAME} not running" > "$d/db-error.txt" + echo "$ts,,,,,,,1" >> "$DB_CSV" + return 0 + fi + + if run_db_query "show max_connections;" > "$d/db-max-connections.txt" 2> "$d/db-error.txt"; then + run_db_query "show superuser_reserved_connections;" > "$d/db-superuser-reserved-connections.txt" 2>> "$d/db-error.txt" || failed=1 + run_db_query "select count(*) from pg_stat_activity;" > "$d/db-total-sessions.txt" 2>> "$d/db-error.txt" || failed=1 + run_db_query "select coalesce(state,''), count(*) from pg_stat_activity group by state order by count(*) desc;" > "$d/db-by-state.txt" 2>> "$d/db-error.txt" || failed=1 + run_db_query "select coalesce(application_name,''), coalesce(usename,''), coalesce(client_addr::text,''), coalesce(state,''), count(*) from pg_stat_activity group by application_name, usename, client_addr, state order by count(*) desc limit 20;" > "$d/db-by-app.txt" 2>> "$d/db-error.txt" || failed=1 + run_db_query "select pid, usename, application_name, client_addr, state, backend_start, xact_start, query_start, wait_event_type, wait_event, left(query,120) from pg_stat_activity order by backend_start;" > "$d/db-activity-detail.txt" 2>> "$d/db-error.txt" || failed=1 + else + failed=1 + fi + + max_conn="" + super_reserved="" + total="" + active="" + idle="" + idle_tx="" + + [ -f "$d/db-max-connections.txt" ] && max_conn="$(tr -d '[:space:]' < "$d/db-max-connections.txt" | tail -n 1)" + [ -f "$d/db-superuser-reserved-connections.txt" ] && super_reserved="$(tr -d '[:space:]' < "$d/db-superuser-reserved-connections.txt" | tail -n 1)" + [ -f "$d/db-total-sessions.txt" ] && total="$(tr -d '[:space:]' < "$d/db-total-sessions.txt" | tail -n 1)" + if [ -f "$d/db-by-state.txt" ]; then + active="$(awk -F'|' '$1=="active" {gsub(/^[ \t]+|[ \t]+$/, "", $2); print $2}' "$d/db-by-state.txt" | tail -n 1)" + idle="$(awk -F'|' '$1=="idle" {gsub(/^[ \t]+|[ \t]+$/, "", $2); print $2}' "$d/db-by-state.txt" | tail -n 1)" + idle_tx="$(awk -F'|' '$1=="idle in transaction" {gsub(/^[ \t]+|[ \t]+$/, "", $2); print $2}' "$d/db-by-state.txt" | tail -n 1)" + fi + echo "$ts,${total:-},${active:-0},${idle:-0},${idle_tx:-0},${max_conn:-},${super_reserved:-},$failed" >> "$DB_CSV" +} + +dump_once() { + if [ -z "${PID:-}" ] || ! kill -0 "$PID" 2>/dev/null; then + return 0 + fi + + local ts d + ts="$(date +%Y%m%d-%H%M%S)" + d="$OUT_DIR/$ts" + mkdir -p "$d" + + log "Collecting snapshot at $ts for PID $PID" + + ps -p "$PID" -o pid,ppid,user,%cpu,%mem,vsz,rss,etimes,cmd > "$d/ps.txt" 2>&1 || true + [ -r "/proc/$PID/status" ] && cat "/proc/$PID/status" > "$d/proc-status.txt" 2>&1 || true + [ -r "/proc/$PID/smaps_rollup" ] && cat "/proc/$PID/smaps_rollup" > "$d/smaps_rollup.txt" 2>&1 || true + + command -v pmap >/dev/null 2>&1 && pmap -x "$PID" > "$d/pmap-x.txt" 2>&1 || true + command -v free >/dev/null 2>&1 && free -h > "$d/free.txt" 2>&1 || true + [ -r /proc/meminfo ] && cat /proc/meminfo > "$d/meminfo.txt" 2>&1 || true + [ -r /proc/swaps ] && cat /proc/swaps > "$d/swaps.txt" 2>&1 || true + vmstat 1 5 > "$d/vmstat.txt" 2>&1 || true + + if command -v jcmd >/dev/null 2>&1; then + jcmd "$PID" VM.native_memory summary > "$d/nmt-summary.txt" 2>&1 || true + jcmd "$PID" GC.heap_info > "$d/gc-heap-info.txt" 2>&1 || true + jcmd "$PID" Thread.print > "$d/thread-print.txt" 2>&1 || true + jcmd "$PID" JFR.check > "$d/jfr-check.txt" 2>&1 || true + fi + + docker ps --filter "name=$CONTAINER_NAME" > "$d/docker-ps.txt" 2>&1 || true + docker logs --tail 100 "$CONTAINER_NAME" > "$d/docker-logs-tail.txt" 2>&1 || true + collect_db_snapshot "$d" +} + +final_dump() { + if [ -n "${PID:-}" ] && kill -0 "$PID" 2>/dev/null; then + dump_once + if command -v jcmd >/dev/null 2>&1; then + jcmd "$PID" JFR.dump name="$JFR_NAME" filename="$OUT_DIR/${JFR_NAME}-final.jfr" \ + > "$OUT_DIR/jfr-dump-final.txt" 2>&1 || true + fi + fi +} + +stop_stack() { + if [ "$STOPPING" -eq 1 ]; then + return 0 + fi + STOPPING=1 + + log "Stopping OSCAR stack..." + final_dump + + if [ -n "${PID:-}" ] && kill -0 "$PID" 2>/dev/null; then + log "Stopping JVM PID $PID" + kill "$PID" 2>/dev/null || true + for _ in 1 2 3 4 5 6 7 8 9 10; do + if ! kill -0 "$PID" 2>/dev/null; then + break + fi + sleep 1 + done + if kill -0 "$PID" 2>/dev/null; then + log "Force killing JVM PID $PID" + kill -9 "$PID" 2>/dev/null || true + fi + fi + + if [ -n "${LAUNCH_PID:-}" ] && kill -0 "$LAUNCH_PID" 2>/dev/null; then + log "Stopping launcher PID $LAUNCH_PID" + kill "$LAUNCH_PID" 2>/dev/null || true + fi + + if docker ps --format '{{.Names}}' | grep -Eq "^${CONTAINER_NAME}$"; then + log "Stopping container ${CONTAINER_NAME}" + docker stop "$CONTAINER_NAME" > "$OUT_DIR/docker-stop.txt" 2>&1 || true + fi +} + +on_signal() { + log "Received stop signal" + write_status "STOPPING signal_received monitor_pid=$$ output=$OUT_DIR" + stop_stack + finalize_status "STOPPED signal monitor_pid=$$ output=$OUT_DIR" + exit 0 +} + +on_exit() { + local ec="$?" + final_dump + if [ "$FINAL_STATUS_WRITTEN" -eq 0 ]; then + if [ "$ec" -eq 0 ]; then + if [ "$STOPPING" -eq 1 ]; then + finalize_status "STOPPED monitor_pid=$$ output=$OUT_DIR" + elif [ -n "${PID:-}" ]; then + finalize_status "EXITED jvm_pid=$PID monitor_pid=$$ output=$OUT_DIR" + else + finalize_status "EXITED monitor_pid=$$ output=$OUT_DIR" + fi + else + finalize_status "FAILED exit_code=$ec monitor_pid=$$ output=$OUT_DIR" + fi + fi + release_monitor_lock +} + +if [ "${1:-}" = "stop" ]; then + mkdir -p "$STATE_DIR" + monitor_pid="$(read_monitor_pid || true)" + active_dir="" + if [ -f "$ACTIVE_MONITOR_FILE" ]; then + active_dir="$(cat "$ACTIVE_MONITOR_FILE" 2>/dev/null || true)" + fi + + if [ -n "$monitor_pid" ] && is_monitor_pid_running "$monitor_pid"; then + write_status "STOP_REQUESTED monitor_pid=$monitor_pid output=$active_dir" + clear_error + kill "$monitor_pid" 2>/dev/null || true + echo "OSCAR monitor stop requested for PID $monitor_pid." + exit 0 + fi + + remove_monitor_state + write_status "STOP_REQUESTED no_active_monitor" + clear_error + echo "OSCAR monitor is not running." + exit 0 +fi + +trap on_signal INT TERM +trap on_exit EXIT + +mkdir -p "$STATE_DIR" +write_status "STARTING monitor_pid=$$ output=$OUT_DIR" +clear_error + +if [ -f "$ENV_FILE" ]; then + set -a + . "$ENV_FILE" + set +a + CONTAINER_NAME="${CONTAINER_NAME:-oscar-postgis-container}" + DB_NAME="${DB_NAME:-gis}" + DB_USER="${DB_USER:-postgres}" + DB_PASSWORD="${DB_PASSWORD:-postgres}" +fi + +DB_CSV="$OUT_DIR/db-connection-trend.csv" + +check_dependencies +preflight_existing_monitor +acquire_monitor_lock +mkdir -p "$OUT_DIR" +echo "$OUT_DIR" > "$ACTIVE_MONITOR_FILE" + +DB_CSV="$OUT_DIR/db-connection-trend.csv" +echo 'timestamp,total_sessions,active,idle,idle_in_transaction,max_connections,superuser_reserved_connections,failed_psql' > "$DB_CSV" + +log "Monitor output: $OUT_DIR" +log "Launch command: $LAUNCH_CMD" +log "JVM match: $MATCH_EXPR" +log "Container name: $CONTAINER_NAME" +log "Database: $DB_NAME user=$DB_USER" +write_status "RUNNING monitor_pid=$$ output=$OUT_DIR" + +if [ ! -x "$LAUNCH_CMD" ] && [ "$ATTACH_TO_EXISTING" != "1" ]; then + echo "Error: launch command is not executable: $LAUNCH_CMD" >&2 + write_error "Launch command is not executable: $LAUNCH_CMD" + finalize_status "FAILED launch_not_executable path=$LAUNCH_CMD" + exit 1 +fi + +existing_pids="$(find_all_existing_oscar_pids)" +if [ -n "$existing_pids" ]; then + if [ "$ATTACH_TO_EXISTING" = "1" ]; then + PID="$(printf '%s\n' "$existing_pids" | head -n 1)" + USE_EXISTING=1 + log "Attaching monitor to existing OSCAR PID $PID" + clear_error + write_status "RUNNING attached monitor_pid=$$ jvm_pid=$PID output=$OUT_DIR" + elif [ "$FORCE_RESTART" = "1" ]; then + log "Existing OSCAR instance found: $existing_pids" + stop_existing_oscar "$existing_pids" + else + echo "OSCAR is already running with PID(s): $existing_pids" >&2 + echo "Set ATTACH_TO_EXISTING=1 to monitor the running instance, or FORCE_RESTART=1 to replace it." >&2 + write_error "OSCAR is already running with PID(s): $existing_pids" + finalize_status "FAILED oscar_already_running pids=$existing_pids" + exit 1 + fi +fi + +if [ "$USE_EXISTING" = "0" ]; then + log "Starting OSCAR..." + write_status "WAITING_FOR_JVM monitor_pid=$$ output=$OUT_DIR" + "$LAUNCH_CMD" > "$OUT_DIR/launch.stdout.log" 2> "$OUT_DIR/launch.stderr.log" & + LAUNCH_PID=$! + echo "$LAUNCH_PID" > "$OUT_DIR/launcher-pid.txt" + + waited=0 + while true; do + PID="$(find_existing_oscar_pid)" + if [ -n "$PID" ]; then + break + fi + if ! kill -0 "$LAUNCH_PID" 2>/dev/null; then + write_error "Launch process exited before OSCAR JVM appeared. Check $OUT_DIR/launch.stdout.log and $OUT_DIR/launch.stderr.log" + finalize_status "FAILED launch_exited_before_jvm output=$OUT_DIR" + exit 1 + fi + if [ "$waited" -ge "$MAX_WAIT_SECONDS" ]; then + log "Timed out waiting for JVM after ${MAX_WAIT_SECONDS}s" + write_error "Timed out waiting for JVM after ${MAX_WAIT_SECONDS}s. Check $OUT_DIR/launch.stdout.log and $OUT_DIR/launch.stderr.log" + finalize_status "FAILED wait_for_jvm_timeout output=$OUT_DIR" + exit 1 + fi + sleep 2 + waited=$((waited + 2)) + done +else + : > "$OUT_DIR/launch.stdout.log" + : > "$OUT_DIR/launch.stderr.log" +fi + +log "Found JVM PID: $PID" +echo "$PID" > "$OUT_DIR/jvm-pid.txt" +write_status "RUNNING monitor_pid=$$ jvm_pid=$PID output=$OUT_DIR" +clear_error + +{ + echo "Timestamp: $(date -Is)" + echo "Monitor PID: $$" + echo "Launcher PID: ${LAUNCH_PID:-}" + echo "JVM PID: $PID" + echo + echo "Command line:" + tr '\0' ' ' < "/proc/$PID/cmdline" + echo +} > "$OUT_DIR/process-info.txt" + +if command -v jcmd >/dev/null 2>&1; then + log "Starting JFR on PID $PID" + jcmd "$PID" JFR.start \ + name="$JFR_NAME" \ + settings=profile \ + disk=true \ + maxage="$JFR_MAX_AGE" \ + maxsize="$JFR_MAX_SIZE" \ + filename="$OUT_DIR/${JFR_NAME}.jfr" \ + > "$OUT_DIR/jfr-start.txt" 2>&1 || true + + jcmd "$PID" VM.native_memory baseline \ + > "$OUT_DIR/nmt-baseline.txt" 2>&1 || true +fi + +dump_once + +while kill -0 "$PID" 2>/dev/null; do + sleep "$INTERVAL" + dump_once + write_status "RUNNING monitor_pid=$$ jvm_pid=$PID output=$OUT_DIR" +done + +log "JVM exited." +if [ -n "$LAUNCH_PID" ]; then + wait "$LAUNCH_PID" || true +fi +finalize_status "EXITED jvm_pid=$PID monitor_pid=$$ output=$OUT_DIR" diff --git a/dist/release/reset-all.bat b/dist/release/reset-all.bat new file mode 100755 index 0000000..6488a6d --- /dev/null +++ b/dist/release/reset-all.bat @@ -0,0 +1,99 @@ +@echo off +setlocal EnableExtensions EnableDelayedExpansion + +set "SCRIPT_DIR=%~dp0" +if "%SCRIPT_DIR:~-1%"=="\" set "SCRIPT_DIR=%SCRIPT_DIR:~0,-1%" + +set "ENV_FILE=%SCRIPT_DIR%\.env" +if exist "%ENV_FILE%" call :load_env "%ENV_FILE%" + +if not defined CONTAINER_NAME set "CONTAINER_NAME=oscar-postgis-container" + +set "PGDATA_DIR=%SCRIPT_DIR%\pgdata" +set "NODE_DIR=%SCRIPT_DIR%\osh-node-oscar" +set "DB_DIR=%NODE_DIR%\db" +set "FILES_DIR=%NODE_DIR%\files" +set "CONFIG_JSON=%NODE_DIR%\config.json" +set "CONFIG_TEMPLATE=%NODE_DIR%\config.template.json" +set "SECRET_FILE=%NODE_DIR%\.s" + +echo Requesting monitor shutdown... +if exist "%SCRIPT_DIR%\monitor-oscar.bat" ( + call "%SCRIPT_DIR%\monitor-oscar.bat" stop >nul 2>nul +) + +echo Stopping OSCAR Java processes... +for /f "usebackq delims=" %%P in (` + powershell -NoProfile -ExecutionPolicy Bypass -Command "$procs = Get-CimInstance Win32_Process; foreach ($proc in $procs) { if ($proc.Name -match '^(java|javaw)(\.exe)?$' -and $null -ne $proc.CommandLine -and $proc.CommandLine -like '*com.botts.impl.security.SensorHubWrapper*') { $proc.ProcessId } }" 2^>nul +`) do ( + powershell -NoProfile -ExecutionPolicy Bypass -Command "try { Stop-Process -Id %%P -Force -ErrorAction Stop } catch {}" >nul 2>nul +) + +echo Removing container: %CONTAINER_NAME%... +docker rm -f -v "%CONTAINER_NAME%" >nul 2>nul + +if exist "%PGDATA_DIR%" ( + echo Removing Postgres data directory: %PGDATA_DIR% + rmdir /s /q "%PGDATA_DIR%" +) else ( + echo Postgres data directory not found: %PGDATA_DIR% +) + +if exist "%DB_DIR%" ( + echo Removing OSCAR runtime DB directory: %DB_DIR% + rmdir /s /q "%DB_DIR%" +) else ( + echo OSCAR runtime DB directory not found: %DB_DIR% +) + +if exist "%FILES_DIR%" ( + echo Removing OSCAR files directory: %FILES_DIR% + rmdir /s /q "%FILES_DIR%" +) else ( + echo OSCAR files directory not found: %FILES_DIR% +) + +if exist "%CONFIG_TEMPLATE%" ( + echo Restoring config.json from template: %CONFIG_TEMPLATE% + copy /y "%CONFIG_TEMPLATE%" "%CONFIG_JSON%" >nul +) else ( + if exist "%CONFIG_JSON%" ( + echo WARNING: config.template.json not found. Resetting admin password placeholder in existing config.json. + powershell -NoProfile -ExecutionPolicy Bypass -Command ^ + "$path = '%CONFIG_JSON%';" ^ + "$json = Get-Content -LiteralPath $path -Raw;" ^ + "$pattern = '(\"id\"\s*:\s*\"admin\"[\s\S]*?\"password\"\s*:\s*)\"[^\"]*\"';" ^ + "$updated = [regex]::Replace($json, $pattern, '$1\"__INITIAL_ADMIN_PASSWORD__\"', 1);" ^ + "Set-Content -LiteralPath $path -Value $updated -NoNewline" + ) else ( + echo OSCAR config not found: %CONFIG_JSON% + ) +) + +echo Restoring initial admin secret file: %SECRET_FILE% +> "%SECRET_FILE%" echo oscar + +del "%SCRIPT_DIR%\.monitor-active-dir" >nul 2>nul + +echo. +echo Reset complete. +echo Next launch should initialize the default login as admin / oscar. +exit /b 0 + +:load_env +for /f "usebackq tokens=1,* delims==" %%A in ("%~1") do ( + set "ENV_NAME=%%A" + set "ENV_VALUE=%%B" + call :set_env_var +) +exit /b 0 + +:set_env_var +if not defined ENV_NAME exit /b 0 +if "%ENV_NAME:~0,1%"=="#" exit /b 0 +if /I "%ENV_NAME:~0,7%"=="export " set "ENV_NAME=%ENV_NAME:~7%" +set "%ENV_NAME%=%ENV_VALUE%" +exit /b 0 + +if exist "%~dp0.monitor-lock" rmdir /s /q "%~dp0.monitor-lock" >nul 2>nul +del /q "%~dp0monitor.heartbeat" "%~dp0oscar.pid" "%~dp0current-monitor-dir.txt" >nul 2>nul \ No newline at end of file diff --git a/dist/release/reset-all.sh b/dist/release/reset-all.sh new file mode 100755 index 0000000..ebc3877 --- /dev/null +++ b/dist/release/reset-all.sh @@ -0,0 +1,78 @@ +#!/usr/bin/env bash +set -u + +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" + +ENV_FILE="${SCRIPT_DIR}/.env" +if [[ -f "${ENV_FILE}" ]]; then + while IFS='=' read -r key value; do + [[ -z "${key}" ]] && continue + [[ "${key:0:1}" == "#" ]] && continue + if [[ "${key:0:7}" == "export " ]]; then + key="${key:7}" + fi + export "${key}=${value}" + done < "${ENV_FILE}" +fi + +CONTAINER_NAME="${CONTAINER_NAME:-oscar-postgis-container}" + +PGDATA_DIR="${SCRIPT_DIR}/pgdata" +NODE_DIR="${SCRIPT_DIR}/osh-node-oscar" +DB_DIR="${NODE_DIR}/db" +FILES_DIR="${NODE_DIR}/files" +CONFIG_JSON="${NODE_DIR}/config.json" +CONFIG_TEMPLATE="${NODE_DIR}/config.template.json" +SECRET_FILE="${NODE_DIR}/.s" + +echo "Requesting monitor shutdown..." +if [[ -x "${SCRIPT_DIR}/monitor-oscar.sh" ]]; then + "${SCRIPT_DIR}/monitor-oscar.sh" stop >/dev/null 2>&1 || true +fi + +echo "Stopping OSCAR Java processes..." +pgrep -af 'com\.botts\.impl\.security\.SensorHubWrapper' >/dev/null 2>&1 && \ + pkill -f 'com\.botts\.impl\.security\.SensorHubWrapper' >/dev/null 2>&1 || true + +echo "Removing container: ${CONTAINER_NAME}..." +docker rm -f -v "${CONTAINER_NAME}" >/dev/null 2>&1 || true + +if [[ -d "${PGDATA_DIR}" ]]; then + echo "Removing Postgres data directory: ${PGDATA_DIR}" + rm -rf "${PGDATA_DIR}" +else + echo "Postgres data directory not found: ${PGDATA_DIR}" +fi + +if [[ -d "${DB_DIR}" ]]; then + echo "Removing OSCAR runtime DB directory: ${DB_DIR}" + rm -rf "${DB_DIR}" +else + echo "OSCAR runtime DB directory not found: ${DB_DIR}" +fi + +if [[ -d "${FILES_DIR}" ]]; then + echo "Removing OSCAR files directory: ${FILES_DIR}" + rm -rf "${FILES_DIR}" +else + echo "OSCAR files directory not found: ${FILES_DIR}" +fi + +if [[ -f "${CONFIG_TEMPLATE}" ]]; then + echo "Restoring config.json from template: ${CONFIG_TEMPLATE}" + cp -f "${CONFIG_TEMPLATE}" "${CONFIG_JSON}" +elif [[ -f "${CONFIG_JSON}" ]]; then + echo "WARNING: config.template.json not found. Resetting admin password placeholder in existing config.json." + perl -0pi -e 's/("id"\s*:\s*"admin"[\s\S]*?"password"\s*:\s*)"(?:[^"\\]|\\.)*"/$1"__INITIAL_ADMIN_PASSWORD__"/s' "${CONFIG_JSON}" +else + echo "OSCAR config not found: ${CONFIG_JSON}" +fi + +echo "Restoring initial admin secret file: ${SECRET_FILE}" +printf 'oscar\n' > "${SECRET_FILE}" + +rm -f "${SCRIPT_DIR}/.monitor-active-dir" + +echo +echo "Reset complete." +echo "Next launch should initialize the default login as admin / oscar." \ No newline at end of file diff --git a/dist/release/stop-all.bat b/dist/release/stop-all.bat old mode 100755 new mode 100644 index 9bbd92f..3f0093a --- a/dist/release/stop-all.bat +++ b/dist/release/stop-all.bat @@ -1,23 +1,48 @@ -@echo off -set CONTAINER_NAME=oscar-postgis-container -set SENSORHUB_NAME=com.botts.impl.security.SensorHubWrapper - -echo Stopping container: %CONTAINER_NAME%... - -docker stop %CONTAINER_NAME% - -echo. -echo Stopping SensorHubWrapper Java Process... - -FOR /F "tokens=1" %%A IN ('wmic process where "CommandLine like '%%%SENSORHUB_NAME%%%' and name='java.exe'" get ProcessId ^| findstr /R "[0-9]"') DO ( - echo Stopping SensorHubWrapper with PID %%A... - taskkill /PID %%A /F - echo SensorHubWrapper stopped. - goto :DoneJava -) - -echo SensorHubWrapper process not found. - -:DoneJava -echo. -echo Done. +@echo off +setlocal EnableExtensions + +set "SCRIPT_DIR=%~dp0" +if "%SCRIPT_DIR:~-1%"=="\" set "SCRIPT_DIR=%SCRIPT_DIR:~0,-1%" +set "CONTAINER_NAME=oscar-postgis-container" +set "SENSORHUB_NAME=com.botts.impl.security.SensorHubWrapper" + +if exist "%SCRIPT_DIR%\.env" ( + for /f "usebackq tokens=* delims=" %%L in ("%SCRIPT_DIR%\.env") do ( + set "LINE=%%L" + call :parse_env_line + ) +) + +echo Requesting monitor stop if active... +if exist "%SCRIPT_DIR%\monitor-oscar.bat" ( + call "%SCRIPT_DIR%\monitor-oscar.bat" stop >nul 2>nul + timeout /t 5 /nobreak >nul +) + +echo. +echo Stopping container: %CONTAINER_NAME%... +docker stop %CONTAINER_NAME% >nul 2>nul +if errorlevel 1 ( + echo Container not found or already stopped. +) else ( + echo Container stop requested. +) + +echo. +echo Stopping SensorHubWrapper Java process... +powershell -NoProfile -ExecutionPolicy Bypass -Command "Get-CimInstance Win32_Process | Where-Object { $_.Name -match '^(java|javaw)(\.exe)?$' -and $null -ne $_.CommandLine -and $_.CommandLine -like '*%SENSORHUB_NAME%*' } | ForEach-Object { try { Stop-Process -Id $_.ProcessId -Force -ErrorAction Stop; Write-Output ('Stopped PID ' + $_.ProcessId) } catch {} }" + +echo. +echo Done. +exit /b 0 + +:parse_env_line +if not defined LINE exit /b 0 +if "%LINE:~0,1%"=="#" exit /b 0 +for /f "tokens=1,* delims==" %%A in ("%LINE%") do ( + if /I "%%A"=="CONTAINER_NAME" set "CONTAINER_NAME=%%B" +) +exit /b 0 + +if exist "%~dp0.monitor-lock" rmdir /s /q "%~dp0.monitor-lock" >nul 2>nul +del /q "%~dp0monitor.heartbeat" "%~dp0oscar.pid" "%~dp0current-monitor-dir.txt" >nul 2>nul \ No newline at end of file diff --git a/dist/release/stop-all.sh b/dist/release/stop-all.sh index 522477d..f80be77 100755 --- a/dist/release/stop-all.sh +++ b/dist/release/stop-all.sh @@ -1,41 +1,53 @@ #!/bin/bash +set -euo pipefail +SCRIPT_DIR="$(cd -- "$(dirname -- "${BASH_SOURCE[0]}")" && pwd)" +ENV_FILE="${ENV_FILE:-$SCRIPT_DIR/.env}" +MONITOR_SCRIPT="$SCRIPT_DIR/monitor-oscar.sh" +STATE_DIR="$SCRIPT_DIR/.monitor-state" +LOCK_DIR="$STATE_DIR/lock" CONTAINER_NAME="oscar-postgis-container" SENSORHUB_NAME="com.botts.impl.security.SensorHubWrapper" -echo "Stopping container: $CONTAINER_NAME..." +if [ -f "$ENV_FILE" ]; then + set -a + . "$ENV_FILE" + set +a + CONTAINER_NAME="${CONTAINER_NAME:-oscar-postgis-container}" +fi + +echo "Requesting monitor stop if active..." +if [ -x "$MONITOR_SCRIPT" ]; then + "$MONITOR_SCRIPT" stop || true + for _ in 1 2 3 4 5 6 7 8 9 10; do + if [ ! -d "$LOCK_DIR" ]; then + break + fi + sleep 1 + done +fi -# Stop Docker container if it exists +echo +printf 'Stopping container: %s...\n' "$CONTAINER_NAME" if docker ps -a --format '{{.Names}}' | grep -q "^${CONTAINER_NAME}$"; then - echo "Container exists. Stopping..." - docker stop "$CONTAINER_NAME" - echo "Container stopped." + docker stop "$CONTAINER_NAME" >/dev/null 2>&1 || true + echo "Container stop requested." else - echo "Container not found. Nothing to stop." + echo "Container not found." fi echo echo "Stopping SensorHubWrapper Java process..." - -PID="" - -# --- Option 1: Use jps if available --- -if command -v jps >/dev/null 2>&1; then - PID=$(jps -l | grep "$SENSORHUB_NAME" | awk '{print $1}') -fi - -# --- Option 2: fallback to pgrep if PID not found --- -if [ -z "$PID" ]; then - if command -v pgrep >/dev/null 2>&1; then - PID=$(pgrep -f "$SENSORHUB_NAME") +PIDS="$(pgrep -f "$SENSORHUB_NAME" || true)" +if [ -n "$PIDS" ]; then + echo "Stopping SensorHubWrapper with PID(s): $PIDS" + kill $PIDS 2>/dev/null || true + sleep 3 + REMAINING="$(pgrep -f "$SENSORHUB_NAME" || true)" + if [ -n "$REMAINING" ]; then + echo "Force killing remaining PID(s): $REMAINING" + kill -9 $REMAINING 2>/dev/null || true fi -fi - -# --- Kill process if found --- -if [ -n "$PID" ]; then - echo "Stopping SensorHubWrapper with PID(s): $PID" - kill -9 $PID - echo "SensorHubWrapper stopped." else echo "SensorHubWrapper process not found." fi diff --git a/dist/scripts/standard/launch.bat b/dist/scripts/standard/launch.bat index 4b50578..689f64e 100755 --- a/dist/scripts/standard/launch.bat +++ b/dist/scripts/standard/launch.bat @@ -1,41 +1,195 @@ -@echo off -setlocal enabledelayedexpansion - - -REM Make sure all the necessary certificates are trusted by the system. -CALL %~dp0load_trusted_certs.bat - -set KEYSTORE=.\osh-keystore.p12 -set KEYSTORE_TYPE=PKCS12 -set KEYSTORE_PASSWORD=atakatak - -set TRUSTSTORE=.\truststore.jks -set TRUSTSTORE_TYPE=JKS -set TRUSTSTORE_PASSWORD=changeit - -set INITIAL_ADMIN_PASSWORD_FILE=.\.s - - -REM Check if INITIAL_ADMIN_PASSWORD_FILE and INITIAL_ADMIN_PASSWORD are empty -REM Set default password if neither is provided -if "%INITIAL_ADMIN_PASSWORD_FILE%"=="" if "%INITIAL_ADMIN_PASSWORD%"=="" ( - set INITIAL_ADMIN_PASSWORD=admin -) - -REM Call the next batch script to handle setting the initial admin password -CALL "%SCRIPT_DIR%set-initial-admin-password.bat" - -REM Start the node -java -Xms6g -Xmx6g -Xss256k -XX:ReservedCodeCacheSize=512m -XX:+UseG1GC -XX:+HeapDumpOnOutOfMemoryError ^ - -Dlogback.configurationFile=./logback.xml ^ - -cp "lib/*" ^ - -Djava.system.class.loader="org.sensorhub.utils.NativeClassLoader" ^ - -Djavax.net.ssl.keyStore="./osh-keystore.p12" ^ - -Djavax.net.ssl.keyStorePassword="atakatak" ^ - -Djavax.net.ssl.trustStore="%~dp0trustStore.jks" ^ - -Djavax.net.ssl.trustStorePassword="changeit" ^ - -Djava.library.path="./nativelibs" ^ - com.botts.impl.security.SensorHubWrapper config.json db - - -endlocal +@echo off +setlocal EnableExtensions EnableDelayedExpansion + +set "SCRIPT_DIR=%~dp0" +if "%SCRIPT_DIR:~-1%"=="\" set "SCRIPT_DIR=%SCRIPT_DIR:~0,-1%" + +set "ENV_FILE=" +if exist "%SCRIPT_DIR%\.env" ( + set "ENV_FILE=%SCRIPT_DIR%\.env" +) else if exist "%SCRIPT_DIR%\..\.env" ( + set "ENV_FILE=%SCRIPT_DIR%\..\.env" +) + +if defined ENV_FILE call :load_env "%ENV_FILE%" + +if not defined SYSTEM_PROFILE set "SYSTEM_PROFILE=8GB" +if not defined FORCE_RESTART set "FORCE_RESTART=0" + +where java >nul 2>nul +if errorlevel 1 ( + echo ERROR: Java was not found in PATH. + exit /b 1 +) + +where keytool >nul 2>nul +if errorlevel 1 ( + echo ERROR: keytool was not found in PATH. + exit /b 1 +) + +if not exist "%SCRIPT_DIR%\lib" ( + echo ERROR: Missing library directory: "%SCRIPT_DIR%\lib" + exit /b 1 +) + +if not exist "%SCRIPT_DIR%\config.json" ( + echo ERROR: Missing config file: "%SCRIPT_DIR%\config.json" + exit /b 1 +) + +if not exist "%SCRIPT_DIR%\load_trusted_certs.bat" ( + echo ERROR: Missing trusted-certs helper: "%SCRIPT_DIR%\load_trusted_certs.bat" + exit /b 1 +) + +if not exist "%SCRIPT_DIR%\set-initial-admin-password.bat" ( + echo ERROR: Missing admin-password helper: "%SCRIPT_DIR%\set-initial-admin-password.bat" + exit /b 1 +) + +call :check_existing_oscar +if defined OSCAR_PID ( + if /I "%FORCE_RESTART%"=="1" ( + echo OSCAR is already running with PID !OSCAR_PID!. FORCE_RESTART=1, stopping it first... + call :stop_existing_oscar + call :wait_for_oscar_stop 60 + call :check_existing_oscar + if defined OSCAR_PID ( + echo ERROR: OSCAR is still running with PID !OSCAR_PID! after stop attempt. + exit /b 1 + ) + ) else ( + echo OSCAR is already running with PID !OSCAR_PID!. + echo Run stop-all.bat first, or set FORCE_RESTART=1 to replace the existing OSCAR process. + exit /b 1 + ) +) + +if /I "%SYSTEM_PROFILE%"=="RPI4" ( + set "JAVA_XMS=512m" + set "JAVA_XMX=1536m" + set "JAVACPP_MAX_BYTES_DEFAULT=512m" + set "JAVACPP_MAX_PHYSICAL_BYTES_DEFAULT=2g" +) else if /I "%SYSTEM_PROFILE%"=="8GB" ( + set "JAVA_XMS=1g" + set "JAVA_XMX=2g" + set "JAVACPP_MAX_BYTES_DEFAULT=1g" + set "JAVACPP_MAX_PHYSICAL_BYTES_DEFAULT=4g" +) else if /I "%SYSTEM_PROFILE%"=="16GB" ( + set "JAVA_XMS=1g" + set "JAVA_XMX=3g" + set "JAVACPP_MAX_BYTES_DEFAULT=2g" + set "JAVACPP_MAX_PHYSICAL_BYTES_DEFAULT=8g" +) else if /I "%SYSTEM_PROFILE%"=="32GB" ( + set "JAVA_XMS=2g" + set "JAVA_XMX=6g" + set "JAVACPP_MAX_BYTES_DEFAULT=4g" + set "JAVACPP_MAX_PHYSICAL_BYTES_DEFAULT=16g" +) else ( + echo WARNING: Unknown SYSTEM_PROFILE "%SYSTEM_PROFILE%". Using 8GB defaults. + set "JAVA_XMS=1g" + set "JAVA_XMX=2g" + set "JAVACPP_MAX_BYTES_DEFAULT=1g" + set "JAVACPP_MAX_PHYSICAL_BYTES_DEFAULT=4g" +) + +if not defined JAVACPP_MAX_BYTES set "JAVACPP_MAX_BYTES=%JAVACPP_MAX_BYTES_DEFAULT%" +if not defined JAVACPP_MAX_PHYSICAL_BYTES set "JAVACPP_MAX_PHYSICAL_BYTES=%JAVACPP_MAX_PHYSICAL_BYTES_DEFAULT%" +if not defined JFR_FILENAME set "JFR_FILENAME=%SCRIPT_DIR%\oscar.jfr" + +echo Starting OSH Node with Profile: %SYSTEM_PROFILE% +echo Heap: %JAVA_XMS% / %JAVA_XMX% +echo JavaCPP maxBytes: %JAVACPP_MAX_BYTES% +echo JavaCPP maxPhysicalBytes: %JAVACPP_MAX_PHYSICAL_BYTES% +echo JFR file: %JFR_FILENAME% + +call "%SCRIPT_DIR%\load_trusted_certs.bat" +if errorlevel 1 exit /b %ERRORLEVEL% + +set "KEYSTORE=%SCRIPT_DIR%\osh-keystore.p12" +set "KEYSTORE_TYPE=PKCS12" +if not defined KEYSTORE_PASSWORD set "KEYSTORE_PASSWORD=atakatak" + +set "TRUSTSTORE=%SCRIPT_DIR%\truststore.jks" +set "TRUSTSTORE_TYPE=JKS" +if not defined TRUSTSTORE_PASSWORD set "TRUSTSTORE_PASSWORD=changeit" + +set "INITIAL_ADMIN_PASSWORD_FILE=%SCRIPT_DIR%\.s" +if not exist "%INITIAL_ADMIN_PASSWORD_FILE%" if not defined INITIAL_ADMIN_PASSWORD set "INITIAL_ADMIN_PASSWORD=admin" + +call "%SCRIPT_DIR%\set-initial-admin-password.bat" +if errorlevel 1 exit /b %ERRORLEVEL% + +set "JAVA_LIBRARY_OPT=" +if exist "%SCRIPT_DIR%\nativelibs" ( + set "JAVA_LIBRARY_OPT=-Djava.library.path=%SCRIPT_DIR%\nativelibs" +) else ( + echo WARNING: Optional native library directory not found: "%SCRIPT_DIR%\nativelibs" +) + +java ^ + -Xms%JAVA_XMS% ^ + -Xmx%JAVA_XMX% ^ + -Xss256k ^ + -XX:ReservedCodeCacheSize=256m ^ + -XX:+UseG1GC ^ + -XX:+HeapDumpOnOutOfMemoryError ^ + -XX:+UnlockDiagnosticVMOptions ^ + -XX:NativeMemoryTracking=summary ^ + "-Dorg.bytedeco.javacpp.maxBytes=%JAVACPP_MAX_BYTES%" ^ + "-Dorg.bytedeco.javacpp.maxPhysicalBytes=%JAVACPP_MAX_PHYSICAL_BYTES%" ^ + -Dorg.bytedeco.javacpp.maxRetries=2 ^ + -Dorg.bytedeco.javacpp.mxbean=true ^ + "-Dlogback.configurationFile=%SCRIPT_DIR%\logback.xml" ^ + -cp "%SCRIPT_DIR%\lib\*" ^ + "-Djava.system.class.loader=org.sensorhub.utils.NativeClassLoader" ^ + "-Djavax.net.ssl.keyStore=%KEYSTORE%" ^ + "-Djavax.net.ssl.keyStorePassword=%KEYSTORE_PASSWORD%" ^ + "-Djavax.net.ssl.trustStore=%TRUSTSTORE%" ^ + "-Djavax.net.ssl.trustStorePassword=%TRUSTSTORE_PASSWORD%" ^ + !JAVA_LIBRARY_OPT! ^ + com.botts.impl.security.SensorHubWrapper "%SCRIPT_DIR%\config.json" "%SCRIPT_DIR%\db" + +set "JAVA_EXIT_CODE=%ERRORLEVEL%" +endlocal & exit /b %JAVA_EXIT_CODE% + +:check_existing_oscar +set "OSCAR_PID=" +for /f "usebackq delims=" %%P in (` + powershell -NoProfile -ExecutionPolicy Bypass -Command "$procs = Get-CimInstance Win32_Process; foreach ($proc in $procs) { if ($proc.Name -match '^(java|javaw)(\.exe)?$' -and $null -ne $proc.CommandLine -and $proc.CommandLine -like '*com.botts.impl.security.SensorHubWrapper*') { [Console]::Write($proc.ProcessId); break } }" 2^>nul +`) do set "OSCAR_PID=%%P" +exit /b 0 + +:stop_existing_oscar +if not defined OSCAR_PID exit /b 0 +powershell -NoProfile -ExecutionPolicy Bypass -Command "try { Stop-Process -Id %OSCAR_PID% -Force -ErrorAction Stop } catch {}" >nul 2>nul +exit /b 0 + +:wait_for_oscar_stop +set "WAIT_LIMIT=%~1" +if not defined WAIT_LIMIT set "WAIT_LIMIT=60" +set /a WAITED=0 + +:wait_for_oscar_stop_loop +call :check_existing_oscar +if not defined OSCAR_PID exit /b 0 +if !WAITED! GEQ %WAIT_LIMIT% exit /b 0 +timeout /t 1 /nobreak >nul +set /a WAITED+=1 +goto wait_for_oscar_stop_loop + +:load_env +for /f "usebackq tokens=1,* delims==" %%A in ("%~1") do ( + set "ENV_NAME=%%A" + set "ENV_VALUE=%%B" + call :set_env_var +) +exit /b 0 + +:set_env_var +if not defined ENV_NAME exit /b 0 +if "%ENV_NAME:~0,1%"=="#" exit /b 0 +if /I "%ENV_NAME:~0,7%"=="export " set "ENV_NAME=%ENV_NAME:~7%" +set "%ENV_NAME%=%ENV_VALUE%" +exit /b 0 \ No newline at end of file diff --git a/dist/scripts/standard/launch.sh b/dist/scripts/standard/launch.sh index f9d4094..7aec8f2 100755 --- a/dist/scripts/standard/launch.sh +++ b/dist/scripts/standard/launch.sh @@ -1,37 +1,232 @@ -#!/bin/bash +#!/usr/bin/env bash +set -euo pipefail -# Make sure all the necessary certificates are trusted by the system. -SCRIPT_DIR=$( cd -- "$( dirname -- "${BASH_SOURCE[0]}" )" &> /dev/null && pwd ) -"$SCRIPT_DIR/load_trusted_certs.sh" +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" +MATCH_EXPR='com.botts.impl.security.SensorHubWrapper' - export KEYSTORE="./osh-keystore.p12" - export KEYSTORE_TYPE=PKCS12 - export KEYSTORE_PASSWORD="atakatak" +load_env() { + local env_file="$1" + while IFS= read -r line || [ -n "$line" ]; do + case "$line" in + ""|"#"*) continue ;; + export\ *) line="${line#export }" ;; + esac + local name="${line%%=*}" + local value="${line#*=}" + value="${value%$'\r'}" + export "${name}=${value}" + done < "$env_file" +} - export TRUSTSTORE="./truststore.jks" - export TRUSTSTORE_TYPE=JKS - export TRUSTSTORE_PASSWORD="changeit" - export INITIAL_ADMIN_PASSWORD_FILE="./.s" +require_cmd() { + local cmd="$1" + if ! command -v "$cmd" >/dev/null 2>&1; then + echo "Error: required command not found: $cmd" + exit 1 + fi +} +get_java_major() { + java -version 2>&1 | awk -F'"' '/version/ { split($2, v, "."); print v[1]; exit }' +} -# After copying the default configuration file, also look to see if they -# specified what they want the initial admin user's password to be, either -# as a secret file or by providing it as an environment variable. -if [ -z "$INITIAL_ADMIN_PASSWORD_FILE" ] && [ -z "$INITIAL_ADMIN_PASSWORD" ]; then - export INITIAL_ADMIN_PASSWORD=admin +check_dependencies() { + require_cmd bash + require_cmd java + require_cmd keytool + + local java_major + java_major="$(get_java_major || true)" + if [[ -z "$java_major" || ! "$java_major" =~ ^[0-9]+$ ]]; then + echo "Error: could not determine Java version. Java 21 or newer is required." + exit 1 + fi + if [ "$java_major" -lt 21 ]; then + echo "Error: Java 21 or newer is required. Found Java $java_major." + exit 1 + fi +} + +find_existing_oscar_pids() { + pgrep -f "$MATCH_EXPR" || true +} + +stop_existing_oscar() { + local pids="$1" + if [ -z "$pids" ]; then + return 0 + fi + + echo "Stopping existing OSCAR instance(s): $pids" + kill $pids 2>/dev/null || true + + local waited=0 + while [ "$waited" -lt 15 ]; do + sleep 1 + waited=$((waited + 1)) + if [ -z "$(find_existing_oscar_pids)" ]; then + return 0 + fi + done + + echo "Existing OSCAR instance still running after graceful stop. Forcing stop." + kill -9 $pids 2>/dev/null || true + sleep 1 + + if [ -n "$(find_existing_oscar_pids)" ]; then + echo "Error: unable to stop the existing OSCAR instance." + exit 1 + fi +} + +check_existing_oscar() { + local pids + pids="$(find_existing_oscar_pids)" + + if [ -z "$pids" ]; then + return 0 + fi + + if [ "${FORCE_RESTART:-0}" = "1" ]; then + echo "Existing OSCAR instance found with PID(s): $pids. Replacing because FORCE_RESTART=1." + stop_existing_oscar "$pids" + return 0 + fi + + echo "OSCAR is already running with PID(s): $pids." + echo "Run stop-all.sh first, or set FORCE_RESTART=1 to replace the existing OSCAR process." + exit 1 +} + +ensure_runtime_paths() { + if [ ! -f "$SCRIPT_DIR/config.json" ]; then + echo "Error: missing config file: $SCRIPT_DIR/config.json" + exit 1 + fi + + if [ ! -d "$SCRIPT_DIR/lib" ]; then + echo "Error: missing library directory: $SCRIPT_DIR/lib" + exit 1 + fi + + if [ ! -f "$SCRIPT_DIR/load_trusted_certs.sh" ]; then + echo "Error: load_trusted_certs.sh not found in $SCRIPT_DIR" + exit 1 + fi + + if [ ! -f "$SCRIPT_DIR/set-initial-admin-password.sh" ]; then + echo "Error: set-initial-admin-password.sh not found in $SCRIPT_DIR" + exit 1 + fi + + mkdir -p "$SCRIPT_DIR/db" +} + +ENV_FILE="" +if [ -f "$SCRIPT_DIR/.env" ]; then + ENV_FILE="$SCRIPT_DIR/.env" +elif [ -f "$SCRIPT_DIR/../.env" ]; then + ENV_FILE="$SCRIPT_DIR/../.env" +fi + +if [ -n "$ENV_FILE" ]; then + load_env "$ENV_FILE" +fi + +check_dependencies +check_existing_oscar +ensure_runtime_paths + +SYSTEM_PROFILE="${SYSTEM_PROFILE:-8GB}" + +case "$SYSTEM_PROFILE" in + RPI4) + JAVA_XMS="512m" + JAVA_XMX="1536m" + JAVACPP_MAX_BYTES_DEFAULT="512m" + JAVACPP_MAX_PHYSICAL_BYTES_DEFAULT="2g" + ;; + 8GB) + JAVA_XMS="1g" + JAVA_XMX="2g" + JAVACPP_MAX_BYTES_DEFAULT="1g" + JAVACPP_MAX_PHYSICAL_BYTES_DEFAULT="4g" + ;; + 16GB) + JAVA_XMS="1g" + JAVA_XMX="3g" + JAVACPP_MAX_BYTES_DEFAULT="2g" + JAVACPP_MAX_PHYSICAL_BYTES_DEFAULT="8g" + ;; + 32GB) + JAVA_XMS="2g" + JAVA_XMX="6g" + JAVACPP_MAX_BYTES_DEFAULT="4g" + JAVACPP_MAX_PHYSICAL_BYTES_DEFAULT="16g" + ;; + *) + echo "Unknown profile '$SYSTEM_PROFILE', using 8GB defaults." + JAVA_XMS="1g" + JAVA_XMX="2g" + JAVACPP_MAX_BYTES_DEFAULT="1g" + JAVACPP_MAX_PHYSICAL_BYTES_DEFAULT="4g" + ;; +esac + +: "${JAVACPP_MAX_BYTES:=$JAVACPP_MAX_BYTES_DEFAULT}" +: "${JAVACPP_MAX_PHYSICAL_BYTES:=$JAVACPP_MAX_PHYSICAL_BYTES_DEFAULT}" +: "${JFR_FILENAME:=$SCRIPT_DIR/oscar.jfr}" + +mkdir -p "$(dirname "$JFR_FILENAME")" + +export KEYSTORE="${KEYSTORE:-$SCRIPT_DIR/osh-keystore.p12}" +export KEYSTORE_TYPE="${KEYSTORE_TYPE:-PKCS12}" +export KEYSTORE_PASSWORD="${KEYSTORE_PASSWORD:-atakatak}" + +export TRUSTSTORE="${TRUSTSTORE:-$SCRIPT_DIR/trustStore.jks}" +export TRUSTSTORE_TYPE="${TRUSTSTORE_TYPE:-JKS}" +export TRUSTSTORE_PASSWORD="${TRUSTSTORE_PASSWORD:-changeit}" + +export INITIAL_ADMIN_PASSWORD_FILE="${INITIAL_ADMIN_PASSWORD_FILE:-$SCRIPT_DIR/.s}" +if [ ! -f "$INITIAL_ADMIN_PASSWORD_FILE" ] && [ -z "${INITIAL_ADMIN_PASSWORD:-}" ]; then + export INITIAL_ADMIN_PASSWORD="oscar" +fi + +if [ -z "${HOME:-}" ] && [ -n "${USER:-}" ]; then + export HOME="/home/${USER}" +fi + +JAVA_LIBRARY_PATH_ARG=() +if [ -d "$SCRIPT_DIR/nativelibs" ]; then + JAVA_LIBRARY_PATH_ARG=("-Djava.library.path=$SCRIPT_DIR/nativelibs") +else + echo "Warning: optional native library directory not found: $SCRIPT_DIR/nativelibs" fi -"$SCRIPT_DIR/set-initial-admin-password.sh" +echo "Starting OSH Node with Profile: $SYSTEM_PROFILE" +echo " Heap: $JAVA_XMS / $JAVA_XMX" +echo " JavaCPP maxBytes: $JAVACPP_MAX_BYTES" +echo " JavaCPP maxPhysicalBytes: $JAVACPP_MAX_PHYSICAL_BYTES" +echo " JFR file: $JFR_FILENAME" +bash "$SCRIPT_DIR/load_trusted_certs.sh" +bash "$SCRIPT_DIR/set-initial-admin-password.sh" -# Start the node -java -Xms6g -Xmx6g -Xss256k -XX:ReservedCodeCacheSize=512m -XX:+UseG1GC -XX:+HeapDumpOnOutOfMemoryError \ - -Dlogback.configurationFile=./logback.xml \ - -cp "lib/*" \ - -Djava.system.class.loader="org.sensorhub.utils.NativeClassLoader" \ - -Djavax.net.ssl.keyStore="./osh-keystore.p12" \ - -Djavax.net.ssl.keyStorePassword="atakatak" \ - -Djavax.net.ssl.trustStore="$SCRIPT_DIR/trustStore.jks" \ - -Djavax.net.ssl.trustStorePassword="changeit" \ - -Djava.library.path="./nativelibs" \ - com.botts.impl.security.SensorHubWrapper ./config.json ./db +exec java \ + -Xms"$JAVA_XMS" \ + -Xmx"$JAVA_XMX" \ + -Xss256k \ + -XX:ReservedCodeCacheSize=256m \ + -XX:+UseG1GC \ + -XX:+HeapDumpOnOutOfMemoryError \ + -XX:+UnlockDiagnosticVMOptions \ + -XX:NativeMemoryTracking=summary \ + "-Dorg.bytedeco.javacpp.maxBytes=$JAVACPP_MAX_BYTES" \ + "-Dorg.bytedeco.javacpp.maxPhysicalBytes=$JAVACPP_MAX_PHYSICAL_BYTES" \ + -Dorg.bytedeco.javacpp.maxRetries=2 \ + -Dorg.bytedeco.javacpp.mxbean=true \ + "-Dlogback.configurationFile=$SCRIPT_DIR/logback.xml" \ + -cp "$SCRIPT_DIR/lib/*" \ + "-Djava.system.class.loader=org.sensorhub.utils.NativeClassLoader" \ + "${JAVA_LIBRARY_PATH_ARG[@]}" \ + com.botts.impl.security.SensorHubWrapper "$SCRIPT_DIR/config.json" "$SCRIPT_DIR/db" diff --git a/dist/scripts/standard/load_trusted_certs.bat b/dist/scripts/standard/load_trusted_certs.bat index 5357164..247bc3c 100755 --- a/dist/scripts/standard/load_trusted_certs.bat +++ b/dist/scripts/standard/load_trusted_certs.bat @@ -1,64 +1,99 @@ -@echo off -setlocal - -echo Building Java trust store... - -REM Default password for the sytem trust store is "changeit". Edit this next -REM line if it's something different in your Java installation. -set "STOREPASS=changeit" - -REM Get the path of this script. -set "SCRIPTDIR=%~dp0" - -REM Get the path where we'll build the new trust store. -set "NEWTRUSTSTORE=%SCRIPTDIR%trustStore.jks" - -REM To find the location of the system trust store, we start by finding the -REM path to "java.exe". -for /f "tokens=* usebackq" %%j in (`where java`) do (set "JAVA=%%~dpj" & goto :next ) -:next -REM Then we back up a directory and look in lib\security. -set "CACERTS=%JAVA%..\lib\security\cacerts" - -REM Now make a copy of that default system trust store into this directory, -REM where we'll add our stuff to it. -copy /y "%CACERTS%" "%NEWTRUSTSTORE%" >NUL - -REM Get the full path to where our certs are. -set CERTDIR=%SCRIPTDIR%trusted_certificates - -REM Now for each .cer, .pem, and .crt file in our cert dir, check to see if we -REM need to add it to the system trust store. -for %%c in ( %CERTDIR%\*.cer %CERTDIR%\*.pem %CERTDIR%\*.crt ) do ( - call :check_certificate %%c -) - -goto :end_of_script - -REM The next few lines define a function that checks whether a certificate -REM is already loaded in the system store. If so, it does nothing. If not, it -REM attempts to load it in. Note that the alias of the certificate is -REM calculated as the base file name (without path or extension). -REM NOTE: As currently written, this is performing an unnecessary check, since -REM we're guaranteed that none of the certificates will ever be present in the -REM original file. - -:check_certificate -set ALIAS=%~n1 -REM Check for existence. ERRORLEVEL is set to 0 if it's found, and something -REM else otherwise. -keytool -list -keystore "%NEWTRUSTSTORE%" -storepass "%STOREPASS%" -alias "%ALIAS%" >NUL 2>NUL -if not "%ERRORLEVEL%" == "0" ( - echo Importing "%ALIAS%" from "%1" - keytool -importcert -keystore "%NEWTRUSTSTORE%" -noprompt -storepass "%STOREPASS%" -alias "%ALIAS%" -file "%1" -) else ( - echo Certificate with alias "%ALIAS%" already exists. Skipping. -) -REM Return to caller. -exit /b 0 - -:end_of_script - -echo Done. - -endlocal +@echo off +setlocal EnableExtensions EnableDelayedExpansion + +echo Building Java trust store... + +set "STOREPASS=changeit" +set "SCRIPTDIR=%~dp0" +set "NEWTRUSTSTORE=%SCRIPTDIR%truststore.jks" +set "CERTDIR=%SCRIPTDIR%trusted_certificates" +set "CACERTS=" +set "JAVA_HOME_DETECTED=" + +rem First try JAVA_HOME if already set +if defined JAVA_HOME ( + if exist "%JAVA_HOME%\conf\security\cacerts" set "CACERTS=%JAVA_HOME%\conf\security\cacerts" + if not defined CACERTS if exist "%JAVA_HOME%\lib\security\cacerts" set "CACERTS=%JAVA_HOME%\lib\security\cacerts" +) + +rem If that did not work, ask Java itself for java.home +if not defined CACERTS ( + for /f "tokens=1,* delims==" %%A in ('java -XshowSettings:properties -version 2^>^&1 ^| findstr /c:"java.home ="') do ( + set "JAVA_HOME_DETECTED=%%B" + ) +) + +rem Trim leading spaces +if defined JAVA_HOME_DETECTED ( + for /f "tokens=* delims= " %%H in ("!JAVA_HOME_DETECTED!") do set "JAVA_HOME_DETECTED=%%H" +) + +rem Try common cacerts locations under detected java.home +if not defined CACERTS if defined JAVA_HOME_DETECTED ( + if exist "!JAVA_HOME_DETECTED!\conf\security\cacerts" set "CACERTS=!JAVA_HOME_DETECTED!\conf\security\cacerts" + if not defined CACERTS if exist "!JAVA_HOME_DETECTED!\lib\security\cacerts" set "CACERTS=!JAVA_HOME_DETECTED!\lib\security\cacerts" +) + +if not defined CACERTS ( + echo Error: could not locate Java cacerts. + if defined JAVA_HOME echo JAVA_HOME="%JAVA_HOME%" + if defined JAVA_HOME_DETECTED echo java.home="!JAVA_HOME_DETECTED!" + endlocal & exit /b 1 +) + +if not exist "%CACERTS%" ( + echo Error: Java cacerts path does not exist: "%CACERTS%" + endlocal & exit /b 1 +) + +echo Using Java cacerts: "%CACERTS%" + +copy /y "%CACERTS%" "%NEWTRUSTSTORE%" >nul +if errorlevel 1 ( + echo Error: failed to create "%NEWTRUSTSTORE%" + endlocal & exit /b 1 +) + +if not exist "%CERTDIR%" ( + echo Trusted certificates directory not found: "%CERTDIR%" + echo Using copied default trust store only. + echo Done. + endlocal & exit /b 0 +) + +set "FOUND_CERT=0" +for %%c in ("%CERTDIR%\*.cer" "%CERTDIR%\*.pem" "%CERTDIR%\*.crt") do ( + if exist "%%~fc" ( + set "FOUND_CERT=1" + call :check_certificate "%%~fc" + if errorlevel 1 ( + endlocal & exit /b 1 + ) + ) +) + +if "%FOUND_CERT%"=="0" ( + echo No certificate files found in "%CERTDIR%". +) + +echo Done. +endlocal & exit /b 0 + +:check_certificate +setlocal +set "CERTFILE=%~1" +set "ALIAS=%~n1" + +keytool -list -keystore "%NEWTRUSTSTORE%" -storepass "%STOREPASS%" -alias "%ALIAS%" >nul 2>nul +if not "%ERRORLEVEL%"=="0" ( + echo Importing "%ALIAS%" from "%CERTFILE%" + keytool -importcert -keystore "%NEWTRUSTSTORE%" -noprompt -storepass "%STOREPASS%" -alias "%ALIAS%" -file "%CERTFILE%" >nul + if errorlevel 1 ( + echo Error: failed to import "%ALIAS%" from "%CERTFILE%" + endlocal & exit /b 1 + ) +) else ( + echo Certificate with alias "%ALIAS%" already exists. Skipping. +) + +endlocal & exit /b 0 \ No newline at end of file diff --git a/include/osh-addons b/include/osh-addons index 8b0aabd..5d2215d 160000 --- a/include/osh-addons +++ b/include/osh-addons @@ -1 +1 @@ -Subproject commit 8b0aabd5a74aa375a5424dde516897e67abb82d7 +Subproject commit 5d2215d2668f75d7e3b730711ddccd0edfdc35bc diff --git a/include/osh-oakridge-modules b/include/osh-oakridge-modules index aaefbb7..54fe70b 160000 --- a/include/osh-oakridge-modules +++ b/include/osh-oakridge-modules @@ -1 +1 @@ -Subproject commit aaefbb79cd651265f04ee2337428c8168c24d256 +Subproject commit 54fe70bf4e39e68950ec9bdc56fc33d48fceaf91