Golddigger is a C++17 machine-learning project for training and running short-horizon XAUUSD price predictors with dlib.
It currently supports:
- offline model training from candle CSV files
- hyperparameter tuning for SVR (
C,epsilon-insensitivity,gamma) - daily patching of rolling Dukascopy candle CSV files
- live prediction from Dukascopy CLI candle data
- live prediction from Alpha Vantage
FX_INTRADAY
The project builds four binaries in the project root:
trainerpredictortuner.binpatcher.bin
Golddigger builds an SVR model on top of engineered candle features such as:
- recent log returns
- RSI
- ADX
- relative ATR
- relative SMA20 / SMA50
- SMA spread
- cyclical hour-of-day and day-of-week features
The regression target is a signed best-horizon move measured from the current candle close:
m15: best move reached within the next6candlesh1: best move reached within the next4candlesd1: best move reached within the next4candles
Features are still derived from candle closes. Labels use the future horizon's highest high and lowest low, then keep whichever log-return has the larger absolute magnitude. This makes the model learn entry points that lead into the strongest subsequent expansion rather than only the very next candle close.
The training flow is:
- tune
C,epsilon-insensitivity, andgammawithtuner.bin - save the tuned parameters to
Models/tuner_<timeframe>.dat - run
trainerto build a model - run
predictorto fetch live candles and emit predictions
trainer automatically loads the matching tuner_*.dat file when one exists for the training dataset timeframe.
Build/runtime dependencies used by the current codebase:
- CMake
- a C++17 compiler
- dlib
nlohmann-json- libcurl
- Node.js and
npxif you want to use the Dukascopy provider - an Alpha Vantage API key if you want to use the Alpha Vantage provider
This project also depends on local code from your HotBits repo:
cBaseWorker_V2.hcHTTPClient.cppcHTTPClient.h
By default, CMake expects those files under:
../Repository/Software/ThreadComponent
If your checkout lives somewhere else, pass the path explicitly:
cmake -S . -B build -DTHREAD_COMPONENT_DIR=/path/to/Repository/Software/ThreadComponentOn macOS, the current project configuration already looks for Homebrew installs of nlohmann-json. A typical setup is:
brew install nlohmann-json dlib curlConfigure and build:
cmake -S . -B build
cmake --build buildExecutables are written to the project root:
./trainer
./predictor
./tuner.bin
./patcher.bin
The repo currently includes a sample training file:
Data/xauusd-m15-bid-2024-01-01-2026-03-11.csv
The rolling live-data files patched by the updater use the current naming convention:
Data/xauusd-m15-bid.csv
Data/xauusd-h1-bid.csv
Data/xauusd-d1-bid.csv
CSV format:
timestamp,open,high,low,close,volume
Typical output files:
- tuned hyperparameters:
Models/tuner_m15.dat,Models/tuner_h1.dat,Models/tuner_d1.dat - trained models:
Models/gold_digger_m15.dat,Models/gold_digger_h1.dat, etc.
If you pass custom output paths, those are used instead.
Tunes SVR hyperparameters using dlib::find_max_global and cross_validate_regression_trainer.
Default behavior:
- uses
Data/xauusd-m15-bid-2024-01-01-2026-03-11.csv - writes
Models/tuner_m15.dat - uses
50max optimizer calls - tunes on up to
5000evenly spaced samples using3folds by default - derives the epsilon-insensitivity search range automatically from the label distribution
- derives the gamma search range automatically from normalized sample distances
Usage:
./tuner.bin
./tuner.bin --max-calls 25 Data/xauusd-h1.csv
./tuner.bin --max-calls 50 --progress-seconds 60 Data/xauusd-m15-bid.csv
./tuner.bin --max-calls 25 --tuning-samples 8000 --folds 4 Data/xauusd-h1.csv
./tuner.bin --epsilon-range auto Data/xauusd-m15-bid.csv
./tuner.bin --epsilon-range 0.00005:0.003 Data/xauusd-m15-bid.csv
./tuner.bin --gamma-range auto Data/xauusd-m15-bid.csv
./tuner.bin --gamma-range 0.0001:3 Data/xauusd-m15-bid.csv
./tuner.bin Data/xauusd-h1.csv=Models/tuner_h1.dat
./tuner.bin Data/xauusd-m15.csv=Models/tuner_m15.dat Data/xauusd-h1.csv=Models/tuner_h1.datUseful flags:
--max-calls N--progress-seconds N--tuning-samples N--folds N--epsilon-range auto|MIN:MAX--gamma-range auto|MIN:MAX
Argument format:
data.csvResult path is inferred asModels/tuner_<timeframe>.datdata.csv=output.datUses the explicit output file
Notes:
- tuning jobs run on worker threads, so progress can be printed while cross-validation is still running
- the tuner now keeps solver tolerance fixed and tunes
epsilonas the SVR epsilon-insensitivity parameter - by default, tuning uses an evenly spaced subset of the generated samples so large
m15datasets do not spend hours inside the first cross-validation call - by default, the tuner derives a sensible epsilon-insensitivity search range from the dataset's return distribution
- by default, the tuner derives a sensible gamma search range from normalized sample distances
--epsilon-range MIN:MAXlets you override that auto-derived range manually--gamma-range MIN:MAXlets you override the auto-derived gamma range manuallyCtrl+Crequests a graceful stop and waits for the current evaluation boundary
Trains a model from one or more CSV datasets.
Default behavior:
- uses
Data/xauusd-m15-bid-2024-01-01-2026-03-11.csv - writes
Models/gold_digger_m15.dat
Usage:
./trainer
./trainer Data/xauusd-h1.csv
./trainer Data/xauusd-h1.csv=Models/gold_digger_h1.dat
./trainer Data/xauusd-m15.csv=Models/gold_digger_m15.dat Data/xauusd-h1.csv=Models/gold_digger_h1.datNotes:
- training jobs are launched as worker threads
- duplicate model output paths are rejected
- if
Models/tuner_<timeframe>.datexists, the trainer loads it automatically - if no tuning file exists, trainer falls back to built-in defaults
Ctrl+Crequests a graceful stop and waits for worker threads to exit
Loads a trained model, fetches live candles from a provider, keeps track of the next missing candle timestamp, and predicts the signed best move over the configured future horizon after each newly completed candle becomes available.
Supported providers:
dukascopyalphavantage
General usage:
./predictor --model Models/gold_digger_m15.dat --instrument xauusd --timeframe m15Useful flags:
--provider dukascopy|alphavantage--model PATH--instrument SYMBOL--timeframe m15|h1|d1--poll-seconds N--availability-delay-seconds N--max-predictions N
Provider-specific flags:
- Dukascopy:
--dukascopy-command "npx dukascopy-node" - Alpha Vantage:
--alpha-vantage-api-key KEY--alpha-vantage-base-url URL
The Dukascopy provider shells out to:
npx dukascopy-nodeExample:
./predictor \
--provider dukascopy \
--model Models/gold_digger_m15.dat \
--instrument xauusd \
--timeframe m15 \
--dukascopy-command "npx dukascopy-node"Optional environment override:
export DUKASCOPY_NODE_COMMAND="npx dukascopy-node"Example:
export ALPHAVANTAGE_API_KEY=your_api_key
./predictor \
--provider alphavantage \
--model Models/gold_digger_m15.dat \
--instrument xauusd \
--timeframe m15You can also pass the key explicitly:
./predictor \
--provider alphavantage \
--alpha-vantage-api-key your_api_key \
--model Models/gold_digger_m15.dat \
--instrument xauusd \
--timeframe m15Current Alpha Vantage limitations:
- implemented against
FX_INTRADAY - supports
m15andh1 d1is not supported through the current Alpha Vantage provider- volume is not provided by the API, so predictor uses
0.0for volume on that provider
The predictor intentionally avoids requesting the still-forming candle.
It uses:
- candle timeframe
- an availability delay
- polling
So if a provider usually lags one or more minutes after candle close, use:
./predictor --availability-delay-seconds 120Default availability delay is 60 seconds.
Fetches one or more completed UTC trading days of Dukascopy candles and merges them into the rolling CSV files in Data/.
Default behavior:
- uses Dukascopy CLI via
npx dukascopy-node - patches
m15,h1, andd1 - targets the previous UTC calendar day
- writes into
Data/xauusd-m15-bid.csv,Data/xauusd-h1-bid.csv, andData/xauusd-d1-bid.csv
Usage:
./patcher.bin
./patcher.bin --date 2026-03-31
./patcher.bin --from 2026-04-01 --to 2026-04-06
./patcher.bin --from "2026-03-31 00:00" --to "2026-04-07 23:45" --timeframes m15
./patcher.bin --timeframes m15,h1
./patcher.bin --dukascopy-command "npx dukascopy-node" --data-dir DataUseful flags:
--date YYYY-MM-DD--from YYYY-MM-DD[ HH:MM[:SS]]--to YYYY-MM-DD[ HH:MM[:SS]]--instrument SYMBOL--data-dir DIR--timeframes m15,h1,d1--dukascopy-command CMD
Notes:
- requests are formatted in UTC/GMT for Dukascopy
- duplicate timestamps are avoided when patching
- if a timestamp already exists in a CSV, the downloaded candle replaces that row
--fromand--topatch an inclusive UTC date range with one Dukascopy request per timeframe across the whole window, which is useful for catching up missed days without tripping over weekends or holidays--fromand--toalso accept exact UTC datetimes; when a time is included,--tois treated as the last candle timestamp you want included- date flags accept both
YYYY-MM-DD/DD-MM-YYYYandYYYY-MM-DD HH:MM[:SS]/DD-MM-YYYY HH:MM[:SS] - this is designed as a one-shot updater, so you can schedule it externally for
01:00 GMT
./tuner.bin --max-calls 50 Data/xauusd-m15-bid-2024-01-01-2026-03-11.csv./trainer Data/xauusd-m15-bid-2024-01-01-2026-03-11.csv=Models/gold_digger_m15.dat./predictor \
--provider dukascopy \
--model Models/gold_digger_m15.dat \
--instrument xauusd \
--timeframe m15./patcher.bin.
├── CMakeLists.txt
├── Data/
├── DataUpdate/
├── Indicators/
├── MarketData/
├── Models/
├── Prediction/
├── Training/
├── Utils/
├── patcher.cpp
├── predictor.cpp
├── trainer.cpp
└── tuner.cpp
- The predictor prints an action signal based on predicted price change vs estimated spread.
- The project uses the current working directory for relative paths like
Data/...andModels/....
This project is licensed under the GNU General Public License v3.0.
See LICENSE for the full text.
Copyright (C) 2026 Hotbits
Third-party dependencies such as dlib and nlohmann-json remain under their own respective licenses.