Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions changelog.d/9766_data_dir_cli_env.feature.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
Added a `--data-dir` command-line flag (and corresponding `VECTOR_DATA_DIR` environment variable) to override the `data_dir` global configuration option. When set, it takes precedence over any `data_dir` value in the configuration file. This is useful for keeping deployment-specific paths out of the configuration file, for example when validating a configuration in a CI environment where the configured `data_dir` may not exist.

authors: xfocus3
10 changes: 10 additions & 0 deletions src/app.rs
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,7 @@ impl ApplicationConfig {
&config_paths,
watcher_conf,
opts.require_healthy,
opts.data_dir.clone(),
opts.allow_empty_config,
!opts.disable_env_var_interpolation,
graceful_shutdown_duration,
Expand Down Expand Up @@ -555,10 +556,12 @@ pub fn build_runtime(threads: Option<usize>, thread_name: &str) -> Result<Runtim
Ok(rt_builder.build().expect("Unable to create async runtime"))
}

#[allow(clippy::too_many_arguments)]
pub async fn load_configs(
config_paths: &[ConfigPath],
watcher_conf: Option<config::watcher::WatcherConfig>,
require_healthy: Option<bool>,
data_dir: Option<PathBuf>,
allow_empty_config: bool,
interpolate_env: bool,
graceful_shutdown_duration: Option<Duration>,
Expand Down Expand Up @@ -648,6 +651,13 @@ pub async fn load_configs(
info!("Health checks are disabled.");
}
config.healthchecks.set_require_healthy(require_healthy);
if let Some(data_dir) = data_dir {
debug!(
message = "Overriding data_dir from command line.",
?data_dir
);
config.global.data_dir = Some(data_dir);
Comment on lines +654 to +659
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve data_dir override during reloads

This override is applied only to the initial config returned by load_configs; reloads in handle_signal call config::load_from_paths_with_provider_and_secrets directly and TopologyController::reload rejects changes to global options. With --watch-config/SIGHUP and a CLI/env data_dir that differs from the file (or default), the reloaded config loses this assignment, differs from the running global options, and reload fails with GlobalOptionsChanged instead of continuing to use the requested deployment-level data directory.

Useful? React with 👍 / 👎.

Comment on lines +654 to +659
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Apply data_dir before merging configs

The override happens only after load_from_paths_with_provider_and_secrets has fully built the config, but GlobalOptions::merge rejects two config files that both set different non-default data_dir values during that build. In a split configuration where --data-dir/VECTOR_DATA_DIR is supplied specifically to take precedence, startup still fails with conflicting values for 'data_dir' found before this assignment runs, so the deployment-level override cannot actually supersede all config-file values.

Useful? React with 👍 / 👎.

}
config.graceful_shutdown_duration = graceful_shutdown_duration;

Ok(config)
Expand Down
30 changes: 30 additions & 0 deletions src/cli.rs
Original file line number Diff line number Diff line change
Expand Up @@ -125,6 +125,15 @@ pub struct RootOpts {
#[arg(short, long, env = "VECTOR_REQUIRE_HEALTHY")]
pub require_healthy: Option<bool>,

/// The directory used for persisting Vector state data.
///
/// This overrides the `data_dir` global option set in the configuration file. It is
/// useful for keeping deployment-specific paths out of the configuration file, for
/// example when validating a configuration in a CI environment where the configured
/// `data_dir` may not exist.
#[arg(long, env = "VECTOR_DATA_DIR")]
pub data_dir: Option<PathBuf>,
Comment on lines +134 to +135
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Apply data_dir override to validate

This option is added only to RootOpts, but the validate subcommand is executed with its own validate::Opts and SubCommand::execute passes only those subcommand opts into validate::validate. As a result, VECTOR_DATA_DIR=/tmp vector validate ... (and vector --data-dir /tmp validate ...) parses the value but never applies it before validate_config/create_tmp_directory, so the documented CI validation case still uses the config file's data_dir and can fail on the nonexistent path.

Useful? React with 👍 / 👎.

Comment on lines +134 to +135
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Mark data_dir as a global arg

Because this is a top-level RootOpts argument without global = true, clap only accepts it before a subcommand, so the advertised vector validate --data-dir /tmp vector.yaml form is rejected as an unexpected argument before any override can happen. If this flag is meant to work with subcommands in the position shown in the PR description, it needs to be global or defined on those subcommands.

Useful? React with 👍 / 👎.


/// Number of threads to use for processing (default is number of available cores)
#[arg(short, long, env = "VECTOR_THREADS")]
pub threads: Option<usize>,
Expand Down Expand Up @@ -424,3 +433,24 @@ pub fn handle_config_errors(errors: Vec<String>) -> exitcode::ExitCode {

exitcode::CONFIG
}

#[cfg(test)]
mod tests {
use std::path::PathBuf;

use clap::Parser;

use super::RootOpts;

#[test]
fn data_dir_defaults_to_none() {
let opts = RootOpts::try_parse_from(["vector"]).unwrap();
assert_eq!(opts.data_dir, None);
}

#[test]
fn data_dir_parsed_from_flag() {
let opts = RootOpts::try_parse_from(["vector", "--data-dir", "/tmp/vector"]).unwrap();
assert_eq!(opts.data_dir, Some(PathBuf::from("/tmp/vector")));
}
}
Loading