Skip to content

GCS split uploads fail with invalid URL, scheme is not http since opendal 0.56 (reqwest built without TLS)pr #6477

@palkakrzysiek

Description

@palkakrzysiek

Describe the bug

Indexing to a GCS (gs://) backend fails: the indexer cannot upload splits and every
request to https://storage.googleapis.com errors with:

error sending request for url (https://storage.googleapis.com/<bucket>/...):
client error (Connect): invalid URL, scheme is not http

Root cause is a dependency/feature regression, not configuration. quickwit/Cargo.toml declares:

opendal = { version = "0.56", default-features = false }   # line 365

opendal 0.56's default feature set includes reqwest-rustls-tls
(opendal-core/reqwest-rustls-tlsreqwest/rustls), so default-features = false removes
opendal's only TLS backend. This used to be harmless: opendal 0.55 depended on reqwest 0.12, the
same major as the workspace reqwest (quickwit/Cargo.toml:231, built with rustls-tls), so Cargo
feature-unification gave opendal a TLS-capable client anyway. Since #6432 bumped opendal to 0.56,
opendal depends on reqwest 0.13 — a different major that does not unify with reqwest 0.12.
opendal's reqwest 0.13 is therefore compiled with no TLS backend (only the stream feature),
and a TLS-less reqwest uses hyper-util's plain HttpConnector, which rejects any https:// URL with
invalid URL, scheme is not http (before any network I/O).

Only the GCS backend is affected (quickwit-storage/Cargo.toml:84:
gcs = ["dep:opendal", "opendal/services-gcs"] enables no TLS feature). S3 uses the AWS SDK; Azure's
azure_storage_blobs sets enable_reqwest_rustls.

Regression introduced by #6432 "Upgrade Rust and dependencies" (opendal 0.55 → 0.56).

Steps to reproduce (if applicable)

  1. Build quickwit from main (or run quickwit/quickwit:edge) — built with the default
    opendal = { default-features = false } and the gcs feature.
  2. Create an index whose storage is GCS, i.e. index_uri: gs://<bucket>/<index> (or a node
    default_index_root_uri on gs://).
  3. Ingest a few documents and wait for a commit so the indexer publishes a split.
  4. The split upload fails; the indexer logs ... Connect: invalid URL, scheme is not http.

Minimal standalone reproduction (no cluster, no credentials — pins the root cause to the
dependency graph). It mirrors quickwit's exact declarations and goes red on the buggy graph, green
once opendal's TLS feature is restored:

# Cargo.toml
[package]
name = "tls-repro"
version = "0.0.0"
edition = "2021"

[dependencies]
# Mirror quickwit's workspace declarations:
reqwest = { version = "0.12", default-features = false, features = ["json", "rustls-tls"] }
opendal = { version = "0.56", default-features = false, features = ["services-http"] }
tokio   = { version = "1", features = ["macros", "rt-multi-thread"] }
// tests/repro.rs
use opendal::{services::Http, Operator};

#[tokio::test]
async fn opendal_can_make_https_requests() {
    let op = Operator::new(Http::default().endpoint("https://storage.googleapis.com"))
        .unwrap()
        .finish();
    if let Err(err) = op.read("robots.txt").await {
        let msg = format!("{err:?}");
        assert!(
            !msg.contains("invalid URL, scheme is not http"),
            "opendal's reqwest has no TLS backend: {msg}"
        );
    }
}

cargo test → fails with client error (Connect): invalid URL, scheme is not http.
cargo tree -e features -i reqwest@0.13 shows opendal's reqwest carries only the stream feature —
no rustls/native-tls.

Expected behavior

GCS split uploads succeed over HTTPS (as they did before the opendal 0.55 → 0.56 bump).

Restoring opendal's TLS backend fixes it:

opendal = { version = "0.56", default-features = false, features = ["reqwest-rustls-tls"] }

After this, opendal's reqwest 0.13 pulls in hyper-rustls/rustls/tokio-rustls, the standalone
test makes a real HTTPS round-trip (gets a normal 404 NoSuchBucket from GCS instead of the scheme
error), and split uploads work again.

Configuration:

  1. Output of quickwit --version

    Quickwit 0.8.0
    

    Reproduced from source on main and on the published quickwit/quickwit:edge image (built from
    every main merge), i.e. any build after Upgrade Rust and dependencies #6432.

  2. The index_config.yaml

    version: 0.8
    index_id: example-gcs-index
    # The trigger is the gs:// (GCS) storage backend; the doc mapping is irrelevant.
    index_uri: gs://my-bucket/example-gcs-index
    doc_mapping:
      field_mappings:
        - name: timestamp
          type: datetime
          input_formats: [unix_timestamp]
          fast: true
        - name: message
          type: text
      timestamp_field: timestamp
    indexing_settings:
      commit_timeout_secs: 30
    search_settings:
      default_search_fields: [message]

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions