Skip to content

Commit 1ffda2c

Browse files
committed
Document unified host support and auto-discovery in README
Completes the port of databricks/databricks-sdk-go#1641 and databricks/databricks-sdk-py#1358 by aligning README with what Go and Python published. Changes: - Add "Unified host support" entry + "GCP native authentication" entry to the auth "In this section" TOC. - List GCP as the third step of the default auth order, and explain that auth methods auto-skip when their required config is absent. Show how to force a method via DatabricksConfig.setAuthType. - Add a "Unified host support" section with a .databrickscfg example and equivalent Java code (WorkspaceClient + AccountClient sharing one profile, plus overriding workspace_id). - Extend the native-auth attribute table with workspace_id and discovery_url, mark account_id / workspace_id / discovery_url as auto-discoverable, and note that auto-discovery never overwrites explicit values. The Go/Python PRs also document a "cloud" config field; Java does not (yet) expose one on DatabricksConfig, so it is omitted here. Co-authored-by: Isaac
1 parent 3e81d21 commit 1ffda2c

1 file changed

Lines changed: 66 additions & 9 deletions

File tree

README.md

Lines changed: 66 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -87,8 +87,10 @@ workspace. // press <TAB> for autocompletion
8787
### In this section
8888

8989
- [Default authentication flow](#default-authentication-flow)
90+
- [Unified host support](#unified-host-support)
9091
- [Databricks native authentication](#databricks-native-authentication)
9192
- [Azure native authentication](#azure-native-authentication)
93+
- [Google Cloud Platform native authentication](#google-cloud-platform-native-authentication)
9294
- [Overriding .databrickscfg](#overriding-databrickscfg)
9395
- [Additional authentication configuration options](#additional-authentication-configuration-options)
9496

@@ -98,9 +100,25 @@ If you run the [Databricks Terraform Provider](https://registry.terraform.io/pro
98100

99101
1. [Databricks native authentication](#databricks-native-authentication)
100102
2. [Azure native authentication](#azure-native-authentication)
101-
3. If the SDK is unsuccessful at this point, it returns an authentication error and stops running.
103+
3. [Google Cloud Platform native authentication](#google-cloud-platform-native-authentication)
104+
4. If the SDK is unsuccessful at this point, it returns an authentication error and stops running.
102105

103-
You can instruct the Databricks SDK for Java to use a specific authentication method by instantiating the `DatabricksConfig` class and setting the `auth_type` as described in the following sections.
106+
Each authentication method requires specific configuration attributes (e.g., `token` for PAT auth, `azureClientId` for Azure service principal auth). The SDK automatically detects the cloud provider and skips authentication methods whose required configuration attributes are not present. This means that Azure-specific methods like `azure-cli` are automatically skipped when connecting to an AWS or GCP workspace, and vice versa for GCP-specific methods.
107+
108+
To force a specific authentication method instead of relying on auto-detection, set the `authType` on `DatabricksConfig`:
109+
110+
```java
111+
import com.databricks.sdk.WorkspaceClient;
112+
import com.databricks.sdk.core.DatabricksConfig;
113+
...
114+
// Force Azure CLI authentication — skip all other methods
115+
DatabricksConfig config = new DatabricksConfig()
116+
.setHost("https://mycompany.databricks.com")
117+
.setAuthType("azure-cli");
118+
WorkspaceClient workspace = new WorkspaceClient(config);
119+
```
120+
121+
This is useful when your environment has credentials for multiple authentication methods and you want to ensure a specific one is used, or when auto detection is not accurate.
104122

105123
For each authentication method, the SDK searches for compatible authentication credentials in the following locations, in the following order. Once the SDK finds a compatible set of credentials that it can use, it stops searching:
106124

@@ -115,6 +133,41 @@ For each authentication method, the SDK searches for compatible authentication c
115133

116134
Depending on the Databricks authentication method, the SDK uses the following information. Presented are the `WorkspaceClient` and `AccountClient` arguments (which have corresponding `.databrickscfg` file fields), their descriptions, and any corresponding environment variables.
117135

136+
### Unified host support
137+
138+
Certain Databricks host types support both account-level and workspace-level API operations from a single endpoint. When using such a unified host, a single configuration profile can be used to create both `WorkspaceClient` and `AccountClient` instances without changing the `host`.
139+
140+
For this to work, the following conditions must be met:
141+
142+
1. The host must support unified operations.
143+
2. Both `account_id` and `workspace_id` must be available — either set explicitly in the configuration or auto-discovered.
144+
145+
When both values are present, the SDK uses `workspace_id` to route workspace-level requests and `account_id` to route account-level requests, all through the same host.
146+
147+
```ini
148+
# .databrickscfg
149+
[unified]
150+
host = https://mycompany.databricks.com
151+
account_id = 00000000-0000-0000-0000-000000000000
152+
workspace_id = 1234567890
153+
```
154+
155+
```java
156+
import com.databricks.sdk.AccountClient;
157+
import com.databricks.sdk.WorkspaceClient;
158+
import com.databricks.sdk.core.DatabricksConfig;
159+
...
160+
// Both clients share the same host and profile
161+
WorkspaceClient workspace = new WorkspaceClient(new DatabricksConfig().setProfile("unified"));
162+
AccountClient account = new AccountClient(new DatabricksConfig().setProfile("unified"));
163+
164+
// A WorkspaceClient for a different workspace under the same host and account
165+
WorkspaceClient otherWorkspace = new WorkspaceClient(
166+
new DatabricksConfig().setProfile("unified").setWorkspaceId("2345678901"));
167+
```
168+
169+
If the host supports it, `account_id` and `workspace_id` may be auto-discovered, reducing the required explicit configuration.
170+
118171
### Databricks native authentication
119172

120173
By default, the Databricks SDK for Java initially tries [Databricks token authentication](https://docs.databricks.com/dev-tools/api/latest/authentication.html) (`auth_type='pat'` argument). If the SDK is unsuccessful, it then tries Workload Identity Federation (WIF). See [Supported WIF](https://docs.databricks.com/aws/en/dev-tools/auth/oauth-federation-provider) for the supported JWT token providers.
@@ -123,13 +176,17 @@ By default, the Databricks SDK for Java initially tries [Databricks token authen
123176
- For Databricks OIDC authentication, you must provide the `host`, `client_id` and `token_audience` _(optional)_ either directly, through the corresponding environment variables, or in your `.databrickscfg` configuration file.
124177
- For Azure DevOps OIDC authentication, the `token_audience` is irrelevant as the audience is always set to `api://AzureADTokenExchange`. Also, the `System.AccessToken` pipeline variable required for OIDC request must be exposed as the `SYSTEM_ACCESSTOKEN` environment variable, following [Pipeline variables](https://learn.microsoft.com/en-us/azure/devops/pipelines/build/variables?view=azure-devops&tabs=yaml#systemaccesstoken)
125178

126-
| Argument | Description | Environment variable |
127-
|--------------|-------------|-------------------|
128-
| `host` | _(String)_ The Databricks host URL for either the Databricks workspace endpoint or the Databricks accounts endpoint. | `DATABRICKS_HOST` |
129-
| `account_id` | _(String)_ The Databricks account ID for the Databricks accounts endpoint. Only has effect when `Host` is either `https://accounts.cloud.databricks.com/` _(AWS)_, `https://accounts.azuredatabricks.net/` _(Azure)_, or `https://accounts.gcp.databricks.com/` _(GCP)_. | `DATABRICKS_ACCOUNT_ID` |
130-
| `token` | _(String)_ The Databricks personal access token (PAT) _(AWS, Azure, and GCP)_ or Azure Active Directory (Azure AD) token _(Azure)_. | `DATABRICKS_TOKEN` |
131-
| `client_id` | _(String)_ The Databricks Service Principal Application ID. | `DATABRICKS_CLIENT_ID` |
132-
| `token_audience` | _(String)_ When using Workload Identity Federation, the audience to specify when fetching an ID token from the ID token supplier. | `TOKEN_AUDIENCE` |
179+
During initialization, the SDK automatically resolves missing configuration fields (`account_id`, `workspace_id`, and `discovery_url`) from the host metadata endpoint. Any explicitly provided values take precedence and are never overwritten. If the auto-discovery fails, the SDK falls back to the explicit configuration. It is recommended to always set explicit configuration.
180+
181+
| Argument | Description | Environment variable |
182+
|------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|---------------------------|
183+
| `host` | _(String)_ The Databricks host URL for either the Databricks workspace endpoint or the Databricks accounts endpoint. | `DATABRICKS_HOST` |
184+
| `account_id` | _(String)_ The Databricks account ID for the Databricks accounts endpoint. Auto-discovered if not provided. Has effect on hosts that serve account-level APIs. | `DATABRICKS_ACCOUNT_ID` |
185+
| `workspace_id` | _(String)_ The Databricks workspace ID for the Databricks workspace endpoint. Auto-discovered if not provided. | `DATABRICKS_WORKSPACE_ID` |
186+
| `discovery_url` | _(String)_ The OpenID Connect discovery URL. Auto-discovered if not provided. When set, OIDC endpoints are fetched directly from this URL instead of using the default host-based well-known endpoint logic. | `DATABRICKS_DISCOVERY_URL`|
187+
| `token` | _(String)_ The Databricks personal access token (PAT) _(AWS, Azure, and GCP)_ or Azure Active Directory (Azure AD) token _(Azure)_. | `DATABRICKS_TOKEN` |
188+
| `client_id` | _(String)_ The Databricks Service Principal Application ID. | `DATABRICKS_CLIENT_ID` |
189+
| `token_audience` | _(String)_ When using Workload Identity Federation, the audience to specify when fetching an ID token from the ID token supplier. | `TOKEN_AUDIENCE` |
133190

134191
For example, to use Databricks token authentication:
135192

0 commit comments

Comments
 (0)