connect.external.databricks

connect.external.databricks

Databricks SDK integration.

Databricks SDK credentials implementations which support interacting with Posit OAuth integrations on Connect.

Notes

These APIs are provided as a convenience and are subject to breaking changes: https://github.com/databricks/databricks-sdk-py#interface-stability

Attributes

Name Description
CredentialsProvider
POSIT_LOCAL_CLIENT_CREDENTIALS_AUTH_TYPE
POSIT_OAUTH_INTEGRATION_AUTH_TYPE

Classes

Name Description
CredentialsStrategy Maintain compatibility with the Databricks SQL/SDK client libraries.
PositContentCredentialsProvider CredentialsProvider implementation which initiates a credential exchange using a content-session-token.
PositContentCredentialsStrategy CredentialsStrategy implementation which supports interacting with Service Account OAuth integrations on Connect.
PositCredentialsProvider CredentialsProvider implementation which initiates a credential exchange using a user-session-token.
PositCredentialsStrategy CredentialsStrategy implementation which supports interacting with Viewer OAuth integrations on Connect.
PositLocalContentCredentialsProvider CredentialsProvider implementation which provides a fallback for local development using a client credentials flow.
PositLocalContentCredentialsStrategy CredentialsStrategy implementation which supports local development using OAuth M2M authentication against Databricks.

CredentialsStrategy

connect.external.databricks.CredentialsStrategy()

Maintain compatibility with the Databricks SQL/SDK client libraries.

See Also

  • https://github.com/databricks/databricks-sql-python/blob/v3.3.0/src/databricks/sql/auth/authenticators.py#L19-L33
  • https://github.com/databricks/databricks-sdk-py/blob/v0.29.0/databricks/sdk/credentials_provider.py#L44-L54

Methods

Name Description
auth_type
auth_type
connect.external.databricks.CredentialsStrategy.auth_type()

PositContentCredentialsProvider

connect.external.databricks.PositContentCredentialsProvider(self, client)

CredentialsProvider implementation which initiates a credential exchange using a content-session-token.

The content-session-token is provided by Connect through the environment variable CONNECT_CONTENT_SESSION_TOKEN.

See Also

  • https://github.com/posit-dev/posit-sdk-py/blob/main/src/posit/connect/oauth/oauth.py

PositContentCredentialsStrategy

connect.external.databricks.PositContentCredentialsStrategy(
    self
    local_strategy
    client=None
)

CredentialsStrategy implementation which supports interacting with Service Account OAuth integrations on Connect.

This strategy callable class returns a PositContentCredentialsProvider when hosted on Connect, and its local_strategy strategy otherwise.

Examples

NOTE: in the example below, the PositContentCredentialsStrategy can be initialized anywhere that the Python process can read environment variables.

from posit.connect.external.databricks import PositContentCredentialsStrategy

import pandas as pd
from databricks import sql
from databricks.sdk.core import ApiClient, Config, databricks_cli
from databricks.sdk.service.iam import CurrentUserAPI

DATABRICKS_HOST = "<REDACTED>"
DATABRICKS_HOST_URL = f"https://{DATABRICKS_HOST}"
SQL_HTTP_PATH = "<REDACTED>"

# NOTE: currently the databricks_cli local strategy only supports auth code OAuth flows.
# https://github.com/databricks/cli/issues/1939
#
# This means that the databricks_cli supports local development using the developer's
# databricks credentials, but not the credentials for a service principal.
# To fallback to service principal credentials in local development, use
# `PositLocalContentCredentialsStrategy` as a drop-in replacement.
posit_strategy = PositContentCredentialsStrategy(local_strategy=databricks_cli)

cfg = Config(host=DATABRICKS_HOST_URL, credentials_strategy=posit_strategy)

databricks_user_info = CurrentUserAPI(ApiClient(cfg)).me()
print(f"Hello, {databricks_user_info.display_name}!")

query = "SELECT * FROM samples.nyctaxi.trips LIMIT 10;"
with sql.connect(
    server_hostname=DATABRICKS_HOST,
    http_path=SQL_HTTP_PATH,
    credentials_provider=posit_strategy.sql_credentials_provider(cfg),
) as connection:
    with connection.cursor() as cursor:
        cursor.execute(query)
        rows = cursor.fetchall()
        print(pd.DataFrame([row.asDict() for row in rows]))

Methods

Name Description
auth_type
sql_credentials_provider The sql connector attempts to call the credentials provider w/o any args.
auth_type
connect.external.databricks.PositContentCredentialsStrategy.auth_type()
sql_credentials_provider
connect.external.databricks.PositContentCredentialsStrategy.sql_credentials_provider(
    *args
    **kwargs
)

The sql connector attempts to call the credentials provider w/o any args.

The SQL client’s ExternalAuthProvider is not compatible w/ the SDK’s implementation of CredentialsProvider, so create a no-arg lambda that wraps the args defined by the real caller. This way we can pass in a databricks Config object required by most of the SDK’s CredentialsProvider implementations from where sql.connect is called.

https://github.com/databricks/databricks-sql-python/issues/148#issuecomment-2271561365

PositCredentialsProvider

connect.external.databricks.PositCredentialsProvider(
    self
    client
    user_session_token
)

CredentialsProvider implementation which initiates a credential exchange using a user-session-token.

The user-session-token is provided by Connect through the HTTP session header Posit-Connect-User-Session-Token.

See Also

  • https://github.com/posit-dev/posit-sdk-py/blob/main/src/posit/connect/oauth/oauth.py

PositCredentialsStrategy

connect.external.databricks.PositCredentialsStrategy(
    self
    local_strategy
    client=None
    user_session_token=None
)

CredentialsStrategy implementation which supports interacting with Viewer OAuth integrations on Connect.

This strategy callable class returns a PositCredentialsProvider when hosted on Connect, and its local_strategy strategy otherwise.

Examples

NOTE: In the example below, the PositCredentialsProvider must be initialized within the context of the shiny server function, which provides access to the HTTP session headers.

import os

import pandas as pd
from databricks import sql
from databricks.sdk.core import ApiClient, Config, databricks_cli
from databricks.sdk.service.iam import CurrentUserAPI
from posit.connect.external.databricks import PositCredentialsStrategy
from shiny import App, Inputs, Outputs, Session, render, ui

DATABRICKS_HOST = "<REDACTED>"
DATABRICKS_HOST_URL = f"https://{DATABRICKS_HOST}"
SQL_HTTP_PATH = "<REDACTED>"

app_ui = ui.page_fluid(ui.output_text("text"), ui.output_data_frame("result"))


def server(i: Inputs, o: Outputs, session: Session):
    # HTTP session headers are available in this context.
    session_token = session.http_conn.headers.get("Posit-Connect-User-Session-Token")
    posit_strategy = PositCredentialsStrategy(
        local_strategy=databricks_cli, user_session_token=session_token
    )
    cfg = Config(host=DATABRICKS_HOST_URL, credentials_strategy=posit_strategy)

    @render.data_frame
    def result():
        query = "SELECT * FROM samples.nyctaxi.trips LIMIT 10;"

        with sql.connect(
            server_hostname=DATABRICKS_HOST,
            http_path=SQL_HTTP_PATH,
            credentials_provider=posit_strategy.sql_credentials_provider(cfg),
        ) as connection:
            with connection.cursor() as cursor:
                cursor.execute(query)
                rows = cursor.fetchall()
                df = pd.DataFrame(rows, columns=[col[0] for col in cursor.description])
                return df

    @render.text
    def text():
        databricks_user_info = CurrentUserAPI(ApiClient(cfg)).me()
        return f"Hello, {databricks_user_info.display_name}!"


app = App(app_ui, server)

Methods

Name Description
auth_type
sql_credentials_provider The sql connector attempts to call the credentials provider w/o any args.
auth_type
connect.external.databricks.PositCredentialsStrategy.auth_type()
sql_credentials_provider
connect.external.databricks.PositCredentialsStrategy.sql_credentials_provider(
    *args
    **kwargs
)

The sql connector attempts to call the credentials provider w/o any args.

The SQL client’s ExternalAuthProvider is not compatible w/ the SDK’s implementation of CredentialsProvider, so create a no-arg lambda that wraps the args defined by the real caller. This way we can pass in a databricks Config object required by most of the SDK’s CredentialsProvider implementations from where sql.connect is called.

See Also
  • https://github.com/databricks/databricks-sql-python/issues/148#issuecomment-2271561365

PositLocalContentCredentialsProvider

connect.external.databricks.PositLocalContentCredentialsProvider(
    self
    token_endpoint_url
    client_id
    client_secret
)

CredentialsProvider implementation which provides a fallback for local development using a client credentials flow.

There is an open issue against the Databricks CLI which prevents it from returning service principal access tokens. https://github.com/databricks/cli/issues/1939

Until the CLI issue is resolved, this CredentialsProvider implements the approach described in the Databricks documentation for manually generating a workspace-level access token using OAuth M2M authentication. Once it has acquired an access token, it returns it as a Bearer authorization header like other CredentialsProvider implementations.

See Also

  • https://docs.databricks.com/en/dev-tools/auth/oauth-m2m.html#manually-generate-a-workspace-level-access-token

PositLocalContentCredentialsStrategy

connect.external.databricks.PositLocalContentCredentialsStrategy(
    self
    token_endpoint_url
    client_id
    client_secret
)

CredentialsStrategy implementation which supports local development using OAuth M2M authentication against Databricks.

There is an open issue against the Databricks CLI which prevents it from returning service principal access tokens. https://github.com/databricks/cli/issues/1939

Until the CLI issue is resolved, this CredentialsStrategy provides a drop-in replacement as a local_strategy that can be used to develop applications which target Service Account OAuth integrations on Connect.

Examples

In the example below, the PositContentCredentialsStrategy can be initialized anywhere that the Python process can read environment variables.

CLIENT_ID and CLIENT_SECRET are credentials associated with the Databricks service principal.

from posit.connect.external.databricks import (
    PositContentCredentialsStrategy,
    PositLocalContentCredentialsStrategy,
)

import pandas as pd
from databricks import sql
from databricks.sdk.core import ApiClient, Config
from databricks.sdk.service.iam import CurrentUserAPI

DATABRICKS_HOST = "<REDACTED>"
DATABRICKS_HOST_URL = f"https://{DATABRICKS_HOST}"
SQL_HTTP_PATH = "<REDACTED>"
TOKEN_ENDPOINT_URL = f"https://{DATABRICKS_HOST}/oidc/v1/token"

CLIENT_ID = "<REDACTED>"
CLIENT_SECRET = "<REDACTED>"

# Rather than relying on the Databricks CLI as a local strategy, we use
# PositLocalContentCredentialsStrategy as a drop-in replacement.
# Can be replaced with the Databricks CLI implementation when
# https://github.com/databricks/cli/issues/1939 is resolved.
local_strategy = PositLocalContentCredentialsStrategy(
    token_endpoint_url=TOKEN_ENDPOINT_URL,
    client_id=CLIENT_ID,
    client_secret=CLIENT_SECRET,
)

posit_strategy = PositContentCredentialsStrategy(local_strategy=local_strategy)

cfg = Config(host=DATABRICKS_HOST_URL, credentials_strategy=posit_strategy)

databricks_user_info = CurrentUserAPI(ApiClient(cfg)).me()
print(f"Hello, {databricks_user_info.display_name}!")

query = "SELECT * FROM samples.nyctaxi.trips LIMIT 10;"
with sql.connect(
    server_hostname=DATABRICKS_HOST,
    http_path=SQL_HTTP_PATH,
    credentials_provider=posit_strategy.sql_credentials_provider(cfg),
) as connection:
    with connection.cursor() as cursor:
        cursor.execute(query)
        rows = cursor.fetchall()
        print(pd.DataFrame([row.asDict() for row in rows]))

See Also

  • https://docs.databricks.com/en/dev-tools/auth/oauth-m2m.html#manually-generate-a-workspace-level-access-token

Methods

Name Description
auth_type
sql_credentials_provider
auth_type
connect.external.databricks.PositLocalContentCredentialsStrategy.auth_type()
sql_credentials_provider
connect.external.databricks.PositLocalContentCredentialsStrategy.sql_credentials_provider(
    *args
    **kwargs
)