- Split the connector into two separate packages:
databricks-sql-connector
anddatabricks-sqlalchemy
. Thedatabricks-sql-connector
package contains the core functionality of the connector, while thedatabricks-sqlalchemy
package contains the SQLAlchemy dialect for the connector. - Pyarrow dependency is now optional in
databricks-sql-connector
. Users needing arrow are supposed to explicitly install pyarrow
- Relaxed the number of Http retry attempts (#486 by @jprakash-db)
- Fix: Incorrect number of rows fetched in inline results when fetching results with FETCH_NEXT orientation (#479 by @jprakash-db)
- Updated the doc to specify native parameters are not supported in PUT operation (#477 by @jprakash-db)
- Relax
pyarrow
andnumpy
pin (#452 by @arredond) - Feature: Support for async execute has been added (#463 by @jprakash-db)
- Updated the HTTP retry logic to be similar to the other Databricks drivers (#467 by @jprakash-db)
- Support encryption headers in the cloud fetch request (#460 by @jackyhu-db)
- Create a non pyarrow flow to handle small results for the column set (#440 by @jprakash-db)
- Fix: On non-retryable error, ensure PySQL includes useful information in error (#447 by @shivam2680)
- Unpin pandas to support v2.2.2 (#416 by @kfollesdal)
- Make OAuth as the default authenticator if no authentication setting is provided (#419 by @jackyhu-db)
- Fix (regression): use SSL options with HTTPS connection pool (#425 by @kravets-levko)
- Don't retry requests that fail with HTTP code 401 (#408 by @Hodnebo)
- Remove username/password (aka "basic") auth option (#409 by @jackyhu-db)
- Refactor CloudFetch handler to fix numerous issues with it (#405 by @kravets-levko)
- Add option to disable SSL verification for CloudFetch links (#414 by @kravets-levko)
Databricks-managed passwords reached end of life on July 10, 2024. Therefore, Basic auth support was removed from the library. See https://docs.databricks.com/en/security/auth-authz/password-deprecation.html
The existing option _tls_no_verify=True
of sql.connect(...)
will now also disable SSL cert verification
(but not the SSL itself) for CloudFetch links. This option should be used as a workaround only, when other ways
to fix SSL certificate errors didn't work.
- Update proxy authentication (#354 by @amir-haroun)
- Relax
pyarrow
pin (#389 by @dhirschfeld) - Fix error logging in OAuth manager (#269 by @susodapop)
- SQLAlchemy: enable delta.feature.allowColumnDefaults for all tables (#343 by @dhirschfeld)
- Update
thrift
dependency (#397 by @m1n0)
- Remove broken cookie code (#379)
- Small typing fixes (#382, #384 thanks @wyattscarpenter)
- Don't retry requests that fail with code 403 (#373)
- Assume a default retry-after for 429/503 (#371)
- Fix boolean literals (#357)
- Revert retry-after behavior to be exponential backoff (#349)
- Support Databricks OAuth on Azure (#351)
- Support Databricks OAuth on GCP (#338)
- Revised docstrings and examples for OAuth (#339)
- Redact the URL query parameters from the urllib3.connectionpool logs (#341)
- SQLAlchemy dialect now supports table and column comments (thanks @cbornet!)
- Fix: SQLAlchemy dialect now correctly reflects TINYINT types (thanks @TimTheinAtTabs!)
- Fix:
server_hostname
URIs that includedhttps://
would raise an exception - Other: pinned to
pandas<=2.1
andurllib3>=1.26
to avoid runtime errors in dbt-databricks (#330)
- Other: updated docstring comment about default parameterization approach (#287)
- Other: added tests for reading complex types and revised docstrings and type hints (#293)
- Fix: SQLAlchemy dialect raised DeprecationWarning due to
dbapi
classmethod (#294) - Fix: SQLAlchemy dialect could not reflect TIMESTAMP_NTZ columns (#296)
- Remove support for Python 3.7
- Add support for native parameterized SQL queries. Requires DBR 14.2 and above. See docs/parameters.md for more info.
- Completely rewritten SQLAlchemy dialect
- Adds support for SQLAlchemy >= 2.0 and drops support for SQLAlchemy 1.x
- Full e2e test coverage of all supported features
- Detailed usage notes in
README.sqlalchemy.md
- Adds support for:
- New types:
TIME
,TIMESTAMP
,TIMESTAMP_NTZ
,TINYINT
Numeric
type scale and precision, likeNumeric(10,2)
- Reading and writing
PrimaryKeyConstraint
andForeignKeyConstraint
- Reading and writing composite keys
- Reading and writing from views
- Writing
Identity
to tables (i.e. autoincrementing primary keys) LIMIT
andOFFSET
for paging through results- Caching metadata calls
- New types:
- Enable cloud fetch by default. To disable, set
use_cloud_fetch=False
when buildingdatabricks.sql.client
. - Add integration tests for Databricks UC Volumes ingestion queries
- Retries:
- Add
_retry_max_redirects
config - Set
_enable_v3_retries=True
and warn if users override it
- Add
- Security: bump minimum pyarrow version to 14.0.1 (CVE-2023-47248)
- Fix: Connections failed when urllib3~=1.0.0 is installed (#206)
Note: this release was yanked from Pypi on 13 September 2023 due to compatibility issues with environments where urllib3<=2.0.0
were installed. The log changes are incorporated into version 2.9.3 and greater.
- Other: Add
examples/v3_retries_query_execute.py
(#199) - Other: suppress log message when
_enable_v3_retries
is notTrue
(#199) - Other: make this connector backwards compatible with
urllib3>=1.0.0
(#197)
Note: this release was yanked from Pypi on 13 September 2023 due to compatibility issues with environments where urllib3<=2.0.0
were installed.
- Other: Explicitly pin urllib3 to ^2.0.0 (#191)
- Replace retry handling with DatabricksRetryPolicy. This is disabled by default. To enable, set
_enable_v3_retries=True
when creatingdatabricks.sql.client
(#182) - Other: Fix typo in README quick start example (#186)
- Other: Add autospec to Client mocks and tidy up
make_request
(#188)
- Add support for Cloud Fetch. Disabled by default. Set
use_cloud_fetch=True
when buildingdatabricks.sql.client
to enable it (#146, #151, #154) - SQLAlchemy has_table function now honours schema= argument and adds catalog= argument (#174)
- SQLAlchemy set non_native_boolean_check_constraint False as it's not supported by Databricks (#120)
- Fix: Revised SQLAlchemy dialect and examples for compatibility with SQLAlchemy==1.3.x (#173)
- Fix: oauth would fail if expired credentials appeared in ~/.netrc (#122)
- Fix: Python HTTP proxies were broken after switch to urllib3 (#158)
- Other: remove unused import in SQLAlchemy dialect
- Other: Relax pandas dependency constraint to allow ^2.0.0 (#164)
- Other: Connector now logs operation handle guids as hexadecimal instead of bytes (#170)
- Other: test_socket_timeout_user_defined e2e test was broken (#144)
- Fix: connector raised exception when calling close() on a closed Thrift session
- Improve e2e test development ergonomics
- Redact logged thrift responses by default
- Add support for OAuth on Databricks Azure
- Fix: Retry GetOperationStatus requests for http errors
- Fix: http.client would raise a BadStatusLine exception in some cases
- Add support for HTTP 1.1 connections (connection pools)
- Add a default socket timeout for thrift RPCs
- Fix: SQLAlchemy adapter could not reflect TIMESTAMP or DATETIME columns
- Other: Relax pandas and alembic dependency specifications
- Other: Relax sqlalchemy required version as it was unecessarily strict.
- Add support for External Auth providers
- Fix: Python HTTP proxies were broken
- Other: All Thrift requests that timeout during connection will be automatically retried
- Less strict numpy and pyarrow dependencies
- Update examples in README to use security best practices
- Update docstring for client.execute() for clarity
- Improve compatibility when installed alongside other Databricks namespace Python packages
- Add SQLAlchemy dialect
- Support staging ingestion commands for DBR 12+
- Support custom oauth client id and redirect port
- Fix: Add none check on _oauth_persistence in DatabricksOAuthProvider
- Add support for Python 3.11
- Bump thrift version to address https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2020-13949
- Add support for lz4 compression
- Introduce experimental OAuth support while Bring Your Own IDP is in Public Preview on AWS
- Add functional examples
- Fix: closing a connection now closes any open cursors from that connection at the server
- Other: Add project links to pyproject.toml (helpful for visitors from PyPi)
- Add support for Python 3.10
- Add unit test matrix for supported Python versions
Huge thanks to @dbaxa for contributing this change!
- Add retry logic for
GetOperationStatus
requests that fail with anOSError
- Reorganised code to use Poetry for dependency management.
- Better exception handling in automatic connection close
- Fixed Pandas dependency in setup.cfg to be >= 1.2.0
- Initial stable release of V2
- Added better support for complex types, so that in Databricks runtime 10.3+, Arrays, Maps and Structs will get deserialized as lists, lists of tuples and dicts, respectively.
- Changed the name of the metadata arg to http_headers
- Change import of collections.Iterable to collections.abc.Iterable to make the library compatible with Python 3.10
- Fixed bug with .tables method so that .tables works as expected with Unity-Catalog enabled endpoints
- Fix packaging issue (dependencies were not being installed properly)
- Fetching timestamp results will now return aware instead of naive timestamps
- The client will now default to using simplified error messages
- Initial beta release of V2. V2 is an internal re-write of large parts of the connector to use Databricks edge features. All public APIs from V1 remain.
- Added Unity Catalog support (pass catalog and / or schema key word args to the .connect method to select initial schema and catalog)
Note: The code for versions prior to v2.0.0b
is not contained in this repository. The below entries are included for reference only.
- Add operations for retrieving metadata
- Add the ability to access columns by name on result rows
- Add the ability to provide configuration settings on connect
- Improved logging and error messages.
- Add retries for 429 and 503 HTTP responses.
- (Bug fix) Increased Thrift requirement from 0.10.0 to 0.13.0 as 0.10.0 was in fact incompatible
- (Bug fix) Fixed error message after query execution failed -SQLSTATE and Error message were misplaced
- Public Preview release, Experimental tag removed
- minor updates in internal build/packaging
- no functional changes
- initial (Experimental) release of pyhive-forked connector
- Python DBAPI 2.0 (PEP-0249), thrift based
- see docs for more info: https://docs.databricks.com/dev-tools/python-sql-connector.html