Handling DATETIME2 compatibility issue in Databricks during Hyperscale type alignment
In our project, we are transforming data in Azure Databricks coming from source systems (DB2 via CDC or snapshots) and storing it temporarily in Delta Lake. We later load this data into Azure SQL Hyperscale. To align with Hyperscale’s expected schema, we are converting source data types in Databricks (e.g., string → DATETIME2, decimal → numeric).
However, we are facing a compatibility issue where Databricks does not support certain SQL Server-specific data types like DATETIME2
. When we attempt to cast or store such values in Delta format, we encounter errors because DATETIME2
is not natively supported in Spark.
Can you guide:
What is the recommended approach to handle SQL Server types like DATETIME2
inside Databricks?
Should we store it in Spark-compatible types like TimestampType
in Delta and only cast to DATETIME2
at the point of writing into Hyperscale?
Is there any issue in doing so in terms of data truncation or precision mismatch between Spark TimestampType
and SQL DATETIME2
?
Any best practice guidance for type casting and schema enforcement in intermediate layers between source and Hyperscale?