Issue with Oracle Connector Ver. 2.0 in ADF writing to Parquet fles

Question

Issue with Oracle Connector Ver. 2.0 in ADF writing to Parquet fles

Kelman David 20

As recommended I have updated our Oracle connectors in our ADF linked services from ver 1.0 to 2.0 since ver 1.0 will be deprecated by 31st July 2025.

Since using ver 2.0 we are getting errors when saving files to parquet format. Below is the error we get in ADF:

"Operation on target Load Source to Inbound failed: ErrorCode=ParquetJavaInvocationException,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=An error occurred when invoking java, message: java.lang.ArrayIndexOutOfBoundsException:255

total entry:1

com.microsoft.datatransfer.bridge.parquet.ParquetWriterBuilderBridge.addDecimalColumn(ParquetWriterBuilderBridge.java:107)

.,Source=Microsoft.DataTransfer.Richfile.ParquetTransferPlugin,''Type=Microsoft.DataTransfer.Richfile.JniExt.JavaBridgeException,Message=,Source=Microsoft.DataTransfer.Richfile.HiveOrcBridge,'"

Switching to CSV sink format does NOT throw this error. Switching back to ver 1.0 also does not throw the error.

To debug I'm using a very small oracle table with just 3 rows - which results in this error.

Oracle schema of this table is as follows:

Name Null Type

UNIQUE_REC_ID NOT NULL NUMBER

WEBMANIFEST_DIRECTION NOT NULL VARCHAR2(50)

ADDED_DATE NOT NULL TIMESTAMP(6)

Is this a known-issue that will be resolved soon? I would ideally not have to change our sink format to CSV or other format.

yvindRogneJensrud-2651 11 Reputation points

2025-06-08T11:49:42.6266667+00:00

We are experiencing exactly the same issue. The only workaround I have found so far is to force NUMBER into having a specific scale and precision using CAST(... as NUMBER(x,y))

However, if there is any way to do this without having to update a lot of pipelines, it would be great...

Some articles recommend setting useParquetV2 to true, but that just means the problem shows up in the Spark code instead
Maryia T 6 Reputation points

2025-06-09T18:50:40.9333333+00:00

Second this. We have hundreds of tables with thousands of columns and our pipelines are parameterized and schema-agnostic - they copy table as-is and only care about the name. Explicitly casting thousands of columns (we don't even know which ones) is clearly not an option.

As mentioned above, useParquetV2 just postpones the issue: now Databricks can't read the file and throws SparkArithmeticException: [DECIMAL_PRECISION_EXCEEDS_MAX_PRECISION] Decimal precision 256 exceeds max precision 38. SQLSTATE: 22003.

And honestly, it's really annoying that yet another one ADF connector upgrade breaks what was working just fine in the older version - just like with Salesforce last year.
Shraddha Pore 445 Reputation points Microsoft External Staff Moderator

2025-06-10T09:30:59.87+00:00
Hi Kelman David, Thank you so much your query.

When working with Oracle as a source and writing to Parquet in Azure Data Factory (ADF), there are two common issues that tend to pop up especially when using filters or custom SQL. Here's a breakdown of what causes these problems and how to fix them reliably.

1.Parquet Write Errors with Oracle NUMBER Columns

The problem is Oracle NUMBER columns can have very high or undefined precision.

When using a WHERE clause or custom SQL, ADF might infer a precision that's too high (like NUMBER(256,0)), which Parquet can’t handle.

This leads to failures when writing to Parquet.

Why this happens: Even if you set supportV1DataTypes = true, it may not help with custom SQL or filtered queries where inference still happens.

How to fix it:

Set supportV1DataTypes = true in your Oracle linked service.

In the Parquet dataset, set "useParquetV2": true in the JSON.

In your SQL query, cast your NUMBER columns to a safe precision, like this:

CAST(UNIQUE_REC_ID AS NUMBER(18,0)) AS UNIQUE_REC_ID

This combination ensures that decimal values are handled properly and the Parquet schema stays within limits.

2."Table or view does not exist" Error

The problem is ADF sometimes can’t find Oracle tables when using custom SQL queries.

Why this happens: ADF doesn’t always respect role-based permissions.

If your SQL refers to a table without including the schema name, it might not be able to resolve it.

How to fix it:

Always write your SQL queries using fully qualified table names, like this:

SELECT ... FROM MYSCHEMA.MYTABLE WHERE ...

Make sure the Oracle user you're using in the pipeline has direct SELECT privileges on the table. Don’t rely on permissions granted via roles. Quick Fix Checklist

Set supportV1DataTypes = true in the Oracle linked service.

Enable "useParquetV2": true in the Parquet dataset.

Cast NUMBER columns in your SQL query to something like NUMBER(18,0).

Always use SCHEMA.TABLE in your SQL to avoid name resolution errors.

Ensure the Oracle user has direct SELECT access on the objects being queried. Example SQL for Copy Activity Here’s what a stable query looks like in your ADF pipeline:

SELECT CAST(UNIQUE_REC_ID AS NUMBER(18,0)) AS UNIQUE_REC_ID, WEBMANIFEST_DIRECTION, ADDED_DATE FROM MYSCHEMA.MYTABLE WHERE <your filters>;

Let me know if any of this won't help or if you’re seeing any specific error messages! Happy to help you.

Please do not forget to click "Upvote the comment” and Yes wherever the information provided helps you, this can be beneficial to other community members.
Shraddha Pore 445 Reputation points Microsoft External Staff Moderator

2025-06-11T18:44:31.8266667+00:00

Hi Kelman David, We haven’t heard from you on the last response and was just checking back to see if you have a resolution yet. In case if you have any resolution, please do share that same with the community as it can be helpful to others. Otherwise, will respond with more details and happy to help you.
Kelman David 20 Reputation points

2025-06-11T22:15:55.06+00:00

Hi @Shraddha Pore - Re the suggested steps to resolve this, FYI we are extracting from approx. 150 Oracle tables using a config table etc. We have couple issues, right off the bat:1. Identifying each column that is of NUMBER datatype in our custom queries would be quite problematic and undesirable1. We construct our select queries as you've suggested, using schema.table_name across all the instances we extract from These issues don't exist in the old connector - why would MS create a new connector that doesn't, as a minimum, function as the v1.0 connector? You've mentioned "When working with Oracle as a source and writing to Parquet in Azure Data Factory (ADF), there are two common issues that tend to pop up especially when using filters or custom SQL" this is not the case in v1.0 connector... you're referring to the common issues in relation to v 2.0, right?

Could you clarify for me please, with the v1.0 connector being deprecated....does this mean "unsupported" but still available? or will it be no longer available?

You can imagine what our preference would be.... and hopefully see MS team release a more stable, less-problematic version in the near future.

Many thanks

K
Kelman David 20 Reputation points

2025-06-12T23:48:42.5733333+00:00

Thank you @Shraddha Pore for the explanation and the information regarding deprecation and support... much appreciated.

I’ll review this internally with our team. Given the context, we’ll likely explore an alternative file format, even though we initially selected parquet for its performance benefits. Thanks

4 answers

Your answer

yvindRogneJensrud-2651 11 Reputation points

2025-06-08T11:49:42.6266667+00:00

We are experiencing exactly the same issue. The only workaround I have found so far is to force NUMBER into having a specific scale and precision using CAST(... as NUMBER(x,y))

However, if there is any way to do this without having to update a lot of pipelines, it would be great...

Some articles recommend setting useParquetV2 to true, but that just means the problem shows up in the Spark code instead
Maryia T 6 Reputation points

2025-06-09T18:50:40.9333333+00:00

Second this. We have hundreds of tables with thousands of columns and our pipelines are parameterized and schema-agnostic - they copy table as-is and only care about the name. Explicitly casting thousands of columns (we don't even know which ones) is clearly not an option.

As mentioned above, useParquetV2 just postpones the issue: now Databricks can't read the file and throws SparkArithmeticException: [DECIMAL_PRECISION_EXCEEDS_MAX_PRECISION] Decimal precision 256 exceeds max precision 38. SQLSTATE: 22003.

And honestly, it's really annoying that yet another one ADF connector upgrade breaks what was working just fine in the older version - just like with Salesforce last year.
Shraddha Pore 445 Reputation points Microsoft External Staff Moderator

2025-06-11T18:44:31.8266667+00:00

Hi Kelman David, We haven’t heard from you on the last response and was just checking back to see if you have a resolution yet. In case if you have any resolution, please do share that same with the community as it can be helpful to others. Otherwise, will respond with more details and happy to help you.
Kelman David 20 Reputation points

2025-06-11T22:15:55.06+00:00

Hi @Shraddha Pore - Re the suggested steps to resolve this, FYI we are extracting from approx. 150 Oracle tables using a config table etc. We have couple issues, right off the bat:1. Identifying each column that is of NUMBER datatype in our custom queries would be quite problematic and undesirable1. We construct our select queries as you've suggested, using schema.table_name across all the instances we extract from These issues don't exist in the old connector - why would MS create a new connector that doesn't, as a minimum, function as the v1.0 connector? You've mentioned "When working with Oracle as a source and writing to Parquet in Azure Data Factory (ADF), there are two common issues that tend to pop up especially when using filters or custom SQL" this is not the case in v1.0 connector... you're referring to the common issues in relation to v 2.0, right?

Could you clarify for me please, with the v1.0 connector being deprecated....does this mean "unsupported" but still available? or will it be no longer available?

You can imagine what our preference would be.... and hopefully see MS team release a more stable, less-problematic version in the near future.

Many thanks

K
Kelman David 20 Reputation points

2025-06-12T23:48:42.5733333+00:00

Thank you @Shraddha Pore for the explanation and the information regarding deprecation and support... much appreciated.

I’ll review this internally with our team. Given the context, we’ll likely explore an alternative file format, even though we initially selected parquet for its performance benefits. Thanks

Answer 1

Maryia T 6

We have observed the same behavior after switching to Oracle v2 connector. After some investigation seems like setting supportV1DataTypes to true on Oracle Linked Service level has solved the issue for us.

Will do additional testing, but at least end-to-end pipeline is no longer failing without any changes on the table/query level User's image

Deleted

This comment has been deleted due to a violation of our Code of Conduct. The comment was manually reported or identified through automated detection before action was taken. Please refer to our Code of Conduct for more information.
Kelman David 20 Reputation points

2025-06-10T00:01:37.6933333+00:00

UPDATE - I may have jumped the gun. The recommendation works for tables that are full loads - NUMBER dataypes are no longer throwing the error.

When I try extract from oracle using a WHERE clause the error is thrown again and in other instances "table or view does not exist" error is thrown :(
Maryia T 6 Reputation points

2025-06-10T11:22:21.7633333+00:00

That's weird, but I wasn't able to replicate the issue on my end: my test pipeline with watermarking didn't fail (both watermark lookup with select max() and actual query with the where clause). But will test on more tables

Answer 2

Deleted

This answer has been deleted due to a violation of our Code of Conduct. The answer was manually reported or identified through automated detection before action was taken. Please refer to our Code of Conduct for more information.

Comments have been turned off. Learn more

Answer 3

Hello David !

Thank you for posting on Microsoft Learn.

You are dealing with is a known issue in ADF and this is related to how Oracle NUMBER columns without explicit precision and scale are handled during conversion to Parquet, which has stricter requirements on schema typing particularly for decimal fields.

When Oracle NUMBER columns do not have explicitly defined precision and scale, the Java-based Parquet writer may attempt to give incorrect or unsafe defaults.

Especially when trying to fit large or undefined-scale decimals into Parquet required DECIMAL(precision, scale) format.

In your source query (inside the ADF copy activity) you can cast each NUMBER column with a defined precision and scale :

SELECT
  CAST(UNIQUE_REC_ID AS NUMBER(18,0)) AS UNIQUE_REC_ID,
  WEBMANIFEST_DIRECTION,
  ADDED_DATE
FROM your_table

You can use Data Flow or a derived column in Copy Activity to cast the column with fixed types.

yvindRogneJensrud-2651 11 Reputation points

2025-06-10T10:10:52.8333333+00:00

This workaround will work, but as @Maryia T mention above, it is quite common to use dynamical extract of tables (maybe hundreds of tables). In those cases, you will not know where to apply CAST(). In worst case, you will have to manually create a lot of sources and targets where you figure out where to CAST(). This is not a feasible solution

Answer 4

Hi Kelman David, Thank you so much for putting your genuine concern I understand your logic and discomfort.

Yes, the problems you’re running into with Oracle NUMBER fields and Parquet output are tied specifically to the newer v2.0 Oracle connector in Azure Data Factory (ADF). The older v1.0 connector was more forgiving when it came to handling Oracle's flexible NUMBER data types. It often allowed the data to flow through without requiring strict precision or casting, even when writing to Parquet.
With v2.0, Microsoft has introduced stricter type enforcement. This means that when a column doesn't have a clearly defined precision or scale in Oracle, it can trigger errors during the write to Parquet—because Parquet requires that decimals conform to a fixed precision (maximum 38 digits). These stricter checks are part of broader changes aimed at improving data integrity and compatibility with secure standards like TLS 1.3. So yes, these issues you're now seeing didn’t exist with v1.0 and are new to v2.0.

Why would Microsoft release a connector that breaks functionality?

That’s a valid concern, and many teams have asked the same. Microsoft’s goal with v2.0 wasn’t to break things it was to modernize the connector by aligning it with newer platform and security standards. This includes better performance, improved handling of secure connections, and a more consistent mapping of data types. Unfortunately, this also meant tightening the rules around how data types like NUMBER are handled, which has introduced breaking changes for existing pipelines—especially those that rely on implicit conversions when writing to formats like Parquet.

Could you clarify for me please, with the v1.0 connector being deprecated

Microsoft has officially announced that Oracle connector v1.0 will no longer receive feature updates after July 31, 2025, and will be fully unsupported by October 31, 2025. After this point, the connector is not just unsupported it may actually be removed entirely, meaning your pipelines that depend on it could fail or be blocked from running. 2025-06-12 17_22_15-Connector release stages and timelines - Azure Data Factory _ Microsoft Learn

So, “deprecation” here means: No more updates or fixes after July 2025.No guarantee it will keep working after October 2025. Microsoft might remove it completely, especially if security issues arise.

Since manually rewriting SQL queries to cast every NUMBER column isn’t realistic for 150+ tables, here are some practical alternatives:

Use a more flexible file format temporarily: Instead of writing directly to Parquet, consider using Avro or even CSV as a staging format. These formats are more tolerant of Oracle’s flexible number types and can later be converted to Parquet with a dedicated transformation step (e.g., in Synapse, Data Flow, or Databricks).
Automate the casting logic: You could query Oracle’s metadata (ALL_TAB_COLUMNS) to identify columns with NUMBER types that lack precision, and automatically inject CAST(... AS NUMBER(18,0)) into your generated queries. This can be scripted using Python or another tool integrated with your pipeline config logic.

Sharing the link for [feedback] so that it will get highlighted in Microsoft forum.

Please do not forget to click "Accept the Answer” and Yes wherever the information provided helps you, this can be beneficial to other community members.

If you have any other questions or still running into more issues, let me know in the "comments" and I would be happy to help you.

Shraddha Pore 445 Reputation points Microsoft External Staff Moderator

2025-06-13T03:42:42.1766667+00:00

Hi Kelman David, Please do not forget to click "Accept the Answer” and Yes wherever the information provided helps you, this can be beneficial to other community members.

If you have any other questions or still running into more issues, let me know in the "comments" and I would be happy to help you.

Share via

Issue with Oracle Connector Ver. 2.0 in ADF writing to Parquet fles

4 answers

Your answer