Issue with Oracle Connector Ver. 2.0 in ADF writing to Parquet fles

Kelman David 20 Reputation points
2025-06-06T01:44:17.1966667+00:00

As recommended I have updated our Oracle connectors in our ADF linked services from ver 1.0 to 2.0 since ver 1.0 will be deprecated by 31st July 2025.

Since using ver 2.0 we are getting errors when saving files to parquet format. Below is the error we get in ADF:

"Operation on target Load Source to Inbound failed: ErrorCode=ParquetJavaInvocationException,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=An error occurred when invoking java, message: java.lang.ArrayIndexOutOfBoundsException:255

total entry:1

com.microsoft.datatransfer.bridge.parquet.ParquetWriterBuilderBridge.addDecimalColumn(ParquetWriterBuilderBridge.java:107)

.,Source=Microsoft.DataTransfer.Richfile.ParquetTransferPlugin,''Type=Microsoft.DataTransfer.Richfile.JniExt.JavaBridgeException,Message=,Source=Microsoft.DataTransfer.Richfile.HiveOrcBridge,'"

Switching to CSV sink format does NOT throw this error. Switching back to ver 1.0 also does not throw the error.

To debug I'm using a very small oracle table with just 3 rows - which results in this error.

Oracle schema of this table is as follows:

Name Null Type


UNIQUE_REC_ID NOT NULL NUMBER

WEBMANIFEST_DIRECTION NOT NULL VARCHAR2(50)

ADDED_DATE NOT NULL TIMESTAMP(6)

Is this a known-issue that will be resolved soon? I would ideally not have to change our sink format to CSV or other format.

Azure Data Factory
Azure Data Factory
An Azure service for ingesting, preparing, and transforming data at scale.
11,582 questions
{count} votes

4 answers

Sort by: Most helpful
  1. Maryia T 6 Reputation points
    2025-06-09T19:53:56.6766667+00:00

    We have observed the same behavior after switching to Oracle v2 connector. After some investigation seems like setting supportV1DataTypes to true on Oracle Linked Service level has solved the issue for us.

    Will do additional testing, but at least end-to-end pipeline is no longer failing without any changes on the table/query levelUser's image

    1 person found this answer helpful.

  2. Deleted

    This answer has been deleted due to a violation of our Code of Conduct. The answer was manually reported or identified through automated detection before action was taken. Please refer to our Code of Conduct for more information.


    Comments have been turned off. Learn more

  3. Amira Bedhiafi 32,756 Reputation points Volunteer Moderator
    2025-06-09T13:52:28.4566667+00:00

    Hello David !

    Thank you for posting on Microsoft Learn.

    You are dealing with is a known issue in ADF and this is related to how Oracle NUMBER columns without explicit precision and scale are handled during conversion to Parquet, which has stricter requirements on schema typing particularly for decimal fields.

    When Oracle NUMBER columns do not have explicitly defined precision and scale, the Java-based Parquet writer may attempt to give incorrect or unsafe defaults.

    Especially when trying to fit large or undefined-scale decimals into Parquet required DECIMAL(precision, scale) format.

    In your source query (inside the ADF copy activity) you can cast each NUMBER column with a defined precision and scale :

    SELECT
      CAST(UNIQUE_REC_ID AS NUMBER(18,0)) AS UNIQUE_REC_ID,
      WEBMANIFEST_DIRECTION,
      ADDED_DATE
    FROM your_table
    

    You can use Data Flow or a derived column in Copy Activity to cast the column with fixed types.


  4. Shraddha Pore 445 Reputation points Microsoft External Staff Moderator
    2025-06-12T13:30:59.07+00:00

    Hi Kelman David, Thank you so much for putting your genuine concern I understand your logic and discomfort.

    • Yes, the problems you’re running into with Oracle NUMBER fields and Parquet output are tied specifically to the newer v2.0 Oracle connector in Azure Data Factory (ADF). The older v1.0 connector was more forgiving when it came to handling Oracle's flexible NUMBER data types. It often allowed the data to flow through without requiring strict precision or casting, even when writing to Parquet.
    • With v2.0, Microsoft has introduced stricter type enforcement. This means that when a column doesn't have a clearly defined precision or scale in Oracle, it can trigger errors during the write to Parquet—because Parquet requires that decimals conform to a fixed precision (maximum 38 digits). These stricter checks are part of broader changes aimed at improving data integrity and compatibility with secure standards like TLS 1.3. So yes, these issues you're now seeing didn’t exist with v1.0 and are new to v2.0.

    Why would Microsoft release a connector that breaks functionality?

    That’s a valid concern, and many teams have asked the same. Microsoft’s goal with v2.0 wasn’t to break things it was to modernize the connector by aligning it with newer platform and security standards. This includes better performance, improved handling of secure connections, and a more consistent mapping of data types. Unfortunately, this also meant tightening the rules around how data types like NUMBER are handled, which has introduced breaking changes for existing pipelines—especially those that rely on implicit conversions when writing to formats like Parquet.

    Could you clarify for me please, with the v1.0 connector being deprecated

    Microsoft has officially announced that Oracle connector v1.0 will no longer receive feature updates after July 31, 2025, and will be fully unsupported by October 31, 2025. After this point, the connector is not just unsupported it may actually be removed entirely, meaning your pipelines that depend on it could fail or be blocked from running. 2025-06-12 17_22_15-Connector release stages and timelines - Azure Data Factory _ Microsoft Learn

    So, “deprecation” here means: No more updates or fixes after July 2025.No guarantee it will keep working after October 2025. Microsoft might remove it completely, especially if security issues arise.

    Since manually rewriting SQL queries to cast every NUMBER column isn’t realistic for 150+ tables, here are some practical alternatives:

    • Use a more flexible file format temporarily: Instead of writing directly to Parquet, consider using Avro or even CSV as a staging format. These formats are more tolerant of Oracle’s flexible number types and can later be converted to Parquet with a dedicated transformation step (e.g., in Synapse, Data Flow, or Databricks).
    • Automate the casting logic: You could query Oracle’s metadata (ALL_TAB_COLUMNS) to identify columns with NUMBER types that lack precision, and automatically inject CAST(... AS NUMBER(18,0)) into your generated queries. This can be scripted using Python or another tool integrated with your pipeline config logic.

    Sharing the link for [feedback] so that it will get highlighted in Microsoft forum.

    Please do not forget to click "Accept the Answer” and Yes wherever the information provided helps you, this can be beneficial to other community members.

    If you have any other questions or still running into more issues, let me know in the "comments" and I would be happy to help you.


Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.