Note
Access to this page requires authorization. You can try signing in or changing directories.
Access to this page requires authorization. You can try changing directories.
This article shows you how to add a PostgreSQL Database Change Data Capture (CDC) source to an eventstream.
The PostgreSQL Database Change Data Capture (CDC) source connector for Microsoft Fabric event streams allows you to capture a snapshot of the current data in a PostgreSQL database. Currently, PostgreSQL Database Change Data Capture (CDC) is supported from the following services where the databases can be accessed publicly:
- Azure Database for PostgreSQL
- Amazon RDS for PostgreSQL
- Amazon Aurora PostgreSQL
- Google Cloud SQL for PostgreSQL
Once the PostgreSQL Database CDC source is added to the eventstream, it captures row-level changes to the specified tables. These changes can then be processed in real-time and sent to different destinations for further analysis.
Note
This source is not supported in the following regions of your workspace capacity: West US3, Switzerland West.
Prerequisites
Access to a workspace in the Fabric capacity license mode (or) the Trial license mode with Contributor or higher permissions.
Registered user access in the PostgreSQL database.
Your PostgreSQL database must be publicly accessible and not be behind a firewall or secured in a virtual network.
CDC enabled in the PostgreSQL database and tables.
If you have Azure Database for PostgreSQL, follow the steps in the next section to enable CDC. For detailed information, see Logical replication and logical decoding - Azure Database for PostgreSQL - Flexible Server.
For other PostgreSQL databases, see Debezium connector for PostgreSQL :: Debezium Documentation.
If you don't have an eventstream, create an eventstream.
Enable CDC in your PostgreSQL Database
This section uses Azure Database for PostgreSQL as an example.
To enable CDC in your Azure Database for PostgreSQL Flexible Server, follow these steps:
On your Azure Database for PostgreSQL Flexible Server page in the Azure portal, select Server parameters in the navigation menu.
On the Server parameters page:
- Set wal_level to logical.
- Update the max_worker_processes to at least 16.
Save the changes and restart the server.
Confirm that your Azure Database for PostgreSQL Flexible Server instance allows public network traffic.
Grant the admin user replication permissions by running the following SQL statement. If you want to use other user account to connect your PostgreSQL DB to fetch CDC, ensure the user is the table owner.
ALTER ROLE <admin_user_or_table_owner_user> WITH REPLICATION;
Launch the Select a data source wizard
If you haven't added any source to your eventstream yet, select Use external source tile.
If you're adding the source to an already published eventstream, switch to Edit mode, select Add source on the ribbon, and then select External sources.
On the Select a data source page, search for and select Connect on the PostgreSQL DB (CDC) tile.
Configure and connect to PostgreSQL Database CDC
On the Connect page, select New connection.
In the Connection settings section, enter the following information.
Server: The server address of your PostgreSQL database, for example my-pgsql-server.postgres.database.azure.com.
Database: The database name, for example my_database.
Connection name: Enter a name for the connection.
Authentication kind, Select Basic and enter your Username and Password for the database.
Note
Currently, Fabric event streams support only Basic authentication.
Select Connect to complete the connection settings.
Port: Enter the port number of your server. Default value is 5432. If your selected cloud connection is configured in Manage connections and gateways, ensure that the port number matches the one set there. If they don't match, the port number in cloud connection in Manage connections and gateways take precedence.
You can choose between two options when capturing changes from database tables:
- All tables: Capture changes from every table in the database.
- Enter table name(s): Allows you to specify a subset of tables using a comma-separated list. You may use either: full table identifiers in the format
schemaName.tableName
or valid regular expressions. Examples: dbo.test.*
: Select all tables whose names start withtest
in thedbo
schema.dbo\.(test1|test2)
: Selectdbo.test1
anddbo.test2
.
You can combine both formats in the list. The total character limit for the entire entry is 102,400 characters.
Slot name (optional): Enter the name of the PostgreSQL logical decoding slot that was created for streaming changes from a particular plug-in for a particular database/schema. The server uses this slot to stream events to Eventstream streaming connector. It must contain only lowercase letters, numbers, and underscores.
- If not specified, a GUID is used to create the slot, requiring the appropriate database permissions.
- If a specified slot name exists, the connector uses it directly.
You may expand Advanced settings to access additional configuration options for the PostgreSQL Database CDC source:
Publication name: Specifies the name of the PostgreSQL logical replication publication to use. This must match an existing publication in the database, or it will be created automatically depending on the autocreate mode. Default value:
dbz_publication
.Note
The connector user must have superuser permissions to create the publication. It's recommended to create the publication manually before starting the connector for the first time to avoid permission-related issues.
Publication auto-create mode: Controls whether and how the publication is automatically created. Options include:
Filtered
(default): If the specified publication doesn't exist, the connector creates one that includes only the selected tables (as specified in the table include list).AllTables
: If the specified publication exists, the connector uses it. If it doesn't exist, the connector creates one that includes all tables in the database.Disabled
: The connector doesn't create a publication. If the specified publication is missing, the connector throws an exception and stops. In this case, the publication must be manually created in the database.
For more information, see the Debezium documentation on publication autocreate mode
Decimal handling mode: Specifies how the connector handles PostgreSQL
DECIMAL
andNUMERIC
column values:Precise
: Represents values using exact decimal types (for example, JavaBigDecimal
) to ensure full precision and accuracy in data representation.Double
: Converts values to double-precision floating-point numbers. This improves usability and performance but may result in a loss of precision.String
: Encodes values as formatted strings. This makes them easy to consume in downstream systems but loses semantic information about the original numeric type.
On the Review + connect page, review the summary, and then select Add.
Note
The maximum number of sources and destinations for one eventstream is 11.
View updated eventstream
You can see the PostgreSQL Database CDC source added to your eventstream in Edit mode.
To implement this newly added PostgreSQL DB CDC source, select Publish. After you complete these steps, your PostgreSQL DB CDC source is available for visualization in the Live view.
Related content
Other connectors: