Caveats
Only iceberg V2 table supported
This connector writes data to Iceberg tables using the V2 specification. To optimize write performance, delete events are recorded in delete files, avoiding costly data file rewrites. While this approach significantly improves write performance, it can impact read performance, especially in upsert
mode. However, in append
mode, this performance trade-off is not applicable.
No automatic schema evolution
Full schema evolution, such as converting incompatible data types, is not currently supported. However, schema expansion, including adding new fields or expanding existing field data types, is supported. To enable this behavior, set the
debezium.sink.iceberg.allow-field-addition
configuration property to true
.
Specific tables replication
By default, the Debezium connector will replicate all the tables in the database, resulting in unnecessary load. To avoid replicating tables you don't need, configure the debezium.source.table.include.list
property to specify the exact tables to replicate. This will streamline your data pipeline and reduce the overhead. For more details on this configuration, refer to the Debezium server source documentation.
AWS S3 credentials
You can setup aws credentials in the following ways:
- Option 1: use
debezium.sink.iceberg.fs.s3a.access.key
anddebezium.sink.iceberg.fs.s3a.secret.key
inapplication.properties
- Option 2: inject credentials to environment variables
AWS_ACCESS_KEY
andAWS_SECRET_ACCESS_KEY
- Option 3: setup proper
HADOOP_HOME
env then add s3a configuration intocore-site.xml
, more information can be found here.