Caveats
Equality Delete Performance
This connector writes data to Iceberg tables using the V2 specification. To optimize write performance, delete events are recorded in delete files, avoiding costly data file rewrites. While this approach significantly improves write performance, it can impact read performance, especially in upsert
mode. However, in append
mode, this performance trade-off is not applicable.
To optimize read performance end user must run periodic table maintenance job. do data compaction and rewrite the delete
filed. This is especially critical for upsert
mode.
Schema Evolution (Only Schema Expansion is supported)
Full schema evolution, such as converting incompatible data types, is not currently supported. However, schema
expansion, including adding new fields or expanding existing field data types, is supported. To enable this behavior,
set the
debezium.sink.iceberg.allow-field-addition
configuration property to true
.
To seamlessly absorb schema changes, consider using this setting. All nested data will be stored in a variant fields.
# Store nested data in variant fields
debezium.sink.iceberg.nested-as-variant=true
# Ensure event flattening is disabled (flattening is the default behavior)
debezium.transforms=,
Specific tables replication
By default, the Debezium connector will replicate all the tables in the database, resulting in unnecessary load. To avoid replicating tables you don't need, configure the debezium.source.table.include.list
property to specify the exact tables to replicate. This will streamline your data pipeline and reduce the overhead. For more details on this configuration, refer to the Debezium server source documentation.
AWS S3 credentials
You can setup aws credentials in the following ways:
- Option 1: use
debezium.sink.iceberg.fs.s3a.access.key
anddebezium.sink.iceberg.fs.s3a.secret.key
inapplication.properties
- Option 2: inject credentials to environment variables
AWS_ACCESS_KEY
andAWS_SECRET_ACCESS_KEY
- Option 3: setup proper
HADOOP_HOME
env then add s3a configuration intocore-site.xml
, more information can be found here.