Iceberg 版本 | Apache Iceberg 0.12.1 发布-技术圈

Apache Iceberg 0.12.1 已于2021年11月8日发布。

Apache Iceberg 是一种用于大型分析数据集的开放式表格式。Iceberg 为包括 Spark、Trino、PrestoDB、Flink 和 Hive 在内的多种计算引擎提供高性能的 SQL 表格式。

Apache Kyuubi(Inubating) 基于 Apache Spark 为包括 Iceberg 在内的数据湖三剑客均提供了支持。

该版本提供的 spark3-runtime Jar 仅支持 Spark 3.0 和 3.1。下一次主要版本的发布将会带来 Spark 3.2 的支持，也可以尝试基于 master 分支代码的每日构建版尝鲜 Spark 3.2 的支持。

Apache Iceberg 0.12.1 包含以下主要的问题修复和改进：

#3264 fixes validation failures that occurred after snapshot expiration when writing Flink CDC streams to Iceberg tables.

#3264 fixes reading projected map columns from Parquet files written before Parquet 1.11.1.

#3195 allows validating that commits that produce row-level deltas don't conflict with concurrently added files. Ensures users can maintain serializable isolation for update and delete operations, including merge operations.

#3199 allows validating that commits that overwrite files don't conflict with concurrently added files. Ensures users can maintain serializable isolation for overwrite operations.

#3135 fixes equality-deletes using DATE, TIMESTAMP, and TIME types.

#3078 prevents the JDBC catalog from overwriting the jdbc.user property if any property called user exists in the environment.

#3035 fixes drop namespace calls with the DyanmoDB catalog.

#3273 fixes importing Avro files via add_files by correctly setting the number of records.

#3332 fixes importing ORC files with float or double columns in add_files.