Review the list of Iceberg issues that are resolved in Cloudera Runtime 7.3.2, its service packs and cumulative hotfixes.
Cloudera Runtime 7.3.2
Cloudera Runtime 7.3.2 resolves Iceberg issues and incorporates fixes
from the service packs and cumulative hotfixes from 7.3.1.100 through 7.3.1.706. For a
comprehensive record of all fixes in Cloudera Runtime 7.3.1.x, see
Fixed Issues.
- CDPD-78686: Iceberg tables created in 7.2.17 are not captured in
Atlas when using 7.2.18 or 7.3.1 Atlas server
- 7.3.2
- This issue occurred due to incompatibility between Data
Services, Hive, and Impala hooks in 7.2.17 and the Atlas server in 7.2.18 and 7.3.1.
This fix resolves the compatibility issue and Iceberg tables created in 7.2.17 are
now correctly captured and displayed in the Atlas UI in later versions.
- CDPD-97171: Concurrency issues between compaction and concurrent
write operations
- 7.3.2
- This fix resolves an issue where compaction conflicted
with concurrent write operations causing data corruption, by improving concurrency
handling to ensure stable operations and data consistency.
- Apache Jira: HIVE-29437
- CDPD-74040: Dropping Iceberg table with complex type containing
timestamp fails
- 7.3.2
- This fix resolves an issue where dropping an Iceberg
table with complex types such as array, map, or struct containing timestamp fields
failed due to unsupported Hive timestamp type handling, by ensuring proper type
conversion and successful table deletion.
- CDPD-89402: Event processing invalidates Iceberg tables due to
reload failures
- 7.3.2
- This fix resolves an issue where
CatalogServiceCatalog.reloadTableIfExists() resulted in a
ClassCastException during event processing, invalidating Iceberg
tables and triggering full table reloads instead of incremental loading.
- Apache Jira: IMPALA-14358
- CDPD-72383: COUNT(*) optimization returns incorrect results in
UNION queries on Iceberg V2 tables
- 7.3.2
- A faulty COUNT(*) query optimization caused incorrect
results in UNION queries on Iceberg V2 tables that have delete files, leading to
inconsistent query outputs.
The fix corrects the optimization logic to ensure accurate
results across UNION queries, including scenarios involving post data
changes.
- Apache Jira: IMPALA-13756
IMPALA-13249
- CDPD-81076:
LEFT ANTI JOIN fails on Iceberg V2
tables with Delete files
- 7.3.2
- Queries using a
LEFT ANTI JOIN fail with an
AnalysisException if the right-side table is an Iceberg V2 table
containing delete files. For example, consider the following
query:SELECT * FROM table_a a
LEFT ANTI JOIN iceberg_v2_table b
ON a.id = b.id;
The error Illegal column/field
reference'b.input_file_name' of semi-/anti-joined table 'b' is displayed
because semi-joined tuples need to be explicitly made visible for paths pointing
inside them to be resolvable.
The fix updates the
IcebergScanPlanner to ensure that the tuple containing the virtual
fields is made visible when it is semi-joined.
Apache
Jira: IMPALA-13888
- CDPD-78427: Enable MERGE statement for Iceberg tables with
equality deletes
- 7.3.2
- This patch fixes an issue that caused
MERGE
statements to fail on Iceberg tables that use equality deletes.The failure occurred
because the delete expression calculation was missing the data sequence number, even
though the underlying data description included it. This mismatch caused row
evaluation to fail.
The fix ensures the data sequence number is correctly
included in the result expressions, allowing MERGE operations to
complete successfully on these tables.
Apache
Jira: IMPALA-13674
- CDPD-77773: Tolerate missing data files during Iceberg table
loading
- 7.3.2
- This fix addresses an issue where an Iceberg table would fail to
load completely if any of its data files were missing from the file system. This
TableLoadingException left the table in an incomplete state, blocking
all operations on it.Impala now tolerates missing data files during the table loading
process. An exception will only be thrown if a query subsequently attempts to read one
of the specific files that is missing.
This change allows other operations that
do not depend on the missing data—such as ROLLBACK, DROP
PARTITION, or SELECT statements on valid partitions—to
execute successfully.
Apache Jira: IMPALA-13654
- CDPD-78508: Skip reloading Iceberg tables when metadata JSON
file is the same
- 7.3.2
- This patch optimizes metadata handling for Iceberg tables,
particularly those that are updated frequently.
Previously, if an event processor was
lagging, Impala might receive numerous update events for the same table (for example,
100 events). Impala would attempt to reload the table 100 times, even if the table's
state was already up-to-date after processing the first event.
With this fix,
Impala now compares the path of the incoming metadata JSON file with the one that is
currently loaded. If the metadata file location is the same, Impala skips the reload,
correctly assuming the table is already unchanged. This significantly reduces
unnecessary metadata processing.
Apache Jira:
IMPALA-13718
- CDPD-82415:
TABLESAMPLE clause of the
COMPUTE STATS statement has no effect on Iceberg tables
- 7.3.2
- This fix resolves a regression introduced by IMPALA-13737. For example, the following query scans the
entire Iceberg table to calculate statistics, whereas it should ideally use only about
10% of the
data.
COMPUTE STATS t TABLESAMPLE SYSTEM system(10);
This fix introduces proper table sampling logic for Iceberg tables, which
can be utilized for COMPUTE STATS. The sampling algorithm previously
located in IcebergScanNode.getFilesSample() is now relocated to
FeIcebergTable.Utils.getFilesSample().
Apache Jira: IMPALA-14014
- CDPD-85228:
IllegalStateException with Iceberg
table with DELETE
- 7.3.2
- Running a query on an Iceberg table fails with an
IllegalStateException error in the following scenario:
- The Iceberg table has delete files for every data file (no data files without
delete files) AND
- An anti-join operation is performed on the result of the Iceberg delete operation
(IcebergDeleteNode or HashJoinNode)
This fix resolves the issue by setting the TableRefIds of the
node corresponding to the Iceberg delete operation (IcebergDeleteNode or HashJoinNode)
to only the table reference associated with the data files, excluding the delete
files.
Apache Jira: IMPALA-14154
- CDPD-87405: Error unnesting arrays in Iceberg tables with DELETE
files
- 7.3.2
- The following error occurred when unnesting a nested array (a 2D
array) from an Iceberg table. This issue was triggered specifically when the table
contained delete files for some, but not all, of its data
files.
Filtering an unnested collection that comes from a UNION [ALL] is not supported yet.
Reading
an Iceberg table with this mixed data and delete file configuration creates a
UNION ALL node in the query execution plan. The system had a check
that explicitly blocked any filtering on an unnested array.
This fix relaxes the
validation check, allowing the operation to proceed if all UNION
operands share the same tuple IDs. This ensures the query can successfully unnest the
array.
Apache Jira: IMPALA-14185