Iceberg-related known issues in Cloudera Data Warehouse Private Cloud

This topic describes the Iceberg-related known issues in Cloudera Data Warehouse Private Cloud.

Known issues identified in 1.5.4

No new known issues identified in 1.5.4.

Known issues identified in 1.5.2

CDPD-59413: Unable to view Iceberg table metadata in Atlas
You may see the following exception in the Atlas application logs when you create an Iceberg table from the Cloudera Data Warehouse data service associated with a Cloudera Private Cloud Base 7.1.8 or 7.1.7 SP2 cluster: Type ENTITY with name iceberg_table does not exist. This happens because the Atlas server on Cloudera Private Cloud Base 7.1.8 and 7.1.7 SP2 does not contain the necessary, compatible functionality to support Iceberg tables. This neither affects creating, querying, or modifying of Iceberg tables using Cloudera Data Warehouse nor does it affect creating of policies in Ranger.

On Cloudera Private Cloud Base 7.1.9, Iceberg table entities are not created in Atlas. You can ignore the following error appearing in the Atlas application logs: ERROR - [NotificationHookConsumer thread-1:] ~ graph rollback due to exception (GraphTransactionInterceptor:200) org.apache.atlas.exception.AtlasBaseException: invalid relationshipDef: hive_table_storagedesc: end type 1: hive_storagedesc, end type 2: iceberg_table

If you are on Cloudera Private Cloud Base 7.1.7 SP2 or 7.1.8, then you can manually upload the Iceberg model file z1130-iceberg_table_model.json in to the /opt/cloudera/parcels/CDH/lib/atlas/models/1000-Hadoop directory as follows:
  1. SSH into the Atlas server host as an Administrator.
  2. Change directory to the following:
    cd /opt/cloudera/parcels/CDH/lib/atlas/models/1000-Hadoop
  3. Create a file called 1130-iceberg_table_model.json with the following content:
    {
      "enumDefs": [],
      "structDefs": [],
      "classificationDefs": [],
      "entityDefs": [
        {
          "name": "iceberg_table",
          "superTypes": [
            "hive_table"
          ],
          "serviceType": "hive",
          "typeVersion": "1.0",
          "attributeDefs": [
            {
              "name": "partitionSpec",
              "typeName": "array<string>",
              "cardinality": "SET",
              "isIndexable": false,
              "isOptional": true,
              "isUnique": false
            }
          ]
        },
        {
          "name": "iceberg_column",
          "superTypes": [
            "hive_column"
          ],
          "serviceType": "hive",
          "typeVersion": "1.0"
        }
      ],
      "relationshipDefs": [
        {
          "name": "iceberg_table_columns",
          "serviceType": "hive",
          "typeVersion": "1.0",
          "relationshipCategory": "COMPOSITION",
          "relationshipLabel": "__iceberg_table.columns",
          "endDef1": {
            "type": "iceberg_table",
            "name": "columns",
            "isContainer": true,
            "cardinality": "SET",
            "isLegacyAttribute": true
          },
          "endDef2": {
            "type": "iceberg_column",
            "name": "table",
            "isContainer": false,
            "cardinality": "SINGLE",
            "isLegacyAttribute": true
          },
          "propagateTags": "NONE"
        }
      ]
    }
  4. Save the file and exit.
  5. Restart the Atlas service using Cloudera Manager.

Technical Service Bulletins

TSB 2024-745: Impala returns incorrect results for Iceberg V2 tables when optimized operator is being used in Cloudera Data Warehouse
Cloudera Data Warehouse customers using Apache Impala (Impala) to read Apache Iceberg (Iceberg) V2 tables can encounter an issue of Impala returning incorrect results when the optimized V2 operator is used. The optimized V2 operator is enabled by default in the affected versions below. The issue only affects Iceberg V2 tables that have position delete files.
Knowledge article

For the latest update on this issue see the corresponding Knowledge Article: TSB 2024-745: Impala returns incorrect results for Iceberg V2 tables when optimized operator is being used in Cloudera Data Warehouse.