What's new in Cloudera Data Warehouse Public Cloud
Review the new features introduced in this release of Cloudera Data Warehouse service on Cloudera Public Cloud.
- Cloudera Data Warehouse features
- Cloudera Data Warehouse on Azure environments
- Cloudera Data Warehouse on AWS environments
- Iceberg
- Hue
- Technical Preview features
- Behavior changes
What's new in Cloudera Data Warehouse Public Cloud
- General availability of Virtual Warehouse and Database Catalog workload version selections
- The Cloudera Data Warehouse UI now provides a list of workload versions that match your cluster from which you can select one during cluster installation. The Database Catalog list contains versions compatible with your Kubernetes version and your cluster environment (DWX version). The Virtual Warehouse list contains versions compatible with your Kubernetes version, your cluster environment (DWX version), and your Database Catalog version.
- General availability of Impala workload-aware autoscaling
- Workload-aware autoscaling allocates Impala Virtual Warehouse resources based on the workload that is running. You choose multiple executor group sets size based on your workload requirements, instead of the fixed executor group size of the previous auto-scaling implementation. This feature is now generally available. See Workload Aware Auto-Scaling in Impala.
- Improved Impala Autoscaling Dashboard
- You can now use the new Impala Autoscaling Dashboard to monitor Impala autoscaling in a warehouse that uses workload-aware autoscaling or the regular autoscaling. You can access the Impala Autoscaling Dashboard by going to the Virtual Warehouse Details page's Web UI tab, and clicking the Impala Autoscaler Web UI option. See About the Impala Autoscaling Dashboard.
- Ability to forward Prometheus metrics from Cloudera Data Warehouse to an external endpoint
- In this release, you can configure Prometheus in Cloudera Data Warehouse to push its metrics to an external endpoint, such as Prometheus, Grafana, Thanos, or some other endpoint. See Forwarding Prometheus metrics from Cloudera Data Warehouse to an endpoint.
- Automatically backing up and restoring Cloudera Data Warehouse
- This release adds more automation to back up and restore procedures for AWS and Azure
environments and clarifies the documentation of the automatic, semi-automatic, and manual
procedures.
To get the supported Kubernetes version for this release, you back up your old AWS or Azure environment and start up a new environment using the restoration process. The backup/restore feature saves your environment parameters, making it possible to recreate your environment with the same settings, URL, and connection strings you used in your previous environment.
- Ability to configure Impala Statestore high availability
- You can now configure high availability for Impala Statestore pods in a Virtual Warehouse, with active and passive modes ensuring continuity and reliability during failovers. See Configuring Impala Statestore high availability.
- Downloading the UDF development package from Cloudera Data Warehouse UI
- Introducing the ability to download the Impala UDF development package directly from the Cloudera Data Warehouse UI for enhanced convenience and integration, see Building and deploying UDFs.
- PostgreSQL replaces SQLite database for Grafana in Cloudera Data Warehouse Public Cloud
- The file-based SQLite database for Grafana has been replaced with PostgreSQL database, providing a more robust experience. You must deactivate and reactivate your environment in Cloudera Data Warehouse to use this feature.
What's new in Cloudera Data Warehouse on Azure environments
- Azure AKS 1.29 upgrade
- Cloudera supports the Azure Kubernetes Service (AKS) version 1.29. In 1.9.1-b233 (released July 26, 2024), when you activate an environment, Cloudera Data Warehouse automatically provisions AKS 1.29. To upgrade to AKS 1.29 from an earlier version of Cloudera Data Warehouse, you must backup and restore Cloudera Data Warehouse. To avoid compatibility issues between Cloudera Data Warehouse and AKS, upgrade to version 1.29.
- Addition of new Azure instance types
- This release offers the selection of the Standard_E16pds_v5 Azure Virtual Machine, an AKS Ampere® Altra® Arm-based instance type for an Impala Virtual Warehouse. For more information about using the instance type, see Activating an Azure environment from Cloudera Data Warehouse.
- Cloudera Data Warehouse provisions Azure Database for PostgreSQL - Flexible Server
- Starting with this release, Cloudera Data Warehouse provisions Azure Database for PostgreSQL - Flexible Server instead of Azure Database for PostgreSQL - Single Server. See Enabling a private Cloudera Data Warehouse environment.
What's new in Cloudera Data Warehouse on AWS environments
- Amazon EKS 1.29 upgrade
- Cloudera supports the Amazon Elastic Kubernetes Service (EKS) version 1.29. In 1.9.1-b233 (released July 26, 2024), when you activate an environment, Cloudera Data Warehouse automatically provisions EKS 1.29. To upgrade to EKS 1.29 from an earlier version of CDW, you must backup and restore CDW. To avoid compatibility issues between Cloudera Data Warehouse and EKS, upgrade to version 1.29. See Upgrading Amazon Kubernetes Service (EKS).
- Note about the impact of AWS RDS root certificate rotation in 2024
- A Cloudera Data Warehouse Cluster RDS does not use certificate verification for
connections to the Cloudera Data Warehouse. Therefore you are not directly
impacted by certificate expiration for your Cloudera Data Warehouse Cluster RDS.
You can either choose to clear the warnings or rotate the certificate.
To rotate the certificate for the Cloudera Data Warehouse Cluster RDS, follow the step outlined by AWS in Rotating your SSL/TLS certificate to update the certificate. There should be no impact on Cloudera Data Warehouse because the Cloudera Data Warehouse Cluster RDS should not be restarted, Postgres RDS has
SupportsCertificateRotationWithoutRestart=true
.For the Datalake RDS, follow instructions shared by the Datalake account team to update the certificate. There maybe some impact to Cloudera Data Warehouse while restarting the Datalake, such as query failures or delays. This could happen because services such as Ranger, Knox, and FreeIPA might be unavailable during this period.
- Addition of new AWS instance types
- This release offers the selection of the r6gd.4xlarge and r7gd.4xlarge Arm-based instance types for an Impala Virtual Warehouse. For more information about using the instance type, see Activating an AWS environment from Cloudera Data Warehouse.
- Ability to use envelope encryption for EKS secrets
- Envelope encryption is now added for EKS Secrets through Cloudera Data Warehouse KMS Key by default. See Encrypt Kubernetes secrets with AWS KMS on existing clusters.
What's new in Iceberg on Cloudera Data Warehouse Public Cloud
- Cloudera support for Iceberg version 1.4.3
- The Apache Iceberg component has been upgraded from 1.3.0 to 1.4.3.
- Support for Iceberg data compaction
- You can compact Iceberg tables and optimize them for read operations from Hive and Impala. Compaction is an essential table maintenance activity that creates a new snapshot, which contains the table content in a compact form. See Iceberg data compaction.
- SQL support for querying Iceberg metadata tables
- Apache Iceberg stores extensive metadata for its tables. From Hive and Impala, you can query the metadata tables as you would query a regular table. For example, you can use projections, joins, filters, and so on. See Query metadata tables feature.
What's new in Hue on Cloudera Data Warehouse Public Cloud
- General availability (GA) of the SQL AI Assistant
- Hue leverages the power of Large Language Models (LLM) to help you generate SQL queries
from natural language prompts and also provides options to optimize, explain, and fix queries,
promoting efficient and accurate practices for accessing and manipulating data. You can use
several AI services and models such as OpenAI’s GPT service, Amazon Bedrock, and Azure’s
OpenAI service to run the Hue SQL AI assistant.
- To learn more about the supported models and services, limitations, and what data is shared with the LLMs, see About the SQL AI Assistant in Cloudera Data Warehouse.
- To set up and enable the SQL AI Assistant, see About setting up the SQL AI Assistant in Cloudera Data Warehouse.
- To see how to generate, edit, explain, optimize, and fix queries, see Starting the SQL AI Assistant in Hue.
- Introduction of task server in Hue and significant improvement in the file upload functionality
- A new Task Server page has been added to the Hue web interface. The
Hue task server enables the following functionalities:
- It improves the file-upload experience, allowing you to upload multiple files up to 5 GB each in parallel.
- It helps you to schedule tasks to clean up Hue documents and the /tmp directory, improving cluster maintenance experience and performance.