Cloudera AI on premises 1.5.4 SP2

Review the features, fixes, and known issues in the Cloudera AI 1.5.4 Service Pack 2 release.

Apache Parquet CVE-2025-30065

On April 1, 2025, a critical vulnerability in the parquet-avro module of Apache Parquet (CVE-2025-30065, CVSS score 10.0) was announced.

Cloudera has determined the list of affected products, and is issuing this TSB to provide details of remediation for affected versions.

Upgraded versions are being released for all currently affected supported releases of Cloudera products. Customers using older versions are advised to upgrade to a supported release that has the remediation, once it becomes available.

Vulnerability Details

Exploiting this vulnerability is only possible by modifying the accepted schema used for translating Parquet files and subsequently submitting a specifically crafted malicious file.

CVE-2025-30065 | Schema parsing in the parquet-avro module of Apache Parquet 1.15.0 and previous versions allows bad actors to execute arbitrary code.

CVE: NVD - CVE-2025-30065

Severity (Critical): CVSS:4.0/AV:N/AC:L/AT:N/PR:N/UI:N/VC:H/VI:H/VA:H/SC:H/SI:H/SA:H

Impact

Schema parsing in the parquet-avro module of Apache Parquet 1.15.0 and previous versions allows bad actors to execute arbitrary code. Attackers may be able to modify unexpected objects or data that was assumed to be safe from modification. Deserialized data or code could be modified without using the provided accessor functions, or unexpected functions could be invoked.

Deserialization vulnerabilities most commonly lead to undefined behavior, such as memory modification or remote code execution.

Releases affected

Cloudera Data Services on premises
  • All versions

Mitigation

Until Cloudera has released product version with the Apache Parquet vulnerability fix, please continue to use the the mitigations listed below:

Customers with their own FIM Solution:

  1. Utilize a File Integrity Monitoring (FIM) solution. This allows administrators to monitor files at the filesystem level and receive alerts on any unexpected or suspicious activity in the schema configuration.

General advisory:

  1. Use network segmentation and traffic monitoring with a device capable of deep packet inspection, such as a network firewall or web application firewall, to inspect all traffic sent to the affected endpoints.
  2. Configure alerts for any suspicious or unexpected activity. You may also configure sample analysis parameters to include:

    • Parquet file format “magic bytes” = PAR1
    • Connections from sending hosts that are not expected source IP ranges.
  3. Be cautious with Parquet files from unknown or untrusted sources. If possible, do not process files with uncertain origins or that can be ingested from outside the organization.
  4. Ensure that only authorized users have access to endpoints that ingest Parquet files.

For the latest update on this issue, see the corresponding Knowledge article: Cloudera Customer Advisory 2025-847: Cloudera's remediation actions for Apache Parquet CVE-2025-30065

What's new in 1.5.4 SP2

Cloudera on premises 1.5.4 SP2 includes the following features for Cloudera AI.

Introducing auto scaling performance-critical components

Some performance-critical components in Cloudera AI automatically scale based on demand, enhancing the scalability and the reliability of the product. To enable utilising this feature, we have increased the recommended resource requirements for Cloudera AI by 16 cores.

Fixed issues in 1.5.4 SP2

This section lists issues fixed in this release for Cloudera AI on premises.

DSE-40909: Disabled 'Run experiments' function does not work as expected

When the Administrator disabled 'running experiments' in Cloudera AI Workbench > Site Administration > Settings, the user was still able to see the Experiments menu item in the global- and project-level navigation.

With the help of implementing the sufficient conditions the UI can now hide the disabled function.

DSE-38645: spark.yarn.jars are not set properly in Cloudera AI on premises Spark Pushdown mode

In case of Spark pushdown, the spark.yarn.jars parameter can indicate which jars in the driver shall be transported to YARN executors. Without this parameter set, the YARN executors use the Spark jars from the base cluster provided by Cloudera Distribution of Spark. This can lead to version mismatch between the Spark driver (used Cloudera Data Engineering-provided jars) and the executors.

Now, the spark.yarn.jars parameter is set to use jars from the Cloudera AI driver (Cloudera AI session) to keep the versions matched between the Spark driver and the executors.

DSE-39798: API v1 stop session endpoint shall perform authentication check
Previously, users could stop sessions under projects that they were not authorized to access using the session’s Universally unique identifier (UUID). This issue is now resolved.
DSE-41431: Register engine stopped status

When the reconciler is overloaded with a large number of events, the deleted status is still propagated to ensure that engines do not remain in 'Stopping' status.

DSE-41424: Better handling for data connection validation errors

In Cloudera AI on premises 1.5.4 SP1 Cloudera AI Workbench failed to start under the following conditions:

  • If the HIVE_ON_TEZ service was not present or the following configurations were missing:

    • hiveserver2_load_balancer

    • hive.server2.transport.mode

    • Kerberos_princ_name

  • If the IMPALA service was not present on your Base cluster or the following configurations were missing:

    • Hs2_port or hs2_http_port

    • Kerberos_princ_name

This issue has been fixed.

DSE-41218: Restrict secrets and ingress access from User Service Account role

User access to Kubernetes secrets and ingresses in their own user namespace has been removed.

DSE-34314: Cloudera Base AutoTLS property set incorrectly
The correction to Cloudera Base AutoTLS configurations resulted in the atlas.kafka.ssl.truststore.location value to be set incorrectly.

This issue has been fixed.

Technical Service Bulletins

TSB 2025-822 Cloudera AI Workbench Web Service Crashes after Upgrading to Cloudera Data Services 1.5.4 SP1
For the latest update on this issue, see the corresponding Knowledge article: Cloudera Customer Advisory 2025-822: Cloudera AI Workbench Web Service Crashes after Upgrading to Cloudera Data Services 1.5.4 SP1.

Known issues in 1.5.4 SP2

You might run into some known issues while using Cloudera AI on premises.

DSE-42079: Cloudera AI Workbenches are not compatible with the Unified timezone feature

When you enable the Unified timezone feature, the Cloudera Embedded Container Service cluster timezone is synchronized with the Cloudera Manager Base time zone, and the Cloudera AI sessions fails to launch with Exit Code 34. Timestamp discrepancies with workloads are displayed.

Workaround:

If you use Cloudera AI Workbenches, disable the Unified Timezone feature by following the instructions in Cloudera Embedded Container Service unified time zone.

DSE-41757: Python workloads running multiple-line comments or strings might fail

Python workloads running multiple-line comments or strings might fail to run when using the Workbench Editor.

Workaround:

Run the code using the PBJ Workbench Editor.

DSE-42509: Creating a project does not work on NTP/Airgap proxy when using private repository with SSH key and SSH URL

When creating a project on NTP/Airgap proxy, using a private repository with an SSH key and an SSH URL, an error message is displayed that the project cannot be created.

No workaround is available.

DSE-42510: Fetching Cloudera or Huggingface AMP catalog is failing on NTP/Airgap proxy
When fetching Cloudera or Huggingface Cloudera Accelerators for Machine Learning Project (AMP) catalog on NTP/Airgap proxy, an error message is displayed Error fetching AMP catalog source URL.

No workaround is available.