Fixed Issues in YARN and YARN Queue Manager

Fixed issues and resolved maintenance items for YARN and YARN Queue Manager are addressed in Cloudera Runtime 7.3.2 and its associated service packs.

Cloudera Runtime 7.3.2

Cloudera Runtime 7.3.2 resolves YARN and YARN Queue Manager issues and incorporates fixes from the service packs and cumulative hotfixes from 7.3.1.100 through 7.3.1.700. For a comprehensive record of all fixes in Cloudera Runtime 7.3.1.x, see Fixed Issues.

CDPD-49702: Error in NodeManager when executing /var/lib/yarn-ce/bin/container-executor
7.3.2
Previously, a job failure occurred because NodeManager returned a No such file or directory error when attempting to run the /var/lib/yarn-ce/bin/container-executor program. This issue is now resolved and NodeManager is now marked as unhealthy and shut down if it is unable to run the program.
Apache Jira: YARN-11709
COMPX-13401: CapacityScheduler UI queue filter does not work as expected when submitting applications with leaf queue's name
7.3.2
Previously, when submitting applications to YARN using only a leaf queue name, for example, default or custom, instead of the full queue path, for example, root.default, the ResourceManager (RM) and CapacityScheduler UI inconsistently displayed or filtered applications. This led to confusion, as the same queue could be displayed under different names, and applications were not visible under the expected queue filter in the UI.

This issue is now resolved and RM now returns the full queue path regardless of whether the application was submitted with a leaf queue name or a full queue path.

Apache Jira: YARN-11538
COMPX-14637: Missing permissions on NodeManagers local directories on startup
7.3.2

Previously, the NodeManager created its required local directories on startup with the correct 755 permissions only if they did not already exist.

If an administrator created these directories with incorrect permissions, or if the permissions were altered after the NodeManager started, the NodeManager failed to reset them. This lack of permission enforcement caused container failures. This issue is now resolved.

Apache Jira: YARN-11703
COMPX-17261: NodeManager disk health status oscillation with MB-based limits
7.3.2
Previously, the NodeManager disk health status could oscillate rapidly between good and full states when using the yarn.nodemanager.disk-health-checker.min-free-space-per-disk-mb property because it relied on a single threshold. This issue is now fixed. To prevent this oscillation, the yarn.nodemanager.disk-health-checker.min-free-space-per-disk-watermark-high-mb optional configuration property is now available. This setting allows administrators to specify the minimum free space in megabytes required for a previously bad disk to be marked as good again. If this value is not set or is lower than the YARN.nodemanager.disk-health-checker.min-free-space-per-disk-mb value, it defaults to the same value as the existing minimum free space configuration. This update applies to both YARN.nodemanager.local-dirs and YARN.nodemanager.log-dirs directories, providing more granular control over disk health checks.
Apache Jira: YARN-9914
COMPX-18004: Metrics missing in the RM UI2
7.3.2
The Replication Manager (RM) UI2 did not display metrics with the same granularity as RM UI (UI1). This made analyzing and debugging the scheduler behavior difficult, often requiring the retrieval of information from the UI. This issue is now resolved and the missing metrics are now available in RM UI2.
COMPX-18545: Setting maximum-application-lifetime using AQCv2 templates does not apply on the first submitted application
7.3.2
Setting the maximum-application-lifetime property using the AQC v2 templates did not apply to the first submitted application but was applied to the subsequent ones. This issue is now resolved.
Apache Jira: YARN-11708
COMPX-18909: NodeManager marked as unhealthy if an application is terminated
7.3.2
By design, Node Managers are marked unhealthy if an unrecoverable configuration error occurrs, that is, the container-executor script is missing. Previously, a falase positive marking occurred if an application application was terminated just before one of its containers was tried to access the localizer syslog file. This caused an IOException, and the Node Managers was incorrectly marked unhealthy. This issue is now resolved. The error checking is more specific, preventing these false positive unhealthy markings.
Apache Jira: YARN-11753
COMPX-20253: Missing Secure Sockets Layer (SSL) cipher inclusion list support
7.3.2
Previously, support for configuring an inclusion list of SSL ciphers for HttpServer2 and SSLFactorywas missing. This issue is now resolved. When an inclusion list is set, only the listed ciphers are allowed, and any cipher present in both the inclusion and exclusion lists are excluded.
Apache Jira: HADOOP-19546
COMPX-21537: Upgraded the Jersey version
7.3.2
Upgraded the Jersey framework from version 1.19 to 2.46 to enhance security, improve performance, and ensure compatibility with modern standards and dependencies.
COMPX-23154: Unecessary direct dependency on Jersey 1 and JSR311 APIs fro HTTP headers
7.3.2
Previously, direct dependency on Jersey 1 jsr311 API for HTTP header handling relied on the javax.ws.rs.core.HttpHeaders class. This issue is now resolved. All usages of javax.ws.rs.core.HttpHeaders are replaces with internal constants or alternative classes, successfully removing the direct dependency.
Apache Jira: HADOOP-19077
COMPX-23191: Null Pointer Exception in Delegation Token Renewer causes all subsequent applications to fail
7.3.2
Previously, any uncaught exception in DelegationTokenRenewer.RenewalTimerTask#run caused all subsequent YARN applications to fail with a java.lang.IllegalStateException: Timer already cancelled exception. This issue is now resolved, and such failures are prevented.
COMPX-24259: Incorrect permissions on NodeManager local directories cause container failures
7.3.2
The NodeManager creates the configured local directories with 755 permissions on startup if the directories did not exist. However, if these permissions were changed after startup or if an administrator created the directories with incorrect permissions before starting YARN, the NodeManager did not reset the permissions, resulting in container failures. This issue is now resolved.
COMPX-21461: The queuemanager_includedCipherSuites property fails when using comma as a separator
7.3.2
Previously, the queuemanager_includedCipherSuites property in Queue Manager only supported colon (:) as a separator for cipher suites. When comma (,) was introduced as an additional separator, configurations using commas caused failures in property parsing.

This issue is now resolved by updating the property parsing logic to accept both colon and comma as valid separators.

COMPX-22209: Missing centralised Apache HttpComponents libraries
7.3.2
Previoulsy, the cpx component did not use the centralized Apache HttpComponents libraries (httpcore, httpclient, httpcore5, httpclient5, httpcore5-h2). This issue is now resolved. The component now uses these centralized libraries, to align with with internal standards and incorporate the latest fixes.
COMPX-22213: Missing centralised Bouncy Castle (org.bouncycastle) libraries
7.3.2
Previously, the cpx component did not use the centralized Bouncy Castle org.bouncycastle library versions defined in CDPD (bcprov-jdk18on, bcpkix-jdk18on, and bcutil-jdk18on updated from 1.78 to 1.78.1).This issue is now resolved. The component now uses these centralized libraries to align with internal dependency standards and incorporate the latest security and bug fixes.
COMPX-23423, COMPX-22836: Apache Commons Lang upgraded to 3.18.0
7.3.2
The Apache Commons Lang package is now upgraded to version 3.18.0 in Queue Manager.
Fixed CVE-2025-28924
7.3.2
Few Hadoop API endpoints are now removed from Cloudera Runtime 7.3.2. This change is a result of the fix for CVE-2025-48924 and YARN's migration to common-lang 3. For more information see, https://docs.cloudera.com/cdp-private-cloud-base/7.3.2/private-release-notes/topics/rt-pvc-api-compat-changes-hadoop.html

Apache Jira: YARN-10772