Fixed issues and resolved maintenance items for YARN and YARN Queue Manager are
addressed in Cloudera Runtime 7.3.2 and its associated service
packs.
Cloudera Runtime 7.3.2
Cloudera Runtime 7.3.2 resolves YARN and YARN Queue Manager issues and
incorporates fixes from the service packs and cumulative hotfixes from 7.3.1.100 through
7.3.1.700. For a comprehensive record of all fixes in Cloudera Runtime
7.3.1.x, see Fixed Issues.
- CDPD-49702: Error in NodeManager when executing
/var/lib/yarn-ce/bin/container-executor
- 7.3.2
- Previously, a job failure occurred because NodeManager returned a No such file or directory error when attempting to run the
/var/lib/yarn-ce/bin/container-executor program. This issue is now
resolved and NodeManager is now marked as unhealthy and shut down if it is unable to run
the program.
- Apache Jira:
YARN-11709
- COMPX-13401: CapacityScheduler UI queue filter does not work as expected when
submitting applications with leaf queue's name
- 7.3.2
- Previously, when submitting applications to YARN using only a leaf queue name, for
example,
default or custom, instead of the full queue
path, for example, root.default, the ResourceManager (RM) and
CapacityScheduler UI inconsistently displayed or filtered applications. This led to
confusion, as the same queue could be displayed under different names, and applications
were not visible under the expected queue filter in the UI.This issue is now resolved
and RM now returns the full queue path regardless of whether the application was
submitted with a leaf queue name or a full queue path.
- Apache Jira:
YARN-11538
- COMPX-14637: Missing permissions on NodeManagers local directories on startup
- 7.3.2
-
Previously, the NodeManager created its required local directories on startup with
the correct 755 permissions only if they did not already exist.
If an administrator created these directories with incorrect permissions, or if the
permissions were altered after the NodeManager started, the NodeManager failed to
reset them. This lack of permission enforcement caused container failures. This issue
is now resolved.
- Apache Jira:
YARN-11703
- COMPX-17261: NodeManager disk health status oscillation with MB-based limits
- 7.3.2
- Previously, the NodeManager disk health status could oscillate rapidly between
good and full states when using the
yarn.nodemanager.disk-health-checker.min-free-space-per-disk-mb
property because it relied on a single threshold. This issue is now fixed. To prevent
this oscillation, the
yarn.nodemanager.disk-health-checker.min-free-space-per-disk-watermark-high-mb
optional configuration property is now available. This setting allows administrators to
specify the minimum free space in megabytes required for a previously bad disk to be
marked as good again. If this value is not set or is lower than the
YARN.nodemanager.disk-health-checker.min-free-space-per-disk-mb
value, it defaults to the same value as the existing minimum free space configuration.
This update applies to both YARN.nodemanager.local-dirs and
YARN.nodemanager.log-dirs directories, providing more granular control over
disk health checks.
- Apache Jira:
YARN-9914
- COMPX-18004: Metrics missing in the RM UI2
- 7.3.2
- The Replication Manager (RM) UI2 did not display metrics with the same granularity as
RM UI (UI1). This made analyzing and debugging the scheduler behavior difficult, often
requiring the retrieval of information from the UI. This issue is now resolved and the
missing metrics are now available in RM UI2.
- Apache Jira:
YARN-11755
- COMPX-18545: Setting
maximum-application-lifetime using AQCv2
templates does not apply on the first submitted application
- 7.3.2
- Setting the
maximum-application-lifetime property using the AQC v2
templates did not apply to the first submitted application but was applied to the
subsequent ones. This issue is now resolved.
- Apache Jira:
YARN-11708
- COMPX-18909: NodeManager marked as unhealthy if an application is terminated
- 7.3.2
- By design, Node Managers are marked unhealthy if an unrecoverable configuration error
occurrs, that is, the
container-executor
script is missing. Previously, a falase positive marking occurred if an application
application was terminated just before one of its containers was tried to access the
localizer syslog file. This caused an IOException, and the Node Managers was incorrectly
marked unhealthy. This issue is now resolved. The error checking is more specific,
preventing these false positive unhealthy markings.
- Apache Jira:
YARN-11753
- COMPX-21537: Upgraded the Jersey version
- 7.3.2
- Upgraded the Jersey framework from version 1.19 to 2.46 to fix the
CVE-2017-1000028.
- COMPX-23191: Null Pointer Exception in Delegation Token Renewer causes all subsequent
applications to fail
- 7.3.2
- Previously, any uncaught exception in
DelegationTokenRenewer.RenewalTimerTask#run caused all subsequent
YARN applications to fail with a java.lang.IllegalStateException: Timer already cancelled exception. This
issue is now resolved, and such failures are prevented.
- Apache Jira: YARN-11384
- COMPX-24259: Incorrect permissions on NodeManager local directories cause container
failures
- 7.3.2
- The NodeManager creates the configured local directories with 755 permissions on
startup if the directories did not exist. However, if these permissions were changed
after startup or if an administrator created the directories with incorrect permissions
before starting YARN, the NodeManager did not reset the permissions, resulting in
container failures. This issue is now resolved.
- Apache Jira: YARN-11703
- COMPX-21461: The
queuemanager_includedCipherSuites property fails
when using comma as a separator
- 7.3.2
- Previously, the
queuemanager_includedCipherSuites property in Queue
Manager only supported colon (:) as a separator for cipher suites. When
comma (,) was introduced as an additional separator, configurations
using commas caused failures in property parsing.This issue is now resolved by
updating the property parsing logic to accept both colon and comma as valid
separators.
- COMPX-22209: Missing centralised Apache HttpComponents libraries
- 7.3.2
- Previoulsy, the
cpx component did not use the centralized Apache
HttpComponents libraries (httpcore, httpclient,
httpcore5, httpclient5,
httpcore5-h2). This issue is now resolved. The component now uses
these centralized libraries, to align with with internal standards and incorporate the
latest fixes.
- COMPX-22213: Missing centralised Bouncy Castle (
org.bouncycastle)
libraries
- 7.3.2
- Previously, the
cpx component did not use the centralized Bouncy
Castle org.bouncycastle library versions defined in CDPD
(bcprov-jdk18on, bcpkix-jdk18on, and
bcutil-jdk18on updated from 1.78 to 1.78.1).This issue is now
resolved. The component now uses these centralized libraries to align with internal
dependency standards and incorporate the latest security and bug fixes.
- COMPX-23423: Apache Commons Lang upgraded to 3.18.0
- 7.3.2
- The Apache Commons Lang package is now upgraded to version 3.18.0 in Queue
Manager.
- CDPD-91280: Tez is unable to start on 7.3.2.0 FIPS clusters
- 7.3.2
- Previously, Tez startup failed on 7.3.2.0 FIPS clusters, caused by a conflict between
Cloudera Manager, which set
hadoop.security.secret-manager.key-generator.algorithm to
HmacSHA256, and Tez's older, hardcoded use of the
HmacSHA1 algorithm. This issue is now resolved by upgrading Tez to
version 0.10.5 and updating its secret managers to dynamically respect the algorithm
configured by Hadoop at runtime. This change prevents the recurring DIGEST-MD5: digest response format violation errors.
- Fixed CVE-2025-28924
- 7.3.2
- Few Hadoop API endpoints are now removed from Cloudera Runtime
7.3.2. This change is a result of the fix for CVE-2025-48924 and YARN's migration to common-lang 3. For more information
see, https://docs.cloudera.com/cdp-private-cloud-base/7.3.2/private-release-notes/topics/rt-pvc-api-compat-changes-hadoop.html
-
Apache Jira: YARN-10772