What's New
Major features and updates for the Cloudera AI data service.
March 28, 2025
Release notes and fixed issues for version 2.0.47-b365.
Fixed Issues
Cloudera AI Workbench
- The issue of sessions and pods getting stuck in the Stopping state has been resolved. (DSE-42144)
- Pods in an Error or Stuck state within Cloudera AI Workbenches are now being properly garbage-collected. (DSE-43549)
- Reduced the frequency of initialization failures for user workloads that launch immediately after node autoscaling. (DSE-43311)
Cloudera AI Platform
- Previously, users with MLAdmin roles were initially assigned the MLUser role during the first sync, but their permissions are updated correctly in subsequent syncs or when they log in. This issue is now resolved. (DSE-42775)
March 02, 2025
Release notes and fixed issues for version 2.0.47-b360.
Fixed Issues
Cloudera AI Workbench
- Previously, when users try to create a session, the ssh: This private key is passphrase protected error was displayed. This issue is now resolved. (DSE-426980)
February 26, 2025
Release notes and fixed issues for version 2.0.47-b359.
New Features / Improvements
Cloudera AI Platform
-
We have improved the synchronization efficiency and ease of use of the user management and team management auto synchronization features. The major updates include:
- Auto-synchronization is enabled by default: Auto synchronization for users and teams is now enabled by default, with a synchronization interval set to 12 hours.
- User management service: User management is now handled by a new service, reducing overhead on the web pod. It now prevents multiple synchronization operations from running in parallel.
- Logging: Detailed logging has been added for the failure cases.
- Synchronization trigger sequence: The team synchronization now internally triggers user synchronization to pull the most recent user details from the Cloudera control plane.
These improvements are aimed at optimizing performance and streamlining the synchronization process for users and teams. (DSE-37941)
- We have added support to set maximum input/output operations per second (IOPS) and throughput for root volumes attached to worker nodes, using the UI while provisioning a workbench. Note, that this is supported only for AWS. For more details on how to Maximize IOPS and throughput of the root volumes, see Provisioning Cloudera AI Workbenches. (DSE-42075)
Cloudera AI Registry
- You can now specify subnets for load balancers when creating the AI Registry. (DSE-42156)
- We have enhanced the security of the AI Registry's search capability. (DSE-41740)
Cloudera AI Inference service
- We have improved the UI usability of the Hugging Face import feature by adding a tooltip example. (DSE-41926)
Fixed Issues
Cloudera AI Workbench
- We have increased Grafana pod's default memory and CPU to prevent from out of memory (OOM) errors. (DSE-39525)
- We have increased the Remote Procedure Call (GRPC) Operator timeout to two minutes to prevent from errors encountered with 150 concurrent sessions. (DSE-36922)
- We have removed unessential calls to the usage API to resolve slowness during new workload creation under heavy load in a workbench. (DSE-42231)
Cloudera AI Platform
- We have optimized the Suspend timeout during periods of high network latency. (DSE-42055)
- Previously, when restoring a workbench with a very large Elastic File System (EFS) drive was failing due to session time out. This issue is now resolved. (DSE-42171)
Cloudera AI Registry
- We have fixed an issue that prevented from model registration to the AI Registry within a workbench. (DSE-42360)
- We have fixed a page token issue that prevented users from viewing AI Registry models on subsequent pages within the workbench. (DSE-42379)
- We have fixed an incorrect error message displayed in the UI when deleting AI Registry models from within a workbench. (DSE-42379)
- Error visibility has been improved during AI Registry backup. (DSE-42163)
Cloudera AI Inference service
- We have fixed an issue that prevented from rendering TPOT (Time per Output Token) and TTFT (Time to First Token) charts for Hugging Face models. (DSE-42192)
ML Runtimes
- Previously, non-administrator users were unable to add new Runtimes to the Runtime Catalog. This issue is now resolved. (DSE-42298)