- CDPD-97246: Avro schema literal logging at INFO level
- 7.3.2
- Previously, the Avro deserializer logged the schema literal at
the INFO level.
- This issue is now resolved by changing the log level to
DEBUG.
Apache Jira: HIVE-22606
- CDPD-96649: Incorrect aggregate statistics when direct SQL batch
retrieval is enabled
- 7.3.2
- Previously, when
hive.metastore.direct.sql.batch.size config was greater than 0,
the system failed to merge column statistics correctly if the number of partitions or
columns exceeded that batch size.
- This issue is now resolved by ensuring the statistics are
properly merged during the retrieval process, preventing redundant entries.
Apache Jira: HIVE-29203
- CDPD-95834: Hive Metastore backend database schema out of
date
- 7.3.2
- Previously, the Hive Metastore (HMS) backend database schema
files required updates for compatibility with the latest version.
- This issue is now resolved by updating the schema files to
version 7.3.2.0 to ensure a successful upgrade.
- CDPD-95681: Hive Beeline connection failure due to SSL
certificate hostname mismatch
- 7.3.2
- Previously, the Beeline client failed to establish a connection
to HiveServer2 in ZooKeeper High Availability (HA) setups.
- This issue is now resolved by ensuring that the SSL certificates
correctly match the DNS hostnames, allowing the secure connection to be verified and
established.
- CDPD-93766: Missing results during anti-join conversion
- 7.3.2
- Previously, queries involving specific join conditions returned
empty results instead of the expected data when anti-join conversion was enabled.
- This issue is now resolved by preventing anti-join conversion in
these specific scenarios to ensure query result accuracy.
Apache Jira: HIVE-29175
- CDPD-93756: SHOW COMPACTIONS output filtering
- 7.3.2
- Previously, the SHOW COMPACTIONS output
included all historical information, which caused the display to become unwieldy when
many partitions or history lines existed.
- This issue is resolved by adding the ability to filter the
command output by database, table, partition, and compaction type or state.
Apache Jira: HIVE-13353
- CDPD-93617: Duplicate records during minor compaction
- 7.3.2
- Previously, when the Hive Metastore (HMS) crashed, active
compaction jobs were incorrectly reset.
- This issue is now fixed by updating the compactor cleaner to
address duplicate directories.
Apache Jira: HIVE-29210
- CDPD-93432: Incorrect results for anti-join queries
- 7.3.2
- Previously, the HiveAntiJoin rule incorrectly replaced IS NULL
filters on nullable columns, which resulted in missing records in the query output.
- This issue is now fixed by improving the logic within the
HiveAntiSemiJoinRule to ensure accurate query plans.
Apache
Jira: HIVE-29176
- CDPD-93431: Incorrect results for n-way joins
- 7.3.2
- Previously, queries produced incorrect results when an n-way
join contained a combination of both anti and outer joins.
- This issue is now resolved by extending the CommonJoinOperator
to properly support anti joins when they are used alongside outer joins in n-way join
operations.
Apache Jira: HIVE-29290
- CDPD-93428: Manifest files appearing in read queries
- 7.3.2
- Previously, temporary directories containing direct insert
manifest files used a prefix that allowed them to be included in concurrent read
queries.
- This issue is now resolved by hiding the direct insert manifest
directory from read queries, ensuring only valid data files are processed.
Apache Jira: HIVE-29297
- CDPD-93425: Data loss during minor compaction
- 7.3.2
- Previously, query-based minor compaction incorrectly used the
minimum open write ID as a lower bound for selecting data files.
- This issue is now fixed by removing the incorrect check against
the minimum open write ID, allowing the high-watermark to correctly define the
compaction range.
Apache Jira: HIVE-29272
- CDPD-92923 / CDPD-92586: Memory leak in Hive Metastore REST Catalog
- 7.3.2
- Previously, the Hive Metastore (HMS) REST Catalog leaked
memory during ALTER TABLE authorization checks. Each check created
new catalog instances and handler pools with JMX enabled, which prevented the system
from reclaiming memory.
- This issue is now fixed by reusing existing catalog instances
and disabling JMX for the handler pool to allow the system to recover resources.
- CDPD-92499: Hive LDAP authentication and ZooKeeper connections
- 7.3.2
- Previously, Hive Lightweight Directory Access Protocol (LDAP)
authentication failed when connecting to a SASL-enforced ZooKeeper instance.
- This issue is now resolved by updating the service components to
support simultaneous authentication configurations and ensuring correct credential
handling during service discovery.
Apache Jira: HIVE-29138
- CDPD-92478: Timestamp processing in MetaStoreUtils
- 7.3.2
- Previously, Hive Metastore utilities used local time zone
settings to convert between timestamps and strings.
- This issue is now fixed by using UTC time zone and the
java.time.Instant class to process timestamps, which ensures that
time points are represented accurately regardless of local time zone rules.Apache Jira: HIVE-28337
- CDPD-91415: WebHCat and Python script compatibility with Python 3
- 7.3.2
- Previously, hcat.py and various Python
scripts used in q files contained syntax that was incompatible with Python 3.
- This issue is now resolved by updating the Python scripts to use
Python 3-compatible syntax.
Apache Jira: HIVE-25817
- CDPD-90670: Incorrect results for queries with multiple lateral view operations
- 7.3.2
- Previously, queries that used two or more LATERAL
VIEW explode operations along with a WHERE clause
returned incorrect results when Cost-Based Optimization (CBO) was enabled.
- This issue is now resolved by updating the logic to correctly
identify separate table aliases for lateral view columns, ensuring that filters are
applied accurately.
Apache Jira: HIVE-29084
- CDPD-90303: Incorrect results from a CASE expression
- 7.3.2
- A query that used a CASE expression to conditionally return
values produced an incorrect result. The query plan incorrectly folded the CASE
statement into a COALESCE function, which led to a logic error that filtered
out some of the expected results.
- This issue is addressed by adding a more strict check when
converting CASE expressions into COALESCE during query optimization.
Apache Jira: HIVE-24902
- CDPD-89462: Performance degradation for wide tables in DirectSqlUpdatePart
- 7.3.2
- Previously, updating or inserting partition statistics for
tables with a high number of columns and partitions was slow.
- This issue is now resolved by improving the hashing logic to
ensure faster data retrieval and insertion, even for tables with thousands of columns
and partitions.
Apache Jira: HIVE-29165
- CDPD-88987: Improved performance for adding columns to partitioned tables
- 7.3.2
- Previously, metadata operations for adding columns to tables
with a high number of partitions and columns were slow because the system utilized a
less optimized implementation for partition updates.
- This issue is addressed by implementing a more efficient batch
processing method for partition updates, which utilizes optimized metadata queries to
improve performance for tables with many partitions.
Apache
Jira: HIVE-28956
- CDPD-88981: Performance degradation during column addition with cascade
- 7.3.2
- Previously, adding columns to a table using the
CASCADE command resulted in slower performance after optimizations
for metadata storage were enabled.
- This issue is now resolved. The fix includes an optimized method
to reuse column descriptors across partitions, which restores performance levels during
the column addition process.
Apache Jira: HIVE-29042
- CDPD-88166: Query failure during JDBC filter optimization
- 7.3.2
- Previously, certain queries failed with a class-cast error
during the optimization phase.
- This issue is now resolved by updating the query optimization
rules to ensure that relational operators are correctly identified and processed.
Apache Jira: HIVE-25356
- CDPD-87266: Query failure during Tez execution
- 7.3.2
- Previously, Hive queries failed during execution after an
upgrade. This resulted in a vertex failure and prevented queries from completing
successfully.
- This issue is now resolved by updating the execution engine to
correctly instantiate internal query split generators during the initialization
process.
- CDPD-84149: MariaDB connector recognition failure
- 7.3.2
- Previously, Hive failed to recognize the MariaDB connector
even when the driver was present.
- This issue is now fixed.
- CDPD-83461: Query failure when using stack function with union operations
- 7.3.2
- Previously, queries utilizing the STACK
function in combination with UNION operations failed with an internal
error during the compilation phase.
- This issue is now resolved by updating the query optimization
logic to correctly handle the stack function during union operations, preventing the
internal processing error.
Apache Jira: HIVE-29029
- CDPD-83334: Improving performance for alter partition operations
- 7.3.2
- When altering partitions, the system used Java Data Objects
(JDO) updates, which required fetching all fields of old partitions and produced
redundant queries, leading to slower performance.
- This issue is resolved by implementing direct SQL for altering
partitions.
Apache Jira: HIVE-27530
- Direct SQL failure during partition alterations
- 7.3.2
- Previously, direct SQL for partition alterations failed in
certain database environments due to Character Large Object (CLOB) casting errors and
missing boolean type conversion checks.
- This issue is resolved by updating the direct SQL logic to
handle CLOB types correctly and ensuring proper boolean type conversions during batch
updates.
Apache Jira: HIVE-28271
- CDPD-80146: Group by alias query failures
- 7.3.2
- Previously, the hive.runtime.dialect.enable property was enabled by default, which caused the hive.groupby.position.alias property to be ignored.
- This issue is resolved by setting the hive.runtime.dialect.enable property to false by default. Hive now correctly respects the hive.groupby.position.alias configuration.
- CDPD-79144: Incorrect schema version in Hive schema initialization script for MySQL
- 7.3.2
- Cluster creation with 7.3.1 fails due to an incorrect database
schema in the
CDH_VERSION table.
- The issue was addressed by correcting the schema version in the Hive schema initialization script, ensuring successful cluster creation.
- CDPD-78337: Merge task not invoked for external CTAS queries on object stores
- 7.3.2
- Previously, the merge task was not invoked for external
Create Table As Select (CTAS) queries when using S3 or other object
stores.
- This issue is now resolved by ensuring the merge task is correctly invoked after optimization for external CTAS queries.
Apache Jira: HIVE-27536
- CDPD-78329: HiveServer2 runs out of memory with multiple parallel queries with fetch task
- 7.3.2
- Previously, HiveServer2 may run out of memory when multiple parallel queries use fetch task caching (hive.fetch.task.caching=true). This causes queries to fail and HiveServer2 to crash.
- The issue was addressed by reducing the default value of hive.fetch.task.conversion.threshold from one GB to 200MB, preventing excessive memory usage and improving stability.
- CDPD-77869: Iceberg table data written to HDFS instead of S3 in RAZ-enabled clusters
- 7.3.2
- Previously, when you configured a cluster with Ranger Remote Authorization Board (RAZ) and updated configurations to use S3, data for Iceberg tables was unexpectedly written to the HDFS external table location. This occurred because Iceberg tables, which are treated as external tables, defaulted to the database LOCATION property that still pointed to HDFS if the database was created prior to the S3 switch.
- This issue is now fixed by ensuring that external tables correctly align with the intended S3 storage paths. You can now update the database location to point to the S3 bucket to ensure all future external tables default to the cloud storage.
- CDPD-75665: Importing a table generates a DDL with an
incorrect location
- 7.3.2
- When creating a table using the
IMPORT command, this table's partitions could point to an
incorrect location, that is, an external imported table can have his partitions
located under the managed warehouse directory, which violates the
metastore.warehouse.external.dir and metastore.warehouse.dir that intend to host
different type of tables.Apache Jira: HIVE-28580
- Hive query execution failure due to AM
container exit on lost node with Exit code -100
- 7.3.2
- Hive query failed when ApplicationMaster container was lost
- Previously, when running a Hive query, a failed ApplicationMaster
(AM) container did not trigger a DAG retry and caused the query execution to fail if the
failure message included diagnostic information with a line break.
- This issue is now resolved by automatically re-executing the DAG
if the AM fails.
Apache Jira: HIVE-28093
- CDPD-74539: MariaDB falls back to MySQL in Hive
- 7.3.2
- Hive downstream had errors in supporting MariaDB.
- The issue was addressed by making MariaDB automatically fall back to MySQL.
- CDPD-66731: Hive Metastore query failure during Zero Downtime Upgrade
- 7.3.2
- Previously, during a Zero Downtime Upgrade (ZDU) from version 7.2.17 to 7.2.18, long-running queries such as INSERT INTO statements failed with a
MetaException.
- This issue is now fixed by ensuring that transaction blocks are correctly managed during the statistics update task, preventing the "current transaction is aborted" error.
- CDPD-60770: Passwords with special characters fail to connect
with Beeline
- 7.3.2
- When you used a password containing special characters like #,
^, or ; in a JDBC URL for a Beeline connection, the connection failed with a 401
error. This happened because Beeline did not correctly interpret these special
characters in the password.
- This issue is resolved by introducing a new method to reparse
the password from the original JDBC URL, allowing Beeline to correctly handle and
authenticate passwords containing special characters.
Apache Jira: HIVE-28805
- CDPD-58428: Bucket Map Join hangs when source vertex parallelism changes
- 7.3.2
- Previously, a Bucket Map Join could hang if the parallelism of a source vertex was modified by automated reducer parallelism.
- This issue is now fixed by disabling automated reducer parallelism for vertices that serve as a source for a Bucket Map Join.
Apache Jira: HIVE-27078
- CDPD-58428: Incorrect results in map-side Sort-Merge Bucket
Join with different bucket sizes
- 7.3.2
- Previously, map-side Sort-Merge Bucket (SMB) Joins returned incorrect results when joining two tables with different bucket counts (for example, joining a table with two buckets to a table with three buckets).
- This issue is now fixed by implementing a new routing
algorithm that ensures bucket N of a small table is correctly mapped to bucket M of a
large table based on the greatest common divisor of their bucket sizes.
Apache Jira: HIVE-27357
- CDPD-50060: Configurable filter for partition metadata properties in Hive Metastore
- 7.3.2
- Previously, Hive Metastore (HMS) API calls failed with a
TTransportException (MaxMessageSize reached) when processing tables with large partition metadata.
- This issue is now fixed by providing a configurable filter that excludes unnecessary properties from
listPartitions API responses. This change reduces the metadata payload size, prevents connection timeouts, and improves the performance of metadata operations.
Apache Jira: HIVE-27114
- CDPD-44551: Avro table import or download fails with ODBC driver due to missing property
- 7.3.2
- The absence of
metastore.storage.schema.reader.impl caused Avro table import or
download failures in Cloudera runtime 7.1.7 when using
the ODBC driver.
- The issue was addressed by setting metastore.storage.schema.reader.impl to
org.apache.hadoop.hive.metastore.SerDeStorageSchemaReader by default.
Apache Jira: HIVE-26952
- CDPD-92208: Query failure when selecting data from views with bracketed definitions
- 7.3.2
- Previously, a SELECT query against a view failed with a
SemanticException if the view was created with specific column names and a definition enclosed in brackets.
- This issue is now fixed by ensuring that the compiler does not add extra brackets if the view definition is already enclosed.
Apache Jira: HIVE-26493
- CDPD-83530: Task commits were allowed despite an exception being
thrown in the Tez processor
- 7.3.2
- A communication failure between the coordinator and executor
caused a running task to terminate, resulting in a
java.lang.InterruptedException being thrown by the
ReduceRecordProcessor.init(). Despite this exception, the process
still allowed the task to be committed and generated a commit manifest.This issue has
now been resolved. The fix ensures that outputs are not committed if an exception is
thrown in the Tez processor.
Apache Jira: HIVE-28962
- CDPD-89414: Incorrect results for window functions with IGNORE
NULLS
- 7.3.2
- When you used the FIRST_VALUE and LAST_VALUE window functions
with the IGNORE NULLS clause while vectorization was enabled, the results were
incorrect. This occurred because the vectorized execution engine did not properly handle
the IGNORE NULLS setting for these functions.
- This issue is addressed by modifying the vectorized processing
for FIRST_VALUE and LAST_VALUE to correctly respect the IGNORE NULLS clause, ensuring
the same results are produced whether vectorization is enabled or disabled.
Apache Jira: HIVE-29122
- CDPD-85600: Select queries with ORDER BY fail due to compression
error
- 7.3.2
- When you ran a Hive SELECT query with an
ORDER BY clause, it failed with a
java.io.IOException and java.lang.UnsatisfiedLinkError
related to the zlib decompressor.
- The issue was addressed by ensuring the zlib native library is
correctly loaded.
Apache Jira: HIVE-28805
- CDPD-90301: Stack overflow error from queries with OR and MIN
filters
- 7.3.2
- Queries, cause a stack overflow error when they contained
multiple OR conditions on the same expression, such as
MINUTE(date_) = 2 OR
MINUTE(date_) = 10.
- This issue is addressed by modifying the
HivePointLookupOptimizerRule to keep the original order of expressions and to check if a
merge can be performed before creating a new expression.
Apache Jira: HIVE-29208
- DWX-20754: Invalid column reference in lateral view queries
- 7.3.2
- The virtual column
BLOCK__OFFSET__INSIDE__FILE
fails to be correctly referenced in queries using lateral views, resulting in the
error:FAILED: SemanticException Line 0:-1 Invalid column reference 'BLOCK_OFFSET_INSIDE_FILE.
- This issue is now resolved.
Apache
Jira:HIVE-28938