Advanced Hive configuration parameters for Hive ACID table replication policies

You can configure the additional Hive service configuration parameters as necessary.

  1. Configure the event time-to-live (TTL) parameter to 7 days in the Hive-on-Tez service.
    1. Go to the Cloudera Manager > Clusters > Hive-on-Tez service > Configuration page.
    2. Search for the Hive Service Advanced Configuration Snippet (Safety Valve) for hive-site.xml property.
    3. Enter the hive.repl.event.db.listener.timetolive parameter, and set its value as 7. The unit for the parameter is days.
    4. Enter the hive.metastore.event.db.listener.timetolive parameter, and set its value as 7. The unit for the parameter is days.
      The event TTL can be adjusted using the hive.metastore.event.db.listener.timetolive parameter. You can modify the parameter value based on the target cluster size, data ingestion rate, and other settings in the target cluster.
    5. Enter the hive.repl.cm.retain parameter, and set its value as 7d. 7d indicates seven days.

      You can adjust the hive.repl.cm.retain parameter depending on the available HDFS storage, data ingestion rate, data deletion rate, and other settings on the target cluster.

  2. Configure the event time-to-live (TTL) parameter to 7 days in the Hive service.
    1. Go to the Cloudera Manager > Clusters > Hive service > Configuration page.
    2. Search for the Hive Service Advanced Configuration Snippet (Safety Valve) for hive-site.xml property.
    3. Enter the hive.metastore.event.db.listener.timetolive parameter, and set its value as 7. The unit for the parameter is days.
    4. Enter the hive.repl.event.db.listener.timetolive parameter, and set its value as 7. The unit for the parameter is days.
    5. Enter the hive.repl.cm.retain parameter, and set its value as 7d. 7d indicates seven days.
  3. Configure the metastore.scheduled.queries.execution.timeout parameter to 600 seconds.
  4. Configure the metastore.housekeeping.threads.on parameter to true.
  5. Restart the services after you configure the parameters.
  • After the hive.repl.cm.retain expires, the files are purged from the cmrootdir. After the files are purged, the FAILED: Execution Error, return code 20016 from org.apache.hadoop.hive.ql.exec.ReplCopyTask. File is missing from both source and cm path message appears before the next replication run.

    To recover from this state, re-bootstrap the replication policy again. For more information, see How to re-bootstrap Hive ACID replication?.

  • After an event’s TTL expires, the event is removed from the metastore. In this scenario, the replication policy job shows a FAILED_ADMIN state, and the Notification events are missing in the meta store error appears.

    To recover from this state, re-bootstrap the database on the source cluster. For more information, see Why are notification events missing in the metastore?.