Default Managed Tables
In CDP, managed tables are transactional tables with the insert_only
property by default. You must be aware of the new default behavior of modifying file systems on
a managed table in CDP and the methods to switch to the old behavior.
New Default Behavior
- You can no longer perform file system modifications (add/remove files) on a managed table in CDP. The directory structure for transactional tables is different than non-transactional tables, and any out-of-band files which are added may or may not be picked up by Hive and Impala.
- The
insert_onlytransactional tables cannot be currently altered in Impala. TheALTER TABLEstatement on a transactional table currently displays an error. - Impala does not currently support compaction on transaction tables. You should use Hive to compact the tables.
- The
SELECT,INSERT,INSERT OVERWRITE, andTRUNCATEstatements are supported on the insert-only transactional tables.
Steps to switch to the CDH behavior:
- If you do not want transactional tables, set the
DEFAULT_TRANSACTIONAL_TYPEquery option toNONEso that any newly created managed tables are not transactional by default. -
External tables do not drop the data files when the table is dropped. To purge the data along with the table when the table is dropped, add
external.table.purge = truein the table properties. Whenexternal.table.purgeis set totrue, the data is removed when theDROP TABLEstatement is executed.
