Before migrating from Navigator to Apache Atlas, review the migration paths. You must
extract, transform, and import the content from Navigator to Apache Atlas. After the
migration is completed, services start producing metadata for Atlas and audit information
for Apache Ranger.
There are two main paths that describe a Navigator-to-Atlas migration scenario:
Upgrading Cloudera Manager to Cloudera 7.0.0 and upgrading all your CDH
clusters to Cloudera Runtime. In this case, you can stop
Cloudera Navigator after migrating its content to Atlas.
Upgrading Cloudera Manager to Cloudera 7.0.0 but managing some or all of your
existing CDH clusters as CDH 5.x or 6.x. In this case, the Cloudera cluster running Cloudera Navigator
continues to extract metadata and audit information from existing CDH clusters
and runs Atlas and Ranger to support metadata and audit extraction from new or
potential new Cloudera Runtime clusters.
In both the scenarios, you shall complete the upgrade of Cloudera Manager first. While Cloudera Manager is being upgraded, Navigator pauses
collection of metadata and audit information from cluster activities. After the upgrade
is complete, Navigator processes the queued metadata and audit information.
In the timeline diagrams that follow, the blue color indicates steps and because you
trigger the steps manually, you can control their timing.
The migration of Navigator content to Atlas occurs during the upgrade from CDH to Cloudera. The migration involves three phases:
Extracting metadata from Navigator
The Atlas installation
includes a script (cnav.sh) that calls Navigator APIs to extract all technical
and business metadata from Navigator. The process takes about 4 minutes per one
million Navigator entities. The script compresses the result and writes it to
the local file system on the host where the Atlas server is installed. Plan for
about 100 MB for every one million Navigator entities; lower requirements for
larger numbers of entities.
Transforming the Navigator metadata into a form that Atlas can consume. Including
time and resources. The Atlas installation includes a script (nav2atlas.sh)
that converts the extracted content and again compresses it and writes it to the
local file system. This process takes about 1.5 minutes per million Navigator
entities. The script compresses the results and writes it to the local file system
on the host where the Atlas server is installed. Plan for about 100 to 150 MB for
every million Navigator entities; higher end of the range for larger numbers of
entities.
Importing the transformed metadata into Atlas.
After the Cloudera upgrade completes, Atlas starts in
"migration mode," where it waits to find the transformed data file and does not
collect metadata from cluster services. When the transformation is complete,
Atlas begins importing the content, creating equivalent Atlas entities for each
Navigator entity. This process takes about 35 minutes for a million Navigator
entities, counting only the entities that are migrated into Atlas.
To make sure you do not miss metadata for cluster operations, provide time after the Cloudera Manager upgrade and before the CDH upgrade for Navigator, to
process all the metadata produced by CDH service operations. See Navigator Extraction Timing for more
information.
You can start extracting metadata from Navigator as soon as the Cloudera parcel is deployed on the cluster. After Cloudera is started, Navigator no longer collects
metadata or audit information from the services on that cluster; instead services
produce metadata for Atlas and audit information for Ranger.
Migration from Navigator to Atlas can be run only in non-HA mode
Migration import works only with a single Atlas instance.
If Atlas has been set up in HA mode before migration, you must remove the additional
instances of Atlas, so that Atlas service has only one instance.
Later, start Atlas in the migration mode and complete the migration. Perform the
necessary checks to verify if the data has been imported correctly.
Restart Atlas in non-migration mode.
If you have Atlas setup in HA mode, retain only one instance and remove the
others.
Ensure that the ZIP files generated as an output from the Nav2Atlas conversion
are placed at the same location where the Atlas node is present.