Improving performance in Schema Registry
Adding Knox as a load balancer can increase the performance of Schema Registry.
You can increase the performance of Schema Registry by adding Apache Knox, the Cloudera load balancer, to your Schema Registry configuration. This increase in performance offsets the minor decrease in performance caused by the lack of server-side caching.
By default, Schema Registry does not have server-side caching. Each time a client sends a query, the client reads the data directly from the database.
The lack of server-side caching creates a minor performance impact. However, it enables Schema Registry to be stateless and be distributed over multiple hosts. All hosts use the same database. This enables a client to insert a schema on host A and query the same schema on host B.
You can store JAR files similarly in HDFS and retrieve a file even if one of the hosts is down.
To make Schema Registry highly available in this setup, include a load balancer on top of the services. Clients communicate through the load balancer. The load balancer then determines which host to send the requests to. If one host is down, the request is sent to another host. In CDP, the load balancer is Apache Knox.

Integrating Schema Registry with Knox
Follow these steps to add Knox load balancer to your Schema Registry configuration:
- In Cloudera Manager, navigate to .
- Add the following property to the
Knox Gateway Advanced Configuration Snippet (Safety Valve) for conf/cdp-resources.xml
configuration:```xml <name>providerConfigs:sso</name> <value> <role>ha#ha.name=HaProvider#ha.param.SCHEMA-REGISTRY= enabled=true;maxFailoverAttempts=3;failoverSleep=1000; maxRetryAttempts=300;retrySleep=1000 </value> ```
- Click Save Changes, and restart the Knox service.
- Check if the changes are reflected in the cdp-proxy.xml file.
- To verify if the Knox load balancer works fine, disable one Schema Registry host and check if you can still access Schema Registry through Knox.