To provide high availability or load balancing for HiveServer2, Hive provides a function called dynamic service discovery where multiple HiveServer2 instances can register themselves with Zookeeper. Instead of connecting to a specific HiveServer2 directly, clients connect to Zookeeper which returns a randomly selected registered HiveServer2 instance.
For example -
Below command connects to Hive Server on MachineA
- beeline -u "jdbc:hive2://machineA:10000"
- beeline -u "jdbc:hive2://machineA:2181,machineB:2181,machineC:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=hiveserver2-mob-batch?tez.queue.name=myyarnqueue"
- Open Zookeeper command line interface
- zookeeper-client
- Connect to Zookeeper Server
- connect machineA:2181,machineB:2181,machineC:2181
- Create ZNode
- create /hiveserver2-mob-batch
- Manually, Register HS2 with Zookeeper under a namespace
- create /hiveserver2-mob-batch/serverUri=machineA:10000;version=3.1.3000.7.1.2.0-96;sequence=0000000082
- create /hiveserver2-mob-batch/serverUri=machineB:10000;version=3.1.3000.7.1.2.0-96;sequence=0000000081
- create /hiveserver2-mob-batch/serverUri=machineC:10000;version=3.1.3000.7.1.2.0-96;sequence=0000000051
- Verify the Namespace by executing below
- ls /hiveserver2-mob-batch
- [serverUri=machineC:10000;version=3.1.3000.7.1.2.0-96;sequence=0000000051, serverUri=machineA:10000;version=3.1.3000.7.1.2.0-96;sequence=0000000082, serverUri=machineB:10000;version=3.1.3000.7.1.2.0-96;sequence=0000000081]
- delete /hiveserver2-mob-batch/serverUri=machineC:10000;version=3.1.3000.7.1.2.0-96;sequence=0000000051
- hive --service hiveserver2 --deregister <version_number>
Configuration Requirements
1. Set hive.zookeeper.quorum to the ZooKeeper ensemble (a comma separated list of ZooKeeper server host:ports running at the cluster)
2. Customize hive.zookeeper.session.timeout so that it closes the connection between the HiveServer2’s client and ZooKeeper if a heartbeat is not received within the timeout period.
3. Set hive.server2.support.dynamic.service.discovery to true
4. Set hive.server2.zookeeper.namespace to the value that you want to use as the root namespace on ZooKeeper. The default value is hiveserver2.
- Open zookeeper-client
- connect machineA:2181,machineB:2181,machineC:2181
- ls /hiveserver2-mob-batch
- Parse HS2 URL's, as mentioned between - serverUri= and ;
- Does Random shuffling of all URL's - shuf
- Pick up first random URL - head -1
- Concatenate string to form JDBC URL - jdbc:hive2:// ...
Comments
Post a Comment