We have seen that Spark connects to Hive Metastore. And, sometimes it takes too long to get connected as Metastore is slow.
Also, we have seen that there is no automatic load balancing between Hive Meta stores.
Solution -
Out of all available Hive Metastore connections- we can externally test a working Metastore in a script then set that URL while invoking Spark Job as follows -
spark-sql --conf "spark.hadoop.hive.metastore.uris=thrift://myMetaIp:9083" -e "show databases"
Note - This can also be used to set Hive Metastore or other properties externally from Spark.
Also, we have seen that there is no automatic load balancing between Hive Meta stores.
Solution -
Out of all available Hive Metastore connections- we can externally test a working Metastore in a script then set that URL while invoking Spark Job as follows -
spark-sql --conf "spark.hadoop.hive.metastore.uris=thrift://myMetaIp:9083" -e "show databases"
Note - This can also be used to set Hive Metastore or other properties externally from Spark.
Comments
Post a Comment