Skip to main content

Posts

Showing posts from May, 2021

Spark-JDBC connection with Oracle Fails - java.sql.SQLSyntaxErrorException: ORA-00903: invalid table name

  While connecting Spark with Oracle JDBC, one may observe exception like below -  spark.read.format("jdbc"). option("url", "jdbc:oracle:thin:@//oraclehost:1521/servicename"). option("dbtable", "mytable"). option("user", "myuser").option("driver", "oracle.jdbc.driver.OracleDriver") option("password", "mypassword"). load().write.parquet("/data/out") java.sql.SQLSyntaxErrorException: ORA-00903: invalid table name at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:447) at oracle.jdbc.driver.T4CTTIoer.processError(T4CTTIoer.java:396) at oracle.jdbc.driver.T4C8Oall.processError(T4C8Oall.java:951) at oracle.jdbc.driver.T4CTTIfun.receive(T4CTTIfun.java:513) at oracle.jdbc.driver.T4CTTIfun.doRPC(T4CTTIfun.java:227) at oracle.jdbc.driver.T4C8Oall.doOALL(T4C8Oall.java:531) at oracle.jdbc.driver.T4CPreparedStatement.doOall8(T4CPreparedStatement.java:208) ...

Hadoop Jobs - javax.security.sasl.SaslException: GSS initiate failed

  If you see below exception while running Big Data Jobs - Spark, MR, Tez, etc. WARNING: YARN_OPTS has been replaced by HADOOP_OPTS. Using value of YARN_OPTS. javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] Caused by: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt) javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] Caused by: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt) javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)] Solution - The simplest way is to generate kerberos Keytab & initialize same in a user session before running job -  Ex ...

TEZ - How to enable Fetch Task instead of MapReduce Job for simple query in Hive

  Certain simple Hive queries can utilize fetch task, which can avoid the overhead of starting MapReduce job. hive.fetch.task.conversion This parameter controls which kind of simple query can be converted to a single fetch task. Value "none" is added in Hive 0.14 to disable this feature Value "minimal" means SELECT *, FILTER on partition columns (WHERE and HAVING clauses), LIMIT only. hive.fetch.task.conversion.threshold This parameter controls input threshold (in bytes) for applying hive.fetch.task.conversion.