org.apache.spark.sql.AnalysisException: Cannot overwrite a path that is also being read from.;

Caused by: org.apache.spark.sql.AnalysisException: Cannot overwrite a path that is also being read from.;

at org.apache.spark.sql.execution.command.DDLUtils$.verifyNotReadPath(ddl.scala:906)
at org.apache.spark.sql.execution.datasources.DataSourceAnalysis$$anonfun$apply$1.applyOrElse(DataSourceStrategy.scala:192)
at org.apache.spark.sql.execution.datasources.DataSourceAnalysis$$anonfun$apply$1.applyOrElse(DataSourceStrategy.scala:134)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$2.apply(TreeNode.scala:267)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$2.apply(TreeNode.scala:267)
at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:70)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:266)
at org.apache.spark.sql.catalyst.trees.TreeNode.transform(TreeNode.scala:256)
at org.apache.spark.sql.execution.datasources.DataSourceAnalysis.apply(DataSourceStrategy.scala:134)
at org.apache.spark.sql.execution.datasources.DataSourceAnalysis.apply(DataSourceStrategy.scala

Reason - With HDP Upgrade to 2.6.3 , we upgraded from Spark 2.2 to Spark 2.3 which resulted in exception above. This error can be seen for jobs where-in job is -

reading and writing from same table or path
SCD Logic Jobs

Solution -

Set --conf "spark.sql.hive.convertMetastoreOrc=false"
or, update the job such that it writes data to a temporary table. Then reads from temporary table and insert it into final table.

QueryDB

Search This Blog

org.apache.spark.sql.AnalysisException: Cannot overwrite a path that is also being read from.;

Comments

Post a Comment

Popular posts

Hive Parse JSON with Array Columns and Explode it in to Multiple rows.

Scala Spark building Jar leads java.lang.StackOverflowError

Read from a hive table and write back to it using spark sql

AWS EMR Spark – Much Larger Executors are Created than Requested