Caused by: org.apache.spark.sql.AnalysisException: Cannot overwrite a path that is also being read from.;
at org.apache.spark.sql.execution.command.DDLUtils$.verifyNotReadPath(ddl.scala:906)
at org.apache.spark.sql.execution.datasources.DataSourceAnalysis$$anonfun$apply$1.applyOrElse(DataSourceStrategy.scala:192)
at org.apache.spark.sql.execution.datasources.DataSourceAnalysis$$anonfun$apply$1.applyOrElse(DataSourceStrategy.scala:134)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$2.apply(TreeNode.scala:267)
at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$2.apply(TreeNode.scala:267)
at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:70)
at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:266)
at org.apache.spark.sql.catalyst.trees.TreeNode.transform(TreeNode.scala:256)
at org.apache.spark.sql.execution.datasources.DataSourceAnalysis.apply(DataSourceStrategy.scala:134)
at org.apache.spark.sql.execution.datasources.DataSourceAnalysis.apply(DataSourceStrategy.scala
Reason - With HDP Upgrade to 2.6.3 , we upgraded from Spark 2.2 to Spark 2.3 which resulted in exception above. This error can be seen for jobs where-in job is -
- reading and writing from same table or path
- SCD Logic Jobs
Solution -
- Set --conf "spark.sql.hive.convertMetastoreOrc=false"
- or, update the job such that it writes data to a temporary table. Then reads from temporary table and insert it into final table.
Comments
Post a Comment