When you create a Spark DataFrame - One or more Columns can have schema nullable = false. What it means is that these column(s) can not have null values. When null value is assigned to such columns, we see following exception - 2/7/2023 3:16:00 ERROR Executor: Exception in task 0.0 in stage 4.0 (TID 6) java.lang.RuntimeException: Error while encoding: java.lang.RuntimeException: The 0th field 'colA' of input row cannot be null. So, as to avoid above error - we are required to update the Schema of DataFrame: to set nullable=true One of the way to do that is using When.Otherwise Clause like below - . withColumn("col_name", when(col("col_name").isNotNull, col("col_name")).otherwise(lit(null))) This will tell Spark that Column can be null (, in case) Other way to do it is creating custom method to be called on Dataframe that returns new Dataframe with modified schema. import org.apache.spark