Skip to main content

Posts

Showing posts from July, 2021

Spark Job Failure reading empty gz file, Exception- java.io.EOFException: Unexpected end of input stream

  Spark Job fails to read data from Table which has empty/ corrupt / 0 size .GZ Files with exception as below. Exception -  Caused by: java.io.EOFException: Unexpected end of input stream at org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java:165) at org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:105) at java.io.InputStream.read(InputStream.java:101) at org.apache.hadoop.util.LineReader.fillBuffer(LineReader.java:182) at org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:218) Solution - Remove such 0 size GZ files, or Set following property - --conf spark.sql.files.ignoreCorruptFiles=true