Spark Job Failure reading empty gz file, Exception- java.io.EOFException: Unexpected end of input stream
Spark Job fails to read data from Table which has empty/ corrupt / 0 size .GZ Files with exception as below.
Exception -
Caused by: java.io.EOFException: Unexpected end of input stream
at org.apache.hadoop.io.compress.DecompressorStream.decompress(DecompressorStream.java:165)
at org.apache.hadoop.io.compress.DecompressorStream.read(DecompressorStream.java:105)
at java.io.InputStream.read(InputStream.java:101)
at org.apache.hadoop.util.LineReader.fillBuffer(LineReader.java:182)
at org.apache.hadoop.util.LineReader.readDefaultLine(LineReader.java:218)
Solution -
- Remove such 0 size GZ files, or
- Set following property -
- --conf spark.sql.files.ignoreCorruptFiles=true
Comments
Post a Comment