Too Large Frame error

Description: When the size of the shuffle data blocks exceeds the limit of 2 GB, which spark can handle, the following error occurs.

org.apache.spark.shuffle.FetchFailedException: Too large frame: XXXXXXXXXX at org.apache.spark.storage.ShuffleBlockFetcherIterator.throwFetchFailedException(ShuffleBlockFetcherIterator.scala:513) at org.apache.spark.storage.ShuffleBlockFetcherIterator.next(ShuffleBlockFetcherIterator.scala:444) Caused by: java.lang.IllegalArgumentException: Too large frame: XXXXXXXXXX at org.spark_project.guava.base.Preconditions.checkArgument(Preconditions.java:119) at org.apache.spark.network.util.TransportFrameDecoder.decodeNext(TransportFrameDecoder.java:133)

Solutions that may work -

Set spark.sql.shuffle.partitions
Identify the DataFrame that is causing the issue.

After the DataFrame is identified, repartition the DataFrame by using df.repartition()

A possible reason to problem above can be data skewness.

Comments

Add comment

QueryDB

Search This Blog

Too Large Frame error

Comments

Post a Comment

Popular posts

Spark MongoDB Connector Not leading to correct count or data while reading

Scala Spark building Jar leads java.lang.StackOverflowError

MongoDB Chunk size many times bigger than configure chunksize (128 MB)

AWS EMR Spark – Much Larger Executors are Created than Requested

Hive Count Query not working