Uploaded image for project: 'Apache Hudi'
  1. Apache Hudi
  2. HUDI-1529

Spark-SQL drvier runs out of memory when metadata table is enabled

Log workAgile BoardRank to TopRank to BottomAttach filesAttach ScreenshotBulk Copy AttachmentsBulk Move AttachmentsVotersWatch issueWatchersCreate sub-taskConvert to sub-taskLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    Description

      When testing a large dataset around 1.2TB data and around 20k files, we notice an issue where the spark driver would always run out of memory, when running queries with use of metadata table enabled. The OOM would happen on any query, even if it was touching a single partition, and was happening in the split generation phase before any tasks would start executing.

      Upon analyzing the heap dump, it was analyzed that input format was generating millions of splits for every single file. Upon further analysis of the code path, it was found that the root cause was because metadata enabled code was ignoring the blockSize when returning FileStatus objects and setting it to 0. Spark by itself does not set any value for the property:

      mapreduce.input.fileinputformat.split.minsize
      

      As a result minSize ends up being 1, and with block size as 0 it cause input format to generate splits of size 1 bytes**** because of the logic here:

      https://github.com/apache/hadoop/blob/trunk/hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapred/FileInputFormat.java#L417

      This ends up in exponential file split objects being creating, causing driver to run out of memory.

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            uditme Udit Mehrotra Assign to me
            uditme Udit Mehrotra
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Slack

                Issue deployment