Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-2156

When the size of serialized results for one partition is slightly smaller than 10MB (the default akka.frameSize), the execution blocks

Attach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Blocker
    • Resolution: Fixed
    • 0.9.1, 1.0.0
    • 0.9.2, 1.0.1, 1.1.0
    • Spark Core
    • None
    • AWS EC2 1 master 2 slaves with the instance type of r3.2xlarge

    Description

      I have done some experiments when the frameSize is around 10MB .

      1) spark.akka.frameSize = 10
      If one of the partition size is very close to 10MB, say 9.97MB, the execution blocks without any exception or warning. Worker finished the task to send the serialized result, and then throw exception saying hadoop IPC client connection stops (changing the logging to debug level). However, the master never receives the results and the program just hangs.
      But if sizes for all the partitions less than some number btw 9.96MB amd 9.97MB, the program works fine.
      2) spark.akka.frameSize = 9
      when the partition size is just a little bit smaller than 9MB, it fails as well.
      This bug behavior is not exactly what spark-1112 is about.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            mengxr Xiangrui Meng
            xiaocai Chen Jin
            Votes:
            0 Vote for this issue
            Watchers:
            6 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - 504h
                504h
                Remaining:
                Remaining Estimate - 504h
                504h
                Logged:
                Time Spent - Not Specified
                Not Specified

                Slack

                  Issue deployment