Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-1145

Memory mapping with many small blocks can cause JVM allocation failures

Log workAgile BoardRank to TopRank to BottomAttach filesAttach ScreenshotVotersWatch issueWatchersCreate sub-taskConvert to sub-taskLinkCloneLabelsUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 0.9.0
    • 0.9.2, 1.0.0
    • Spark Core
    • None

    Description

      During a shuffle each block or block segment is memory mapped to a file. When the segments are very small and there are a large number of them, the memory maps can start failing and eventually the JVM will terminate. It's not clear exactly what's happening but it appears that when the JVM terminates about 265MB of virtual address space is used by memory mapped files. This doesn't seem affected at all by `-XXmaxdirectmemorysize` - AFAIK that option is just to give the JVM its own self imposed limit rather than allow it to run into OS limits.

      At the time of JVM failure it appears the overall OS memory becomes scarce, so it's possible there are overheads for each memory mapped file that are adding up here. One overhead is that the memory mapping occurs at the granularity of pages, so if blocks are really small there is natural overhead required to pad to the page boundary.

      In the particular case where I saw this, the JVM was running 4 reducers, each of which was trying to access about 30,000 blocks for a total of 120,000 concurrent reads. At about 65,000 open files it crapped out. In this case each file was about 1000 bytes.

      User should really be coalescing or using fewer reducers if they have 1000 byte shuffle files, but I expect this to happen nonetheless. My proposal was that if the file is smaller than a few pages, we should just read it into a java buffer and not bother to memory map it. Memory mapping huge numbers of small files in the JVM is neither recommended or good for performance, AFAIK.

      Below is the stack trace:

      14/02/27 08:32:35 ERROR storage.BlockManagerWorker: Exception handling buffer message
      java.io.IOException: Map failed
        at sun.nio.ch.FileChannelImpl.map(FileChannelImpl.java:888)
        at org.apache.spark.storage.DiskStore.getBytes(DiskStore.scala:89)
        at org.apache.spark.storage.BlockManager.getLocalBytes(BlockManager.scala:285)
        at org.apache.spark.storage.BlockManagerWorker.getBlock(BlockManagerWorker.scala:90)
        at org.apache.spark.storage.BlockManagerWorker.processBlockMessage(BlockManagerWorker.scala:69)
        at org.apache.spark.storage.BlockManagerWorker$$anonfun$2.apply(BlockManagerWorker.scala:44)
        at org.apache.spark.storage.BlockManagerWorker$$anonfun$2.apply(BlockManagerWorker.scala:44)
        at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
        at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
        at scala.collection.Iterator$class.foreach(Iterator.scala:727)
        at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
        at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
        at org.apache.spark.storage.BlockMessageArray.foreach(BlockMessageArray.scala:28)
        at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
        at org.apache.spark.storage.BlockMessageArray.map(BlockMessageArray.scala:28)
        at org.apache.spark.storage.BlockManagerWorker.onBlockMessageReceive(BlockManagerWorker.scala:44)
        at org.apache.spark.storage.BlockManagerWorker$$anonfun$1.apply(BlockManagerWorker.scala:34)
        at org.apache.spark.storage.BlockManagerWorker$$anonfun$1.apply(BlockManagerWorker.scala:34)
        at org.apache.spark.network.ConnectionManager.org$apache$spark$network$ConnectionManager$$handleMessage(ConnectionManager.scala:512)
        at org.apache.spark.network.ConnectionManager$$anon$8.run(ConnectionManager.scala:478)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
      

      And the JVM error log had a bunch of entries like this:

      7f4b48f89000-7f4b48f8a000 r--s 00000000 ca:30 1622077901                 /mnt4/spark/spark-local-20140227020022-227c/26/shuffle_0_22312_38
      7f4b48f8a000-7f4b48f8b000 r--s 00000000 ca:20 545892715                  /mnt3/spark/spark-local-20140227020022-5ef5/3a/shuffle_0_26808_20
      7f4b48f8b000-7f4b48f8c000 r--s 00000000 ca:50 1622480741                 /mnt2/spark/spark-local-20140227020022-315b/1c/shuffle_0_29013_19
      7f4b48f8c000-7f4b48f8d000 r--s 00000000 ca:30 10082610                   /mnt4/spark/spark-local-20140227020022-227c/3b/shuffle_0_28002_9
      7f4b48f8d000-7f4b48f8e000 r--s 00000000 ca:50 1622268539                 /mnt2/spark/spark-local-20140227020022-315b/3e/shuffle_0_23983_17
      7f4b48f8e000-7f4b48f8f000 r--s 00000000 ca:50 1083068239                 /mnt2/spark/spark-local-20140227020022-315b/37/shuffle_0_25505_22
      7f4b48f8f000-7f4b48f90000 r--s 00000000 ca:30 9921006                    /mnt4/spark/spark-local-20140227020022-227c/31/shuffle_0_24072_95
      7f4b48f90000-7f4b48f91000 r--s 00000000 ca:50 10441349                   /mnt2/spark/spark-local-20140227020022-315b/20/shuffle_0_27409_47
      7f4b48f91000-7f4b48f92000 r--s 00000000 ca:50 10406042                   /mnt2/spark/spark-local-20140227020022-315b/0e/shuffle_0_26481_84
      7f4b48f92000-7f4b48f93000 r--s 00000000 ca:50 1622268192                 /mnt2/spark/spark-local-20140227020022-315b/14/shuffle_0_23818_92
      7f4b48f93000-7f4b48f94000 r--s 00000000 ca:50 1082957628                 /mnt2/spark/spark-local-20140227020022-315b/09/shuffle_0_22824_45
      7f4b48f94000-7f4b48f95000 r--s 00000000 ca:20 1082199965                 /mnt3/spark/spark-local-20140227020022-5ef5/00/shuffle_0_1429_13
      7f4b48f95000-7f4b48f96000 r--s 00000000 ca:20 10940995                   /mnt3/spark/spark-local-20140227020022-5ef5/38/shuffle_0_28705_44
      7f4b48f96000-7f4b48f97000 r--s 00000000 ca:10 17456971                   /mnt/spark/spark-local-20140227020022-b372/28/shuffle_0_23055_72
      7f4b48f97000-7f4b48f98000 r--s 00000000 ca:30 9853895                    /mnt4/spark/spark-local-20140227020022-227c/08/shuffle_0_22797_42
      7f4b48f98000-7f4b48f99000 r--s 00000000 ca:20 1622089728                 /mnt3/spark/spark-local-20140227020022-5ef5/27/shuffle_0_24017_97
      7f4b48f99000-7f4b48f9a000 r--s 00000000 ca:50 1082937570                 /mnt2/spark/spark-local-20140227020022-315b/24/shuffle_0_22291_38
      7f4b48f9a000-7f4b48f9b000 r--s 00000000 ca:30 10056604                   /mnt4/spark/spark-local-20140227020022-227c/2f/shuffle_0_27408_59
      

      Attachments

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            pwendell Patrick Wendell Assign to me
            pwendell Patrick Wendell
            Votes:
            0 Vote for this issue
            Watchers:
            4 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Issue deployment