Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-2650

Caching tables larger than memory causes OOMs

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Critical
    • Resolution: Fixed
    • 1.0.0, 1.0.1
    • 1.1.0
    • SQL
    • None

    Description

      The logic for setting up the initial column buffers is different for Spark SQL compared to Shark and I'm seeing OOMs when caching tables that are larger than available memory (where shark was okay).

      Two suspicious things: the intialSize is always set to 0 so we always go with the default. The default looks like it was copied from code like 10 * 1024 * 1024... but in Spark SQL its 10 * 102 * 1024.

      Attachments

        Issue Links

          Activity

            People

              marmbrus Michael Armbrust
              marmbrus Michael Armbrust
              Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: