Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-10399

Off Heap Memory Access for non-JVM libraries (C++)

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Won't Fix
    • None
    • None
    • Spark Core
    • None

    Description

      Summary
      Provide direct off-heap memory access to an external non-JVM program such as a c++ library within the Spark running JVM/executor. As Spark moves to storing all data into off heap memory it makes sense to provide access points to the memory for non-JVM programs.


      Assumptions

      • Zero copies will be made during the call into non-JVM library
      • Access into non-JVM libraries will be accomplished via JNI
      • A generic JNI interface will be created so that developers will not need to deal with the raw JNI call
      • C++ will be the initial target non-JVM use case
      • memory management will remain on the JVM/Spark side
      • the API from C++ will be similar to dataframes as much as feasible and NOT require expert knowledge of JNI
      • Data organization and layout will support complex (multi-type, nested, etc.) types

      Design

      • Initially Spark JVM -> non-JVM will be supported
      • Creating an embedded JVM with Spark running from a non-JVM program is initially out of scope

      Technical

      • GetDirectBufferAddress is the JNI call used to access byte buffer without copy

      Attachments

        Activity

          People

            Unassigned Unassigned
            paulweiss Paul Weiss
            Votes:
            13 Vote for this issue
            Watchers:
            41 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: