Uploaded image for project: 'TinkerPop'
  1. TinkerPop
  2. TINKERPOP-925

Use persisted SparkContext to persist an RDD across Spark jobs.

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 3.0.2-incubating
    • Fix Version/s: 3.1.0-incubating
    • Component/s: hadoop
    • Labels:
      None

      Description

      If a provider is using Spark, they are currently forced to have HDFS be used to store intermediate RDD data. However, if they plan on using that data in a GraphComputer "job chain," then they should be able to lookup a .cached() RDD by name.

      Create a inputGraphRDD.name and outputGraphRDD.name to make it so that the configuration references SparkContext.getPersitedRDDs().

        Attachments

          Activity

            People

            • Assignee:
              okram Marko A. Rodriguez
              Reporter:
              okram Marko A. Rodriguez
            • Votes:
              0 Vote for this issue
              Watchers:
              3 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: