Uploaded image for project: 'Geode'
  1. Geode
  2. GEODE-9 Spark Integration
  3. GEODE-120

RDD.saveToGemfire() can not handle big dataset (1M entries per partition)

    XMLWordPrintableJSON

Details

    • Sub-task
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 1.0.0-incubating
    • 1.0.0-incubating.M1
    • core, extensions
    • None

    Description

      the connector use single region.putAll() call to save each RDD partition. But putAll() doesn't handle big dataset well (such as 1M record). Need to split the dataset into smaller chunks, and invoke putAll() for each chunk.

      Attachments

        Activity

          People

            qihong Qihong Chen
            qihong Qihong Chen
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved:

              Time Tracking

                Estimated:
                Original Estimate - 48h
                48h
                Remaining:
                Remaining Estimate - 48h
                48h
                Logged:
                Time Spent - Not Specified
                Not Specified