Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-4059 Pig on Spark
  3. PIG-4237

Error when there is a bag inside an RDD

    XMLWordPrintableJSON

    Details

    • Type: Sub-task
    • Status: Closed
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: spark-branch
    • Component/s: spark
    • Labels:
    • Patch Info:
      Patch Available

      Description

      Bags cannot be sent to an RDD, as it produces a SelfSpillBag$MemoryLimits not Serializable exception. This results in an error for almost every operation performed after grouping tuples.

      This error is fixed after making transient the protected MemoryLimit memLimit attribute inside org.apache.pig.data.SelfSpillBag, but I do not know the impact of this change.

        Attachments

        1. PIG-4237-1.diff
          12 kB
          Praveen Rachabattuni

          Issue Links

            Activity

              People

              • Assignee:
                Carlos Balduz Carlos Balduz
                Reporter:
                Carlos Balduz Carlos Balduz
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: