Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-3986

PigSplit to support multiple split class

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • None
    • 0.14.0
    • None
    • None

    Description

      Currently one PigSplit wraps one to many input split and pig assign one PigSplit to one mapper; however when it serializes the split class name, it expects all input split to be of same class, hence it serializes class name only once — the first split (see code snippet at the end).

      To support PigSplit wrap multi split class, we can serialize each split along with its own class name. This would allow each split to be deserialized/restored correctly. Of course, LoadFunc would need to dispatch input split to appropriate record reader.

      Attachments

        1. PIG-3986-2.patch
          12 kB
          Cheolsoo Park
        2. PIG-3986.patch.txt
          12 kB
          Tongjie Chen

        Activity

          People

            tongjie Tongjie Chen
            tongjie Tongjie Chen
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: