Pig
  1. Pig
  2. PIG-55

Allow user control over split creation

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.0.0
    • Fix Version/s: 0.1.0
    • Component/s: None
    • Labels:
      None

      Description

      I have a dataset in HDFS that's stored in a file per column that I'd like to access from pig. This means I can't use LoadFunc to get at the data as it only allows the loader access to a single input stream at a time. To handle this usage, I've broken the existing split creation code out into a few classes and interfaces, and allowed user specified load functions to be used in place of the existing code.

      1. replaceable_PigSplit.diff
        41 kB
        Charlie Groves
      2. replaceable_PigSplit_v2.diff
        34 kB
        Charlie Groves
      3. pig_chunker_split.patch
        48 kB
        Charlie Groves
      4. pig_chunker_split_v2.patch
        51 kB
        Charlie Groves
      5. pig_chunker_split_v3.patch
        61 kB
        Charlie Groves
      6. pig_chunker_split_v4.patch
        67 kB
        Charlie Groves
      7. pig_chunker_split_v5.patch
        68 kB
        Charlie Groves
      8. pig_chunker_split_v6.patch
        68 kB
        Charlie Groves
      9. pig_chunker_split_v7.patch
        71 kB
        Charlie Groves

        Activity

        No work has yet been logged on this issue.

          People

          • Assignee:
            Charlie Groves
            Reporter:
            Charlie Groves
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development