Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-55

Allow user control over split creation

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.0.0
    • 0.1.0
    • None
    • None

    Description

      I have a dataset in HDFS that's stored in a file per column that I'd like to access from pig. This means I can't use LoadFunc to get at the data as it only allows the loader access to a single input stream at a time. To handle this usage, I've broken the existing split creation code out into a few classes and interfaces, and allowed user specified load functions to be used in place of the existing code.

      Attachments

        1. replaceable_PigSplit.diff
          41 kB
          Charlie Groves
        2. replaceable_PigSplit_v2.diff
          34 kB
          Charlie Groves
        3. pig_chunker_split.patch
          48 kB
          Charlie Groves
        4. pig_chunker_split_v2.patch
          51 kB
          Charlie Groves
        5. pig_chunker_split_v3.patch
          61 kB
          Charlie Groves
        6. pig_chunker_split_v4.patch
          67 kB
          Charlie Groves
        7. pig_chunker_split_v5.patch
          68 kB
          Charlie Groves
        8. pig_chunker_split_v6.patch
          68 kB
          Charlie Groves
        9. pig_chunker_split_v7.patch
          71 kB
          Charlie Groves

        Activity

          People

            groves Charlie Groves
            groves Charlie Groves
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: