Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-1188

Padding nulls to the input tuple according to input schema

    Details

    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.6.0
    • Fix Version/s: 0.9.0
    • Component/s: impl
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      If load statement specify schema, Pig will truncate/padding null to make sure the loaded data has exactly the same number of fields specified in load statement.

      Description

      Currently, the number of fields in the input tuple is determined by the data. When we have schema, we should generate input data according to the schema, and padding nulls if necessary. Here is one example:

      Pig script:

      a = load '1.txt' as (a0, a1);
      dump a;
      

      Input file:

      1       2
      1       2       3
      1
      

      Current result:

      (1,2)
      (1,2,3)
      (1)
      

      Desired result:

      (1,2)
      (1,2)
      (1, null)
      

        Attachments

        1. PIG-1188-1.patch
          3 kB
          Daniel Dai
        2. PIG-1188-2.patch
          20 kB
          Daniel Dai

          Issue Links

            Activity

              People

              • Assignee:
                daijy Daniel Dai
                Reporter:
                daijy Daniel Dai
              • Votes:
                0 Vote for this issue
                Watchers:
                2 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: