Pig
  1. Pig
  2. PIG-1188

Padding nulls to the input tuple according to input schema

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.6.0
    • Fix Version/s: 0.9.0
    • Component/s: impl
    • Labels:
      None
    • Hadoop Flags:
      Reviewed
    • Release Note:
      If load statement specify schema, Pig will truncate/padding null to make sure the loaded data has exactly the same number of fields specified in load statement.

      Description

      Currently, the number of fields in the input tuple is determined by the data. When we have schema, we should generate input data according to the schema, and padding nulls if necessary. Here is one example:

      Pig script:

      a = load '1.txt' as (a0, a1);
      dump a;
      

      Input file:

      1       2
      1       2       3
      1
      

      Current result:

      (1,2)
      (1,2,3)
      (1)
      

      Desired result:

      (1,2)
      (1,2)
      (1, null)
      
      1. PIG-1188-2.patch
        20 kB
        Daniel Dai
      2. PIG-1188-1.patch
        3 kB
        Daniel Dai

        Issue Links

          Activity

            People

            • Assignee:
              Daniel Dai
              Reporter:
              Daniel Dai
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development