Uploaded image for project: 'Pig'
  1. Pig
  2. PIG-1536

use same logic for merging inner schemas in "default union" and "union onschema"

    Details

    • Type: Task
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.9.0
    • Component/s: None
    • Labels:
      None
    • Release Note:
      Hide
      The behavior of union and uniononschema converges after this patch:
      1. Union on relation of two different size result a null schema (union only):
      A: (a1:long, a2:long)
      B: (b1:long, b2:long, b3:long)
      A union B: null

      2. Union column of incompatible type result a bytearray type:
      A: (a1:long, a2:long)
      B: (b1:(b11:long, b12:long), b2:long)
      A union B: (a1:bytearray, a2:long)

      3. Union column of compatible type will produce a escalate the type. The priority is chararray -> double -> float -> long -> int -> bytearray:
      A: (a1:int, a2:double, a3:int)
      B: (b1:float, b2:chararray, b3:bytearray)
      A union B: (a1:float, a2:chararray, a3:int)

      4. Union different inner type result an empty complex type:
      A: (a1:(a11:long, a12:int), a2:{(a21:charray, a22:int)})
      B: (b1:(b11:int, b12:int), b2:{(b21:int, b22:int)})
      A union B: (a1:(), a2:{()})

      5. Always take the alias of first relation as the alias of unioned relation field
      Show
      The behavior of union and uniononschema converges after this patch: 1. Union on relation of two different size result a null schema (union only): A: (a1:long, a2:long) B: (b1:long, b2:long, b3:long) A union B: null 2. Union column of incompatible type result a bytearray type: A: (a1:long, a2:long) B: (b1:(b11:long, b12:long), b2:long) A union B: (a1:bytearray, a2:long) 3. Union column of compatible type will produce a escalate the type. The priority is chararray -> double -> float -> long -> int -> bytearray: A: (a1:int, a2:double, a3:int) B: (b1:float, b2:chararray, b3:bytearray) A union B: (a1:float, a2:chararray, a3:int) 4. Union different inner type result an empty complex type: A: (a1:(a11:long, a12:int), a2:{(a21:charray, a22:int)}) B: (b1:(b11:int, b12:int), b2:{(b21:int, b22:int)}) A union B: (a1:(), a2:{()}) 5. Always take the alias of first relation as the alias of unioned relation field

      Description

      We should consider using logic for merging inner schema in case of the two different types of union.

      In case of 'default union', it merges the two inner schema of bags/tuples by position if the number of fields are same and the corresponding types are compatible.

      In case of 'union onschema', it considers tuple/bag with different innerschema to be incompatible types.

        Attachments

        1. PIG-1536-1.patch
          31 kB
          Daniel Dai
        2. PIG-1536-2.patch
          32 kB
          Daniel Dai
        3. PIG-1536-3.patch
          31 kB
          Daniel Dai
        4. PIG-1536-4.patch
          31 kB
          Daniel Dai

          Activity

            People

            • Assignee:
              daijy Daniel Dai
              Reporter:
              thejas Thejas M Nair
            • Votes:
              0 Vote for this issue
              Watchers:
              0 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: