Uploaded image for project: 'Hive'
  1. Hive
  2. HIVE-7898

HCatStorer should ignore namespaces generated by Pig

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Open
    • Priority: Minor
    • Resolution: Unresolved
    • Affects Version/s: 0.13.1
    • Fix Version/s: None
    • Component/s: HCatalog
    • Labels:
      None

      Description

      Currently, Pig aliases must exactly match the names of HCat columns for HCatStorer to be successful. However, several Pig operations prepend a namespace to the alias in order to differentiate fields (e.g. after a group with field b, you might have A::b). In this case, even if the fields are in the right order and the alias without namespace matches, the store will fail because it tries to match the long form of the alias, despite the namespace being extraneous information in this case. Note that multiple aliases can be applied (e.g. A::B::C::d).

      A workaround is possible by doing a
      FOREACH relation GENERATE field1 AS field1, field2 AS field2, etc.
      This quickly becomes tedious and bloated for tables with many fields.

      Changing this would normally require care around columns named, for example, `A::b` as has been introduced in Hive 13. However, a different function call only validates Pig aliases if they follow the old rules for Hive columns. As such, a direct change (rather than attempting to match either the namespace::alias or just alias) maintains compatibility for now.

        Attachments

        1. HIVE-7898.1.patch
          14 kB
          Justin Leet

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              justinleet Justin Leet
            • Votes:
              1 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated: