Uploaded image for project: 'Sqoop'
  1. Sqoop
  2. SQOOP-3438

Sqoop Import with create hcatalog table for ORC will not work with Hive3 as the table created would be a ACID table and transactional

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Critical
    • Resolution: Fixed
    • Affects Version/s: 1.4.7
    • Fix Version/s: 1.5.0
    • Component/s: hive-integration
    • Labels:
      None

      Description

      PROBLEM: Running a sqoop import command with the option --create-hcatalog-table will not work due to the following reasons
      When create-hcatalog-table is used it creates the table as a Managed ACID table.
      HCatalog does not support transactional or bucketing table

      So customer who need to create a ORC based table cannot use sqoop to create a ORC based table which means their existing code where if in case they use sqoop to create these tables would fail.

      The current workaround is a two step process
      1. Create the ORC table in hive with the keyword external and set transactional to false
      2. Then use the sqoop command to load the data into the orc table.

      The request is to add in an extra argument in the sqoop command line to specify that the table is external (example: --hcatalog-external-table )so we can use the option --hcatalog-storage-stanza "stored as orc tblproperties (\"transactional\"=\"false\")".


      Thank you Mahesh Balakrishnan for your findings. This ticket is created based on your work.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                dionusos Denes Bodo
                Reporter:
                dionusos Denes Bodo
              • Votes:
                0 Vote for this issue
                Watchers:
                4 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved:

                  Time Tracking

                  Estimated:
                  Original Estimate - Not Specified
                  Not Specified
                  Remaining:
                  Remaining Estimate - 0h
                  0h
                  Logged:
                  Time Spent - 3h 20m
                  3h 20m