Sqoop
  1. Sqoop
  2. SQOOP-443

Calling sqoop with hive import is not working multiple times due to kept output directory

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 1.4.0-incubating, 1.4.1-incubating
    • Fix Version/s: 1.4.2
    • Component/s: None
    • Labels:
      None

      Description

      Hive is not removing input directory when doing "LOAD DATA" command in all cases. This input directory is actually sqoop's export directory. Because this directory is kept, calling same sqoop command twice is failing on exception "org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory $table already exists".

      This issue might be easily overcome by manual directory removal, however it's putting unnecessary burden on users. It's also complicating executing saved jobs as there is additional script execution needed.

      1. SQOOP-443.patch
        3 kB
        Jarek Jarcec Cecho
      2. SQOOP-443.patch
        3 kB
        Jarek Jarcec Cecho

        Activity

        Jarek Jarcec Cecho created issue -
        Jarek Jarcec Cecho made changes -
        Field Original Value New Value
        Attachment SQOOP-443.patch [ 12515106 ]
        Jarek Jarcec Cecho made changes -
        Status Open [ 1 ] Patch Available [ 10002 ]
        Jarek Jarcec Cecho made changes -
        Attachment SQOOP-443.patch [ 12525721 ]
        Kathleen Ting made changes -
        Status Patch Available [ 10002 ] Resolved [ 5 ]
        Fix Version/s 1.4.2-incubating [ 12320141 ]
        Resolution Fixed [ 1 ]

          People

          • Assignee:
            Jarek Jarcec Cecho
            Reporter:
            Jarek Jarcec Cecho
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development