Sqoop
  1. Sqoop
  2. SQOOP-1281

Support of glob paths during export

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 1.4.3
    • Fix Version/s: 1.4.7
    • Component/s: None
    • Labels:
      None

      Description

      The current Export mechanism in Sqoop does not support globs in its input directory parameter. As a result, it considers wildcards as a path component and upon the logical failure of its lookup, it assumes the input format is unknown, and proceeds with using a Text based processor instead of the actual type (such as avro).

        Activity

        Hide
        Clément MATHIEU added a comment -

        I wrote this patch a while ago for our internal needs. It adds the support of glob paths for input directories.

        It has been tested & used with Sqoop 1.4.5 on top of a CDH 4.7, 5.3 & 5.4.

        I quickly crawled the test suite to see how I could test this change but failed to find a trivial implementation. I believe that I have to fiddle with getTablePath / TestExport but I'm not sure where it will take me. Any advice is welcome.

        Show
        Clément MATHIEU added a comment - I wrote this patch a while ago for our internal needs. It adds the support of glob paths for input directories. It has been tested & used with Sqoop 1.4.5 on top of a CDH 4.7, 5.3 & 5.4. I quickly crawled the test suite to see how I could test this change but failed to find a trivial implementation. I believe that I have to fiddle with getTablePath / TestExport but I'm not sure where it will take me. Any advice is welcome.
        Hide
        Jarek Jarcec Cecho added a comment -

        Switching to patch available status, so that it shows up in review queue. Please also upload the patch to review board.

        Show
        Jarek Jarcec Cecho added a comment - Switching to patch available status, so that it shows up in review queue. Please also upload the patch to review board .
        Show
        Clément MATHIEU added a comment - https://reviews.apache.org/r/38372/
        Hide
        Jarek Jarcec Cecho added a comment -

        Hi Clément MATHIEU,
        my apologies for late review on this one. The patch overall looks good to me. I would add a simple test rather then changing the existing ones. E.g. rather then change the shared structures I would create test that will call custom --target-dir and be done with it.

        Show
        Jarek Jarcec Cecho added a comment - Hi Clément MATHIEU , my apologies for late review on this one. The patch overall looks good to me. I would add a simple test rather then changing the existing ones. E.g. rather then change the shared structures I would create test that will call custom --target-dir and be done with it.
        Hide
        Clément MATHIEU added a comment -

        New patch (same code but adds an unit test)

        Show
        Clément MATHIEU added a comment - New patch (same code but adds an unit test)
        Hide
        Clément MATHIEU added a comment -

        Jarek Jarcec Cecho, thanks for the hint. I just updated the patch to include an unit test (review board has been updated too).

        Show
        Clément MATHIEU added a comment - Jarek Jarcec Cecho , thanks for the hint. I just updated the patch to include an unit test (review board has been updated too).
        Hide
        ASF subversion and git services added a comment -

        Commit 02e36db2b8deee01ae08a493369097b6812a164e in sqoop's branch refs/heads/trunk from Jarek Jarcec Cecho
        [ https://git-wip-us.apache.org/repos/asf?p=sqoop.git;h=02e36db ]

        SQOOP-1281: Support of glob paths during export

        (Clément MAHTIEU via Jarek Jarcec Cecho)

        Show
        ASF subversion and git services added a comment - Commit 02e36db2b8deee01ae08a493369097b6812a164e in sqoop's branch refs/heads/trunk from Jarek Jarcec Cecho [ https://git-wip-us.apache.org/repos/asf?p=sqoop.git;h=02e36db ] SQOOP-1281 : Support of glob paths during export (Clément MAHTIEU via Jarek Jarcec Cecho)
        Hide
        Jarek Jarcec Cecho added a comment -

        Thank you for your contribution!

        Show
        Jarek Jarcec Cecho added a comment - Thank you for your contribution!
        Hide
        Hudson added a comment -

        FAILURE: Integrated in Sqoop-hadoop20 #1019 (See https://builds.apache.org/job/Sqoop-hadoop20/1019/)
        SQOOP-1281: Support of glob paths during export (jarcec: https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=02e36db2b8deee01ae08a493369097b6812a164e)

        • src/java/org/apache/sqoop/mapreduce/ExportJobBase.java
        • src/test/com/cloudera/sqoop/TestAvroExport.java
        Show
        Hudson added a comment - FAILURE: Integrated in Sqoop-hadoop20 #1019 (See https://builds.apache.org/job/Sqoop-hadoop20/1019/ ) SQOOP-1281 : Support of glob paths during export (jarcec: https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=02e36db2b8deee01ae08a493369097b6812a164e ) src/java/org/apache/sqoop/mapreduce/ExportJobBase.java src/test/com/cloudera/sqoop/TestAvroExport.java
        Hide
        Hudson added a comment -

        FAILURE: Integrated in Sqoop-hadoop200 #1026 (See https://builds.apache.org/job/Sqoop-hadoop200/1026/)
        SQOOP-1281: Support of glob paths during export (jarcec: https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=02e36db2b8deee01ae08a493369097b6812a164e)

        • src/test/com/cloudera/sqoop/TestAvroExport.java
        • src/java/org/apache/sqoop/mapreduce/ExportJobBase.java
        Show
        Hudson added a comment - FAILURE: Integrated in Sqoop-hadoop200 #1026 (See https://builds.apache.org/job/Sqoop-hadoop200/1026/ ) SQOOP-1281 : Support of glob paths during export (jarcec: https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=02e36db2b8deee01ae08a493369097b6812a164e ) src/test/com/cloudera/sqoop/TestAvroExport.java src/java/org/apache/sqoop/mapreduce/ExportJobBase.java
        Hide
        Hudson added a comment -

        FAILURE: Integrated in Sqoop-hadoop100 #986 (See https://builds.apache.org/job/Sqoop-hadoop100/986/)
        SQOOP-1281: Support of glob paths during export (jarcec: https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=02e36db2b8deee01ae08a493369097b6812a164e)

        • src/java/org/apache/sqoop/mapreduce/ExportJobBase.java
        • src/test/com/cloudera/sqoop/TestAvroExport.java
        Show
        Hudson added a comment - FAILURE: Integrated in Sqoop-hadoop100 #986 (See https://builds.apache.org/job/Sqoop-hadoop100/986/ ) SQOOP-1281 : Support of glob paths during export (jarcec: https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=02e36db2b8deee01ae08a493369097b6812a164e ) src/java/org/apache/sqoop/mapreduce/ExportJobBase.java src/test/com/cloudera/sqoop/TestAvroExport.java
        Hide
        Hudson added a comment -

        FAILURE: Integrated in Sqoop-hadoop23 #1222 (See https://builds.apache.org/job/Sqoop-hadoop23/1222/)
        SQOOP-1281: Support of glob paths during export (jarcec: https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=02e36db2b8deee01ae08a493369097b6812a164e)

        • src/test/com/cloudera/sqoop/TestAvroExport.java
        • src/java/org/apache/sqoop/mapreduce/ExportJobBase.java
        Show
        Hudson added a comment - FAILURE: Integrated in Sqoop-hadoop23 #1222 (See https://builds.apache.org/job/Sqoop-hadoop23/1222/ ) SQOOP-1281 : Support of glob paths during export (jarcec: https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=02e36db2b8deee01ae08a493369097b6812a164e ) src/test/com/cloudera/sqoop/TestAvroExport.java src/java/org/apache/sqoop/mapreduce/ExportJobBase.java

          People

          • Assignee:
            Clément MATHIEU
            Reporter:
            Viji
          • Votes:
            1 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development