Sqoop
  1. Sqoop
  2. SQOOP-788

Sqoop2: Import sometimes duplicate some data

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 1.99.1
    • Component/s: None
    • Labels:
      None

      Description

      I've noticed that on my unique testing set of 408957 rows, import always imports 408957 rows. However when I check number of unique lines I usually got smaller number - for example 408056 (less by 901).

      Because total number of rows fits, I'm expecting that we sometimes read one value twice somehow. I'm not quite sure why.

      1. bugSQOOP-788.patch
        3 kB
        Jarek Jarcec Cecho

        Issue Links

          Activity

          Hide
          Hari Shreedharan added a comment -

          The issue would affect export as well, since objects are reused there too.

          Show
          Hari Shreedharan added a comment - The issue would affect export as well, since objects are reused there too.
          Hide
          Bilung Lee added a comment -

          Thanks. Patch is committed.

          Show
          Bilung Lee added a comment - Thanks. Patch is committed.
          Hide
          Hudson added a comment -

          Integrated in Sqoop2-hadoop200 #14 (See https://builds.apache.org/job/Sqoop2-hadoop200/14/)
          SQOOP-788 Import sometimes duplicate some data (Revision d9465bba216372f053ba9c652b8758f5941b3ead)

          Result = SUCCESS
          blee : https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=d9465bba216372f053ba9c652b8758f5941b3ead
          Files :

          • connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcImportExtractor.java
          • execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsTextExportExtractor.java
          • execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsSequenceExportExtractor.java
          Show
          Hudson added a comment - Integrated in Sqoop2-hadoop200 #14 (See https://builds.apache.org/job/Sqoop2-hadoop200/14/ ) SQOOP-788 Import sometimes duplicate some data (Revision d9465bba216372f053ba9c652b8758f5941b3ead) Result = SUCCESS blee : https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=d9465bba216372f053ba9c652b8758f5941b3ead Files : connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcImportExtractor.java execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsTextExportExtractor.java execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsSequenceExportExtractor.java
          Hide
          Hudson added a comment -

          Integrated in Sqoop2-hadoop100 #14 (See https://builds.apache.org/job/Sqoop2-hadoop100/14/)
          SQOOP-788 Import sometimes duplicate some data (Revision d9465bba216372f053ba9c652b8758f5941b3ead)

          Result = UNSTABLE
          blee : https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=d9465bba216372f053ba9c652b8758f5941b3ead
          Files :

          • execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsTextExportExtractor.java
          • connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcImportExtractor.java
          • execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsSequenceExportExtractor.java
          Show
          Hudson added a comment - Integrated in Sqoop2-hadoop100 #14 (See https://builds.apache.org/job/Sqoop2-hadoop100/14/ ) SQOOP-788 Import sometimes duplicate some data (Revision d9465bba216372f053ba9c652b8758f5941b3ead) Result = UNSTABLE blee : https://git-wip-us.apache.org/repos/asf?p=sqoop.git&a=commit&h=d9465bba216372f053ba9c652b8758f5941b3ead Files : execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsTextExportExtractor.java connector/connector-generic-jdbc/src/main/java/org/apache/sqoop/connector/jdbc/GenericJdbcImportExtractor.java execution/mapreduce/src/main/java/org/apache/sqoop/job/etl/HdfsSequenceExportExtractor.java

            People

            • Assignee:
              Jarek Jarcec Cecho
              Reporter:
              Jarek Jarcec Cecho
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development