Sqoop
  1. Sqoop
  2. SQOOP-474

Split-by specification incorrectly triggers bounding value query

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 1.4.1-incubating
    • Fix Version/s: 1.4.2
    • Component/s: build, connectors/generic
    • Labels:
      None

      Description

      To reproduce this, run an import using a query with number of mappers set to 1 and a split-by specification. For example:

      $ sqoop import --connect jdbc:mysql://localhost/hadoopguide --query 'SELECT A.*, B.* FROM A JOIN B ON (A.AID = B.BID) WHERE $CONDITIONS' --split-by AID --target-dir /user/kateting/test1 --m=1
      

      This import will output the following:

      12/04/02 13:29:59 INFO db.DataDrivenDBInputFormat: BoundingValsQuery: SELECT MIN(AID), MAX(AID) FROM (SELECT A.*, B.* FROM A JOIN B ON (A.AID = B.BID) WHERE  (1 = 1) ) AS t1
      

      An embedded query fails in DB2 when using the 'with ur' syntax. This also fails for Informix if the version of Informix doesn't support embedded queries. The issue is the 'with ur' syntax, without which, the boundary query is harmless. The boundary query is being triggered because of the split-by specification. However specifying split-by is redundant given that the number of mappers is 1.

      1. SQOOP-474.patch
        0.7 kB
        Kathleen Ting
      2. SQOOP-474-1.patch
        0.5 kB
        Kathleen Ting

        Activity

        No work has yet been logged on this issue.

          People

          • Assignee:
            Kathleen Ting
            Reporter:
            Kathleen Ting
          • Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development