Uploaded image for project: 'Sqoop (Retired)'
  1. Sqoop (Retired)
  2. SQOOP-474

Split-by specification incorrectly triggers bounding value query

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.4.1-incubating
    • 1.4.2
    • build, connectors/generic
    • None

    Description

      To reproduce this, run an import using a query with number of mappers set to 1 and a split-by specification. For example:

      $ sqoop import --connect jdbc:mysql://localhost/hadoopguide --query 'SELECT A.*, B.* FROM A JOIN B ON (A.AID = B.BID) WHERE $CONDITIONS' --split-by AID --target-dir /user/kateting/test1 --m=1
      

      This import will output the following:

      12/04/02 13:29:59 INFO db.DataDrivenDBInputFormat: BoundingValsQuery: SELECT MIN(AID), MAX(AID) FROM (SELECT A.*, B.* FROM A JOIN B ON (A.AID = B.BID) WHERE  (1 = 1) ) AS t1
      

      An embedded query fails in DB2 when using the 'with ur' syntax. This also fails for Informix if the version of Informix doesn't support embedded queries. The issue is the 'with ur' syntax, without which, the boundary query is harmless. The boundary query is being triggered because of the split-by specification. However specifying split-by is redundant given that the number of mappers is 1.

      Attachments

        1. SQOOP-474-1.patch
          0.5 kB
          Kathleen Ting
        2. SQOOP-474.patch
          0.7 kB
          Kathleen Ting

        Activity

          People

            kathleen Kathleen Ting
            kathleen Kathleen Ting
            Votes:
            0 Vote for this issue
            Watchers:
            0 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: