Uploaded image for project: 'Sqoop'
  1. Sqoop
  2. SQOOP-1154

Sqoop2: Text partitioner might miss or include edge values

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 1.99.2
    • Fix Version/s: 1.99.3
    • Component/s: None
    • Labels:
      None

      Description

      Attached is test case that will use Varchar partitioner to generate 3, 5, 10 and 13 partitions for interval "Breezy Badger" to "Warty Warthog". For all cases generated partitions looks like the following:

      'Bree' <= VCCOL AND VCCOL < SOME_VALUE
      SOME_VALUE <= VCCOL AND VCCOL <= 'Wart'
      

      As 'Warty Warthog' > 'Wart', the last value will be never imported. Also similarly as 'Bree' < 'Breezy Badger', additional values might be imported as well (for example 'Breedy Budget'). I think that the varchar partitioner must have the interval boundaries there without any truncation, for example:

      'Breezy Badget' <= VCCOL AND VCCOL < SOME_VALUE
      SOME_VALUE <= VCCOL AND VCCOL <= 'Warty Warthog'
      

        Attachments

        1. bugSQOOP-1154.patch
          5 kB
          Jarek Jarcec Cecho
        2. import_test_case.patch
          2 kB
          Jarek Jarcec Cecho

          Issue Links

            Activity

              People

              • Assignee:
                jarcec Jarek Jarcec Cecho
                Reporter:
                jarcec Jarek Jarcec Cecho
              • Votes:
                0 Vote for this issue
                Watchers:
                5 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: