Uploaded image for project: 'Sqoop (Retired)'
  1. Sqoop (Retired)
  2. SQOOP-1154

Sqoop2: Text partitioner might miss or include edge values

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 1.99.2
    • 1.99.3
    • None
    • None

    Description

      Attached is test case that will use Varchar partitioner to generate 3, 5, 10 and 13 partitions for interval "Breezy Badger" to "Warty Warthog". For all cases generated partitions looks like the following:

      'Bree' <= VCCOL AND VCCOL < SOME_VALUE
      SOME_VALUE <= VCCOL AND VCCOL <= 'Wart'
      

      As 'Warty Warthog' > 'Wart', the last value will be never imported. Also similarly as 'Bree' < 'Breezy Badger', additional values might be imported as well (for example 'Breedy Budget'). I think that the varchar partitioner must have the interval boundaries there without any truncation, for example:

      'Breezy Badget' <= VCCOL AND VCCOL < SOME_VALUE
      SOME_VALUE <= VCCOL AND VCCOL <= 'Warty Warthog'
      

      Attachments

        1. import_test_case.patch
          2 kB
          Jarek Jarcec Cecho
        2. bugSQOOP-1154.patch
          5 kB
          Jarek Jarcec Cecho

        Issue Links

          Activity

            People

              jarcec Jarek Jarcec Cecho
              jarcec Jarek Jarcec Cecho
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: