[SQOOP-2664] Duplicate records found when split-by column is of type char(n) - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: 1.4.5
Fix Version/s: None
Component/s: sqoop2-jdbc-connector
Labels:
- import
- split-by
- sqoop
Environment:

Hortonworks: 2.2.4.2
Sqoop: 1.4.5
MS SQL: R2008

External issue URL:
https://issues.apache.org/jira/browse/SQOOP-2536

Description

Hi,

While working with SQOOP, we found an issue where records are duplicated while importing the data. This is seen when split-by column is of type char.

We understand, ideally, integral columns with not null type must be chosen for split-by column, but in our case, all integral columns has null values. There is open bug (SQOOP - 2536) raised for this issue.

But as sqoop have support for char datatype to be used in split-by column, and it giving unexpected results. Thus raising this bug.

Attachments

Issue Links

is related to

SQOOP-3263 Duplicate rows found when split-by column is of textual type due to different charset difference of sqoop and hadoop

Open

Activity

People

Assignee:: Unassigned

Reporter:: Dhaval Modi

Votes:: 0 Vote for this issue

Watchers:: 2 Start watching this issue

Dates

Created:: 05/Nov/15 23:08

Updated:: 26/Nov/17 16:11