Details
-
Bug
-
Status: Open
-
Major
-
Resolution: Unresolved
-
None
-
None
-
None
-
None
Description
calling the HBASE Put.add() statement on an unchanged (previously inserted) row/value
will cause a data duplication (only the timestamp associated will be incremented)
hbase(main):030:0> get "dump_HKFAS.sales_order", "1", {COLUMN => "mysql:created_at", VERSIONS => 4} COLUMN CELL mysql:created_at timestamp=1358853505756, value=2011-12-21 18:07:38.0 mysql:created_at timestamp=1358790515451, value=2011-12-21 18:07:38.0 2 row(s) in 0.0040 seconds
today's sqoop run
hbase(main):031:0> Date.new(1358853505756).toString() => "Tue Jan 22 11:18:25 UTC 2013"
yesterday's sqoop run
hbase(main):032:0> Date.new(1358790515451).toString() => "Mon Jan 21 17:48:35 UTC 2013"
I did verified that this is a desired behavior on server side, according to HBASE-7645
I'd expect instead that a rerun of SQOOP would not cause a reversioning of all rows in the tables in HBase, but just an update of the changed fields
Attachments
Issue Links
- is related to
-
HBASE-7645 put without timestamp duplicates the record/row
- Closed