Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-5564

Bulkload is discarding duplicate records

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Major
    • Resolution: Fixed
    • 0.95.2
    • 0.95.0
    • mapreduce
    • HBase 0.92

    • Reviewed
    • Hide
      1) Provision for using the existing timestamp (HBASE_TS_KEY)
      2) Bug fix to use same timestamp across mappers.
      Show
      1) Provision for using the existing timestamp (HBASE_TS_KEY) 2) Bug fix to use same timestamp across mappers.
    • bulkload, mapreduce, importtsv

    Description

      Duplicate records are getting discarded when duplicate records exists in same input file and more specifically if they exists in same split.
      Duplicate records are considered if the records are from diffrent different splits.

      Version under test: HBase 0.92

      Attachments

        1. 5564.lint
          10 kB
          Ted Yu
        2. 5564v5.txt
          19 kB
          Michael Stack
        3. HBASE-5564_1.patch
          16 kB
          ramkrishna.s.vasudevan
        4. HBASE-5564_trunk.1.patch
          14 kB
          Laxman
        5. HBASE-5564_trunk.1.patch
          14 kB
          Laxman
        6. HBASE-5564_trunk.2.patch
          13 kB
          Laxman
        7. HBASE-5564_trunk.3.patch
          14 kB
          Laxman
        8. HBASE-5564_trunk.4_final.patch
          18 kB
          Laxman
        9. HBASE-5564_trunk.patch
          14 kB
          Laxman
        10. HBASE-5564.patch
          16 kB
          ramkrishna.s.vasudevan

        Activity

          People

            lakshman Laxman
            lakshman Laxman
            Votes:
            0 Vote for this issue
            Watchers:
            12 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: