Uploaded image for project: 'HBase'
  1. HBase
  2. HBASE-12419

"Partial cell read caused by EOF" ERRORs on replication source during replication

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Closed
    • Minor
    • Resolution: Fixed
    • 0.98.7
    • 0.98.8, 0.99.2
    • None
    • None

    Description

      We are seeing exceptions like these on the replication sources when replication is active:

      2014-11-04 01:20:19,738 ERROR [regionserver8120-EventThread.replicationSource,1] codec.BaseDecoder:
      Partial cell read caused by EOF: java.io.IOException: Premature EOF from inputStream
      

      HBase 0.98.8-SNAPSHOT, Hadoop 2.4.1.

      Happens both with and without short circuit reads on the source cluster.

      I'm able to reproduce this reliably:

      1. Set up two clusters. Can be single slave.
      2. Enable replication in configuration
      3. Use LoadTestTool -init_only on both clusters
      4. On source cluster via shell: alter 'cluster_test', {NAME=>'test_cf',REPLICATION_SCOPE=>1}
      5. On source cluster via shell: add_peer 'remote:port:/hbase'
      6. On source cluster, LoadTestTool -skip_init -write 1:1024:10 -num_keys 1000000
      7. Wait for LoadTestTool to complete
      8. Use the shell to verify 1M rows are in 'cluster_test' on the target cluster.

      All 1M rows will replicate without data loss, but I'll see 5-15 instances of "Partial cell read caused by EOF" messages logged from codec.BaseDecoder at ERROR level on the replication source.

      Attachments

        1. HBASE-12419.patch
          1 kB
          Andrew Kyle Purtell
        2. TestReplicationIngest.patch
          7 kB
          Andrew Kyle Purtell

        Activity

          People

            apurtell Andrew Kyle Purtell
            apurtell Andrew Kyle Purtell
            Votes:
            0 Vote for this issue
            Watchers:
            8 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: