[HBASE-10958] [dataloss] Bulk loading with seqids can prevent some log entries from being replayed - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Blocker
Resolution: Fixed
Affects Version/s: 0.96.2, 0.98.1, 0.94.18
Fix Version/s: 0.99.0, 0.98.2, 0.96.3, 0.94.20
Component/s: None
Labels:
None

Hadoop Flags:

Reviewed
Release Note:
Bulk loading with sequence IDs, an option in late 0.94 releases and the default since 0.96.0, will now trigger a flush per region that loads an HFile (if there's data that needs to flushed).

Description

We found an issue with bulk loads causing data loss when assigning sequence ids (~~HBASE-6630~~) that is triggered when replaying recovered edits. We're nicknaming this issue Blindspot.

The problem is that the sequence id given to a bulk loaded file is higher than those of the edits in the region's memstore. When replaying recovered edits, the rule to skip some of them is that they have to be lower than the highest sequence id. In other words, the edits that have a sequence id lower than the highest one in the store files should have also been flushed. This is not the case with bulk loaded files since we now have an HFile with a sequence id higher than unflushed edits.

The log recovery code takes this into account by simply skipping the bulk loaded files, but this "bulk loaded status" is lost on compaction. The edits in the logs that have a sequence id lower than the bulk loaded file that got compacted are put in a blind spot and are skipped during replay.

Here's the easiest way to recreate this issue:

Create an empty table
Put one row in it (let's say it gets seqid 1)
Bulk load one file (it gets seqid 2). I used ImporTsv and set hbase.mapreduce.bulkload.assign.sequenceNumbers.
Bulk load a second file the same way (it gets seqid 3).
Major compact the table (the new file has seqid 3 and isn't considered bulk loaded).
Kill the region server that holds the table's region.
Scan the table once the region is made available again. The first row, at seqid 1, will be missing since the HFile with seqid 3 makes us believe that everything that came before it was flushed.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

HBASE-10958.patch
12/Apr/14 00:12
16 kB
Jean-Daniel Cryans
HBASE-10958-0.94.patch
18/Apr/14 00:40
31 kB
Jean-Daniel Cryans
HBASE-10958-less-intrusive-hack-0.96.patch
11/Apr/14 00:02
0.8 kB
Jean-Daniel Cryans
HBASE-10958-quick-hack-0.96.patch
10/Apr/14 23:45
10 kB
Jean-Daniel Cryans
HBASE-10958-v2.patch
15/Apr/14 20:20
29 kB
Jean-Daniel Cryans
HBASE-10958-v3.patch
18/Apr/14 00:40
31 kB
Jean-Daniel Cryans

Issue Links

is blocked by

HBASE-11008 Align bulk load, flush, and compact to require Action.CREATE

Closed

Activity

People

Assignee:: Jean-Daniel Cryans

Reporter:: Jean-Daniel Cryans

Votes:: 2 Vote for this issue

Watchers:: 21 Start watching this issue

Dates

Created:: 10/Apr/14 18:08

Updated:: 11/Jun/14 02:53

Resolved:: 30/Apr/14 22:09