Issue Details (XML | Word | Printable)

Key: HBASE-1155
Type: Bug Bug
Status: Resolved Resolved
Resolution: Won't Fix
Priority: Major Major
Assignee: stack
Reporter: Jim Kellerman
Votes: 0
Watchers: 0
Operations

If you were logged in you would be able to see more operations.
Hadoop HBase

Verify that FSDataoutputStream.sync() works

Created: 26/Jan/09 07:03 PM   Updated: 09/Jun/09 04:57 AM
Return to search
Component/s: master, regionserver
Affects Version/s: 0.19.0
Fix Version/s: 0.21.0

Time Tracking:
Not Specified

File Attachments:
  Size
Text File Licensed for inclusion in ASF works patch.txt 2009-02-25 11:47 PM Jim Kellerman 7 kB
Issue Links:
Blocker
 

Resolution Date: 09/Jun/09 04:57 AM


 Description  « Hide
In order to guarantee that an HLog sync() flushes the data to the HDFS, we will need to invoke FSDataOutputStream.sync() per HADOOP-4379.

Currently, there is no access to the underlying FSDataOutputStream from SequenceFile.Writer, as it is a package private member.



 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Jim Kellerman made changes - 26/Jan/09 07:03 PM
Field Original Value New Value
Link This issue is blocked by HADOOP-4379 [ HADOOP-4379 ]
Jim Kellerman added a comment - 28/Jan/09 12:39 AM - edited
The latest patch for HADOOP-4379 combined with HADOOP-5027seems to solve the problems that we have seen. As for Doug Judd's problem with getting the length of the file, that is not an issue for HBase, as we do not look at the length of the file.

We need more testing to confirm.


Jim Kellerman made changes - 05/Feb/09 08:02 AM
Description In order to guarantee that an HLog sync() flushes the data to the HDFS, we will need to invoke FSDataOutputStream.sync() per HADOOP-4379.

Currently, there is no access to the underlying FSDataOutputStream from SequenceFile.Writer, as it is a package private member.

Waiting on HADOOP-4379 to see how this plays out.
In order to guarantee that an HLog sync() flushes the data to the HDFS, we will need to invoke FSDataOutputStream.sync() per HADOOP-4379.

Currently, there is no access to the underlying FSDataOutputStream from SequenceFile.Writer, as it is a package private member.

Summary HLog flush does not invoke FSDataOutputStream.flush() Verify that FSDataoutputStream.sync() works
Jim Kellerman added a comment - 05/Feb/09 08:05 AM
Simple testing using the test programs that I attached to HADOOP-4379, would seem to indicate that the patch for 4379 works. However we need more testing in the HBase environment to verify that the patch is sufficient.

stack added a comment - 06/Feb/09 08:06 AM
We have a sequence file local to hbase. We can just change our copy?

Jim Kellerman added a comment - 06/Feb/09 04:13 PM
@Stack

Yes, there are not many changes. Will work on this as soon as I finish up what I am currently working on.


Jim Kellerman added a comment - 11/Feb/09 01:18 AM
each record is approximately 1024 bytes.
one block is either 1,048,576 (1MB) or 67,108,864 (64 MB)

A 1MB block holds 1,002 records
1026048 bytes written, overhead is 22.48 bytes/record

expected overhead for 64MB is 1,441,792
expected number of records for 64MB is 64,128

A 64MB block holds 64,157 records
65,696,768 bytes written, overhead is 1,412,096
overhead is 22.01 bytes/record

So overhead is ~ 22-23 bytes/record.

========================================

Without the patch the best we can do is read up to the end of the last
full block. If we write 1024 records into 1MB blocks we can read 1002
records (~ number of records in block)

If we write write 70,000 records into 64MB blocks we can read 64157
records back.

If less than a block is written, we get back nothing. We only get up
to the last full block.

========================================

With the patch, 1MB block size and no syncs:

  • Writing 1024 records, none are recovered
  • Writing 1200 records, 1188 are recovered
  • Writing 1500 records, 1499 are recovered
  • Writing 1000 records, 994 are recovered

There seems to be a problem with writing about 1024 records to a 1MB
block size file if there are no syncs. Writing more than 1024 records
ia recoverable (e.g., 1500) works, as does writing less (e.g., 1000
records - 994 are recoverable, writing 900 records - 870
are recoverable). So there appears to be a problem with writing
close to 1MB of data into a 1MB block size with no syncs. Adding more
than or some less than 1024 records seems to work.

========================================

With the patch, it appears that the block size is irrelevant and it is
possible to read up to the last sync for 64MB blocks.

With a 64MB block size:

  • If the sync rate is 1, it is possible to read every record written.
  • With a sync rate of 100, it is possible to read up to the last multiple of 100.

With a 1MB block size:

  • Cancelling the writer's lease seems to take a lot longer.
  • Sometimes it seems to never recover the lease. (e.g., write 1024 records, sync every 100 writes, 1MB block size)

More testing to do: try writing close to 64MB with a 64MB block size and see if it experiences the non-recoverability that writing ~1MB with 1MB block size does.


Repository Revision Date User Message
ASF #743191 Wed Feb 11 01:21:16 UTC 2009 jimk HBASE-1155 Verify that FSDataoutputStream.sync() works
Files Changed
MODIFY /hadoop/hbase/branches/trunk_on_hadoop-0.19.1-dev_with_hadoop-4379/src/java/org/apache/hadoop/hbase/io/SequenceFile.java
DEL /hadoop/hbase/branches/trunk_on_hadoop-0.19.1-dev_with_hadoop-4379/lib/hadoop-0.19.0-core.jar
DEL /hadoop/hbase/branches/trunk_on_hadoop-0.19.1-dev_with_hadoop-4379/lib/hadoop-0.19.0-test.jar
ADD /hadoop/hbase/branches/trunk_on_hadoop-0.19.1-dev_with_hadoop-4379/lib/hadoop-0.19.1-dev-core.jar
ADD /hadoop/hbase/branches/trunk_on_hadoop-0.19.1-dev_with_hadoop-4379/src/test/org/apache/hadoop/hbase/io/Writer.java
ADD /hadoop/hbase/branches/trunk_on_hadoop-0.19.1-dev_with_hadoop-4379/src/test/org/apache/hadoop/hbase/io/Reader.java
ADD /hadoop/hbase/branches/trunk_on_hadoop-0.19.1-dev_with_hadoop-4379/lib/hadoop-0.19.1-dev-test.jar

stack added a comment - 11/Feb/09 06:51 AM
Looks great. You think it works when lots of concurrent edits written?

Jim Kellerman added a comment - 11/Feb/09 08:52 PM
To clarify Stack's question, he said:

> ----Original Message----
> From: Michael Stack
> Sent: Wednesday, February 11, 2009 12:42 PM
> To: Jim Kellerman (POWERSET)
> Subject: RE: [jira] Commented: (HBASE-1155) Verify that
> FSDataoutputStream.sync() works
>
> On a loaded cluster do appends persist or get their knickers in a twist?
> St.Ack

The answer to this question is TBD. I have yet to test how it works in a loaded cluster. To this point, I have just verified that in
a simple test that it works. More to come soon...


Jim Kellerman added a comment - 25/Feb/09 11:47 PM
Patch that uses new API's to recover file lease and read from last log file being written by region server.

It does work, but slowly. As noted in HADOOP-4379, it takes almost an hour to recover the file lease when the clusters are loaded.

2009-02-25 21:39:16,843 DEBUG org.apache.hadoop.hbase.regionserver.HLog: Splitting 3 of 3: hdfs:/x.y.com:8100/hbase/log_10.76.44.139_1235597506284_8020/hlog.dat.1235597820662
2009-02-25 21:39:16,847 DEBUG org.apache.hadoop.hbase.regionserver.HLog: Triggering lease recovery.
...
2009-02-25 22:37:12,755 INFO org.apache.hadoop.hbase.regionserver.HLog: log file splitting completed for hdfs://x.y.com:8100/hbase/log_10.76.44.139_1235597506284_8020

Jim Kellerman made changes - 25/Feb/09 11:47 PM
Attachment patch.txt [ 12400976 ]
stack added a comment - 26/Feb/09 05:01 AM
An hour is unacceptable, don't you think? Regions can't be off-line an hour. Is there a timeout we can adjust in hadoop?

Repository Revision Date User Message
ASF #748261 Thu Feb 26 18:25:19 UTC 2009 jimk HBASE-1155 Changes for HLog
Files Changed
MODIFY /hadoop/hbase/branches/trunk_on_hadoop-0.19.1-dev_with_hadoop-4379/src/java/org/apache/hadoop/hbase/regionserver/HLog.java
MODIFY /hadoop/hbase/branches/trunk_on_hadoop-0.19.1-dev_with_hadoop-4379/lib/hadoop-0.19.1-dev-core.jar
MODIFY /hadoop/hbase/branches/trunk_on_hadoop-0.19.1-dev_with_hadoop-4379/lib/hadoop-0.19.1-dev-test.jar

Jim Kellerman added a comment - 26/Feb/09 06:30 PM
Yes an hour is way too long. I asked in HADOOP-4379 if there is a way to speed it up.

Repository Revision Date User Message
ASF #750583 Thu Mar 05 21:01:16 UTC 2009 jimk HBASE-1155 - commit patched Hadoop jars, add patches to SequenceFile and HLog
Files Changed
DEL /hadoop/hbase/branches/0.19/lib/hadoop-0.19.1-test.jar
ADD /hadoop/hbase/branches/0.19/lib/hadoop-0.19.1-dev-core.jar
ADD /hadoop/hbase/branches/0.19/lib/hadoop-0.19.1-dev-test.jar
MODIFY /hadoop/hbase/branches/0.19/src/java/org/apache/hadoop/hbase/io/SequenceFile.java
MODIFY /hadoop/hbase/branches/0.19/src/java/org/apache/hadoop/hbase/regionserver/HLog.java
DEL /hadoop/hbase/branches/0.19/lib/hadoop-0.19.1-core.jar

Repository Revision Date User Message
ASF #750749 Fri Mar 06 02:05:04 UTC 2009 jimk HBASE-1155, this time to the correct branch.
Files Changed
MODIFY /hadoop/hbase/branches/trunk_on_hadoop-0.19.1-dev_with_hadoop-4379/src/java/org/apache/hadoop/hbase/regionserver/HLog.java

Jim Kellerman added a comment - 06/Mar/09 06:51 PM
Moving out of 0.19.1 because it is unlikely we will a patch for HADOOP-4379 soon enough.

Jim Kellerman made changes - 06/Mar/09 06:51 PM
Fix Version/s 0.19.1 [ 12313591 ]
Fix Version/s 0.19.2 [ 12313688 ]
stack added a comment - 28/Apr/09 04:47 PM
Moving out of 0.20.0. It doesn't work in hadoop 0.20.0. Hopefully 0.21.0.

stack made changes - 28/Apr/09 04:47 PM
Fix Version/s 0.19.2 [ 12313688 ]
Fix Version/s 0.20.0 [ 12313474 ]
Fix Version/s 0.21.0 [ 12313607 ]
stack made changes - 20/May/09 06:27 PM
Assignee Jim Kellerman [ jimk ] stack [ stack ]
stack added a comment - 20/May/09 06:28 PM
Have been testing latest patches in HADOOP-4379. See notes there on current state.

stack added a comment - 09/Jun/09 04:57 AM
Resolving this issue as won't fix. Being dealt with over in hbase-1470

stack made changes - 09/Jun/09 04:57 AM
Resolution Won't Fix [ 2 ]
Status Open [ 1 ] Resolved [ 5 ]