|
[
Permlink
| « Hide
]
Raghu Angadi added a comment - 30/Sep/08 06:31 PM
This affects 0.18 as well. For 18, the fix could be as simple as just throwing a 'not supported' exception.
There is another bug in LocalFileSystem:
When creating a file with LocalFileSystem.create(Path f, FsPermission permission, boolean overwrite, int bufferSize, short replication, long blockSize, Progressable progress), no checksum will be created. Created
The complexity of append exists when the last checksum chunk is partial. In this case, I plan to first read the partial chunk out, then truncate the data file to exclude the last chunk and truncate the crc file to exclude the last 4 bytes. Afterwards write the partial chunk again and data to be append.
Since LocalFileSystem is a ChecksumFileSystem, I plan to implement the above algorithm in ChecksumFileSystem so it is general to any checksum file system. For any checksum file system to support append, its raw file system needs to implement a TruncatableFileSystem interface that supports a truncate method: FSDataOutputStream truncate(Path file, long len); This method sets the give file's length be the given len and returns a FSDataOutputStream, whose current position is equal to the given len. Specifically what's in my mind is to introduce the following interface:
TruncatableFileSystem.java interface TruncatableFileSystem { // set the file length to be the given len and the current position to be the given len FSDataOutputStream truncate(Path file, long len); RawLocalFileSystem will implement TruncatableFileSystem. In ChecksumFileSystem.append, if the raw file system is not a TruncatableFileSystem, throw an exception indicating that the append is not supported. Otherwise, read the last partial trunk, truncate both the data file and crc file, rewrite the last partial trunk. truncate implies that if the first append() fails, user can actually lose data that existed in the file before append().
Raghu's comment makes sense. How about we truncating only the crc file not the data file? So data is not at the risk of losing.
Another option is that we could have a positional write interface instead of having a truncate interface. Something like:
interface PositionalWritableFileSystem { // Return an output stream with the given pos as its current position; next write will write to the current position FSOutputStream append(File file, long pos); } The "append" design (
As Raghu points out and you acknowledged on the list, only the CRC file needs to be truncated.
An alternative would be to re-write the entire CRC file when truncate is not possible. Should we add truncate() to FileSystem.java instead of to a new interface? Why does the truncate method return an FSDataOutputStream? Perhaps it could return a boolean, indicating whether it was successful. Then the default implementation could simply return false. Or it could return void and the default would throw UnsupportedOperationException. > The "append" design (
ChecksumFileSystem is managed only on the client. It won't work for multiple appenders, just like it does not work for multiple writers now. Note that DFSOutputStream is not a ChecksumFileSystem. Not sure if this is related to > Note that DFSOutputStream is not a ChecksumFileSystem.
I meant to say DFS is not a ChecksumFileSystem. The reason that I did not propose adding truncate() to FileSystem.java is that I did not make dramatic change to FileSystem APIs. But if this is OK with everybody, it is fine with me too.
truncate does seem like a common-enough file operation to put in FileSystem.java.
> If truncate returns an FSDataOutput, next write does not need to open it for appending first. My intuition is to make this two separate operations: truncate, and open-for-append. Do you have a strong reason to connect them? > As Raghu points out and you acknowledged on the list, only the CRC file needs to be truncated.
Even only the CRC file is truncated, fail in append could result in data-checksum inconsistency. How could we fix this? > The reason that I did not propose adding truncate() to FileSystem.java is that I did not make dramatic change to FileSystem APIs. There will be a lot of changes. Are we going to apply all these changes to 0.18 and onwards? > Even only the CRC file is truncated, fail in append could result in data-checksum inconsistency. How could we fix this?
We could seek to the right position and start writing from there. If that is not allowed, we could copy the crc file. For this jira, we can implement this for LocalFS (and any simpler Filesystems), the default implementation at ChecksumFS could be to throw unsupported exception. > Even only the CRC file is truncated, fail in append could result in data-checksum inconsistency. How could we fix this?
With ChecksumFileSystem, lots of failures can result in data-checksum inconsistency. I don't think this makes the situation particularly worse. Perhaps we could change the checksum file's format so that it includes a trailing element that includes the file's length or somesuch, to avoid such false-positives. > There will be a lot of changes [ ... ] More than the addition of a single method to FileSystem.java? > Even only the CRC file is truncated, fail in append could result in data-checksum inconsistency. How could we fix this?
Append is an operation that changes the content of file. So it has a potential risk that may lead to data-crc inconsistency. This is a problem for append & write. I do not plan to get it solved in this issue. > My intuition is to make this two separate operations: truncate, and open-for-append. Does the community feel comfortable for adding the method truncate to FileSystem? +1 on introducing FileSystem.truncate(Path, offset)
-1 on introducing FileSystem.truncate in 0.19.
I think that this patch should make ChecksumFileSystem.append throw UnsupportedOperationException. If someone wants to make it work in 0.20 that is great. But I don't think we should push it into 0.19. Owen> -1 on introducing FileSystem.truncate in 0.19.
I agree. I didn't notice earlier that this was targeted for 0.19. Append for the local fs is a new feature, and should go in 0.20. Truncate for any fs is clearly a new feature. This patch marks "append" as "not supported" in ChecksumFileSystem. As a result, append is not supported in LocalFileSystem.
+1 patch looks good.
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12392037/LFSAppend.patch against trunk revision 704261. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 Eclipse classpath. The patch retains Eclipse classpath integrity. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3453/testReport/ This message is automatically generated. |
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||