[HADOOP-1285] ChecksumFileSystem : Can't read when io.file.buffer.size < bytePerChecksum - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: 0.12.3
Fix Version/s: 0.14.0
Component/s: fs
Labels:
None

Description

Looks like ChecksumFileSystem fails to read a file when bytesPerChecksum is larger than io.file.buffer.size. Default for bytesPerChecksum and buffer size are 512 and 4096, so default config might not see the problem.

I noticed this problem when I was testing block level CRCs with different configs.

How to reproduce with latest trunk:
Copy a text file larger than 512 bytes to dfs : bin/hadoop fs -copyFromLocal ~/tmp/x.txt x.txt
then set io.file.buffer.size to something smaller than 512 (say 53). Now try to read the file :

bin/hadoop dfs -cat x.txt

This will print only the first 53 characters.

The following code or comment at ChecksumFileSystem.java:163 seems suspect. But not sure if more changes are required:

    public int read(byte b[], int off, int len) throws IOException {
      // make sure that it ends at a checksum boundary
      long curPos = getPos();
      long endPos = len+curPos/bytesPerSum*bytesPerSum;
      return readBuffer(b, off, (int)(endPos-curPos));
    }

Attachments

Issue Links

is part of

HADOOP-1470 Rework FSInputChecker and FSOutputSummer to support checksum code sharing between ChecksumFileSystem and block level crc dfs

Closed

Activity

People

Assignee:: Unassigned

Reporter:: Raghu Angadi

Votes:: 0 Vote for this issue

Watchers:: 0 Start watching this issue

Dates

Created:: 21/Apr/07 00:39

Updated:: 20/Aug/07 18:11

Resolved:: 18/Jul/07 18:26