[HADOOP-79] listFiles optimization - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Improvement
Status: Closed
Priority: Major
Resolution: Fixed
Affects Version/s: None
Fix Version/s: 0.1.0
Component/s: None
Labels:
None

Description

In FSDirectory.getListing() looking at line
listing[i] = new DFSFileInfo(curName, cur.computeFileLength(), cur.computeContentsLength(), isDir(curName));

1. computeContentsLength() is actually calling computeFileLength(), so this is called twice,
meaning that file length is calculated twice.
2. isDir() is looking for the INode (starting from the rootDir) that has actually been obtained
just two lines above, note that the tree is locked by that time.

I propose a simple optimization for this, see attachment.

3. A related question: Why DFSFileInfo needs 2 separate fields len for file length and
contentsLen for directory contents size? It looks like these fields are mutually exclusive,
and we can use just one, interpreting it one way or another with respect to the value of isDir.

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

DFSFileInfo.patch
14/Mar/06 07:54
3 kB
Konstantin Shvachko

Activity

People

Assignee:: Konstantin Shvachko

Reporter:: Konstantin Shvachko

Votes:: 0 Vote for this issue

Watchers:: 0 Start watching this issue

Dates

Created:: 14/Mar/06 07:53

Updated:: 08/Jul/09 16:41

Resolved:: 15/Mar/06 04:57