Issue Details (XML | Word | Printable)

Key: HADOOP-4647
Type: Bug Bug
Status: Closed Closed
Resolution: Fixed
Priority: Major Major
Assignee: Tsz Wo (Nicholas), SZE
Reporter: Tsz Wo (Nicholas), SZE
Votes: 0
Watchers: 1
Operations

If you were logged in you would be able to see more operations.
Hadoop Common

NamenodeFsck creates a new DFSClient but never closes it

Created: 13/Nov/08 12:20 AM   Updated: 08/Jul/09 04:43 PM
Return to search
Component/s: None
Affects Version/s: None
Fix Version/s: 0.18.3

Time Tracking:
Not Specified

File Attachments:
  Size
Text File Licensed for inclusion in ASF works 4647_20081113.patch 2008-11-13 06:19 PM Tsz Wo (Nicholas), SZE 1 kB
Text File Licensed for inclusion in ASF works 4647_20081113b.patch 2008-11-13 10:34 PM Tsz Wo (Nicholas), SZE 0.7 kB
Text File Licensed for inclusion in ASF works 4647_20081118.patch 2008-11-19 12:18 AM Tsz Wo (Nicholas), SZE 2 kB
Text File Licensed for inclusion in ASF works 4647_20081118_0.18.patch 2008-11-21 11:59 PM Tsz Wo (Nicholas), SZE 2 kB

Hadoop Flags: Reviewed
Resolution Date: 22/Nov/08 12:10 AM


 Description  « Hide
In NamenodeFsck.lostFoundMove(FileStatus file, LocatedBlocks blocks), a new DFSClient is created but never closed.

 All   Comments   Work Log   Change History   Subversion Commits      Sort Order: Ascending order - Click to sort in descending order
Tsz Wo (Nicholas), SZE added a comment - 13/Nov/08 06:19 PM
4647_20081113.patch: added try-finally for closing DFSClient.

Hairong Kuang added a comment - 13/Nov/08 10:30 PM
I prefer not to change the signature of lostFoundMove and then do the try-catch in lostFoundMove.

Tsz Wo (Nicholas), SZE added a comment - 13/Nov/08 10:34 PM
4647_20081113b.patch: incorporated Hairong's comment.

Hairong Kuang added a comment - 13/Nov/08 10:57 PM
+1

dhruba borthakur added a comment - 17/Nov/08 07:46 AM
Does this mean that the NameNode might leak a few socket descriptors every time an fsck is invoked?

Hadoop QA added a comment - 17/Nov/08 01:00 PM
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12393900/4647_20081113b.patch
against trunk revision 714107.

+1 @author. The patch does not contain any @author tags.

-1 tests included. The patch doesn't appear to include any new or modified tests.
Please justify why no tests are needed for this patch.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac compiler warnings.

+1 findbugs. The patch does not introduce any new Findbugs warnings.

+1 Eclipse classpath. The patch retains Eclipse classpath integrity.

-1 core tests. The patch failed core unit tests.

+1 contrib tests. The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3596/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3596/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3596/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3596/console

This message is automatically generated.


Tsz Wo (Nicholas), SZE added a comment - 17/Nov/08 06:34 PM
> Does this mean that the NameNode might leak a few socket descriptors every time an fsck is invoked?

This is probably the case. Also, it leak a thread each time in 0.17. All this leaking happen when running fsck with -move.


Tsz Wo (Nicholas), SZE added a comment - 17/Nov/08 07:00 PM
In build #3596, TestFsck failed at line 77: DFSTestUtil.waitReplication(...) during the test setup. Fsck was not involved yet. There are a lot of following messages in the log.
[junit] 2008-11-17 11:19:37,661 INFO  FSNamesystem.audit (FSNamesystem.java:logAuditEvent(107)) - ugi=hudson,hudson
	ip=/127.0.0.1	cmd=open	src=/srcdat/57758981436956897	dst=null	perm=null
[junit] File /srcdat/57758981436956897 has replication factor 4
[junit] Waiting for replication factor to drain

It seems that the file 57758981436956897 somehow has replication factor 4 and won't drain back to 3.


Tsz Wo (Nicholas), SZE added a comment - 19/Nov/08 12:18 AM
4647_20081118.patch: fixed the test setting a short blockreport interval.

Hadoop QA added a comment - 21/Nov/08 11:41 PM
+1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12394205/4647_20081118.patch
against trunk revision 719651.

+1 @author. The patch does not contain any @author tags.

+1 tests included. The patch appears to include 3 new or modified tests.

+1 javadoc. The javadoc tool did not generate any warning messages.

+1 javac. The applied patch does not increase the total number of javac compiler warnings.

+1 findbugs. The patch does not introduce any new Findbugs warnings.

+1 Eclipse classpath. The patch retains Eclipse classpath integrity.

+1 core tests. The patch passed core unit tests.

+1 contrib tests. The patch passed contrib unit tests.

Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3622/testReport/
Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3622/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3622/artifact/trunk/build/test/checkstyle-errors.html
Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/3622/console

This message is automatically generated.


Tsz Wo (Nicholas), SZE added a comment - 21/Nov/08 11:59 PM
4647_20081118_0.18.patch: for 0.18

Tsz Wo (Nicholas), SZE added a comment - 22/Nov/08 12:10 AM
I just committed this.

Hudson added a comment - 22/Nov/08 04:28 PM
Integrated in Hadoop-trunk #668 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/668/)
. NamenodeFsck should close the DFSClient it has created. (szetszwo)