Hadoop Common
  1. Hadoop Common
  2. HADOOP-6180

Namenode slowed down when many files with same filename were moved to Trash

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.21.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      This fix changes the naming of files moved to Trash.
      Before files with the same name were appended with a running number 1,2,3.. and so on. Now it will have the current time in milliseconds attached to the name.

      1. COMMON-124.patch
        8 kB
        Boris Shkolnik

        Activity

        Hide
        Robert Chansler added a comment -

        Editorial pass over all release notes prior to publication of 0.21.

        Show
        Robert Chansler added a comment - Editorial pass over all release notes prior to publication of 0.21.
        Hide
        Hudson added a comment -

        Integrated in Hadoop-Common-trunk #48 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk/48/)
        . NameNode slowed down when many files with same filename were moved to Trash. Contributed by Boris Shkolnik.

        Show
        Hudson added a comment - Integrated in Hadoop-Common-trunk #48 (See http://hudson.zones.apache.org/hudson/job/Hadoop-Common-trunk/48/ ) . NameNode slowed down when many files with same filename were moved to Trash. Contributed by Boris Shkolnik.
        Hide
        Tsz Wo Nicholas Sze added a comment -

        After updated hadoop-core-0.21.0-dev.jar and hadoop-core-test-0.21.0-dev.jar on hdfs, TestHDFSTrash failed. Could someone take a look?

        ------------- Standard Error -----------------
        rmr: Cannot move "hdfs://localhost:57940/user/tsz" to the trash, as it contains  
        the trash
        ------------- ---------------- ---------------
        
        Testcase: testTrash took 1.451 sec
                Caused an ERROR
        null
        java.lang.NullPointerException
                at org.apache.hadoop.fs.TestTrash.countSameDeletedFiles(TestTrash.java:77
        5)
                at org.apache.hadoop.fs.TestTrash.trashShell(TestTrash.java:363)
                at org.apache.hadoop.hdfs.TestHDFSTrash.testTrash(TestHDFSTrash.java:54)
                at junit.extensions.TestDecorator.basicRun(TestDecorator.java:24)
                at junit.extensions.TestSetup$1.protect(TestSetup.java:23)
                at junit.extensions.TestSetup.run(TestSetup.java:27)
        
        Testcase: testNonDefaultFS took 0.262 sec
        
        Show
        Tsz Wo Nicholas Sze added a comment - After updated hadoop-core-0.21.0-dev.jar and hadoop-core-test-0.21.0-dev.jar on hdfs, TestHDFSTrash failed. Could someone take a look? ------------- Standard Error ----------------- rmr: Cannot move "hdfs://localhost:57940/user/tsz" to the trash, as it contains the trash ------------- ---------------- --------------- Testcase: testTrash took 1.451 sec Caused an ERROR null java.lang.NullPointerException at org.apache.hadoop.fs.TestTrash.countSameDeletedFiles(TestTrash.java:77 5) at org.apache.hadoop.fs.TestTrash.trashShell(TestTrash.java:363) at org.apache.hadoop.hdfs.TestHDFSTrash.testTrash(TestHDFSTrash.java:54) at junit.extensions.TestDecorator.basicRun(TestDecorator.java:24) at junit.extensions.TestSetup$1.protect(TestSetup.java:23) at junit.extensions.TestSetup.run(TestSetup.java:27) Testcase: testNonDefaultFS took 0.262 sec
        Hide
        Hairong Kuang added a comment -

        I've just committed this. Thanks Boris!

        Show
        Hairong Kuang added a comment - I've just committed this. Thanks Boris!
        Hide
        Hairong Kuang added a comment -

        I was about to commit this patch, but realized that this issue should have belonged to hadoop project common.

        Show
        Hairong Kuang added a comment - I was about to commit this patch, but realized that this issue should have belonged to hadoop project common.
        Hide
        Boris Shkolnik added a comment -

        ant test-patch
        +1 overall.
        [exec]
        [exec] +1 @author. The patch does not contain any @author tags.
        [exec]
        [exec] +1 tests included. The patch appears to include 3 new or modified tests.
        [exec]
        [exec] +1 javadoc. The javadoc tool did not generate any warning messages.
        [exec]
        [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings.
        [exec]
        [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings.
        [exec]
        [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings.
        [exec]

        ant test

        run-test-core:
        [mkdir] Created dir: /home/borya/work/hadoop/hadoop-common/build/test/data
        [mkdir] Created dir: /home/borya/work/hadoop/hadoop-common/build/test/logs
        [copy] Copying 1 file to /home/borya/work/hadoop/hadoop-common/build/test/extraconf
        [junit] Running org.apache.hadoop.cli.TestCLI
        [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.292 sec
        [junit] Running org.apache.hadoop.conf.TestConfiguration
        [junit] Tests run: 13, Failures: 0, Errors: 0, Time elapsed: 1.198 sec
        [junit] Running org.apache.hadoop.conf.TestConfigurationSubclass
        [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 0.419 sec
        [junit] Running org.apache.hadoop.conf.TestGetInstances
        [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.381 sec
        [junit] Running org.apache.hadoop.filecache.TestDistributedCache
        [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 1.198 sec
        [junit] Running org.apache.hadoop.fs.TestBlockLocation
        [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.229 sec
        [junit] Running org.apache.hadoop.fs.TestChecksumFileSystem
        [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 0.561 sec
        [junit] Running org.apache.hadoop.fs.TestDFVariations
        [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.375 sec
        [junit] Running org.apache.hadoop.fs.TestDU
        [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 5.497 sec
        [junit] Running org.apache.hadoop.fs.TestGetFileBlockLocations
        [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 1.278 sec
        [junit] Running org.apache.hadoop.fs.TestGlobExpander
        [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 0.055 sec
        [junit] Running org.apache.hadoop.fs.TestLocalDirAllocator
        [junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed: 1.106 sec
        [junit] Running org.apache.hadoop.fs.TestLocalFileSystem
        [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 1.096 sec
        [junit] Running org.apache.hadoop.fs.TestLocalFileSystemPermission
        [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 0.584 sec
        [junit] Running org.apache.hadoop.fs.TestPath
        [junit] Tests run: 8, Failures: 0, Errors: 0, Time elapsed: 0.084 sec
        [junit] Running org.apache.hadoop.fs.TestTrash
        [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 2.506 sec
        [junit] Running org.apache.hadoop.fs.TestTruncatedInputBug
        [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.531 sec
        [junit] Running org.apache.hadoop.fs.kfs.TestKosmosFileSystem
        [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 0.824 sec
        [junit] Running org.apache.hadoop.fs.permission.TestFsPermission
        [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 0.1 sec
        [junit] Running org.apache.hadoop.fs.s3.TestINode
        [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 0.076 sec
        [junit] Running org.apache.hadoop.fs.s3.TestInMemoryS3FileSystemContract
        [junit] Tests run: 29, Failures: 0, Errors: 0, Time elapsed: 2.437 sec
        [junit] Running org.apache.hadoop.fs.s3.TestS3Credentials
        [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.195 sec
        [junit] Running org.apache.hadoop.fs.s3.TestS3FileSystem
        [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.237 sec
        [junit] Running org.apache.hadoop.fs.s3native.TestInMemoryNativeS3FileSystemContract
        [junit] Tests run: 35, Failures: 0, Errors: 0, Time elapsed: 2.031 sec
        [junit] Running org.apache.hadoop.http.TestGlobalFilter
        [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 1.535 sec
        [junit] Running org.apache.hadoop.http.TestServletFilter
        [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 1.826 sec
        [junit] Running org.apache.hadoop.io.TestArrayFile
        [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 6.627 sec
        [junit] Running org.apache.hadoop.io.TestArrayWritable
        [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.213 sec
        [junit] Running org.apache.hadoop.io.TestBloomMapFile
        [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.766 sec
        [junit] Running org.apache.hadoop.io.TestBytesWritable
        [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 0.186 sec
        [junit] Running org.apache.hadoop.io.TestDefaultStringifier
        [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 0.341 sec
        [junit] Running org.apache.hadoop.io.TestEnumSetWritable
        [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 0.2 sec
        [junit] Running org.apache.hadoop.io.TestGenericWritable
        [junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed: 0.527 sec
        [junit] Running org.apache.hadoop.io.TestMD5Hash
        [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.088 sec
        [junit] Running org.apache.hadoop.io.TestMapFile
        [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 0.861 sec
        [junit] Running org.apache.hadoop.io.TestMapWritable
        [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 0.289 sec
        [junit] Running org.apache.hadoop.io.TestSequenceFileSerialization
        [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.672 sec
        [junit] Running org.apache.hadoop.io.TestSetFile
        [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 6.44 sec
        [junit] Running org.apache.hadoop.io.TestSortedMapWritable
        [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 0.263 sec
        [junit] Running org.apache.hadoop.io.TestText
        [junit] Tests run: 9, Failures: 0, Errors: 0, Time elapsed: 1.092 sec
        [junit] Running org.apache.hadoop.io.TestTextNonUTF8
        [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.046 sec
        [junit] Running org.apache.hadoop.io.TestUTF8
        [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 0.223 sec
        [junit] Running org.apache.hadoop.io.TestVersionedWritable
        [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 0.075 sec
        [junit] Running org.apache.hadoop.io.TestWritable
        [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 0.079 sec
        [junit] Running org.apache.hadoop.io.TestWritableName
        [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 0.231 sec
        [junit] Running org.apache.hadoop.io.TestWritableUtils
        [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.036 sec
        [junit] Running org.apache.hadoop.io.compress.TestCodec
        [junit] Tests run: 6, Failures: 0, Errors: 0, Time elapsed: 20.625 sec
        [junit] Running org.apache.hadoop.io.compress.TestCodecFactory
        [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.453 sec
        [junit] Running org.apache.hadoop.io.file.tfile.TestTFile
        [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 2.756 sec
        [junit] Running org.apache.hadoop.io.file.tfile.TestTFileByteArrays
        [junit] Tests run: 25, Failures: 0, Errors: 0, Time elapsed: 6.188 sec
        [junit] Running org.apache.hadoop.io.file.tfile.TestTFileComparators
        [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 0.555 sec
        [junit] Running org.apache.hadoop.io.file.tfile.TestTFileJClassComparatorByteArrays
        [junit] Tests run: 25, Failures: 0, Errors: 0, Time elapsed: 6.104 sec
        [junit] Running org.apache.hadoop.io.file.tfile.TestTFileLzoCodecsByteArrays
        [junit] Tests run: 25, Failures: 0, Errors: 0, Time elapsed: 0.207 sec
        [junit] Running org.apache.hadoop.io.file.tfile.TestTFileLzoCodecsStreams
        [junit] Tests run: 19, Failures: 0, Errors: 0, Time elapsed: 0.201 sec
        [junit] Running org.apache.hadoop.io.file.tfile.TestTFileNoneCodecsByteArrays
        [junit] Tests run: 25, Failures: 0, Errors: 0, Time elapsed: 3.298 sec
        [junit] Running org.apache.hadoop.io.file.tfile.TestTFileNoneCodecsJClassComparatorByteArrays
        [junit] Tests run: 25, Failures: 0, Errors: 0, Time elapsed: 3.378 sec
        [junit] Running org.apache.hadoop.io.file.tfile.TestTFileNoneCodecsStreams
        [junit] Tests run: 19, Failures: 0, Errors: 0, Time elapsed: 3.224 sec
        [junit] Running org.apache.hadoop.io.file.tfile.TestTFileSeek
        [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 6.468 sec
        [junit] Running org.apache.hadoop.io.file.tfile.TestTFileSeqFileComparison
        [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 23.683 sec
        [junit] Running org.apache.hadoop.io.file.tfile.TestTFileSplit
        [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 11.542 sec
        [junit] Running org.apache.hadoop.io.file.tfile.TestTFileStreams
        [junit] Tests run: 19, Failures: 0, Errors: 0, Time elapsed: 3.893 sec
        [junit] Running org.apache.hadoop.io.file.tfile.TestTFileUnsortedByteArrays
        [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 0.875 sec
        [junit] Running org.apache.hadoop.io.file.tfile.TestVLong
        [junit] Tests run: 9, Failures: 0, Errors: 0, Time elapsed: 4.215 sec
        [junit] Running org.apache.hadoop.io.retry.TestRetryProxy
        [junit] Tests run: 9, Failures: 0, Errors: 0, Time elapsed: 0.417 sec
        [junit] Running org.apache.hadoop.io.serializer.TestWritableSerialization
        [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 0.264 sec
        [junit] Running org.apache.hadoop.io.serializer.avro.TestAvroSerialization
        [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 0.677 sec
        [junit] Running org.apache.hadoop.ipc.TestIPC
        [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 15.459 sec
        [junit] Running org.apache.hadoop.ipc.TestIPCServerResponder
        [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 11.274 sec
        [junit] Running org.apache.hadoop.ipc.TestRPC
        [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 34.832 sec
        [junit] Running org.apache.hadoop.log.TestLogLevel
        [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 1.407 sec
        [junit] Running org.apache.hadoop.metrics.TestMetricsServlet
        [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 0.142 sec
        [junit] Running org.apache.hadoop.metrics.spi.TestOutputRecord
        [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.048 sec
        [junit] Running org.apache.hadoop.net.TestDNS
        [junit] Tests run: 7, Failures: 0, Errors: 0, Time elapsed: 0.107 sec
        [junit] Running org.apache.hadoop.net.TestScriptBasedMapping
        [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.489 sec
        [junit] Running org.apache.hadoop.net.TestSocketIOWithTimeout
        [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 3.091 sec
        [junit] Running org.apache.hadoop.record.TestBuffer
        [junit] Tests run: 6, Failures: 0, Errors: 0, Time elapsed: 0.069 sec
        [junit] Running org.apache.hadoop.record.TestRecordIO
        [junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed: 0.209 sec
        [junit] Running org.apache.hadoop.record.TestRecordVersioning
        [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 0.143 sec
        [junit] Running org.apache.hadoop.security.TestAccessControlList
        [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 0.047 sec
        [junit] Running org.apache.hadoop.security.TestAccessToken
        [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 1.029 sec
        [junit] Running org.apache.hadoop.security.TestUnixUserGroupInformation
        [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 0.534 sec
        [junit] Running org.apache.hadoop.security.authorize.TestConfiguredPolicy
        [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.401 sec
        [junit] Running org.apache.hadoop.util.TestCyclicIteration
        [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.049 sec
        [junit] Running org.apache.hadoop.util.TestGenericsUtil
        [junit] Tests run: 6, Failures: 0, Errors: 0, Time elapsed: 0.247 sec
        [junit] Running org.apache.hadoop.util.TestIndexedSort
        [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 2.097 sec
        [junit] Running org.apache.hadoop.util.TestProcfsBasedProcessTree
        [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 7.739 sec
        [junit] Running org.apache.hadoop.util.TestPureJavaCrc32
        [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 1.372 sec
        [junit] Running org.apache.hadoop.util.TestShell
        [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 4.265 sec
        [junit] Running org.apache.hadoop.util.TestStringUtils
        [junit] Tests run: 6, Failures: 0, Errors: 0, Time elapsed: 0.071 sec

        Show
        Boris Shkolnik added a comment - ant test-patch +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 3 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. [exec] ant test run-test-core: [mkdir] Created dir: /home/borya/work/hadoop/hadoop-common/build/test/data [mkdir] Created dir: /home/borya/work/hadoop/hadoop-common/build/test/logs [copy] Copying 1 file to /home/borya/work/hadoop/hadoop-common/build/test/extraconf [junit] Running org.apache.hadoop.cli.TestCLI [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.292 sec [junit] Running org.apache.hadoop.conf.TestConfiguration [junit] Tests run: 13, Failures: 0, Errors: 0, Time elapsed: 1.198 sec [junit] Running org.apache.hadoop.conf.TestConfigurationSubclass [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 0.419 sec [junit] Running org.apache.hadoop.conf.TestGetInstances [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.381 sec [junit] Running org.apache.hadoop.filecache.TestDistributedCache [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 1.198 sec [junit] Running org.apache.hadoop.fs.TestBlockLocation [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.229 sec [junit] Running org.apache.hadoop.fs.TestChecksumFileSystem [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 0.561 sec [junit] Running org.apache.hadoop.fs.TestDFVariations [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.375 sec [junit] Running org.apache.hadoop.fs.TestDU [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 5.497 sec [junit] Running org.apache.hadoop.fs.TestGetFileBlockLocations [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 1.278 sec [junit] Running org.apache.hadoop.fs.TestGlobExpander [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 0.055 sec [junit] Running org.apache.hadoop.fs.TestLocalDirAllocator [junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed: 1.106 sec [junit] Running org.apache.hadoop.fs.TestLocalFileSystem [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 1.096 sec [junit] Running org.apache.hadoop.fs.TestLocalFileSystemPermission [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 0.584 sec [junit] Running org.apache.hadoop.fs.TestPath [junit] Tests run: 8, Failures: 0, Errors: 0, Time elapsed: 0.084 sec [junit] Running org.apache.hadoop.fs.TestTrash [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 2.506 sec [junit] Running org.apache.hadoop.fs.TestTruncatedInputBug [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.531 sec [junit] Running org.apache.hadoop.fs.kfs.TestKosmosFileSystem [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 0.824 sec [junit] Running org.apache.hadoop.fs.permission.TestFsPermission [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 0.1 sec [junit] Running org.apache.hadoop.fs.s3.TestINode [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 0.076 sec [junit] Running org.apache.hadoop.fs.s3.TestInMemoryS3FileSystemContract [junit] Tests run: 29, Failures: 0, Errors: 0, Time elapsed: 2.437 sec [junit] Running org.apache.hadoop.fs.s3.TestS3Credentials [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.195 sec [junit] Running org.apache.hadoop.fs.s3.TestS3FileSystem [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.237 sec [junit] Running org.apache.hadoop.fs.s3native.TestInMemoryNativeS3FileSystemContract [junit] Tests run: 35, Failures: 0, Errors: 0, Time elapsed: 2.031 sec [junit] Running org.apache.hadoop.http.TestGlobalFilter [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 1.535 sec [junit] Running org.apache.hadoop.http.TestServletFilter [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 1.826 sec [junit] Running org.apache.hadoop.io.TestArrayFile [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 6.627 sec [junit] Running org.apache.hadoop.io.TestArrayWritable [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.213 sec [junit] Running org.apache.hadoop.io.TestBloomMapFile [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.766 sec [junit] Running org.apache.hadoop.io.TestBytesWritable [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 0.186 sec [junit] Running org.apache.hadoop.io.TestDefaultStringifier [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 0.341 sec [junit] Running org.apache.hadoop.io.TestEnumSetWritable [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 0.2 sec [junit] Running org.apache.hadoop.io.TestGenericWritable [junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed: 0.527 sec [junit] Running org.apache.hadoop.io.TestMD5Hash [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.088 sec [junit] Running org.apache.hadoop.io.TestMapFile [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 0.861 sec [junit] Running org.apache.hadoop.io.TestMapWritable [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 0.289 sec [junit] Running org.apache.hadoop.io.TestSequenceFileSerialization [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.672 sec [junit] Running org.apache.hadoop.io.TestSetFile [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 6.44 sec [junit] Running org.apache.hadoop.io.TestSortedMapWritable [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 0.263 sec [junit] Running org.apache.hadoop.io.TestText [junit] Tests run: 9, Failures: 0, Errors: 0, Time elapsed: 1.092 sec [junit] Running org.apache.hadoop.io.TestTextNonUTF8 [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.046 sec [junit] Running org.apache.hadoop.io.TestUTF8 [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 0.223 sec [junit] Running org.apache.hadoop.io.TestVersionedWritable [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 0.075 sec [junit] Running org.apache.hadoop.io.TestWritable [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 0.079 sec [junit] Running org.apache.hadoop.io.TestWritableName [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 0.231 sec [junit] Running org.apache.hadoop.io.TestWritableUtils [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.036 sec [junit] Running org.apache.hadoop.io.compress.TestCodec [junit] Tests run: 6, Failures: 0, Errors: 0, Time elapsed: 20.625 sec [junit] Running org.apache.hadoop.io.compress.TestCodecFactory [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.453 sec [junit] Running org.apache.hadoop.io.file.tfile.TestTFile [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 2.756 sec [junit] Running org.apache.hadoop.io.file.tfile.TestTFileByteArrays [junit] Tests run: 25, Failures: 0, Errors: 0, Time elapsed: 6.188 sec [junit] Running org.apache.hadoop.io.file.tfile.TestTFileComparators [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 0.555 sec [junit] Running org.apache.hadoop.io.file.tfile.TestTFileJClassComparatorByteArrays [junit] Tests run: 25, Failures: 0, Errors: 0, Time elapsed: 6.104 sec [junit] Running org.apache.hadoop.io.file.tfile.TestTFileLzoCodecsByteArrays [junit] Tests run: 25, Failures: 0, Errors: 0, Time elapsed: 0.207 sec [junit] Running org.apache.hadoop.io.file.tfile.TestTFileLzoCodecsStreams [junit] Tests run: 19, Failures: 0, Errors: 0, Time elapsed: 0.201 sec [junit] Running org.apache.hadoop.io.file.tfile.TestTFileNoneCodecsByteArrays [junit] Tests run: 25, Failures: 0, Errors: 0, Time elapsed: 3.298 sec [junit] Running org.apache.hadoop.io.file.tfile.TestTFileNoneCodecsJClassComparatorByteArrays [junit] Tests run: 25, Failures: 0, Errors: 0, Time elapsed: 3.378 sec [junit] Running org.apache.hadoop.io.file.tfile.TestTFileNoneCodecsStreams [junit] Tests run: 19, Failures: 0, Errors: 0, Time elapsed: 3.224 sec [junit] Running org.apache.hadoop.io.file.tfile.TestTFileSeek [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 6.468 sec [junit] Running org.apache.hadoop.io.file.tfile.TestTFileSeqFileComparison [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 23.683 sec [junit] Running org.apache.hadoop.io.file.tfile.TestTFileSplit [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 11.542 sec [junit] Running org.apache.hadoop.io.file.tfile.TestTFileStreams [junit] Tests run: 19, Failures: 0, Errors: 0, Time elapsed: 3.893 sec [junit] Running org.apache.hadoop.io.file.tfile.TestTFileUnsortedByteArrays [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 0.875 sec [junit] Running org.apache.hadoop.io.file.tfile.TestVLong [junit] Tests run: 9, Failures: 0, Errors: 0, Time elapsed: 4.215 sec [junit] Running org.apache.hadoop.io.retry.TestRetryProxy [junit] Tests run: 9, Failures: 0, Errors: 0, Time elapsed: 0.417 sec [junit] Running org.apache.hadoop.io.serializer.TestWritableSerialization [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 0.264 sec [junit] Running org.apache.hadoop.io.serializer.avro.TestAvroSerialization [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 0.677 sec [junit] Running org.apache.hadoop.ipc.TestIPC [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 15.459 sec [junit] Running org.apache.hadoop.ipc.TestIPCServerResponder [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 11.274 sec [junit] Running org.apache.hadoop.ipc.TestRPC [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 34.832 sec [junit] Running org.apache.hadoop.log.TestLogLevel [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 1.407 sec [junit] Running org.apache.hadoop.metrics.TestMetricsServlet [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 0.142 sec [junit] Running org.apache.hadoop.metrics.spi.TestOutputRecord [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.048 sec [junit] Running org.apache.hadoop.net.TestDNS [junit] Tests run: 7, Failures: 0, Errors: 0, Time elapsed: 0.107 sec [junit] Running org.apache.hadoop.net.TestScriptBasedMapping [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.489 sec [junit] Running org.apache.hadoop.net.TestSocketIOWithTimeout [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 3.091 sec [junit] Running org.apache.hadoop.record.TestBuffer [junit] Tests run: 6, Failures: 0, Errors: 0, Time elapsed: 0.069 sec [junit] Running org.apache.hadoop.record.TestRecordIO [junit] Tests run: 5, Failures: 0, Errors: 0, Time elapsed: 0.209 sec [junit] Running org.apache.hadoop.record.TestRecordVersioning [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 0.143 sec [junit] Running org.apache.hadoop.security.TestAccessControlList [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 0.047 sec [junit] Running org.apache.hadoop.security.TestAccessToken [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 1.029 sec [junit] Running org.apache.hadoop.security.TestUnixUserGroupInformation [junit] Tests run: 4, Failures: 0, Errors: 0, Time elapsed: 0.534 sec [junit] Running org.apache.hadoop.security.authorize.TestConfiguredPolicy [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.401 sec [junit] Running org.apache.hadoop.util.TestCyclicIteration [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 0.049 sec [junit] Running org.apache.hadoop.util.TestGenericsUtil [junit] Tests run: 6, Failures: 0, Errors: 0, Time elapsed: 0.247 sec [junit] Running org.apache.hadoop.util.TestIndexedSort [junit] Tests run: 2, Failures: 0, Errors: 0, Time elapsed: 2.097 sec [junit] Running org.apache.hadoop.util.TestProcfsBasedProcessTree [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 7.739 sec [junit] Running org.apache.hadoop.util.TestPureJavaCrc32 [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 1.372 sec [junit] Running org.apache.hadoop.util.TestShell [junit] Tests run: 3, Failures: 0, Errors: 0, Time elapsed: 4.265 sec [junit] Running org.apache.hadoop.util.TestStringUtils [junit] Tests run: 6, Failures: 0, Errors: 0, Time elapsed: 0.071 sec
        Hide
        Hairong Kuang added a comment -

        +1. The patch and the performance look good.

        Show
        Hairong Kuang added a comment - +1. The patch and the performance look good.
        Hide
        Konstantin Boudnik added a comment -

        I love the fact that it has a performance test now - it seems like a yet another brick to the performance wall being build!

        A couple of quick comments:

        • Hadoop is using JUnit4.5 right now. However, I'm seeing new tests coming in using old JUnit3 notations. E.g. TestCase extensions, junit.framework.* etc. Shall we start using org.junit.* instead and use
          @Test
          

          annotation instead of direct inheritance from TestCase superclass? Annotations should let us use arbitrary test names as well.

        • for future use I'd suggest to put additional tags to the JavaDoc of performanceTestDeleteSameFile. Say
          + *  PerformanceTestSuite
          

          so later on we'd be able to find all such tests at once to include them into performance suite whenever it will be created?

        Show
        Konstantin Boudnik added a comment - I love the fact that it has a performance test now - it seems like a yet another brick to the performance wall being build! A couple of quick comments: Hadoop is using JUnit4.5 right now. However, I'm seeing new tests coming in using old JUnit3 notations. E.g. TestCase extensions, junit.framework.* etc. Shall we start using org.junit.* instead and use @Test annotation instead of direct inheritance from TestCase superclass? Annotations should let us use arbitrary test names as well. for future use I'd suggest to put additional tags to the JavaDoc of performanceTestDeleteSameFile. Say + * PerformanceTestSuite so later on we'd be able to find all such tests at once to include them into performance suite whenever it will be created?
        Hide
        Boris Shkolnik added a comment -

        I've noticed there is no unit test for the case of deletion of the same file multiple time. So I added one.

        I also added the performance test (the one used in the previous comment) if someone wants to run it in the future. It is not a part of unit test. It can only be run as a separate test (I've added main() method for this purpose).

        Show
        Boris Shkolnik added a comment - I've noticed there is no unit test for the case of deletion of the same file multiple time. So I added one. I also added the performance test (the one used in the previous comment) if someone wants to run it in the future. It is not a part of unit test. It can only be run as a separate test (I've added main() method for this purpose).
        Hide
        Boris Shkolnik added a comment -

        I added currTimeMillis() to the file name.
        Since the jira is defined as a performance problems I put here some numbers before and after the change.
        Deleting file with the same name 1000 times. Printing every 100 time. Comparing with first deletion:

        old code:
        iteration=100;res =0; start=1249064570781; iterTime = 30 vs. firstTime=26
        iteration=200;res =0; start=1249064576620; iterTime = 35 vs. firstTime=26
        iteration=300;res =0; start=1249064583456; iterTime = 42 vs. firstTime=26
        iteration=400;res =0; start=1249064590637; iterTime = 50 vs. firstTime=26
        iteration=500;res =0; start=1249064598285; iterTime = 56 vs. firstTime=26
        iteration=600;res =0; start=1249064606626; iterTime = 63 vs. firstTime=26
        iteration=700;res =0; start=1249064615643; iterTime = 71 vs. firstTime=26
        iteration=800;res =0; start=1249064625248; iterTime = 76 vs. firstTime=26
        iteration=900;res =0; start=1249064635654; iterTime = 83 vs. firstTime=26

        new code:
        iteration=100;res =0; start=1249066273893; iterTime = 22 vs. firstTime=22
        iteration=200;res =0; start=1249066278729; iterTime = 22 vs. firstTime=22
        iteration=300;res =0; start=1249066283591; iterTime = 22 vs. firstTime=22
        iteration=400;res =0; start=1249066288416; iterTime = 24 vs. firstTime=22
        iteration=500;res =0; start=1249066293223; iterTime = 27 vs. firstTime=22
        iteration=600;res =0; start=1249066298018; iterTime = 30 vs. firstTime=22
        iteration=700;res =0; start=1249066302824; iterTime = 25 vs. firstTime=22
        iteration=800;res =0; start=1249066307593; iterTime = 23 vs. firstTime=22
        iteration=900;res =0; start=1249066312833; iterTime = 23 vs. firstTime=22

        Show
        Boris Shkolnik added a comment - I added currTimeMillis() to the file name. Since the jira is defined as a performance problems I put here some numbers before and after the change. Deleting file with the same name 1000 times. Printing every 100 time. Comparing with first deletion: old code: iteration=100;res =0; start=1249064570781; iterTime = 30 vs. firstTime=26 iteration=200;res =0; start=1249064576620; iterTime = 35 vs. firstTime=26 iteration=300;res =0; start=1249064583456; iterTime = 42 vs. firstTime=26 iteration=400;res =0; start=1249064590637; iterTime = 50 vs. firstTime=26 iteration=500;res =0; start=1249064598285; iterTime = 56 vs. firstTime=26 iteration=600;res =0; start=1249064606626; iterTime = 63 vs. firstTime=26 iteration=700;res =0; start=1249064615643; iterTime = 71 vs. firstTime=26 iteration=800;res =0; start=1249064625248; iterTime = 76 vs. firstTime=26 iteration=900;res =0; start=1249064635654; iterTime = 83 vs. firstTime=26 new code: iteration=100;res =0; start=1249066273893; iterTime = 22 vs. firstTime=22 iteration=200;res =0; start=1249066278729; iterTime = 22 vs. firstTime=22 iteration=300;res =0; start=1249066283591; iterTime = 22 vs. firstTime=22 iteration=400;res =0; start=1249066288416; iterTime = 24 vs. firstTime=22 iteration=500;res =0; start=1249066293223; iterTime = 27 vs. firstTime=22 iteration=600;res =0; start=1249066298018; iterTime = 30 vs. firstTime=22 iteration=700;res =0; start=1249066302824; iterTime = 25 vs. firstTime=22 iteration=800;res =0; start=1249066307593; iterTime = 23 vs. firstTime=22 iteration=900;res =0; start=1249066312833; iterTime = 23 vs. firstTime=22
        Hide
        He Yongqiang added a comment -

        Then what about adding the file's create time?

        Show
        He Yongqiang added a comment - Then what about adding the file's create time?
        Hide
        Koji Noguchi added a comment -

        Maybe we can also add the file's path info as well as the timestamp when it is moved to trash.

        HADOOP-4970 from 0.20.

        I sometimes create a file named .Trash in my hdfs home directory for disabling trash in my account.

        Never thought of that. Interesting.

        Show
        Koji Noguchi added a comment - Maybe we can also add the file's path info as well as the timestamp when it is moved to trash. HADOOP-4970 from 0.20. I sometimes create a file named .Trash in my hdfs home directory for disabling trash in my account. Never thought of that. Interesting.
        Hide
        Tsz Wo Nicholas Sze added a comment -

        > For me, the contract is we try to move the files to Trash but if that fails, we simply delete them.

        I agree. I sometimes create a file named .Trash in my hdfs home directory for disabling trash in my account.

        Show
        Tsz Wo Nicholas Sze added a comment - > For me, the contract is we try to move the files to Trash but if that fails, we simply delete them. I agree. I sometimes create a file named .Trash in my hdfs home directory for disabling trash in my account.
        Hide
        He Yongqiang added a comment -

        Maybe we can also add the file's path info as well as the timestamp when it is moved to trash.

        Show
        He Yongqiang added a comment - Maybe we can also add the file's path info as well as the timestamp when it is moved to trash.
        Hide
        dhruba borthakur added a comment -

        > In short, I want the Trash feature to stay as simple as it is now without involving the Namenode much.

        Ok, sounds good to me.

        Show
        dhruba borthakur added a comment - > In short, I want the Trash feature to stay as simple as it is now without involving the Namenode much. Ok, sounds good to me.
        Hide
        Koji Noguchi added a comment -

        At present, it is very easy to find out the last version of the file in Trash.

        If all the nodes in the cluster are in the same timezone, timestamp would (almost) serve this purpose?

        Another option would be to make the Trash client retrieve the contents of the Trash directory and then scan what files pre-exist in the list.

        listStatus is one of the most expensive call to Namenode right now.
        I really want to avoid an extra overhead to the namenode with this common command.

        Also, this code is not fool-proof because there is no atomicity between the exists and the rename.

        True. But I haven't seen this become a problem yet.
        For me, the contract is we try to move the files to Trash but if that fails, we simply delete them.
        We completely delete the files if rename fails twice in a row anyway.

        In short, I want the Trash feature to stay as simple as it is now without involving the Namenode much.

        Show
        Koji Noguchi added a comment - At present, it is very easy to find out the last version of the file in Trash. If all the nodes in the cluster are in the same timezone, timestamp would (almost) serve this purpose? Another option would be to make the Trash client retrieve the contents of the Trash directory and then scan what files pre-exist in the list. listStatus is one of the most expensive call to Namenode right now. I really want to avoid an extra overhead to the namenode with this common command. Also, this code is not fool-proof because there is no atomicity between the exists and the rename. True. But I haven't seen this become a problem yet. For me, the contract is we try to move the files to Trash but if that fails, we simply delete them. We completely delete the files if rename fails twice in a row anyway. In short, I want the Trash feature to stay as simple as it is now without involving the Namenode much.
        Hide
        dhruba borthakur added a comment -

        The proposal of using the timestamp instead of a monotonically increasing number solves the issue. However, it is not very user-friendly. What do other filesystems that have a Trash feature do in this case... the only other FS that I know of implements this feature uses the monotonically increasing number instead of appending the timestamp. At present, it is very easy to find out the last version of the file in Trash.

        Another option would be to make the Trash client retrieve the contents of the Trash directory and then scan what files pre-exist in the list. Also, this code is not fool-proof because there is no atomicity between the exists and the rename.

        Show
        dhruba borthakur added a comment - The proposal of using the timestamp instead of a monotonically increasing number solves the issue. However, it is not very user-friendly. What do other filesystems that have a Trash feature do in this case... the only other FS that I know of implements this feature uses the monotonically increasing number instead of appending the timestamp. At present, it is very easy to find out the last version of the file in Trash. Another option would be to make the Trash client retrieve the contents of the Trash directory and then scan what files pre-exist in the list. Also, this code is not fool-proof because there is no atomicity between the exists and the rename.
        Hide
        Doug Cutting added a comment -

        > This solution assumes that all client machines have synchronized clocks.

        We don't require that these are strictly ascending, just that they're unique. There's very little chance that two nodes, whether they have synchronized clocks or not, will collide to the millisecond, and, if they do, one will have to try twice. But the chances of ever having to try more than twice are even smaller.

        Show
        Doug Cutting added a comment - > This solution assumes that all client machines have synchronized clocks. We don't require that these are strictly ascending, just that they're unique. There's very little chance that two nodes, whether they have synchronized clocks or not, will collide to the millisecond, and, if they do, one will have to try twice. But the chances of ever having to try more than twice are even smaller.
        Hide
        Hairong Kuang added a comment -

        Wow! After Koji figured out the cause of the problem, he talked about the same timestamp solution as what Doug suggested. This reminded me of a Chinese saying: all heroes have the same great idea!

        One minor suggestion... This solution assumes that all client machines have synchronized clocks. We could eliminate possible conflicts by appending client's machine name as well.

        Show
        Hairong Kuang added a comment - Wow! After Koji figured out the cause of the problem, he talked about the same timestamp solution as what Doug suggested. This reminded me of a Chinese saying: all heroes have the same great idea! One minor suggestion... This solution assumes that all client machines have synchronized clocks. We could eliminate possible conflicts by appending client's machine name as well.
        Hide
        Koji Noguchi added a comment -

        Thanks Doug. +1 for the change.

        Created HADOOP-5714 for a new metric (for tracking number of getFileInfo calls).

        Show
        Koji Noguchi added a comment - Thanks Doug. +1 for the change. Created HADOOP-5714 for a new metric (for tracking number of getFileInfo calls).
        Hide
        Doug Cutting added a comment -

        Maybe this should be changed to something like:

        while (fs.exists(trashPath)) {
          trashPath = new Path(orig + "." + System.currentTimeMillis());
        }
        
        Show
        Doug Cutting added a comment - Maybe this should be changed to something like: while (fs.exists(trashPath)) { trashPath = new Path(orig + "." + System .currentTimeMillis()); }
        Hide
        Koji Noguchi added a comment -

        From 0.20, HADOOP-4970 changed to preserve the full path for Trash and this should reduce the chances of name
        clashing. However, we still have this problem if a user decides to repeat create&delete on some temp files.

        Show
        Koji Noguchi added a comment - From 0.20, HADOOP-4970 changed to preserve the full path for Trash and this should reduce the chances of name clashing. However, we still have this problem if a user decides to repeat create&delete on some temp files.
        Hide
        Koji Noguchi added a comment -

        Looking at the Trash.java,

        118         String orig = trashPath.toString();
        119         for (int j = 1; fs.exists(trashPath); j++) {
        120           trashPath = new Path(orig + "." + j);
        121         }
        122         if (fs.rename(path, trashPath))           // move to current trash
        123           return true;
        

        it seems like fs.exists were called 506 times before the single rename in the original description.

        When user changed his script to simply call dfs -rmr $outdir, namenode came back to normal.

        Show
        Koji Noguchi added a comment - Looking at the Trash.java, 118 String orig = trashPath.toString(); 119 for (int j = 1; fs.exists(trashPath); j++) { 120 trashPath = new Path(orig + "." + j); 121 } 122 if (fs.rename(path, trashPath)) // move to current trash 123 return true; it seems like fs.exists were called 506 times before the single rename in the original description. When user changed his script to simply call dfs -rmr $outdir, namenode came back to normal.
        Hide
        Koji Noguchi added a comment -

        We observed Namenode was spending more time in GC than usual.
        No metrics stood out.

        Audit showed many rename operations like the following.

        2009-04-17 23:00:03,957 INFO org.apache.hadoop.fs.FSNamesystem.audit: ugi=user1,users,  
          ip=/99.111.222.111      cmd=rename
          src=/user/user1/agg_results/20090417/part-00489      dst=/user/user1/.Trash/Current/part-00489.507   
          perm=user1:users:rw-r--r--
        

        It turns out that this user had large number of small jobs and was calling

           hadoop dfs -rmr $outdir/\* 
        

        after each job.

        Show
        Koji Noguchi added a comment - We observed Namenode was spending more time in GC than usual. No metrics stood out. Audit showed many rename operations like the following. 2009-04-17 23:00:03,957 INFO org.apache.hadoop.fs.FSNamesystem.audit: ugi=user1,users, ip=/99.111.222.111 cmd=rename src=/user/user1/agg_results/20090417/part-00489 dst=/user/user1/.Trash/Current/part-00489.507 perm=user1:users:rw-r--r-- It turns out that this user had large number of small jobs and was calling hadoop dfs -rmr $outdir/\* after each job.

          People

          • Assignee:
            Boris Shkolnik
            Reporter:
            Koji Noguchi
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development