Bigtop
  1. Bigtop
  2. BIGTOP-895

A number of testcases in TestCLI are failing with (at least) Hadoop 2.0.3 and later

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Blocker Blocker
    • Resolution: Fixed
    • Affects Version/s: 0.5.0
    • Fix Version/s: 0.6.0
    • Component/s: Tests
    • Labels:
      None

      Description

      The list of failing tests is attached.

      1. failing-testCLI.txt
        6 kB
        Konstantin Boudnik
      2. BIGTOP-895.patch
        754 kB
        Anatoli Fomenko
      3. BIGTOP-895.patch
        581 kB
        Anatoli Fomenko
      4. 0001-BIGTOP-895.-A-number-of-testcases-in-TestCLI-are-fai.patch
        634 kB
        Anatoli Fomenko
      5. 0002-BIGTOP-895.-A-number-of-testcases-in-TestCLI-are-fai.patch
        2 kB
        Anatoli Fomenko

        Issue Links

          Activity

          Hide
          Roman Shaposhnik added a comment -

          Latest test runs shows no signs of failures! Kudos Anatoli! Closing the issue now.

          Show
          Roman Shaposhnik added a comment - Latest test runs shows no signs of failures! Kudos Anatoli! Closing the issue now.
          Hide
          Anatoli Fomenko added a comment -

          Added a second patch (on the top of the current master) that addresses previous comment.

          Show
          Anatoli Fomenko added a comment - Added a second patch (on the top of the current master) that addresses previous comment.
          Hide
          Anatoli Fomenko added a comment -

          Thanks for the review. I will create /tmp/testcli with proper permissions. It will make the Bigtop TestCLI framework more well rounded.

          Show
          Anatoli Fomenko added a comment - Thanks for the review. I will create /tmp/testcli with proper permissions. It will make the Bigtop TestCLI framework more well rounded.
          Hide
          Roman Shaposhnik added a comment -

          The test looks much better now – thanks!!! There's still one little issue thought – it seems that you need to create /tmp/testcli somewhere in the Before method. Otherwise first batch of tests that expect the root of /tmp/testcli to be there fail.

          Show
          Roman Shaposhnik added a comment - The test looks much better now – thanks!!! There's still one little issue thought – it seems that you need to create /tmp/testcli somewhere in the Before method. Otherwise first batch of tests that expect the root of /tmp/testcli to be there fail.
          Hide
          Anatoli Fomenko added a comment -

          A new version of the patch attached, per modified approach.

          Show
          Anatoli Fomenko added a comment - A new version of the patch attached, per modified approach.
          Hide
          Anatoli Fomenko added a comment -

          In order to work on real cluster, these tests update approach has been modified:

          • The HDFS root permissions modification removed
          • The Hadoop's testHDFSConf.xml preprocessed:
            • added TEST_DIR_ABSOLUTE variable to the test cases to replace HDFS root that worked well for mini cluster but inconvenient for a real cluster (this approach existed in the current Bigtop's testConf.xml)
            • added USER_NAME variable to work with user home dir for relative path operations
            • failing tests removed, with remaining total of 461
          Show
          Anatoli Fomenko added a comment - In order to work on real cluster, these tests update approach has been modified: The HDFS root permissions modification removed The Hadoop's testHDFSConf.xml preprocessed: added TEST_DIR_ABSOLUTE variable to the test cases to replace HDFS root that worked well for mini cluster but inconvenient for a real cluster (this approach existed in the current Bigtop's testConf.xml) added USER_NAME variable to work with user home dir for relative path operations failing tests removed, with remaining total of 461
          Hide
          Konstantin Boudnik added a comment -

          I see two issues in this patch:

          • in {{ bigtop-tests/test-artifacts/hadoop/src/main/groovy/org/apache/bigtop/itest/hadoop/hdfs/FSCmdExecutor.java}}
            you still have
            args[i] = args[i].replaceAll("TEST_DIR_ABSOLUTE", TestCLI.TEST_DIR_ABSOLUTE);
            that breaks the compilation
          • the following tests are failing:
            13/04/23 19:49:36 INFO cli.CLITestHelper: 260: mkdir: Test recreate of existing directory with -p succeeds
            13/04/23 19:49:36 INFO cli.CLITestHelper: 284: test: non existent file (absolute path)
            13/04/23 19:49:36 INFO cli.CLITestHelper: 285: test: non existent file (relative path)
            13/04/23 19:49:36 INFO cli.CLITestHelper: 286: test: non existent directory (absolute path)
            13/04/23 19:49:36 INFO cli.CLITestHelper: 287: test: non existent directory (relative path)
            13/04/23 19:49:36 INFO cli.CLITestHelper: 288: test: Test for hdfs:// path - non existent file
            13/04/23 19:49:36 INFO cli.CLITestHelper: 289: test: Test for hdfs:// path - non existent directory
            13/04/23 19:49:36 INFO cli.CLITestHelper: 290: test: Test for Namenode's path - non existent file
            13/04/23 19:49:36 INFO cli.CLITestHelper: 291: test: Test for Namenode's path - non existent directory
            

          the 1st:

          13/04/23 19:49:36 INFO cli.CLITestHelper:                     Test ID: [260]
          13/04/23 19:49:36 INFO cli.CLITestHelper:            Test Description: [mkdir: Test recreate of existing directory with -p succeeds]
          13/04/23 19:49:36 INFO cli.CLITestHelper: 
          13/04/23 19:49:36 INFO cli.CLITestHelper:               Test Commands: [-fs hdfs://localhost:8020 -rm -r -f dir0]
          13/04/23 19:49:36 INFO cli.CLITestHelper:               Test Commands: [-fs hdfs://localhost:8020 -mkdir -p dir0/dir1]
          13/04/23 19:49:36 INFO cli.CLITestHelper:               Test Commands: [-fs hdfs://localhost:8020 -mkdir -p dir0/dir1]
          13/04/23 19:49:36 INFO cli.CLITestHelper: 
          13/04/23 19:49:36 INFO cli.CLITestHelper:            Cleanup Commands: [-fs hdfs://localhost:8020 -rm -r dir0]
          13/04/23 19:49:36 INFO cli.CLITestHelper: 
          13/04/23 19:49:36 INFO cli.CLITestHelper:                  Comparator: [ExactComparator]
          13/04/23 19:49:36 INFO cli.CLITestHelper:          Comparision result:   [fail]
          13/04/23 19:49:36 INFO cli.CLITestHelper:             Expected output:   []
          13/04/23 19:49:36 INFO cli.CLITestHelper:               Actual output:   [13/04/23 19:49:04 TRACE ipc.ProtobufRpcEngine: 1: Call -> null@localhost/127.0.0.1:8020: getFileInfo {src: "/user/cos/dir0/dir1"}
          13/04/23 19:49:04 DEBUG ipc.Client: IPC Client (164014493) connection to localhost/127.0.0.1:8020 from cos sending #5045
          13/04/23 19:49:04 DEBUG ipc.Client: IPC Client (164014493) connection to localhost/127.0.0.1:8020 from cos got value #5045
          13/04/23 19:49:04 DEBUG ipc.ProtobufRpcEngine: Call: getFileInfo took 0ms
          13/04/23 19:49:04 TRACE ipc.ProtobufRpcEngine: 1: Response <- null@localhost/127.0.0.1:8020: getFileInfo {fs { fileType: IS_DIR path: "" length: 0 permission { perm: 493 } owner: "cos" group: "supergroup" modification_time: 1366771744038 access_time: 0 block_replication: 0 blocksize: 0 }}
          

          that might be the cause for the rest of failures.

          Show
          Konstantin Boudnik added a comment - I see two issues in this patch: in {{ bigtop-tests/test-artifacts/hadoop/src/main/groovy/org/apache/bigtop/itest/hadoop/hdfs/FSCmdExecutor.java}} you still have args [i] = args [i] .replaceAll("TEST_DIR_ABSOLUTE", TestCLI.TEST_DIR_ABSOLUTE); that breaks the compilation the following tests are failing: 13/04/23 19:49:36 INFO cli.CLITestHelper: 260: mkdir: Test recreate of existing directory with -p succeeds 13/04/23 19:49:36 INFO cli.CLITestHelper: 284: test: non existent file (absolute path) 13/04/23 19:49:36 INFO cli.CLITestHelper: 285: test: non existent file (relative path) 13/04/23 19:49:36 INFO cli.CLITestHelper: 286: test: non existent directory (absolute path) 13/04/23 19:49:36 INFO cli.CLITestHelper: 287: test: non existent directory (relative path) 13/04/23 19:49:36 INFO cli.CLITestHelper: 288: test: Test for hdfs:// path - non existent file 13/04/23 19:49:36 INFO cli.CLITestHelper: 289: test: Test for hdfs:// path - non existent directory 13/04/23 19:49:36 INFO cli.CLITestHelper: 290: test: Test for Namenode's path - non existent file 13/04/23 19:49:36 INFO cli.CLITestHelper: 291: test: Test for Namenode's path - non existent directory the 1st: 13/04/23 19:49:36 INFO cli.CLITestHelper: Test ID: [260] 13/04/23 19:49:36 INFO cli.CLITestHelper: Test Description: [mkdir: Test recreate of existing directory with -p succeeds] 13/04/23 19:49:36 INFO cli.CLITestHelper: 13/04/23 19:49:36 INFO cli.CLITestHelper: Test Commands: [-fs hdfs://localhost:8020 -rm -r -f dir0] 13/04/23 19:49:36 INFO cli.CLITestHelper: Test Commands: [-fs hdfs://localhost:8020 -mkdir -p dir0/dir1] 13/04/23 19:49:36 INFO cli.CLITestHelper: Test Commands: [-fs hdfs://localhost:8020 -mkdir -p dir0/dir1] 13/04/23 19:49:36 INFO cli.CLITestHelper: 13/04/23 19:49:36 INFO cli.CLITestHelper: Cleanup Commands: [-fs hdfs://localhost:8020 -rm -r dir0] 13/04/23 19:49:36 INFO cli.CLITestHelper: 13/04/23 19:49:36 INFO cli.CLITestHelper: Comparator: [ExactComparator] 13/04/23 19:49:36 INFO cli.CLITestHelper: Comparision result: [fail] 13/04/23 19:49:36 INFO cli.CLITestHelper: Expected output: [] 13/04/23 19:49:36 INFO cli.CLITestHelper: Actual output: [13/04/23 19:49:04 TRACE ipc.ProtobufRpcEngine: 1: Call -> null@localhost/127.0.0.1:8020: getFileInfo {src: "/user/cos/dir0/dir1"} 13/04/23 19:49:04 DEBUG ipc.Client: IPC Client (164014493) connection to localhost/127.0.0.1:8020 from cos sending #5045 13/04/23 19:49:04 DEBUG ipc.Client: IPC Client (164014493) connection to localhost/127.0.0.1:8020 from cos got value #5045 13/04/23 19:49:04 DEBUG ipc.ProtobufRpcEngine: Call: getFileInfo took 0ms 13/04/23 19:49:04 TRACE ipc.ProtobufRpcEngine: 1: Response <- null@localhost/127.0.0.1:8020: getFileInfo {fs { fileType: IS_DIR path: "" length: 0 permission { perm: 493 } owner: "cos" group: "supergroup" modification_time: 1366771744038 access_time: 0 block_replication: 0 blocksize: 0 }} that might be the cause for the rest of failures.
          Hide
          Anatoli Fomenko added a comment -

          Adding updated patch:

          • testHDFSConf.xml copied from hadoop
          • TestCLI.java updated to accommodate new test file
          • after removing 20% of testHDFSConf.xml tests that were failing, 100% tests pass
          Show
          Anatoli Fomenko added a comment - Adding updated patch: testHDFSConf.xml copied from hadoop TestCLI.java updated to accommodate new test file after removing 20% of testHDFSConf.xml tests that were failing, 100% tests pass
          Hide
          Anatoli Fomenko added a comment -

          After closer look and some updates of TestCLI.java, including changing the HDFS / permissions to "777", 80% of tests pass.

          The remaining 20% of tests fail because of the following reasons:

          • Commands such as
            -fs hdfs://localhost:8020 -count -q /dir1

            provide different output than expected:

            [mini cluster] Expected output:   [( |\t)*10( |\t)*9( |\t)*1048576( |\t)*1048576( |\t)*1( |\t)*0( |\t)*0 /dir1]
            [Bigtop cluser] Actual output:   [        none             inf            none             inf            1            0                  0 /dir1 
            
          • Commands such as
            -fs hdfs://localhost:8020 -chown newowner:newgroup hdfs:///file1

            and

            -fs hdfs://localhost:8020 -chgrp newgroup /file1

            fail due to absence of the expected user and group.

          • Commands such as "help for dfsadmin report" fail due to not quite clear reason, perhaps related to the fact that TestCLI extends CLITestHelper, not CLITestHelperDFS. Or perhaps due to superuser requirements.
          • Cause of the failure of commands such as
            hdfs dfs -fs hdfs://localhost:8020 -moveFromLocal file:...

            is unclear.

          Show
          Anatoli Fomenko added a comment - After closer look and some updates of TestCLI.java, including changing the HDFS / permissions to "777", 80% of tests pass. The remaining 20% of tests fail because of the following reasons: Commands such as -fs hdfs: //localhost:8020 -count -q /dir1 provide different output than expected: [mini cluster] Expected output:   [( |\t)*10( |\t)*9( |\t)*1048576( |\t)*1048576( |\t)*1( |\t)*0( |\t)*0 /dir1] [Bigtop cluser] Actual output:   [        none             inf           none             inf            1            0                  0 /dir1 Commands such as -fs hdfs: //localhost:8020 -chown newowner:newgroup hdfs:///file1 and -fs hdfs: //localhost:8020 -chgrp newgroup /file1 fail due to absence of the expected user and group. Commands such as "help for dfsadmin report" fail due to not quite clear reason, perhaps related to the fact that TestCLI extends CLITestHelper, not CLITestHelperDFS. Or perhaps due to superuser requirements. Cause of the failure of commands such as hdfs dfs -fs hdfs: //localhost:8020 -moveFromLocal file:... is unclear.
          Hide
          Anatoli Fomenko added a comment -

          Roman,

          Thank you for your comments. That's how they will be addressed, as this patch progresses:
          1. The xml file will be taken from the hadoop test source file that will be used as a dependency.
          2. 755 will be used instead, with elevated privileges as referenced in 3.

          I'm planning to work on it over weekend, and at the beginning of the next week, so a week until completion is a safe bet.

          Show
          Anatoli Fomenko added a comment - Roman, Thank you for your comments. That's how they will be addressed, as this patch progresses: 1. The xml file will be taken from the hadoop test source file that will be used as a dependency. 2. 755 will be used instead, with elevated privileges as referenced in 3. I'm planning to work on it over weekend, and at the beginning of the next week, so a week until completion is a safe bet.
          Hide
          Roman Shaposhnik added a comment -

          Great progress! There's still a few things I don't fully understand:

          1. why do we need a copy of the xml file if we're not changing it? Can't we just use the one that comes in hadoop-hdfs test jar?
          2. FileUtil.chmod("/", "754") – I'm not sure 754 (which would result in a group 'other not being able to cd is the right setting)
          3. you can't really call FileUtil.chmod without elevating your privileges. Perhaps you may want to do what some of the other HDFS tests are doing. E.g. https://git-wip-us.apache.org/repos/asf?p=bigtop.git;a=blob;f=bigtop-tests/test-artifacts/hadoop/src/main/groovy/org/apache/bigtop/itest/hadoop/hdfs/TestFsck.groovy;h=62efd7c9964675cdc04b120834191947d3c79f2e;hb=HEAD#l29

          Finally, what the odds we can get a closure on this within a week? I'd like to see what's holding up 0.6.0 at this poin.

          Show
          Roman Shaposhnik added a comment - Great progress! There's still a few things I don't fully understand: why do we need a copy of the xml file if we're not changing it? Can't we just use the one that comes in hadoop-hdfs test jar? FileUtil.chmod("/", "754") – I'm not sure 754 (which would result in a group 'other not being able to cd is the right setting) you can't really call FileUtil.chmod without elevating your privileges. Perhaps you may want to do what some of the other HDFS tests are doing. E.g. https://git-wip-us.apache.org/repos/asf?p=bigtop.git;a=blob;f=bigtop-tests/test-artifacts/hadoop/src/main/groovy/org/apache/bigtop/itest/hadoop/hdfs/TestFsck.groovy;h=62efd7c9964675cdc04b120834191947d3c79f2e;hb=HEAD#l29 Finally, what the odds we can get a closure on this within a week? I'd like to see what's holding up 0.6.0 at this poin.
          Hide
          Anatoli Fomenko added a comment -

          Attached for review a preliminary patch that replaces testConf.xml with ~140 tests by the Hadoop current testHDFSConf.xml with ~600 tests. ~590 tests pass.
          Additional work and testing in progress.

          Show
          Anatoli Fomenko added a comment - Attached for review a preliminary patch that replaces testConf.xml with ~140 tests by the Hadoop current testHDFSConf.xml with ~600 tests. ~590 tests pass. Additional work and testing in progress.
          Hide
          Anatoli Fomenko added a comment - - edited

          In the course of investigation, updating current testConf.xml with ~140 tests to the new Hadoop smoke tests in testHDFSConf.xml with ~600 tests seemed to be proper.

          It has been found that quite a few tests in testHDFSConf.xml, while passing on hadoop-common smoke infrastructure using a mini-cluster, fail on a default Bigtop cluster. There are a few unrelated reasons to that:

          1. The regexp for files in RegexpComparator in testHDFSConf.xml have a hard coded replication factor set to 1 that is fine for mini-cluster, but fails on a real cluster with a default replication factor set to 3. This issue should be resolved in HADOOP-9464.
          2. As opposed to Bigtop testConf.xml where default "root" working HDFS directory is set to /tmp/testcli, that is widely accessible, testHDFSConf.xml default HDFS directory is set to /. Since the "root" prefix needs to be prepended in each test case, a better solution would be to set HDFS / permission on a Bigtop cluster to 777.
          3. Output from test commands such as
            -count -q /dir1

            are entirely different on mini-cluster and Bigtop cluster.

          4. Some tests require superuser privileges that needs to be addressed per test case.

          These issues are planned to be resolved. Also, a reuse of the Hadoop testHDFSConf.xml along with the majority of the test helping infrastructure will be achieved by making TestCLI class extending CLITestHelperDFS as opposed to current CLITestHelper.

          Show
          Anatoli Fomenko added a comment - - edited In the course of investigation, updating current testConf.xml with ~140 tests to the new Hadoop smoke tests in testHDFSConf.xml with ~600 tests seemed to be proper. It has been found that quite a few tests in testHDFSConf.xml, while passing on hadoop-common smoke infrastructure using a mini-cluster, fail on a default Bigtop cluster. There are a few unrelated reasons to that: The regexp for files in RegexpComparator in testHDFSConf.xml have a hard coded replication factor set to 1 that is fine for mini-cluster, but fails on a real cluster with a default replication factor set to 3. This issue should be resolved in HADOOP-9464 . As opposed to Bigtop testConf.xml where default "root" working HDFS directory is set to /tmp/testcli, that is widely accessible, testHDFSConf.xml default HDFS directory is set to /. Since the "root" prefix needs to be prepended in each test case, a better solution would be to set HDFS / permission on a Bigtop cluster to 777. Output from test commands such as -count -q /dir1 are entirely different on mini-cluster and Bigtop cluster. Some tests require superuser privileges that needs to be addressed per test case. These issues are planned to be resolved. Also, a reuse of the Hadoop testHDFSConf.xml along with the majority of the test helping infrastructure will be achieved by making TestCLI class extending CLITestHelperDFS as opposed to current CLITestHelper.

            People

            • Assignee:
              Anatoli Fomenko
              Reporter:
              Konstantin Boudnik
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development