Mahout
  1. Mahout
  2. MAHOUT-1142

TestDistributedLanczosSolverCLI.testDistributedLanczosSolverEVJCLI is broken with Hadoop 0.23.5

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Minor Minor
    • Resolution: Fixed
    • Affects Version/s: 0.8
    • Fix Version/s: 0.8
    • Component/s: None
    • Labels:
      None

      Description

      Stacktrace

      java.lang.AssertionError: number of clean eigenvectors expected:<3> but was:<0>
      at org.junit.Assert.fail(Assert.java:91)
      at org.junit.Assert.failNotEquals(Assert.java:645)
      at org.junit.Assert.assertEquals(Assert.java:126)
      at org.junit.Assert.assertEquals(Assert.java:470)
      at org.apache.mahout.math.hadoop.decomposer.TestDistributedLanczosSolverCLI.testDistributedLanczosSolverEVJCLI(TestDistributedLanczosSolverCLI.java:143)
      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      at java.lang.reflect.Method.invoke(Method.java:597)
      at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:44)
      at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
      at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:41)
      at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
      at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
      at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:31)
      at org.junit.runners.BlockJUnit4ClassRunner.runNotIgnored(BlockJUnit4ClassRunner.java:79)
      at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:71)
      at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:49)
      at org.junit.runners.ParentRunner$3.run(ParentRunner.java:193)
      at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:52)
      at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:191)
      at org.junit.runners.ParentRunner.access$000(ParentRunner.java:42)
      at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:184)
      at org.junit.runners.ParentRunner.run(ParentRunner.java:236)
      at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:236)
      at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:134)
      at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:113)
      at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
      at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
      at java.lang.reflect.Method.invoke(Method.java:597)
      at org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:189)
      at org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:165)
      at org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:85)
      at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:103)
      at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:74)

      1. MAHOUT-1142-3.patch
        1 kB
        Mike Percy
      2. MAHOUT-1142-2.patch
        1 kB
        Mike Percy
      3. mahout-1142.patch
        1 kB
        Jimmy Xiang

        Issue Links

          Activity

          Hide
          Jimmy Xiang added a comment -

          This test is broken due to HADOOP-8906.

          The issue is that if the path pattern is something like "/parent_path/file/*", the original FileSystem#globStatus will return the status of the file "/parent_path/file". The new one (HADOOP-8906) returns nothing since "file" is a file, not a folder.

          Show
          Jimmy Xiang added a comment - This test is broken due to HADOOP-8906 . The issue is that if the path pattern is something like "/parent_path/file/*", the original FileSystem#globStatus will return the status of the file "/parent_path/file". The new one ( HADOOP-8906 ) returns nothing since "file" is a file, not a folder.
          Hide
          Sean Owen added a comment -

          0.23.x / 2.x is not supported, and this change doesn't compile against 1.x. That is probably fixable though.

          I read the issue. You're saying that globbing "foo/*" would return the status of "foo" now, but it doesn't. If it was 'fixed' to return nothing, that would also seem to not work, and isn't correct either. I assume the Hadoop issue did fix globbing problems, but they seem to be related to multiple wildcards.

          I am not sure this is the issue, if there is one? it works vs the stable supported version of Hadoop, so I'm going to close, but if we can identify a backwards-compatible change that makes it work in both cases that's cool.

          Show
          Sean Owen added a comment - 0.23.x / 2.x is not supported, and this change doesn't compile against 1.x. That is probably fixable though. I read the issue. You're saying that globbing "foo/*" would return the status of "foo" now, but it doesn't. If it was 'fixed' to return nothing, that would also seem to not work, and isn't correct either. I assume the Hadoop issue did fix globbing problems, but they seem to be related to multiple wildcards. I am not sure this is the issue, if there is one? it works vs the stable supported version of Hadoop, so I'm going to close, but if we can identify a backwards-compatible change that makes it work in both cases that's cool.
          Hide
          Mike Percy added a comment -

          Updating Jimmy's patch to update the file offsets against trunk. I don't have perms to reopen this JIRA, but the updated patch compiles for me against Hadoop 1.0.4.

          Both of these work for me after applying Jimmy's patch:

          (1) mvn clean install -Dtest=TestDistributedLanczosSolverCLI,TestDistributedRowMatrix -DfailIfNoTests=false

          (2) mvn clean install -Dhadoop.version=0.23.5 -Dtest=TestDistributedLanczosSolverCLI,TestDistributedRowMatrix -DfailIfNoTests=false

          Before applying the patch, only (1) succeeds. After applying the patch, both (1) and (2) succeed.

          Show
          Mike Percy added a comment - Updating Jimmy's patch to update the file offsets against trunk. I don't have perms to reopen this JIRA, but the updated patch compiles for me against Hadoop 1.0.4. Both of these work for me after applying Jimmy's patch: (1) mvn clean install -Dtest=TestDistributedLanczosSolverCLI,TestDistributedRowMatrix -DfailIfNoTests=false (2) mvn clean install -Dhadoop.version=0.23.5 -Dtest=TestDistributedLanczosSolverCLI,TestDistributedRowMatrix -DfailIfNoTests=false Before applying the patch, only (1) succeeds. After applying the patch, both (1) and (2) succeed.
          Hide
          Mike Percy added a comment -

          It's worth noting that HADOOP-8906 is designed to make "hadoop fs -ls blah/*" return nothing if "blah" is a file, which is consistent with Linux/Unix systems. If "blah" is a directory, it will still list the contents. This patch just differentiates between the two and tries to maintain backwards compatibility.

          Show
          Mike Percy added a comment - It's worth noting that HADOOP-8906 is designed to make "hadoop fs -ls blah/*" return nothing if "blah" is a file, which is consistent with Linux/Unix systems. If "blah" is a directory, it will still list the contents. This patch just differentiates between the two and tries to maintain backwards compatibility.
          Hide
          Jimmy Xiang added a comment -

          Reopen it, which is a valid issue.

          Show
          Jimmy Xiang added a comment - Reopen it, which is a valid issue.
          Hide
          Sean Owen added a comment -

          This is the same patch. isDirectory() is deprecated and its replacement throws IOException. That's mostly a detail but I do think it's worth noting that #1 the priority is being compatible with the most recent stable release.

          The thing is that the test is already passing a directory, so I don't see how this would change anything. Either that's wrong, and the test needs to be fixed... or it's not, and the problem is elsewhere. Either way this does not seem like the patch to apply (even overlooking the compile/deprecation problem).

          Can anyone explain the above, and/or provide a patch suitable for 1.x?

          Show
          Sean Owen added a comment - This is the same patch. isDirectory() is deprecated and its replacement throws IOException. That's mostly a detail but I do think it's worth noting that #1 the priority is being compatible with the most recent stable release. The thing is that the test is already passing a directory, so I don't see how this would change anything. Either that's wrong, and the test needs to be fixed... or it's not, and the problem is elsewhere. Either way this does not seem like the patch to apply (even overlooking the compile/deprecation problem). Can anyone explain the above, and/or provide a patch suitable for 1.x?
          Hide
          Jimmy Xiang added a comment -

          The test passes a directory. Inside the core code, it passes directories as well in many places. However, at one time, it passes a file. It is easy to find out by running the same test from eclipse with a break point in the place this patch touched.

          Show
          Jimmy Xiang added a comment - The test passes a directory. Inside the core code, it passes directories as well in many places. However, at one time, it passes a file. It is easy to find out by running the same test from eclipse with a break point in the place this patch touched.
          Hide
          Mike Percy added a comment -

          Here are some examples.

          Passes a directory:

          • TestDistributedRowMatrix.testTranspose()
          • TestDistributedRowMatrix.testMatrixTimesMatrix()

          Passes a file:

          • TestDistributedLanczosSolverCLI.testDistributedLanczosSolverEVJCLI()

          I'll attach a patch that uses the less-deprecated API, FileSystem.get(conf).getFileStatus(rowPath).isDir(). Sadly, in 0.23.x and 2.0.x they have deprecated isDir() in favor of isDirectory() - even though only isDir() is available in Hadoop 1.x.x. So there may still be deprecation warnings when compiled against 0.23/2.0 versions but that seems to be the right way to handle this case.

          Show
          Mike Percy added a comment - Here are some examples. Passes a directory: TestDistributedRowMatrix.testTranspose() TestDistributedRowMatrix.testMatrixTimesMatrix() Passes a file: TestDistributedLanczosSolverCLI.testDistributedLanczosSolverEVJCLI() I'll attach a patch that uses the less-deprecated API, FileSystem.get(conf).getFileStatus(rowPath).isDir(). Sadly, in 0.23.x and 2.0.x they have deprecated isDir() in favor of isDirectory() - even though only isDir() is available in Hadoop 1.x.x. So there may still be deprecation warnings when compiled against 0.23/2.0 versions but that seems to be the right way to handle this case.
          Hide
          Mike Percy added a comment -

          Updated patch using getFileStatus().isDir() API.

          Show
          Mike Percy added a comment - Updated patch using getFileStatus().isDir() API.
          Hide
          Sean Owen added a comment -

          I don't see where it passes a file – "distMatrix" is a directory, or should be and appears to be. This would be the better fix, since it is supposed to consume a directory. It seems like it does though.

          I suspect, though I don't know, that this is not the problem. You can get a "0" in other cases, like when the RNG seeds differently, in my past experience. It could be a red herring, but it could also explain why shuffling up the dependencies and such changes things slightly to make this fail.

          Show
          Sean Owen added a comment - I don't see where it passes a file – "distMatrix" is a directory, or should be and appears to be. This would be the better fix, since it is supposed to consume a directory. It seems like it does though. I suspect, though I don't know, that this is not the problem. You can get a "0" in other cases, like when the RNG seeds differently, in my past experience. It could be a red herring, but it could also explain why shuffling up the dependencies and such changes things slightly to make this fail.
          Hide
          Mike Percy added a comment -

          I hacked up the file to get some debug info and below is what I got running under the default (Hadoop 1.0.0) profile. The test is not passing a directory in the below stacktrace, it's a file.

          rowPath is a file. Path = file:/tmp/mahout-TestDistributedLanczosSolverCLI-7708908523833467904/output/rawEigenvectors
          java.lang.Exception: Stack trace for debug
            at org.apache.mahout.math.hadoop.DistributedRowMatrix.iterateAll(DistributedRowMatrix.java:146)
            at org.apache.mahout.math.hadoop.DistributedRowMatrix.iterator(DistributedRowMatrix.java:292)
            at org.apache.mahout.math.hadoop.decomposer.EigenVerificationJob.verifyEigens(EigenVerificationJob.java:248)
            at org.apache.mahout.math.hadoop.decomposer.EigenVerificationJob.run(EigenVerificationJob.java:159)
            at org.apache.mahout.math.hadoop.decomposer.DistributedLanczosSolver.run(DistributedLanczosSolver.java:159)
            at org.apache.mahout.math.hadoop.decomposer.DistributedLanczosSolver.run(DistributedLanczosSolver.java:112)
            at org.apache.mahout.math.hadoop.decomposer.DistributedLanczosSolver$DistributedLanczosSolverJob.run(DistributedLanczosSolver.java:284)
            at org.apache.mahout.math.hadoop.decomposer.TestDistributedLanczosSolverCLI.testDistributedLanczosSolverEVJCLI(TestDistributedLanczosSolverCLI.java:109)
            at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
            at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
            at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
            at java.lang.reflect.Method.invoke(Method.java:597)
            at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
            at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
            at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
            at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
            at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28)
            at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30)
            at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263)
            at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68)
            at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47)
            at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231)
            at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60)
            at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229)
            at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50)
            at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222)
            at org.junit.runners.ParentRunner.run(ParentRunner.java:300)
            at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:252)
            at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:141)
            at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:112)
            at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
            at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
            at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
            at java.lang.reflect.Method.invoke(Method.java:597)
            at org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:189)
            at org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:165)
            at org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:85)
            at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:115)
            at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:75)
          

          Sean, are you saying this should not be supported? I don't see why it would be problematic. However, an alternative could be to check for isDir() in the DistributedRowMatrix constructor and throw an IllegalArgumentException if a non-directory path is detected. We would also need to clarify the semantics in the javadocs. It seems a bit strong but at least it's consistent. Thoughts?

          Show
          Mike Percy added a comment - I hacked up the file to get some debug info and below is what I got running under the default (Hadoop 1.0.0) profile. The test is not passing a directory in the below stacktrace, it's a file. rowPath is a file. Path = file:/tmp/mahout-TestDistributedLanczosSolverCLI-7708908523833467904/output/rawEigenvectors java.lang.Exception: Stack trace for debug at org.apache.mahout.math.hadoop.DistributedRowMatrix.iterateAll(DistributedRowMatrix.java:146) at org.apache.mahout.math.hadoop.DistributedRowMatrix.iterator(DistributedRowMatrix.java:292) at org.apache.mahout.math.hadoop.decomposer.EigenVerificationJob.verifyEigens(EigenVerificationJob.java:248) at org.apache.mahout.math.hadoop.decomposer.EigenVerificationJob.run(EigenVerificationJob.java:159) at org.apache.mahout.math.hadoop.decomposer.DistributedLanczosSolver.run(DistributedLanczosSolver.java:159) at org.apache.mahout.math.hadoop.decomposer.DistributedLanczosSolver.run(DistributedLanczosSolver.java:112) at org.apache.mahout.math.hadoop.decomposer.DistributedLanczosSolver$DistributedLanczosSolverJob.run(DistributedLanczosSolver.java:284) at org.apache.mahout.math.hadoop.decomposer.TestDistributedLanczosSolverCLI.testDistributedLanczosSolverEVJCLI(TestDistributedLanczosSolverCLI.java:109) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:28) at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:30) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222) at org.junit.runners.ParentRunner.run(ParentRunner.java:300) at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:252) at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:141) at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:112) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:189) at org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:165) at org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:85) at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:115) at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:75) Sean, are you saying this should not be supported? I don't see why it would be problematic. However, an alternative could be to check for isDir() in the DistributedRowMatrix constructor and throw an IllegalArgumentException if a non-directory path is detected. We would also need to clarify the semantics in the javadocs. It seems a bit strong but at least it's consistent. Thoughts?
          Hide
          Sean Owen added a comment -

          OK, it's much farther down. It seems fine to support this, if it's not merely a test error, but seems actually like the desired behavior in the first place. This is a latent bug then – I don't know, didn't write this code. I agree with checking for a directory. It just has to be done in a way that works with 1.x; it's OK if that is later deprecated if it's not in 1.x. So yes the last patch you attached looks right.

          Show
          Sean Owen added a comment - OK, it's much farther down. It seems fine to support this, if it's not merely a test error, but seems actually like the desired behavior in the first place. This is a latent bug then – I don't know, didn't write this code. I agree with checking for a directory. It just has to be done in a way that works with 1.x; it's OK if that is later deprecated if it's not in 1.x. So yes the last patch you attached looks right.
          Hide
          Mike Percy added a comment -

          Great!

          Show
          Mike Percy added a comment - Great!
          Hide
          Hudson added a comment -

          Integrated in Mahout-Quality #1834 (See https://builds.apache.org/job/Mahout-Quality/1834/)
          MAHOUT-1142 support single file input (Revision 1436840)

          Result = SUCCESS
          srowen : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1436840
          Files :

          • /mahout/trunk/core/src/main/java/org/apache/mahout/math/hadoop/DistributedRowMatrix.java
          Show
          Hudson added a comment - Integrated in Mahout-Quality #1834 (See https://builds.apache.org/job/Mahout-Quality/1834/ ) MAHOUT-1142 support single file input (Revision 1436840) Result = SUCCESS srowen : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1436840 Files : /mahout/trunk/core/src/main/java/org/apache/mahout/math/hadoop/DistributedRowMatrix.java
          Hide
          Mike Percy added a comment -

          Thanks for the commit Sean!

          Show
          Mike Percy added a comment - Thanks for the commit Sean!

            People

            • Assignee:
              Sean Owen
              Reporter:
              Jimmy Xiang
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development