Uploaded image for project: 'Mahout'
  1. Mahout
  2. MAHOUT-780

job jars fail on OS X due to case-insensitive name conflict on 'license'



    • Type: Bug
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.5
    • Fix Version/s: 0.6
    • Component/s: build
    • Labels:
    • Environment:

      Mac OS X


      Dan explains it well below. The workaround is to make the 'license' folder into a 'licenses' folder, but, where does this come from? anyone know?

      With SVN 'At revision 1152597.', and freshly rebuilt:

      jar -tvf /Users/danbri/Documents/workspace/trunk/examples/target/mahout-examples-0.6-SNAPSHOT-job.jar

      grep -i license

      19355 Sat Feb 26 19:16:30 CET 2011 META-INF/LICENSE.txt
      11358 Sun Apr 11 21:45:12 CEST 2010 META-INF/LICENSE
      1596 Mon Dec 20 15:47:30 CET 2010 LICENSE
      0 Sun Dec 01 11:57:24 CET 2002 license/
      4083 Sun Dec 01 11:57:24 CET 2002 license/LICENSE.dom-documentation.txt
      3595 Sun Dec 01 11:57:24 CET 2002 license/LICENSE.dom-software.txt
      804 Sun Dec 01 11:57:24 CET 2002 license/LICENSE.sax.txt
      2827 Sun Dec 01 11:57:24 CET 2002 license/LICENSE.txt
      1274 Sun Dec 01 11:57:24 CET 2002 license/README.dom.txt
      715 Sun Dec 01 11:57:24 CET 2002 license/README.sax.txt
      672 Sun Dec 01 11:57:24 CET 2002 license/README.txt

      This situation seems to quite confuse Hadoop. The underlying OSX
      filesystem doesn't support file and directory names differing only by
      case; see http://developer.apple.com/library/mac/#documentation/Java/Conceptual/Java14Development/01-JavaOverview/JavaOverview.html

      mahout lucene.vector --dir solr/data/index/ --output bar/vecs --field
      label --idField id --dictOut bar/dict.out --norm 2

      Running on hadoop, using HADOOP_HOME=/Users/danbri/working/hadoop/hadoop-0.20.2
      MAHOUT-JOB: /Users/danbri/Documents/workspace/trunk/examples/target/mahout-examples-0.6-SNAPSHOT-job.jar

      Exception in thread "main" java.io.IOException: Mkdirs failed to
      create /tmp/hadoop/hadoop-unjar5018665014541152120/license
      at org.apache.hadoop.util.RunJar.unJar(RunJar.java:48)
      at org.apache.hadoop.util.RunJar.main(RunJar.java:136)

      That Hadoop error message is somewhat unhelpful, especially for those
      who doubt their hadoop knowhow; but technically correct. The
      /tmp/hadoop and its subdirectory exist and are writeable. The problem
      is the specific file/dir names being written into it. That wasn't so
      obvious. So I went chasing around configuring hadoop tmp dirs,
      checking it existed and was writable in local and in hdfs dirs, ...
      then ... I finally, belatedly tried unzipping the jar with 'jar -xvf '
      to see what was special about 'license', and got the same error from
      commandline 'jar' that upset !file.getParentFile().isDirectory() in
      Hadoop's ./src/core/org/apache/hadoop/util/RunJar.java:

      java.io.IOException: license : could not create directory
      at sun.tools.jar.Main.extractFile(Main.java:909)
      at sun.tools.jar.Main.extract(Main.java:852)
      at sun.tools.jar.Main.run(Main.java:242)
      at sun.tools.jar.Main.main(Main.java:1149)

      (this is the same error that trips up hadoop)

      This seems to be reproducible; I did an svn up, mvn clean and mvn
      package, let all the tests run and pass, and confirm that the same
      thing happens.

      I compared an early job .jar from 0.5, where all was fine. Any
      suggestions for best quick fix?


        1. MAHOUT-780.patch
          0.8 kB
          Sean R. Owen



            • Assignee:
              srowen Sean R. Owen
              srowen Sean R. Owen
            • Votes:
              0 Vote for this issue
              1 Start watching this issue


              • Created: