Uploaded image for project: 'Mahout'
  1. Mahout
  2. MAHOUT-2074




    • Type: New Feature
    • Status: In Progress
    • Priority: Critical
    • Resolution: Unresolved
    • Affects Version/s: 14.1
    • Fix Version/s: 14.2
    • Component/s: None
    • Labels:


      Have a [WIP] Dockerfile for which (assuming a binary release,) pulls the appropriate version of Spark and places both Spark and Mahout in /opt/spark and /opt/mahout respectively.

      Would like to add full build mahout build capabilities (this should not be difficult) in a second file.

      these files currently use an ENTRYPOINT["entrypoint.sh"] command and some environment variables (none uncommon to Spark or Mahout aside from a $MAHOUT_CLASSPATH env variable).  

      the entrypiont.sh essentially. cheeks to see if the command is form a worker or a driver, and runs as such.  Currently I'm just dumping the entire $MAHOUT_HOME/lib/*.jar into the $MAHOUT_CLASSPATH and adding it to the SPARK_CLASSPATH.

      If the entrypoint.sh file detects a driver. it will launch spark-submit.  IIRC, which I so not think that I do, spark submit can handle any driver Pat Ferrel Does this sound correct.  Otherwise we just add the mahout class to be passed to spark-submit class as a command parameter. 

      Though this may be better to migrate in 14.2 or 15.0 to an entire new build-chain. E.g. CMake.   (I would suggest) given our large amount of native code (hopefully soon to be added 

      Though its nearly finished may want to punt this Dockerfile for 14.1, or mark it Experimental, is likely a better option.


          Issue Links



              • Assignee:
                joe_o Joe Olson
                Andrew_Palumbo Andrew Palumbo
              • Votes:
                0 Vote for this issue
                3 Start watching this issue


                • Created: