Uploaded image for project: 'Mahout'
  1. Mahout
  2. MAHOUT-2074

Dockerfile(s)

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Critical
    • Resolution: Won't Fix
    • 14.1
    • classic-15.0
    • None
    • None

    Description

      Have a [WIP] Dockerfile for which (assuming a binary release,) pulls the appropriate version of Spark and places both Spark and Mahout in /opt/spark and /opt/mahout respectively.

      Would like to add full build mahout build capabilities (this should not be difficult) in a second file.

      these files currently use an ENTRYPOINT["entrypoint.sh"] command and some environment variables (none uncommon to Spark or Mahout aside from a $MAHOUT_CLASSPATH env variable).  

      the entrypiont.sh essentially. cheeks to see if the command is form a worker or a driver, and runs as such.  Currently I'm just dumping the entire $MAHOUT_HOME/lib/*.jar into the $MAHOUT_CLASSPATH and adding it to the SPARK_CLASSPATH.

      If the entrypoint.sh file detects a driver. it will launch spark-submit.  IIRC, which I so not think that I do, spark submit can handle any driver pferrel Does this sound correct.  Otherwise we just add the mahout class to be passed to spark-submit class as a command parameter. 

      Though this may be better to migrate in 14.2 or 15.0 to an entire new build-chain. E.g. CMake.   (I would suggest) given our large amount of native code (hopefully soon to be added 

      Though its nearly finished may want to punt this Dockerfile for 14.1, or mark it Experimental, is likely a better option.

      Attachments

        Issue Links

          Activity

            People

              joe_o Joe Olson
              Andrew_Palumbo Andrew Palumbo
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: