Uploaded image for project: 'Apache Submarine'
  1. Apache Submarine
  2. SUBMARINE-16

[Submarine] Correct the default directory path in HDFS for "checkout_path"

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Resolved
    • Major
    • Resolution: Duplicate
    • None
    • 0.2.0
    • None
    • None

    Description

       

      yarn jar $HADOOP_BASE_DIR/home/share/hadoop/yarn/hadoop-yarn-submarine-3.2.0-SNAPSHOT.jar job run \
       -verbose \
       -wait_job_finish \
       -keep_staging_dir \
       --env DOCKER_JAVA_HOME=/usr/lib/jvm/java-8-oracle \
       --env DOCKER_HADOOP_HDFS_HOME=/hadoop-3.2.0-SNAPSHOT \
       --name tf-job-001 \
       --docker_image tangzhankun/tensorflow \
       --input_path hdfs://default/user/yarn/cifar-10-data \
       --worker_resources memory=4G,vcores=2 \
       --worker_launch_cmd "cd /cifar10_estimator && python cifar10_main.py --data-dir=%input_path% --job-dir=%checkpoint_path% --num-gpus=0 --train-steps=5"

       

      Above script should work, but the job failed due to invalid path passed to "--job-dir" per my testing. It should be a URI start with "hdfs://".

      2018-09-19 23:19:34,729 INFO yarnservice.YarnServiceJobSubmitter: Worker command =[cd /cifar10_estimator && python cifar10_main.py --data-dir=hdfs://default/user/yarn/cifar-10-data --job-dir=submarine/jobs/tf-job-001/staging/checkpoint_path --num-gpus=0 --train-steps=2]

      Attachments

        Activity

          People

            tangzhankun Zhankun Tang
            tangzhankun Zhankun Tang
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: