Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-13396

start-dfs.sh and stop-dfs.sh has malformed command; doesn't use workers file

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 3.0.0, 3.0.1
    • None
    • hdfs
    • None
    • Hadoop 3.0.1 binary distribution on Gentoo Linux, Icedtea JRE

    Description

      In 3.0.1's start-dfs.sh, the command to start the datanodes reads as follows:

      hadoop_uservar_su hdfs datanode "${HADOOP_HDFS_HOME}/bin/hdfs" \
          --workers \
          --config "${HADOOP_CONF_DIR}" \
          --daemon start \
          datanode ${dataStartOpt}
      

       This doesn't work; executing the script produces this:

      hdfs@msba02a ~ $ $HADOOP_HOME/sbin/start-dfs.sh
      Starting namenodes on [msba02a.bus.emory.ddns]
      Starting datanodes
      ^/opt/hadoop/3: ssh: Could not resolve hostname ^/opt/hadoop/3.0.1/etc/hadoop/workers: Name or service not known
      pdsh@msba02a: ^/opt/hadoop/3: ssh exited with exit code 255
      Starting secondary namenodes [msba02a]
      
      

      It misinterprets the value of HADOOP_CONF_DIR as one of the names of a machine it is supposed to access.

      The workaround I developed involves the --hostnames option like so, changing the one-name-per-line workers file into a comma-separated list:

      hadoop_uservar_su hdfs datanode "${HADOOP_HDFS_HOME}/bin/hdfs" \
          --workers \
          --hostnames `sed -e ':a' -e 'N' -e '$!ba' -e 's/\n/,/g' ${HADOOP_CONF_DIR}/workers` \
          --config "${HADOOP_CONF_DIR}" \
          --daemon start \
          datanode ${dataStartOpt}
      
      

      A similar change had to be made to stop-dfs.sh. I've verified that HADOOP_HDFS_HOME and HADOOP_CONF_DIR are set correctly within the script at the point where this command executes.

      This problem also exists in start-dfs.sh/stop-dfs.sh in 3.0.0, although the original invocation differs slightly from 3.0.1.

      In 3.0.1, I'm running into another problem with getting datanodes started (was fine in 3.0.0) but I couldn't hit that problem until I got past this one.

       

      Attachments

        Activity

          People

            Unassigned Unassigned
            Gravelator Jeff Hubbs
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: