Hadoop Common
  1. Hadoop Common
  2. HADOOP-7596

Enable jsvc to work with Hadoop RPM package

    Details

    • Type: Bug Bug
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.20.204.0
    • Fix Version/s: 0.20.205.0
    • Component/s: build
    • Labels:
      None
    • Environment:

      Java 6, RedHat EL 5.6

      Description

      For secure Hadoop 0.20.2xx cluster, datanode can only run with 32 bit jvm because Hadoop only packages 32 bit jsvc. The build process should download proper jsvc versions base on the build architecture. In addition, the shell script should be enhanced to locate hadoop jar files in the proper location.

      1. HADOOP-7596-3.patch
        16 kB
        Eric Yang
      2. HADOOP-7596-2.patch
        17 kB
        Eric Yang
      3. HADOOP-7596.patch
        17 kB
        Eric Yang

        Issue Links

          Activity

          Hide
          Matt Foley added a comment -

          Closed upon release of 0.20.205.0

          Show
          Matt Foley added a comment - Closed upon release of 0.20.205.0
          Hide
          Devaraj Das added a comment -

          I just committed this (to the 20-security branch). Thanks, Eric!

          Show
          Devaraj Das added a comment - I just committed this (to the 20-security branch). Thanks, Eric!
          Hide
          Bruno Mahé added a comment -

          Thanks a lot for your reply. I understand the motivation now.
          But imho, this is negligible comparing to:

          • Everything else in the hadoop stack.
          • The advantages that consistency and readability would bring.
          • the efforts to maintain such micro-optimization across all the GNU/Linux distributions(from which LSB implementation wildly differ and in some case is partly broken)

          Thank you again very much for your patience.

          Show
          Bruno Mahé added a comment - Thanks a lot for your reply. I understand the motivation now. But imho, this is negligible comparing to: Everything else in the hadoop stack. The advantages that consistency and readability would bring. the efforts to maintain such micro-optimization across all the GNU/Linux distributions(from which LSB implementation wildly differ and in some case is partly broken) Thank you again very much for your patience.
          Hide
          Eric Yang added a comment -

          So you end up with a similar structure as the deb side and less repetition.

          The code is optimized for each linux distro. There is a settle difference for "daemon --user root" and "daemon " in Redhat. In the second case, it does not run "runuser" for performance reason. In Debian case, it always runs start-stop-daemon regardless there is a user specified or not. Due to the performance difference, and generalization code are already in hadoop-daemon.sh. There is no need to put generalization code in vendor specific script.

          Show
          Eric Yang added a comment - So you end up with a similar structure as the deb side and less repetition. The code is optimized for each linux distro. There is a settle difference for "daemon --user root" and "daemon " in Redhat. In the second case, it does not run "runuser" for performance reason. In Debian case, it always runs start-stop-daemon regardless there is a user specified or not. Due to the performance difference, and generalization code are already in hadoop-daemon.sh. There is no need to put generalization code in vendor specific script.
          Hide
          Bruno Mahé added a comment -

          Anyway, this is going off-topic. I just meant to replace in your patch:

          -  daemon --user hdfs ${HADOOP_PREFIX}/sbin/hadoop-daemon.sh --config "${HADOOP_CONF_DIR}" stop datanode
          +  if [ -n "$HADOOP_SECURE_DN_USER" ]; then
          +    daemon ${HADOOP_PREFIX}/sbin/hadoop-daemon.sh --config "${HADOOP_CONF_DIR}" stop datanode
          +  else
          +    daemon --user hdfs ${HADOOP_PREFIX}/sbin/hadoop-daemon.sh --config "${HADOOP_CONF_DIR}" stop datanode
          +  fi
          

          by something similar to what you did in the debian side. Which would look like something like:

          +if [ -n "$HADOOP_SECURE_DN_USER" ]; then
          +  DN_USER="root"
          +else
          +  DN_USER="hdfs"
          +fi
          

          And then later on:

          -  daemon --user hdfs ${HADOOP_PREFIX}/sbin/hadoop-daemon.sh --config "${HADOOP_CONF_DIR}" stop datanode
          +  daemon --user $DN_USER ${HADOOP_PREFIX}/sbin/hadoop-daemon.sh --config "${HADOOP_CONF_DIR}" stop datanode
          

          So you end up with a similar structure as the deb side and less repetition.

          Show
          Bruno Mahé added a comment - Anyway, this is going off-topic. I just meant to replace in your patch: - daemon --user hdfs ${HADOOP_PREFIX}/sbin/hadoop-daemon.sh --config "${HADOOP_CONF_DIR}" stop datanode + if [ -n "$HADOOP_SECURE_DN_USER" ]; then + daemon ${HADOOP_PREFIX}/sbin/hadoop-daemon.sh --config "${HADOOP_CONF_DIR}" stop datanode + else + daemon --user hdfs ${HADOOP_PREFIX}/sbin/hadoop-daemon.sh --config "${HADOOP_CONF_DIR}" stop datanode + fi by something similar to what you did in the debian side. Which would look like something like: +if [ -n "$HADOOP_SECURE_DN_USER" ]; then + DN_USER="root" +else + DN_USER="hdfs" +fi And then later on: - daemon --user hdfs ${HADOOP_PREFIX}/sbin/hadoop-daemon.sh --config "${HADOOP_CONF_DIR}" stop datanode + daemon --user $DN_USER ${HADOOP_PREFIX}/sbin/hadoop-daemon.sh --config "${HADOOP_CONF_DIR}" stop datanode So you end up with a similar structure as the deb side and less repetition.
          Hide
          Bruno Mahé added a comment -
          Bruno, Debian system uses .pid file to check for process aliveness. The filename generated by hadoop-daemon.sh can have discrepancy between the starting user and effective running user. This is why this fix only applies to Debian family. For redhat, it is using /var/lock/subsys to track process aliveness. Hence, in Redhat, there is no discrepancy for the pid file name verse the user that started the process because the lock is a fixed name.
          

          I am not sure to follow what you mean by "track process aliveness". Lock files and pid files have different purposes.
          And as far as I can tell from https://github.com/apache/hadoop-common/blob/branch-0.20-security-205/src/packages/rpm/init.d/hadoop-datanode you use $PIDFILE to check aliveness where PIDFILE="$

          {HADOOP_PID_DIR}

          /hadoop-hdfs-datanode.pid".

          Does that mean a secure cluster on redhat does not have all the issues debian platform have? (dropping privileges and changing user...)

          Show
          Bruno Mahé added a comment - Bruno, Debian system uses .pid file to check for process aliveness. The filename generated by hadoop-daemon.sh can have discrepancy between the starting user and effective running user. This is why this fix only applies to Debian family. For redhat, it is using /var/lock/subsys to track process aliveness. Hence, in Redhat, there is no discrepancy for the pid file name verse the user that started the process because the lock is a fixed name. I am not sure to follow what you mean by "track process aliveness". Lock files and pid files have different purposes. And as far as I can tell from https://github.com/apache/hadoop-common/blob/branch-0.20-security-205/src/packages/rpm/init.d/hadoop-datanode you use $PIDFILE to check aliveness where PIDFILE="$ {HADOOP_PID_DIR} /hadoop-hdfs-datanode.pid". Does that mean a secure cluster on redhat does not have all the issues debian platform have? (dropping privileges and changing user...)
          Hide
          Eric Yang added a comment -

          Does this mean I'll have to adduser hdfs to run a non-secure cluster? If so I'd much rather not.

          Ravi, hdfs is added as part of the rpm/deb package installation. This is the common behavior for rpm installed by packages. The headless user is preconfigured as part of package installation process.

          I would be expecting an identical behavior in both init scripts and use DN_USER/IDENT_USER for both cases

          Bruno, Debian system uses .pid file to check for process aliveness. The filename generated by hadoop-daemon.sh can have discrepancy between the starting user and effective running user. This is why this fix only applies to Debian family. For redhat, it is using /var/lock/subsys to track process aliveness. Hence, in Redhat, there is no discrepancy for the pid file name verse the user that started the process because the lock is a fixed name.

          Show
          Eric Yang added a comment - Does this mean I'll have to adduser hdfs to run a non-secure cluster? If so I'd much rather not. Ravi, hdfs is added as part of the rpm/deb package installation. This is the common behavior for rpm installed by packages. The headless user is preconfigured as part of package installation process. I would be expecting an identical behavior in both init scripts and use DN_USER/IDENT_USER for both cases Bruno, Debian system uses .pid file to check for process aliveness. The filename generated by hadoop-daemon.sh can have discrepancy between the starting user and effective running user. This is why this fix only applies to Debian family. For redhat, it is using /var/lock/subsys to track process aliveness. Hence, in Redhat, there is no discrepancy for the pid file name verse the user that started the process because the lock is a fixed name.
          Hide
          Bruno Mahé added a comment -

          Thank you very much for your reply.

          But I am still confused.

          I see

          +if [ -n "$HADOOP_SECURE_DN_USER" ]; then
          +  DN_USER="root"
          +  IDENT_USER=${HADOOP_SECURE_DN_USER}
          +else
          +  DN_USER="hdfs"
          +  IDENT_USER=${DN_USER}
          +fi
          

          defined in src/packages/deb/init.d/hadoop-datanode but not in src/packages/rpm/init.d/hadoop-datanode ? (I am looking at https://github.com/apache/hadoop-common/blob/branch-0.20-security-205/src/packages/rpm/init.d/hadoop-datanode so hopefully I am on the right branch)
          I would be expecting an identical behavior in both init scripts and use DN_USER/IDENT_USER for both cases

          Show
          Bruno Mahé added a comment - Thank you very much for your reply. But I am still confused. I see +if [ -n "$HADOOP_SECURE_DN_USER" ]; then + DN_USER="root" + IDENT_USER=${HADOOP_SECURE_DN_USER} +else + DN_USER="hdfs" + IDENT_USER=${DN_USER} +fi defined in src/packages/deb/init.d/hadoop-datanode but not in src/packages/rpm/init.d/hadoop-datanode ? (I am looking at https://github.com/apache/hadoop-common/blob/branch-0.20-security-205/src/packages/rpm/init.d/hadoop-datanode so hopefully I am on the right branch) I would be expecting an identical behavior in both init scripts and use DN_USER/IDENT_USER for both cases
          Hide
          Ravi Prakash added a comment -

          For non-secure cluster, datanode process is hard coded to hdfs user.

          Does this mean I'll have to adduser hdfs to run a non-secure cluster? If so I'd much rather not.

          Show
          Ravi Prakash added a comment - For non-secure cluster, datanode process is hard coded to hdfs user. Does this mean I'll have to adduser hdfs to run a non-secure cluster? If so I'd much rather not.
          Hide
          Eric Yang added a comment -

          Out of curiosity, what is the rationale of having a ${IDENT_USER} and ${DN_USER} ?

          For secure datanode, the script should be started with root user then drop privileges to HADOOP_SECURE_DN_USER. In the secure deployment context, IDENT_USER=hdfs, and DN_USER=root.

          hdfs user is still hardcoded in several locations

          For non-secure cluster, datanode process is hard coded to hdfs user.

          Isn't it a dangerous behaviour if a user customize this directory? (and by mistake or knowingly set it for instance to /var/log)

          chmod code was legacy code. The test log dir is to make it less destructive while maintain backward compatibility.

          Show
          Eric Yang added a comment - Out of curiosity, what is the rationale of having a ${IDENT_USER} and ${DN_USER} ? For secure datanode, the script should be started with root user then drop privileges to HADOOP_SECURE_DN_USER. In the secure deployment context, IDENT_USER=hdfs, and DN_USER=root. hdfs user is still hardcoded in several locations For non-secure cluster, datanode process is hard coded to hdfs user. Isn't it a dangerous behaviour if a user customize this directory? (and by mistake or knowingly set it for instance to /var/log) chmod code was legacy code. The test log dir is to make it less destructive while maintain backward compatibility.
          Hide
          Bruno Mahé added a comment -

          + if start-stop-daemon --start --quiet --oknodo --pidfile $

          Unknown macro: {HADOOP_PID_DIR}

          /hadoop-$

          Unknown macro: {IDENT_USER}

          -datanode.pid -c $

          Unknown macro: {DN_USER}

          -x $

          Unknown macro: {HADOOP_PREFIX}

          /sbin/hadoop-daemon.sh – --config $

          Unknown macro: {HADOOP_CONF_DIR}

          start datanode; then

          Out of curiosity, what is the rationale of having a $

          {IDENT_USER}

          and $

          {DN_USER}

          ?

          + daemon --user hdfs $

          Unknown macro: {HADOOP_PREFIX}

          /sbin/hadoop-daemon.sh --config "$

          Unknown macro: {HADOOP_CONF_DIR}

          " start datanode

          hdfs user is still hardcoded in several locations

          $

          Unknown macro: {HADOOP_PREFIX}

          /sbin/hadoop-daemon.sh

          is used in many locations. It may be useful to use a variable instead.

          -chown $HADOOP_IDENT_STRING $HADOOP_LOG_DIR
          +touch $HADOOP_LOG_DIR/.hadoop_test > /dev/null 2>&1
          +TEST_LOG_DIR=$?
          +if [ "$

          Unknown macro: {TEST_LOG_DIR}

          " = "0" ]; then
          + rm -f $HADOOP_LOG_DIR/.hadoop_test
          +else
          + chown $HADOOP_IDENT_STRING $HADOOP_LOG_DIR
          +fi

          Isn't it a dangerous behaviour if a user customize this directory? (and by mistake or knowingly set it for instance to /var/log)

          And also in some places I see the LSB log* functions being used and in some other places I see "echo -n". Although this is out of scope of this ticket.

          Show
          Bruno Mahé added a comment - + if start-stop-daemon --start --quiet --oknodo --pidfile $ Unknown macro: {HADOOP_PID_DIR} /hadoop-$ Unknown macro: {IDENT_USER} -datanode.pid -c $ Unknown macro: {DN_USER} -x $ Unknown macro: {HADOOP_PREFIX} /sbin/hadoop-daemon.sh – --config $ Unknown macro: {HADOOP_CONF_DIR} start datanode; then Out of curiosity, what is the rationale of having a $ {IDENT_USER} and $ {DN_USER} ? + daemon --user hdfs $ Unknown macro: {HADOOP_PREFIX} /sbin/hadoop-daemon.sh --config "$ Unknown macro: {HADOOP_CONF_DIR} " start datanode hdfs user is still hardcoded in several locations $ Unknown macro: {HADOOP_PREFIX} /sbin/hadoop-daemon.sh is used in many locations. It may be useful to use a variable instead. -chown $HADOOP_IDENT_STRING $HADOOP_LOG_DIR +touch $HADOOP_LOG_DIR/.hadoop_test > /dev/null 2>&1 +TEST_LOG_DIR=$? +if [ "$ Unknown macro: {TEST_LOG_DIR} " = "0" ]; then + rm -f $HADOOP_LOG_DIR/.hadoop_test +else + chown $HADOOP_IDENT_STRING $HADOOP_LOG_DIR +fi Isn't it a dangerous behaviour if a user customize this directory? (and by mistake or knowingly set it for instance to /var/log) And also in some places I see the LSB log* functions being used and in some other places I see "echo -n". Although this is out of scope of this ticket.
          Hide
          Devaraj Das added a comment -

          +1

          Show
          Devaraj Das added a comment - +1
          Hide
          Eric Yang added a comment -

          Issues fixed in this jira does not require forward porting to trunk because the changes are already in trunk. Thanks to Devaraj for reviewing this patch.

          Show
          Eric Yang added a comment - Issues fixed in this jira does not require forward porting to trunk because the changes are already in trunk. Thanks to Devaraj for reviewing this patch.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12492861/HADOOP-7596-3.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/130//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12492861/HADOOP-7596-3.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/130//console This message is automatically generated.
          Hide
          Eric Yang added a comment -

          Also fixed init.d script for debian where the startup user was set incorrectly in HADOOP-7596-2.patch.

          Show
          Eric Yang added a comment - Also fixed init.d script for debian where the startup user was set incorrectly in HADOOP-7596 -2.patch.
          Hide
          Eric Yang added a comment -
          • Removed uid/gid from this patch.
          • Removed legacy code for setting HADOOP_HOME to $HADOOP_PREFIX/share/hadoop/bin.
          Show
          Eric Yang added a comment - Removed uid/gid from this patch. Removed legacy code for setting HADOOP_HOME to $HADOOP_PREFIX/share/hadoop/bin.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12492644/HADOOP-7596-2.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/124//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12492644/HADOOP-7596-2.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/124//console This message is automatically generated.
          Hide
          Eric Yang added a comment -

          Dependency on openjdk was a mistake on my part. HDFS-2192 and MAPREDUCE-2728 have patch to address this problem for trunk.

          Show
          Eric Yang added a comment - Dependency on openjdk was a mistake on my part. HDFS-2192 and MAPREDUCE-2728 have patch to address this problem for trunk.
          Hide
          Bruno Mahé added a comment -

          Sun Java is required

          So why do we have the following line:
          https://github.com/apache/hadoop-common/blob/trunk/hadoop-common-project/hadoop-common/src/main/packages/deb/hadoop.control/control

          Depends: openjdk-6-jre-headless

          I am not that much familiar with packaging in hadoop itself, so I am sorry if I have missed some bits of infofmation.

          Show
          Bruno Mahé added a comment - Sun Java is required So why do we have the following line: https://github.com/apache/hadoop-common/blob/trunk/hadoop-common-project/hadoop-common/src/main/packages/deb/hadoop.control/control Depends: openjdk-6-jre-headless I am not that much familiar with packaging in hadoop itself, so I am sorry if I have missed some bits of infofmation.
          Hide
          Eric Yang added a comment -
          • Fixed secure datanode user.
          • Improved layout detection for tar layout vs binary layout.
          Show
          Eric Yang added a comment - Fixed secure datanode user. Improved layout detection for tar layout vs binary layout.
          Show
          Allen Wittenauer added a comment - http://wiki.apache.org/hadoop/HadoopJavaVersions
          Hide
          Eric Yang added a comment -

          For Kerberos secured Hadoop, Sun Java is required. Java definition fix is to ensure that we limit the compatibility to Sun Java for now until Hadoop has been verified to work with other vendor's Java.

          Show
          Eric Yang added a comment - For Kerberos secured Hadoop, Sun Java is required. Java definition fix is to ensure that we limit the compatibility to Sun Java for now until Hadoop has been verified to work with other vendor's Java.
          Hide
          Allen Wittenauer added a comment -

          Hadoop only works with Sun Java.

          This isn't true and one of the reasons why attempting to figure out which java to use programmatically is full of pot holes.

          Show
          Allen Wittenauer added a comment - Hadoop only works with Sun Java. This isn't true and one of the reasons why attempting to figure out which java to use programmatically is full of pot holes.
          Hide
          Eric Yang added a comment -

          Data node user is grabbled in the patch.

          Show
          Eric Yang added a comment - Data node user is grabbled in the patch.
          Hide
          Eric Yang added a comment -

          /usr/lib/jvm/java-6-sun is not valid on my machine (I have the JDK package installed). So why this path?

          Using update-alternatives prompts user for selection of java, if there are multiple java installed. This can cause the installer to hang. Hadoop only works with Sun Java. Therefore, this change was made in trunk to set to /usr/lib/jvm/java-6-sun for debian system. Part of this patch brings 0.20.2xx to be in sync with trunk code.

          + DN_USER="-c hdfs"

          Good catch, I made a mistake in my patch. Glad the mistake was caught. Thank you.

          Show
          Eric Yang added a comment - /usr/lib/jvm/java-6-sun is not valid on my machine (I have the JDK package installed). So why this path? Using update-alternatives prompts user for selection of java, if there are multiple java installed. This can cause the installer to hang. Hadoop only works with Sun Java. Therefore, this change was made in trunk to set to /usr/lib/jvm/java-6-sun for debian system. Part of this patch brings 0.20.2xx to be in sync with trunk code. + DN_USER="-c hdfs" Good catch, I made a mistake in my patch. Glad the mistake was caught. Thank you.
          Hide
          Hadoop QA added a comment -

          -1 overall. Here are the results of testing the latest attachment
          http://issues.apache.org/jira/secure/attachment/12492542/HADOOP-7596.patch
          against trunk revision .

          +1 @author. The patch does not contain any @author tags.

          -1 tests included. The patch doesn't appear to include any new or modified tests.
          Please justify why no new tests are needed for this patch.
          Also please list what manual steps were performed to verify this patch.

          -1 patch. The patch command could not apply the patch.

          Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/116//console

          This message is automatically generated.

          Show
          Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12492542/HADOOP-7596.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HADOOP-Build/116//console This message is automatically generated.
          Hide
          Bruno Mahé added a comment -

          I am confused.

          -      JAVA_HOME=`update-alternatives --config java | grep java | cut -f2 -d':' | cut -f2 -d' ' | sed -e 's/\/bin\/java//'`
          +      JAVA_HOME=/usr/lib/jvm/java-6-sun
          

          /usr/lib/jvm/java-6-sun is not valid on my machine (I have the JDK package installed). So why this path?

          And also I see

          +if [ -n "$HADOOP_SECURE_DN_USER" ]; then
          +  DN_USER=""
          +else
          +  DN_USER="-c hdfs"
          +fi
          +
          

          but then it is used later on in paths. Here is one example:

          -	if start-stop-daemon --start --quiet --oknodo --pidfile ${HADOOP_PID_DIR}/hadoop-hdfs-datanode.pid -c hdfs -x ${HADOOP_PREFIX}/sbin/hadoop-daemon.sh -- --config ${HADOOP_CONF_DIR} start datanode; then
          +	if start-stop-daemon --start --quiet --oknodo --pidfile ${HADOOP_PID_DIR}/hadoop-${DN_USER}-datanode.pid -c ${DN_USER} -x ${HADOOP_PREFIX}/sbin/hadoop-daemon.sh -- --config ${HADOOP_CONF_DIR} start datanode; then
          

          So

          --pidfile ${HADOOP_PID_DIR}/hadoop-${DN_USER}-datanode.pid 
          

          would be expanded to

          --pidfile ${HADOOP_PID_DIR}/hadoop-c hdfs-datanode.pid 
          
          Show
          Bruno Mahé added a comment - I am confused. - JAVA_HOME=`update-alternatives --config java | grep java | cut -f2 -d':' | cut -f2 -d' ' | sed -e 's/\/bin\/java//'` + JAVA_HOME=/usr/lib/jvm/java-6-sun /usr/lib/jvm/java-6-sun is not valid on my machine (I have the JDK package installed). So why this path? And also I see +if [ -n "$HADOOP_SECURE_DN_USER" ]; then + DN_USER="" +else + DN_USER="-c hdfs" +fi + but then it is used later on in paths. Here is one example: - if start-stop-daemon --start --quiet --oknodo --pidfile ${HADOOP_PID_DIR}/hadoop-hdfs-datanode.pid -c hdfs -x ${HADOOP_PREFIX}/sbin/hadoop-daemon.sh -- --config ${HADOOP_CONF_DIR} start datanode; then + if start-stop-daemon --start --quiet --oknodo --pidfile ${HADOOP_PID_DIR}/hadoop-${DN_USER}-datanode.pid -c ${DN_USER} -x ${HADOOP_PREFIX}/sbin/hadoop-daemon.sh -- --config ${HADOOP_CONF_DIR} start datanode; then So --pidfile ${HADOOP_PID_DIR}/hadoop-${DN_USER}-datanode.pid would be expanded to --pidfile ${HADOOP_PID_DIR}/hadoop-c hdfs-datanode.pid
          Hide
          Eric Yang added a comment -

          The patch is correct. I made a mistake on local testing.

          Show
          Eric Yang added a comment - The patch is correct. I made a mistake on local testing.
          Hide
          Eric Yang added a comment -

          Cancel patch because classpath should be setup differently between tar layout and binary layout

          Show
          Eric Yang added a comment - Cancel patch because classpath should be setup differently between tar layout and binary layout
          Hide
          Eric Yang added a comment -
          • Bundle both 32bit and 64bit jsvc in release tar ball.
          • Change scripts to construct class path properly.
          • Bug fix for start up script to utilize proper jsvc.
          Show
          Eric Yang added a comment - Bundle both 32bit and 64bit jsvc in release tar ball. Change scripts to construct class path properly. Bug fix for start up script to utilize proper jsvc.
          Hide
          Eric Yang added a comment -

          This jira is only valid for 0.20.205 and 2xx branch only. For 0.23+, HDFS-2289 should be applied.

          Show
          Eric Yang added a comment - This jira is only valid for 0.20.205 and 2xx branch only. For 0.23+, HDFS-2289 should be applied.
          Hide
          Roman Shaposhnik added a comment -

          I thought we've agreed over at HDFS-2289 to recompile jsvc as part of the build process. Is this JIRA somehow different?

          Show
          Roman Shaposhnik added a comment - I thought we've agreed over at HDFS-2289 to recompile jsvc as part of the build process. Is this JIRA somehow different?
          Hide
          Jakob Homan added a comment -

          ok.

          Show
          Jakob Homan added a comment - ok.
          Hide
          Eric Yang added a comment -

          Jakob, yes, but there is only one release tar ball which contains both i386 and x86_64 libraries. There is no unique location for jsvc which bundled in Hadoop. Hence, the release tarball is effectively 32 bit jsvc only.

          Ideally, there should be one tarball/rpm/deb per architecture build. However, having separated tar ball per architecture build prevents us from having a single signed hadoop-core*.jar file to be push to maven repository.

          Therefore, to solve this problem, we probably should retain the single tar ball release, but bundle both jsvc in the release tar ball. The only change would be the shell script to do the late binding to select the right jsvc to use at runtime.

          RPM/DEB packages should only package jsvc base on the building architecture.

          Does this sound reasonable?

          Show
          Eric Yang added a comment - Jakob, yes, but there is only one release tar ball which contains both i386 and x86_64 libraries. There is no unique location for jsvc which bundled in Hadoop. Hence, the release tarball is effectively 32 bit jsvc only. Ideally, there should be one tarball/rpm/deb per architecture build. However, having separated tar ball per architecture build prevents us from having a single signed hadoop-core*.jar file to be push to maven repository. Therefore, to solve this problem, we probably should retain the single tar ball release, but bundle both jsvc in the release tar ball. The only change would be the shell script to do the late binding to select the right jsvc to use at runtime. RPM/DEB packages should only package jsvc base on the building architecture. Does this sound reasonable?
          Hide
          Jakob Homan added a comment -

          And this isn't solved by setting -Djsvc.location=

          {url to 64 bit package}

          ?

          Show
          Jakob Homan added a comment - And this isn't solved by setting -Djsvc.location= {url to 64 bit package} ?

            People

            • Assignee:
              Eric Yang
              Reporter:
              Eric Yang
            • Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development