Uploaded image for project: 'Hadoop Common'
  1. Hadoop Common
  2. HADOOP-2857

libhdfs: no way to set JVM args other than classpath

    Details

    • Type: Improvement
    • Status: Closed
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 0.16.0
    • Fix Version/s: 0.18.0
    • Component/s: None
    • Labels:
      None
    • Hadoop Flags:
      Reviewed

      Description

      I would like a way to set other Java system properties and/or other Java VM arguments when using libhdfs - i.e. I want to be able to say start a profiler, attach a debugger, or increase the memory available to the VM.

      At present, new JVMs are initialised from the getJNIEnv(void) method in src/c++/libhdfs/hdfsJniHelper.c
      This method initialises the -Djava.class.path JVM argument from the CLASSPATH environment variable. I am proposing that another env variable should be added, the contents of which is passed (almost) verbatim as JVM arguments.

      Eg, say the env var JVM_ARGS is used. The string from the env var would be tokenised on "one or more spaces", and would be passed as additional JVM arguments, by make a larger JavaVMOption options[] array, and setting vm_args.nOptions = 1+ number of passed arguments.

      Only flaw I can see with this is that none of the passed parameters can contain spaces.

      1. patch.libhdfs.jvm.args
        2 kB
        Craig Macdonald
      2. v2.patch.libhdfs.jvm.args
        3 kB
        Craig Macdonald

        Activity

        Hide
        craigm Craig Macdonald added a comment -

        Patch attached. Arguments are read from HADOOP_JVM_ARGS env var, space delimited. Applies cleanly to trunk and 0.16 branch.

        Show
        craigm Craig Macdonald added a comment - Patch attached. Arguments are read from HADOOP_JVM_ARGS env var, space delimited. Applies cleanly to trunk and 0.16 branch.
        Hide
        cutting Doug Cutting added a comment -

        There's already a HADOOP_OPTS environment variable that's used by other scripts when starting Java. If these are meant to be libhdfs-specific options, then the environment variable name should make that clear. Thus it might be called LIBHDFS_OPTS.

        Show
        cutting Doug Cutting added a comment - There's already a HADOOP_OPTS environment variable that's used by other scripts when starting Java. If these are meant to be libhdfs-specific options, then the environment variable name should make that clear. Thus it might be called LIBHDFS_OPTS.
        Hide
        craigm Craig Macdonald added a comment -

        I'm quite happy to rename the variable. To HADOOP_OPTS would make most sense to me, as the options that can be set (any JVM args including System properties etc, memory consumption, etc), are exactly the same as someone might set for Hadoop, so why should it be any different to HADOOP_OPTS? A user of a C application that uses libhdfs might not know that it's libhdfs that the C application uses, but rather just Hadoop in general, so HADOOP_OPTS might be better than LIBHDFS_OPTS. However, I'm happy to take arguments either way.

        Make the call on the env vaar name, and can anyone provide feedback - I'm not a natural C coder. Should I be using strtok_r instead of strtok? Do I need to do the tokenisation twice?

        Show
        craigm Craig Macdonald added a comment - I'm quite happy to rename the variable. To HADOOP_OPTS would make most sense to me, as the options that can be set (any JVM args including System properties etc, memory consumption, etc), are exactly the same as someone might set for Hadoop, so why should it be any different to HADOOP_OPTS? A user of a C application that uses libhdfs might not know that it's libhdfs that the C application uses, but rather just Hadoop in general, so HADOOP_OPTS might be better than LIBHDFS_OPTS. However, I'm happy to take arguments either way. Make the call on the env vaar name, and can anyone provide feedback - I'm not a natural C coder. Should I be using strtok_r instead of strtok? Do I need to do the tokenisation twice?
        Hide
        craigm Craig Macdonald added a comment -

        Corrected env var name as per Doug's suggestion; Added code documentation for the affected method, describing added env var.

        Show
        craigm Craig Macdonald added a comment - Corrected env var name as per Doug's suggestion; Added code documentation for the affected method, describing added env var.
        Hide
        hadoopqa Hadoop QA added a comment -

        -1 overall. Here are the results of testing the latest attachment
        http://issues.apache.org/jira/secure/attachment/12380413/v2.patch.libhdfs.jvm.args
        against trunk revision 645773.

        @author +1. The patch does not contain any @author tags.

        tests included -1. The patch doesn't appear to include any new or modified tests.
        Please justify why no tests are needed for this patch.

        javadoc +1. The javadoc tool did not generate any warning messages.

        javac +1. The applied patch does not generate any new javac compiler warnings.

        release audit +1. The applied patch does not generate any new release audit warnings.

        findbugs +1. The patch does not introduce any new Findbugs warnings.

        core tests +1. The patch passed core unit tests.

        contrib tests +1. The patch passed contrib unit tests.

        Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2266/testReport/
        Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2266/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
        Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2266/artifact/trunk/build/test/checkstyle-errors.html
        Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2266/console

        This message is automatically generated.

        Show
        hadoopqa Hadoop QA added a comment - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12380413/v2.patch.libhdfs.jvm.args against trunk revision 645773. @author +1. The patch does not contain any @author tags. tests included -1. The patch doesn't appear to include any new or modified tests. Please justify why no tests are needed for this patch. javadoc +1. The javadoc tool did not generate any warning messages. javac +1. The applied patch does not generate any new javac compiler warnings. release audit +1. The applied patch does not generate any new release audit warnings. findbugs +1. The patch does not introduce any new Findbugs warnings. core tests +1. The patch passed core unit tests. contrib tests +1. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2266/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2266/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2266/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hadoop-Patch/2266/console This message is automatically generated.
        Hide
        craigm Craig Macdonald added a comment -

        Not sure that this patch needs tests. The libhdfs test harness creates a namenode then runs the entire test program once only.

        Show
        craigm Craig Macdonald added a comment - Not sure that this patch needs tests. The libhdfs test harness creates a namenode then runs the entire test program once only.
        Hide
        owen.omalley Owen O'Malley added a comment -

        I just committed this. Thanks, Craig!

        ps. Next time please try to keep tabs out, they are set differently in different editors...

        Show
        owen.omalley Owen O'Malley added a comment - I just committed this. Thanks, Craig! ps. Next time please try to keep tabs out, they are set differently in different editors...
        Hide
        hudson Hudson added a comment -
        Show
        hudson Hudson added a comment - Integrated in Hadoop-trunk #475 (See http://hudson.zones.apache.org/hudson/job/Hadoop-trunk/475/ )

          People

          • Assignee:
            craigm Craig Macdonald
            Reporter:
            craigm Craig Macdonald
          • Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Time Tracking

              Estimated:
              Original Estimate - 5h
              5h
              Remaining:
              Remaining Estimate - 5h
              5h
              Logged:
              Time Spent - Not Specified
              Not Specified

                Development