Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-6263

Improvements to DoY initial experience

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • 1.13.0
    • None
    • None
    • None

    Description

      As part of the Drill 1.13 release process, I tested out DoY after a year of not having used it. That time gap pointed out some improvements for first-time users.

      • Copy the USAGE.md file into the Drill home directory with the name "DRILL_YARN_USAGE.md.".
      • Change the drill-on-yarn-example.conf file to be a valid file for the default Drill and YARN configurations.
          heap: "2G"
          max-direct-memory: "2G"
         memory-mb: 5125
      
      • Change the drill-on-yarn-example.conf to disable SSL by default. Just comment out the following line:
          #ssl-enabled: true
      
      • Change the drill-on-yarn-example.conf to disable authorization by default. That is, comment out the following line:
          #auth-type: "drill"
      
      • Change the drill-on-yarn-example.conf to use no AM node labels by default. That is, comment out the following line:
          #node-label-expr: "drill-am"
      

      Failure to comment out this line results in the following error:

      Failed to start Drill application master
        Caused by: Submit application failed
        Caused by: Invalid resource request, node label not enabled but request contains label expression
      

      Also, add this to the Troubleshooting section in USAGE.md.

      • Change DrillOnYarnConfig.findSuffix, to allow the .tar suffix. This is what one ends up with it ht Mac does its automatic extract. A tar file is larger than the compressed version, but no reason it should not be allowed (assuming YARN supports it.)
      • Otherwise, change DrillOnYarnConfig.getRemoteDrillHome(), where we emit the error "does not name a valid archive" to differentiate between no sufficient and an unsupported suffix. (I got the following error and had to look at the source to figure out what I'd done wrong):
      drill.yarn.drill-install.client-path does not name a valid archive: /Users/paulrogers/bin/apache-drill-1.13.0.tar
      
      • Change the newly-added error reporting code in DrillOnYarn.displayError to omit displaying the exception cause it if just repeats the main error message. Here is the full error message from above, the second line is redundant:
      drill.yarn.drill-install.client-path does not name a valid archive: /Users/paulrogers/bin/apache-drill-1.13.0.tar
        Caused by: drill.yarn.drill-install.client-path does not name a valid archive: /Users/paulrogers/bin/apache-drill-1.13.0.tar
      
      • Add to USAGE.md pointers for how to set up a basic HDFS, ZK and YARN configuration. Mostly just state what is to be done and point to the relevant Hadoop docs, "Pseudo-Distributed Operation". In particular, we want to create an actual HDFS file system, not use the default of local file system.
      • Add to USAGE.md a description of the supported YARN (actually Hadoop) versions. Feature was developed with 2.7.1. Currently verifying with 2.9.0. Probably needs to be rechecked on the 3.x series.
      • Add to USAGE.md the fact that Drill is built with, and includes the jars for, Hadoop 2.7.1. It is not clear what version compatibility Hadoop has; are these jars compatible with the latest 2.x series Hadoop? With Hadoop 3.x?
      • Until DRILL-6268 is fixed, explain that the HDFS configuration must use port 8020. Also, add this to the Troubleshooting section in USAGE.md.
      • Add to USAGE.md, Troubleshooting: if configuration issues cause Drill to fail to start, then Drill-on-YARN will blacklist each node after several tries. Unfortunately, the YARN UI appears to not provide access to the logs for failed application containers. So, to track down the failure, look for the container logs in YARN. In the default single-node install, they are in $HADOOP_HOME/logs/userlogs/application_xxx/container_xx_00000y where y=1 is the AM, y>1 are the Drillbit containers.
      • Change USAGE.md to change the following line:
      cp $DRILL_HOME/conf/drill-override-example.conf $DRILL_SITE/drill-override.conf
      

      To the following:

      cp $DRILL_HOME/conf/drill-override.conf $DRILL_SITE
      

      Without this change, Drill will fail to start and you'll see the following in the YARN container log directory, drillbit.log file:

      2018-03-17 16:11:25,293 [main] ERROR o.a.d.e.r.u.s.PamUserAuthenticator - Problem in finding the native library of JPAM (Pluggable Authenticator Module API). Make sure to set Drillbit JVM option 'java.library.path' to point to the directory where the native JPAM exists.
      java.lang.UnsatisfiedLinkError: no jpam in java.library.path
      

      None of these are show stoppers, each is instead just a bit of sand in the gears that makes progress a bit slower than it need be.

      Attachments

        Activity

          People

            Unassigned Unassigned
            paul-rogers Paul Rogers
            Votes:
            0 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

              Created:
              Updated: