Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-6521

When using Hadoop configs, allow dfs connection to be unset

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Open
    • Minor
    • Resolution: Unresolved
    • 1.14.0
    • None
    • None
    • None

    Description

      Drill usually works stand-alone. In a Hadoop HDFS environment, the docs. say to set the HDFS configuration in the storage plugin config.

      In a MapR installation, MaprFS settings are applied automatically.

      However, when Drill runs on an existing HDFS cluster, one must often provide more than the simple HDFS URL. Particularly in a secure cluster, other configuration settings are also needed. At present, these must be copied out of the HDFS config files into the Drill storage plugin config, and the two must be updated in tandem. Clearly less than ideal.

      Drill does allow the user to add Hadoop configs to the class path. (Though, it looks like in recent releases the previous HADOOP_CONF setting has been removed.) The user can edit drill-env.sh to add the Hadoop class path to EXTN_CLASSPATH. (But see DRILL-6520.)

      This is all good, but Drill still requires that the "dfs" storage plugin config contain a connection. Omit the connection and we get:

      Please retry: Error while creating/ updating storage : The value of property fs.defaultFS must not be null
      

      Would expect to be able to omit this property if the value is provided by the standard Hadoop core-site.xml file (and to do so without Drill crashing, per DRILL-6520).

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              paul-rogers Paul Rogers
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

                Created:
                Updated: