Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-1290

Document Configuration Steps for Different Hadoop Distributions

    Details

    • Type: New Feature
    • Status: Open
    • Priority: Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: Future
    • Component/s: Client - HTTP
    • Labels:
      None
    • Environment:

      OS: Centos 6.4
      HDFS: CDH5.1
      Drill: 0.4.0

      Description

      In web GUI, I can successfully create a new storage plugin named "myhdfs" using "file:///":

      {
        "type": "file",
        "enabled": true,
        "connection": "file:///",
        "workspaces": {
          "root": {
            "location": "/",
            "writable": false,
            "storageformat": null
          },
          "tmp": {
            "location": "/tmp",
            "writable": true,
            "storageformat": "csv"
          }
        },
        "formats": {
          "psv": {
            "type": "text",
            "extensions": [
              "tbl"
            ],
            "delimiter": "|"
          },
          "csv": {
            "type": "text",
            "extensions": [
              "csv"
            ],
            "delimiter": ","
          },
          "tsv": {
            "type": "text",
            "extensions": [
              "tsv"
            ],
            "delimiter": "\t"
          },
          "parquet": {
            "type": "parquet"
          },
          "json": {
            "type": "json"
          }
        }
      }
      

      However if I try to change "file:///" to "hdfs:///" to point to HDFS other than local file system, drill log errors out "[qtp416200645-67] DEBUG o.a.d.e.server.rest.StorageResources - Unable to create/ update plugin: myhdfs".

      {
        "type": "file",
        "enabled": true,
        "connection": "hdfs:///",
        "workspaces": {
          "root": {
            "location": "/",
            "writable": false,
            "storageformat": null
          },
          "tmp": {
            "location": "/tmp",
            "writable": true,
            "storageformat": "csv"
          }
        },
        "formats": {
          "psv": {
            "type": "text",
            "extensions": [
              "tbl"
            ],
            "delimiter": "|"
          },
          "csv": {
            "type": "text",
            "extensions": [
              "csv"
            ],
            "delimiter": ","
          },
          "tsv": {
            "type": "text",
            "extensions": [
              "tsv"
            ],
            "delimiter": "\t"
          },
          "parquet": {
            "type": "parquet"
          },
          "json": {
            "type": "json"
          }
        }
      }
      

      On my cluster, I am using CDH5 hdfs, and it all client configurations are valid. For example, on the drillbit server:

      [root@hdm ~]# hdfs dfs -ls /
      Found 3 items
      drwxr-xr-x   - hbase hbase               0 2014-08-04 22:55 /hbase
      drwxrwxrwt   - hdfs  supergroup          0 2014-07-31 16:31 /tmp
      drwxr-xr-x   - hdfs  supergroup          0 2014-07-11 12:06 /user
      

      Is there anything wrong with the storage plugin syntax for HDFS?
      If so, can drill log prints more debug info to show the reason why it failed?
      Thanks.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                Unassigned
                Reporter:
                haozhu Hao Zhu
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated: