Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-1290

Document Configuration Steps for Different Hadoop Distributions

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • Future
    • Client - HTTP
    • None
    • OS: Centos 6.4
      HDFS: CDH5.1
      Drill: 0.4.0

    Description

      In web GUI, I can successfully create a new storage plugin named "myhdfs" using "file:///":

      {
        "type": "file",
        "enabled": true,
        "connection": "file:///",
        "workspaces": {
          "root": {
            "location": "/",
            "writable": false,
            "storageformat": null
          },
          "tmp": {
            "location": "/tmp",
            "writable": true,
            "storageformat": "csv"
          }
        },
        "formats": {
          "psv": {
            "type": "text",
            "extensions": [
              "tbl"
            ],
            "delimiter": "|"
          },
          "csv": {
            "type": "text",
            "extensions": [
              "csv"
            ],
            "delimiter": ","
          },
          "tsv": {
            "type": "text",
            "extensions": [
              "tsv"
            ],
            "delimiter": "\t"
          },
          "parquet": {
            "type": "parquet"
          },
          "json": {
            "type": "json"
          }
        }
      }
      

      However if I try to change "file:///" to "hdfs:///" to point to HDFS other than local file system, drill log errors out "[qtp416200645-67] DEBUG o.a.d.e.server.rest.StorageResources - Unable to create/ update plugin: myhdfs".

      {
        "type": "file",
        "enabled": true,
        "connection": "hdfs:///",
        "workspaces": {
          "root": {
            "location": "/",
            "writable": false,
            "storageformat": null
          },
          "tmp": {
            "location": "/tmp",
            "writable": true,
            "storageformat": "csv"
          }
        },
        "formats": {
          "psv": {
            "type": "text",
            "extensions": [
              "tbl"
            ],
            "delimiter": "|"
          },
          "csv": {
            "type": "text",
            "extensions": [
              "csv"
            ],
            "delimiter": ","
          },
          "tsv": {
            "type": "text",
            "extensions": [
              "tsv"
            ],
            "delimiter": "\t"
          },
          "parquet": {
            "type": "parquet"
          },
          "json": {
            "type": "json"
          }
        }
      }
      

      On my cluster, I am using CDH5 hdfs, and it all client configurations are valid. For example, on the drillbit server:

      [root@hdm ~]# hdfs dfs -ls /
      Found 3 items
      drwxr-xr-x   - hbase hbase               0 2014-08-04 22:55 /hbase
      drwxrwxrwt   - hdfs  supergroup          0 2014-07-31 16:31 /tmp
      drwxr-xr-x   - hdfs  supergroup          0 2014-07-11 12:06 /user
      

      Is there anything wrong with the storage plugin syntax for HDFS?
      If so, can drill log prints more debug info to show the reason why it failed?
      Thanks.

      Attachments

        Issue Links

          Activity

            People

              Unassigned Unassigned
              haozhu Hao Zhu
              Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

                Created:
                Updated: