Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-1290

Document Configuration Steps for Different Hadoop Distributions

Attach filesAttach ScreenshotAdd voteVotersWatch issueWatchersCreate sub-taskLinkCloneUpdate Comment AuthorReplace String in CommentUpdate Comment VisibilityDelete Comments
    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Open
    • Major
    • Resolution: Unresolved
    • None
    • Future
    • Client - HTTP
    • None
    • OS: Centos 6.4
      HDFS: CDH5.1
      Drill: 0.4.0

    Description

      In web GUI, I can successfully create a new storage plugin named "myhdfs" using "file:///":

      {
        "type": "file",
        "enabled": true,
        "connection": "file:///",
        "workspaces": {
          "root": {
            "location": "/",
            "writable": false,
            "storageformat": null
          },
          "tmp": {
            "location": "/tmp",
            "writable": true,
            "storageformat": "csv"
          }
        },
        "formats": {
          "psv": {
            "type": "text",
            "extensions": [
              "tbl"
            ],
            "delimiter": "|"
          },
          "csv": {
            "type": "text",
            "extensions": [
              "csv"
            ],
            "delimiter": ","
          },
          "tsv": {
            "type": "text",
            "extensions": [
              "tsv"
            ],
            "delimiter": "\t"
          },
          "parquet": {
            "type": "parquet"
          },
          "json": {
            "type": "json"
          }
        }
      }
      

      However if I try to change "file:///" to "hdfs:///" to point to HDFS other than local file system, drill log errors out "[qtp416200645-67] DEBUG o.a.d.e.server.rest.StorageResources - Unable to create/ update plugin: myhdfs".

      {
        "type": "file",
        "enabled": true,
        "connection": "hdfs:///",
        "workspaces": {
          "root": {
            "location": "/",
            "writable": false,
            "storageformat": null
          },
          "tmp": {
            "location": "/tmp",
            "writable": true,
            "storageformat": "csv"
          }
        },
        "formats": {
          "psv": {
            "type": "text",
            "extensions": [
              "tbl"
            ],
            "delimiter": "|"
          },
          "csv": {
            "type": "text",
            "extensions": [
              "csv"
            ],
            "delimiter": ","
          },
          "tsv": {
            "type": "text",
            "extensions": [
              "tsv"
            ],
            "delimiter": "\t"
          },
          "parquet": {
            "type": "parquet"
          },
          "json": {
            "type": "json"
          }
        }
      }
      

      On my cluster, I am using CDH5 hdfs, and it all client configurations are valid. For example, on the drillbit server:

      [root@hdm ~]# hdfs dfs -ls /
      Found 3 items
      drwxr-xr-x   - hbase hbase               0 2014-08-04 22:55 /hbase
      drwxrwxrwt   - hdfs  supergroup          0 2014-07-31 16:31 /tmp
      drwxr-xr-x   - hdfs  supergroup          0 2014-07-11 12:06 /user
      

      Is there anything wrong with the storage plugin syntax for HDFS?
      If so, can drill log prints more debug info to show the reason why it failed?
      Thanks.

      Attachments

        Issue Links

        Activity

          This comment will be Viewable by All Users Viewable by All Users
          Cancel

          People

            Unassigned Unassigned
            haozhu Hao Zhu

            Dates

              Created:
              Updated:

              Slack

                Issue deployment