Apache Drill
  1. Apache Drill
  2. DRILL-1075

can not create hdfs as connection type in storage engine : server throws http 500 error

    Details

    • Type: Bug Bug
    • Status: Resolved
    • Priority: Critical Critical
    • Resolution: Invalid
    • Affects Version/s: None
    • Fix Version/s: 0.4.0
    • Component/s: Client - HTTP
    • Labels:
      None

      Description

      Server at 8047 throws:
      HTTP ERROR 500
      Problem accessing /storage/config/update. Reason:
      Request failed.

      configure file:
      {
      "type" : "file",
      "enabled" : true,
      "connection" : "hdfs:///",
      "workspaces" : {
      "root" :

      { "location" : "/", "writable" : false, "storageformat" : null }

      ,
      "default" :

      { "location" : "/user/root", "writable" : true, "storageformat" : null }

      ,
      "tmp" :

      { "location" : "/tmp", "writable" : true, "storageformat" : "csv" }

      },
      "formats" : {
      "psv" :

      { "type" : "text", "extensions" : [ "tbl" ], "delimiter" : "|" }

      ,
      "csv" :

      { "type" : "text", "extensions" : [ "csv" ], "delimiter" : "," }

      ,
      "tsv" :

      { "type" : "text", "extensions" : [ "tsv" ], "delimiter" : "\t" }

      ,
      "parquet" :

      { "type" : "parquet" }

      ,
      "json" :

      { "type" : "json" }

      }
      }

        Activity

        Hide
        Vivian Summers added a comment -

        git.commit.id.abbrev=79c1502

        Show
        Vivian Summers added a comment - git.commit.id.abbrev=79c1502
        Hide
        Jacques Nadeau added a comment -

        It looks like you have forgotten to put in your namenode and host. Also, please provide the exception logs that happen on the server when you see errors like this. Thanks

        Show
        Jacques Nadeau added a comment - It looks like you have forgotten to put in your namenode and host. Also, please provide the exception logs that happen on the server when you see errors like this. Thanks
        Hide
        Vivian Summers added a comment -

        I've tried hdfs://cent64:8020 and hdfs://cent64:50070 also but getting the same error. There is nothing written to the drillbit.log and drillbit.out. It only works if I use file:///. I tried to put maprfs:/// and it also throws the same error. The node is running CDH5. Tests works fine using file:/// as the storage engine.

        Show
        Vivian Summers added a comment - I've tried hdfs://cent64:8020 and hdfs://cent64:50070 also but getting the same error. There is nothing written to the drillbit.log and drillbit.out. It only works if I use file:/// . I tried to put maprfs:/// and it also throws the same error. The node is running CDH5. Tests works fine using file:/// as the storage engine.
        Hide
        Cliff Buchanan added a comment -

        Assigning to Sudheesh to further analyze

        Show
        Cliff Buchanan added a comment - Assigning to Sudheesh to further analyze
        Hide
        Amit Katti added a comment - - edited

        I just finished testing DRILL on vanilla Hadoop and CDH4 and CDH5 and it is working successfully.

        For Drill to work on Hadoop, you need to provide the connection as "connection" : "hdfs://10.10.30.156:8020/"
        The storage plugin for dfs is below

        {
          "type" : "file",
          "enabled" : true,
          "connection" : "hdfs://10.10.30.156:8020/",
          "workspaces" : {
            "root" : {
              "location" : "/user/root/drill",
              "writable" : true,
              "storageformat" : "null"
            },
            "tmp" : {
              "location" : "/tmp",
              "writable" : true,
              "storageformat" : "csv"
            },
            "drillTestDir" : {
              "location" : "/drill/testdata/",
              "writable" : false,
              "storageformat" : "parquet"
            }
          },
          "formats" : {
            "psv" : {
              "type" : "text",
              "extensions" : [ "tbl" ],
              "delimiter" : "|"
            },
            "csv" : {
              "type" : "text",
              "extensions" : [ "csv" ],
              "delimiter" : ","
            },
            "tsv" : {
              "type" : "text",
              "extensions" : [ "tsv" ],
              "delimiter" : "t"
            },
            "parquet" : {
              "type" : "parquet"
            },
            "json" : {
              "type" : "json"
            }
          }
        }
        

        For this to work you need to have the below mentioned jars in the Drill Classpath (location might change depending on installation):
        /opt/cloudera/parcels/CDH-4.7.0-1.cdh4.7.0.p0.40/lib/hadoop/hadoop-annotations-2.0.0-cdh4.7.0.jar
        /opt/cloudera/parcels/CDH-4.7.0-1.cdh4.7.0.p0.40/lib/hadoop/hadoop-auth-2.0.0-cdh4.7.0.jar
        /opt/cloudera/parcels/CDH-4.7.0-1.cdh4.7.0.p0.40/lib/hadoop/hadoop-common-2.0.0-cdh4.7.0.jar
        /opt/cloudera/parcels/CDH-4.7.0-1.cdh4.7.0.p0.40/lib/hadoop-hdfs/hadoop-hdfs-2.0.0-cdh4.7.0.jar
        /opt/cloudera/parcels/CDH-4.7.0-1.cdh4.7.0.p0.40/lib/hadoop-mapreduce/hadoop-mapreduce-client-core-2.0.0-cdh4.7.0.jar

        There is a separate Drill Jira open for ensuring this is configured correctly at DRILL-1160
        https://issues.apache.org/jira/browse/DRILL-1160

        Show
        Amit Katti added a comment - - edited I just finished testing DRILL on vanilla Hadoop and CDH4 and CDH5 and it is working successfully. For Drill to work on Hadoop, you need to provide the connection as "connection" : "hdfs://10.10.30.156:8020/" The storage plugin for dfs is below { "type" : "file" , "enabled" : true , "connection" : "hdfs: //10.10.30.156:8020/" , "workspaces" : { "root" : { "location" : "/user/root/drill" , "writable" : true , "storageformat" : " null " }, "tmp" : { "location" : "/tmp" , "writable" : true , "storageformat" : "csv" }, "drillTestDir" : { "location" : "/drill/testdata/" , "writable" : false , "storageformat" : "parquet" } }, "formats" : { "psv" : { "type" : "text" , "extensions" : [ "tbl" ], "delimiter" : "|" }, "csv" : { "type" : "text" , "extensions" : [ "csv" ], "delimiter" : "," }, "tsv" : { "type" : "text" , "extensions" : [ "tsv" ], "delimiter" : "t" }, "parquet" : { "type" : "parquet" }, "json" : { "type" : "json" } } } For this to work you need to have the below mentioned jars in the Drill Classpath (location might change depending on installation): /opt/cloudera/parcels/CDH-4.7.0-1.cdh4.7.0.p0.40/lib/hadoop/hadoop-annotations-2.0.0-cdh4.7.0.jar /opt/cloudera/parcels/CDH-4.7.0-1.cdh4.7.0.p0.40/lib/hadoop/hadoop-auth-2.0.0-cdh4.7.0.jar /opt/cloudera/parcels/CDH-4.7.0-1.cdh4.7.0.p0.40/lib/hadoop/hadoop-common-2.0.0-cdh4.7.0.jar /opt/cloudera/parcels/CDH-4.7.0-1.cdh4.7.0.p0.40/lib/hadoop-hdfs/hadoop-hdfs-2.0.0-cdh4.7.0.jar /opt/cloudera/parcels/CDH-4.7.0-1.cdh4.7.0.p0.40/lib/hadoop-mapreduce/hadoop-mapreduce-client-core-2.0.0-cdh4.7.0.jar There is a separate Drill Jira open for ensuring this is configured correctly at DRILL-1160 https://issues.apache.org/jira/browse/DRILL-1160
        Hide
        Amit Katti added a comment -

        Also you need to point to the correct Zookeeper in drill-override.conf

        drill.exec:

        { cluster-id: "working_cdh_drill" zk.connect: "10.10.30.156:2181" }
        Show
        Amit Katti added a comment - Also you need to point to the correct Zookeeper in drill-override.conf drill.exec: { cluster-id: "working_cdh_drill" zk.connect: "10.10.30.156:2181" }
        Hide
        Jacques Nadeau added a comment -

        Looks like you just have to get the configuration right.

        Show
        Jacques Nadeau added a comment - Looks like you just have to get the configuration right.
        Hide
        petty zhang added a comment -

        Hi,

        I also encounter the similar issue about hdfs storage engine , drill cannot read the table stored in hdfs.

        The version info is :
        drill 0.7.0
        apache hadoop2.5.0

        Does drill 0.7.0 support the hdfs of apache hadoop2.5.0?

        thanks,

        Show
        petty zhang added a comment - Hi, I also encounter the similar issue about hdfs storage engine , drill cannot read the table stored in hdfs. The version info is : drill 0.7.0 apache hadoop2.5.0 Does drill 0.7.0 support the hdfs of apache hadoop2.5.0? thanks,
        Hide
        Hema Kumar S added a comment - - edited

        Amit Katti
        I'm using cdh 4.5 and drill 0.7.0 .I'm trying to store file in hdfs, (using create table),
        If I add CDH jars to drill classpath. I got below error.
        Java.lang.UnsupportedOperationException: This is supposed to be overridden by subclasses.

        at com.google.protobuf.GeneratedMessage.getUnknownFields(GeneratedMessage.java:180) ~[protobuf-java-2.5.0.jar:na]
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$GetFileInfoRequestProto.getSerializedSize(ClientNamenodeProtocolProtos.java:30108) ~[hadoop-hdfs-2.0.0-cdh4.5.0.jar:na]
        at com.google.protobuf.AbstractMessageLite.toByteString(AbstractMessageLite.java:49) ~[protobuf-java-2.5.0.jar:na]
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.constructRpcRequest(ProtobufRpcEngine.java:149) ~[hadoop-common-2.0.0-cdh4.5.0.jar:na]
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:193) ~[hadoop-common-2.0.0-cdh4.5.0.jar:na]
        at com.sun.proxy.$Proxy33.getFileInfo(Unknown Source) ~[na:na]
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.7.0_51]
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) ~[na:1.7.0_51]
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.7.0_51]
        at java.lang.reflect.Method.invoke(Method.java:606) ~[na:1.7.0_51]
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164) ~[hadoop-common-2.0.0-cdh4.5.0.jar:na]
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83) ~[hadoop-common-2.0.0-cdh4.5.0.jar:na]
        at com.sun.proxy.$Proxy33.getFileInfo(Unknown Source) ~[na:na]
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:629) ~[hadoop-hdfs-2.0.0-cdh4.5.0.jar:na]
        at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1545) ~[hadoop-hdfs-2.0.0-cdh4.5.0.jar:na]
        at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:820) ~[hadoop-hdfs-2.0.0-cdh4.5.0.jar:na]

        if I don't add cdh jars to Drill classpath. It's giving below error.

        ERROR o.a.d.e.s.text.DrillTextRecordWriter - Unable to create file: /tmp/table/1_13_0.csv

        java.io.IOException: Failed on local exception: java.nio.channels.ClosedByInterruptException; Host Details : local host is: "**********"; destination host is: "*********":8020;
        at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764) ~[hadoop-common-2.4.1.jar:na]
        at org.apache.hadoop.ipc.Client.call(Client.java:1414) ~[hadoop-common-2.4.1.jar:na]
        at org.apache.hadoop.ipc.Client.call(Client.java:1363) ~[hadoop-common-2.4.1.jar:na]
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206) ~[hadoop-common-2.4.1.jar:na]
        at com.sun.proxy.$Proxy38.create(Unknown Source) ~[na:na]
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.7.0_51]
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) ~[na:1.7.0_51]
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.7.0_51]
        at java.lang.reflect.Method.invoke(Method.java:606) ~[na:1.7.0_51]
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:190) ~[hadoop-common-2.4.1.jar:na]
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:103) ~[hadoop-common-2.4.1.jar:na]
        at com.sun.proxy.$Proxy38.create(Unknown Source) ~[na:na]
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:258) ~[hadoop-hdfs-2.4.1.jar:na]
        at org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1600) ~[hadoop-hdfs-2.4.1.jar:na]
        at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1465) ~[hadoop-hdfs-2.4.1.jar:na]
        at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1390) ~[hadoop-hdfs-2.4.1.jar:na]
        at org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:394) ~[hadoop-hdfs-2.4.1.jar:na]
        at org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:390) ~[hadoop-hdfs-2.4.1.jar:na]
        at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) ~[hadoop-common-2.4.1.jar:na]
        at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:390) ~[hadoop-hdfs-2.4.1.jar:na]
        at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:334) ~[hadoop-hdfs-2.4.1.jar:na]
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:906) ~[hadoop-common-2.4.1.jar:na]
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:887) ~[hadoop-common-2.4.1.jar:na]
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:784) ~[hadoop-common-2.4.1.jar:na]
        at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:773) ~[hadoop-common-2.4.1.jar:na]
        at org.apache.drill.exec.store.text.DrillTextRecordWriter.startNewSchema(DrillTextRecordWriter.java:81) ~[drill-java-exec-0.7.0-rebuffed.jar:0.7.0]
        at org.apache.drill.exec.store.StringOutputRecordWriter.updateSchema(StringOutputRecordWriter.java:57) [drill-java-exec-0.7.0-rebuffed.jar:0.7.0]
        at org.apache.drill.exec.physical.impl.WriterRecordBatch.setupNewSchema(WriterRecordBatch.java:162) [drill-java-exec-0.7.0-rebuffed.jar:0.7.0]
        at org.apache.drill.exec.physical.impl.WriterRecordBatch.innerNext(WriterRecordBatch.java:113) [drill-java-exec-0.7.0-rebuffed.jar:0.7.0]

        Does drill 0.70 support hdfs storage in CDH 4.5?

        Show
        Hema Kumar S added a comment - - edited Amit Katti I'm using cdh 4.5 and drill 0.7.0 .I'm trying to store file in hdfs, (using create table), If I add CDH jars to drill classpath. I got below error. Java.lang.UnsupportedOperationException: This is supposed to be overridden by subclasses. at com.google.protobuf.GeneratedMessage.getUnknownFields(GeneratedMessage.java:180) ~ [protobuf-java-2.5.0.jar:na] at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$GetFileInfoRequestProto.getSerializedSize(ClientNamenodeProtocolProtos.java:30108) ~ [hadoop-hdfs-2.0.0-cdh4.5.0.jar:na] at com.google.protobuf.AbstractMessageLite.toByteString(AbstractMessageLite.java:49) ~ [protobuf-java-2.5.0.jar:na] at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.constructRpcRequest(ProtobufRpcEngine.java:149) ~ [hadoop-common-2.0.0-cdh4.5.0.jar:na] at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:193) ~ [hadoop-common-2.0.0-cdh4.5.0.jar:na] at com.sun.proxy.$Proxy33.getFileInfo(Unknown Source) ~ [na:na] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~ [na:1.7.0_51] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) ~ [na:1.7.0_51] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~ [na:1.7.0_51] at java.lang.reflect.Method.invoke(Method.java:606) ~ [na:1.7.0_51] at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164) ~ [hadoop-common-2.0.0-cdh4.5.0.jar:na] at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83) ~ [hadoop-common-2.0.0-cdh4.5.0.jar:na] at com.sun.proxy.$Proxy33.getFileInfo(Unknown Source) ~ [na:na] at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:629) ~ [hadoop-hdfs-2.0.0-cdh4.5.0.jar:na] at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:1545) ~ [hadoop-hdfs-2.0.0-cdh4.5.0.jar:na] at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:820) ~ [hadoop-hdfs-2.0.0-cdh4.5.0.jar:na] if I don't add cdh jars to Drill classpath. It's giving below error. ERROR o.a.d.e.s.text.DrillTextRecordWriter - Unable to create file: /tmp/table/1_13_0.csv java.io.IOException: Failed on local exception: java.nio.channels.ClosedByInterruptException; Host Details : local host is: "********** "; destination host is: " *********":8020; at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764) ~ [hadoop-common-2.4.1.jar:na] at org.apache.hadoop.ipc.Client.call(Client.java:1414) ~ [hadoop-common-2.4.1.jar:na] at org.apache.hadoop.ipc.Client.call(Client.java:1363) ~ [hadoop-common-2.4.1.jar:na] at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206) ~ [hadoop-common-2.4.1.jar:na] at com.sun.proxy.$Proxy38.create(Unknown Source) ~ [na:na] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~ [na:1.7.0_51] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) ~ [na:1.7.0_51] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~ [na:1.7.0_51] at java.lang.reflect.Method.invoke(Method.java:606) ~ [na:1.7.0_51] at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:190) ~ [hadoop-common-2.4.1.jar:na] at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:103) ~ [hadoop-common-2.4.1.jar:na] at com.sun.proxy.$Proxy38.create(Unknown Source) ~ [na:na] at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:258) ~ [hadoop-hdfs-2.4.1.jar:na] at org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1600) ~ [hadoop-hdfs-2.4.1.jar:na] at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1465) ~ [hadoop-hdfs-2.4.1.jar:na] at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1390) ~ [hadoop-hdfs-2.4.1.jar:na] at org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:394) ~ [hadoop-hdfs-2.4.1.jar:na] at org.apache.hadoop.hdfs.DistributedFileSystem$6.doCall(DistributedFileSystem.java:390) ~ [hadoop-hdfs-2.4.1.jar:na] at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) ~ [hadoop-common-2.4.1.jar:na] at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:390) ~ [hadoop-hdfs-2.4.1.jar:na] at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:334) ~ [hadoop-hdfs-2.4.1.jar:na] at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:906) ~ [hadoop-common-2.4.1.jar:na] at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:887) ~ [hadoop-common-2.4.1.jar:na] at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:784) ~ [hadoop-common-2.4.1.jar:na] at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:773) ~ [hadoop-common-2.4.1.jar:na] at org.apache.drill.exec.store.text.DrillTextRecordWriter.startNewSchema(DrillTextRecordWriter.java:81) ~ [drill-java-exec-0.7.0-rebuffed.jar:0.7.0] at org.apache.drill.exec.store.StringOutputRecordWriter.updateSchema(StringOutputRecordWriter.java:57) [drill-java-exec-0.7.0-rebuffed.jar:0.7.0] at org.apache.drill.exec.physical.impl.WriterRecordBatch.setupNewSchema(WriterRecordBatch.java:162) [drill-java-exec-0.7.0-rebuffed.jar:0.7.0] at org.apache.drill.exec.physical.impl.WriterRecordBatch.innerNext(WriterRecordBatch.java:113) [drill-java-exec-0.7.0-rebuffed.jar:0.7.0] Does drill 0.70 support hdfs storage in CDH 4.5?

          People

          • Assignee:
            Jacques Nadeau
            Reporter:
            Vivian Summers
          • Votes:
            0 Vote for this issue
            Watchers:
            7 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development