Uploaded image for project: 'Hadoop HDFS'
  1. Hadoop HDFS
  2. HDFS-7488

HDFS Windows CIFS Gateway

    XMLWordPrintableJSON

Details

    • New Feature
    • Status: Open
    • Major
    • Resolution: Unresolved
    • 2.4.0
    • None
    • None
    • None
    • HDP 2.1

    Description

      Stakeholders are pressuring for native Windows file share access to our Hadoop clusters.

      I've used NFS gateway several times and while it's theoretically viable for users now UID mapping is implemented in 2.5... insecure NFS makes our fully Kerberized clusters security pointless.

      We really need CIFS gateway access to enforce authentication which NFSv3 doesn't (NFSv4?).

      I've even tried Samba over NFS gateway loopback mount point (don't laugh - they want it that badly), and enabled hdfs atime precision to an hour to prevent FSNamesystem.setTimes() java exceptions in gw logs, but the NFS server still doesn't like the Windows CIFS client actions:

      2014-12-08 16:31:38,053 ERROR nfs3.RpcProgramNfs3 (RpcProgramNfs3.java:setattr(346)) - Setting file size is not supported when setattr, fileId: 25597
      2014-12-08 16:31:38,065 INFO  nfs3.WriteManager (WriteManager.java:handleWrite(136)) - No opened stream for fileId:25597
      2014-12-08 16:31:38,122 INFO  nfs3.OpenFileCtx (OpenFileCtx.java:receivedNewWriteInternal(624)) - Have to change stable write to unstable write:FILE_SYNC
      

      A debug of the Samba server shows it's trying to set metadata timestamps which hangs indefinitely, resulting in the creation of a zero byte file when trying to copy a file in to HDFS /tmp via the Windows mapped drive.

      ...
       smb_set_file_time: setting utimes to modified values.
      file_ntime: actime: Thu Jan  1 01:00:00 1970
      file_ntime: modtime: Mon Dec  8 16:31:38 2014
      file_ntime: ctime: Thu Jan  1 01:00:00 1970
      file_ntime: createtime: Thu Jan  1 01:00:00 1970
      

      This is the traceback from NFS gw log when hdfs precision was set to 0:

      org.apache.hadoop.ipc.RemoteException(java.io.IOException): Access time for hdfs is not configured.  Please set dfs.namenode.accesstime.precision configuration parameter.
              at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setTimes(FSNamesystem.java:1960)
              at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.setTimes(NameNodeRpcServer.java:950)
              at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.setTimes(ClientNamenodeProtocolServerSideTranslatorPB.java:833)
              at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
              at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
              at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
              at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
              at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
              at java.security.AccessController.doPrivileged(Native Method)
              at javax.security.auth.Subject.doAs(Subject.java:415)
              at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1594)
              at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
      ...
      

      Regards,

      Hari Sekhon
      http://www.linkedin.com/in/harisekhon

      Attachments

        Activity

          People

            Unassigned Unassigned
            harisekhon Hari Sekhon
            Votes:
            0 Vote for this issue
            Watchers:
            10 Start watching this issue

            Dates

              Created:
              Updated: