Details

    • Type: Bug Bug
    • Status: Open
    • Priority: Major Major
    • Resolution: Unresolved
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None

      Description

      Yo Dawg, I heard Hadoop 2.3.0 included an NFS service, so Bigtop 0.8.0 should also contain an NFS service, so you can mount stuff while you mount stuff.

        Issue Links

          Activity

          Hide
          Sean Mackrory added a comment -

          Attaching an initial patch. There are a couple of things I still intend to do:

          • Test that Debian packages can be installed, etc.. and run
          • On CentOS 6, you have to install nfs-utils and nfs-utils-libs. I need to determine what packages these dependencies are in on other distros and
          • This will conflict with the system's NFS service by default, but I don't want to add a conflict as they can be configured to coexist on different ports.
          Show
          Sean Mackrory added a comment - Attaching an initial patch. There are a couple of things I still intend to do: Test that Debian packages can be installed, etc.. and run On CentOS 6, you have to install nfs-utils and nfs-utils-libs. I need to determine what packages these dependencies are in on other distros and This will conflict with the system's NFS service by default, but I don't want to add a conflict as they can be configured to coexist on different ports.
          Hide
          Sean Mackrory added a comment -

          So on SUSE you just need nfs-utils, and on Ubuntu you just need nfs-common. I'll add the appropriate package dependencies in a subsequent patch. The use of hadoop-daemon.sh in the init script also needs to be changed as it emits a warning that using that script to invoke hdfs is deprecated. To make this work the following configuration is needed:

          NameNode's core-site.xml:

          <property>
             <name>hadoop.proxyuser.hdfs.groups</name>
             <value>*</value>
             <description>
               Set this to '*' to allow the gateway user to proxy any group.
             </description>
          </property>
          <property>
              <name>hadoop.proxyuser.hdfs.hosts</name>
              <value>*</value>
              <description>
               Set this to '*' to allow requests from any hosts to be proxied.
              </description>
          </property>
          

          NameNode's hdfs-site.xml:

          <property>
              <name>dfs.namenode.accesstime.precision</name>
              <value>3600000</value>
              <description>Access time for an HDFS file is precise up to this value. Default value is 1 hour. 0 disables access times for HDFS.</description>
          </property>
          

          NFS Gateway's hdfs-site.xml:

          <property>
            <name>dfs.nfs3.dump.dir</name>
            <value>/tmp/.hdfs-nfs</value>
            <description>Needs to have enough space to buffer all out-of-sequence writes</description>
          </property>
          

          And once the servers are started you can mount the filesystem with

          mount -t  nfs  -o vers=3,proto=tcp,nolock <hostname>:/ /hdfs_nfs_mount
          
          Show
          Sean Mackrory added a comment - So on SUSE you just need nfs-utils, and on Ubuntu you just need nfs-common. I'll add the appropriate package dependencies in a subsequent patch. The use of hadoop-daemon.sh in the init script also needs to be changed as it emits a warning that using that script to invoke hdfs is deprecated. To make this work the following configuration is needed: NameNode's core-site.xml: <property> <name>hadoop.proxyuser.hdfs.groups</name> <value>*</value> <description> Set this to '*' to allow the gateway user to proxy any group. </description> </property> <property> <name>hadoop.proxyuser.hdfs.hosts</name> <value>*</value> <description> Set this to '*' to allow requests from any hosts to be proxied. </description> </property> NameNode's hdfs-site.xml: <property> <name>dfs.namenode.accesstime.precision</name> <value>3600000</value> <description>Access time for an HDFS file is precise up to this value. Default value is 1 hour. 0 disables access times for HDFS.</description> </property> NFS Gateway's hdfs-site.xml: <property> <name>dfs.nfs3.dump.dir</name> <value>/tmp/.hdfs-nfs</value> <description>Needs to have enough space to buffer all out-of-sequence writes</description> </property> And once the servers are started you can mount the filesystem with mount -t nfs -o vers=3,proto=tcp,nolock <hostname>:/ /hdfs_nfs_mount
          Hide
          jay vyas added a comment -

          There is also the fuse mount . What's the difference between the two and what is the preferred way of mounting hdfs?

          Show
          jay vyas added a comment - There is also the fuse mount . What's the difference between the two and what is the preferred way of mounting hdfs?
          Hide
          Sean Mackrory added a comment -

          I am aware of the fuse mount, although I don't know much about how it's working internally. I suspect the fuse mount is going be a bit less sophisticated than the NFS mount - usually when I hear about the former it's with negative overtones. I suspect which one you prefer depends mainly on you. Both are included in a Certain Distribution of Hadoop - I'm just backporting this so that it's there for those who want to use it, now that Bigtop is moving to 2.3.0 which includes this feature.

          Show
          Sean Mackrory added a comment - I am aware of the fuse mount, although I don't know much about how it's working internally. I suspect the fuse mount is going be a bit less sophisticated than the NFS mount - usually when I hear about the former it's with negative overtones. I suspect which one you prefer depends mainly on you. Both are included in a Certain Distribution of Hadoop - I'm just backporting this so that it's there for those who want to use it, now that Bigtop is moving to 2.3.0 which includes this feature.
          Hide
          Mark Grover added a comment -

          Jay, fuse integration with Hadoop does have a few limitations. The design doc for HDFS's NFS native integration has more details on it.
          You can refer to the design doc at https://issues.apache.org/jira/secure/attachment/12580444/HADOOP-NFS-Proposal.pdf
          and the relevant umbrella JIRA at HDFS-4750

          Show
          Mark Grover added a comment - Jay, fuse integration with Hadoop does have a few limitations. The design doc for HDFS's NFS native integration has more details on it. You can refer to the design doc at https://issues.apache.org/jira/secure/attachment/12580444/HADOOP-NFS-Proposal.pdf and the relevant umbrella JIRA at HDFS-4750
          Hide
          jay vyas added a comment -

          looks like there is renewed interest in NFS Packaging for HDFS. Yes, fuse is limited. looking forward to seeing this. i can test a patch if anyone has one.

          Show
          jay vyas added a comment - looks like there is renewed interest in NFS Packaging for HDFS. Yes, fuse is limited. looking forward to seeing this. i can test a patch if anyone has one.
          Hide
          Roman Shaposhnik added a comment -

          Sean Mackrory any chance you can finish this up for the DEB side? Would be really cool to have this functionality in 0.9.0

          Show
          Roman Shaposhnik added a comment - Sean Mackrory any chance you can finish this up for the DEB side? Would be really cool to have this functionality in 0.9.0

            People

            • Assignee:
              Sean Mackrory
              Reporter:
              Sean Mackrory
            • Votes:
              0 Vote for this issue
              Watchers:
              4 Start watching this issue

              Dates

              • Created:
                Updated:

                Development