Uploaded image for project: 'Bigtop'
  1. Bigtop
  2. BIGTOP-1694

puppet: Make httpfs subscribe to core-site and hdfs-site

    Details

    • Type: Bug
    • Status: Resolved
    • Priority: Major
    • Resolution: Fixed
    • Affects Version/s: 0.8.0
    • Fix Version/s: 1.0.0
    • Component/s: deployment
    • Labels:
      None

      Description

      When applying puppet on a cluster, the HttpFS service can be started before core-site.xml template is filled. This results in HttpFS using the default value of fs.defaultFS which is 'file://'. HttpFS will then be using the local file system instead of HDFS. If Hue is installed on the cluster as well, its file browser will show the local file system instead of HDFS.

      1. BIGTOP-1694.1.patch
        2 kB
        Peter Slawski
      2. BIGTOP-1694.2.patch
        2 kB
        Peter Slawski

        Activity

        Hide
        petersla Peter Slawski added a comment -

        The attached patch fixes this issue by making HttpFS subscribe to core-site.xml and hdfs-site.xml since HttpFS pulls configuration values from those two files as documented here

        Show
        petersla Peter Slawski added a comment - The attached patch fixes this issue by making HttpFS subscribe to core-site.xml and hdfs-site.xml since HttpFS pulls configuration values from those two files as documented here
        Hide
        cos Konstantin Boudnik added a comment -

        I see that you remove some subscriptions, So, they aren't required?

        Show
        cos Konstantin Boudnik added a comment - I see that you remove some subscriptions, So, they aren't required?
        Hide
        petersla Peter Slawski added a comment -

        They shouldn't be removed. I've added a new line after subscription to File["/etc/hadoop-httpfs/conf/httpfs-site.xml"]:

        -      subscribe => [Package["hadoop-httpfs"], File["/etc/hadoop-httpfs/conf/httpfs-site.xml"], File["/etc/hadoop-httpfs/conf/httpfs-env.sh"], File["/etc/hadoop-httpfs/conf/httpfs-signature.secret"]],
        +      subscribe => [Package["hadoop-httpfs"], File["/etc/hadoop/conf/core-site.xml"], File["/etc/hadoop/conf/hdfs-site.xml"], File["/etc/hadoop-httpfs/conf/httpfs-site.xml"],
        +        File["/etc/hadoop-httpfs/conf/httpfs-env.sh"], File["/etc/hadoop-httpfs/conf/httpfs-signature.secret"]],
        
        Show
        petersla Peter Slawski added a comment - They shouldn't be removed. I've added a new line after subscription to File ["/etc/hadoop-httpfs/conf/httpfs-site.xml"] : - subscribe => [Package[ "hadoop-httpfs" ], File[ "/etc/hadoop-httpfs/conf/httpfs-site.xml" ], File[ "/etc/hadoop-httpfs/conf/httpfs-env.sh" ], File[ "/etc/hadoop-httpfs/conf/httpfs-signature.secret" ]], + subscribe => [Package[ "hadoop-httpfs" ], File[ "/etc/hadoop/conf/core-site.xml" ], File[ "/etc/hadoop/conf/hdfs-site.xml" ], File[ "/etc/hadoop-httpfs/conf/httpfs-site.xml" ], + File[ "/etc/hadoop-httpfs/conf/httpfs-env.sh" ], File[ "/etc/hadoop-httpfs/conf/httpfs-signature.secret" ]],
        Hide
        cos Konstantin Boudnik added a comment -

        Oh, I see. I think it would be easier to read if the new files were added at the end, but this way it would work too. Presuming that patch has been tested (please confirm), I am tentatively +1 it. Thanks!

        Show
        cos Konstantin Boudnik added a comment - Oh, I see. I think it would be easier to read if the new files were added at the end, but this way it would work too. Presuming that patch has been tested (please confirm), I am tentatively +1 it. Thanks!
        Hide
        petersla Peter Slawski added a comment -

        Yes, here is the testing done on latest master branch for this patch. I'll attach a second patch with the new files added to the end. Thanks!

        Without patch, HttpFS uses file:// for fs.defaultFS:

        -bash-4.1$ cat /var/log/hadoop-httpfs/httpfs.log | grep Name
        2015-02-21 01:48:11,255  INFO HttpFSServerWebApp [][:]  Connects to Namenode [file:///]
        

        With patch, HttpFS uses hdfs://:

        -bash-4.1$ cat /var/log/hadoop-httpfs/httpfs.log | grep Name
        2015-02-21 01:35:55,728  INFO HttpFSServerWebApp [][:]  Connects to Namenode [hdfs://ip-10-225-181-95.ec2.internal:8020]
        

        Thus with the patch, listing root through HttpFS shows HDFS files:

        -bash-4.1$ curl "http://localhost:14000/webhdfs/v1?op=liststatus&user.name=hadoop" | python -m json.tool | grep pathSuffix
          % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                         Dload  Upload   Total   Spent    Left  Speed
        100  1124    0  1124    0     0  19136      0 --:--:-- --:--:-- --:--:-- 19379
                        "pathSuffix": "benchmarks", 
                        "pathSuffix": "hbase", 
                        "pathSuffix": "solr", 
                        "pathSuffix": "tmp", 
                        "pathSuffix": "user", 
                        "pathSuffix": "var", 
        -bash-4.1$ hadoop fs -ls /
        Found 6 items
        drwxrwxrwx   - hdfs  hadoop          0 2015-02-21 01:37 /benchmarks
        drwxr-xr-x   - hbase hbase           0 2015-02-21 01:37 /hbase
        drwxr-xr-x   - solr  solr            0 2015-02-21 01:37 /solr
        drwxrwxrwt   - hdfs  hadoop          0 2015-02-21 01:41 /tmp
        drwxr-xr-x   - hdfs  hadoop          0 2015-02-21 01:41 /user
        drwxr-xr-x   - hdfs  hadoop          0 2015-02-21 01:37 /var
        
        Show
        petersla Peter Slawski added a comment - Yes, here is the testing done on latest master branch for this patch. I'll attach a second patch with the new files added to the end. Thanks! Without patch, HttpFS uses file:// for fs.defaultFS: -bash-4.1$ cat /var/log/hadoop-httpfs/httpfs.log | grep Name 2015-02-21 01:48:11,255 INFO HttpFSServerWebApp [][:] Connects to Namenode [file:///] With patch, HttpFS uses hdfs://: -bash-4.1$ cat /var/log/hadoop-httpfs/httpfs.log | grep Name 2015-02-21 01:35:55,728 INFO HttpFSServerWebApp [][:] Connects to Namenode [hdfs://ip-10-225-181-95.ec2.internal:8020] Thus with the patch, listing root through HttpFS shows HDFS files: -bash-4.1$ curl "http://localhost:14000/webhdfs/v1?op=liststatus&user.name=hadoop" | python -m json.tool | grep pathSuffix % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 1124 0 1124 0 0 19136 0 --:--:-- --:--:-- --:--:-- 19379 "pathSuffix": "benchmarks", "pathSuffix": "hbase", "pathSuffix": "solr", "pathSuffix": "tmp", "pathSuffix": "user", "pathSuffix": "var", -bash-4.1$ hadoop fs -ls / Found 6 items drwxrwxrwx - hdfs hadoop 0 2015-02-21 01:37 /benchmarks drwxr-xr-x - hbase hbase 0 2015-02-21 01:37 /hbase drwxr-xr-x - solr solr 0 2015-02-21 01:37 /solr drwxrwxrwt - hdfs hadoop 0 2015-02-21 01:41 /tmp drwxr-xr-x - hdfs hadoop 0 2015-02-21 01:41 /user drwxr-xr-x - hdfs hadoop 0 2015-02-21 01:37 /var
        Hide
        cos Konstantin Boudnik added a comment -

        Thanks +1. I will commit it shortly!

        Show
        cos Konstantin Boudnik added a comment - Thanks +1. I will commit it shortly!
        Hide
        cos Konstantin Boudnik added a comment -

        Committed and pushed. Thanks Peter!

        Show
        cos Konstantin Boudnik added a comment - Committed and pushed. Thanks Peter!

          People

          • Assignee:
            petersla Peter Slawski
            Reporter:
            petersla Peter Slawski
          • Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development