Uploaded image for project: 'Bigtop'
  1. Bigtop
  2. BIGTOP-952

init-hdfs.sh is dog slow. Let's replace it with a direct HDFS API calls and better layout management

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Blocker
    • Resolution: Fixed
    • Affects Version/s: 0.5.0
    • Fix Version/s: 0.8.0
    • Component/s: deployment
    • Labels:
      None

      Description

      As has been proposed in this patch by Roman Shaposhnik there's a very efficient way of creating layout in HDFS using a tarfile and Groovy script with direct call into DFS APIs.

      Let's making it happen.

      1. BIGTOP-952.patch
        13 kB
        jay vyas
      2. BIGTOP-952.patch
        13 kB
        jay vyas
      3. BIGTOP-952-tested.patch
        5 kB
        jay vyas
      4. BIGTOP-952-tested-refined.patch
        6 kB
        jay vyas
      5. provision2.groovy
        1 kB
        jay vyas
      6. untar.groovy
        2 kB
        Konstantin Boudnik

        Issue Links

          Activity

          Hide
          cos Konstantin Boudnik added a comment -

          reattaching the script to this ticket.

          Show
          cos Konstantin Boudnik added a comment - reattaching the script to this ticket.
          Hide
          bmahe Bruno Mahé added a comment -

          Nice!
          Some comments though:

          • There is nomain method
          • There is no license header
          • Arguments are not checked

          Also, can we link this ticket to the ticket in which this patch was first proposed? I forgot the ticket number.

          Show
          bmahe Bruno Mahé added a comment - Nice! Some comments though: There is nomain method There is no license header Arguments are not checked Also, can we link this ticket to the ticket in which this patch was first proposed? I forgot the ticket number.
          Hide
          cos Konstantin Boudnik added a comment -

          Bruno, this is groovy script - not a class/program: it really doesn't a main method.
          Also, it needs to be improved upon - the initial patch is just a prove of concept.

          Show
          cos Konstantin Boudnik added a comment - Bruno, this is groovy script - not a class/program: it really doesn't a main method. Also, it needs to be improved upon - the initial patch is just a prove of concept.
          Hide
          bmahe Bruno Mahé added a comment -

          Main methods are good practices. See Python/Perl for examples.

          Show
          bmahe Bruno Mahé added a comment - Main methods are good practices. See Python/Perl for examples.
          Hide
          cos Konstantin Boudnik added a comment -

          This is highly ironic that you mentioned Python, Perl and "good practices" in a single sentence

          Show
          cos Konstantin Boudnik added a comment - This is highly ironic that you mentioned Python, Perl and "good practices" in a single sentence
          Hide
          jayunit100 jay vyas added a comment -

          Its good to move away from bash, but i think the tarball approach will reduce usability of the unified hcfs semantics provided in init-hdfs.sh.

          Thus, as the the latter patch in BIGTOP-1200 we've defined a json text file which ANY DFS initializer can easily parse and use.

          Any chance we can re-roll this patch to use that Jsonfile , rather than tarball ?

          Show
          jayunit100 jay vyas added a comment - Its good to move away from bash, but i think the tarball approach will reduce usability of the unified hcfs semantics provided in init-hdfs.sh. Thus, as the the latter patch in BIGTOP-1200 we've defined a json text file which ANY DFS initializer can easily parse and use. Any chance we can re-roll this patch to use that Jsonfile , rather than tarball ?
          Hide
          cos Konstantin Boudnik added a comment -

          ok, bigtop-groovy package is in now - we can work on this and BIGTOP-1200

          Show
          cos Konstantin Boudnik added a comment - ok, bigtop-groovy package is in now - we can work on this and BIGTOP-1200
          Hide
          rvs Roman Shaposhnik added a comment -

          Konstantin Boudnik & jay vyas indeed! lets make this one our first Bigtop groovy script that is intended for general purpose use. I suggest that we keep this JIRA purely for implementing a script that can initialize the state of HDFS based on some kind of a model (may be even different ones like tar, content of the local file tree, etc.) and define the policy in BIGTOP-1200.

          Sounds good?

          Show
          rvs Roman Shaposhnik added a comment - Konstantin Boudnik & jay vyas indeed! lets make this one our first Bigtop groovy script that is intended for general purpose use. I suggest that we keep this JIRA purely for implementing a script that can initialize the state of HDFS based on some kind of a model (may be even different ones like tar, content of the local file tree, etc.) and define the policy in BIGTOP-1200 . Sounds good?
          Hide
          jayunit100 jay vyas added a comment - - edited

          sure Roman Shaposhnik... +1 for just getting this patch in to use some kind of model.... no need to couple it to any particular implementation. If you want to push it out as is, I can do a second patch which updates it to use whatever we agree on in BIGTOP-1200 .

          (created follow up JIRA: https://issues.apache.org/jira/browse/BIGTOP-1210)

          Show
          jayunit100 jay vyas added a comment - - edited sure Roman Shaposhnik ... +1 for just getting this patch in to use some kind of model.... no need to couple it to any particular implementation. If you want to push it out as is, I can do a second patch which updates it to use whatever we agree on in BIGTOP-1200 . (created follow up JIRA: https://issues.apache.org/jira/browse/BIGTOP-1210 )
          Hide
          jayunit100 jay vyas added a comment -

          Can somebody suggest to me how do I glue in groovy based provisioners to replace the otherwise easy to run init-hdfs.sh in the provisioner? In particular, its not clear to me

          • how we run groovy scripts in a bigtop runtime?
          • wether or not the groovy script needs to be a jar to run ( i hope it doesnt, sort of defeats purpose of a script)?
          • are there any java based provisioning actions which i can piggy back this into.

          If so i can lend a hand on this JIRA i think.

          Show
          jayunit100 jay vyas added a comment - Can somebody suggest to me how do I glue in groovy based provisioners to replace the otherwise easy to run init-hdfs.sh in the provisioner? In particular, its not clear to me how we run groovy scripts in a bigtop runtime? wether or not the groovy script needs to be a jar to run ( i hope it doesnt, sort of defeats purpose of a script)? are there any java based provisioning actions which i can piggy back this into. If so i can lend a hand on this JIRA i think.
          Hide
          jayunit100 jay vyas added a comment -

          Well, after some more thought, I guess, it doesnt need to be a jar. I might be overthinking this

          I guess we can just do this, and then call this groovy file in the same place that we used to call init-hdfs.sh .

          #!/bin/bash                                                       
          export HADOOP_CONF_DIR=/etc/hadoop/conf/                                                                                                          
          //usr/bin/env groovy -cp /usr/lib/hadoop/lib/*jar "$0" $@; exit $?
          
          //.... romans code goes here, 
          
          

          right?

          Show
          jayunit100 jay vyas added a comment - Well, after some more thought, I guess, it doesnt need to be a jar. I might be overthinking this I guess we can just do this, and then call this groovy file in the same place that we used to call init-hdfs.sh . #!/bin/bash export HADOOP_CONF_DIR=/etc/hadoop/conf/ //usr/bin/env groovy -cp /usr/lib/hadoop/lib/*jar "$0" $@; exit $? //.... romans code goes here, right?
          Hide
          cos Konstantin Boudnik added a comment -

          Jay,

          as we have groovy runtime now you should be able to directly do groovy shebang, e.g.

          #!/usr/lib/bigtop-groovy/bin/groovy
          
          println "let's start with something fun!"
          // etc...
          
          Show
          cos Konstantin Boudnik added a comment - Jay, as we have groovy runtime now you should be able to directly do groovy shebang, e.g. #!/usr/lib/bigtop-groovy/bin/groovy println "let's start with something fun!" // etc...
          Hide
          jayunit100 jay vyas added a comment -

          Okay, made some progress on this. Heres a snippet of the WIP code for provisioning from JSON (FYI I have to switch the { to ['s in the BIGTOP-1200,

          import groovy.json.JsonSlurper;
          import java.io.FileNotFoundException;
          import java.io.FileReader;
          import java.util.List;
          import java.util.Map;
          import org.apache.hadoop.conf.Configuration;
          import org.apache.hadoop.fs.permission.FsPermission;
          import org.apache.hadoop.fs.FileSystem;
          import org.apache.hadoop.fs.Path;
          import java.io.BufferedReader;
          import java.io.InputStreamReader;
          import java.io.OutputStreamWriter;
          import java.io.Writer;
          
          def uri = "hdfs://"
          
          def v = new JsonSlurper();
          
          def jsonParser = new JsonSlurper();
          
          def json = "init-hcfs.json";
          
          def parsedData = jsonParser.parse( new FileReader(json));
          
          def dirs = (List) parsedData.get("dir");
          
          Configuration conf = new Configuration();
          //Configuration.dumpConfiguration(conf, new java.io.PrintWriter(System.out));
          
          FileSystem fs = FileSystem.get(conf);
          System.out.println(fs.getClass());
          
          System.out.println("DIRS : " + dirs + " DIRSSSS");
          dirs.each() {
            println(it);
            name=it[0];
            mode=it[1];
            user=it[2];
            group=it[3];
          
            Path file = new Path(uri+name);
            System.out.println("mkdirs " + name + " " + mode);
            System.out.println("Ownder " + name + " " + user + " " + group);
            //fs.mkdirs(name, new FsPermission(Short.parseShort(mode));
            //fs.setOwner(name, user, group);
          
          }
          
          Show
          jayunit100 jay vyas added a comment - Okay, made some progress on this. Heres a snippet of the WIP code for provisioning from JSON (FYI I have to switch the { to ['s in the BIGTOP-1200 , import groovy.json.JsonSlurper; import java.io.FileNotFoundException; import java.io.FileReader; import java.util.List; import java.util.Map; import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.fs.permission.FsPermission; import org.apache.hadoop.fs.FileSystem; import org.apache.hadoop.fs.Path; import java.io.BufferedReader; import java.io.InputStreamReader; import java.io.OutputStreamWriter; import java.io.Writer; def uri = "hdfs://" def v = new JsonSlurper(); def jsonParser = new JsonSlurper(); def json = "init-hcfs.json"; def parsedData = jsonParser.parse( new FileReader(json)); def dirs = (List) parsedData.get("dir"); Configuration conf = new Configuration(); //Configuration.dumpConfiguration(conf, new java.io.PrintWriter(System.out)); FileSystem fs = FileSystem.get(conf); System.out.println(fs.getClass()); System.out.println("DIRS : " + dirs + " DIRSSSS"); dirs.each() { println(it); name=it[0]; mode=it[1]; user=it[2]; group=it[3]; Path file = new Path(uri+name); System.out.println("mkdirs " + name + " " + mode); System.out.println("Ownder " + name + " " + user + " " + group); //fs.mkdirs(name, new FsPermission(Short.parseShort(mode)); //fs.setOwner(name, user, group); }
          Hide
          cos Konstantin Boudnik added a comment -

          I think you can simplify

          dirs.each() {
            println(it);
            name=it[0];
            mode=it[1];
            user=it[2];
            group=it[3];
          

          to something like (not tested)

          [name,mode,user,group] = it as Array
          
          Show
          cos Konstantin Boudnik added a comment - I think you can simplify dirs.each() { println(it); name=it[0]; mode=it[1]; user=it[2]; group=it[3]; to something like (not tested) [name,mode,user,group] = it as Array
          Hide
          jayunit100 jay vyas added a comment - - edited

          thanks cos, will do.

          is it Okay if we keep this JIRA as a pure groovy file system provisioning one, and

          • leave out the oozie stuff that init-hdfs.sh still contains?
          • avoid deleting init-hdfs.sh,for now, as refactoring deployment is really a separate task.

          Then we can tie it all together in a separate JIRA which i think we should very carefully inject into the code base (as it will completely change bigtop deployment BIGTOP-1235).

          Show
          jayunit100 jay vyas added a comment - - edited thanks cos, will do. is it Okay if we keep this JIRA as a pure groovy file system provisioning one, and leave out the oozie stuff that init-hdfs.sh still contains? avoid deleting init-hdfs.sh,for now, as refactoring deployment is really a separate task. Then we can tie it all together in a separate JIRA which i think we should very carefully inject into the code base (as it will completely change bigtop deployment BIGTOP-1235 ).
          Hide
          jayunit100 jay vyas added a comment -

          attached is the latest pass (provision2.groovy). I've spot tested it and it appears to create the dirs correctly (/tmp/, /user/hive, ...).

          Any more feeback before i formalize patches for BIGTOP-1200 and this one?

          Show
          jayunit100 jay vyas added a comment - attached is the latest pass (provision2.groovy). I've spot tested it and it appears to create the dirs correctly (/tmp/, /user/hive, ...). Any more feeback before i formalize patches for BIGTOP-1200 and this one?
          Hide
          jayunit100 jay vyas added a comment -

          Tested patch which uses BIGTOP-1200 for provisioning

          Show
          jayunit100 jay vyas added a comment - Tested patch which uses BIGTOP-1200 for provisioning
          Hide
          jayunit100 jay vyas added a comment -

          Refined the patch again.
          This is now ready for review !

          (excuse the different named patches: Helps me keep them straight in my head, hope its not too annoying or causing issues).

          Show
          jayunit100 jay vyas added a comment - Refined the patch again. This is now ready for review ! (excuse the different named patches: Helps me keep them straight in my head, hope its not too annoying or causing issues).
          Hide
          cos Konstantin Boudnik added a comment -

          Jay, a few comments:

          • I don't see groovy file belonging to the puppet directory. That's a wrong place for it. Where do you plan to call it from? Similarly to init-hdfs.sh ? I think we need to move it bigtop-utils or something. Anyone has a different opinion?
          • file is missing ASL boiler plate
          • is this a script or a compilable class? If the former then use correct shebang to point to bigtop's groovy interpreter
          • don't use printlns for logging
          • I don't like the fact that classpath needs to be constructed manually. Is there a better way?
          • {{if(! args.length == 1) { }} should look like {{ if (args.length != 1) { }}
          • {{ def v = new JsonSlurper(); }} isn't used
          • there's a bunch of unused imports
          • lines shouldn't be longer than 80 symbols. And definetely they shouldn't be 669 chars long
          • else if should be on the same line as '}' in here
            }
            
            else if(! new File(args[0]).exists()) { 
            
          • what is {{ exit 1;}}? Are you referring to System.exit ? Please be explicit
          • be consistent with indentations: it should be 2 and 4 for continuous statements. This one is wrong;
            dirs.each() { 
                  System.out.println("here " + it);
            
          • what's the point of keeping commented out lines?
          • stationary paths like "/usr/lib/hadoop-mapreduce/" should be declared as named constants
          • use safe Groovy casting in cases lile def dirs = (List) parsedData.get("dir"); It can be replaced with {{parsedData.get("dir") as [] }}
          Show
          cos Konstantin Boudnik added a comment - Jay, a few comments: I don't see groovy file belonging to the puppet directory. That's a wrong place for it. Where do you plan to call it from? Similarly to init-hdfs.sh ? I think we need to move it bigtop-utils or something. Anyone has a different opinion? file is missing ASL boiler plate is this a script or a compilable class? If the former then use correct shebang to point to bigtop's groovy interpreter don't use printlns for logging I don't like the fact that classpath needs to be constructed manually. Is there a better way? {{if(! args.length == 1) { }} should look like {{ if (args.length != 1) { }} {{ def v = new JsonSlurper(); }} isn't used there's a bunch of unused imports lines shouldn't be longer than 80 symbols. And definetely they shouldn't be 669 chars long else if should be on the same line as '}' in here } else if(! new File(args[0]).exists()) { what is {{ exit 1;}}? Are you referring to System.exit ? Please be explicit be consistent with indentations: it should be 2 and 4 for continuous statements. This one is wrong; dirs.each() { System.out.println("here " + it); what's the point of keeping commented out lines? stationary paths like "/usr/lib/hadoop-mapreduce/" should be declared as named constants use safe Groovy casting in cases lile def dirs = (List) parsedData.get("dir"); It can be replaced with {{parsedData.get("dir") as [] }}
          Hide
          jayunit100 jay vyas added a comment -

          Hi cos: some questions before i resubmit. FYI the formatting should be much better this time since now i have a dev setup that does hadoop style formatting properly.

          • okay ill move it to bigtop-utils and yes there is definetly some more cleanup i can do .
          • i think it can be a script : i can add the groovy shebang.

          Now a question for you:

          • the manual classpath setting is just an example of how to invoke. how do you feel if we wait to decide on the exact right way we invoke for provisioning later on, in BIGTOP-1235 ? Ive intentionally created that JIRA so that we can get this functional utility as a first pass , and then refine it with a more global view of the system later.
          Show
          jayunit100 jay vyas added a comment - Hi cos: some questions before i resubmit. FYI the formatting should be much better this time since now i have a dev setup that does hadoop style formatting properly. okay ill move it to bigtop-utils and yes there is definetly some more cleanup i can do . i think it can be a script : i can add the groovy shebang. Now a question for you: the manual classpath setting is just an example of how to invoke. how do you feel if we wait to decide on the exact right way we invoke for provisioning later on, in BIGTOP-1235 ? Ive intentionally created that JIRA so that we can get this functional utility as a first pass , and then refine it with a more global view of the system later.
          Hide
          bmahe Bruno Mahé added a comment -

          Looked at the script and in addition to Cos comments:

          • Can you split the code in functions? At least a main, makeUser, oozieStuff...
          • I see a mix between log and println statements
          Show
          bmahe Bruno Mahé added a comment - Looked at the script and in addition to Cos comments: Can you split the code in functions? At least a main, makeUser, oozieStuff... I see a mix between log and println statements
          Hide
          cos Konstantin Boudnik added a comment -

          decide on the exact right way we invoke for provisioning later on, in BIGTOP-1235 ?

          sounds good to me.

          Show
          cos Konstantin Boudnik added a comment - decide on the exact right way we invoke for provisioning later on, in BIGTOP-1235 ? sounds good to me.
          Hide
          jayunit100 jay vyas added a comment - - edited

          Okay cos/bruno. I've responded to everything and here's what ive got so far:

          (there might be some syntax errors etc, as this is an untested code block, just a quick rewrite incorporating suggestions).

          https://gist.github.com/jayunit100/9401503

          Its untested, so dont bother analyzing functional logic too much just yet, as I will confirm it in VMs myself when i submit official patch.

          Any initial thoughts? Im about to put a patch and officially test it later tonite.

          ***************************
          Responses/fixes inline:

          • bruno: I split up stuff into methods etc, probably a good idea, it was getting complex.
          • in the process i also had to make methods return things, I personally hate methods that work on global variables. overall result is more cleanable and modular code so i guess it was worth it, even thought the script was a little more hacker friendly.
            -moved it to bigtop-utils as per cos's suggestion.
          • added ASL boiler plate
          • added groovy shebang
          • I havent really used printlns for logs : but in a script, we want to err.println and out.println. I think the usage is pretty uniform here unless a specific one you want to point out that should be LOG?
          • re: classpath, that'll be resolved in BIGTOP-1235
          • fixed 80 line formatting
          • System.exit clarified. good point. dont really know what i was thinking .
          • removed commented lines
            -stationary paths externalized (Actually, i think these should be pulled in from env variables eventually, but for now I put em in vars so they can easily just be edited.
          • Switched to groovy down casts
            ***********************
          Show
          jayunit100 jay vyas added a comment - - edited Okay cos/bruno. I've responded to everything and here's what ive got so far: (there might be some syntax errors etc, as this is an untested code block, just a quick rewrite incorporating suggestions). https://gist.github.com/jayunit100/9401503 Its untested, so dont bother analyzing functional logic too much just yet, as I will confirm it in VMs myself when i submit official patch. Any initial thoughts? Im about to put a patch and officially test it later tonite. *************************** Responses/fixes inline: bruno: I split up stuff into methods etc, probably a good idea, it was getting complex. in the process i also had to make methods return things, I personally hate methods that work on global variables. overall result is more cleanable and modular code so i guess it was worth it, even thought the script was a little more hacker friendly. -moved it to bigtop-utils as per cos's suggestion. added ASL boiler plate added groovy shebang I havent really used printlns for logs : but in a script, we want to err.println and out.println. I think the usage is pretty uniform here unless a specific one you want to point out that should be LOG? re: classpath, that'll be resolved in BIGTOP-1235 fixed 80 line formatting System.exit clarified. good point. dont really know what i was thinking . removed commented lines -stationary paths externalized (Actually, i think these should be pulled in from env variables eventually, but for now I put em in vars so they can easily just be edited. Switched to groovy down casts ***********************
          Hide
          cos Konstantin Boudnik added a comment -

          I think there's still a couple of println's left here and there.
          Also, I believe oozie_libs method can be speed up by using DFS API instead of the shell-outs (essentially what we are trying to avoid in the first place, right?)

          Show
          cos Konstantin Boudnik added a comment - I think there's still a couple of println's left here and there. Also, I believe oozie_libs method can be speed up by using DFS API instead of the shell-outs (essentially what we are trying to avoid in the first place, right?)
          Hide
          jayunit100 jay vyas added a comment - - edited
          • yup ill remove the printlns .
          • and, wrt API calls: they could be faster but I ran into issues doing recursive copies using the direct copyFromLocal API . not sure where the logic is for recursive/glob copying. will have to look maybe in FsShell and see how it uses the API from globs. not too hard to do.

          Are those the only issues you see? If so ill clean all this up , test it, and proabbly put the patch in by the middle of this week !

          Show
          jayunit100 jay vyas added a comment - - edited yup ill remove the printlns . and, wrt API calls: they could be faster but I ran into issues doing recursive copies using the direct copyFromLocal API . not sure where the logic is for recursive/glob copying. will have to look maybe in FsShell and see how it uses the API from globs. not too hard to do. Are those the only issues you see? If so ill clean all this up , test it, and proabbly put the patch in by the middle of this week !
          Hide
          cos Konstantin Boudnik added a comment -

          Yeah, I think recursive copy isn't available to the client - as dumb as it is. FsShell seems to be the best bet.

          Show
          cos Konstantin Boudnik added a comment - Yeah, I think recursive copy isn't available to the client - as dumb as it is. FsShell seems to be the best bet.
          Hide
          jayunit100 jay vyas added a comment - - edited

          Okay! Heres a preliminary working copy (with a few naughty System.outs soon to be removed still floating around).

          I've handled the copying of local->distributedFS jars like this:

          public void copyJars(FileSystem fs, File input, String jarstr, Path target){    
              input.listFiles(new FilenameFilter(){
          	public boolean accept(File f, String filename) {
                      return filename.contains(jarstr) && filename.endsWith("jar")
                  }})
          	.each({ 
          	   file ->  
                          System.out.println("copying "+file);
                   	fs.copyFromLocalFile(
                		   new Path(file.getAbsolutePath()),
                             target)
              	});
          }
          ...
          copyJars(
              fs, new File("/usr/lib/hive/lib"),
                 "",new Path("/user/oozie/share/lib/hive/"))
          

          The full code is here:
          https://gist.github.com/jayunit100/9479790

          The idea here is to just copy the files directly, without using globbing. I tried the FSShell calls and they seemed to fail. And also it seemed like bad practice to "recode" FSShell in a different context, with all the reflection and stuff.

          Any other preliminary thoughts?

          After this ill do some cleanup and then have a patch you can apply directly ! It is working to provision on my machines:

          bash-4.1$ groovy -classpath /usr/lib/hadoop/hadoop-common-2.0.6-alpha.jar:/root/.m2/repository/org/apache/bigtop/itest/itest-common/0.8.0-SNAPSHOT/itest-common-0.8.0-SNAPSHOT.jar:/usr/lib/hadoop/lib/guava-11.0.2.jar:/etc/hadoop/conf/:/usr/lib/hadoop/hadoop-common-2.0.6-alpha.jar:/usr/lib/hadoop/lib/commons-configuration-1.6.jar:/usr/lib/hadoop/lib/commons-lang-2.5.jar:/usr/lib/hadoop/hadoop-auth.jar:/usr/lib/hadoop/lib/slf4j-api-1.6.1.jar:/usr/lib/hadoop-hdfs/hadoop-hdfs.jar:/usr/lib/hadoop/lib/protobuf-java-2.4.0a.jar provision.groovy /vagrant/init-hcfs.json
          Mar 11, 2014 5:09:29 AM org.apache.commons.logging.Log$info call
          INFO: Provisioning file system for file system from Configuration: hdfs://vagrant.bigtop1:17020
          SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
          SLF4J: Defaulting to no-operation (NOP) logger implementation
          SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.
          Mar 11, 2014 5:09:30 AM org.apache.hadoop.util.NativeCodeLoader <clinit>
          WARNING: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
          Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call
          INFO: 
          PROVISIONING WITH FILE SYSTEM : class org.apache.hadoop.hdfs.DistributedFileSystem
          
          here [/tmp, 1777, null, null]
          Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call
          INFO: mkdirs /tmp 1777
          Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call
          INFO: Owner /tmp null null
          Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call
          INFO: Skipping ... user null
          here [/var/log, 1775, yarn, mapred]
          Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call
          INFO: mkdirs /var/log 1775
          Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call
          INFO: Owner /var/log yarn mapred
          here [/tmp/hadoop-yarn, 777, mapred, mapred]
          Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call
          INFO: mkdirs /tmp/hadoop-yarn 777
          Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call
          INFO: Owner /tmp/hadoop-yarn mapred mapred
          here [/var/log/hadoop-yarn/apps, 1777, yarn, mapred]
          Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call
          INFO: mkdirs /var/log/hadoop-yarn/apps 1777
          Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call
          INFO: Owner /var/log/hadoop-yarn/apps yarn mapred
          here [/hbase, null, hbase, hbase]
          Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call
          INFO: mkdirs /hbase null
          Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call
          INFO: Owner /hbase hbase hbase
          here [/solr, null, solr, solr]
          Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call
          INFO: mkdirs /solr null
          Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call
          INFO: Owner /solr solr solr
          here [/benchmarks, 777, null, null]
          Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call
          INFO: mkdirs /benchmarks 777
          Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call
          INFO: Owner /benchmarks null null
          Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call
          INFO: Skipping ... user null
          here [/user, 755, HCFS_SUPER_USER, null]
          Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call
          INFO: mkdirs /user 755
          Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call
          INFO: Owner /user HCFS_SUPER_USER null
          here [/user/history, 755, mapred, mapred]
          Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call
          INFO: mkdirs /user/history 755
          Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call
          INFO: Owner /user/history mapred mapred
          here [/user/jenkins, 777, jenkins, null]
          Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call
          INFO: mkdirs /user/jenkins 777
          Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call
          INFO: Owner /user/jenkins jenkins null
          here [/user/hive, 777, null, null]
          Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call
          INFO: mkdirs /user/hive 777
          Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call
          INFO: Owner /user/hive null null
          Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call
          INFO: Skipping ... user null
          here [/user/root, 777, root, null]
          Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call
          INFO: mkdirs /user/root 777
          Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call
          INFO: Owner /user/root root null
          here [/user/hue, 777, hue, null]
          Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call
          INFO: mkdirs /user/hue 777
          Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call
          INFO: Owner /user/hue hue null
          here [/user/sqoop, 777, sqoop, null]
          Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call
          INFO: mkdirs /user/sqoop 777
          Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call
          INFO: Owner /user/sqoop sqoop null
          here [/user/oozie, 777, oozie]
          Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call
          INFO: mkdirs /user/oozie 777
          Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call
          INFO: Owner /user/oozie oozie null
          here [/user/oozie/share, null, null, null]
          Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call
          INFO: mkdirs /user/oozie/share null
          Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call
          INFO: Owner /user/oozie/share null null
          Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call
          INFO: Skipping ... user null
          here [/user/oozie/share/lib, null, null, null]
          Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call
          INFO: mkdirs /user/oozie/share/lib null
          Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call
          INFO: Owner /user/oozie/share/lib null null
          Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call
          INFO: Skipping ... user null
          here [/user/oozie/share/lib/hive, null, null, null]
          Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call
          INFO: mkdirs /user/oozie/share/lib/hive null
          Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call
          INFO: Owner /user/oozie/share/lib/hive null null
          Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call
          INFO: Skipping ... user null
          here [/user/oozie/share/lib/mapreduce, null, null, null]
          Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call
          INFO: mkdirs /user/oozie/share/lib/mapreduce null
          Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call
          INFO: Owner /user/oozie/share/lib/mapreduce null null
          Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call
          INFO: Skipping ... user null
          here [/user/oozie/share/lib/mapreduce-streaming, null, null, null]
          Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call
          INFO: mkdirs /user/oozie/share/lib/mapreduce-streaming null
          Mar 11, 2014 5:09:31 AM org.apache.commons.logging.Log$info call
          INFO: Owner /user/oozie/share/lib/mapreduce-streaming null null
          Mar 11, 2014 5:09:31 AM org.apache.commons.logging.Log$info call
          INFO: Skipping ... user null
          here [/user/oozie/share/lib/distcp, null, null, null]
          Mar 11, 2014 5:09:31 AM org.apache.commons.logging.Log$info call
          INFO: mkdirs /user/oozie/share/lib/distcp null
          Mar 11, 2014 5:09:31 AM org.apache.commons.logging.Log$info call
          INFO: Owner /user/oozie/share/lib/distcp null null
          Mar 11, 2014 5:09:31 AM org.apache.commons.logging.Log$info call
          INFO: Skipping ... user null
          here [/user/oozie/share/lib/pig, null, null, null]
          Mar 11, 2014 5:09:31 AM org.apache.commons.logging.Log$info call
          INFO: mkdirs /user/oozie/share/lib/pig null
          Mar 11, 2014 5:09:31 AM org.apache.commons.logging.Log$info call
          INFO: Owner /user/oozie/share/lib/pig null null
          Mar 11, 2014 5:09:31 AM org.apache.commons.logging.Log$info call
          INFO: Skipping ... user null
          here [/user/oozie/share, null, null, null]
          Mar 11, 2014 5:09:31 AM org.apache.commons.logging.Log$info call
          INFO: mkdirs /user/oozie/share null
          Mar 11, 2014 5:09:31 AM org.apache.commons.logging.Log$info call
          INFO: Owner /user/oozie/share null null
          Mar 11, 2014 5:09:31 AM org.apache.commons.logging.Log$info call
          INFO: Skipping ... user null
          here [/user/oozie/share/lib, null, null, null]
          Mar 11, 2014 5:09:31 AM org.apache.commons.logging.Log$info call
          INFO: mkdirs /user/oozie/share/lib null
          Mar 11, 2014 5:09:31 AM org.apache.commons.logging.Log$info call
          INFO: Owner /user/oozie/share/lib null null
          Mar 11, 2014 5:09:31 AM org.apache.commons.logging.Log$info call
          INFO: Skipping ... user null
          here [/user/oozie/share/lib/hive, null, null, null]
          Mar 11, 2014 5:09:31 AM org.apache.commons.logging.Log$info call
          INFO: mkdirs /user/oozie/share/lib/hive null
          Mar 11, 2014 5:09:31 AM org.apache.commons.logging.Log$info call
          INFO: Owner /user/oozie/share/lib/hive null null
          Mar 11, 2014 5:09:31 AM org.apache.commons.logging.Log$info call
          INFO: Skipping ... user null
          here [/user/oozie/share/lib/mapreduce-streaming, null, null, null]
          Mar 11, 2014 5:09:31 AM org.apache.commons.logging.Log$info call
          INFO: mkdirs /user/oozie/share/lib/mapreduce-streaming null
          Mar 11, 2014 5:09:31 AM org.apache.commons.logging.Log$info call
          INFO: Owner /user/oozie/share/lib/mapreduce-streaming null null
          Mar 11, 2014 5:09:31 AM org.apache.commons.logging.Log$info call
          INFO: Skipping ... user null
          here [/user/oozie/share/lib/distcp, null, null, null]
          Mar 11, 2014 5:09:31 AM org.apache.commons.logging.Log$info call
          INFO: mkdirs /user/oozie/share/lib/distcp null
          Mar 11, 2014 5:09:31 AM org.apache.commons.logging.Log$info call
          INFO: Owner /user/oozie/share/lib/distcp null null
          Mar 11, 2014 5:09:31 AM org.apache.commons.logging.Log$info call
          INFO: Skipping ... user null
          here [/user/oozie/share/lib/pig, null, null, null]
          Mar 11, 2014 5:09:31 AM org.apache.commons.logging.Log$info call
          INFO: mkdirs /user/oozie/share/lib/pig null
          Mar 11, 2014 5:09:31 AM org.apache.commons.logging.Log$info call
          INFO: Owner /user/oozie/share/lib/pig null null
          Mar 11, 2014 5:09:31 AM org.apache.commons.logging.Log$info call
          INFO: Skipping ... user null
          Mar 11, 2014 5:09:31 AM org.apache.commons.logging.Log$info call
          INFO: current user: tom
          Mar 11, 2014 5:09:31 AM org.apache.commons.logging.Log$info call
          INFO: current user: alice
          Mar 11, 2014 5:09:31 AM org.apache.commons.logging.Log$info call
          INFO: current user: bigtop
          Mar 11, 2014 5:09:31 AM org.apache.commons.logging.Log$info call
          INFO: Now running some basic shell commands for setting up oozie shared libraries.
          copying /usr/lib/hive/lib/hive-beeline.jar
          copying /usr/lib/hive/lib/jackson-jaxrs-1.8.8.jar
          copying /usr/lib/hive/lib/hive-hwi-0.11.0.jar
          copying /usr/lib/hive/lib/hive-jdbc-0.11.0.jar
          copying /usr/lib/hive/lib/antlr-runtime-3.4.jar
          copying /usr/lib/hive/lib/ST4-4.0.4.jar
          copying /usr/lib/hive/lib/jline-0.9.94.jar
          copying /usr/lib/hive/lib/jetty-util-6.1.26.jar
          copying /usr/lib/hive/lib/hive-common.jar
          copying /usr/lib/hive/lib/hive-cli.jar
          copying /usr/lib/hive/lib/hive-metastore.jar
          copying /usr/lib/hive/lib/hive-shims-0.11.0.jar
          copying /usr/lib/hive/lib/JavaEWAH-0.3.2.jar
          copying /usr/lib/hive/lib/hive-shims.jar
          copying /usr/lib/hive/lib/snappy-0.2.jar
          copying /usr/lib/hive/lib/jackson-mapper-asl-1.8.8.jar
          copying /usr/lib/hive/lib/hive-hwi.jar
          copying /usr/lib/hive/lib/commons-dbcp-1.4.jar
          copying /usr/lib/hive/lib/servlet-api-2.5-20081211.jar
          copying /usr/lib/hive/lib/commons-configuration-1.6.jar
          copying /usr/lib/hive/lib/jackson-core-asl-1.8.8.jar
          copying /usr/lib/hive/lib/avro-1.7.1.jar
          copying /usr/lib/hive/lib/derby-10.4.2.0.jar
          copying /usr/lib/hive/lib/tempus-fugit-1.1.jar
          copying /usr/lib/hive/lib/avro-mapred-1.7.1.jar
          copying /usr/lib/hive/lib/hive-contrib.jar
          copying /usr/lib/hive/lib/datanucleus-enhancer-2.0.3.jar
          copying /usr/lib/hive/lib/datanucleus-rdbms-2.0.3.jar
          copying /usr/lib/hive/lib/hive-metastore-0.11.0.jar
          copying /usr/lib/hive/lib/log4j-1.2.16.jar
          copying /usr/lib/hive/lib/hive-service-0.11.0.jar
          copying /usr/lib/hive/lib/commons-lang-2.4.jar
          copying /usr/lib/hive/lib/commons-io-2.4.jar
          copying /usr/lib/hive/lib/hive-serde-0.11.0.jar
          copying /usr/lib/hive/lib/guava-11.0.2.jar
          copying /usr/lib/hive/lib/hive-common-0.11.0.jar
          copying /usr/lib/hive/lib/slf4j-api-1.6.1.jar
          copying /usr/lib/hive/lib/commons-pool-1.5.4.jar
          copying /usr/lib/hive/lib/hive-hbase-handler-0.11.0.jar
          copying /usr/lib/hive/lib/hive-contrib-0.11.0.jar
          copying /usr/lib/hive/lib/commons-logging-1.0.4.jar
          copying /usr/lib/hive/lib/json-20090211.jar
          copying /usr/lib/hive/lib/zookeeper.jar
          copying /usr/lib/hive/lib/xz-1.0.jar
          copying /usr/lib/hive/lib/commons-collections-3.2.1.jar
          copying /usr/lib/hive/lib/hive-hbase-handler.jar
          copying /usr/lib/hive/lib/libthrift-0.9.0.jar
          copying /usr/lib/hive/lib/hive-cli-0.11.0.jar
          copying /usr/lib/hive/lib/commons-logging-api-1.0.4.jar
          copying /usr/lib/hive/lib/javolution-5.5.1.jar
          copying /usr/lib/hive/lib/jackson-xc-1.8.8.jar
          copying /usr/lib/hive/lib/protobuf-java-2.4.1.jar
          copying /usr/lib/hive/lib/jdo2-api-2.3-ec.jar
          copying /usr/lib/hive/lib/maven-ant-tasks-2.1.3.jar
          copying /usr/lib/hive/lib/datanucleus-connectionpool-2.0.3.jar
          copying /usr/lib/hive/lib/jetty-6.1.26.jar
          copying /usr/lib/hive/lib/hive-beeline-0.11.0.jar
          copying /usr/lib/hive/lib/commons-compress-1.4.1.jar
          copying /usr/lib/hive/lib/metrics-core-2.1.2.jar
          copying /usr/lib/hive/lib/hive-service.jar
          copying /usr/lib/hive/lib/commons-codec-1.4.jar
          copying /usr/lib/hive/lib/commons-cli-1.2.jar
          copying /usr/lib/hive/lib/hive-exec.jar
          copying /usr/lib/hive/lib/libfb303-0.9.0.jar
          copying /usr/lib/hive/lib/datanucleus-core-2.0.3.jar
          copying /usr/lib/hive/lib/hive-jdbc.jar
          copying /usr/lib/hive/lib/hive-serde.jar
          copying /usr/lib/hive/lib/hive-exec-0.11.0.jar
          copying /usr/lib/hadoop-mapreduce/hadoop-streaming.jar
          copying /usr/lib/hadoop-mapreduce/hadoop-streaming-2.0.6-alpha.jar
          copying /usr/lib/hadoop-mapreduce/hadoop-distcp-2.0.6-alpha.jar
          copying /usr/lib/hadoop-mapreduce/hadoop-distcp.jar
          copying /usr/lib/pig/lib/jython-standalone-2.5.3.jar
          copying /usr/lib/pig/pig-0.11.1-smoketests.jar
          copying /usr/lib/pig/piggybank.jar
          copying /usr/lib/pig/pig-0.11.1.jar
          copying /usr/lib/pig/pig-0.11.1-withouthadoop.jar
          copying /usr/lib/pig/pig.jar
          copying /usr/lib/pig/lib/jython-standalone-2.5.3.jar
          
          Show
          jayunit100 jay vyas added a comment - - edited Okay! Heres a preliminary working copy (with a few naughty System.outs soon to be removed still floating around). I've handled the copying of local->distributedFS jars like this: public void copyJars(FileSystem fs, File input, String jarstr, Path target){ input.listFiles(new FilenameFilter(){ public boolean accept(File f, String filename) { return filename.contains(jarstr) && filename.endsWith("jar") }}) .each({ file -> System.out.println("copying "+file); fs.copyFromLocalFile( new Path(file.getAbsolutePath()), target) }); } ... copyJars( fs, new File("/usr/lib/hive/lib"), "",new Path("/user/oozie/share/lib/hive/")) The full code is here: https://gist.github.com/jayunit100/9479790 The idea here is to just copy the files directly, without using globbing. I tried the FSShell calls and they seemed to fail. And also it seemed like bad practice to "recode" FSShell in a different context, with all the reflection and stuff. Any other preliminary thoughts? After this ill do some cleanup and then have a patch you can apply directly ! It is working to provision on my machines: bash-4.1$ groovy -classpath /usr/lib/hadoop/hadoop-common-2.0.6-alpha.jar:/root/.m2/repository/org/apache/bigtop/itest/itest-common/0.8.0-SNAPSHOT/itest-common-0.8.0-SNAPSHOT.jar:/usr/lib/hadoop/lib/guava-11.0.2.jar:/etc/hadoop/conf/:/usr/lib/hadoop/hadoop-common-2.0.6-alpha.jar:/usr/lib/hadoop/lib/commons-configuration-1.6.jar:/usr/lib/hadoop/lib/commons-lang-2.5.jar:/usr/lib/hadoop/hadoop-auth.jar:/usr/lib/hadoop/lib/slf4j-api-1.6.1.jar:/usr/lib/hadoop-hdfs/hadoop-hdfs.jar:/usr/lib/hadoop/lib/protobuf-java-2.4.0a.jar provision.groovy /vagrant/init-hcfs.json Mar 11, 2014 5:09:29 AM org.apache.commons.logging.Log$info call INFO: Provisioning file system for file system from Configuration: hdfs://vagrant.bigtop1:17020 SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details. Mar 11, 2014 5:09:30 AM org.apache.hadoop.util.NativeCodeLoader <clinit> WARNING: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call INFO: PROVISIONING WITH FILE SYSTEM : class org.apache.hadoop.hdfs.DistributedFileSystem here [/tmp, 1777, null, null] Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call INFO: mkdirs /tmp 1777 Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call INFO: Owner /tmp null null Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call INFO: Skipping ... user null here [/var/log, 1775, yarn, mapred] Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call INFO: mkdirs /var/log 1775 Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call INFO: Owner /var/log yarn mapred here [/tmp/hadoop-yarn, 777, mapred, mapred] Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call INFO: mkdirs /tmp/hadoop-yarn 777 Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call INFO: Owner /tmp/hadoop-yarn mapred mapred here [/var/log/hadoop-yarn/apps, 1777, yarn, mapred] Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call INFO: mkdirs /var/log/hadoop-yarn/apps 1777 Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call INFO: Owner /var/log/hadoop-yarn/apps yarn mapred here [/hbase, null, hbase, hbase] Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call INFO: mkdirs /hbase null Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call INFO: Owner /hbase hbase hbase here [/solr, null, solr, solr] Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call INFO: mkdirs /solr null Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call INFO: Owner /solr solr solr here [/benchmarks, 777, null, null] Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call INFO: mkdirs /benchmarks 777 Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call INFO: Owner /benchmarks null null Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call INFO: Skipping ... user null here [/user, 755, HCFS_SUPER_USER, null] Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call INFO: mkdirs /user 755 Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call INFO: Owner /user HCFS_SUPER_USER null here [/user/history, 755, mapred, mapred] Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call INFO: mkdirs /user/history 755 Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call INFO: Owner /user/history mapred mapred here [/user/jenkins, 777, jenkins, null] Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call INFO: mkdirs /user/jenkins 777 Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call INFO: Owner /user/jenkins jenkins null here [/user/hive, 777, null, null] Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call INFO: mkdirs /user/hive 777 Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call INFO: Owner /user/hive null null Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call INFO: Skipping ... user null here [/user/root, 777, root, null] Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call INFO: mkdirs /user/root 777 Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call INFO: Owner /user/root root null here [/user/hue, 777, hue, null] Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call INFO: mkdirs /user/hue 777 Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call INFO: Owner /user/hue hue null here [/user/sqoop, 777, sqoop, null] Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call INFO: mkdirs /user/sqoop 777 Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call INFO: Owner /user/sqoop sqoop null here [/user/oozie, 777, oozie] Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call INFO: mkdirs /user/oozie 777 Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call INFO: Owner /user/oozie oozie null here [/user/oozie/share, null, null, null] Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call INFO: mkdirs /user/oozie/share null Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call INFO: Owner /user/oozie/share null null Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call INFO: Skipping ... user null here [/user/oozie/share/lib, null, null, null] Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call INFO: mkdirs /user/oozie/share/lib null Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call INFO: Owner /user/oozie/share/lib null null Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call INFO: Skipping ... user null here [/user/oozie/share/lib/hive, null, null, null] Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call INFO: mkdirs /user/oozie/share/lib/hive null Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call INFO: Owner /user/oozie/share/lib/hive null null Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call INFO: Skipping ... user null here [/user/oozie/share/lib/mapreduce, null, null, null] Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call INFO: mkdirs /user/oozie/share/lib/mapreduce null Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call INFO: Owner /user/oozie/share/lib/mapreduce null null Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call INFO: Skipping ... user null here [/user/oozie/share/lib/mapreduce-streaming, null, null, null] Mar 11, 2014 5:09:30 AM org.apache.commons.logging.Log$info call INFO: mkdirs /user/oozie/share/lib/mapreduce-streaming null Mar 11, 2014 5:09:31 AM org.apache.commons.logging.Log$info call INFO: Owner /user/oozie/share/lib/mapreduce-streaming null null Mar 11, 2014 5:09:31 AM org.apache.commons.logging.Log$info call INFO: Skipping ... user null here [/user/oozie/share/lib/distcp, null, null, null] Mar 11, 2014 5:09:31 AM org.apache.commons.logging.Log$info call INFO: mkdirs /user/oozie/share/lib/distcp null Mar 11, 2014 5:09:31 AM org.apache.commons.logging.Log$info call INFO: Owner /user/oozie/share/lib/distcp null null Mar 11, 2014 5:09:31 AM org.apache.commons.logging.Log$info call INFO: Skipping ... user null here [/user/oozie/share/lib/pig, null, null, null] Mar 11, 2014 5:09:31 AM org.apache.commons.logging.Log$info call INFO: mkdirs /user/oozie/share/lib/pig null Mar 11, 2014 5:09:31 AM org.apache.commons.logging.Log$info call INFO: Owner /user/oozie/share/lib/pig null null Mar 11, 2014 5:09:31 AM org.apache.commons.logging.Log$info call INFO: Skipping ... user null here [/user/oozie/share, null, null, null] Mar 11, 2014 5:09:31 AM org.apache.commons.logging.Log$info call INFO: mkdirs /user/oozie/share null Mar 11, 2014 5:09:31 AM org.apache.commons.logging.Log$info call INFO: Owner /user/oozie/share null null Mar 11, 2014 5:09:31 AM org.apache.commons.logging.Log$info call INFO: Skipping ... user null here [/user/oozie/share/lib, null, null, null] Mar 11, 2014 5:09:31 AM org.apache.commons.logging.Log$info call INFO: mkdirs /user/oozie/share/lib null Mar 11, 2014 5:09:31 AM org.apache.commons.logging.Log$info call INFO: Owner /user/oozie/share/lib null null Mar 11, 2014 5:09:31 AM org.apache.commons.logging.Log$info call INFO: Skipping ... user null here [/user/oozie/share/lib/hive, null, null, null] Mar 11, 2014 5:09:31 AM org.apache.commons.logging.Log$info call INFO: mkdirs /user/oozie/share/lib/hive null Mar 11, 2014 5:09:31 AM org.apache.commons.logging.Log$info call INFO: Owner /user/oozie/share/lib/hive null null Mar 11, 2014 5:09:31 AM org.apache.commons.logging.Log$info call INFO: Skipping ... user null here [/user/oozie/share/lib/mapreduce-streaming, null, null, null] Mar 11, 2014 5:09:31 AM org.apache.commons.logging.Log$info call INFO: mkdirs /user/oozie/share/lib/mapreduce-streaming null Mar 11, 2014 5:09:31 AM org.apache.commons.logging.Log$info call INFO: Owner /user/oozie/share/lib/mapreduce-streaming null null Mar 11, 2014 5:09:31 AM org.apache.commons.logging.Log$info call INFO: Skipping ... user null here [/user/oozie/share/lib/distcp, null, null, null] Mar 11, 2014 5:09:31 AM org.apache.commons.logging.Log$info call INFO: mkdirs /user/oozie/share/lib/distcp null Mar 11, 2014 5:09:31 AM org.apache.commons.logging.Log$info call INFO: Owner /user/oozie/share/lib/distcp null null Mar 11, 2014 5:09:31 AM org.apache.commons.logging.Log$info call INFO: Skipping ... user null here [/user/oozie/share/lib/pig, null, null, null] Mar 11, 2014 5:09:31 AM org.apache.commons.logging.Log$info call INFO: mkdirs /user/oozie/share/lib/pig null Mar 11, 2014 5:09:31 AM org.apache.commons.logging.Log$info call INFO: Owner /user/oozie/share/lib/pig null null Mar 11, 2014 5:09:31 AM org.apache.commons.logging.Log$info call INFO: Skipping ... user null Mar 11, 2014 5:09:31 AM org.apache.commons.logging.Log$info call INFO: current user: tom Mar 11, 2014 5:09:31 AM org.apache.commons.logging.Log$info call INFO: current user: alice Mar 11, 2014 5:09:31 AM org.apache.commons.logging.Log$info call INFO: current user: bigtop Mar 11, 2014 5:09:31 AM org.apache.commons.logging.Log$info call INFO: Now running some basic shell commands for setting up oozie shared libraries. copying /usr/lib/hive/lib/hive-beeline.jar copying /usr/lib/hive/lib/jackson-jaxrs-1.8.8.jar copying /usr/lib/hive/lib/hive-hwi-0.11.0.jar copying /usr/lib/hive/lib/hive-jdbc-0.11.0.jar copying /usr/lib/hive/lib/antlr-runtime-3.4.jar copying /usr/lib/hive/lib/ST4-4.0.4.jar copying /usr/lib/hive/lib/jline-0.9.94.jar copying /usr/lib/hive/lib/jetty-util-6.1.26.jar copying /usr/lib/hive/lib/hive-common.jar copying /usr/lib/hive/lib/hive-cli.jar copying /usr/lib/hive/lib/hive-metastore.jar copying /usr/lib/hive/lib/hive-shims-0.11.0.jar copying /usr/lib/hive/lib/JavaEWAH-0.3.2.jar copying /usr/lib/hive/lib/hive-shims.jar copying /usr/lib/hive/lib/snappy-0.2.jar copying /usr/lib/hive/lib/jackson-mapper-asl-1.8.8.jar copying /usr/lib/hive/lib/hive-hwi.jar copying /usr/lib/hive/lib/commons-dbcp-1.4.jar copying /usr/lib/hive/lib/servlet-api-2.5-20081211.jar copying /usr/lib/hive/lib/commons-configuration-1.6.jar copying /usr/lib/hive/lib/jackson-core-asl-1.8.8.jar copying /usr/lib/hive/lib/avro-1.7.1.jar copying /usr/lib/hive/lib/derby-10.4.2.0.jar copying /usr/lib/hive/lib/tempus-fugit-1.1.jar copying /usr/lib/hive/lib/avro-mapred-1.7.1.jar copying /usr/lib/hive/lib/hive-contrib.jar copying /usr/lib/hive/lib/datanucleus-enhancer-2.0.3.jar copying /usr/lib/hive/lib/datanucleus-rdbms-2.0.3.jar copying /usr/lib/hive/lib/hive-metastore-0.11.0.jar copying /usr/lib/hive/lib/log4j-1.2.16.jar copying /usr/lib/hive/lib/hive-service-0.11.0.jar copying /usr/lib/hive/lib/commons-lang-2.4.jar copying /usr/lib/hive/lib/commons-io-2.4.jar copying /usr/lib/hive/lib/hive-serde-0.11.0.jar copying /usr/lib/hive/lib/guava-11.0.2.jar copying /usr/lib/hive/lib/hive-common-0.11.0.jar copying /usr/lib/hive/lib/slf4j-api-1.6.1.jar copying /usr/lib/hive/lib/commons-pool-1.5.4.jar copying /usr/lib/hive/lib/hive-hbase-handler-0.11.0.jar copying /usr/lib/hive/lib/hive-contrib-0.11.0.jar copying /usr/lib/hive/lib/commons-logging-1.0.4.jar copying /usr/lib/hive/lib/json-20090211.jar copying /usr/lib/hive/lib/zookeeper.jar copying /usr/lib/hive/lib/xz-1.0.jar copying /usr/lib/hive/lib/commons-collections-3.2.1.jar copying /usr/lib/hive/lib/hive-hbase-handler.jar copying /usr/lib/hive/lib/libthrift-0.9.0.jar copying /usr/lib/hive/lib/hive-cli-0.11.0.jar copying /usr/lib/hive/lib/commons-logging-api-1.0.4.jar copying /usr/lib/hive/lib/javolution-5.5.1.jar copying /usr/lib/hive/lib/jackson-xc-1.8.8.jar copying /usr/lib/hive/lib/protobuf-java-2.4.1.jar copying /usr/lib/hive/lib/jdo2-api-2.3-ec.jar copying /usr/lib/hive/lib/maven-ant-tasks-2.1.3.jar copying /usr/lib/hive/lib/datanucleus-connectionpool-2.0.3.jar copying /usr/lib/hive/lib/jetty-6.1.26.jar copying /usr/lib/hive/lib/hive-beeline-0.11.0.jar copying /usr/lib/hive/lib/commons-compress-1.4.1.jar copying /usr/lib/hive/lib/metrics-core-2.1.2.jar copying /usr/lib/hive/lib/hive-service.jar copying /usr/lib/hive/lib/commons-codec-1.4.jar copying /usr/lib/hive/lib/commons-cli-1.2.jar copying /usr/lib/hive/lib/hive-exec.jar copying /usr/lib/hive/lib/libfb303-0.9.0.jar copying /usr/lib/hive/lib/datanucleus-core-2.0.3.jar copying /usr/lib/hive/lib/hive-jdbc.jar copying /usr/lib/hive/lib/hive-serde.jar copying /usr/lib/hive/lib/hive-exec-0.11.0.jar copying /usr/lib/hadoop-mapreduce/hadoop-streaming.jar copying /usr/lib/hadoop-mapreduce/hadoop-streaming-2.0.6-alpha.jar copying /usr/lib/hadoop-mapreduce/hadoop-distcp-2.0.6-alpha.jar copying /usr/lib/hadoop-mapreduce/hadoop-distcp.jar copying /usr/lib/pig/lib/jython-standalone-2.5.3.jar copying /usr/lib/pig/pig-0.11.1-smoketests.jar copying /usr/lib/pig/piggybank.jar copying /usr/lib/pig/pig-0.11.1.jar copying /usr/lib/pig/pig-0.11.1-withouthadoop.jar copying /usr/lib/pig/pig.jar copying /usr/lib/pig/lib/jython-standalone-2.5.3.jar
          Hide
          jayunit100 jay vyas added a comment -

          Okay......... i guess no news is good news
          so im working a official patch w proper boilerplate/formatting/logging for this now

          Show
          jayunit100 jay vyas added a comment - Okay......... i guess no news is good news – so im working a official patch w proper boilerplate/formatting/logging for this now
          Hide
          jayunit100 jay vyas added a comment - - edited

          oops i meant hey doctor Konstantin Boudnik ........ and also Roman Shaposhnik might be interested in this....

          Finally ! Heres an tested and functional patch for BIGTOP-1200 based provisioning.

          Shall we also create a JIRA to deprecate init-hdfs.sh in the interim before we polish of BIGTOP-1235?

          I've tested it and it works for my VMs:

          bash-4.1$ groovy -classpath /usr/lib/hadoop/hadoop-common-2.0.6-alpha.jar:/root/.m2/repository/org/apache/bigtop/itest/itest-common/0.8.0-SNAPSHOT/itest-common-0.8.0-SNAPSHOT.jar:/usr/lib/hadoop/lib/guava-11.0.2.jar:/etc/hadoop/conf/:/usr/lib/hadoop/hadoop-common-2.0.6-alpha.jar:/usr/lib/hadoop/lib/commons-configuration-1.6.jar:/usr/lib/hadoop/lib/commons-lang-2.5.jar:/usr/lib/hadoop/hadoop-auth.jar:/usr/lib/hadoop/lib/slf4j-api-1.6.1.jar:/usr/lib/hadoop-hdfs/hadoop-hdfs.jar:/usr/lib/hadoop/lib/protobuf-java-2.4.0a.jar /vagrant/provision.groovy /vagrant/init-hcfs.json
          Mar 13, 2014 4:01:31 AM org.apache.commons.logging.Log$info callalpha.jar:/root/.m2/repository/org/apache/bigtop/itest/itest-common/0.8.0-SNAPSHOT/itest-common-0.8.0-SNAPSHOT.jar:/usr/lib/hadINFO: Provisioning file system for file system from Configuration: hdfs://vagrant.bigtop1:17020ib/hadoop/lib/commons-configuration-1.6.jar:/usr/lib/hadoop/lib/commons-lang-2.5.jar:/usr/lib/haSLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".r/lib/hadoop-hdfs/hadoop-hdfs.
          SLF4J: Defaulting to no-operation (NOP) logger implementation
          SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.2/repository/org/apache/bigtop/itest/itest-common/0.8.0-SNAPSHOT/itest-common-0.8.0-SNAPSHOT.jar:/usr/lib/hadMar 13, 2014 4:01:31 AM org.apache.hadoop.util.NativeCodeLoader <clinit>-2.0.6-alpha.jar:/usr/lib/hadoop/lib/commons-configuration-1.6.jar:/usr/lib/hadoop/lib/commons-lang-2.5.jar:/usr/lib/haWARNING: Unable to load native-hadoop library for your platform... using builtin-java classes where applicableoop/lib/protobuf-java-2.4.0a.jar /vagrant/pop/hadoop-common-2.0.6-alpha.jar:/usr/Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$info call/commons-lang-2.5.jar:/usr/lib/hadoop/hadoop-auth.jar:/usr/lib/hadoop/lib/slf4j-api-1.6.1.jar:/usr/lib/hadoop-hdfs/hadoop-hdfs.INFO: PROVISIONING WITH FILE SYSTEM : class org.apache.hadoop.hdfs.DistributedFileSystem
          Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$info call
          INFO: mkdirs /tmp null null
          Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$warn$0 call
          WARNING: No owner specified for /tmp
          Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$info call
          INFO: mkdirs /var/log yarn mapred
          Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$info call
          INFO: mkdirs /tmp/hadoop-yarn mapred mapred
          Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$info call
          INFO: mkdirs /var/log/hadoop-yarn/apps yarn mapred
          Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$info call
          INFO: mkdirs /hbase hbase hbase
          Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$info call
          INFO: mkdirs /solr solr solr
          Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$info call
          INFO: mkdirs /benchmarks null null
          Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$warn$0 call
          WARNING: No owner specified for /benchmarks
          Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$info call
          INFO: mkdirs /user HCFS_SUPER_USER null
          Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$info call
          INFO: mkdirs /user/history mapred mapred
          Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$info call
          INFO: mkdirs /user/jenkins jenkins null
          Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$info call
          INFO: mkdirs /user/hive null null
          Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$warn$0 call
          WARNING: No owner specified for /user/hive
          Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$info call
          INFO: mkdirs /user/root root null
          Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$info call
          INFO: mkdirs /user/hue hue null
          Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$info call
          INFO: mkdirs /user/sqoop sqoop null
          Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$info call
          INFO: mkdirs /user/oozie oozie null
          Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$info call
          INFO: mkdirs /user/oozie/share null null
          Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$warn$0 call
          WARNING: No owner specified for /user/oozie/share
          Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$info call
          INFO: mkdirs /user/oozie/share/lib null null
          Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$warn$0 call
          WARNING: No owner specified for /user/oozie/share/lib
          Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$info call
          INFO: mkdirs /user/oozie/share/lib/hive null null
          Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$warn$0 call
          WARNING: No owner specified for /user/oozie/share/lib/hive
          Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$info call
          INFO: mkdirs /user/oozie/share/lib/mapreduce null null
          Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$warn$0 call
          WARNING: No owner specified for /user/oozie/share/lib/mapreduce
          Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$info call
          INFO: mkdirs /user/oozie/share/lib/mapreduce-streaming null null
          Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$warn$0 call
          WARNING: No owner specified for /user/oozie/share/lib/mapreduce-streaming
          Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$info call
          INFO: mkdirs /user/oozie/share/lib/distcp null null
          Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$warn$0 call
          WARNING: No owner specified for /user/oozie/share/lib/distcp
          Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$info call
          INFO: mkdirs /user/oozie/share/lib/pig null null
          Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$warn$0 call
          WARNING: No owner specified for /user/oozie/share/lib/pig
          Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$info call
          INFO: mkdirs /user/oozie/share null null
          Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$warn$0 call
          WARNING: No owner specified for /user/oozie/share
          Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$info call
          INFO: mkdirs /user/oozie/share/lib null null
          Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$warn$0 call
          WARNING: No owner specified for /user/oozie/share/lib
          Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$info call
          INFO: mkdirs /user/oozie/share/lib/hive null null
          Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$warn$0 call
          WARNING: No owner specified for /user/oozie/share/lib/hive
          Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$info call
          INFO: mkdirs /user/oozie/share/lib/mapreduce-streaming null null
          Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$warn$0 call
          WARNING: No owner specified for /user/oozie/share/lib/mapreduce-streaming
          Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$info call
          INFO: mkdirs /user/oozie/share/lib/distcp null null
          Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$warn$0 call
          WARNING: No owner specified for /user/oozie/share/lib/distcp
          Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$info call
          INFO: mkdirs /user/oozie/share/lib/pig null null
          Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$warn$0 call
          WARNING: No owner specified for /user/oozie/share/lib/pig
          Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$info call
          INFO: current user: tom
          Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$info call
          INFO: current user: alice
          Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$info call
          INFO: current user: bigtop
          Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$info call
          INFO: Now copying Jars into the DFS for oozie 
          Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$info call
          INFO: This might take a few seconds...
          Mar 13, 2014 4:01:35 AM org.apache.commons.logging.Log$info call
          INFO: Total jars copied into the DFS : 79
          
          
          Show
          jayunit100 jay vyas added a comment - - edited oops i meant hey doctor Konstantin Boudnik ........ and also Roman Shaposhnik might be interested in this.... Finally ! Heres an tested and functional patch for BIGTOP-1200 based provisioning. Shall we also create a JIRA to deprecate init-hdfs.sh in the interim before we polish of BIGTOP-1235 ? I've tested it and it works for my VMs: bash-4.1$ groovy -classpath /usr/lib/hadoop/hadoop-common-2.0.6-alpha.jar:/root/.m2/repository/org/apache/bigtop/itest/itest-common/0.8.0-SNAPSHOT/itest-common-0.8.0-SNAPSHOT.jar:/usr/lib/hadoop/lib/guava-11.0.2.jar:/etc/hadoop/conf/:/usr/lib/hadoop/hadoop-common-2.0.6-alpha.jar:/usr/lib/hadoop/lib/commons-configuration-1.6.jar:/usr/lib/hadoop/lib/commons-lang-2.5.jar:/usr/lib/hadoop/hadoop-auth.jar:/usr/lib/hadoop/lib/slf4j-api-1.6.1.jar:/usr/lib/hadoop-hdfs/hadoop-hdfs.jar:/usr/lib/hadoop/lib/protobuf-java-2.4.0a.jar /vagrant/provision.groovy /vagrant/init-hcfs.json Mar 13, 2014 4:01:31 AM org.apache.commons.logging.Log$info callalpha.jar:/root/.m2/repository/org/apache/bigtop/itest/itest-common/0.8.0-SNAPSHOT/itest-common-0.8.0-SNAPSHOT.jar:/usr/lib/hadINFO: Provisioning file system for file system from Configuration: hdfs://vagrant.bigtop1:17020ib/hadoop/lib/commons-configuration-1.6.jar:/usr/lib/hadoop/lib/commons-lang-2.5.jar:/usr/lib/haSLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".r/lib/hadoop-hdfs/hadoop-hdfs. SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.2/repository/org/apache/bigtop/itest/itest-common/0.8.0-SNAPSHOT/itest-common-0.8.0-SNAPSHOT.jar:/usr/lib/hadMar 13, 2014 4:01:31 AM org.apache.hadoop.util.NativeCodeLoader <clinit>-2.0.6-alpha.jar:/usr/lib/hadoop/lib/commons-configuration-1.6.jar:/usr/lib/hadoop/lib/commons-lang-2.5.jar:/usr/lib/haWARNING: Unable to load native-hadoop library for your platform... using builtin-java classes where applicableoop/lib/protobuf-java-2.4.0a.jar /vagrant/pop/hadoop-common-2.0.6-alpha.jar:/usr/Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$info call/commons-lang-2.5.jar:/usr/lib/hadoop/hadoop-auth.jar:/usr/lib/hadoop/lib/slf4j-api-1.6.1.jar:/usr/lib/hadoop-hdfs/hadoop-hdfs.INFO: PROVISIONING WITH FILE SYSTEM : class org.apache.hadoop.hdfs.DistributedFileSystem Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$info call INFO: mkdirs /tmp null null Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$warn$0 call WARNING: No owner specified for /tmp Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$info call INFO: mkdirs /var/log yarn mapred Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$info call INFO: mkdirs /tmp/hadoop-yarn mapred mapred Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$info call INFO: mkdirs /var/log/hadoop-yarn/apps yarn mapred Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$info call INFO: mkdirs /hbase hbase hbase Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$info call INFO: mkdirs /solr solr solr Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$info call INFO: mkdirs /benchmarks null null Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$warn$0 call WARNING: No owner specified for /benchmarks Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$info call INFO: mkdirs /user HCFS_SUPER_USER null Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$info call INFO: mkdirs /user/history mapred mapred Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$info call INFO: mkdirs /user/jenkins jenkins null Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$info call INFO: mkdirs /user/hive null null Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$warn$0 call WARNING: No owner specified for /user/hive Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$info call INFO: mkdirs /user/root root null Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$info call INFO: mkdirs /user/hue hue null Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$info call INFO: mkdirs /user/sqoop sqoop null Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$info call INFO: mkdirs /user/oozie oozie null Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$info call INFO: mkdirs /user/oozie/share null null Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$warn$0 call WARNING: No owner specified for /user/oozie/share Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$info call INFO: mkdirs /user/oozie/share/lib null null Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$warn$0 call WARNING: No owner specified for /user/oozie/share/lib Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$info call INFO: mkdirs /user/oozie/share/lib/hive null null Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$warn$0 call WARNING: No owner specified for /user/oozie/share/lib/hive Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$info call INFO: mkdirs /user/oozie/share/lib/mapreduce null null Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$warn$0 call WARNING: No owner specified for /user/oozie/share/lib/mapreduce Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$info call INFO: mkdirs /user/oozie/share/lib/mapreduce-streaming null null Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$warn$0 call WARNING: No owner specified for /user/oozie/share/lib/mapreduce-streaming Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$info call INFO: mkdirs /user/oozie/share/lib/distcp null null Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$warn$0 call WARNING: No owner specified for /user/oozie/share/lib/distcp Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$info call INFO: mkdirs /user/oozie/share/lib/pig null null Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$warn$0 call WARNING: No owner specified for /user/oozie/share/lib/pig Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$info call INFO: mkdirs /user/oozie/share null null Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$warn$0 call WARNING: No owner specified for /user/oozie/share Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$info call INFO: mkdirs /user/oozie/share/lib null null Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$warn$0 call WARNING: No owner specified for /user/oozie/share/lib Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$info call INFO: mkdirs /user/oozie/share/lib/hive null null Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$warn$0 call WARNING: No owner specified for /user/oozie/share/lib/hive Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$info call INFO: mkdirs /user/oozie/share/lib/mapreduce-streaming null null Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$warn$0 call WARNING: No owner specified for /user/oozie/share/lib/mapreduce-streaming Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$info call INFO: mkdirs /user/oozie/share/lib/distcp null null Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$warn$0 call WARNING: No owner specified for /user/oozie/share/lib/distcp Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$info call INFO: mkdirs /user/oozie/share/lib/pig null null Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$warn$0 call WARNING: No owner specified for /user/oozie/share/lib/pig Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$info call INFO: current user: tom Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$info call INFO: current user: alice Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$info call INFO: current user: bigtop Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$info call INFO: Now copying Jars into the DFS for oozie Mar 13, 2014 4:01:32 AM org.apache.commons.logging.Log$info call INFO: This might take a few seconds... Mar 13, 2014 4:01:35 AM org.apache.commons.logging.Log$info call INFO: Total jars copied into the DFS : 79
          Hide
          cos Konstantin Boudnik added a comment -

          Mostly looking good! A few minor things:

          • shebang #!/usr/bin/env /opt/groovy-2.2.1/bin/groovy should point to /usr/lib/bigtop-groovy/bin/groovy
          • there's a bunch of unused imports. Clean them up
          • exit should read System.exit
          • def v = new JsonSlurper(); is unused
          • if I am not mistaking you can reduce the following
            def dirs = (List) parsedData.get("dir");
            def users = (List) parsedData.get("user");
            

            to

            def dirs = parsedData.dir as List;
            def users = parsedData.user as List;
            

            because - oh the greatness of Groovy - JSON elements will automatically become object's properties with correct names. Also, save casting is a good idea

          • {[def root_user}} is unused
          • 117: Path file = new Path(name); is unused; perhaps you meant to use it place of Path constructor's calling later?
          • could you please use named constants in place of hardcoded paths at the end of the script?
            Thanks!
          Show
          cos Konstantin Boudnik added a comment - Mostly looking good! A few minor things: shebang #!/usr/bin/env /opt/groovy-2.2.1/bin/groovy should point to /usr/lib/bigtop-groovy/bin/groovy there's a bunch of unused imports. Clean them up exit should read System.exit def v = new JsonSlurper(); is unused if I am not mistaking you can reduce the following def dirs = (List) parsedData.get("dir"); def users = (List) parsedData.get("user"); to def dirs = parsedData.dir as List; def users = parsedData.user as List; because - oh the greatness of Groovy - JSON elements will automatically become object's properties with correct names. Also, save casting is a good idea {[def root_user}} is unused 117: Path file = new Path(name); is unused; perhaps you meant to use it place of Path constructor's calling later? could you please use named constants in place of hardcoded paths at the end of the script? Thanks!
          Hide
          jayunit100 jay vyas added a comment -

          Thanks cos will put these changes in now looks like we're almost there

          Show
          jayunit100 jay vyas added a comment - Thanks cos will put these changes in now looks like we're almost there
          Hide
          jayunit100 jay vyas added a comment -

          Reattached new patch (also fixed a feature... it wasnt working properly for super_users).

          Show
          jayunit100 jay vyas added a comment - Reattached new patch (also fixed a feature... it wasnt working properly for super_users).
          Hide
          cos Konstantin Boudnik added a comment -

          Ok, almost there!

          • declare things like OOZIE_SHARE="/user/oozie/share/lib/"; as def final ..... It is script, but why not to make the code reading an easier exercise?
          • still unused imports in
            import java.io.FileReader;
            ...
            import org.apache.hadoop.fs.permission.FsPermission;
            ...
            import org.apache.commons.logging.Log;
            
          • if(user != null && user.equals("HCFS_SUPER_USER")) can be shortened to if (user?.equals("HCFS_SUPER_USER"))
          • and I would suggest to change the iteration over the dirs to
              def final dir = new Path(name);
              if (mode == null) {
                fs.mkdirs(dir);
              } else {
                fs.mkdirs(dir, new FsPermission((short) mode));
              }
              if (user != null) {
                fs.setOwner(dir, user, group);
              } else {
                LOG.warn("No owner specified for " + name);
              }
            

            as you can see I am using a constant for dir instead of calling the constructor each time (I think this was your initial intention too?) and added a pair of '

            {' '}

            ' to the else branch of the last if

          • you can add final qualifier in the line 152 as well
          • there's an extra empty line at 139 and 161
            The rest is great! Thanks!
          Show
          cos Konstantin Boudnik added a comment - Ok, almost there! declare things like OOZIE_SHARE="/user/oozie/share/lib/"; as def final .... . It is script, but why not to make the code reading an easier exercise? still unused imports in import java.io.FileReader; ... import org.apache.hadoop.fs.permission.FsPermission; ... import org.apache.commons.logging.Log; if(user != null && user.equals("HCFS_SUPER_USER")) can be shortened to if (user?.equals("HCFS_SUPER_USER")) and I would suggest to change the iteration over the dirs to def final dir = new Path(name); if (mode == null) { fs.mkdirs(dir); } else { fs.mkdirs(dir, new FsPermission((short) mode)); } if (user != null) { fs.setOwner(dir, user, group); } else { LOG.warn("No owner specified for " + name); } as you can see I am using a constant for dir instead of calling the constructor each time (I think this was your initial intention too?) and added a pair of ' {' '} ' to the else branch of the last if you can add final qualifier in the line 152 as well there's an extra empty line at 139 and 161 The rest is great! Thanks!
          Hide
          jayunit100 jay vyas added a comment -

          thanks cos. which unused imports? I think they're all being used now ? but maybe there are some repeats. lines listed below that use the imports.........

          import groovy.json.JsonSlurper; (93)
          import java.io.FileReader; (95)
          import org.apache.hadoop.conf.Configuration; (105)
          import org.apache.hadoop.fs.FileSystem; (114)
          import org.apache.hadoop.fs.Path; (130)
          import org.apache.commons.logging.Log; (12)
          import org.apache.commons.logging.LogFactory; (12)
          import org.apache.hadoop.fs.permission.FsPermission; (132)

          Show
          jayunit100 jay vyas added a comment - thanks cos. which unused imports? I think they're all being used now ? but maybe there are some repeats. lines listed below that use the imports......... import groovy.json.JsonSlurper; (93) import java.io.FileReader; (95) import org.apache.hadoop.conf.Configuration; (105) import org.apache.hadoop.fs.FileSystem; (114) import org.apache.hadoop.fs.Path; (130) import org.apache.commons.logging.Log; (12) import org.apache.commons.logging.LogFactory; (12) import org.apache.hadoop.fs.permission.FsPermission; (132)
          Hide
          cos Konstantin Boudnik added a comment - - edited

          Right, FsPermission is repeated twice - that's why. As for logging.Log - the 12th line of the patch is empty. Are we looking into two different versions? I am using the latest you've attached.

          Also, just caught this one: the line

          if (!args.length == 1) 

          won't work. Please use '!=' operation instead

          Show
          cos Konstantin Boudnik added a comment - - edited Right, FsPermission is repeated twice - that's why. As for logging.Log - the 12th line of the patch is empty. Are we looking into two different versions? I am using the latest you've attached. Also, just caught this one: the line if (!args.length == 1) won't work. Please use '!=' operation instead
          Hide
          jayunit100 jay vyas added a comment - - edited

          hey cos thanks ! your right. will patch those, and while doing it FYI im adding a couple other minor improvments as well.

          **************

          After more testing, im finding funny /tmp permissions when reading from init-hcfs.json. Duh : JSON doesnt like Octals. I think we better encode the permissions as strings. This patch will thus include an update to init-hcfs.json as well.

          Show
          jayunit100 jay vyas added a comment - - edited hey cos thanks ! your right. will patch those, and while doing it FYI im adding a couple other minor improvments as well. ************** After more testing, im finding funny /tmp permissions when reading from init-hcfs.json. Duh : JSON doesnt like Octals. I think we better encode the permissions as strings. This patch will thus include an update to init-hcfs.json as well.
          Hide
          jayunit100 jay vyas added a comment -

          New patch !

          • encodes BIGTOP-1200 json file perms as strings, not ints.
          • use Short.decode
          • fix the argument validator logic and make it prettier
          • added some of the final qualifiers for some of those immutables
          • added summary of # dirs created
          • cached the new Path() calls

          I realize that i could create another jira just for fixing the json file, but hope its okay if we can just commit these two files together, since BIGTOP-952 and BIGTOP-1200 are so closely coupled anyways.

          Show
          jayunit100 jay vyas added a comment - New patch ! encodes BIGTOP-1200 json file perms as strings, not ints. use Short.decode fix the argument validator logic and make it prettier added some of the final qualifiers for some of those immutables added summary of # dirs created cached the new Path() calls I realize that i could create another jira just for fixing the json file, but hope its okay if we can just commit these two files together, since BIGTOP-952 and BIGTOP-1200 are so closely coupled anyways.
          Hide
          jayunit100 jay vyas added a comment -

          more cleanup

          Show
          jayunit100 jay vyas added a comment - more cleanup
          Hide
          cos Konstantin Boudnik added a comment -

          Like it - especially the clever lazy error evaluation one!
          I am ok with committing the changes of the json file together with the script - don't see any harm in it. One last thing caught my attention. The usage string has the classpath which is awkwardly split by '.'. Do you think it would make sense to do the split my ':' instead? Will it look a bit more natural?

          Show
          cos Konstantin Boudnik added a comment - Like it - especially the clever lazy error evaluation one! I am ok with committing the changes of the json file together with the script - don't see any harm in it. One last thing caught my attention. The usage string has the classpath which is awkwardly split by '.'. Do you think it would make sense to do the split my ':' instead? Will it look a bit more natural?
          Hide
          jayunit100 jay vyas added a comment -

          hi cos, sure i just manually edited the patch file for this.
          so are we all set? If so next week i can start on BIGTOP-1235 !

          Show
          jayunit100 jay vyas added a comment - hi cos, sure i just manually edited the patch file for this. so are we all set? If so next week i can start on BIGTOP-1235 !
          Hide
          cos Konstantin Boudnik added a comment -

          Thanks Jay! + 1 - patch looks good and is ready for commit. I will do this in the morning to give someone else a chance to chime in if they feel like.

          Show
          cos Konstantin Boudnik added a comment - Thanks Jay! + 1 - patch looks good and is ready for commit. I will do this in the morning to give someone else a chance to chime in if they feel like.
          Hide
          rvs Roman Shaposhnik added a comment -

          +1 from me as well.

          Show
          rvs Roman Shaposhnik added a comment - +1 from me as well.
          Hide
          cos Konstantin Boudnik added a comment -

          Committed to the master. Thanks Jay!

          Show
          cos Konstantin Boudnik added a comment - Committed to the master. Thanks Jay!

            People

            • Assignee:
              jayunit100 jay vyas
              Reporter:
              cos Konstantin Boudnik
            • Votes:
              0 Vote for this issue
              Watchers:
              8 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development