Bigtop
  1. Bigtop
  2. BIGTOP-1089

BigPetStore: A polyglot big data processing blueprint

    Details

    • Type: New Feature New Feature
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.7.0
    • Fix Version/s: 0.8.0
    • Component/s: blueprints
    • Labels:
      None

      Description

      The need for templates for processing big data pipelines is obvious - and also - given the increasing amount of overlap across different big data and nosql projects, it will provide a ground truth in the future for comparing the behaviour and approach of different tools to solve a common, easily comprehended problem.

      This ticket formalizes the conversation in mailing list archives regarding the BigPetStore proposal.

      At the moment, (with the exception of word count), there are very few examples of bigdata problems that have been solved by a variety of different technologies. And, even with wordcount, there arent alot of templates which can be customized for applications.

      Comparatively: Other application developer communities (i.e.the Rails folks, those using maven archetypes, etc.. ) have a plethora of template applications which can be used to kickstart their applications and use cases.

      This big pet store JIRA thus aims to do the following:

      0) Curate a single, central, standard input data set . (modified: generating a large input data set on the fly).

      1) Define a big data processing pipeline (using the pet store theme - except morphing it to be analytics rather than transaction oriented), and implement basic aggregations in hive, pig, etc...

      2) Sink the results of 2 into some kind of NoSQL store or search engine.

      Some implementation details – open to change these, please comment/review – .

      • initial data source will be raw text or (better yet) some kind of automatically generated data.
      • the source will initially go in bigtop/blueprints
      • the application sources can be in any modern JVM language (java,scala,groovy,clojure), since bigtop supports scala, java, groovy natively already and clojure is easy to support with the right jars.
      • each "job" will be named according to the corresponding DAG of the big data pipeline .
      • all jobs should (not sure if requirement?) be controlled by a global program (maybe oozie?) which runs the tasks in order, and can easily be customized to use different tools at different stages.
      • for now, all outputs will be to files: so that users don't require servers to run the app.
      • final data sinks will be into a highly available transaction oriented store (solr/hbase/...)

      This ticket will be completed once a first iteration of BigPetStore is complete using 3 ecosystem components, along with a depiction of the pipeline which can be used for development.

      I've assigned this to myself I hope thats okay? Seems like at the moment im the only one working on it.

      1. BIGTOP-1089.patch
        156 kB
        jay vyas
      2. BIGTOP-1089.patch
        162 kB
        jay vyas
      3. BIGTOP-1089.patch
        161 kB
        jay vyas
      4. BIGTOP-1089.patch
        136 kB
        jay vyas
      5. BIGTOP-1089.patch
        136 kB
        jay vyas

        Issue Links

          Activity

          Hide
          jay vyas added a comment -

          Some minor updates above. Currently the very raw working tree for source is here:

          https://github.com/jayunit100/hadoop-example-jobs/blob/master/src/main/java/org/bigtop/bigpetstore/PetStoreTransactionGeneratorJob.java

          ^^ Is just a stub of a raw input data set generator, based on a dud input format that simply generates random "transactions", sample output:

          enno_watson big chew toy
          jay_farlson dog treats
          jay_yang big chew toy
          jacob_jones dog food
          sanford_yang dog treats
          jay_stevens premium dog food
          sanford_walbright fish food
          andrew_vyas dog food
          enno_stevens dog treats

          I will have to update this with more feilds, and jsonify it later - so that its more extensible moving forward. Presumeably you would want purchase price, state, etc...

          Show
          jay vyas added a comment - Some minor updates above. Currently the very raw working tree for source is here: https://github.com/jayunit100/hadoop-example-jobs/blob/master/src/main/java/org/bigtop/bigpetstore/PetStoreTransactionGeneratorJob.java ^^ Is just a stub of a raw input data set generator, based on a dud input format that simply generates random "transactions", sample output: enno_watson big chew toy jay_farlson dog treats jay_yang big chew toy jacob_jones dog food sanford_yang dog treats jay_stevens premium dog food sanford_walbright fish food andrew_vyas dog food enno_stevens dog treats I will have to update this with more feilds, and jsonify it later - so that its more extensible moving forward. Presumeably you would want purchase price, state, etc...
          Hide
          jay vyas added a comment -

          update: still early stages, but made some progress.. ive created a staging github project for this. The input format is complete with two different statistical distributions. Now am in process of adding embedded hive/pig ETL parts of the flow.

          At that point Ill submit a formal patch and move it into "blueprints" if the patch goes through.
          https://github.com/jayunit100/bigpetstore

          Show
          jay vyas added a comment - update: still early stages, but made some progress.. ive created a staging github project for this. The input format is complete with two different statistical distributions. Now am in process of adding embedded hive/pig ETL parts of the flow. At that point Ill submit a formal patch and move it into "blueprints" if the patch goes through. https://github.com/jayunit100/bigpetstore
          Hide
          jay vyas added a comment - - edited

          Update:

          Its still under development here : https://github.com/jayunit100/bigpetstore . After its stable ill put in the first patch, and move dev to bigtop if its approved.

          • Generates gaussian distributed data in the DFS for pet store transactions (i.e. # of purchases by each individual is smeared on a guassian distro).
            -Currently it does basic aggregations of raw data data using pig, hive, and compares that the two approaches result in identical outputs.
            -It also does some classificiation of "states" of the US based on similar transaction profiles using mahout.

          I need some help on some things:

          • more unit tests.
          • adding crunch, datafu, and a any other bigtop packages : more examples means more usefullness as a big data sandbox app.
          • documentation
          • deployment on a cluster

          I'd like to work with bigtop community directly on this, maybe we can open up to the broader hadoop community for some feedback this week.

          Show
          jay vyas added a comment - - edited Update: Its still under development here : https://github.com/jayunit100/bigpetstore . After its stable ill put in the first patch, and move dev to bigtop if its approved. Generates gaussian distributed data in the DFS for pet store transactions (i.e. # of purchases by each individual is smeared on a guassian distro). -Currently it does basic aggregations of raw data data using pig, hive, and compares that the two approaches result in identical outputs. -It also does some classificiation of "states" of the US based on similar transaction profiles using mahout. I need some help on some things: more unit tests. adding crunch, datafu, and a any other bigtop packages : more examples means more usefullness as a big data sandbox app. documentation deployment on a cluster I'd like to work with bigtop community directly on this, maybe we can open up to the broader hadoop community for some feedback this week.
          Hide
          Bruno Mahé added a comment -

          Great!
          I haven't had as much time to look into it as I had wished but it does not build for me:

          [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) on project hadoop-examples: Compilation failure: Compilation failure:
          [ERROR] /home/bruno/freesoftware/bigpetstore/src/main/java/org/bigtop/bigpetstore/etl/HiveETL.java:[16,40] cannot find symbol
          [ERROR] symbol:   class TestPetStoreHiveCode
          [ERROR] location: package org.bigtop.bigpetstore.generator
          [ERROR] /home/bruno/freesoftware/bigpetstore/src/main/java/org/bigtop/bigpetstore/etl/HiveETL.java:[43,24] cannot find symbol
          [ERROR] symbol:   class TestPetStoreHiveCode
          [ERROR] location: class org.bigtop.bigpetstore.etl.HiveETL
          [ERROR] -> [Help 1]
          [ERROR] 
          

          Not sure if it is expected, but I did not have time to look into why yet.

          Also, before asking for contributions, I would recommend to add the code to Apache Bigtop. Otherwise you would have to keep track of all the contributions made through github to ensure all the IP can be correctly integrated.

          Show
          Bruno Mahé added a comment - Great! I haven't had as much time to look into it as I had wished but it does not build for me: [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.1:compile (default-compile) on project hadoop-examples: Compilation failure: Compilation failure: [ERROR] /home/bruno/freesoftware/bigpetstore/src/main/java/org/bigtop/bigpetstore/etl/HiveETL.java:[16,40] cannot find symbol [ERROR] symbol: class TestPetStoreHiveCode [ERROR] location: package org.bigtop.bigpetstore.generator [ERROR] /home/bruno/freesoftware/bigpetstore/src/main/java/org/bigtop/bigpetstore/etl/HiveETL.java:[43,24] cannot find symbol [ERROR] symbol: class TestPetStoreHiveCode [ERROR] location: class org.bigtop.bigpetstore.etl.HiveETL [ERROR] -> [Help 1] [ERROR] Not sure if it is expected, but I did not have time to look into why yet. Also, before asking for contributions, I would recommend to add the code to Apache Bigtop. Otherwise you would have to keep track of all the contributions made through github to ensure all the IP can be correctly integrated.
          Hide
          jay vyas added a comment -

          Thanks bruno. I will try to format a patch as soon as the build is stable.
          Lots of changes last couple of weeks. I agree the sooner we get it in the better.
          Also it will probably increase the visibility and make it easier for me to solicit patches

          Show
          jay vyas added a comment - Thanks bruno. I will try to format a patch as soon as the build is stable. Lots of changes last couple of weeks. I agree the sooner we get it in the better. Also it will probably increase the visibility and make it easier for me to solicit patches
          Hide
          jay vyas added a comment -

          Well, i think we are very close: I've added crunch to the api also. At this point, doing :

            export HADOOP_HOME=./hadoop-1.2.1
            export HIVE_HOME=./hive-0.11.0
            export HIVE_CONF_DIR=./hive-0.11.0/conf/
            export HADOOP_CONF_DIR=./hadoop-1.2.1/conf/
            mvn verify 
          

          Passes for me. does it work for anyone else ?

          This is all preliminary , still cleanup to do so no rush, but just in case anyone is following this JIRA i wanted to leave an update.

          The results:

          crunch:{cat-food=3, leather-collar=1, fish-food=1, duck-caller=1, dog-food=2, organic-dog-food=1, steel-leash=1}
          pig:{cat-food=3, leather-collar=1, fish-food=1, duck-caller=1, dog-food=2, organic-dog-food=1, steel-leash=1}
          hive:{cat-food=3, leather-collar=1, fish-food=1, duck-caller=1, dog-food=2, organic-dog-food=1, steel-leash=1}
          
          Show
          jay vyas added a comment - Well, i think we are very close: I've added crunch to the api also. At this point, doing : export HADOOP_HOME=./hadoop-1.2.1 export HIVE_HOME=./hive-0.11.0 export HIVE_CONF_DIR=./hive-0.11.0/conf/ export HADOOP_CONF_DIR=./hadoop-1.2.1/conf/ mvn verify Passes for me. does it work for anyone else ? This is all preliminary , still cleanup to do so no rush, but just in case anyone is following this JIRA i wanted to leave an update. The results: crunch:{cat-food=3, leather-collar=1, fish-food=1, duck-caller=1, dog-food=2, organic-dog-food=1, steel-leash=1} pig:{cat-food=3, leather-collar=1, fish-food=1, duck-caller=1, dog-food=2, organic-dog-food=1, steel-leash=1} hive:{cat-food=3, leather-collar=1, fish-food=1, duck-caller=1, dog-food=2, organic-dog-food=1, steel-leash=1}
          Hide
          Bruno Mahé added a comment -

          Just to confirm, should "mvn clean install" work? Because it does not for me.

          Show
          Bruno Mahé added a comment - Just to confirm, should "mvn clean install" work? Because it does not for me.
          Hide
          jay vyas added a comment - - edited
          • this is preliminary and not ready to put the patch in just yet, as I'd like to setup a reproducible (i.e. vagrant) deployment of it to be confident that it works on more than just my setups .
          jays-MacBook-Pro:bigpetstore Jpeerindex$ mvn clean install verify
          [INFO] Scanning for projects...
          [WARNING] 
          [WARNING] Some problems were encountered while building the effective model for jay.rhbd:hadoop-examples:jar:1.0-SNAPSHOT
          [WARNING] 'dependencies.dependency.(groupId:artifactId:type:classifier)' must be unique: org.apache.hive:hive-contrib:jar -> version ${hive.version} vs 0.11.0 @ line 183, column 15
          [WARNING] 
          [WARNING] It is highly recommended to fix these problems because they threaten the stability of your build.
          [WARNING] 
          [WARNING] For this reason, future Maven versions might no longer support building such malformed projects.
          [WARNING] 
          [INFO]                                                                         
          [INFO] ------------------------------------------------------------------------
          [INFO] Building hadoop-examples 1.0-SNAPSHOT
          [INFO] ------------------------------------------------------------------------
          [WARNING] Could not transfer metadata asm:asm/maven-metadata.xml from/to local.repository (file:../../local.repository/trunk): No connector available to access repository local.repository (file:../../local.repository/trunk) of type legacy using the available factories WagonRepositoryConnectorFactory
          [WARNING] The POM for javax.jdo:jdo2-api:jar:2.3-ec is missing, no dependency information available
          [INFO] 
          [INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ hadoop-examples ---
          [INFO] Deleting /Users/Jpeerindex/Development/bigpetstore/target
          [INFO] 
          [INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ hadoop-examples ---
          [INFO] Using 'UTF-8' encoding to copy filtered resources.
          [INFO] Copying 4 resources
          [INFO] 
          [INFO] --- maven-compiler-plugin:3.1:compile (default-compile) @ hadoop-examples ---
          [INFO] Changes detected - recompiling the module!
          [INFO] Compiling 16 source files to /Users/Jpeerindex/Development/bigpetstore/target/classes
          [WARNING] error: error reading /Users/Jpeerindex/.m2/repository/it/unimi/dsi/fastutil/6.5.7/fastutil-6.5.7.jar; cannot read zip file entry
          [WARNING] Note: /Users/Jpeerindex/Development/bigpetstore/src/main/java/org/bigtop/bigpetstore/clustering/Mh1.java uses or overrides a deprecated API.
          [WARNING] Note: Recompile with -Xlint:deprecation for details.
          [WARNING] Note: Some input files use unchecked or unsafe operations.
          [WARNING] Note: Recompile with -Xlint:unchecked for details.
          [INFO] 
          [INFO] --- build-helper-maven-plugin:1.7:add-test-source (add-integration-test-sources) @ hadoop-examples ---
          [INFO] Test Source directory: /Users/Jpeerindex/Development/bigpetstore/src/integration/java added.
          [INFO] 
          [INFO] --- maven-resources-plugin:2.6:testResources (default-testResources) @ hadoop-examples ---
          [INFO] Using 'UTF-8' encoding to copy filtered resources.
          [INFO] Copying 1 resource
          [INFO] 
          [INFO] --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ hadoop-examples ---
          [INFO] Changes detected - recompiling the module!
          [INFO] Compiling 2 source files to /Users/Jpeerindex/Development/bigpetstore/target/test-classes
          [WARNING] error: error reading /Users/Jpeerindex/.m2/repository/it/unimi/dsi/fastutil/6.5.7/fastutil-6.5.7.jar; cannot read zip file entry
          [INFO] 
          [INFO] --- maven-surefire-plugin:2.12:test (default-test) @ hadoop-examples ---
          [INFO] Surefire report directory: /Users/Jpeerindex/Development/bigpetstore/target/surefire-reports
          
          -------------------------------------------------------
           T E S T S
          -------------------------------------------------------
          
          -------------------------------------------------------
           T E S T S
          -------------------------------------------------------
          Running org.bigtop.bigpetstore.generator.TestPetStoreTransactionGeneratorJob
          SLF4J: Class path contains multiple SLF4J bindings.
          SLF4J: Found binding in [jar:file:/Users/Jpeerindex/.m2/repository/org/slf4j/slf4j-jcl/1.7.5/slf4j-jcl-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
          SLF4J: Found binding in [jar:file:/Users/Jpeerindex/.m2/repository/org/slf4j/slf4j-log4j12/1.6.1/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
          SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
          memory : 1047
          2014-01-08 09:16:06.740 java[24030:1903] Unable to load realm info from SCDynamicStore
          14/01/08 09:16:06 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
          14/01/08 09:16:06 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
          14/01/08 09:16:06 WARN mapred.JobClient: No job jar file set.  User classes may not be found. See JobConf(Class) or JobConf#setJar(String).
          AZ _ 2
          AK _ 2
          CT _ 2
          OK _ 2
          CO _ 2
          CA _ 6
          NY _ 4
          14/01/08 09:16:07 INFO mapred.JobClient: Running job: job_local_0001
          14/01/08 09:16:07 INFO mapred.Task:  Using ResourceCalculatorPlugin : null
          14/01/08 09:16:07 INFO mapred.MapTask: io.sort.mb = 100
          14/01/08 09:16:07 INFO mapred.MapTask: data buffer = 79691776/99614720
          14/01/08 09:16:07 INFO mapred.MapTask: record buffer = 262144/327680
          14/01/08 09:16:07 INFO mapred.MapTask: Starting flush of map output
          14/01/08 09:16:07 INFO mapred.MapTask: Finished spill 0
          14/01/08 09:16:07 INFO mapred.Task: Task:attempt_local_0001_m_000000_0 is done. And is in the process of commiting
          14/01/08 09:16:07 INFO mapred.LocalJobRunner: 
          14/01/08 09:16:07 INFO mapred.Task: Task 'attempt_local_0001_m_000000_0' done.
          14/01/08 09:16:07 INFO mapred.Task:  Using ResourceCalculatorPlugin : null
          14/01/08 09:16:07 INFO mapred.MapTask: io.sort.mb = 100
          14/01/08 09:16:07 INFO mapred.MapTask: data buffer = 79691776/99614720
          14/01/08 09:16:07 INFO mapred.MapTask: record buffer = 262144/327680
          14/01/08 09:16:07 INFO mapred.MapTask: Starting flush of map output
          14/01/08 09:16:07 INFO mapred.MapTask: Finished spill 0
          14/01/08 09:16:07 INFO mapred.Task: Task:attempt_local_0001_m_000001_0 is done. And is in the process of commiting
          14/01/08 09:16:07 INFO mapred.LocalJobRunner: 
          14/01/08 09:16:07 INFO mapred.Task: Task 'attempt_local_0001_m_000001_0' done.
          14/01/08 09:16:07 INFO mapred.Task:  Using ResourceCalculatorPlugin : null
          14/01/08 09:16:07 INFO mapred.MapTask: io.sort.mb = 100
          14/01/08 09:16:07 INFO mapred.MapTask: data buffer = 79691776/99614720
          14/01/08 09:16:07 INFO mapred.MapTask: record buffer = 262144/327680
          14/01/08 09:16:07 INFO mapred.MapTask: Starting flush of map output
          14/01/08 09:16:07 INFO mapred.MapTask: Finished spill 0
          14/01/08 09:16:07 INFO mapred.Task: Task:attempt_local_0001_m_000002_0 is done. And is in the process of commiting
          14/01/08 09:16:07 INFO mapred.LocalJobRunner: 
          14/01/08 09:16:07 INFO mapred.Task: Task 'attempt_local_0001_m_000002_0' done.
          14/01/08 09:16:07 INFO mapred.Task:  Using ResourceCalculatorPlugin : null
          14/01/08 09:16:07 INFO mapred.MapTask: io.sort.mb = 100
          14/01/08 09:16:07 INFO mapred.MapTask: data buffer = 79691776/99614720
          14/01/08 09:16:07 INFO mapred.MapTask: record buffer = 262144/327680
          14/01/08 09:16:07 INFO mapred.MapTask: Starting flush of map output
          14/01/08 09:16:07 INFO mapred.MapTask: Finished spill 0
          14/01/08 09:16:07 INFO mapred.Task: Task:attempt_local_0001_m_000003_0 is done. And is in the process of commiting
          14/01/08 09:16:07 INFO mapred.LocalJobRunner: 
          14/01/08 09:16:07 INFO mapred.Task: Task 'attempt_local_0001_m_000003_0' done.
          14/01/08 09:16:07 INFO mapred.Task:  Using ResourceCalculatorPlugin : null
          14/01/08 09:16:07 INFO mapred.MapTask: io.sort.mb = 100
          14/01/08 09:16:07 INFO mapred.MapTask: data buffer = 79691776/99614720
          14/01/08 09:16:07 INFO mapred.MapTask: record buffer = 262144/327680
          14/01/08 09:16:07 INFO mapred.MapTask: Starting flush of map output
          14/01/08 09:16:07 INFO mapred.MapTask: Finished spill 0
          14/01/08 09:16:07 INFO mapred.Task: Task:attempt_local_0001_m_000004_0 is done. And is in the process of commiting
          14/01/08 09:16:07 INFO mapred.LocalJobRunner: 
          14/01/08 09:16:07 INFO mapred.Task: Task 'attempt_local_0001_m_000004_0' done.
          14/01/08 09:16:07 INFO mapred.Task:  Using ResourceCalculatorPlugin : null
          14/01/08 09:16:07 INFO mapred.MapTask: io.sort.mb = 100
          14/01/08 09:16:07 INFO mapred.MapTask: data buffer = 79691776/99614720
          14/01/08 09:16:07 INFO mapred.MapTask: record buffer = 262144/327680
          14/01/08 09:16:07 INFO mapred.MapTask: Starting flush of map output
          14/01/08 09:16:07 INFO mapred.MapTask: Finished spill 0
          14/01/08 09:16:07 INFO mapred.Task: Task:attempt_local_0001_m_000005_0 is done. And is in the process of commiting
          14/01/08 09:16:07 INFO mapred.LocalJobRunner: 
          14/01/08 09:16:07 INFO mapred.Task: Task 'attempt_local_0001_m_000005_0' done.
          14/01/08 09:16:07 INFO mapred.Task:  Using ResourceCalculatorPlugin : null
          14/01/08 09:16:07 INFO mapred.MapTask: io.sort.mb = 100
          14/01/08 09:16:07 INFO mapred.MapTask: data buffer = 79691776/99614720
          14/01/08 09:16:07 INFO mapred.MapTask: record buffer = 262144/327680
          14/01/08 09:16:07 INFO mapred.MapTask: Starting flush of map output
          14/01/08 09:16:07 INFO mapred.MapTask: Finished spill 0
          14/01/08 09:16:07 INFO mapred.Task: Task:attempt_local_0001_m_000006_0 is done. And is in the process of commiting
          14/01/08 09:16:07 INFO mapred.LocalJobRunner: 
          14/01/08 09:16:07 INFO mapred.Task: Task 'attempt_local_0001_m_000006_0' done.
          14/01/08 09:16:07 INFO mapred.Task:  Using ResourceCalculatorPlugin : null
          14/01/08 09:16:07 INFO mapred.LocalJobRunner: 
          14/01/08 09:16:07 INFO mapred.Merger: Merging 7 sorted segments
          14/01/08 09:16:07 INFO mapred.Merger: Down to the last merge-pass, with 7 segments left of total size: 1758 bytes
          14/01/08 09:16:07 INFO mapred.LocalJobRunner: 
          14/01/08 09:16:07 INFO mapred.Task: Task:attempt_local_0001_r_000000_0 is done. And is in the process of commiting
          14/01/08 09:16:07 INFO mapred.LocalJobRunner: 
          14/01/08 09:16:07 INFO mapred.Task: Task attempt_local_0001_r_000000_0 is allowed to commit now
          14/01/08 09:16:07 INFO output.FileOutputCommitter: Saved output of task 'attempt_local_0001_r_000000_0' to petstoredata/Wed Jan 08 09:16:06 EST 2014
          14/01/08 09:16:07 INFO mapred.LocalJobRunner: reduce > reduce
          14/01/08 09:16:07 INFO mapred.Task: Task 'attempt_local_0001_r_000000_0' done.
          14/01/08 09:16:08 INFO mapred.JobClient:  map 100% reduce 100%
          14/01/08 09:16:08 INFO mapred.JobClient: Job complete: job_local_0001
          14/01/08 09:16:08 INFO mapred.JobClient: Counters: 17
          14/01/08 09:16:08 INFO mapred.JobClient:   File Output Format Counters 
          14/01/08 09:16:08 INFO mapred.JobClient:     Bytes Written=1728
          14/01/08 09:16:08 INFO mapred.JobClient:   FileSystemCounters
          14/01/08 09:16:08 INFO mapred.JobClient:     FILE_BYTES_READ=20254
          14/01/08 09:16:08 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=296259
          14/01/08 09:16:08 INFO mapred.JobClient:   File Input Format Counters 
          14/01/08 09:16:08 INFO mapred.JobClient:     Bytes Read=0
          14/01/08 09:16:08 INFO mapred.JobClient:   Map-Reduce Framework
          14/01/08 09:16:08 INFO mapred.JobClient:     Map output materialized bytes=1786
          14/01/08 09:16:08 INFO mapred.JobClient:     Map input records=20
          14/01/08 09:16:08 INFO mapred.JobClient:     Reduce shuffle bytes=0
          14/01/08 09:16:08 INFO mapred.JobClient:     Spilled Records=40
          14/01/08 09:16:08 INFO mapred.JobClient:     Map output bytes=1704
          14/01/08 09:16:08 INFO mapred.JobClient:     Total committed heap usage (bytes)=8482979840
          14/01/08 09:16:08 INFO mapred.JobClient:     SPLIT_RAW_BYTES=497
          14/01/08 09:16:08 INFO mapred.JobClient:     Combine input records=0
          14/01/08 09:16:08 INFO mapred.JobClient:     Reduce input records=20
          14/01/08 09:16:08 INFO mapred.JobClient:     Reduce input groups=20
          14/01/08 09:16:08 INFO mapred.JobClient:     Combine output records=0
          14/01/08 09:16:08 INFO mapred.JobClient:     Reduce output records=20
          14/01/08 09:16:08 INFO mapred.JobClient:     Map output records=20
          ===>BigPetStore,storeCode_AK,1	rick,kemp,Thu Jan 15 23:08:51 EST 1970,19.1,fuzzy-collar
          ===>BigPetStore,storeCode_AK,2	rick,kemp,Sun Dec 14 00:51:07 EST 1969,7.5,cat-food
          ===>BigPetStore,storeCode_AZ,1	aaron,medina,Fri Jan 23 03:01:17 EST 1970,25.1,leather-collar
          ===>BigPetStore,storeCode_AZ,2	aaron,medina,Thu Jan 08 21:33:52 EST 1970,10.5,dog-food
          ===>BigPetStore,storeCode_CA,1	elizabeth,suarez,Fri Dec 26 17:43:10 EST 1969,7.5,cat-food
          ===>BigPetStore,storeCode_CA,2	elizabeth,suarez,Wed Jan 21 00:53:53 EST 1970,11.75,fish-food
          ===>BigPetStore,storeCode_CA,3	elizabeth,suarez,Mon Jan 12 00:21:28 EST 1970,11.75,fish-food
          ===>BigPetStore,storeCode_CA,4	elizabeth,suarez,Fri Jan 16 08:24:05 EST 1970,11.75,fish-food
          ===>BigPetStore,storeCode_CA,5	elizabeth,suarez,Sun Dec 07 04:32:44 EST 1969,10.5,dog-food
          ===>BigPetStore,storeCode_CA,6	elizabeth,suarez,Thu Dec 11 19:48:25 EST 1969,16.5,organic-dog-food
          ===>BigPetStore,storeCode_CO,1	donna,melendez,Mon Jan 19 13:41:48 EST 1970,10.5,dog-food
          ===>BigPetStore,storeCode_CO,2	curtis,jackson,Mon Dec 29 04:32:17 EST 1969,15.1,choke-collar
          ===>BigPetStore,storeCode_CT,1	ray,mueller,Tue Jan 13 06:24:08 EST 1970,10.5,dog-food
          ===>BigPetStore,storeCode_CT,2	ray,mueller,Wed Jan 21 10:04:42 EST 1970,7.5,cat-food
          ===>BigPetStore,storeCode_NY,1	sam,curry,Sun Jan 04 18:32:32 EST 1970,19.75,fish-food
          ===>BigPetStore,storeCode_NY,2	raymond,beck,Sat Dec 13 00:15:28 EST 1969,7.5,cat-food
          ===>BigPetStore,storeCode_NY,3	raymond,beck,Tue Jan 13 15:03:54 EST 1970,20.1,steel-leash
          ===>BigPetStore,storeCode_NY,4	raymond,beck,Sat Dec 27 11:54:25 EST 1969,7.5,cat-food
          ===>BigPetStore,storeCode_OK,1	gage,frost,Thu Dec 25 09:34:31 EST 1969,40.1,rodent-cage
          ===>BigPetStore,storeCode_OK,2	aaron,ross,Sat Dec 20 19:21:20 EST 1969,10.5,dog-food
          14/01/08 09:16:08 INFO generator.TestPetStoreTransactionGeneratorJob: Created 20 , file was 1704 bytes.
          Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.698 sec
          
          Results :
          
          Tests run: 1, Failures: 0, Errors: 0, Skipped: 0
          
          [INFO] 
          [INFO] --- maven-jar-plugin:2.4:jar (default-jar) @ hadoop-examples ---
          [INFO] Building jar: /Users/Jpeerindex/Development/bigpetstore/target/hadoop-examples-1.0-SNAPSHOT.jar
          [INFO] 
          [INFO] --- maven-failsafe-plugin:2.12:integration-test (integration-tests) @ hadoop-examples ---
          [INFO] Failsafe report directory: /Users/Jpeerindex/Development/bigpetstore/target/failsafe-reports
          
          -------------------------------------------------------
           T E S T S
          -------------------------------------------------------
          Running org.bigtop.bigpetstore.integration.ITBigPetStore
          SLF4J: Class path contains multiple SLF4J bindings.
          SLF4J: Found binding in [jar:file:/Users/Jpeerindex/.m2/repository/org/slf4j/slf4j-jcl/1.7.5/slf4j-jcl-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
          SLF4J: Found binding in [jar:file:/Users/Jpeerindex/.m2/repository/org/slf4j/slf4j-log4j12/1.6.1/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
          SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
          2014-01-08 09:16:09.195 java[24038:1903] Unable to load realm info from SCDynamicStore
          14/01/08 09:16:09 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
          14/01/08 09:16:09 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
          14/01/08 09:16:09 WARN mapred.JobClient: No job jar file set.  User classes may not be found. See JobConf(Class) or JobConf#setJar(String).
          AZ _ 1
          AK _ 1
          CT _ 1
          OK _ 1
          CO _ 1
          CA _ 3
          NY _ 2
          14/01/08 09:16:09 INFO mapred.JobClient: Running job: job_local_0001
          14/01/08 09:16:09 INFO mapred.Task:  Using ResourceCalculatorPlugin : null
          14/01/08 09:16:09 INFO mapred.MapTask: io.sort.mb = 100
          14/01/08 09:16:09 INFO mapred.MapTask: data buffer = 79691776/99614720
          14/01/08 09:16:09 INFO mapred.MapTask: record buffer = 262144/327680
          14/01/08 09:16:09 INFO mapred.MapTask: Starting flush of map output
          14/01/08 09:16:09 INFO mapred.MapTask: Finished spill 0
          14/01/08 09:16:09 INFO mapred.Task: Task:attempt_local_0001_m_000000_0 is done. And is in the process of commiting
          14/01/08 09:16:09 INFO mapred.LocalJobRunner: 
          14/01/08 09:16:09 INFO mapred.Task: Task 'attempt_local_0001_m_000000_0' done.
          14/01/08 09:16:09 INFO mapred.Task:  Using ResourceCalculatorPlugin : null
          14/01/08 09:16:09 INFO mapred.MapTask: io.sort.mb = 100
          14/01/08 09:16:09 INFO mapred.MapTask: data buffer = 79691776/99614720
          14/01/08 09:16:09 INFO mapred.MapTask: record buffer = 262144/327680
          14/01/08 09:16:09 INFO mapred.MapTask: Starting flush of map output
          14/01/08 09:16:09 INFO mapred.MapTask: Finished spill 0
          14/01/08 09:16:09 INFO mapred.Task: Task:attempt_local_0001_m_000001_0 is done. And is in the process of commiting
          14/01/08 09:16:09 INFO mapred.LocalJobRunner: 
          14/01/08 09:16:09 INFO mapred.Task: Task 'attempt_local_0001_m_000001_0' done.
          14/01/08 09:16:09 INFO mapred.Task:  Using ResourceCalculatorPlugin : null
          14/01/08 09:16:09 INFO mapred.MapTask: io.sort.mb = 100
          14/01/08 09:16:09 INFO mapred.MapTask: data buffer = 79691776/99614720
          14/01/08 09:16:09 INFO mapred.MapTask: record buffer = 262144/327680
          14/01/08 09:16:09 INFO mapred.MapTask: Starting flush of map output
          14/01/08 09:16:09 INFO mapred.MapTask: Finished spill 0
          14/01/08 09:16:09 INFO mapred.Task: Task:attempt_local_0001_m_000002_0 is done. And is in the process of commiting
          14/01/08 09:16:09 INFO mapred.LocalJobRunner: 
          14/01/08 09:16:09 INFO mapred.Task: Task 'attempt_local_0001_m_000002_0' done.
          14/01/08 09:16:09 INFO mapred.Task:  Using ResourceCalculatorPlugin : null
          14/01/08 09:16:09 INFO mapred.MapTask: io.sort.mb = 100
          14/01/08 09:16:09 INFO mapred.MapTask: data buffer = 79691776/99614720
          14/01/08 09:16:09 INFO mapred.MapTask: record buffer = 262144/327680
          14/01/08 09:16:09 INFO mapred.MapTask: Starting flush of map output
          14/01/08 09:16:09 INFO mapred.MapTask: Finished spill 0
          14/01/08 09:16:09 INFO mapred.Task: Task:attempt_local_0001_m_000003_0 is done. And is in the process of commiting
          14/01/08 09:16:09 INFO mapred.LocalJobRunner: 
          14/01/08 09:16:09 INFO mapred.Task: Task 'attempt_local_0001_m_000003_0' done.
          14/01/08 09:16:09 INFO mapred.Task:  Using ResourceCalculatorPlugin : null
          14/01/08 09:16:09 INFO mapred.MapTask: io.sort.mb = 100
          14/01/08 09:16:09 INFO mapred.MapTask: data buffer = 79691776/99614720
          14/01/08 09:16:09 INFO mapred.MapTask: record buffer = 262144/327680
          14/01/08 09:16:09 INFO mapred.MapTask: Starting flush of map output
          14/01/08 09:16:09 INFO mapred.MapTask: Finished spill 0
          14/01/08 09:16:09 INFO mapred.Task: Task:attempt_local_0001_m_000004_0 is done. And is in the process of commiting
          14/01/08 09:16:09 INFO mapred.LocalJobRunner: 
          14/01/08 09:16:09 INFO mapred.Task: Task 'attempt_local_0001_m_000004_0' done.
          14/01/08 09:16:09 INFO mapred.Task:  Using ResourceCalculatorPlugin : null
          14/01/08 09:16:09 INFO mapred.MapTask: io.sort.mb = 100
          14/01/08 09:16:09 INFO mapred.MapTask: data buffer = 79691776/99614720
          14/01/08 09:16:09 INFO mapred.MapTask: record buffer = 262144/327680
          14/01/08 09:16:09 INFO mapred.MapTask: Starting flush of map output
          14/01/08 09:16:09 INFO mapred.MapTask: Finished spill 0
          14/01/08 09:16:09 INFO mapred.Task: Task:attempt_local_0001_m_000005_0 is done. And is in the process of commiting
          14/01/08 09:16:09 INFO mapred.LocalJobRunner: 
          14/01/08 09:16:09 INFO mapred.Task: Task 'attempt_local_0001_m_000005_0' done.
          14/01/08 09:16:09 INFO mapred.Task:  Using ResourceCalculatorPlugin : null
          14/01/08 09:16:09 INFO mapred.MapTask: io.sort.mb = 100
          14/01/08 09:16:09 INFO mapred.MapTask: data buffer = 79691776/99614720
          14/01/08 09:16:09 INFO mapred.MapTask: record buffer = 262144/327680
          14/01/08 09:16:09 INFO mapred.MapTask: Starting flush of map output
          14/01/08 09:16:09 INFO mapred.MapTask: Finished spill 0
          14/01/08 09:16:09 INFO mapred.Task: Task:attempt_local_0001_m_000006_0 is done. And is in the process of commiting
          14/01/08 09:16:09 INFO mapred.LocalJobRunner: 
          14/01/08 09:16:09 INFO mapred.Task: Task 'attempt_local_0001_m_000006_0' done.
          14/01/08 09:16:09 INFO mapred.Task:  Using ResourceCalculatorPlugin : null
          14/01/08 09:16:09 INFO mapred.LocalJobRunner: 
          14/01/08 09:16:09 INFO mapred.Merger: Merging 7 sorted segments
          14/01/08 09:16:09 INFO mapred.Merger: Down to the last merge-pass, with 7 segments left of total size: 891 bytes
          14/01/08 09:16:09 INFO mapred.LocalJobRunner: 
          14/01/08 09:16:09 INFO mapred.Task: Task:attempt_local_0001_r_000000_0 is done. And is in the process of commiting
          14/01/08 09:16:09 INFO mapred.LocalJobRunner: 
          14/01/08 09:16:09 INFO mapred.Task: Task attempt_local_0001_r_000000_0 is allowed to commit now
          14/01/08 09:16:09 INFO output.FileOutputCommitter: Saved output of task 'attempt_local_0001_r_000000_0' to /tmp/BigPetStore1389190568881/generated
          14/01/08 09:16:09 INFO mapred.LocalJobRunner: reduce > reduce
          14/01/08 09:16:09 INFO mapred.Task: Task 'attempt_local_0001_r_000000_0' done.
          14/01/08 09:16:10 INFO mapred.JobClient:  map 100% reduce 100%
          14/01/08 09:16:10 INFO mapred.JobClient: Job complete: job_local_0001
          14/01/08 09:16:10 INFO mapred.JobClient: Counters: 17
          14/01/08 09:16:10 INFO mapred.JobClient:   File Output Format Counters 
          14/01/08 09:16:10 INFO mapred.JobClient:     Bytes Written=873
          14/01/08 09:16:10 INFO mapred.JobClient:   FileSystemCounters
          14/01/08 09:16:10 INFO mapred.JobClient:     FILE_BYTES_READ=19387
          14/01/08 09:16:10 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=291694
          14/01/08 09:16:10 INFO mapred.JobClient:   File Input Format Counters 
          14/01/08 09:16:10 INFO mapred.JobClient:     Bytes Read=0
          14/01/08 09:16:10 INFO mapred.JobClient:   Map-Reduce Framework
          14/01/08 09:16:10 INFO mapred.JobClient:     Map output materialized bytes=919
          14/01/08 09:16:10 INFO mapred.JobClient:     Map input records=10
          14/01/08 09:16:10 INFO mapred.JobClient:     Reduce shuffle bytes=0
          14/01/08 09:16:10 INFO mapred.JobClient:     Spilled Records=20
          14/01/08 09:16:10 INFO mapred.JobClient:     Map output bytes=857
          14/01/08 09:16:10 INFO mapred.JobClient:     Total committed heap usage (bytes)=1039663104
          14/01/08 09:16:10 INFO mapred.JobClient:     SPLIT_RAW_BYTES=497
          14/01/08 09:16:10 INFO mapred.JobClient:     Combine input records=0
          14/01/08 09:16:10 INFO mapred.JobClient:     Reduce input records=10
          14/01/08 09:16:10 INFO mapred.JobClient:     Reduce input groups=10
          14/01/08 09:16:10 INFO mapred.JobClient:     Combine output records=0
          14/01/08 09:16:10 INFO mapred.JobClient:     Reduce output records=10
          14/01/08 09:16:10 INFO mapred.JobClient:     Map output records=10
          output : /tmp/BigPetStore1389190568881/generated/part-r-00000
          BigPetStore,storeCode_AK,1	jennifer,patrick,Tue Jan 06 20:48:34 EST 1970,19.1,fuzzy-collar
          BigPetStore,storeCode_AZ,1	billie,paul,Wed Dec 31 07:32:56 EST 1969,10.5,dog-food
          BigPetStore,storeCode_CA,1	christine,dunn,Wed Dec 17 12:51:16 EST 1969,7.5,cat-food
          BigPetStore,storeCode_CA,2	brent,kerr,Sat Dec 20 17:53:03 EST 1969,7.5,cat-food
          BigPetStore,storeCode_CA,3	everett,christensen,Fri Jan 23 23:47:54 EST 1970,7.5,cat-food
          BigPetStore,storeCode_CO,1	sandy,bernard,Fri Jan 09 20:15:59 EST 1970,30.1,antelope snacks
          BigPetStore,storeCode_CT,1	holly,o'neal,Sat Dec 27 08:44:02 EST 1969,10.5,dog-food
          BigPetStore,storeCode_NY,1	victor,wilson,Sun Jan 18 08:25:34 EST 1970,20.1,steel-leash
          BigPetStore,storeCode_NY,2	victor,wilson,Mon Dec 22 19:14:13 EST 1969,20.1,steel-leash
          BigPetStore,storeCode_OK,1	robbie,finley,Wed Jan 14 21:29:44 EST 1970,7.5,cat-food
          crunch : Text(/tmp/BigPetStore1389190568881/generated/part-r-00000)  857
          14/01/08 09:16:10 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
          14/01/08 09:16:11 WARN mapred.JobClient: No job jar file set.  User classes may not be found. See JobConf(Class) or JobConf#setJar(String).
          14/01/08 09:16:11 INFO input.FileInputFormat: Total input paths to process : 1
          14/01/08 09:16:11 WARN snappy.LoadSnappy: Snappy native library not loaded
          14/01/08 09:16:11 INFO filecache.TrackerDistributedCacheManager: Creating REDUCE in /tmp/mapred/local/archive/-6557058057938015743_-1411752611_1916134038/file/tmp/crunch-43589425/p2-work-6652890227012638320 with rwxr-xr-x
          14/01/08 09:16:11 INFO filecache.TrackerDistributedCacheManager: Cached /tmp/crunch-43589425/p2/REDUCE as /tmp/mapred/local/archive/-6557058057938015743_-1411752611_1916134038/file/tmp/crunch-43589425/p2/REDUCE
          14/01/08 09:16:11 INFO filecache.TrackerDistributedCacheManager: Cached /tmp/crunch-43589425/p2/REDUCE as /tmp/mapred/local/archive/-6557058057938015743_-1411752611_1916134038/file/tmp/crunch-43589425/p2/REDUCE
          14/01/08 09:16:11 INFO filecache.TrackerDistributedCacheManager: Creating COMBINE in /tmp/mapred/local/archive/-4649982853460913085_-948268024_1916134038/file/tmp/crunch-43589425/p2-work-7992331867547637651 with rwxr-xr-x
          14/01/08 09:16:11 INFO filecache.TrackerDistributedCacheManager: Cached /tmp/crunch-43589425/p2/COMBINE as /tmp/mapred/local/archive/-4649982853460913085_-948268024_1916134038/file/tmp/crunch-43589425/p2/COMBINE
          14/01/08 09:16:11 INFO filecache.TrackerDistributedCacheManager: Cached /tmp/crunch-43589425/p2/COMBINE as /tmp/mapred/local/archive/-4649982853460913085_-948268024_1916134038/file/tmp/crunch-43589425/p2/COMBINE
          14/01/08 09:16:11 INFO filecache.TrackerDistributedCacheManager: Creating MAP in /tmp/mapred/local/archive/-4624020441112257_-678227803_1916134038/file/tmp/crunch-43589425/p2-work-6455515777437153419 with rwxr-xr-x
          14/01/08 09:16:11 INFO filecache.TrackerDistributedCacheManager: Cached /tmp/crunch-43589425/p2/MAP as /tmp/mapred/local/archive/-4624020441112257_-678227803_1916134038/file/tmp/crunch-43589425/p2/MAP
          14/01/08 09:16:11 INFO filecache.TrackerDistributedCacheManager: Cached /tmp/crunch-43589425/p2/MAP as /tmp/mapred/local/archive/-4624020441112257_-678227803_1916134038/file/tmp/crunch-43589425/p2/MAP
          14/01/08 09:16:11 INFO jobcontrol.CrunchControlledJob: Running job "org.bigtop.bigpetstore.etl.CrunchETL: Text(/tmp/BigPetStore1389190568881/generated/part-r-00000... (1/1)"
          14/01/08 09:16:11 INFO jobcontrol.CrunchControlledJob: Job status available at: http://localhost:8080/
          14/01/08 09:16:11 INFO mapred.Task:  Using ResourceCalculatorPlugin : null
          14/01/08 09:16:11 INFO mapred.MapTask: io.sort.mb = 100
          14/01/08 09:16:11 INFO mapred.MapTask: data buffer = 79691776/99614720
          14/01/08 09:16:11 INFO mapred.MapTask: record buffer = 262144/327680
          14/01/08 09:16:11 INFO mapred.MapTask: Starting flush of map output
          14/01/08 09:16:11 INFO mapred.MapTask: Finished spill 0
          14/01/08 09:16:11 INFO mapred.Task: Task:attempt_local_0002_m_000000_0 is done. And is in the process of commiting
          14/01/08 09:16:11 INFO mapred.LocalJobRunner: 
          14/01/08 09:16:11 INFO mapred.Task: Task 'attempt_local_0002_m_000000_0' done.
          14/01/08 09:16:11 INFO mapred.Task:  Using ResourceCalculatorPlugin : null
          14/01/08 09:16:11 INFO mapred.LocalJobRunner: 
          14/01/08 09:16:11 INFO mapred.Merger: Merging 1 sorted segments
          14/01/08 09:16:11 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 76 bytes
          14/01/08 09:16:11 INFO mapred.LocalJobRunner: 
          14/01/08 09:16:11 INFO mapred.Task: Task:attempt_local_0002_r_000000_0 is done. And is in the process of commiting
          14/01/08 09:16:11 INFO mapred.LocalJobRunner: 
          14/01/08 09:16:11 INFO mapred.Task: Task attempt_local_0002_r_000000_0 is allowed to commit now
          14/01/08 09:16:11 INFO output.FileOutputCommitter: Saved output of task 'attempt_local_0002_r_000000_0' to /tmp/crunch-43589425/p2/output
          14/01/08 09:16:11 INFO mapred.LocalJobRunner: reduce > reduce
          14/01/08 09:16:11 INFO mapred.Task: Task 'attempt_local_0002_r_000000_0' done.
          Crunch:::  {cat-food=4, antelope snacks=1, fuzzy-collar=1, dog-food=2, steel-leash=2}
           {cat-food=4, antelope snacks=1, fuzzy-collar=1, dog-food=2, steel-leash=2}
          14/01/08 09:16:11 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
          14/01/08 09:16:12 WARN mapred.JobClient: No job jar file set.  User classes may not be found. See JobConf(Class) or JobConf#setJar(String).
          14/01/08 09:16:12 INFO input.FileInputFormat: Total input paths to process : 1
          14/01/08 09:16:12 INFO filecache.TrackerDistributedCacheManager: Creating REDUCE in /tmp/mapred/local/archive/1179138723816890415_2073868059_1916135038/file/tmp/crunch-43589425/p4-work--1244561190751016429 with rwxr-xr-x
          14/01/08 09:16:12 INFO filecache.TrackerDistributedCacheManager: Cached /tmp/crunch-43589425/p4/REDUCE as /tmp/mapred/local/archive/1179138723816890415_2073868059_1916135038/file/tmp/crunch-43589425/p4/REDUCE
          14/01/08 09:16:12 INFO filecache.TrackerDistributedCacheManager: Cached /tmp/crunch-43589425/p4/REDUCE as /tmp/mapred/local/archive/1179138723816890415_2073868059_1916135038/file/tmp/crunch-43589425/p4/REDUCE
          14/01/08 09:16:12 INFO filecache.TrackerDistributedCacheManager: Creating COMBINE in /tmp/mapred/local/archive/-5645924133996583671_-268209654_1916135038/file/tmp/crunch-43589425/p4-work-4417588436280480321 with rwxr-xr-x
          14/01/08 09:16:12 INFO filecache.TrackerDistributedCacheManager: Cached /tmp/crunch-43589425/p4/COMBINE as /tmp/mapred/local/archive/-5645924133996583671_-268209654_1916135038/file/tmp/crunch-43589425/p4/COMBINE
          14/01/08 09:16:12 INFO filecache.TrackerDistributedCacheManager: Cached /tmp/crunch-43589425/p4/COMBINE as /tmp/mapred/local/archive/-5645924133996583671_-268209654_1916135038/file/tmp/crunch-43589425/p4/COMBINE
          14/01/08 09:16:12 INFO filecache.TrackerDistributedCacheManager: Creating MAP in /tmp/mapred/local/archive/5188044211822062668_-676380761_1916135038/file/tmp/crunch-43589425/p4-work-3041935114194345962 with rwxr-xr-x
          14/01/08 09:16:12 INFO filecache.TrackerDistributedCacheManager: Cached /tmp/crunch-43589425/p4/MAP as /tmp/mapred/local/archive/5188044211822062668_-676380761_1916135038/file/tmp/crunch-43589425/p4/MAP
          14/01/08 09:16:12 INFO filecache.TrackerDistributedCacheManager: Cached /tmp/crunch-43589425/p4/MAP as /tmp/mapred/local/archive/5188044211822062668_-676380761_1916135038/file/tmp/crunch-43589425/p4/MAP
          14/01/08 09:16:12 INFO jobcontrol.CrunchControlledJob: Running job "org.bigtop.bigpetstore.etl.CrunchETL: Text(/tmp/BigPetStore1389190568881/generated/part-r-00000... (1/1)"
          14/01/08 09:16:12 INFO jobcontrol.CrunchControlledJob: Job status available at: http://localhost:8080/
          14/01/08 09:16:12 INFO mapred.Task:  Using ResourceCalculatorPlugin : null
          14/01/08 09:16:12 INFO mapred.MapTask: io.sort.mb = 100
          14/01/08 09:16:12 INFO mapred.MapTask: data buffer = 79691776/99614720
          14/01/08 09:16:12 INFO mapred.MapTask: record buffer = 262144/327680
          14/01/08 09:16:12 INFO mapred.MapTask: Starting flush of map output
          14/01/08 09:16:12 INFO mapred.MapTask: Finished spill 0
          14/01/08 09:16:12 INFO mapred.Task: Task:attempt_local_0003_m_000000_0 is done. And is in the process of commiting
          14/01/08 09:16:12 INFO mapred.LocalJobRunner: 
          14/01/08 09:16:12 INFO mapred.Task: Task 'attempt_local_0003_m_000000_0' done.
          14/01/08 09:16:12 INFO mapred.Task:  Using ResourceCalculatorPlugin : null
          14/01/08 09:16:12 INFO mapred.LocalJobRunner: 
          14/01/08 09:16:12 INFO mapred.Merger: Merging 1 sorted segments
          14/01/08 09:16:12 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 76 bytes
          14/01/08 09:16:12 INFO mapred.LocalJobRunner: 
          14/01/08 09:16:12 INFO mapred.Task: Task:attempt_local_0003_r_000000_0 is done. And is in the process of commiting
          14/01/08 09:16:12 INFO mapred.LocalJobRunner: 
          14/01/08 09:16:12 INFO mapred.Task: Task attempt_local_0003_r_000000_0 is allowed to commit now
          14/01/08 09:16:12 INFO output.FileOutputCommitter: Saved output of task 'attempt_local_0003_r_000000_0' to /tmp/crunch-43589425/p4/output
          14/01/08 09:16:12 INFO mapred.LocalJobRunner: reduce > reduce
          14/01/08 09:16:12 INFO mapred.Task: Task 'attempt_local_0003_r_000000_0' done.
          Crunch:::  {cat-food=4, antelope snacks=1, fuzzy-collar=1, dog-food=2, steel-leash=2}
          14/01/08 09:16:12 INFO executionengine.HExecutionEngine: Connecting to hadoop file system at: file:///
          14/01/08 09:16:12 WARN pig.PigServer: Encountered Warning IMPLICIT_CAST_TO_CHARARRAY 2 time(s).
          14/01/08 09:16:12 WARN pig.PigServer: Encountered Warning USING_OVERLOADED_FUNCTION 2 time(s).
          id_details: {drop: NULL,code: NULL,transaction: NULL,lname: NULL,fname: NULL,date: NULL,price: NULL,product: chararray}
          {drop: NULL,code: NULL,transaction: NULL,lname: NULL,fname: NULL,date: NULL,price: NULL,product: chararray}
          14/01/08 09:16:12 WARN pig.PigServer: Encountered Warning IMPLICIT_CAST_TO_CHARARRAY 2 time(s).
          14/01/08 09:16:12 WARN pig.PigServer: Encountered Warning USING_OVERLOADED_FUNCTION 2 time(s).
          uniqcnt: {product: chararray,count: long}
          Schema : {product: chararray,count: long}
          14/01/08 09:16:12 WARN pig.PigServer: Encountered Warning IMPLICIT_CAST_TO_CHARARRAY 2 time(s).
          14/01/08 09:16:12 WARN pig.PigServer: Encountered Warning USING_OVERLOADED_FUNCTION 2 time(s).
          14/01/08 09:16:12 INFO pigstats.ScriptState: Pig features used in the script: GROUP_BY
          14/01/08 09:16:13 INFO optimizer.LogicalPlanOptimizer: {RULES_ENABLED=[AddForEach, ColumnMapKeyPrune, DuplicateForEachColumnRewrite, FilterLogicExpressionSimplifier, GroupByConstParallelSetter, ImplicitSplitInserter, LimitOptimizer, LoadTypeCastInserter, MergeFilter, MergeForEach, NewPartitionFilterOptimizer, PartitionFilterOptimizer, PushDownForEachFlatten, PushUpFilter, SplitFilter, StreamTypeCastInserter]}
          14/01/08 09:16:13 INFO mapReduceLayer.MRCompiler: File concatenation threshold: 100 optimistic? false
          14/01/08 09:16:13 INFO mapReduceLayer.MultiQueryOptimizer: MR plan size before optimization: 1
          14/01/08 09:16:13 INFO mapReduceLayer.MultiQueryOptimizer: MR plan size after optimization: 1
          14/01/08 09:16:13 INFO pigstats.ScriptState: Pig script settings are added to the job
          14/01/08 09:16:13 INFO mapReduceLayer.JobControlCompiler: mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3
          14/01/08 09:16:13 INFO mapReduceLayer.JobControlCompiler: Setting up single store job
          14/01/08 09:16:13 INFO data.SchemaTupleFrontend: Key [pig.schematuple] is false, will not generate code.
          14/01/08 09:16:13 INFO data.SchemaTupleFrontend: Starting process to move generated code to distributed cache
          14/01/08 09:16:13 INFO data.SchemaTupleFrontend: Distributed cache not supported or needed in local mode. Setting key [pig.schematuple.local.dir] with code temp directory: /var/folders/qd/n3xqkhkx5b37xb3p_npnsgp40000gn/T/1389190573212-0
          14/01/08 09:16:13 INFO mapReduceLayer.JobControlCompiler: Reduce phase detected, estimating # of required reducers.
          14/01/08 09:16:13 INFO mapReduceLayer.JobControlCompiler: Using reducer estimator: org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator
          14/01/08 09:16:13 INFO mapReduceLayer.InputSizeReducerEstimator: BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=-1
          14/01/08 09:16:13 INFO mapReduceLayer.JobControlCompiler: Could not estimate number of reducers and no requested or default parallelism set. Defaulting to 1 reducer.
          14/01/08 09:16:13 INFO mapReduceLayer.JobControlCompiler: Setting Parallelism to 1
          14/01/08 09:16:13 INFO mapReduceLayer.MapReduceLauncher: 1 map-reduce job(s) waiting for submission.
          14/01/08 09:16:13 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
          14/01/08 09:16:13 WARN mapred.JobClient: No job jar file set.  User classes may not be found. See JobConf(Class) or JobConf#setJar(String).
          14/01/08 09:16:13 INFO input.FileInputFormat: Total input paths to process : 1
          14/01/08 09:16:13 INFO util.MapRedUtil: Total input paths to process : 1
          14/01/08 09:16:13 INFO util.MapRedUtil: Total input paths (combined) to process : 1
          14/01/08 09:16:13 INFO mapred.Task:  Using ResourceCalculatorPlugin : null
          14/01/08 09:16:13 INFO mapReduceLayer.PigRecordReader: Current split being processed file:/tmp/BigPetStore1389190568881/generated/part-r-00000:0+857
          14/01/08 09:16:13 INFO mapred.MapTask: io.sort.mb = 100
          14/01/08 09:16:13 INFO mapred.MapTask: data buffer = 79691776/99614720
          14/01/08 09:16:13 INFO mapred.MapTask: record buffer = 262144/327680
          14/01/08 09:16:13 INFO util.SpillableMemoryManager: first memory handler call- Usage threshold init = 65404928(63872K) used = 102924280(100511K) committed = 110362624(107776K) max = 110362624(107776K)
          14/01/08 09:16:13 INFO data.SchemaTupleBackend: Key [pig.schematuple] was not set... will not generate code.
          14/01/08 09:16:13 INFO mapReduceLayer.PigGenericMapReduce$Map: Aliases being processed per job phase (AliasName[line,offset]): M: csvdata[1,10],id_details[2,13],transactions[3,15],transactionsG[4,16] C:  R: uniqcnt[5,11],sym[5,40],sym[5,40]
          14/01/08 09:16:13 INFO mapred.MapTask: Starting flush of map output
          14/01/08 09:16:13 INFO mapred.MapTask: Finished spill 0
          14/01/08 09:16:13 INFO mapred.Task: Task:attempt_local_0004_m_000000_0 is done. And is in the process of commiting
          14/01/08 09:16:13 INFO mapred.LocalJobRunner: 
          14/01/08 09:16:13 INFO mapred.Task: Task 'attempt_local_0004_m_000000_0' done.
          14/01/08 09:16:13 INFO mapred.Task:  Using ResourceCalculatorPlugin : null
          14/01/08 09:16:13 INFO mapred.LocalJobRunner: 
          14/01/08 09:16:13 INFO mapred.Merger: Merging 1 sorted segments
          14/01/08 09:16:13 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 1069 bytes
          14/01/08 09:16:13 INFO mapred.LocalJobRunner: 
          14/01/08 09:16:13 WARN data.SchemaTupleBackend: SchemaTupleBackend has already been initialized
          14/01/08 09:16:13 INFO mapReduceLayer.PigMapReduce$Reduce: Aliases being processed per job phase (AliasName[line,offset]): M: csvdata[1,10],id_details[2,13],transactions[3,15],transactionsG[4,16] C:  R: uniqcnt[5,11],sym[5,40],sym[5,40]
          14/01/08 09:16:13 INFO mapred.Task: Task:attempt_local_0004_r_000000_0 is done. And is in the process of commiting
          14/01/08 09:16:13 INFO mapred.LocalJobRunner: 
          14/01/08 09:16:13 INFO mapred.Task: Task attempt_local_0004_r_000000_0 is allowed to commit now
          14/01/08 09:16:13 INFO output.FileOutputCommitter: Saved output of task 'attempt_local_0004_r_000000_0' to file:/tmp/temp1086917246/tmp-1333463362
          14/01/08 09:16:13 INFO mapred.LocalJobRunner: reduce > reduce
          14/01/08 09:16:13 INFO mapred.Task: Task 'attempt_local_0004_r_000000_0' done.
          14/01/08 09:16:13 INFO mapReduceLayer.MapReduceLauncher: HadoopJobId: job_local_0004
          14/01/08 09:16:13 INFO mapReduceLayer.MapReduceLauncher: Processing aliases csvdata,id_details,sym,transactions,transactionsG,uniqcnt
          14/01/08 09:16:13 INFO mapReduceLayer.MapReduceLauncher: detailed locations: M: csvdata[1,10],id_details[2,13],transactions[3,15],transactionsG[4,16] C:  R: uniqcnt[5,11],sym[5,40],sym[5,40]
          14/01/08 09:16:13 WARN pigstats.PigStatsUtil: Failed to get RunningJob for job job_local_0004
          14/01/08 09:16:13 INFO mapReduceLayer.MapReduceLauncher: 100% complete
          14/01/08 09:16:13 INFO pigstats.SimplePigStats: Detected Local mode. Stats reported below may be incomplete
          14/01/08 09:16:13 INFO pigstats.SimplePigStats: Script Statistics: 
          
          HadoopVersion	PigVersion	UserId	StartedAt	FinishedAt	Features
          1.1.1	0.12.0	Jpeerindex	2014-01-08 09:16:13	2014-01-08 09:16:13	GROUP_BY
          
          Success!
          
          Job Stats (time in seconds):
          JobId	Alias	Feature	Outputs
          job_local_0004	csvdata,id_details,sym,transactions,transactionsG,uniqcnt	GROUP_BY	file:/tmp/temp1086917246/tmp-1333463362,
          
          Input(s):
          Successfully read records from: "/tmp/BigPetStore1389190568881/generated"
          
          Output(s):
          Successfully stored records in: "file:/tmp/temp1086917246/tmp-1333463362"
          
          Job DAG:
          job_local_0004
          
          
          14/01/08 09:16:13 INFO mapReduceLayer.MapReduceLauncher: Success!
          14/01/08 09:16:13 WARN data.SchemaTupleBackend: SchemaTupleBackend has already been initialized
          14/01/08 09:16:13 INFO input.FileInputFormat: Total input paths to process : 1
          14/01/08 09:16:13 INFO util.MapRedUtil: Total input paths to process : 1
          14/01/08 09:16:13 INFO metastore.HiveMetaStore: 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
          14/01/08 09:16:13 INFO metastore.ObjectStore: ObjectStore, initialize called
          14/01/08 09:16:13 INFO util.SpillableMemoryManager: first memory handler call - Collection threshold init = 65404928(63872K) used = 102841632(100431K) committed = 110362624(107776K) max = 110362624(107776K)
          14/01/08 09:16:14 ERROR DataNucleus.Plugin: Bundle "org.eclipse.jdt.core" requires "org.eclipse.core.resources" but it cannot be resolved.
          14/01/08 09:16:14 ERROR DataNucleus.Plugin: Bundle "org.eclipse.jdt.core" requires "org.eclipse.core.runtime" but it cannot be resolved.
          14/01/08 09:16:14 ERROR DataNucleus.Plugin: Bundle "org.eclipse.jdt.core" requires "org.eclipse.text" but it cannot be resolved.
          14/01/08 09:16:14 INFO DataNucleus.Persistence: Property datanucleus.cache.level2 unknown - will be ignored
          14/01/08 09:16:14 INFO DataNucleus.Persistence: Property javax.jdo.option.NonTransactionalRead unknown - will be ignored
          14/01/08 09:16:14 INFO DataNucleus.Persistence: ================= Persistence Configuration ===============
          14/01/08 09:16:14 INFO DataNucleus.Persistence: DataNucleus Persistence Factory - Vendor: "DataNucleus"  Version: "2.0.3"
          14/01/08 09:16:14 INFO DataNucleus.Persistence: DataNucleus Persistence Factory initialised for datastore URL="jdbc:derby:;databaseName=/tmp/metastore/metastore_db;create=true" driver="org.apache.derby.jdbc.EmbeddedDriver" userName="APP"
          14/01/08 09:16:14 INFO DataNucleus.Persistence: ===========================================================
          14/01/08 09:16:14 INFO Datastore.Schema: Initialising Catalog "", Schema "APP" using "None" auto-start option
          14/01/08 09:16:14 INFO Datastore.Schema: Catalog "", Schema "APP" initialised - managing 0 classes
          14/01/08 09:16:15 INFO metastore.ObjectStore: Setting MetaStore object pin classes with hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
          14/01/08 09:16:15 INFO DataNucleus.MetaData: Registering listener for metadata initialisation
          14/01/08 09:16:15 INFO metastore.ObjectStore: Initialized ObjectStore
          14/01/08 09:16:15 WARN DataNucleus.MetaData: MetaData Parser encountered an error in file "jar:file:/Users/Jpeerindex/.m2/repository/org/apache/hive/hive-metastore/0.11.0/hive-metastore-0.11.0.jar!/package.jdo" at line 28, column 6 : cvc-elt.1: Cannot find the declaration of element 'jdo'. - Please check your specification of DTD and the validity of the MetaData XML that you have specified.
          14/01/08 09:16:15 WARN DataNucleus.MetaData: MetaData Parser encountered an error in file "jar:file:/Users/Jpeerindex/.m2/repository/org/apache/hive/hive-metastore/0.11.0/hive-metastore-0.11.0.jar!/package.jdo" at line 374, column 13 : The content of element type "class" must match "(extension*,implements*,datastore-identity?,primary-key?,inheritance?,version?,join*,foreign-key*,index*,unique*,column*,field*,property*,query*,fetch-group*,extension*)". - Please check your specification of DTD and the validity of the MetaData XML that you have specified.
          14/01/08 09:16:15 WARN DataNucleus.MetaData: MetaData Parser encountered an error in file "jar:file:/Users/Jpeerindex/.m2/repository/org/apache/hive/hive-metastore/0.11.0/hive-metastore-0.11.0.jar!/package.jdo" at line 421, column 13 : The content of element type "class" must match "(extension*,implements*,datastore-identity?,primary-key?,inheritance?,version?,join*,foreign-key*,index*,unique*,column*,field*,property*,query*,fetch-group*,extension*)". - Please check your specification of DTD and the validity of the MetaData XML that you have specified.
          14/01/08 09:16:15 WARN DataNucleus.MetaData: MetaData Parser encountered an error in file "jar:file:/Users/Jpeerindex/.m2/repository/org/apache/hive/hive-metastore/0.11.0/hive-metastore-0.11.0.jar!/package.jdo" at line 443, column 13 : The content of element type "class" must match "(extension*,implements*,datastore-identity?,primary-key?,inheritance?,version?,join*,foreign-key*,index*,unique*,column*,field*,property*,query*,fetch-group*,extension*)". - Please check your specification of DTD and the validity of the MetaData XML that you have specified.
          14/01/08 09:16:15 WARN DataNucleus.MetaData: MetaData Parser encountered an error in file "jar:file:/Users/Jpeerindex/.m2/repository/org/apache/hive/hive-metastore/0.11.0/hive-metastore-0.11.0.jar!/package.jdo" at line 478, column 13 : The content of element type "class" must match "(extension*,implements*,datastore-identity?,primary-key?,inheritance?,version?,join*,foreign-key*,index*,unique*,column*,field*,property*,query*,fetch-group*,extension*)". - Please check your specification of DTD and the validity of the MetaData XML that you have specified.
          14/01/08 09:16:15 WARN DataNucleus.MetaData: MetaData Parser encountered an error in file "jar:file:/Users/Jpeerindex/.m2/repository/org/apache/hive/hive-metastore/0.11.0/hive-metastore-0.11.0.jar!/package.jdo" at line 515, column 13 : The content of element type "class" must match "(extension*,implements*,datastore-identity?,primary-key?,inheritance?,version?,join*,foreign-key*,index*,unique*,column*,field*,property*,query*,fetch-group*,extension*)". - Please check your specification of DTD and the validity of the MetaData XML that you have specified.
          14/01/08 09:16:15 WARN DataNucleus.MetaData: MetaData Parser encountered an error in file "jar:file:/Users/Jpeerindex/.m2/repository/org/apache/hive/hive-metastore/0.11.0/hive-metastore-0.11.0.jar!/package.jdo" at line 556, column 13 : The content of element type "class" must match "(extension*,implements*,datastore-identity?,primary-key?,inheritance?,version?,join*,foreign-key*,index*,unique*,column*,field*,property*,query*,fetch-group*,extension*)". - Please check your specification of DTD and the validity of the MetaData XML that you have specified.
          14/01/08 09:16:15 WARN DataNucleus.MetaData: MetaData Parser encountered an error in file "jar:file:/Users/Jpeerindex/.m2/repository/org/apache/hive/hive-metastore/0.11.0/hive-metastore-0.11.0.jar!/package.jdo" at line 597, column 13 : The content of element type "class" must match "(extension*,implements*,datastore-identity?,primary-key?,inheritance?,version?,join*,foreign-key*,index*,unique*,column*,field*,property*,query*,fetch-group*,extension*)". - Please check your specification of DTD and the validity of the MetaData XML that you have specified.
          14/01/08 09:16:15 WARN DataNucleus.MetaData: MetaData Parser encountered an error in file "jar:file:/Users/Jpeerindex/.m2/repository/org/apache/hive/hive-metastore/0.11.0/hive-metastore-0.11.0.jar!/package.jdo" at line 638, column 13 : The content of element type "class" must match "(extension*,implements*,datastore-identity?,primary-key?,inheritance?,version?,join*,foreign-key*,index*,unique*,column*,field*,property*,query*,fetch-group*,extension*)". - Please check your specification of DTD and the validity of the MetaData XML that you have specified.
          14/01/08 09:16:15 WARN DataNucleus.MetaData: MetaData Parser encountered an error in file "jar:file:/Users/Jpeerindex/.m2/repository/org/apache/hive/hive-metastore/0.11.0/hive-metastore-0.11.0.jar!/package.jdo" at line 683, column 13 : The content of element type "class" must match "(extension*,implements*,datastore-identity?,primary-key?,inheritance?,version?,join*,foreign-key*,index*,unique*,column*,field*,property*,query*,fetch-group*,extension*)". - Please check your specification of DTD and the validity of the MetaData XML that you have specified.
          14/01/08 09:16:15 WARN DataNucleus.MetaData: MetaData Parser encountered an error in file "jar:file:/Users/Jpeerindex/.m2/repository/org/apache/hive/hive-metastore/0.11.0/hive-metastore-0.11.0.jar!/package.jdo" at line 728, column 13 : The content of element type "class" must match "(extension*,implements*,datastore-identity?,primary-key?,inheritance?,version?,join*,foreign-key*,index*,unique*,column*,field*,property*,query*,fetch-group*,extension*)". - Please check your specification of DTD and the validity of the MetaData XML that you have specified.
          14/01/08 09:16:15 WARN DataNucleus.MetaData: MetaData Parser encountered an error in file "jar:file:/Users/Jpeerindex/.m2/repository/org/apache/hive/hive-metastore/0.11.0/hive-metastore-0.11.0.jar!/package.jdo" at line 756, column 13 : The content of element type "class" must match "(extension*,implements*,datastore-identity?,primary-key?,inheritance?,version?,join*,foreign-key*,index*,unique*,column*,field*,property*,query*,fetch-group*,extension*)". - Please check your specification of DTD and the validity of the MetaData XML that you have specified.
          14/01/08 09:16:15 INFO DataNucleus.Persistence: Managing Persistence of Class : org.apache.hadoop.hive.metastore.model.MDatabase [Table : DBS, InheritanceStrategy : new-table]
          14/01/08 09:16:15 INFO DataNucleus.Persistence: Managing Persistence of Field : org.apache.hadoop.hive.metastore.model.MDatabase.parameters [Table : DATABASE_PARAMS]
          14/01/08 09:16:15 INFO Datastore.Schema: Validating 2 unique key(s) for table DBS
          14/01/08 09:16:15 INFO Datastore.Schema: Validating 0 foreign key(s) for table DBS
          14/01/08 09:16:15 INFO Datastore.Schema: Validating 2 index(es) for table DBS
          14/01/08 09:16:15 INFO Datastore.Schema: Validating 1 unique key(s) for table DATABASE_PARAMS
          14/01/08 09:16:15 INFO Datastore.Schema: Validating 1 foreign key(s) for table DATABASE_PARAMS
          14/01/08 09:16:15 INFO Datastore.Schema: Validating 2 index(es) for table DATABASE_PARAMS
          14/01/08 09:16:15 INFO DataNucleus.MetaData: Listener found initialisation for persistable class org.apache.hadoop.hive.metastore.model.MDatabase
          Hive history file=/tmp/Jpeerindex/hive_job_log_Jpeerindex_24038@jays-MacBook-Pro.local_201401080916_355588244.txt
          14/01/08 09:16:16 INFO exec.HiveHistory: Hive history file=/tmp/Jpeerindex/hive_job_log_Jpeerindex_24038@jays-MacBook-Pro.local_201401080916_355588244.txt
          14/01/08 09:16:16 INFO service.HiveServer: Putting temp output to file /tmp/Jpeerindex/Jpeerindex_24038@jays-MacBook-Pro.local_2014010809169065325454685237184.pipeout
          14/01/08 09:16:16 INFO service.HiveServer: Running the query: set hive.fetch.output.serde = org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
          14/01/08 09:16:16 INFO service.HiveServer: Putting temp output to file /tmp/Jpeerindex/Jpeerindex_24038@jays-MacBook-Pro.local_2014010809169065325454685237184.pipeout
          14/01/08 09:16:16 INFO service.HiveServer: Running the query: DROP TABLE hive_bigpetstore_etl
          14/01/08 09:16:16 INFO ql.Driver: <PERFLOG method=Driver.run>
          14/01/08 09:16:16 INFO ql.Driver: <PERFLOG method=TimeToSubmit>
          14/01/08 09:16:16 INFO ql.Driver: <PERFLOG method=compile>
          14/01/08 09:16:16 INFO parse.ParseDriver: Parsing command: DROP TABLE hive_bigpetstore_etl
          14/01/08 09:16:16 INFO parse.ParseDriver: Parse Completed
          14/01/08 09:16:16 INFO metastore.HiveMetaStore: 0: get_table : db=default tbl=hive_bigpetstore_etl
          14/01/08 09:16:16 INFO HiveMetaStore.audit: ugi=Jpeerindex	ip=unknown-ip-addr	cmd=get_table : db=default tbl=hive_bigpetstore_etl	
          14/01/08 09:16:16 INFO metastore.HiveMetaStore: 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
          14/01/08 09:16:16 INFO metastore.ObjectStore: ObjectStore, initialize called
          14/01/08 09:16:16 INFO metastore.ObjectStore: Initialized ObjectStore
          14/01/08 09:16:16 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
          14/01/08 09:16:16 INFO DataNucleus.Persistence: Managing Persistence of Class : org.apache.hadoop.hive.metastore.model.MColumnDescriptor [Table : CDS, InheritanceStrategy : new-table]
          14/01/08 09:16:16 INFO DataNucleus.Persistence: Managing Persistence of Class : org.apache.hadoop.hive.metastore.model.MSerDeInfo [Table : SERDES, InheritanceStrategy : new-table]
          14/01/08 09:16:16 INFO DataNucleus.Persistence: Managing Persistence of Class : org.apache.hadoop.hive.metastore.model.MStringList [Table : SKEWED_STRING_LIST, InheritanceStrategy : new-table]
          14/01/08 09:16:16 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
          14/01/08 09:16:16 INFO DataNucleus.Persistence: Managing Persistence of Class : org.apache.hadoop.hive.metastore.model.MStorageDescriptor [Table : SDS, InheritanceStrategy : new-table]
          14/01/08 09:16:16 INFO DataNucleus.Persistence: Managing Persistence of Class : org.apache.hadoop.hive.metastore.model.MTable [Table : TBLS, InheritanceStrategy : new-table]
          14/01/08 09:16:16 INFO DataNucleus.Persistence: Managing Persistence of Field : org.apache.hadoop.hive.metastore.model.MSerDeInfo.parameters [Table : SERDE_PARAMS]
          14/01/08 09:16:16 INFO DataNucleus.Persistence: Managing Persistence of Field : org.apache.hadoop.hive.metastore.model.MStringList.internalList [Table : SKEWED_STRING_LIST_VALUES]
          14/01/08 09:16:16 INFO DataNucleus.Persistence: Managing Persistence of Field : org.apache.hadoop.hive.metastore.model.MTable.parameters [Table : TABLE_PARAMS]
          14/01/08 09:16:16 INFO DataNucleus.Persistence: Managing Persistence of Field : org.apache.hadoop.hive.metastore.model.MTable.partitionKeys [Table : PARTITION_KEYS]
          14/01/08 09:16:16 INFO DataNucleus.Persistence: Managing Persistence of Field : org.apache.hadoop.hive.metastore.model.MStorageDescriptor.bucketCols [Table : BUCKETING_COLS]
          14/01/08 09:16:16 INFO DataNucleus.Persistence: Managing Persistence of Field : org.apache.hadoop.hive.metastore.model.MStorageDescriptor.parameters [Table : SD_PARAMS]
          14/01/08 09:16:16 INFO DataNucleus.Persistence: Managing Persistence of Field : org.apache.hadoop.hive.metastore.model.MStorageDescriptor.skewedColNames [Table : SKEWED_COL_NAMES]
          14/01/08 09:16:16 INFO DataNucleus.Persistence: Managing Persistence of Field : org.apache.hadoop.hive.metastore.model.MStorageDescriptor.skewedColValueLocationMaps [Table : SKEWED_COL_VALUE_LOC_MAP]
          14/01/08 09:16:16 INFO DataNucleus.Persistence: Managing Persistence of Field : org.apache.hadoop.hive.metastore.model.MStorageDescriptor.skewedColValues [Table : SKEWED_VALUES]
          14/01/08 09:16:16 INFO DataNucleus.Persistence: Managing Persistence of Field : org.apache.hadoop.hive.metastore.model.MStorageDescriptor.sortCols [Table : SORT_COLS]
          14/01/08 09:16:16 INFO DataNucleus.Persistence: Managing Persistence of Field : org.apache.hadoop.hive.metastore.model.MColumnDescriptor.cols [Table : COLUMNS_V2]
          14/01/08 09:16:16 INFO Datastore.Schema: Validating 1 unique key(s) for table SERDES
          14/01/08 09:16:16 INFO Datastore.Schema: Validating 0 foreign key(s) for table SERDES
          14/01/08 09:16:16 INFO Datastore.Schema: Validating 1 index(es) for table SERDES
          14/01/08 09:16:16 INFO Datastore.Schema: Validating 1 unique key(s) for table SKEWED_STRING_LIST
          14/01/08 09:16:16 INFO Datastore.Schema: Validating 0 foreign key(s) for table SKEWED_STRING_LIST
          14/01/08 09:16:16 INFO Datastore.Schema: Validating 1 index(es) for table SKEWED_STRING_LIST
          14/01/08 09:16:16 INFO Datastore.Schema: Validating 2 unique key(s) for table TBLS
          14/01/08 09:16:16 INFO Datastore.Schema: Validating 2 foreign key(s) for table TBLS
          14/01/08 09:16:16 INFO Datastore.Schema: Validating 4 index(es) for table TBLS
          14/01/08 09:16:16 INFO Datastore.Schema: Validating 1 unique key(s) for table SDS
          14/01/08 09:16:16 INFO Datastore.Schema: Validating 2 foreign key(s) for table SDS
          14/01/08 09:16:16 INFO Datastore.Schema: Validating 3 index(es) for table SDS
          14/01/08 09:16:16 INFO Datastore.Schema: Validating 1 unique key(s) for table CDS
          14/01/08 09:16:16 INFO Datastore.Schema: Validating 0 foreign key(s) for table CDS
          14/01/08 09:16:16 INFO Datastore.Schema: Validating 1 index(es) for table CDS
          14/01/08 09:16:16 INFO Datastore.Schema: Validating 1 unique key(s) for table SD_PARAMS
          14/01/08 09:16:16 INFO Datastore.Schema: Validating 1 foreign key(s) for table SD_PARAMS
          14/01/08 09:16:16 INFO Datastore.Schema: Validating 2 index(es) for table SD_PARAMS
          14/01/08 09:16:16 INFO Datastore.Schema: Validating 1 unique key(s) for table SKEWED_STRING_LIST_VALUES
          14/01/08 09:16:16 INFO Datastore.Schema: Validating 1 foreign key(s) for table SKEWED_STRING_LIST_VALUES
          14/01/08 09:16:16 INFO Datastore.Schema: Validating 2 index(es) for table SKEWED_STRING_LIST_VALUES
          14/01/08 09:16:16 INFO Datastore.Schema: Validating 1 unique key(s) for table TABLE_PARAMS
          14/01/08 09:16:16 INFO Datastore.Schema: Validating 1 foreign key(s) for table TABLE_PARAMS
          14/01/08 09:16:16 INFO Datastore.Schema: Validating 2 index(es) for table TABLE_PARAMS
          14/01/08 09:16:16 INFO Datastore.Schema: Validating 1 unique key(s) for table COLUMNS_V2
          14/01/08 09:16:16 INFO Datastore.Schema: Validating 1 foreign key(s) for table COLUMNS_V2
          14/01/08 09:16:16 INFO Datastore.Schema: Validating 2 index(es) for table COLUMNS_V2
          14/01/08 09:16:16 INFO Datastore.Schema: Validating 1 unique key(s) for table SORT_COLS
          14/01/08 09:16:16 INFO Datastore.Schema: Validating 1 foreign key(s) for table SORT_COLS
          14/01/08 09:16:16 INFO Datastore.Schema: Validating 2 index(es) for table SORT_COLS
          14/01/08 09:16:16 INFO Datastore.Schema: Validating 1 unique key(s) for table SKEWED_VALUES
          14/01/08 09:16:16 INFO Datastore.Schema: Validating 2 foreign key(s) for table SKEWED_VALUES
          14/01/08 09:16:16 INFO Datastore.Schema: Validating 3 index(es) for table SKEWED_VALUES
          14/01/08 09:16:16 INFO Datastore.Schema: Validating 1 unique key(s) for table PARTITION_KEYS
          14/01/08 09:16:16 INFO Datastore.Schema: Validating 1 foreign key(s) for table PARTITION_KEYS
          14/01/08 09:16:16 INFO Datastore.Schema: Validating 2 index(es) for table PARTITION_KEYS
          14/01/08 09:16:16 INFO Datastore.Schema: Validating 1 unique key(s) for table SKEWED_COL_NAMES
          14/01/08 09:16:16 INFO Datastore.Schema: Validating 1 foreign key(s) for table SKEWED_COL_NAMES
          14/01/08 09:16:16 INFO Datastore.Schema: Validating 2 index(es) for table SKEWED_COL_NAMES
          14/01/08 09:16:16 INFO Datastore.Schema: Validating 1 unique key(s) for table BUCKETING_COLS
          14/01/08 09:16:16 INFO Datastore.Schema: Validating 1 foreign key(s) for table BUCKETING_COLS
          14/01/08 09:16:16 INFO Datastore.Schema: Validating 2 index(es) for table BUCKETING_COLS
          14/01/08 09:16:16 INFO Datastore.Schema: Validating 1 unique key(s) for table SERDE_PARAMS
          14/01/08 09:16:16 INFO Datastore.Schema: Validating 1 foreign key(s) for table SERDE_PARAMS
          14/01/08 09:16:16 INFO Datastore.Schema: Validating 2 index(es) for table SERDE_PARAMS
          14/01/08 09:16:16 INFO Datastore.Schema: Validating 1 unique key(s) for table SKEWED_COL_VALUE_LOC_MAP
          14/01/08 09:16:16 INFO Datastore.Schema: Validating 2 foreign key(s) for table SKEWED_COL_VALUE_LOC_MAP
          14/01/08 09:16:16 INFO Datastore.Schema: Validating 3 index(es) for table SKEWED_COL_VALUE_LOC_MAP
          14/01/08 09:16:16 INFO DataNucleus.MetaData: Listener found initialisation for persistable class org.apache.hadoop.hive.metastore.model.MColumnDescriptor
          14/01/08 09:16:16 INFO DataNucleus.MetaData: Listener found initialisation for persistable class org.apache.hadoop.hive.metastore.model.MSerDeInfo
          14/01/08 09:16:16 INFO DataNucleus.MetaData: Listener found initialisation for persistable class org.apache.hadoop.hive.metastore.model.MStorageDescriptor
          14/01/08 09:16:16 INFO DataNucleus.MetaData: Listener found initialisation for persistable class org.apache.hadoop.hive.metastore.model.MTable
          14/01/08 09:16:16 INFO DataNucleus.MetaData: Listener found initialisation for persistable class org.apache.hadoop.hive.metastore.model.MFieldSchema
          14/01/08 09:16:16 INFO ql.Driver: Semantic Analysis Completed
          14/01/08 09:16:16 INFO ql.Driver: Returning Hive schema: Schema(fieldSchemas:null, properties:null)
          14/01/08 09:16:16 INFO ql.Driver: </PERFLOG method=compile start=1389190576175 end=1389190576979 duration=804>
          14/01/08 09:16:16 INFO ql.Driver: <PERFLOG method=Driver.execute>
          14/01/08 09:16:16 INFO ql.Driver: Starting command: DROP TABLE hive_bigpetstore_etl
          14/01/08 09:16:16 INFO ql.Driver: </PERFLOG method=TimeToSubmit start=1389190576175 end=1389190576997 duration=822>
          14/01/08 09:16:17 INFO metastore.HiveMetaStore: 0: get_table : db=default tbl=hive_bigpetstore_etl
          14/01/08 09:16:17 INFO HiveMetaStore.audit: ugi=Jpeerindex	ip=unknown-ip-addr	cmd=get_table : db=default tbl=hive_bigpetstore_etl	
          14/01/08 09:16:17 INFO metastore.HiveMetaStore: 0: get_table : db=default tbl=hive_bigpetstore_etl
          14/01/08 09:16:17 INFO HiveMetaStore.audit: ugi=Jpeerindex	ip=unknown-ip-addr	cmd=get_table : db=default tbl=hive_bigpetstore_etl	
          14/01/08 09:16:17 INFO metastore.HiveMetaStore: 0: drop_table : db=default tbl=hive_bigpetstore_etl
          14/01/08 09:16:17 INFO HiveMetaStore.audit: ugi=Jpeerindex	ip=unknown-ip-addr	cmd=drop_table : db=default tbl=hive_bigpetstore_etl	
          14/01/08 09:16:17 INFO metastore.HiveMetaStore: 0: get_table : db=default tbl=hive_bigpetstore_etl
          14/01/08 09:16:17 INFO HiveMetaStore.audit: ugi=Jpeerindex	ip=unknown-ip-addr	cmd=get_table : db=default tbl=hive_bigpetstore_etl	
          14/01/08 09:16:17 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
          14/01/08 09:16:17 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
          14/01/08 09:16:17 INFO DataNucleus.Persistence: Managing Persistence of Class : org.apache.hadoop.hive.metastore.model.MIndex [Table : IDXS, InheritanceStrategy : new-table]
          14/01/08 09:16:17 INFO DataNucleus.Persistence: Managing Persistence of Field : org.apache.hadoop.hive.metastore.model.MIndex.parameters [Table : INDEX_PARAMS]
          14/01/08 09:16:17 INFO Datastore.Schema: Validating 2 unique key(s) for table IDXS
          14/01/08 09:16:17 INFO Datastore.Schema: Validating 3 foreign key(s) for table IDXS
          14/01/08 09:16:17 INFO Datastore.Schema: Validating 5 index(es) for table IDXS
          14/01/08 09:16:17 INFO Datastore.Schema: Validating 1 unique key(s) for table INDEX_PARAMS
          14/01/08 09:16:17 INFO Datastore.Schema: Validating 1 foreign key(s) for table INDEX_PARAMS
          14/01/08 09:16:17 INFO Datastore.Schema: Validating 2 index(es) for table INDEX_PARAMS
          14/01/08 09:16:17 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
          14/01/08 09:16:17 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
          14/01/08 09:16:17 INFO DataNucleus.Persistence: Managing Persistence of Class : org.apache.hadoop.hive.metastore.model.MPartition [Table : PARTITIONS, InheritanceStrategy : new-table]
          14/01/08 09:16:17 INFO DataNucleus.Persistence: Managing Persistence of Field : org.apache.hadoop.hive.metastore.model.MPartition.parameters [Table : PARTITION_PARAMS]
          14/01/08 09:16:17 INFO DataNucleus.Persistence: Managing Persistence of Field : org.apache.hadoop.hive.metastore.model.MPartition.values [Table : PARTITION_KEY_VALS]
          14/01/08 09:16:17 INFO Datastore.Schema: Validating 2 unique key(s) for table PARTITIONS
          14/01/08 09:16:17 INFO Datastore.Schema: Validating 2 foreign key(s) for table PARTITIONS
          14/01/08 09:16:17 INFO Datastore.Schema: Validating 4 index(es) for table PARTITIONS
          14/01/08 09:16:17 INFO Datastore.Schema: Validating 1 unique key(s) for table PARTITION_KEY_VALS
          14/01/08 09:16:17 INFO Datastore.Schema: Validating 1 foreign key(s) for table PARTITION_KEY_VALS
          14/01/08 09:16:17 INFO Datastore.Schema: Validating 2 index(es) for table PARTITION_KEY_VALS
          14/01/08 09:16:17 INFO Datastore.Schema: Validating 1 unique key(s) for table PARTITION_PARAMS
          14/01/08 09:16:17 INFO Datastore.Schema: Validating 1 foreign key(s) for table PARTITION_PARAMS
          14/01/08 09:16:17 INFO Datastore.Schema: Validating 2 index(es) for table PARTITION_PARAMS
          14/01/08 09:16:17 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
          14/01/08 09:16:17 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
          14/01/08 09:16:17 INFO DataNucleus.Persistence: Managing Persistence of Class : org.apache.hadoop.hive.metastore.model.MTablePrivilege [Table : TBL_PRIVS, InheritanceStrategy : new-table]
          14/01/08 09:16:17 INFO Datastore.Schema: Validating 1 unique key(s) for table TBL_PRIVS
          14/01/08 09:16:17 INFO Datastore.Schema: Validating 1 foreign key(s) for table TBL_PRIVS
          14/01/08 09:16:17 INFO Datastore.Schema: Validating 3 index(es) for table TBL_PRIVS
          14/01/08 09:16:17 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
          14/01/08 09:16:17 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
          14/01/08 09:16:17 INFO DataNucleus.Persistence: Managing Persistence of Class : org.apache.hadoop.hive.metastore.model.MTableColumnPrivilege [Table : TBL_COL_PRIVS, InheritanceStrategy : new-table]
          14/01/08 09:16:17 INFO Datastore.Schema: Validating 1 unique key(s) for table TBL_COL_PRIVS
          14/01/08 09:16:17 INFO Datastore.Schema: Validating 1 foreign key(s) for table TBL_COL_PRIVS
          14/01/08 09:16:17 INFO Datastore.Schema: Validating 3 index(es) for table TBL_COL_PRIVS
          14/01/08 09:16:17 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
          14/01/08 09:16:17 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
          14/01/08 09:16:17 INFO DataNucleus.Persistence: Managing Persistence of Class : org.apache.hadoop.hive.metastore.model.MPartitionPrivilege [Table : PART_PRIVS, InheritanceStrategy : new-table]
          14/01/08 09:16:17 INFO Datastore.Schema: Validating 1 unique key(s) for table PART_PRIVS
          14/01/08 09:16:17 INFO Datastore.Schema: Validating 1 foreign key(s) for table PART_PRIVS
          14/01/08 09:16:17 INFO Datastore.Schema: Validating 3 index(es) for table PART_PRIVS
          14/01/08 09:16:17 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
          14/01/08 09:16:17 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
          14/01/08 09:16:17 INFO DataNucleus.Persistence: Managing Persistence of Class : org.apache.hadoop.hive.metastore.model.MPartitionColumnPrivilege [Table : PART_COL_PRIVS, InheritanceStrategy : new-table]
          14/01/08 09:16:17 INFO Datastore.Schema: Validating 1 unique key(s) for table PART_COL_PRIVS
          14/01/08 09:16:17 INFO Datastore.Schema: Validating 1 foreign key(s) for table PART_COL_PRIVS
          14/01/08 09:16:17 INFO Datastore.Schema: Validating 3 index(es) for table PART_COL_PRIVS
          14/01/08 09:16:17 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table.
          14/01/08 09:16:17 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table.
          14/01/08 09:16:17 INFO DataNucleus.Persistence: Managing Persistence of Class : org.apache.hadoop.hive.metastore.model.MTableColumnStatistics [Table : TAB_COL_STATS, InheritanceStrategy : new-table]
          14/01/08 09:16:17 INFO Datastore.Schema: Validating 1 unique key(s) for table TAB_COL_STATS
          14/01/08 09:16:17 INFO Datastore.Schema: Validating 1 foreign key(s) for table TAB_COL_STATS
          14/01/08 09:16:17 INFO Datastore.Schema: Validating 2 index(es) for table TAB_COL_STATS
          14/01/08 09:16:17 INFO metastore.hivemetastoressimpl: deleting  file:/tmp/hive_bigpetstore_etl
          14/01/08 09:16:17 INFO metastore.hivemetastoressimpl: Deleted the diretory file:/tmp/hive_bigpetstore_etl
          14/01/08 09:16:17 INFO ql.Driver: </PERFLOG method=Driver.execute start=1389190576979 end=1389190577783 duration=804>
          OK
          14/01/08 09:16:17 INFO ql.Driver: OK
          14/01/08 09:16:17 INFO ql.Driver: <PERFLOG method=releaseLocks>
          14/01/08 09:16:17 INFO ql.Driver: </PERFLOG method=releaseLocks start=1389190577783 end=1389190577783 duration=0>
          14/01/08 09:16:17 INFO ql.Driver: </PERFLOG method=Driver.run start=1389190576175 end=1389190577783 duration=1608>
          14/01/08 09:16:17 INFO service.HiveServer: Returning schema: Schema(fieldSchemas:null, properties:null)
          CREATE TABLE hive_bigpetstore_etl (  state STRING,  trans_id STRING,  lname STRING,  fname STRING,  date STRING,  price STRING,  product STRING) ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe' WITH SERDEPROPERTIES  ("input.regex" = "(?:BigPetStore,storeCode_)([A-Z][A-Z]),([0-9]*)(?:	)([a-z]*),([a-z]*),([A-Z][^,]*),([^,]*),([^,]*).*" , "output.format.string" = "%1$s %2$s %3$s %4$s %5$s") STORED AS TEXTFILE
          14/01/08 09:16:17 INFO service.HiveServer: Running the query: CREATE TABLE hive_bigpetstore_etl (  state STRING,  trans_id STRING,  lname STRING,  fname STRING,  date STRING,  price STRING,  product STRING) ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe' WITH SERDEPROPERTIES  ("input.regex" = "(?:BigPetStore,storeCode_)([A-Z][A-Z]),([0-9]*)(?:	)([a-z]*),([a-z]*),([A-Z][^,]*),([^,]*),([^,]*).*" , "output.format.string" = "%1$s %2$s %3$s %4$s %5$s") STORED AS TEXTFILE
          14/01/08 09:16:17 INFO ql.Driver: <PERFLOG method=Driver.run>
          14/01/08 09:16:17 INFO ql.Driver: <PERFLOG method=TimeToSubmit>
          14/01/08 09:16:17 INFO ql.Driver: <PERFLOG method=compile>
          14/01/08 09:16:17 INFO parse.ParseDriver: Parsing command: CREATE TABLE hive_bigpetstore_etl (  state STRING,  trans_id STRING,  lname STRING,  fname STRING,  date STRING,  price STRING,  product STRING) ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe' WITH SERDEPROPERTIES  ("input.regex" = "(?:BigPetStore,storeCode_)([A-Z][A-Z]),([0-9]*)(?:	)([a-z]*),([a-z]*),([A-Z][^,]*),([^,]*),([^,]*).*" , "output.format.string" = "%1$s %2$s %3$s %4$s %5$s") STORED AS TEXTFILE
          14/01/08 09:16:17 INFO parse.ParseDriver: Parse Completed
          14/01/08 09:16:17 INFO parse.SemanticAnalyzer: Starting Semantic Analysis
          14/01/08 09:16:17 INFO parse.SemanticAnalyzer: Creating table hive_bigpetstore_etl position=13
          14/01/08 09:16:17 INFO ql.Driver: Semantic Analysis Completed
          14/01/08 09:16:17 INFO ql.Driver: Returning Hive schema: Schema(fieldSchemas:null, properties:null)
          14/01/08 09:16:17 INFO ql.Driver: </PERFLOG method=compile start=1389190577784 end=1389190577810 duration=26>
          14/01/08 09:16:17 INFO ql.Driver: <PERFLOG method=Driver.execute>
          14/01/08 09:16:17 INFO ql.Driver: Starting command: CREATE TABLE hive_bigpetstore_etl (  state STRING,  trans_id STRING,  lname STRING,  fname STRING,  date STRING,  price STRING,  product STRING) ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe' WITH SERDEPROPERTIES  ("input.regex" = "(?:BigPetStore,storeCode_)([A-Z][A-Z]),([0-9]*)(?:	)([a-z]*),([a-z]*),([A-Z][^,]*),([^,]*),([^,]*).*" , "output.format.string" = "%1$s %2$s %3$s %4$s %5$s") STORED AS TEXTFILE
          14/01/08 09:16:17 INFO ql.Driver: </PERFLOG method=TimeToSubmit start=1389190577784 end=1389190577812 duration=28>
          14/01/08 09:16:17 INFO metastore.HiveMetaStore: 0: create_table: Table(tableName:hive_bigpetstore_etl, dbName:default, owner:Jpeerindex, createTime:1389190577, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:state, type:string, comment:null), FieldSchema(name:trans_id, type:string, comment:null), FieldSchema(name:lname, type:string, comment:null), FieldSchema(name:fname, type:string, comment:null), FieldSchema(name:date, type:string, comment:null), FieldSchema(name:price, type:string, comment:null), FieldSchema(name:product, type:string, comment:null)], location:null, inputFormat:org.apache.hadoop.mapred.TextInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.contrib.serde2.RegexSerDe, parameters:{output.format.string=%1$s %2$s %3$s %4$s %5$s, serialization.format=1, input.regex=(?:BigPetStore,storeCode_)([A-Z][A-Z]),([0-9]*)(?:	)([a-z]*),([a-z]*),([A-Z][^,]*),([^,]*),([^,]*).*}), bucketCols:[], sortCols:[], parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], skewedColValueLocationMaps:{}), storedAsSubDirectories:false), partitionKeys:[], parameters:{}, viewOriginalText:null, viewExpandedText:null, tableType:MANAGED_TABLE, privileges:PrincipalPrivilegeSet(userPrivileges:null, groupPrivileges:null, rolePrivileges:null))
          14/01/08 09:16:17 INFO HiveMetaStore.audit: ugi=Jpeerindex	ip=unknown-ip-addr	cmd=create_table: Table(tableName:hive_bigpetstore_etl, dbName:default, owner:Jpeerindex, createTime:1389190577, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:state, type:string, comment:null), FieldSchema(name:trans_id, type:string, comment:null), FieldSchema(name:lname, type:string, comment:null), FieldSchema(name:fname, type:string, comment:null), FieldSchema(name:date, type:string, comment:null), FieldSchema(name:price, type:string, comment:null), FieldSchema(name:product, type:string, comment:null)], location:null, inputFormat:org.apache.hadoop.mapred.TextInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.contrib.serde2.RegexSerDe, parameters:{output.format.string=%1$s %2$s %3$s %4$s %5$s, serialization.format=1, input.regex=(?:BigPetStore,storeCode_)([A-Z][A-Z]),([0-9]*)(?:	)([a-z]*),([a-z]*),([A-Z][^,]*),([^,]*),([^,]*).*}), bucketCols:[], sortCols:[], parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], skewedColValueLocationMaps:{}), storedAsSubDirectories:false), partitionKeys:[], parameters:{}, viewOriginalText:null, viewExpandedText:null, tableType:MANAGED_TABLE, privileges:PrincipalPrivilegeSet(userPrivileges:null, groupPrivileges:null, rolePrivileges:null))	
          14/01/08 09:16:17 INFO ql.Driver: </PERFLOG method=Driver.execute start=1389190577810 end=1389190577921 duration=111>
          OK
          14/01/08 09:16:17 INFO ql.Driver: OK
          14/01/08 09:16:17 INFO ql.Driver: <PERFLOG method=releaseLocks>
          14/01/08 09:16:17 INFO ql.Driver: </PERFLOG method=releaseLocks start=1389190577921 end=1389190577921 duration=0>
          14/01/08 09:16:17 INFO ql.Driver: </PERFLOG method=Driver.run start=1389190577784 end=1389190577921 duration=137>
          14/01/08 09:16:17 INFO service.HiveServer: Returning schema: Schema(fieldSchemas:null, properties:null)
          14/01/08 09:16:17 INFO service.HiveServer: Running the query: LOAD DATA INPATH '/tmp/BigPetStore1389190568881/generated' INTO TABLE hive_bigpetstore_etl
          14/01/08 09:16:17 INFO ql.Driver: <PERFLOG method=Driver.run>
          14/01/08 09:16:17 INFO ql.Driver: <PERFLOG method=TimeToSubmit>
          14/01/08 09:16:17 INFO ql.Driver: <PERFLOG method=compile>
          14/01/08 09:16:17 INFO parse.ParseDriver: Parsing command: LOAD DATA INPATH '/tmp/BigPetStore1389190568881/generated' INTO TABLE hive_bigpetstore_etl
          14/01/08 09:16:17 INFO parse.ParseDriver: Parse Completed
          14/01/08 09:16:17 INFO metastore.HiveMetaStore: 0: get_table : db=default tbl=hive_bigpetstore_etl
          14/01/08 09:16:17 INFO HiveMetaStore.audit: ugi=Jpeerindex	ip=unknown-ip-addr	cmd=get_table : db=default tbl=hive_bigpetstore_etl	
          14/01/08 09:16:17 INFO ql.Driver: Semantic Analysis Completed
          14/01/08 09:16:17 INFO ql.Driver: Returning Hive schema: Schema(fieldSchemas:null, properties:null)
          14/01/08 09:16:17 INFO ql.Driver: </PERFLOG method=compile start=1389190577922 end=1389190577951 duration=29>
          14/01/08 09:16:17 INFO ql.Driver: <PERFLOG method=Driver.execute>
          14/01/08 09:16:17 INFO ql.Driver: Starting command: LOAD DATA INPATH '/tmp/BigPetStore1389190568881/generated' INTO TABLE hive_bigpetstore_etl
          14/01/08 09:16:17 INFO ql.Driver: </PERFLOG method=TimeToSubmit start=1389190577922 end=1389190577953 duration=31>
          Loading data to table default.hive_bigpetstore_etl
          14/01/08 09:16:17 INFO exec.Task: Loading data to table default.hive_bigpetstore_etl from file:/tmp/BigPetStore1389190568881/generated
          14/01/08 09:16:17 INFO metastore.HiveMetaStore: 0: get_table : db=default tbl=hive_bigpetstore_etl
          14/01/08 09:16:17 INFO HiveMetaStore.audit: ugi=Jpeerindex	ip=unknown-ip-addr	cmd=get_table : db=default tbl=hive_bigpetstore_etl	
          14/01/08 09:16:17 INFO metastore.HiveMetaStore: 0: get_table : db=default tbl=hive_bigpetstore_etl
          14/01/08 09:16:17 INFO HiveMetaStore.audit: ugi=Jpeerindex	ip=unknown-ip-addr	cmd=get_table : db=default tbl=hive_bigpetstore_etl	
          14/01/08 09:16:18 INFO metastore.HiveMetaStore: 0: alter_table: db=default tbl=hive_bigpetstore_etl newtbl=hive_bigpetstore_etl
          14/01/08 09:16:18 INFO HiveMetaStore.audit: ugi=Jpeerindex	ip=unknown-ip-addr	cmd=alter_table: db=default tbl=hive_bigpetstore_etl newtbl=hive_bigpetstore_etl	
          14/01/08 09:16:18 INFO metastore.HiveMetaStore: 0: get_table : db=default tbl=hive_bigpetstore_etl
          14/01/08 09:16:18 INFO HiveMetaStore.audit: ugi=Jpeerindex	ip=unknown-ip-addr	cmd=get_table : db=default tbl=hive_bigpetstore_etl	
          14/01/08 09:16:18 INFO exec.StatsTask: Executing stats task
          14/01/08 09:16:18 INFO metastore.HiveMetaStore: 0: get_table : db=default tbl=hive_bigpetstore_etl
          14/01/08 09:16:18 INFO HiveMetaStore.audit: ugi=Jpeerindex	ip=unknown-ip-addr	cmd=get_table : db=default tbl=hive_bigpetstore_etl	
          14/01/08 09:16:18 INFO metastore.HiveMetaStore: 0: get_table : db=default tbl=hive_bigpetstore_etl
          14/01/08 09:16:18 INFO HiveMetaStore.audit: ugi=Jpeerindex	ip=unknown-ip-addr	cmd=get_table : db=default tbl=hive_bigpetstore_etl	
          14/01/08 09:16:18 INFO metastore.HiveMetaStore: 0: get_database: default
          14/01/08 09:16:18 INFO HiveMetaStore.audit: ugi=Jpeerindex	ip=unknown-ip-addr	cmd=get_database: default	
          14/01/08 09:16:18 INFO metastore.HiveMetaStore: 0: alter_table: db=default tbl=hive_bigpetstore_etl newtbl=hive_bigpetstore_etl
          14/01/08 09:16:18 INFO HiveMetaStore.audit: ugi=Jpeerindex	ip=unknown-ip-addr	cmd=alter_table: db=default tbl=hive_bigpetstore_etl newtbl=hive_bigpetstore_etl	
          14/01/08 09:16:18 INFO metastore.HiveMetaStore: 0: get_table : db=default tbl=hive_bigpetstore_etl
          14/01/08 09:16:18 INFO HiveMetaStore.audit: ugi=Jpeerindex	ip=unknown-ip-addr	cmd=get_table : db=default tbl=hive_bigpetstore_etl	
          Table default.hive_bigpetstore_etl stats: [num_partitions: 0, num_files: 2, num_rows: 0, total_size: 857, raw_data_size: 0]
          14/01/08 09:16:18 INFO exec.Task: Table default.hive_bigpetstore_etl stats: [num_partitions: 0, num_files: 2, num_rows: 0, total_size: 857, raw_data_size: 0]
          14/01/08 09:16:18 INFO ql.Driver: </PERFLOG method=Driver.execute start=1389190577951 end=1389190578133 duration=182>
          OK
          14/01/08 09:16:18 INFO ql.Driver: OK
          14/01/08 09:16:18 INFO ql.Driver: <PERFLOG method=releaseLocks>
          14/01/08 09:16:18 INFO ql.Driver: </PERFLOG method=releaseLocks start=1389190578133 end=1389190578134 duration=1>
          14/01/08 09:16:18 INFO ql.Driver: </PERFLOG method=Driver.run start=1389190577922 end=1389190578134 duration=212>
          14/01/08 09:16:18 INFO service.HiveServer: Returning schema: Schema(fieldSchemas:null, properties:null)
          Hive history file=/tmp/Jpeerindex/hive_job_log_Jpeerindex_24038@jays-MacBook-Pro.local_201401080916_312988160.txt
          14/01/08 09:16:18 INFO exec.HiveHistory: Hive history file=/tmp/Jpeerindex/hive_job_log_Jpeerindex_24038@jays-MacBook-Pro.local_201401080916_312988160.txt
          14/01/08 09:16:18 INFO service.HiveServer: Putting temp output to file /tmp/Jpeerindex/Jpeerindex_24038@jays-MacBook-Pro.local_2014010809163997506011824464437.pipeout
          14/01/08 09:16:18 INFO service.HiveServer: Running the query: set hive.fetch.output.serde = org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
          14/01/08 09:16:18 INFO service.HiveServer: Putting temp output to file /tmp/Jpeerindex/Jpeerindex_24038@jays-MacBook-Pro.local_2014010809163997506011824464437.pipeout
          14/01/08 09:16:18 INFO service.HiveServer: Running the query: select product,count(*) as cnt from hive_bigpetstore_etl group by product
          14/01/08 09:16:18 INFO ql.Driver: <PERFLOG method=Driver.run>
          14/01/08 09:16:18 INFO ql.Driver: <PERFLOG method=TimeToSubmit>
          14/01/08 09:16:18 INFO ql.Driver: <PERFLOG method=compile>
          14/01/08 09:16:18 INFO parse.ParseDriver: Parsing command: select product,count(*) as cnt from hive_bigpetstore_etl group by product
          14/01/08 09:16:18 INFO parse.ParseDriver: Parse Completed
          14/01/08 09:16:18 INFO parse.SemanticAnalyzer: Starting Semantic Analysis
          14/01/08 09:16:18 INFO parse.SemanticAnalyzer: Completed phase 1 of Semantic Analysis
          14/01/08 09:16:18 INFO parse.SemanticAnalyzer: Get metadata for source tables
          14/01/08 09:16:18 INFO metastore.HiveMetaStore: 0: get_table : db=default tbl=hive_bigpetstore_etl
          14/01/08 09:16:18 INFO HiveMetaStore.audit: ugi=Jpeerindex	ip=unknown-ip-addr	cmd=get_table : db=default tbl=hive_bigpetstore_etl	
          14/01/08 09:16:18 INFO parse.SemanticAnalyzer: Get metadata for subqueries
          14/01/08 09:16:18 INFO parse.SemanticAnalyzer: Get metadata for destination tables
          14/01/08 09:16:18 INFO parse.SemanticAnalyzer: Completed getting MetaData in Semantic Analysis
          14/01/08 09:16:18 INFO ppd.OpProcFactory: Processing for FS(6)
          14/01/08 09:16:18 INFO ppd.OpProcFactory: Processing for SEL(5)
          14/01/08 09:16:18 INFO ppd.OpProcFactory: Processing for GBY(4)
          14/01/08 09:16:18 INFO ppd.OpProcFactory: Processing for RS(3)
          14/01/08 09:16:18 INFO ppd.OpProcFactory: Processing for GBY(2)
          14/01/08 09:16:18 INFO ppd.OpProcFactory: Processing for SEL(1)
          14/01/08 09:16:18 INFO ppd.OpProcFactory: Processing for TS(0)
          14/01/08 09:16:18 INFO physical.MetadataOnlyOptimizer: Looking for table scans where optimization is applicable
          14/01/08 09:16:18 INFO physical.MetadataOnlyOptimizer: Found 0 metadata only table scans
          14/01/08 09:16:18 INFO parse.SemanticAnalyzer: Completed plan generation
          14/01/08 09:16:18 INFO ql.Driver: Semantic Analysis Completed
          14/01/08 09:16:18 INFO exec.ListSinkOperator: Initializing Self 7 OP
          14/01/08 09:16:18 INFO exec.ListSinkOperator: Operator 7 OP initialized
          14/01/08 09:16:18 INFO exec.ListSinkOperator: Initialization Done 7 OP
          14/01/08 09:16:18 INFO ql.Driver: Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:product, type:string, comment:null), FieldSchema(name:cnt, type:bigint, comment:null)], properties:null)
          14/01/08 09:16:18 INFO ql.Driver: </PERFLOG method=compile start=1389190578171 end=1389190578435 duration=264>
          14/01/08 09:16:18 INFO ql.Driver: <PERFLOG method=Driver.execute>
          14/01/08 09:16:18 INFO ql.Driver: Starting command: select product,count(*) as cnt from hive_bigpetstore_etl group by product
          Total MapReduce jobs = 1
          14/01/08 09:16:18 INFO ql.Driver: Total MapReduce jobs = 1
          14/01/08 09:16:18 INFO ql.Driver: </PERFLOG method=TimeToSubmit start=1389190578171 end=1389190578438 duration=267>
          Launching Job 1 out of 1
          14/01/08 09:16:18 INFO ql.Driver: Launching Job 1 out of 1
          14/01/08 09:16:18 INFO exec.Utilities: Cache Content Summary for file:/tmp/hive_bigpetstore_etl length: 857 file count: 2 directory count: 1
          14/01/08 09:16:18 INFO exec.ExecDriver: BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=857
          Number of reduce tasks not specified. Estimated from input data size: 1
          14/01/08 09:16:18 INFO exec.Task: Number of reduce tasks not specified. Estimated from input data size: 1
          In order to change the average load for a reducer (in bytes):
          14/01/08 09:16:18 INFO exec.Task: In order to change the average load for a reducer (in bytes):
            set hive.exec.reducers.bytes.per.reducer=<number>
          14/01/08 09:16:18 INFO exec.Task:   set hive.exec.reducers.bytes.per.reducer=<number>
          In order to limit the maximum number of reducers:
          14/01/08 09:16:18 INFO exec.Task: In order to limit the maximum number of reducers:
            set hive.exec.reducers.max=<number>
          14/01/08 09:16:18 INFO exec.Task:   set hive.exec.reducers.max=<number>
          In order to set a constant number of reducers:
          14/01/08 09:16:18 INFO exec.Task: In order to set a constant number of reducers:
            set mapred.reduce.tasks=<number>
          14/01/08 09:16:18 INFO exec.Task:   set mapred.reduce.tasks=<number>
          14/01/08 09:16:18 INFO exec.ExecDriver: Generating plan file file:/tmp/Jpeerindex/hive_2014-01-08_09-16-18_171_5555758322630256543/-local-10003/plan.xml
          14/01/08 09:16:18 INFO exec.ExecDriver: Executing: ./hadoop-1.2.1/bin/hadoop jar /Users/Jpeerindex/.m2/repository/org/apache/hive/hive-exec/0.11.0/hive-exec-0.11.0.jar org.apache.hadoop.hive.ql.exec.ExecDriver  -plan file:/tmp/Jpeerindex/hive_2014-01-08_09-16-18_171_5555758322630256543/-local-10003/plan.xml   -jobconffile file:/tmp/Jpeerindex/hive_2014-01-08_09-16-18_171_5555758322630256543/-local-10002/jobconf.xml
          Warning: $HADOOP_HOME is deprecated.
          
          Execution log at: /tmp/Jpeerindex/hive.log
          2014-01-08 09:16:20.585 java[24070:1903] Unable to load realm info from SCDynamicStore
          Job running in-process (local Hadoop)
          Hadoop job information for null: number of mappers: 0; number of reducers: 0
          2014-01-08 09:16:22,185 null map = 0%,  reduce = 100%
          Ended Job = job_local1258421034_0001
          Execution completed successfully
          14/01/08 09:16:22 INFO exec.Task: Execution completed successfully
          Mapred Local Task Succeeded . Convert the Join into MapJoin
          14/01/08 09:16:22 INFO exec.Task: Mapred Local Task Succeeded . Convert the Join into MapJoin
          14/01/08 09:16:22 INFO exec.ExecDriver: Execution completed successfully
          14/01/08 09:16:22 INFO ql.Driver: </PERFLOG method=Driver.execute start=1389190578436 end=1389190582579 duration=4143>
          OK
          14/01/08 09:16:22 INFO ql.Driver: OK
          14/01/08 09:16:22 INFO ql.Driver: <PERFLOG method=releaseLocks>
          14/01/08 09:16:22 INFO ql.Driver: </PERFLOG method=releaseLocks start=1389190582579 end=1389190582579 duration=0>
          14/01/08 09:16:22 INFO ql.Driver: </PERFLOG method=Driver.run start=1389190578171 end=1389190582579 duration=4408>
          14/01/08 09:16:22 INFO service.HiveServer: Returning schema: Schema(fieldSchemas:[FieldSchema(name:product, type:string, comment:null), FieldSchema(name:cnt, type:bigint, comment:null)], properties:null)
          14/01/08 09:16:22 INFO mapred.FileInputFormat: Total input paths to process : 1
          crunch:{cat-food=4, antelope snacks=1, fuzzy-collar=1, dog-food=2, steel-leash=2}
          pig:{cat-food=4, antelope snacks=1, fuzzy-collar=1, dog-food=2, steel-leash=2}
          hive:{null=1, cat-food=4, antelope snacks=1, fuzzy-collar=1, dog-food=1, steel-leash=2}
          Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 13.777 sec
          
          Results :
          
          Tests run: 1, Failures: 0, Errors: 0, Skipped: 0
          
          [WARNING] File encoding has not been set, using platform encoding MacRoman, i.e. build is platform dependent!
          [INFO] 
          [INFO] --- maven-failsafe-plugin:2.12:verify (integration-tests) @ hadoop-examples ---
          [INFO] Failsafe report directory: /Users/Jpeerindex/Development/bigpetstore/target/failsafe-reports
          [WARNING] File encoding has not been set, using platform encoding MacRoman, i.e. build is platform dependent!
          [INFO] 
          [INFO] --- maven-install-plugin:2.4:install (default-install) @ hadoop-examples ---
          [INFO] Installing /Users/Jpeerindex/Development/bigpetstore/target/hadoop-examples-1.0-SNAPSHOT.jar to /Users/Jpeerindex/.m2/repository/jay/rhbd/hadoop-examples/1.0-SNAPSHOT/hadoop-examples-1.0-SNAPSHOT.jar
          [INFO] Installing /Users/Jpeerindex/Development/bigpetstore/pom.xml to /Users/Jpeerindex/.m2/repository/jay/rhbd/hadoop-examples/1.0-SNAPSHOT/hadoop-examples-1.0-SNAPSHOT.pom
          [INFO] 
          [INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ hadoop-examples ---
          [INFO] Using 'UTF-8' encoding to copy filtered resources.
          [INFO] Copying 4 resources
          [INFO] 
          [INFO] --- maven-compiler-plugin:3.1:compile (default-compile) @ hadoop-examples ---
          [INFO] Nothing to compile - all classes are up to date
          [INFO] 
          [INFO] --- build-helper-maven-plugin:1.7:add-test-source (add-integration-test-sources) @ hadoop-examples ---
          [INFO] Test Source directory: /Users/Jpeerindex/Development/bigpetstore/src/integration/java added.
          [INFO] 
          [INFO] --- maven-resources-plugin:2.6:testResources (default-testResources) @ hadoop-examples ---
          [INFO] Using 'UTF-8' encoding to copy filtered resources.
          [INFO] Copying 1 resource
          [INFO] 
          [INFO] --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ hadoop-examples ---
          [INFO] Nothing to compile - all classes are up to date
          [INFO] 
          [INFO] --- maven-surefire-plugin:2.12:test (default-test) @ hadoop-examples ---
          [INFO] Skipping execution of surefire because it has already been run for this configuration
          [INFO] 
          [INFO] --- maven-jar-plugin:2.4:jar (default-jar) @ hadoop-examples ---
          [INFO] 
          [INFO] --- maven-failsafe-plugin:2.12:integration-test (integration-tests) @ hadoop-examples ---
          [INFO] Skipping execution of surefire because it has already been run for this configuration
          [INFO] 
          [INFO] --- maven-failsafe-plugin:2.12:verify (integration-tests) @ hadoop-examples ---
          [INFO] Failsafe report directory: /Users/Jpeerindex/Development/bigpetstore/target/failsafe-reports
          [WARNING] File encoding has not been set, using platform encoding MacRoman, i.e. build is platform dependent!
          [INFO] ------------------------------------------------------------------------
          [INFO] BUILD SUCCESS
          [INFO] ------------------------------------------------------------------------
          [INFO] Total time: 20.054s
          [INFO] Finished at: Wed Jan 08 09:16:22 EST 2014
          [INFO] Final Memory: 24M/81M
          [INFO] ------------------------------------------------------------------------
          
          Show
          jay vyas added a comment - - edited Thanks for catching that, there was some dead code in there causing failures for "clean install" . I've just pushed a patch for this https://github.com/jayunit100/bigpetstore/commit/f2d72e8a923eee131d198001cf7b767d2aecfa07...Now it should work: this is preliminary and not ready to put the patch in just yet, as I'd like to setup a reproducible (i.e. vagrant) deployment of it to be confident that it works on more than just my setups . jays-MacBook-Pro:bigpetstore Jpeerindex$ mvn clean install verify [INFO] Scanning for projects... [WARNING] [WARNING] Some problems were encountered while building the effective model for jay.rhbd:hadoop-examples:jar:1.0-SNAPSHOT [WARNING] 'dependencies.dependency.(groupId:artifactId:type:classifier)' must be unique: org.apache.hive:hive-contrib:jar -> version ${hive.version} vs 0.11.0 @ line 183, column 15 [WARNING] [WARNING] It is highly recommended to fix these problems because they threaten the stability of your build. [WARNING] [WARNING] For this reason, future Maven versions might no longer support building such malformed projects. [WARNING] [INFO] [INFO] ------------------------------------------------------------------------ [INFO] Building hadoop-examples 1.0-SNAPSHOT [INFO] ------------------------------------------------------------------------ [WARNING] Could not transfer metadata asm:asm/maven-metadata.xml from/to local.repository (file:../../local.repository/trunk): No connector available to access repository local.repository (file:../../local.repository/trunk) of type legacy using the available factories WagonRepositoryConnectorFactory [WARNING] The POM for javax.jdo:jdo2-api:jar:2.3-ec is missing, no dependency information available [INFO] [INFO] --- maven-clean-plugin:2.5:clean (default-clean) @ hadoop-examples --- [INFO] Deleting /Users/Jpeerindex/Development/bigpetstore/target [INFO] [INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ hadoop-examples --- [INFO] Using 'UTF-8' encoding to copy filtered resources. [INFO] Copying 4 resources [INFO] [INFO] --- maven-compiler-plugin:3.1:compile (default-compile) @ hadoop-examples --- [INFO] Changes detected - recompiling the module! [INFO] Compiling 16 source files to /Users/Jpeerindex/Development/bigpetstore/target/classes [WARNING] error: error reading /Users/Jpeerindex/.m2/repository/it/unimi/dsi/fastutil/6.5.7/fastutil-6.5.7.jar; cannot read zip file entry [WARNING] Note: /Users/Jpeerindex/Development/bigpetstore/src/main/java/org/bigtop/bigpetstore/clustering/Mh1.java uses or overrides a deprecated API. [WARNING] Note: Recompile with -Xlint:deprecation for details. [WARNING] Note: Some input files use unchecked or unsafe operations. [WARNING] Note: Recompile with -Xlint:unchecked for details. [INFO] [INFO] --- build-helper-maven-plugin:1.7:add-test-source (add-integration-test-sources) @ hadoop-examples --- [INFO] Test Source directory: /Users/Jpeerindex/Development/bigpetstore/src/integration/java added. [INFO] [INFO] --- maven-resources-plugin:2.6:testResources (default-testResources) @ hadoop-examples --- [INFO] Using 'UTF-8' encoding to copy filtered resources. [INFO] Copying 1 resource [INFO] [INFO] --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ hadoop-examples --- [INFO] Changes detected - recompiling the module! [INFO] Compiling 2 source files to /Users/Jpeerindex/Development/bigpetstore/target/test-classes [WARNING] error: error reading /Users/Jpeerindex/.m2/repository/it/unimi/dsi/fastutil/6.5.7/fastutil-6.5.7.jar; cannot read zip file entry [INFO] [INFO] --- maven-surefire-plugin:2.12:test (default-test) @ hadoop-examples --- [INFO] Surefire report directory: /Users/Jpeerindex/Development/bigpetstore/target/surefire-reports ------------------------------------------------------- T E S T S ------------------------------------------------------- ------------------------------------------------------- T E S T S ------------------------------------------------------- Running org.bigtop.bigpetstore.generator.TestPetStoreTransactionGeneratorJob SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/Users/Jpeerindex/.m2/repository/org/slf4j/slf4j-jcl/1.7.5/slf4j-jcl-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/Users/Jpeerindex/.m2/repository/org/slf4j/slf4j-log4j12/1.6.1/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. memory : 1047 2014-01-08 09:16:06.740 java[24030:1903] Unable to load realm info from SCDynamicStore 14/01/08 09:16:06 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 14/01/08 09:16:06 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. 14/01/08 09:16:06 WARN mapred.JobClient: No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String). AZ _ 2 AK _ 2 CT _ 2 OK _ 2 CO _ 2 CA _ 6 NY _ 4 14/01/08 09:16:07 INFO mapred.JobClient: Running job: job_local_0001 14/01/08 09:16:07 INFO mapred.Task: Using ResourceCalculatorPlugin : null 14/01/08 09:16:07 INFO mapred.MapTask: io.sort.mb = 100 14/01/08 09:16:07 INFO mapred.MapTask: data buffer = 79691776/99614720 14/01/08 09:16:07 INFO mapred.MapTask: record buffer = 262144/327680 14/01/08 09:16:07 INFO mapred.MapTask: Starting flush of map output 14/01/08 09:16:07 INFO mapred.MapTask: Finished spill 0 14/01/08 09:16:07 INFO mapred.Task: Task:attempt_local_0001_m_000000_0 is done. And is in the process of commiting 14/01/08 09:16:07 INFO mapred.LocalJobRunner: 14/01/08 09:16:07 INFO mapred.Task: Task 'attempt_local_0001_m_000000_0' done. 14/01/08 09:16:07 INFO mapred.Task: Using ResourceCalculatorPlugin : null 14/01/08 09:16:07 INFO mapred.MapTask: io.sort.mb = 100 14/01/08 09:16:07 INFO mapred.MapTask: data buffer = 79691776/99614720 14/01/08 09:16:07 INFO mapred.MapTask: record buffer = 262144/327680 14/01/08 09:16:07 INFO mapred.MapTask: Starting flush of map output 14/01/08 09:16:07 INFO mapred.MapTask: Finished spill 0 14/01/08 09:16:07 INFO mapred.Task: Task:attempt_local_0001_m_000001_0 is done. And is in the process of commiting 14/01/08 09:16:07 INFO mapred.LocalJobRunner: 14/01/08 09:16:07 INFO mapred.Task: Task 'attempt_local_0001_m_000001_0' done. 14/01/08 09:16:07 INFO mapred.Task: Using ResourceCalculatorPlugin : null 14/01/08 09:16:07 INFO mapred.MapTask: io.sort.mb = 100 14/01/08 09:16:07 INFO mapred.MapTask: data buffer = 79691776/99614720 14/01/08 09:16:07 INFO mapred.MapTask: record buffer = 262144/327680 14/01/08 09:16:07 INFO mapred.MapTask: Starting flush of map output 14/01/08 09:16:07 INFO mapred.MapTask: Finished spill 0 14/01/08 09:16:07 INFO mapred.Task: Task:attempt_local_0001_m_000002_0 is done. And is in the process of commiting 14/01/08 09:16:07 INFO mapred.LocalJobRunner: 14/01/08 09:16:07 INFO mapred.Task: Task 'attempt_local_0001_m_000002_0' done. 14/01/08 09:16:07 INFO mapred.Task: Using ResourceCalculatorPlugin : null 14/01/08 09:16:07 INFO mapred.MapTask: io.sort.mb = 100 14/01/08 09:16:07 INFO mapred.MapTask: data buffer = 79691776/99614720 14/01/08 09:16:07 INFO mapred.MapTask: record buffer = 262144/327680 14/01/08 09:16:07 INFO mapred.MapTask: Starting flush of map output 14/01/08 09:16:07 INFO mapred.MapTask: Finished spill 0 14/01/08 09:16:07 INFO mapred.Task: Task:attempt_local_0001_m_000003_0 is done. And is in the process of commiting 14/01/08 09:16:07 INFO mapred.LocalJobRunner: 14/01/08 09:16:07 INFO mapred.Task: Task 'attempt_local_0001_m_000003_0' done. 14/01/08 09:16:07 INFO mapred.Task: Using ResourceCalculatorPlugin : null 14/01/08 09:16:07 INFO mapred.MapTask: io.sort.mb = 100 14/01/08 09:16:07 INFO mapred.MapTask: data buffer = 79691776/99614720 14/01/08 09:16:07 INFO mapred.MapTask: record buffer = 262144/327680 14/01/08 09:16:07 INFO mapred.MapTask: Starting flush of map output 14/01/08 09:16:07 INFO mapred.MapTask: Finished spill 0 14/01/08 09:16:07 INFO mapred.Task: Task:attempt_local_0001_m_000004_0 is done. And is in the process of commiting 14/01/08 09:16:07 INFO mapred.LocalJobRunner: 14/01/08 09:16:07 INFO mapred.Task: Task 'attempt_local_0001_m_000004_0' done. 14/01/08 09:16:07 INFO mapred.Task: Using ResourceCalculatorPlugin : null 14/01/08 09:16:07 INFO mapred.MapTask: io.sort.mb = 100 14/01/08 09:16:07 INFO mapred.MapTask: data buffer = 79691776/99614720 14/01/08 09:16:07 INFO mapred.MapTask: record buffer = 262144/327680 14/01/08 09:16:07 INFO mapred.MapTask: Starting flush of map output 14/01/08 09:16:07 INFO mapred.MapTask: Finished spill 0 14/01/08 09:16:07 INFO mapred.Task: Task:attempt_local_0001_m_000005_0 is done. And is in the process of commiting 14/01/08 09:16:07 INFO mapred.LocalJobRunner: 14/01/08 09:16:07 INFO mapred.Task: Task 'attempt_local_0001_m_000005_0' done. 14/01/08 09:16:07 INFO mapred.Task: Using ResourceCalculatorPlugin : null 14/01/08 09:16:07 INFO mapred.MapTask: io.sort.mb = 100 14/01/08 09:16:07 INFO mapred.MapTask: data buffer = 79691776/99614720 14/01/08 09:16:07 INFO mapred.MapTask: record buffer = 262144/327680 14/01/08 09:16:07 INFO mapred.MapTask: Starting flush of map output 14/01/08 09:16:07 INFO mapred.MapTask: Finished spill 0 14/01/08 09:16:07 INFO mapred.Task: Task:attempt_local_0001_m_000006_0 is done. And is in the process of commiting 14/01/08 09:16:07 INFO mapred.LocalJobRunner: 14/01/08 09:16:07 INFO mapred.Task: Task 'attempt_local_0001_m_000006_0' done. 14/01/08 09:16:07 INFO mapred.Task: Using ResourceCalculatorPlugin : null 14/01/08 09:16:07 INFO mapred.LocalJobRunner: 14/01/08 09:16:07 INFO mapred.Merger: Merging 7 sorted segments 14/01/08 09:16:07 INFO mapred.Merger: Down to the last merge-pass, with 7 segments left of total size: 1758 bytes 14/01/08 09:16:07 INFO mapred.LocalJobRunner: 14/01/08 09:16:07 INFO mapred.Task: Task:attempt_local_0001_r_000000_0 is done. And is in the process of commiting 14/01/08 09:16:07 INFO mapred.LocalJobRunner: 14/01/08 09:16:07 INFO mapred.Task: Task attempt_local_0001_r_000000_0 is allowed to commit now 14/01/08 09:16:07 INFO output.FileOutputCommitter: Saved output of task 'attempt_local_0001_r_000000_0' to petstoredata/Wed Jan 08 09:16:06 EST 2014 14/01/08 09:16:07 INFO mapred.LocalJobRunner: reduce > reduce 14/01/08 09:16:07 INFO mapred.Task: Task 'attempt_local_0001_r_000000_0' done. 14/01/08 09:16:08 INFO mapred.JobClient: map 100% reduce 100% 14/01/08 09:16:08 INFO mapred.JobClient: Job complete: job_local_0001 14/01/08 09:16:08 INFO mapred.JobClient: Counters: 17 14/01/08 09:16:08 INFO mapred.JobClient: File Output Format Counters 14/01/08 09:16:08 INFO mapred.JobClient: Bytes Written=1728 14/01/08 09:16:08 INFO mapred.JobClient: FileSystemCounters 14/01/08 09:16:08 INFO mapred.JobClient: FILE_BYTES_READ=20254 14/01/08 09:16:08 INFO mapred.JobClient: FILE_BYTES_WRITTEN=296259 14/01/08 09:16:08 INFO mapred.JobClient: File Input Format Counters 14/01/08 09:16:08 INFO mapred.JobClient: Bytes Read=0 14/01/08 09:16:08 INFO mapred.JobClient: Map-Reduce Framework 14/01/08 09:16:08 INFO mapred.JobClient: Map output materialized bytes=1786 14/01/08 09:16:08 INFO mapred.JobClient: Map input records=20 14/01/08 09:16:08 INFO mapred.JobClient: Reduce shuffle bytes=0 14/01/08 09:16:08 INFO mapred.JobClient: Spilled Records=40 14/01/08 09:16:08 INFO mapred.JobClient: Map output bytes=1704 14/01/08 09:16:08 INFO mapred.JobClient: Total committed heap usage (bytes)=8482979840 14/01/08 09:16:08 INFO mapred.JobClient: SPLIT_RAW_BYTES=497 14/01/08 09:16:08 INFO mapred.JobClient: Combine input records=0 14/01/08 09:16:08 INFO mapred.JobClient: Reduce input records=20 14/01/08 09:16:08 INFO mapred.JobClient: Reduce input groups=20 14/01/08 09:16:08 INFO mapred.JobClient: Combine output records=0 14/01/08 09:16:08 INFO mapred.JobClient: Reduce output records=20 14/01/08 09:16:08 INFO mapred.JobClient: Map output records=20 ===>BigPetStore,storeCode_AK,1 rick,kemp,Thu Jan 15 23:08:51 EST 1970,19.1,fuzzy-collar ===>BigPetStore,storeCode_AK,2 rick,kemp,Sun Dec 14 00:51:07 EST 1969,7.5,cat-food ===>BigPetStore,storeCode_AZ,1 aaron,medina,Fri Jan 23 03:01:17 EST 1970,25.1,leather-collar ===>BigPetStore,storeCode_AZ,2 aaron,medina,Thu Jan 08 21:33:52 EST 1970,10.5,dog-food ===>BigPetStore,storeCode_CA,1 elizabeth,suarez,Fri Dec 26 17:43:10 EST 1969,7.5,cat-food ===>BigPetStore,storeCode_CA,2 elizabeth,suarez,Wed Jan 21 00:53:53 EST 1970,11.75,fish-food ===>BigPetStore,storeCode_CA,3 elizabeth,suarez,Mon Jan 12 00:21:28 EST 1970,11.75,fish-food ===>BigPetStore,storeCode_CA,4 elizabeth,suarez,Fri Jan 16 08:24:05 EST 1970,11.75,fish-food ===>BigPetStore,storeCode_CA,5 elizabeth,suarez,Sun Dec 07 04:32:44 EST 1969,10.5,dog-food ===>BigPetStore,storeCode_CA,6 elizabeth,suarez,Thu Dec 11 19:48:25 EST 1969,16.5,organic-dog-food ===>BigPetStore,storeCode_CO,1 donna,melendez,Mon Jan 19 13:41:48 EST 1970,10.5,dog-food ===>BigPetStore,storeCode_CO,2 curtis,jackson,Mon Dec 29 04:32:17 EST 1969,15.1,choke-collar ===>BigPetStore,storeCode_CT,1 ray,mueller,Tue Jan 13 06:24:08 EST 1970,10.5,dog-food ===>BigPetStore,storeCode_CT,2 ray,mueller,Wed Jan 21 10:04:42 EST 1970,7.5,cat-food ===>BigPetStore,storeCode_NY,1 sam,curry,Sun Jan 04 18:32:32 EST 1970,19.75,fish-food ===>BigPetStore,storeCode_NY,2 raymond,beck,Sat Dec 13 00:15:28 EST 1969,7.5,cat-food ===>BigPetStore,storeCode_NY,3 raymond,beck,Tue Jan 13 15:03:54 EST 1970,20.1,steel-leash ===>BigPetStore,storeCode_NY,4 raymond,beck,Sat Dec 27 11:54:25 EST 1969,7.5,cat-food ===>BigPetStore,storeCode_OK,1 gage,frost,Thu Dec 25 09:34:31 EST 1969,40.1,rodent-cage ===>BigPetStore,storeCode_OK,2 aaron,ross,Sat Dec 20 19:21:20 EST 1969,10.5,dog-food 14/01/08 09:16:08 INFO generator.TestPetStoreTransactionGeneratorJob: Created 20 , file was 1704 bytes. Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.698 sec Results : Tests run: 1, Failures: 0, Errors: 0, Skipped: 0 [INFO] [INFO] --- maven-jar-plugin:2.4:jar (default-jar) @ hadoop-examples --- [INFO] Building jar: /Users/Jpeerindex/Development/bigpetstore/target/hadoop-examples-1.0-SNAPSHOT.jar [INFO] [INFO] --- maven-failsafe-plugin:2.12:integration-test (integration-tests) @ hadoop-examples --- [INFO] Failsafe report directory: /Users/Jpeerindex/Development/bigpetstore/target/failsafe-reports ------------------------------------------------------- T E S T S ------------------------------------------------------- Running org.bigtop.bigpetstore.integration.ITBigPetStore SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/Users/Jpeerindex/.m2/repository/org/slf4j/slf4j-jcl/1.7.5/slf4j-jcl-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/Users/Jpeerindex/.m2/repository/org/slf4j/slf4j-log4j12/1.6.1/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. 2014-01-08 09:16:09.195 java[24038:1903] Unable to load realm info from SCDynamicStore 14/01/08 09:16:09 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 14/01/08 09:16:09 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. 14/01/08 09:16:09 WARN mapred.JobClient: No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String). AZ _ 1 AK _ 1 CT _ 1 OK _ 1 CO _ 1 CA _ 3 NY _ 2 14/01/08 09:16:09 INFO mapred.JobClient: Running job: job_local_0001 14/01/08 09:16:09 INFO mapred.Task: Using ResourceCalculatorPlugin : null 14/01/08 09:16:09 INFO mapred.MapTask: io.sort.mb = 100 14/01/08 09:16:09 INFO mapred.MapTask: data buffer = 79691776/99614720 14/01/08 09:16:09 INFO mapred.MapTask: record buffer = 262144/327680 14/01/08 09:16:09 INFO mapred.MapTask: Starting flush of map output 14/01/08 09:16:09 INFO mapred.MapTask: Finished spill 0 14/01/08 09:16:09 INFO mapred.Task: Task:attempt_local_0001_m_000000_0 is done. And is in the process of commiting 14/01/08 09:16:09 INFO mapred.LocalJobRunner: 14/01/08 09:16:09 INFO mapred.Task: Task 'attempt_local_0001_m_000000_0' done. 14/01/08 09:16:09 INFO mapred.Task: Using ResourceCalculatorPlugin : null 14/01/08 09:16:09 INFO mapred.MapTask: io.sort.mb = 100 14/01/08 09:16:09 INFO mapred.MapTask: data buffer = 79691776/99614720 14/01/08 09:16:09 INFO mapred.MapTask: record buffer = 262144/327680 14/01/08 09:16:09 INFO mapred.MapTask: Starting flush of map output 14/01/08 09:16:09 INFO mapred.MapTask: Finished spill 0 14/01/08 09:16:09 INFO mapred.Task: Task:attempt_local_0001_m_000001_0 is done. And is in the process of commiting 14/01/08 09:16:09 INFO mapred.LocalJobRunner: 14/01/08 09:16:09 INFO mapred.Task: Task 'attempt_local_0001_m_000001_0' done. 14/01/08 09:16:09 INFO mapred.Task: Using ResourceCalculatorPlugin : null 14/01/08 09:16:09 INFO mapred.MapTask: io.sort.mb = 100 14/01/08 09:16:09 INFO mapred.MapTask: data buffer = 79691776/99614720 14/01/08 09:16:09 INFO mapred.MapTask: record buffer = 262144/327680 14/01/08 09:16:09 INFO mapred.MapTask: Starting flush of map output 14/01/08 09:16:09 INFO mapred.MapTask: Finished spill 0 14/01/08 09:16:09 INFO mapred.Task: Task:attempt_local_0001_m_000002_0 is done. And is in the process of commiting 14/01/08 09:16:09 INFO mapred.LocalJobRunner: 14/01/08 09:16:09 INFO mapred.Task: Task 'attempt_local_0001_m_000002_0' done. 14/01/08 09:16:09 INFO mapred.Task: Using ResourceCalculatorPlugin : null 14/01/08 09:16:09 INFO mapred.MapTask: io.sort.mb = 100 14/01/08 09:16:09 INFO mapred.MapTask: data buffer = 79691776/99614720 14/01/08 09:16:09 INFO mapred.MapTask: record buffer = 262144/327680 14/01/08 09:16:09 INFO mapred.MapTask: Starting flush of map output 14/01/08 09:16:09 INFO mapred.MapTask: Finished spill 0 14/01/08 09:16:09 INFO mapred.Task: Task:attempt_local_0001_m_000003_0 is done. And is in the process of commiting 14/01/08 09:16:09 INFO mapred.LocalJobRunner: 14/01/08 09:16:09 INFO mapred.Task: Task 'attempt_local_0001_m_000003_0' done. 14/01/08 09:16:09 INFO mapred.Task: Using ResourceCalculatorPlugin : null 14/01/08 09:16:09 INFO mapred.MapTask: io.sort.mb = 100 14/01/08 09:16:09 INFO mapred.MapTask: data buffer = 79691776/99614720 14/01/08 09:16:09 INFO mapred.MapTask: record buffer = 262144/327680 14/01/08 09:16:09 INFO mapred.MapTask: Starting flush of map output 14/01/08 09:16:09 INFO mapred.MapTask: Finished spill 0 14/01/08 09:16:09 INFO mapred.Task: Task:attempt_local_0001_m_000004_0 is done. And is in the process of commiting 14/01/08 09:16:09 INFO mapred.LocalJobRunner: 14/01/08 09:16:09 INFO mapred.Task: Task 'attempt_local_0001_m_000004_0' done. 14/01/08 09:16:09 INFO mapred.Task: Using ResourceCalculatorPlugin : null 14/01/08 09:16:09 INFO mapred.MapTask: io.sort.mb = 100 14/01/08 09:16:09 INFO mapred.MapTask: data buffer = 79691776/99614720 14/01/08 09:16:09 INFO mapred.MapTask: record buffer = 262144/327680 14/01/08 09:16:09 INFO mapred.MapTask: Starting flush of map output 14/01/08 09:16:09 INFO mapred.MapTask: Finished spill 0 14/01/08 09:16:09 INFO mapred.Task: Task:attempt_local_0001_m_000005_0 is done. And is in the process of commiting 14/01/08 09:16:09 INFO mapred.LocalJobRunner: 14/01/08 09:16:09 INFO mapred.Task: Task 'attempt_local_0001_m_000005_0' done. 14/01/08 09:16:09 INFO mapred.Task: Using ResourceCalculatorPlugin : null 14/01/08 09:16:09 INFO mapred.MapTask: io.sort.mb = 100 14/01/08 09:16:09 INFO mapred.MapTask: data buffer = 79691776/99614720 14/01/08 09:16:09 INFO mapred.MapTask: record buffer = 262144/327680 14/01/08 09:16:09 INFO mapred.MapTask: Starting flush of map output 14/01/08 09:16:09 INFO mapred.MapTask: Finished spill 0 14/01/08 09:16:09 INFO mapred.Task: Task:attempt_local_0001_m_000006_0 is done. And is in the process of commiting 14/01/08 09:16:09 INFO mapred.LocalJobRunner: 14/01/08 09:16:09 INFO mapred.Task: Task 'attempt_local_0001_m_000006_0' done. 14/01/08 09:16:09 INFO mapred.Task: Using ResourceCalculatorPlugin : null 14/01/08 09:16:09 INFO mapred.LocalJobRunner: 14/01/08 09:16:09 INFO mapred.Merger: Merging 7 sorted segments 14/01/08 09:16:09 INFO mapred.Merger: Down to the last merge-pass, with 7 segments left of total size: 891 bytes 14/01/08 09:16:09 INFO mapred.LocalJobRunner: 14/01/08 09:16:09 INFO mapred.Task: Task:attempt_local_0001_r_000000_0 is done. And is in the process of commiting 14/01/08 09:16:09 INFO mapred.LocalJobRunner: 14/01/08 09:16:09 INFO mapred.Task: Task attempt_local_0001_r_000000_0 is allowed to commit now 14/01/08 09:16:09 INFO output.FileOutputCommitter: Saved output of task 'attempt_local_0001_r_000000_0' to /tmp/BigPetStore1389190568881/generated 14/01/08 09:16:09 INFO mapred.LocalJobRunner: reduce > reduce 14/01/08 09:16:09 INFO mapred.Task: Task 'attempt_local_0001_r_000000_0' done. 14/01/08 09:16:10 INFO mapred.JobClient: map 100% reduce 100% 14/01/08 09:16:10 INFO mapred.JobClient: Job complete: job_local_0001 14/01/08 09:16:10 INFO mapred.JobClient: Counters: 17 14/01/08 09:16:10 INFO mapred.JobClient: File Output Format Counters 14/01/08 09:16:10 INFO mapred.JobClient: Bytes Written=873 14/01/08 09:16:10 INFO mapred.JobClient: FileSystemCounters 14/01/08 09:16:10 INFO mapred.JobClient: FILE_BYTES_READ=19387 14/01/08 09:16:10 INFO mapred.JobClient: FILE_BYTES_WRITTEN=291694 14/01/08 09:16:10 INFO mapred.JobClient: File Input Format Counters 14/01/08 09:16:10 INFO mapred.JobClient: Bytes Read=0 14/01/08 09:16:10 INFO mapred.JobClient: Map-Reduce Framework 14/01/08 09:16:10 INFO mapred.JobClient: Map output materialized bytes=919 14/01/08 09:16:10 INFO mapred.JobClient: Map input records=10 14/01/08 09:16:10 INFO mapred.JobClient: Reduce shuffle bytes=0 14/01/08 09:16:10 INFO mapred.JobClient: Spilled Records=20 14/01/08 09:16:10 INFO mapred.JobClient: Map output bytes=857 14/01/08 09:16:10 INFO mapred.JobClient: Total committed heap usage (bytes)=1039663104 14/01/08 09:16:10 INFO mapred.JobClient: SPLIT_RAW_BYTES=497 14/01/08 09:16:10 INFO mapred.JobClient: Combine input records=0 14/01/08 09:16:10 INFO mapred.JobClient: Reduce input records=10 14/01/08 09:16:10 INFO mapred.JobClient: Reduce input groups=10 14/01/08 09:16:10 INFO mapred.JobClient: Combine output records=0 14/01/08 09:16:10 INFO mapred.JobClient: Reduce output records=10 14/01/08 09:16:10 INFO mapred.JobClient: Map output records=10 output : /tmp/BigPetStore1389190568881/generated/part-r-00000 BigPetStore,storeCode_AK,1 jennifer,patrick,Tue Jan 06 20:48:34 EST 1970,19.1,fuzzy-collar BigPetStore,storeCode_AZ,1 billie,paul,Wed Dec 31 07:32:56 EST 1969,10.5,dog-food BigPetStore,storeCode_CA,1 christine,dunn,Wed Dec 17 12:51:16 EST 1969,7.5,cat-food BigPetStore,storeCode_CA,2 brent,kerr,Sat Dec 20 17:53:03 EST 1969,7.5,cat-food BigPetStore,storeCode_CA,3 everett,christensen,Fri Jan 23 23:47:54 EST 1970,7.5,cat-food BigPetStore,storeCode_CO,1 sandy,bernard,Fri Jan 09 20:15:59 EST 1970,30.1,antelope snacks BigPetStore,storeCode_CT,1 holly,o'neal,Sat Dec 27 08:44:02 EST 1969,10.5,dog-food BigPetStore,storeCode_NY,1 victor,wilson,Sun Jan 18 08:25:34 EST 1970,20.1,steel-leash BigPetStore,storeCode_NY,2 victor,wilson,Mon Dec 22 19:14:13 EST 1969,20.1,steel-leash BigPetStore,storeCode_OK,1 robbie,finley,Wed Jan 14 21:29:44 EST 1970,7.5,cat-food crunch : Text(/tmp/BigPetStore1389190568881/generated/part-r-00000) 857 14/01/08 09:16:10 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. 14/01/08 09:16:11 WARN mapred.JobClient: No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String). 14/01/08 09:16:11 INFO input.FileInputFormat: Total input paths to process : 1 14/01/08 09:16:11 WARN snappy.LoadSnappy: Snappy native library not loaded 14/01/08 09:16:11 INFO filecache.TrackerDistributedCacheManager: Creating REDUCE in /tmp/mapred/local/archive/-6557058057938015743_-1411752611_1916134038/file/tmp/crunch-43589425/p2-work-6652890227012638320 with rwxr-xr-x 14/01/08 09:16:11 INFO filecache.TrackerDistributedCacheManager: Cached /tmp/crunch-43589425/p2/REDUCE as /tmp/mapred/local/archive/-6557058057938015743_-1411752611_1916134038/file/tmp/crunch-43589425/p2/REDUCE 14/01/08 09:16:11 INFO filecache.TrackerDistributedCacheManager: Cached /tmp/crunch-43589425/p2/REDUCE as /tmp/mapred/local/archive/-6557058057938015743_-1411752611_1916134038/file/tmp/crunch-43589425/p2/REDUCE 14/01/08 09:16:11 INFO filecache.TrackerDistributedCacheManager: Creating COMBINE in /tmp/mapred/local/archive/-4649982853460913085_-948268024_1916134038/file/tmp/crunch-43589425/p2-work-7992331867547637651 with rwxr-xr-x 14/01/08 09:16:11 INFO filecache.TrackerDistributedCacheManager: Cached /tmp/crunch-43589425/p2/COMBINE as /tmp/mapred/local/archive/-4649982853460913085_-948268024_1916134038/file/tmp/crunch-43589425/p2/COMBINE 14/01/08 09:16:11 INFO filecache.TrackerDistributedCacheManager: Cached /tmp/crunch-43589425/p2/COMBINE as /tmp/mapred/local/archive/-4649982853460913085_-948268024_1916134038/file/tmp/crunch-43589425/p2/COMBINE 14/01/08 09:16:11 INFO filecache.TrackerDistributedCacheManager: Creating MAP in /tmp/mapred/local/archive/-4624020441112257_-678227803_1916134038/file/tmp/crunch-43589425/p2-work-6455515777437153419 with rwxr-xr-x 14/01/08 09:16:11 INFO filecache.TrackerDistributedCacheManager: Cached /tmp/crunch-43589425/p2/MAP as /tmp/mapred/local/archive/-4624020441112257_-678227803_1916134038/file/tmp/crunch-43589425/p2/MAP 14/01/08 09:16:11 INFO filecache.TrackerDistributedCacheManager: Cached /tmp/crunch-43589425/p2/MAP as /tmp/mapred/local/archive/-4624020441112257_-678227803_1916134038/file/tmp/crunch-43589425/p2/MAP 14/01/08 09:16:11 INFO jobcontrol.CrunchControlledJob: Running job "org.bigtop.bigpetstore.etl.CrunchETL: Text(/tmp/BigPetStore1389190568881/generated/part-r-00000... (1/1)" 14/01/08 09:16:11 INFO jobcontrol.CrunchControlledJob: Job status available at: http://localhost:8080/ 14/01/08 09:16:11 INFO mapred.Task: Using ResourceCalculatorPlugin : null 14/01/08 09:16:11 INFO mapred.MapTask: io.sort.mb = 100 14/01/08 09:16:11 INFO mapred.MapTask: data buffer = 79691776/99614720 14/01/08 09:16:11 INFO mapred.MapTask: record buffer = 262144/327680 14/01/08 09:16:11 INFO mapred.MapTask: Starting flush of map output 14/01/08 09:16:11 INFO mapred.MapTask: Finished spill 0 14/01/08 09:16:11 INFO mapred.Task: Task:attempt_local_0002_m_000000_0 is done. And is in the process of commiting 14/01/08 09:16:11 INFO mapred.LocalJobRunner: 14/01/08 09:16:11 INFO mapred.Task: Task 'attempt_local_0002_m_000000_0' done. 14/01/08 09:16:11 INFO mapred.Task: Using ResourceCalculatorPlugin : null 14/01/08 09:16:11 INFO mapred.LocalJobRunner: 14/01/08 09:16:11 INFO mapred.Merger: Merging 1 sorted segments 14/01/08 09:16:11 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 76 bytes 14/01/08 09:16:11 INFO mapred.LocalJobRunner: 14/01/08 09:16:11 INFO mapred.Task: Task:attempt_local_0002_r_000000_0 is done. And is in the process of commiting 14/01/08 09:16:11 INFO mapred.LocalJobRunner: 14/01/08 09:16:11 INFO mapred.Task: Task attempt_local_0002_r_000000_0 is allowed to commit now 14/01/08 09:16:11 INFO output.FileOutputCommitter: Saved output of task 'attempt_local_0002_r_000000_0' to /tmp/crunch-43589425/p2/output 14/01/08 09:16:11 INFO mapred.LocalJobRunner: reduce > reduce 14/01/08 09:16:11 INFO mapred.Task: Task 'attempt_local_0002_r_000000_0' done. Crunch::: {cat-food=4, antelope snacks=1, fuzzy-collar=1, dog-food=2, steel-leash=2} {cat-food=4, antelope snacks=1, fuzzy-collar=1, dog-food=2, steel-leash=2} 14/01/08 09:16:11 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. 14/01/08 09:16:12 WARN mapred.JobClient: No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String). 14/01/08 09:16:12 INFO input.FileInputFormat: Total input paths to process : 1 14/01/08 09:16:12 INFO filecache.TrackerDistributedCacheManager: Creating REDUCE in /tmp/mapred/local/archive/1179138723816890415_2073868059_1916135038/file/tmp/crunch-43589425/p4-work--1244561190751016429 with rwxr-xr-x 14/01/08 09:16:12 INFO filecache.TrackerDistributedCacheManager: Cached /tmp/crunch-43589425/p4/REDUCE as /tmp/mapred/local/archive/1179138723816890415_2073868059_1916135038/file/tmp/crunch-43589425/p4/REDUCE 14/01/08 09:16:12 INFO filecache.TrackerDistributedCacheManager: Cached /tmp/crunch-43589425/p4/REDUCE as /tmp/mapred/local/archive/1179138723816890415_2073868059_1916135038/file/tmp/crunch-43589425/p4/REDUCE 14/01/08 09:16:12 INFO filecache.TrackerDistributedCacheManager: Creating COMBINE in /tmp/mapred/local/archive/-5645924133996583671_-268209654_1916135038/file/tmp/crunch-43589425/p4-work-4417588436280480321 with rwxr-xr-x 14/01/08 09:16:12 INFO filecache.TrackerDistributedCacheManager: Cached /tmp/crunch-43589425/p4/COMBINE as /tmp/mapred/local/archive/-5645924133996583671_-268209654_1916135038/file/tmp/crunch-43589425/p4/COMBINE 14/01/08 09:16:12 INFO filecache.TrackerDistributedCacheManager: Cached /tmp/crunch-43589425/p4/COMBINE as /tmp/mapred/local/archive/-5645924133996583671_-268209654_1916135038/file/tmp/crunch-43589425/p4/COMBINE 14/01/08 09:16:12 INFO filecache.TrackerDistributedCacheManager: Creating MAP in /tmp/mapred/local/archive/5188044211822062668_-676380761_1916135038/file/tmp/crunch-43589425/p4-work-3041935114194345962 with rwxr-xr-x 14/01/08 09:16:12 INFO filecache.TrackerDistributedCacheManager: Cached /tmp/crunch-43589425/p4/MAP as /tmp/mapred/local/archive/5188044211822062668_-676380761_1916135038/file/tmp/crunch-43589425/p4/MAP 14/01/08 09:16:12 INFO filecache.TrackerDistributedCacheManager: Cached /tmp/crunch-43589425/p4/MAP as /tmp/mapred/local/archive/5188044211822062668_-676380761_1916135038/file/tmp/crunch-43589425/p4/MAP 14/01/08 09:16:12 INFO jobcontrol.CrunchControlledJob: Running job "org.bigtop.bigpetstore.etl.CrunchETL: Text(/tmp/BigPetStore1389190568881/generated/part-r-00000... (1/1)" 14/01/08 09:16:12 INFO jobcontrol.CrunchControlledJob: Job status available at: http://localhost:8080/ 14/01/08 09:16:12 INFO mapred.Task: Using ResourceCalculatorPlugin : null 14/01/08 09:16:12 INFO mapred.MapTask: io.sort.mb = 100 14/01/08 09:16:12 INFO mapred.MapTask: data buffer = 79691776/99614720 14/01/08 09:16:12 INFO mapred.MapTask: record buffer = 262144/327680 14/01/08 09:16:12 INFO mapred.MapTask: Starting flush of map output 14/01/08 09:16:12 INFO mapred.MapTask: Finished spill 0 14/01/08 09:16:12 INFO mapred.Task: Task:attempt_local_0003_m_000000_0 is done. And is in the process of commiting 14/01/08 09:16:12 INFO mapred.LocalJobRunner: 14/01/08 09:16:12 INFO mapred.Task: Task 'attempt_local_0003_m_000000_0' done. 14/01/08 09:16:12 INFO mapred.Task: Using ResourceCalculatorPlugin : null 14/01/08 09:16:12 INFO mapred.LocalJobRunner: 14/01/08 09:16:12 INFO mapred.Merger: Merging 1 sorted segments 14/01/08 09:16:12 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 76 bytes 14/01/08 09:16:12 INFO mapred.LocalJobRunner: 14/01/08 09:16:12 INFO mapred.Task: Task:attempt_local_0003_r_000000_0 is done. And is in the process of commiting 14/01/08 09:16:12 INFO mapred.LocalJobRunner: 14/01/08 09:16:12 INFO mapred.Task: Task attempt_local_0003_r_000000_0 is allowed to commit now 14/01/08 09:16:12 INFO output.FileOutputCommitter: Saved output of task 'attempt_local_0003_r_000000_0' to /tmp/crunch-43589425/p4/output 14/01/08 09:16:12 INFO mapred.LocalJobRunner: reduce > reduce 14/01/08 09:16:12 INFO mapred.Task: Task 'attempt_local_0003_r_000000_0' done. Crunch::: {cat-food=4, antelope snacks=1, fuzzy-collar=1, dog-food=2, steel-leash=2} 14/01/08 09:16:12 INFO executionengine.HExecutionEngine: Connecting to hadoop file system at: file:/// 14/01/08 09:16:12 WARN pig.PigServer: Encountered Warning IMPLICIT_CAST_TO_CHARARRAY 2 time(s). 14/01/08 09:16:12 WARN pig.PigServer: Encountered Warning USING_OVERLOADED_FUNCTION 2 time(s). id_details: {drop: NULL,code: NULL,transaction: NULL,lname: NULL,fname: NULL,date: NULL,price: NULL,product: chararray} {drop: NULL,code: NULL,transaction: NULL,lname: NULL,fname: NULL,date: NULL,price: NULL,product: chararray} 14/01/08 09:16:12 WARN pig.PigServer: Encountered Warning IMPLICIT_CAST_TO_CHARARRAY 2 time(s). 14/01/08 09:16:12 WARN pig.PigServer: Encountered Warning USING_OVERLOADED_FUNCTION 2 time(s). uniqcnt: {product: chararray,count: long} Schema : {product: chararray,count: long} 14/01/08 09:16:12 WARN pig.PigServer: Encountered Warning IMPLICIT_CAST_TO_CHARARRAY 2 time(s). 14/01/08 09:16:12 WARN pig.PigServer: Encountered Warning USING_OVERLOADED_FUNCTION 2 time(s). 14/01/08 09:16:12 INFO pigstats.ScriptState: Pig features used in the script: GROUP_BY 14/01/08 09:16:13 INFO optimizer.LogicalPlanOptimizer: {RULES_ENABLED=[AddForEach, ColumnMapKeyPrune, DuplicateForEachColumnRewrite, FilterLogicExpressionSimplifier, GroupByConstParallelSetter, ImplicitSplitInserter, LimitOptimizer, LoadTypeCastInserter, MergeFilter, MergeForEach, NewPartitionFilterOptimizer, PartitionFilterOptimizer, PushDownForEachFlatten, PushUpFilter, SplitFilter, StreamTypeCastInserter]} 14/01/08 09:16:13 INFO mapReduceLayer.MRCompiler: File concatenation threshold: 100 optimistic? false 14/01/08 09:16:13 INFO mapReduceLayer.MultiQueryOptimizer: MR plan size before optimization: 1 14/01/08 09:16:13 INFO mapReduceLayer.MultiQueryOptimizer: MR plan size after optimization: 1 14/01/08 09:16:13 INFO pigstats.ScriptState: Pig script settings are added to the job 14/01/08 09:16:13 INFO mapReduceLayer.JobControlCompiler: mapred.job.reduce.markreset.buffer.percent is not set, set to default 0.3 14/01/08 09:16:13 INFO mapReduceLayer.JobControlCompiler: Setting up single store job 14/01/08 09:16:13 INFO data.SchemaTupleFrontend: Key [pig.schematuple] is false, will not generate code. 14/01/08 09:16:13 INFO data.SchemaTupleFrontend: Starting process to move generated code to distributed cache 14/01/08 09:16:13 INFO data.SchemaTupleFrontend: Distributed cache not supported or needed in local mode. Setting key [pig.schematuple.local.dir] with code temp directory: /var/folders/qd/n3xqkhkx5b37xb3p_npnsgp40000gn/T/1389190573212-0 14/01/08 09:16:13 INFO mapReduceLayer.JobControlCompiler: Reduce phase detected, estimating # of required reducers. 14/01/08 09:16:13 INFO mapReduceLayer.JobControlCompiler: Using reducer estimator: org.apache.pig.backend.hadoop.executionengine.mapReduceLayer.InputSizeReducerEstimator 14/01/08 09:16:13 INFO mapReduceLayer.InputSizeReducerEstimator: BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=-1 14/01/08 09:16:13 INFO mapReduceLayer.JobControlCompiler: Could not estimate number of reducers and no requested or default parallelism set. Defaulting to 1 reducer. 14/01/08 09:16:13 INFO mapReduceLayer.JobControlCompiler: Setting Parallelism to 1 14/01/08 09:16:13 INFO mapReduceLayer.MapReduceLauncher: 1 map-reduce job(s) waiting for submission. 14/01/08 09:16:13 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. 14/01/08 09:16:13 WARN mapred.JobClient: No job jar file set. User classes may not be found. See JobConf(Class) or JobConf#setJar(String). 14/01/08 09:16:13 INFO input.FileInputFormat: Total input paths to process : 1 14/01/08 09:16:13 INFO util.MapRedUtil: Total input paths to process : 1 14/01/08 09:16:13 INFO util.MapRedUtil: Total input paths (combined) to process : 1 14/01/08 09:16:13 INFO mapred.Task: Using ResourceCalculatorPlugin : null 14/01/08 09:16:13 INFO mapReduceLayer.PigRecordReader: Current split being processed file:/tmp/BigPetStore1389190568881/generated/part-r-00000:0+857 14/01/08 09:16:13 INFO mapred.MapTask: io.sort.mb = 100 14/01/08 09:16:13 INFO mapred.MapTask: data buffer = 79691776/99614720 14/01/08 09:16:13 INFO mapred.MapTask: record buffer = 262144/327680 14/01/08 09:16:13 INFO util.SpillableMemoryManager: first memory handler call- Usage threshold init = 65404928(63872K) used = 102924280(100511K) committed = 110362624(107776K) max = 110362624(107776K) 14/01/08 09:16:13 INFO data.SchemaTupleBackend: Key [pig.schematuple] was not set... will not generate code. 14/01/08 09:16:13 INFO mapReduceLayer.PigGenericMapReduce$Map: Aliases being processed per job phase (AliasName[line,offset]): M: csvdata[1,10],id_details[2,13],transactions[3,15],transactionsG[4,16] C: R: uniqcnt[5,11],sym[5,40],sym[5,40] 14/01/08 09:16:13 INFO mapred.MapTask: Starting flush of map output 14/01/08 09:16:13 INFO mapred.MapTask: Finished spill 0 14/01/08 09:16:13 INFO mapred.Task: Task:attempt_local_0004_m_000000_0 is done. And is in the process of commiting 14/01/08 09:16:13 INFO mapred.LocalJobRunner: 14/01/08 09:16:13 INFO mapred.Task: Task 'attempt_local_0004_m_000000_0' done. 14/01/08 09:16:13 INFO mapred.Task: Using ResourceCalculatorPlugin : null 14/01/08 09:16:13 INFO mapred.LocalJobRunner: 14/01/08 09:16:13 INFO mapred.Merger: Merging 1 sorted segments 14/01/08 09:16:13 INFO mapred.Merger: Down to the last merge-pass, with 1 segments left of total size: 1069 bytes 14/01/08 09:16:13 INFO mapred.LocalJobRunner: 14/01/08 09:16:13 WARN data.SchemaTupleBackend: SchemaTupleBackend has already been initialized 14/01/08 09:16:13 INFO mapReduceLayer.PigMapReduce$Reduce: Aliases being processed per job phase (AliasName[line,offset]): M: csvdata[1,10],id_details[2,13],transactions[3,15],transactionsG[4,16] C: R: uniqcnt[5,11],sym[5,40],sym[5,40] 14/01/08 09:16:13 INFO mapred.Task: Task:attempt_local_0004_r_000000_0 is done. And is in the process of commiting 14/01/08 09:16:13 INFO mapred.LocalJobRunner: 14/01/08 09:16:13 INFO mapred.Task: Task attempt_local_0004_r_000000_0 is allowed to commit now 14/01/08 09:16:13 INFO output.FileOutputCommitter: Saved output of task 'attempt_local_0004_r_000000_0' to file:/tmp/temp1086917246/tmp-1333463362 14/01/08 09:16:13 INFO mapred.LocalJobRunner: reduce > reduce 14/01/08 09:16:13 INFO mapred.Task: Task 'attempt_local_0004_r_000000_0' done. 14/01/08 09:16:13 INFO mapReduceLayer.MapReduceLauncher: HadoopJobId: job_local_0004 14/01/08 09:16:13 INFO mapReduceLayer.MapReduceLauncher: Processing aliases csvdata,id_details,sym,transactions,transactionsG,uniqcnt 14/01/08 09:16:13 INFO mapReduceLayer.MapReduceLauncher: detailed locations: M: csvdata[1,10],id_details[2,13],transactions[3,15],transactionsG[4,16] C: R: uniqcnt[5,11],sym[5,40],sym[5,40] 14/01/08 09:16:13 WARN pigstats.PigStatsUtil: Failed to get RunningJob for job job_local_0004 14/01/08 09:16:13 INFO mapReduceLayer.MapReduceLauncher: 100% complete 14/01/08 09:16:13 INFO pigstats.SimplePigStats: Detected Local mode. Stats reported below may be incomplete 14/01/08 09:16:13 INFO pigstats.SimplePigStats: Script Statistics: HadoopVersion PigVersion UserId StartedAt FinishedAt Features 1.1.1 0.12.0 Jpeerindex 2014-01-08 09:16:13 2014-01-08 09:16:13 GROUP_BY Success! Job Stats (time in seconds): JobId Alias Feature Outputs job_local_0004 csvdata,id_details,sym,transactions,transactionsG,uniqcnt GROUP_BY file:/tmp/temp1086917246/tmp-1333463362, Input(s): Successfully read records from: "/tmp/BigPetStore1389190568881/generated" Output(s): Successfully stored records in: "file:/tmp/temp1086917246/tmp-1333463362" Job DAG: job_local_0004 14/01/08 09:16:13 INFO mapReduceLayer.MapReduceLauncher: Success! 14/01/08 09:16:13 WARN data.SchemaTupleBackend: SchemaTupleBackend has already been initialized 14/01/08 09:16:13 INFO input.FileInputFormat: Total input paths to process : 1 14/01/08 09:16:13 INFO util.MapRedUtil: Total input paths to process : 1 14/01/08 09:16:13 INFO metastore.HiveMetaStore: 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore 14/01/08 09:16:13 INFO metastore.ObjectStore: ObjectStore, initialize called 14/01/08 09:16:13 INFO util.SpillableMemoryManager: first memory handler call - Collection threshold init = 65404928(63872K) used = 102841632(100431K) committed = 110362624(107776K) max = 110362624(107776K) 14/01/08 09:16:14 ERROR DataNucleus.Plugin: Bundle "org.eclipse.jdt.core" requires "org.eclipse.core.resources" but it cannot be resolved. 14/01/08 09:16:14 ERROR DataNucleus.Plugin: Bundle "org.eclipse.jdt.core" requires "org.eclipse.core.runtime" but it cannot be resolved. 14/01/08 09:16:14 ERROR DataNucleus.Plugin: Bundle "org.eclipse.jdt.core" requires "org.eclipse.text" but it cannot be resolved. 14/01/08 09:16:14 INFO DataNucleus.Persistence: Property datanucleus.cache.level2 unknown - will be ignored 14/01/08 09:16:14 INFO DataNucleus.Persistence: Property javax.jdo.option.NonTransactionalRead unknown - will be ignored 14/01/08 09:16:14 INFO DataNucleus.Persistence: ================= Persistence Configuration =============== 14/01/08 09:16:14 INFO DataNucleus.Persistence: DataNucleus Persistence Factory - Vendor: "DataNucleus" Version: "2.0.3" 14/01/08 09:16:14 INFO DataNucleus.Persistence: DataNucleus Persistence Factory initialised for datastore URL="jdbc:derby:;databaseName=/tmp/metastore/metastore_db;create=true" driver="org.apache.derby.jdbc.EmbeddedDriver" userName="APP" 14/01/08 09:16:14 INFO DataNucleus.Persistence: =========================================================== 14/01/08 09:16:14 INFO Datastore.Schema: Initialising Catalog "", Schema "APP" using "None" auto-start option 14/01/08 09:16:14 INFO Datastore.Schema: Catalog "", Schema "APP" initialised - managing 0 classes 14/01/08 09:16:15 INFO metastore.ObjectStore: Setting MetaStore object pin classes with hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order" 14/01/08 09:16:15 INFO DataNucleus.MetaData: Registering listener for metadata initialisation 14/01/08 09:16:15 INFO metastore.ObjectStore: Initialized ObjectStore 14/01/08 09:16:15 WARN DataNucleus.MetaData: MetaData Parser encountered an error in file "jar:file:/Users/Jpeerindex/.m2/repository/org/apache/hive/hive-metastore/0.11.0/hive-metastore-0.11.0.jar!/package.jdo" at line 28, column 6 : cvc-elt.1: Cannot find the declaration of element 'jdo'. - Please check your specification of DTD and the validity of the MetaData XML that you have specified. 14/01/08 09:16:15 WARN DataNucleus.MetaData: MetaData Parser encountered an error in file "jar:file:/Users/Jpeerindex/.m2/repository/org/apache/hive/hive-metastore/0.11.0/hive-metastore-0.11.0.jar!/package.jdo" at line 374, column 13 : The content of element type "class" must match "(extension*,implements*,datastore-identity?,primary-key?,inheritance?,version?,join*,foreign-key*,index*,unique*,column*,field*,property*,query*,fetch-group*,extension*)". - Please check your specification of DTD and the validity of the MetaData XML that you have specified. 14/01/08 09:16:15 WARN DataNucleus.MetaData: MetaData Parser encountered an error in file "jar:file:/Users/Jpeerindex/.m2/repository/org/apache/hive/hive-metastore/0.11.0/hive-metastore-0.11.0.jar!/package.jdo" at line 421, column 13 : The content of element type "class" must match "(extension*,implements*,datastore-identity?,primary-key?,inheritance?,version?,join*,foreign-key*,index*,unique*,column*,field*,property*,query*,fetch-group*,extension*)". - Please check your specification of DTD and the validity of the MetaData XML that you have specified. 14/01/08 09:16:15 WARN DataNucleus.MetaData: MetaData Parser encountered an error in file "jar:file:/Users/Jpeerindex/.m2/repository/org/apache/hive/hive-metastore/0.11.0/hive-metastore-0.11.0.jar!/package.jdo" at line 443, column 13 : The content of element type "class" must match "(extension*,implements*,datastore-identity?,primary-key?,inheritance?,version?,join*,foreign-key*,index*,unique*,column*,field*,property*,query*,fetch-group*,extension*)". - Please check your specification of DTD and the validity of the MetaData XML that you have specified. 14/01/08 09:16:15 WARN DataNucleus.MetaData: MetaData Parser encountered an error in file "jar:file:/Users/Jpeerindex/.m2/repository/org/apache/hive/hive-metastore/0.11.0/hive-metastore-0.11.0.jar!/package.jdo" at line 478, column 13 : The content of element type "class" must match "(extension*,implements*,datastore-identity?,primary-key?,inheritance?,version?,join*,foreign-key*,index*,unique*,column*,field*,property*,query*,fetch-group*,extension*)". - Please check your specification of DTD and the validity of the MetaData XML that you have specified. 14/01/08 09:16:15 WARN DataNucleus.MetaData: MetaData Parser encountered an error in file "jar:file:/Users/Jpeerindex/.m2/repository/org/apache/hive/hive-metastore/0.11.0/hive-metastore-0.11.0.jar!/package.jdo" at line 515, column 13 : The content of element type "class" must match "(extension*,implements*,datastore-identity?,primary-key?,inheritance?,version?,join*,foreign-key*,index*,unique*,column*,field*,property*,query*,fetch-group*,extension*)". - Please check your specification of DTD and the validity of the MetaData XML that you have specified. 14/01/08 09:16:15 WARN DataNucleus.MetaData: MetaData Parser encountered an error in file "jar:file:/Users/Jpeerindex/.m2/repository/org/apache/hive/hive-metastore/0.11.0/hive-metastore-0.11.0.jar!/package.jdo" at line 556, column 13 : The content of element type "class" must match "(extension*,implements*,datastore-identity?,primary-key?,inheritance?,version?,join*,foreign-key*,index*,unique*,column*,field*,property*,query*,fetch-group*,extension*)". - Please check your specification of DTD and the validity of the MetaData XML that you have specified. 14/01/08 09:16:15 WARN DataNucleus.MetaData: MetaData Parser encountered an error in file "jar:file:/Users/Jpeerindex/.m2/repository/org/apache/hive/hive-metastore/0.11.0/hive-metastore-0.11.0.jar!/package.jdo" at line 597, column 13 : The content of element type "class" must match "(extension*,implements*,datastore-identity?,primary-key?,inheritance?,version?,join*,foreign-key*,index*,unique*,column*,field*,property*,query*,fetch-group*,extension*)". - Please check your specification of DTD and the validity of the MetaData XML that you have specified. 14/01/08 09:16:15 WARN DataNucleus.MetaData: MetaData Parser encountered an error in file "jar:file:/Users/Jpeerindex/.m2/repository/org/apache/hive/hive-metastore/0.11.0/hive-metastore-0.11.0.jar!/package.jdo" at line 638, column 13 : The content of element type "class" must match "(extension*,implements*,datastore-identity?,primary-key?,inheritance?,version?,join*,foreign-key*,index*,unique*,column*,field*,property*,query*,fetch-group*,extension*)". - Please check your specification of DTD and the validity of the MetaData XML that you have specified. 14/01/08 09:16:15 WARN DataNucleus.MetaData: MetaData Parser encountered an error in file "jar:file:/Users/Jpeerindex/.m2/repository/org/apache/hive/hive-metastore/0.11.0/hive-metastore-0.11.0.jar!/package.jdo" at line 683, column 13 : The content of element type "class" must match "(extension*,implements*,datastore-identity?,primary-key?,inheritance?,version?,join*,foreign-key*,index*,unique*,column*,field*,property*,query*,fetch-group*,extension*)". - Please check your specification of DTD and the validity of the MetaData XML that you have specified. 14/01/08 09:16:15 WARN DataNucleus.MetaData: MetaData Parser encountered an error in file "jar:file:/Users/Jpeerindex/.m2/repository/org/apache/hive/hive-metastore/0.11.0/hive-metastore-0.11.0.jar!/package.jdo" at line 728, column 13 : The content of element type "class" must match "(extension*,implements*,datastore-identity?,primary-key?,inheritance?,version?,join*,foreign-key*,index*,unique*,column*,field*,property*,query*,fetch-group*,extension*)". - Please check your specification of DTD and the validity of the MetaData XML that you have specified. 14/01/08 09:16:15 WARN DataNucleus.MetaData: MetaData Parser encountered an error in file "jar:file:/Users/Jpeerindex/.m2/repository/org/apache/hive/hive-metastore/0.11.0/hive-metastore-0.11.0.jar!/package.jdo" at line 756, column 13 : The content of element type "class" must match "(extension*,implements*,datastore-identity?,primary-key?,inheritance?,version?,join*,foreign-key*,index*,unique*,column*,field*,property*,query*,fetch-group*,extension*)". - Please check your specification of DTD and the validity of the MetaData XML that you have specified. 14/01/08 09:16:15 INFO DataNucleus.Persistence: Managing Persistence of Class : org.apache.hadoop.hive.metastore.model.MDatabase [Table : DBS, InheritanceStrategy : new-table] 14/01/08 09:16:15 INFO DataNucleus.Persistence: Managing Persistence of Field : org.apache.hadoop.hive.metastore.model.MDatabase.parameters [Table : DATABASE_PARAMS] 14/01/08 09:16:15 INFO Datastore.Schema: Validating 2 unique key(s) for table DBS 14/01/08 09:16:15 INFO Datastore.Schema: Validating 0 foreign key(s) for table DBS 14/01/08 09:16:15 INFO Datastore.Schema: Validating 2 index(es) for table DBS 14/01/08 09:16:15 INFO Datastore.Schema: Validating 1 unique key(s) for table DATABASE_PARAMS 14/01/08 09:16:15 INFO Datastore.Schema: Validating 1 foreign key(s) for table DATABASE_PARAMS 14/01/08 09:16:15 INFO Datastore.Schema: Validating 2 index(es) for table DATABASE_PARAMS 14/01/08 09:16:15 INFO DataNucleus.MetaData: Listener found initialisation for persistable class org.apache.hadoop.hive.metastore.model.MDatabase Hive history file=/tmp/Jpeerindex/hive_job_log_Jpeerindex_24038@jays-MacBook-Pro.local_201401080916_355588244.txt 14/01/08 09:16:16 INFO exec.HiveHistory: Hive history file=/tmp/Jpeerindex/hive_job_log_Jpeerindex_24038@jays-MacBook-Pro.local_201401080916_355588244.txt 14/01/08 09:16:16 INFO service.HiveServer: Putting temp output to file /tmp/Jpeerindex/Jpeerindex_24038@jays-MacBook-Pro.local_2014010809169065325454685237184.pipeout 14/01/08 09:16:16 INFO service.HiveServer: Running the query: set hive.fetch.output.serde = org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe 14/01/08 09:16:16 INFO service.HiveServer: Putting temp output to file /tmp/Jpeerindex/Jpeerindex_24038@jays-MacBook-Pro.local_2014010809169065325454685237184.pipeout 14/01/08 09:16:16 INFO service.HiveServer: Running the query: DROP TABLE hive_bigpetstore_etl 14/01/08 09:16:16 INFO ql.Driver: <PERFLOG method=Driver.run> 14/01/08 09:16:16 INFO ql.Driver: <PERFLOG method=TimeToSubmit> 14/01/08 09:16:16 INFO ql.Driver: <PERFLOG method=compile> 14/01/08 09:16:16 INFO parse.ParseDriver: Parsing command: DROP TABLE hive_bigpetstore_etl 14/01/08 09:16:16 INFO parse.ParseDriver: Parse Completed 14/01/08 09:16:16 INFO metastore.HiveMetaStore: 0: get_table : db=default tbl=hive_bigpetstore_etl 14/01/08 09:16:16 INFO HiveMetaStore.audit: ugi=Jpeerindex ip=unknown-ip-addr cmd=get_table : db=default tbl=hive_bigpetstore_etl 14/01/08 09:16:16 INFO metastore.HiveMetaStore: 0: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore 14/01/08 09:16:16 INFO metastore.ObjectStore: ObjectStore, initialize called 14/01/08 09:16:16 INFO metastore.ObjectStore: Initialized ObjectStore 14/01/08 09:16:16 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table. 14/01/08 09:16:16 INFO DataNucleus.Persistence: Managing Persistence of Class : org.apache.hadoop.hive.metastore.model.MColumnDescriptor [Table : CDS, InheritanceStrategy : new-table] 14/01/08 09:16:16 INFO DataNucleus.Persistence: Managing Persistence of Class : org.apache.hadoop.hive.metastore.model.MSerDeInfo [Table : SERDES, InheritanceStrategy : new-table] 14/01/08 09:16:16 INFO DataNucleus.Persistence: Managing Persistence of Class : org.apache.hadoop.hive.metastore.model.MStringList [Table : SKEWED_STRING_LIST, InheritanceStrategy : new-table] 14/01/08 09:16:16 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table. 14/01/08 09:16:16 INFO DataNucleus.Persistence: Managing Persistence of Class : org.apache.hadoop.hive.metastore.model.MStorageDescriptor [Table : SDS, InheritanceStrategy : new-table] 14/01/08 09:16:16 INFO DataNucleus.Persistence: Managing Persistence of Class : org.apache.hadoop.hive.metastore.model.MTable [Table : TBLS, InheritanceStrategy : new-table] 14/01/08 09:16:16 INFO DataNucleus.Persistence: Managing Persistence of Field : org.apache.hadoop.hive.metastore.model.MSerDeInfo.parameters [Table : SERDE_PARAMS] 14/01/08 09:16:16 INFO DataNucleus.Persistence: Managing Persistence of Field : org.apache.hadoop.hive.metastore.model.MStringList.internalList [Table : SKEWED_STRING_LIST_VALUES] 14/01/08 09:16:16 INFO DataNucleus.Persistence: Managing Persistence of Field : org.apache.hadoop.hive.metastore.model.MTable.parameters [Table : TABLE_PARAMS] 14/01/08 09:16:16 INFO DataNucleus.Persistence: Managing Persistence of Field : org.apache.hadoop.hive.metastore.model.MTable.partitionKeys [Table : PARTITION_KEYS] 14/01/08 09:16:16 INFO DataNucleus.Persistence: Managing Persistence of Field : org.apache.hadoop.hive.metastore.model.MStorageDescriptor.bucketCols [Table : BUCKETING_COLS] 14/01/08 09:16:16 INFO DataNucleus.Persistence: Managing Persistence of Field : org.apache.hadoop.hive.metastore.model.MStorageDescriptor.parameters [Table : SD_PARAMS] 14/01/08 09:16:16 INFO DataNucleus.Persistence: Managing Persistence of Field : org.apache.hadoop.hive.metastore.model.MStorageDescriptor.skewedColNames [Table : SKEWED_COL_NAMES] 14/01/08 09:16:16 INFO DataNucleus.Persistence: Managing Persistence of Field : org.apache.hadoop.hive.metastore.model.MStorageDescriptor.skewedColValueLocationMaps [Table : SKEWED_COL_VALUE_LOC_MAP] 14/01/08 09:16:16 INFO DataNucleus.Persistence: Managing Persistence of Field : org.apache.hadoop.hive.metastore.model.MStorageDescriptor.skewedColValues [Table : SKEWED_VALUES] 14/01/08 09:16:16 INFO DataNucleus.Persistence: Managing Persistence of Field : org.apache.hadoop.hive.metastore.model.MStorageDescriptor.sortCols [Table : SORT_COLS] 14/01/08 09:16:16 INFO DataNucleus.Persistence: Managing Persistence of Field : org.apache.hadoop.hive.metastore.model.MColumnDescriptor.cols [Table : COLUMNS_V2] 14/01/08 09:16:16 INFO Datastore.Schema: Validating 1 unique key(s) for table SERDES 14/01/08 09:16:16 INFO Datastore.Schema: Validating 0 foreign key(s) for table SERDES 14/01/08 09:16:16 INFO Datastore.Schema: Validating 1 index(es) for table SERDES 14/01/08 09:16:16 INFO Datastore.Schema: Validating 1 unique key(s) for table SKEWED_STRING_LIST 14/01/08 09:16:16 INFO Datastore.Schema: Validating 0 foreign key(s) for table SKEWED_STRING_LIST 14/01/08 09:16:16 INFO Datastore.Schema: Validating 1 index(es) for table SKEWED_STRING_LIST 14/01/08 09:16:16 INFO Datastore.Schema: Validating 2 unique key(s) for table TBLS 14/01/08 09:16:16 INFO Datastore.Schema: Validating 2 foreign key(s) for table TBLS 14/01/08 09:16:16 INFO Datastore.Schema: Validating 4 index(es) for table TBLS 14/01/08 09:16:16 INFO Datastore.Schema: Validating 1 unique key(s) for table SDS 14/01/08 09:16:16 INFO Datastore.Schema: Validating 2 foreign key(s) for table SDS 14/01/08 09:16:16 INFO Datastore.Schema: Validating 3 index(es) for table SDS 14/01/08 09:16:16 INFO Datastore.Schema: Validating 1 unique key(s) for table CDS 14/01/08 09:16:16 INFO Datastore.Schema: Validating 0 foreign key(s) for table CDS 14/01/08 09:16:16 INFO Datastore.Schema: Validating 1 index(es) for table CDS 14/01/08 09:16:16 INFO Datastore.Schema: Validating 1 unique key(s) for table SD_PARAMS 14/01/08 09:16:16 INFO Datastore.Schema: Validating 1 foreign key(s) for table SD_PARAMS 14/01/08 09:16:16 INFO Datastore.Schema: Validating 2 index(es) for table SD_PARAMS 14/01/08 09:16:16 INFO Datastore.Schema: Validating 1 unique key(s) for table SKEWED_STRING_LIST_VALUES 14/01/08 09:16:16 INFO Datastore.Schema: Validating 1 foreign key(s) for table SKEWED_STRING_LIST_VALUES 14/01/08 09:16:16 INFO Datastore.Schema: Validating 2 index(es) for table SKEWED_STRING_LIST_VALUES 14/01/08 09:16:16 INFO Datastore.Schema: Validating 1 unique key(s) for table TABLE_PARAMS 14/01/08 09:16:16 INFO Datastore.Schema: Validating 1 foreign key(s) for table TABLE_PARAMS 14/01/08 09:16:16 INFO Datastore.Schema: Validating 2 index(es) for table TABLE_PARAMS 14/01/08 09:16:16 INFO Datastore.Schema: Validating 1 unique key(s) for table COLUMNS_V2 14/01/08 09:16:16 INFO Datastore.Schema: Validating 1 foreign key(s) for table COLUMNS_V2 14/01/08 09:16:16 INFO Datastore.Schema: Validating 2 index(es) for table COLUMNS_V2 14/01/08 09:16:16 INFO Datastore.Schema: Validating 1 unique key(s) for table SORT_COLS 14/01/08 09:16:16 INFO Datastore.Schema: Validating 1 foreign key(s) for table SORT_COLS 14/01/08 09:16:16 INFO Datastore.Schema: Validating 2 index(es) for table SORT_COLS 14/01/08 09:16:16 INFO Datastore.Schema: Validating 1 unique key(s) for table SKEWED_VALUES 14/01/08 09:16:16 INFO Datastore.Schema: Validating 2 foreign key(s) for table SKEWED_VALUES 14/01/08 09:16:16 INFO Datastore.Schema: Validating 3 index(es) for table SKEWED_VALUES 14/01/08 09:16:16 INFO Datastore.Schema: Validating 1 unique key(s) for table PARTITION_KEYS 14/01/08 09:16:16 INFO Datastore.Schema: Validating 1 foreign key(s) for table PARTITION_KEYS 14/01/08 09:16:16 INFO Datastore.Schema: Validating 2 index(es) for table PARTITION_KEYS 14/01/08 09:16:16 INFO Datastore.Schema: Validating 1 unique key(s) for table SKEWED_COL_NAMES 14/01/08 09:16:16 INFO Datastore.Schema: Validating 1 foreign key(s) for table SKEWED_COL_NAMES 14/01/08 09:16:16 INFO Datastore.Schema: Validating 2 index(es) for table SKEWED_COL_NAMES 14/01/08 09:16:16 INFO Datastore.Schema: Validating 1 unique key(s) for table BUCKETING_COLS 14/01/08 09:16:16 INFO Datastore.Schema: Validating 1 foreign key(s) for table BUCKETING_COLS 14/01/08 09:16:16 INFO Datastore.Schema: Validating 2 index(es) for table BUCKETING_COLS 14/01/08 09:16:16 INFO Datastore.Schema: Validating 1 unique key(s) for table SERDE_PARAMS 14/01/08 09:16:16 INFO Datastore.Schema: Validating 1 foreign key(s) for table SERDE_PARAMS 14/01/08 09:16:16 INFO Datastore.Schema: Validating 2 index(es) for table SERDE_PARAMS 14/01/08 09:16:16 INFO Datastore.Schema: Validating 1 unique key(s) for table SKEWED_COL_VALUE_LOC_MAP 14/01/08 09:16:16 INFO Datastore.Schema: Validating 2 foreign key(s) for table SKEWED_COL_VALUE_LOC_MAP 14/01/08 09:16:16 INFO Datastore.Schema: Validating 3 index(es) for table SKEWED_COL_VALUE_LOC_MAP 14/01/08 09:16:16 INFO DataNucleus.MetaData: Listener found initialisation for persistable class org.apache.hadoop.hive.metastore.model.MColumnDescriptor 14/01/08 09:16:16 INFO DataNucleus.MetaData: Listener found initialisation for persistable class org.apache.hadoop.hive.metastore.model.MSerDeInfo 14/01/08 09:16:16 INFO DataNucleus.MetaData: Listener found initialisation for persistable class org.apache.hadoop.hive.metastore.model.MStorageDescriptor 14/01/08 09:16:16 INFO DataNucleus.MetaData: Listener found initialisation for persistable class org.apache.hadoop.hive.metastore.model.MTable 14/01/08 09:16:16 INFO DataNucleus.MetaData: Listener found initialisation for persistable class org.apache.hadoop.hive.metastore.model.MFieldSchema 14/01/08 09:16:16 INFO ql.Driver: Semantic Analysis Completed 14/01/08 09:16:16 INFO ql.Driver: Returning Hive schema: Schema(fieldSchemas:null, properties:null) 14/01/08 09:16:16 INFO ql.Driver: </PERFLOG method=compile start=1389190576175 end=1389190576979 duration=804> 14/01/08 09:16:16 INFO ql.Driver: <PERFLOG method=Driver.execute> 14/01/08 09:16:16 INFO ql.Driver: Starting command: DROP TABLE hive_bigpetstore_etl 14/01/08 09:16:16 INFO ql.Driver: </PERFLOG method=TimeToSubmit start=1389190576175 end=1389190576997 duration=822> 14/01/08 09:16:17 INFO metastore.HiveMetaStore: 0: get_table : db=default tbl=hive_bigpetstore_etl 14/01/08 09:16:17 INFO HiveMetaStore.audit: ugi=Jpeerindex ip=unknown-ip-addr cmd=get_table : db=default tbl=hive_bigpetstore_etl 14/01/08 09:16:17 INFO metastore.HiveMetaStore: 0: get_table : db=default tbl=hive_bigpetstore_etl 14/01/08 09:16:17 INFO HiveMetaStore.audit: ugi=Jpeerindex ip=unknown-ip-addr cmd=get_table : db=default tbl=hive_bigpetstore_etl 14/01/08 09:16:17 INFO metastore.HiveMetaStore: 0: drop_table : db=default tbl=hive_bigpetstore_etl 14/01/08 09:16:17 INFO HiveMetaStore.audit: ugi=Jpeerindex ip=unknown-ip-addr cmd=drop_table : db=default tbl=hive_bigpetstore_etl 14/01/08 09:16:17 INFO metastore.HiveMetaStore: 0: get_table : db=default tbl=hive_bigpetstore_etl 14/01/08 09:16:17 INFO HiveMetaStore.audit: ugi=Jpeerindex ip=unknown-ip-addr cmd=get_table : db=default tbl=hive_bigpetstore_etl 14/01/08 09:16:17 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table. 14/01/08 09:16:17 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table. 14/01/08 09:16:17 INFO DataNucleus.Persistence: Managing Persistence of Class : org.apache.hadoop.hive.metastore.model.MIndex [Table : IDXS, InheritanceStrategy : new-table] 14/01/08 09:16:17 INFO DataNucleus.Persistence: Managing Persistence of Field : org.apache.hadoop.hive.metastore.model.MIndex.parameters [Table : INDEX_PARAMS] 14/01/08 09:16:17 INFO Datastore.Schema: Validating 2 unique key(s) for table IDXS 14/01/08 09:16:17 INFO Datastore.Schema: Validating 3 foreign key(s) for table IDXS 14/01/08 09:16:17 INFO Datastore.Schema: Validating 5 index(es) for table IDXS 14/01/08 09:16:17 INFO Datastore.Schema: Validating 1 unique key(s) for table INDEX_PARAMS 14/01/08 09:16:17 INFO Datastore.Schema: Validating 1 foreign key(s) for table INDEX_PARAMS 14/01/08 09:16:17 INFO Datastore.Schema: Validating 2 index(es) for table INDEX_PARAMS 14/01/08 09:16:17 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table. 14/01/08 09:16:17 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table. 14/01/08 09:16:17 INFO DataNucleus.Persistence: Managing Persistence of Class : org.apache.hadoop.hive.metastore.model.MPartition [Table : PARTITIONS, InheritanceStrategy : new-table] 14/01/08 09:16:17 INFO DataNucleus.Persistence: Managing Persistence of Field : org.apache.hadoop.hive.metastore.model.MPartition.parameters [Table : PARTITION_PARAMS] 14/01/08 09:16:17 INFO DataNucleus.Persistence: Managing Persistence of Field : org.apache.hadoop.hive.metastore.model.MPartition.values [Table : PARTITION_KEY_VALS] 14/01/08 09:16:17 INFO Datastore.Schema: Validating 2 unique key(s) for table PARTITIONS 14/01/08 09:16:17 INFO Datastore.Schema: Validating 2 foreign key(s) for table PARTITIONS 14/01/08 09:16:17 INFO Datastore.Schema: Validating 4 index(es) for table PARTITIONS 14/01/08 09:16:17 INFO Datastore.Schema: Validating 1 unique key(s) for table PARTITION_KEY_VALS 14/01/08 09:16:17 INFO Datastore.Schema: Validating 1 foreign key(s) for table PARTITION_KEY_VALS 14/01/08 09:16:17 INFO Datastore.Schema: Validating 2 index(es) for table PARTITION_KEY_VALS 14/01/08 09:16:17 INFO Datastore.Schema: Validating 1 unique key(s) for table PARTITION_PARAMS 14/01/08 09:16:17 INFO Datastore.Schema: Validating 1 foreign key(s) for table PARTITION_PARAMS 14/01/08 09:16:17 INFO Datastore.Schema: Validating 2 index(es) for table PARTITION_PARAMS 14/01/08 09:16:17 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table. 14/01/08 09:16:17 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table. 14/01/08 09:16:17 INFO DataNucleus.Persistence: Managing Persistence of Class : org.apache.hadoop.hive.metastore.model.MTablePrivilege [Table : TBL_PRIVS, InheritanceStrategy : new-table] 14/01/08 09:16:17 INFO Datastore.Schema: Validating 1 unique key(s) for table TBL_PRIVS 14/01/08 09:16:17 INFO Datastore.Schema: Validating 1 foreign key(s) for table TBL_PRIVS 14/01/08 09:16:17 INFO Datastore.Schema: Validating 3 index(es) for table TBL_PRIVS 14/01/08 09:16:17 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table. 14/01/08 09:16:17 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table. 14/01/08 09:16:17 INFO DataNucleus.Persistence: Managing Persistence of Class : org.apache.hadoop.hive.metastore.model.MTableColumnPrivilege [Table : TBL_COL_PRIVS, InheritanceStrategy : new-table] 14/01/08 09:16:17 INFO Datastore.Schema: Validating 1 unique key(s) for table TBL_COL_PRIVS 14/01/08 09:16:17 INFO Datastore.Schema: Validating 1 foreign key(s) for table TBL_COL_PRIVS 14/01/08 09:16:17 INFO Datastore.Schema: Validating 3 index(es) for table TBL_COL_PRIVS 14/01/08 09:16:17 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table. 14/01/08 09:16:17 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table. 14/01/08 09:16:17 INFO DataNucleus.Persistence: Managing Persistence of Class : org.apache.hadoop.hive.metastore.model.MPartitionPrivilege [Table : PART_PRIVS, InheritanceStrategy : new-table] 14/01/08 09:16:17 INFO Datastore.Schema: Validating 1 unique key(s) for table PART_PRIVS 14/01/08 09:16:17 INFO Datastore.Schema: Validating 1 foreign key(s) for table PART_PRIVS 14/01/08 09:16:17 INFO Datastore.Schema: Validating 3 index(es) for table PART_PRIVS 14/01/08 09:16:17 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table. 14/01/08 09:16:17 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table. 14/01/08 09:16:17 INFO DataNucleus.Persistence: Managing Persistence of Class : org.apache.hadoop.hive.metastore.model.MPartitionColumnPrivilege [Table : PART_COL_PRIVS, InheritanceStrategy : new-table] 14/01/08 09:16:17 INFO Datastore.Schema: Validating 1 unique key(s) for table PART_COL_PRIVS 14/01/08 09:16:17 INFO Datastore.Schema: Validating 1 foreign key(s) for table PART_COL_PRIVS 14/01/08 09:16:17 INFO Datastore.Schema: Validating 3 index(es) for table PART_COL_PRIVS 14/01/08 09:16:17 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MFieldSchema" is tagged as "embedded-only" so does not have its own datastore table. 14/01/08 09:16:17 INFO DataNucleus.Datastore: The class "org.apache.hadoop.hive.metastore.model.MOrder" is tagged as "embedded-only" so does not have its own datastore table. 14/01/08 09:16:17 INFO DataNucleus.Persistence: Managing Persistence of Class : org.apache.hadoop.hive.metastore.model.MTableColumnStatistics [Table : TAB_COL_STATS, InheritanceStrategy : new-table] 14/01/08 09:16:17 INFO Datastore.Schema: Validating 1 unique key(s) for table TAB_COL_STATS 14/01/08 09:16:17 INFO Datastore.Schema: Validating 1 foreign key(s) for table TAB_COL_STATS 14/01/08 09:16:17 INFO Datastore.Schema: Validating 2 index(es) for table TAB_COL_STATS 14/01/08 09:16:17 INFO metastore.hivemetastoressimpl: deleting file:/tmp/hive_bigpetstore_etl 14/01/08 09:16:17 INFO metastore.hivemetastoressimpl: Deleted the diretory file:/tmp/hive_bigpetstore_etl 14/01/08 09:16:17 INFO ql.Driver: </PERFLOG method=Driver.execute start=1389190576979 end=1389190577783 duration=804> OK 14/01/08 09:16:17 INFO ql.Driver: OK 14/01/08 09:16:17 INFO ql.Driver: <PERFLOG method=releaseLocks> 14/01/08 09:16:17 INFO ql.Driver: </PERFLOG method=releaseLocks start=1389190577783 end=1389190577783 duration=0> 14/01/08 09:16:17 INFO ql.Driver: </PERFLOG method=Driver.run start=1389190576175 end=1389190577783 duration=1608> 14/01/08 09:16:17 INFO service.HiveServer: Returning schema: Schema(fieldSchemas:null, properties:null) CREATE TABLE hive_bigpetstore_etl ( state STRING, trans_id STRING, lname STRING, fname STRING, date STRING, price STRING, product STRING) ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe' WITH SERDEPROPERTIES ("input.regex" = "(?:BigPetStore,storeCode_)([A-Z][A-Z]),([0-9]*)(?: )([a-z]*),([a-z]*),([A-Z][^,]*),([^,]*),([^,]*).*" , "output.format.string" = "%1$s %2$s %3$s %4$s %5$s") STORED AS TEXTFILE 14/01/08 09:16:17 INFO service.HiveServer: Running the query: CREATE TABLE hive_bigpetstore_etl ( state STRING, trans_id STRING, lname STRING, fname STRING, date STRING, price STRING, product STRING) ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe' WITH SERDEPROPERTIES ("input.regex" = "(?:BigPetStore,storeCode_)([A-Z][A-Z]),([0-9]*)(?: )([a-z]*),([a-z]*),([A-Z][^,]*),([^,]*),([^,]*).*" , "output.format.string" = "%1$s %2$s %3$s %4$s %5$s") STORED AS TEXTFILE 14/01/08 09:16:17 INFO ql.Driver: <PERFLOG method=Driver.run> 14/01/08 09:16:17 INFO ql.Driver: <PERFLOG method=TimeToSubmit> 14/01/08 09:16:17 INFO ql.Driver: <PERFLOG method=compile> 14/01/08 09:16:17 INFO parse.ParseDriver: Parsing command: CREATE TABLE hive_bigpetstore_etl ( state STRING, trans_id STRING, lname STRING, fname STRING, date STRING, price STRING, product STRING) ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe' WITH SERDEPROPERTIES ("input.regex" = "(?:BigPetStore,storeCode_)([A-Z][A-Z]),([0-9]*)(?: )([a-z]*),([a-z]*),([A-Z][^,]*),([^,]*),([^,]*).*" , "output.format.string" = "%1$s %2$s %3$s %4$s %5$s") STORED AS TEXTFILE 14/01/08 09:16:17 INFO parse.ParseDriver: Parse Completed 14/01/08 09:16:17 INFO parse.SemanticAnalyzer: Starting Semantic Analysis 14/01/08 09:16:17 INFO parse.SemanticAnalyzer: Creating table hive_bigpetstore_etl position=13 14/01/08 09:16:17 INFO ql.Driver: Semantic Analysis Completed 14/01/08 09:16:17 INFO ql.Driver: Returning Hive schema: Schema(fieldSchemas:null, properties:null) 14/01/08 09:16:17 INFO ql.Driver: </PERFLOG method=compile start=1389190577784 end=1389190577810 duration=26> 14/01/08 09:16:17 INFO ql.Driver: <PERFLOG method=Driver.execute> 14/01/08 09:16:17 INFO ql.Driver: Starting command: CREATE TABLE hive_bigpetstore_etl ( state STRING, trans_id STRING, lname STRING, fname STRING, date STRING, price STRING, product STRING) ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe' WITH SERDEPROPERTIES ("input.regex" = "(?:BigPetStore,storeCode_)([A-Z][A-Z]),([0-9]*)(?: )([a-z]*),([a-z]*),([A-Z][^,]*),([^,]*),([^,]*).*" , "output.format.string" = "%1$s %2$s %3$s %4$s %5$s") STORED AS TEXTFILE 14/01/08 09:16:17 INFO ql.Driver: </PERFLOG method=TimeToSubmit start=1389190577784 end=1389190577812 duration=28> 14/01/08 09:16:17 INFO metastore.HiveMetaStore: 0: create_table: Table(tableName:hive_bigpetstore_etl, dbName:default, owner:Jpeerindex, createTime:1389190577, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:state, type:string, comment:null), FieldSchema(name:trans_id, type:string, comment:null), FieldSchema(name:lname, type:string, comment:null), FieldSchema(name:fname, type:string, comment:null), FieldSchema(name:date, type:string, comment:null), FieldSchema(name:price, type:string, comment:null), FieldSchema(name:product, type:string, comment:null)], location:null, inputFormat:org.apache.hadoop.mapred.TextInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.contrib.serde2.RegexSerDe, parameters:{output.format.string=%1$s %2$s %3$s %4$s %5$s, serialization.format=1, input.regex=(?:BigPetStore,storeCode_)([A-Z][A-Z]),([0-9]*)(?: )([a-z]*),([a-z]*),([A-Z][^,]*),([^,]*),([^,]*).*}), bucketCols:[], sortCols:[], parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], skewedColValueLocationMaps:{}), storedAsSubDirectories:false), partitionKeys:[], parameters:{}, viewOriginalText:null, viewExpandedText:null, tableType:MANAGED_TABLE, privileges:PrincipalPrivilegeSet(userPrivileges:null, groupPrivileges:null, rolePrivileges:null)) 14/01/08 09:16:17 INFO HiveMetaStore.audit: ugi=Jpeerindex ip=unknown-ip-addr cmd=create_table: Table(tableName:hive_bigpetstore_etl, dbName:default, owner:Jpeerindex, createTime:1389190577, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[FieldSchema(name:state, type:string, comment:null), FieldSchema(name:trans_id, type:string, comment:null), FieldSchema(name:lname, type:string, comment:null), FieldSchema(name:fname, type:string, comment:null), FieldSchema(name:date, type:string, comment:null), FieldSchema(name:price, type:string, comment:null), FieldSchema(name:product, type:string, comment:null)], location:null, inputFormat:org.apache.hadoop.mapred.TextInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.contrib.serde2.RegexSerDe, parameters:{output.format.string=%1$s %2$s %3$s %4$s %5$s, serialization.format=1, input.regex=(?:BigPetStore,storeCode_)([A-Z][A-Z]),([0-9]*)(?: )([a-z]*),([a-z]*),([A-Z][^,]*),([^,]*),([^,]*).*}), bucketCols:[], sortCols:[], parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], skewedColValueLocationMaps:{}), storedAsSubDirectories:false), partitionKeys:[], parameters:{}, viewOriginalText:null, viewExpandedText:null, tableType:MANAGED_TABLE, privileges:PrincipalPrivilegeSet(userPrivileges:null, groupPrivileges:null, rolePrivileges:null)) 14/01/08 09:16:17 INFO ql.Driver: </PERFLOG method=Driver.execute start=1389190577810 end=1389190577921 duration=111> OK 14/01/08 09:16:17 INFO ql.Driver: OK 14/01/08 09:16:17 INFO ql.Driver: <PERFLOG method=releaseLocks> 14/01/08 09:16:17 INFO ql.Driver: </PERFLOG method=releaseLocks start=1389190577921 end=1389190577921 duration=0> 14/01/08 09:16:17 INFO ql.Driver: </PERFLOG method=Driver.run start=1389190577784 end=1389190577921 duration=137> 14/01/08 09:16:17 INFO service.HiveServer: Returning schema: Schema(fieldSchemas:null, properties:null) 14/01/08 09:16:17 INFO service.HiveServer: Running the query: LOAD DATA INPATH '/tmp/BigPetStore1389190568881/generated' INTO TABLE hive_bigpetstore_etl 14/01/08 09:16:17 INFO ql.Driver: <PERFLOG method=Driver.run> 14/01/08 09:16:17 INFO ql.Driver: <PERFLOG method=TimeToSubmit> 14/01/08 09:16:17 INFO ql.Driver: <PERFLOG method=compile> 14/01/08 09:16:17 INFO parse.ParseDriver: Parsing command: LOAD DATA INPATH '/tmp/BigPetStore1389190568881/generated' INTO TABLE hive_bigpetstore_etl 14/01/08 09:16:17 INFO parse.ParseDriver: Parse Completed 14/01/08 09:16:17 INFO metastore.HiveMetaStore: 0: get_table : db=default tbl=hive_bigpetstore_etl 14/01/08 09:16:17 INFO HiveMetaStore.audit: ugi=Jpeerindex ip=unknown-ip-addr cmd=get_table : db=default tbl=hive_bigpetstore_etl 14/01/08 09:16:17 INFO ql.Driver: Semantic Analysis Completed 14/01/08 09:16:17 INFO ql.Driver: Returning Hive schema: Schema(fieldSchemas:null, properties:null) 14/01/08 09:16:17 INFO ql.Driver: </PERFLOG method=compile start=1389190577922 end=1389190577951 duration=29> 14/01/08 09:16:17 INFO ql.Driver: <PERFLOG method=Driver.execute> 14/01/08 09:16:17 INFO ql.Driver: Starting command: LOAD DATA INPATH '/tmp/BigPetStore1389190568881/generated' INTO TABLE hive_bigpetstore_etl 14/01/08 09:16:17 INFO ql.Driver: </PERFLOG method=TimeToSubmit start=1389190577922 end=1389190577953 duration=31> Loading data to table default.hive_bigpetstore_etl 14/01/08 09:16:17 INFO exec.Task: Loading data to table default.hive_bigpetstore_etl from file:/tmp/BigPetStore1389190568881/generated 14/01/08 09:16:17 INFO metastore.HiveMetaStore: 0: get_table : db=default tbl=hive_bigpetstore_etl 14/01/08 09:16:17 INFO HiveMetaStore.audit: ugi=Jpeerindex ip=unknown-ip-addr cmd=get_table : db=default tbl=hive_bigpetstore_etl 14/01/08 09:16:17 INFO metastore.HiveMetaStore: 0: get_table : db=default tbl=hive_bigpetstore_etl 14/01/08 09:16:17 INFO HiveMetaStore.audit: ugi=Jpeerindex ip=unknown-ip-addr cmd=get_table : db=default tbl=hive_bigpetstore_etl 14/01/08 09:16:18 INFO metastore.HiveMetaStore: 0: alter_table: db=default tbl=hive_bigpetstore_etl newtbl=hive_bigpetstore_etl 14/01/08 09:16:18 INFO HiveMetaStore.audit: ugi=Jpeerindex ip=unknown-ip-addr cmd=alter_table: db=default tbl=hive_bigpetstore_etl newtbl=hive_bigpetstore_etl 14/01/08 09:16:18 INFO metastore.HiveMetaStore: 0: get_table : db=default tbl=hive_bigpetstore_etl 14/01/08 09:16:18 INFO HiveMetaStore.audit: ugi=Jpeerindex ip=unknown-ip-addr cmd=get_table : db=default tbl=hive_bigpetstore_etl 14/01/08 09:16:18 INFO exec.StatsTask: Executing stats task 14/01/08 09:16:18 INFO metastore.HiveMetaStore: 0: get_table : db=default tbl=hive_bigpetstore_etl 14/01/08 09:16:18 INFO HiveMetaStore.audit: ugi=Jpeerindex ip=unknown-ip-addr cmd=get_table : db=default tbl=hive_bigpetstore_etl 14/01/08 09:16:18 INFO metastore.HiveMetaStore: 0: get_table : db=default tbl=hive_bigpetstore_etl 14/01/08 09:16:18 INFO HiveMetaStore.audit: ugi=Jpeerindex ip=unknown-ip-addr cmd=get_table : db=default tbl=hive_bigpetstore_etl 14/01/08 09:16:18 INFO metastore.HiveMetaStore: 0: get_database: default 14/01/08 09:16:18 INFO HiveMetaStore.audit: ugi=Jpeerindex ip=unknown-ip-addr cmd=get_database: default 14/01/08 09:16:18 INFO metastore.HiveMetaStore: 0: alter_table: db=default tbl=hive_bigpetstore_etl newtbl=hive_bigpetstore_etl 14/01/08 09:16:18 INFO HiveMetaStore.audit: ugi=Jpeerindex ip=unknown-ip-addr cmd=alter_table: db=default tbl=hive_bigpetstore_etl newtbl=hive_bigpetstore_etl 14/01/08 09:16:18 INFO metastore.HiveMetaStore: 0: get_table : db=default tbl=hive_bigpetstore_etl 14/01/08 09:16:18 INFO HiveMetaStore.audit: ugi=Jpeerindex ip=unknown-ip-addr cmd=get_table : db=default tbl=hive_bigpetstore_etl Table default.hive_bigpetstore_etl stats: [num_partitions: 0, num_files: 2, num_rows: 0, total_size: 857, raw_data_size: 0] 14/01/08 09:16:18 INFO exec.Task: Table default.hive_bigpetstore_etl stats: [num_partitions: 0, num_files: 2, num_rows: 0, total_size: 857, raw_data_size: 0] 14/01/08 09:16:18 INFO ql.Driver: </PERFLOG method=Driver.execute start=1389190577951 end=1389190578133 duration=182> OK 14/01/08 09:16:18 INFO ql.Driver: OK 14/01/08 09:16:18 INFO ql.Driver: <PERFLOG method=releaseLocks> 14/01/08 09:16:18 INFO ql.Driver: </PERFLOG method=releaseLocks start=1389190578133 end=1389190578134 duration=1> 14/01/08 09:16:18 INFO ql.Driver: </PERFLOG method=Driver.run start=1389190577922 end=1389190578134 duration=212> 14/01/08 09:16:18 INFO service.HiveServer: Returning schema: Schema(fieldSchemas:null, properties:null) Hive history file=/tmp/Jpeerindex/hive_job_log_Jpeerindex_24038@jays-MacBook-Pro.local_201401080916_312988160.txt 14/01/08 09:16:18 INFO exec.HiveHistory: Hive history file=/tmp/Jpeerindex/hive_job_log_Jpeerindex_24038@jays-MacBook-Pro.local_201401080916_312988160.txt 14/01/08 09:16:18 INFO service.HiveServer: Putting temp output to file /tmp/Jpeerindex/Jpeerindex_24038@jays-MacBook-Pro.local_2014010809163997506011824464437.pipeout 14/01/08 09:16:18 INFO service.HiveServer: Running the query: set hive.fetch.output.serde = org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe 14/01/08 09:16:18 INFO service.HiveServer: Putting temp output to file /tmp/Jpeerindex/Jpeerindex_24038@jays-MacBook-Pro.local_2014010809163997506011824464437.pipeout 14/01/08 09:16:18 INFO service.HiveServer: Running the query: select product,count(*) as cnt from hive_bigpetstore_etl group by product 14/01/08 09:16:18 INFO ql.Driver: <PERFLOG method=Driver.run> 14/01/08 09:16:18 INFO ql.Driver: <PERFLOG method=TimeToSubmit> 14/01/08 09:16:18 INFO ql.Driver: <PERFLOG method=compile> 14/01/08 09:16:18 INFO parse.ParseDriver: Parsing command: select product,count(*) as cnt from hive_bigpetstore_etl group by product 14/01/08 09:16:18 INFO parse.ParseDriver: Parse Completed 14/01/08 09:16:18 INFO parse.SemanticAnalyzer: Starting Semantic Analysis 14/01/08 09:16:18 INFO parse.SemanticAnalyzer: Completed phase 1 of Semantic Analysis 14/01/08 09:16:18 INFO parse.SemanticAnalyzer: Get metadata for source tables 14/01/08 09:16:18 INFO metastore.HiveMetaStore: 0: get_table : db=default tbl=hive_bigpetstore_etl 14/01/08 09:16:18 INFO HiveMetaStore.audit: ugi=Jpeerindex ip=unknown-ip-addr cmd=get_table : db=default tbl=hive_bigpetstore_etl 14/01/08 09:16:18 INFO parse.SemanticAnalyzer: Get metadata for subqueries 14/01/08 09:16:18 INFO parse.SemanticAnalyzer: Get metadata for destination tables 14/01/08 09:16:18 INFO parse.SemanticAnalyzer: Completed getting MetaData in Semantic Analysis 14/01/08 09:16:18 INFO ppd.OpProcFactory: Processing for FS(6) 14/01/08 09:16:18 INFO ppd.OpProcFactory: Processing for SEL(5) 14/01/08 09:16:18 INFO ppd.OpProcFactory: Processing for GBY(4) 14/01/08 09:16:18 INFO ppd.OpProcFactory: Processing for RS(3) 14/01/08 09:16:18 INFO ppd.OpProcFactory: Processing for GBY(2) 14/01/08 09:16:18 INFO ppd.OpProcFactory: Processing for SEL(1) 14/01/08 09:16:18 INFO ppd.OpProcFactory: Processing for TS(0) 14/01/08 09:16:18 INFO physical.MetadataOnlyOptimizer: Looking for table scans where optimization is applicable 14/01/08 09:16:18 INFO physical.MetadataOnlyOptimizer: Found 0 metadata only table scans 14/01/08 09:16:18 INFO parse.SemanticAnalyzer: Completed plan generation 14/01/08 09:16:18 INFO ql.Driver: Semantic Analysis Completed 14/01/08 09:16:18 INFO exec.ListSinkOperator: Initializing Self 7 OP 14/01/08 09:16:18 INFO exec.ListSinkOperator: Operator 7 OP initialized 14/01/08 09:16:18 INFO exec.ListSinkOperator: Initialization Done 7 OP 14/01/08 09:16:18 INFO ql.Driver: Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:product, type:string, comment:null), FieldSchema(name:cnt, type:bigint, comment:null)], properties:null) 14/01/08 09:16:18 INFO ql.Driver: </PERFLOG method=compile start=1389190578171 end=1389190578435 duration=264> 14/01/08 09:16:18 INFO ql.Driver: <PERFLOG method=Driver.execute> 14/01/08 09:16:18 INFO ql.Driver: Starting command: select product,count(*) as cnt from hive_bigpetstore_etl group by product Total MapReduce jobs = 1 14/01/08 09:16:18 INFO ql.Driver: Total MapReduce jobs = 1 14/01/08 09:16:18 INFO ql.Driver: </PERFLOG method=TimeToSubmit start=1389190578171 end=1389190578438 duration=267> Launching Job 1 out of 1 14/01/08 09:16:18 INFO ql.Driver: Launching Job 1 out of 1 14/01/08 09:16:18 INFO exec.Utilities: Cache Content Summary for file:/tmp/hive_bigpetstore_etl length: 857 file count: 2 directory count: 1 14/01/08 09:16:18 INFO exec.ExecDriver: BytesPerReducer=1000000000 maxReducers=999 totalInputFileSize=857 Number of reduce tasks not specified. Estimated from input data size: 1 14/01/08 09:16:18 INFO exec.Task: Number of reduce tasks not specified. Estimated from input data size: 1 In order to change the average load for a reducer (in bytes): 14/01/08 09:16:18 INFO exec.Task: In order to change the average load for a reducer (in bytes): set hive.exec.reducers.bytes.per.reducer=<number> 14/01/08 09:16:18 INFO exec.Task: set hive.exec.reducers.bytes.per.reducer=<number> In order to limit the maximum number of reducers: 14/01/08 09:16:18 INFO exec.Task: In order to limit the maximum number of reducers: set hive.exec.reducers.max=<number> 14/01/08 09:16:18 INFO exec.Task: set hive.exec.reducers.max=<number> In order to set a constant number of reducers: 14/01/08 09:16:18 INFO exec.Task: In order to set a constant number of reducers: set mapred.reduce.tasks=<number> 14/01/08 09:16:18 INFO exec.Task: set mapred.reduce.tasks=<number> 14/01/08 09:16:18 INFO exec.ExecDriver: Generating plan file file:/tmp/Jpeerindex/hive_2014-01-08_09-16-18_171_5555758322630256543/-local-10003/plan.xml 14/01/08 09:16:18 INFO exec.ExecDriver: Executing: ./hadoop-1.2.1/bin/hadoop jar /Users/Jpeerindex/.m2/repository/org/apache/hive/hive-exec/0.11.0/hive-exec-0.11.0.jar org.apache.hadoop.hive.ql.exec.ExecDriver -plan file:/tmp/Jpeerindex/hive_2014-01-08_09-16-18_171_5555758322630256543/-local-10003/plan.xml -jobconffile file:/tmp/Jpeerindex/hive_2014-01-08_09-16-18_171_5555758322630256543/-local-10002/jobconf.xml Warning: $HADOOP_HOME is deprecated. Execution log at: /tmp/Jpeerindex/hive.log 2014-01-08 09:16:20.585 java[24070:1903] Unable to load realm info from SCDynamicStore Job running in-process (local Hadoop) Hadoop job information for null: number of mappers: 0; number of reducers: 0 2014-01-08 09:16:22,185 null map = 0%, reduce = 100% Ended Job = job_local1258421034_0001 Execution completed successfully 14/01/08 09:16:22 INFO exec.Task: Execution completed successfully Mapred Local Task Succeeded . Convert the Join into MapJoin 14/01/08 09:16:22 INFO exec.Task: Mapred Local Task Succeeded . Convert the Join into MapJoin 14/01/08 09:16:22 INFO exec.ExecDriver: Execution completed successfully 14/01/08 09:16:22 INFO ql.Driver: </PERFLOG method=Driver.execute start=1389190578436 end=1389190582579 duration=4143> OK 14/01/08 09:16:22 INFO ql.Driver: OK 14/01/08 09:16:22 INFO ql.Driver: <PERFLOG method=releaseLocks> 14/01/08 09:16:22 INFO ql.Driver: </PERFLOG method=releaseLocks start=1389190582579 end=1389190582579 duration=0> 14/01/08 09:16:22 INFO ql.Driver: </PERFLOG method=Driver.run start=1389190578171 end=1389190582579 duration=4408> 14/01/08 09:16:22 INFO service.HiveServer: Returning schema: Schema(fieldSchemas:[FieldSchema(name:product, type:string, comment:null), FieldSchema(name:cnt, type:bigint, comment:null)], properties:null) 14/01/08 09:16:22 INFO mapred.FileInputFormat: Total input paths to process : 1 crunch:{cat-food=4, antelope snacks=1, fuzzy-collar=1, dog-food=2, steel-leash=2} pig:{cat-food=4, antelope snacks=1, fuzzy-collar=1, dog-food=2, steel-leash=2} hive:{null=1, cat-food=4, antelope snacks=1, fuzzy-collar=1, dog-food=1, steel-leash=2} Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 13.777 sec Results : Tests run: 1, Failures: 0, Errors: 0, Skipped: 0 [WARNING] File encoding has not been set, using platform encoding MacRoman, i.e. build is platform dependent! [INFO] [INFO] --- maven-failsafe-plugin:2.12:verify (integration-tests) @ hadoop-examples --- [INFO] Failsafe report directory: /Users/Jpeerindex/Development/bigpetstore/target/failsafe-reports [WARNING] File encoding has not been set, using platform encoding MacRoman, i.e. build is platform dependent! [INFO] [INFO] --- maven-install-plugin:2.4:install (default-install) @ hadoop-examples --- [INFO] Installing /Users/Jpeerindex/Development/bigpetstore/target/hadoop-examples-1.0-SNAPSHOT.jar to /Users/Jpeerindex/.m2/repository/jay/rhbd/hadoop-examples/1.0-SNAPSHOT/hadoop-examples-1.0-SNAPSHOT.jar [INFO] Installing /Users/Jpeerindex/Development/bigpetstore/pom.xml to /Users/Jpeerindex/.m2/repository/jay/rhbd/hadoop-examples/1.0-SNAPSHOT/hadoop-examples-1.0-SNAPSHOT.pom [INFO] [INFO] --- maven-resources-plugin:2.6:resources (default-resources) @ hadoop-examples --- [INFO] Using 'UTF-8' encoding to copy filtered resources. [INFO] Copying 4 resources [INFO] [INFO] --- maven-compiler-plugin:3.1:compile (default-compile) @ hadoop-examples --- [INFO] Nothing to compile - all classes are up to date [INFO] [INFO] --- build-helper-maven-plugin:1.7:add-test-source (add-integration-test-sources) @ hadoop-examples --- [INFO] Test Source directory: /Users/Jpeerindex/Development/bigpetstore/src/integration/java added. [INFO] [INFO] --- maven-resources-plugin:2.6:testResources (default-testResources) @ hadoop-examples --- [INFO] Using 'UTF-8' encoding to copy filtered resources. [INFO] Copying 1 resource [INFO] [INFO] --- maven-compiler-plugin:3.1:testCompile (default-testCompile) @ hadoop-examples --- [INFO] Nothing to compile - all classes are up to date [INFO] [INFO] --- maven-surefire-plugin:2.12:test (default-test) @ hadoop-examples --- [INFO] Skipping execution of surefire because it has already been run for this configuration [INFO] [INFO] --- maven-jar-plugin:2.4:jar (default-jar) @ hadoop-examples --- [INFO] [INFO] --- maven-failsafe-plugin:2.12:integration-test (integration-tests) @ hadoop-examples --- [INFO] Skipping execution of surefire because it has already been run for this configuration [INFO] [INFO] --- maven-failsafe-plugin:2.12:verify (integration-tests) @ hadoop-examples --- [INFO] Failsafe report directory: /Users/Jpeerindex/Development/bigpetstore/target/failsafe-reports [WARNING] File encoding has not been set, using platform encoding MacRoman, i.e. build is platform dependent! [INFO] ------------------------------------------------------------------------ [INFO] BUILD SUCCESS [INFO] ------------------------------------------------------------------------ [INFO] Total time: 20.054s [INFO] Finished at: Wed Jan 08 09:16:22 EST 2014 [INFO] Final Memory: 24M/81M [INFO] ------------------------------------------------------------------------
          Hide
          jay vyas added a comment -

          Great news on this folks ! We finally have a stable, production quality codebase, with profiles for each ecosystem tool and preliiminary testing in a real hadoop cluster.

          • First phase of testing (generation of transactions) works, bigpetstore now works in bigtop-deploy/vm/vagrant-puppet based VMs.
          • We also now have maven profiles for pig, hive and crunch. You can build and run any ecosystem ETL using those profiles.

          So, once I finish testing the whole pipeline in psuedo distributed mode, ill be crafting the first official bigpetstore patch !

          Note: It kinda overloads VMs because it creates many tasks (one per state), by nature of the custom generating input format.

          14/02/16 02:37:53 INFO mapreduce.JobSubmitter: number of splits:7
          14/02/16 02:37:53 WARN conf.Configuration: mapred.jar is deprecated. Instead, use mapreduce.job.jar
          14/02/16 02:37:53 WARN conf.Configuration: mapred.output.value.class is deprecated. Instead, use mapreduce.job.output.value.class
          14/02/16 02:37:53 WARN conf.Configuration: mapred.mapoutput.value.class is deprecated. Instead, use mapreduce.map.output.value.class
          14/02/16 02:37:53 WARN conf.Configuration: mapreduce.map.class is deprecated. Instead, use mapreduce.job.map.class
          14/02/16 02:37:53 WARN conf.Configuration: mapred.job.name is deprecated. Instead, use mapreduce.job.name
          14/02/16 02:37:53 WARN conf.Configuration: mapreduce.inputformat.class is deprecated. Instead, use mapreduce.job.inputformat.class
          14/02/16 02:37:53 WARN conf.Configuration: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
          14/02/16 02:37:53 WARN conf.Configuration: mapreduce.outputformat.class is deprecated. Instead, use mapreduce.job.outputformat.class
          14/02/16 02:37:53 WARN conf.Configuration: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps
          14/02/16 02:37:53 WARN conf.Configuration: mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class
          14/02/16 02:37:53 WARN conf.Configuration: mapred.mapoutput.key.class is deprecated. Instead, use mapreduce.map.output.key.class
          14/02/16 02:37:53 WARN conf.Configuration: mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir
          14/02/16 02:37:55 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1392513928307_0005
          14/02/16 02:37:58 WARN mapred.JobConf: The variable mapred.child.ulimit is no longer used.
          14/02/16 02:37:58 INFO client.YarnClientImpl: Submitted application application_1392513928307_0005 to ResourceManager at vagrant.bigtop1/127.0.0.1:8032
          14/02/16 02:37:58 INFO mapreduce.Job: The url to track the job: http://vagrant.bigtop1:20888/proxy/application_1392513928307_0005/
          14/02/16 02:37:58 INFO mapreduce.Job: Running job: job_1392513928307_0005
          14/02/16 02:38:07 INFO mapreduce.Job: Job job_1392513928307_0005 running in uber mode : false
          14/02/16 02:38:07 INFO mapreduce.Job:  map 0% reduce 0%
          14/02/16 02:38:35 INFO mapreduce.Job:  map 14% reduce 0%
          14/02/16 02:38:44 INFO mapreduce.Job:  map 29% reduce 0%
          14/02/16 02:38:45 INFO mapreduce.Job: Task Id : attempt_1392513928307_0005_m_000001_0, Status : FAILED
          
          Killed by external signal
          
          14/02/16 02:38:54 INFO mapreduce.Job: Task Id : attempt_1392513928307_0005_m_000004_0, Status : FAILED
          
          Killed by external signal
          
          14/02/16 02:38:55 INFO mapreduce.Job:  map 57% reduce 0%
          14/02/16 02:39:13 INFO mapreduce.Job:  map 71% reduce 0%
          14/02/16 02:39:22 INFO mapreduce.Job:  map 71% reduce 1%
          14/02/16 02:39:23 INFO mapreduce.Job:  map 86% reduce 2%
          14/02/16 02:39:26 INFO mapreduce.Job:  map 86% reduce 3%
          14/02/16 02:39:31 INFO mapreduce.Job:  map 86% reduce 2%
          14/02/16 02:40:27 INFO mapreduce.Job:  map 100% reduce 2%
          14/02/16 02:40:28 INFO mapreduce.Job:  map 100% reduce 5%
          14/02/16 02:40:29 INFO mapreduce.Job:  map 100% reduce 9%
          14/02/16 02:40:30 INFO mapreduce.Job:  map 100% reduce 10%
          14/02/16 02:40:32 INFO mapreduce.Job:  map 100% reduce 14%
          14/02/16 02:40:33 INFO mapreduce.Job:  map 100% reduce 17%
          14/02/16 02:40:57 INFO mapreduce.Job:  map 100% reduce 27%
          14/02/16 02:40:58 INFO mapreduce.Job:  map 100% reduce 30%
          14/02/16 02:40:59 INFO mapreduce.Job:  map 100% reduce 37%
          14/02/16 02:41:26 INFO mapreduce.Job:  map 100% reduce 47%
          14/02/16 02:41:27 INFO mapreduce.Job:  map 100% reduce 57%
          14/02/16 02:41:53 INFO mapreduce.Job:  map 100% reduce 67%
          14/02/16 02:41:54 INFO mapreduce.Job:  map 100% reduce 70%
          14/02/16 02:41:55 INFO mapreduce.Job:  map 100% reduce 77%
          14/02/16 02:42:18 INFO mapreduce.Job:  map 100% reduce 80%
          14/02/16 02:42:21 INFO mapreduce.Job:  map 100% reduce 90%
          14/02/16 02:42:22 INFO mapreduce.Job:  map 100% reduce 93%
          14/02/16 02:42:23 INFO mapreduce.Job:  map 100% reduce 97%
          14/02/16 02:42:26 INFO mapreduce.Job:  map 100% reduce 100%
          14/02/16 02:42:26 INFO mapreduce.Job: Job job_1392513928307_0005 completed successfully
          14/02/16 02:42:26 WARN mapred.JobConf: The variable mapred.child.ulimit is no longer used.
          14/02/16 02:42:26 INFO mapreduce.Job: Counters: 45
          	File System Counters
          		FILE: Number of bytes read=1067
          		FILE: Number of bytes written=2755986
          		FILE: Number of read operations=0
          		FILE: Number of large read operations=0
          		FILE: Number of write operations=0
          		HDFS: Number of bytes read=497
          		HDFS: Number of bytes written=867
          		HDFS: Number of read operations=104
          		HDFS: Number of large read operations=0
          		HDFS: Number of write operations=60
          	Job Counters 
          		Failed map tasks=2
          		Killed reduce tasks=11
          		Launched map tasks=9
          		Launched reduce tasks=41
          		Other local map tasks=9
          		Total time spent by all maps in occupied slots (ms)=308583
          		Total time spent by all reduces in occupied slots (ms)=1013311
          	Map-Reduce Framework
          		Map input records=10
          		Map output records=10
          		Map output bytes=867
          		Map output materialized bytes=2147
          		Input split bytes=497
          		Combine input records=0
          		Combine output records=0
          		Reduce input groups=10
          		Reduce shuffle bytes=2147
          		Reduce input records=10
          		Reduce output records=10
          		Spilled Records=20
          		Shuffled Maps =210
          		Failed Shuffles=0
          		Merged Map outputs=210
          		GC time elapsed (ms)=9317
          		CPU time spent (ms)=21960
          		Physical memory (bytes) snapshot=4822437888
          		Virtual memory (bytes) snapshot=59736100864
          		Total committed heap usage (bytes)=2466344960
          	Shuffle Errors
          		BAD_ID=0
          		CONNECTION=0
          		IO_ERROR=0
          		WRONG_LENGTH=0
          		WRONG_MAP=0
          		WRONG_REDUCE=0
          	File Input Format Counters 
          		Bytes Read=0
          	File Output Format Counters 
          		Bytes Written=867
          [root@vagrant vagrant]# hadoop fs -cat /tmp/bps2/*
          BigPetStore,storeCode_CO,1	heidi,o'neill,Sun Dec 28 01:54:42 UTC 1969,15.1,choke-collar
          BigPetStore,storeCode_CT,1	shawn,cantrell,Sat Jan 24 05:08:29 UTC 1970,19.1,fuzzy-collar
          BigPetStore,storeCode_OK,1	herbert,dejesus,Fri Jan 16 08:14:57 UTC 1970,10.5,dog-food
          BigPetStore,storeCode_AZ,1	walter,richardson,Wed Dec 31 19:45:21 UTC 1969,10.5,dog-food
          BigPetStore,storeCode_CA,1	natasha,caldwell,Thu Dec 18 04:46:14 UTC 1969,11.75,fish-food
          BigPetStore,storeCode_CA,2	natasha,caldwell,Sat Jan 17 00:50:34 UTC 1970,7.5,cat-food
          BigPetStore,storeCode_CA,3	natasha,caldwell,Sun Jan 25 19:31:17 UTC 1970,11.75,fish-food
          BigPetStore,storeCode_NY,1	margaret,sims,Wed Jan 21 03:56:34 UTC 1970,10.5,dog-food
          BigPetStore,storeCode_NY,2	margaret,sims,Sun Dec 28 06:44:04 UTC 1969,19.75,fish-food
          BigPetStore,storeCode_AK,1	sharon,vargas,Thu Jan 22 15:46:47 UTC 1970,19.1,fuzzy-collar
          [root@vagrant vagrant]# hadoop fs -cat /tmp/bps3/*
          BigPetStore,storeCode_CO,1	shawn,cantrell,Sat Jan 24 05:08:29 UTC 1970,10.5,dog-food
          BigPetStore,storeCode_CT,1	clarence,robles,Wed Jan 21 18:14:05 UTC 1970,10.5,dog-food
          BigPetStore,storeCode_OK,1	tia,mckee,Tue Jan 06 18:35:34 UTC 1970,5.1,hay-bail
          BigPetStore,storeCode_AZ,1	judy,drake,Mon Dec 29 04:55:38 UTC 1969,30.1,snake-bite ointment
          BigPetStore,storeCode_CA,1	darrell,watkins,Mon Dec 08 15:04:55 UTC 1969,11.75,fish-food
          BigPetStore,storeCode_CA,2	mickey,garrison,Sat Jan 17 20:53:21 UTC 1970,11.75,fish-food
          BigPetStore,storeCode_CA,3	mickey,garrison,Fri Jan 23 14:59:35 UTC 1970,7.5,cat-food
          BigPetStore,storeCode_NY,1	clarence,robles,Wed Jan 21 18:14:05 UTC 1970,20.1,steel-leash
          BigPetStore,storeCode_NY,2	valerie,wise,Sun Jan 04 03:11:53 UTC 1970,20.1,steel-leash
          BigPetStore,storeCode_AK,1	lindsey,mcneil,Fri Jan 16 13:43:11 UTC 1970,19.1,fuzzy-collar
          
          
          Show
          jay vyas added a comment - Great news on this folks ! We finally have a stable, production quality codebase, with profiles for each ecosystem tool and preliiminary testing in a real hadoop cluster. First phase of testing (generation of transactions) works, bigpetstore now works in bigtop-deploy/vm/vagrant-puppet based VMs. We also now have maven profiles for pig, hive and crunch. You can build and run any ecosystem ETL using those profiles. So, once I finish testing the whole pipeline in psuedo distributed mode, ill be crafting the first official bigpetstore patch ! Note: It kinda overloads VMs because it creates many tasks (one per state), by nature of the custom generating input format. 14/02/16 02:37:53 INFO mapreduce.JobSubmitter: number of splits:7 14/02/16 02:37:53 WARN conf.Configuration: mapred.jar is deprecated. Instead, use mapreduce.job.jar 14/02/16 02:37:53 WARN conf.Configuration: mapred.output.value.class is deprecated. Instead, use mapreduce.job.output.value.class 14/02/16 02:37:53 WARN conf.Configuration: mapred.mapoutput.value.class is deprecated. Instead, use mapreduce.map.output.value.class 14/02/16 02:37:53 WARN conf.Configuration: mapreduce.map.class is deprecated. Instead, use mapreduce.job.map.class 14/02/16 02:37:53 WARN conf.Configuration: mapred.job.name is deprecated. Instead, use mapreduce.job.name 14/02/16 02:37:53 WARN conf.Configuration: mapreduce.inputformat.class is deprecated. Instead, use mapreduce.job.inputformat.class 14/02/16 02:37:53 WARN conf.Configuration: mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir 14/02/16 02:37:53 WARN conf.Configuration: mapreduce.outputformat.class is deprecated. Instead, use mapreduce.job.outputformat.class 14/02/16 02:37:53 WARN conf.Configuration: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps 14/02/16 02:37:53 WARN conf.Configuration: mapred.output.key.class is deprecated. Instead, use mapreduce.job.output.key.class 14/02/16 02:37:53 WARN conf.Configuration: mapred.mapoutput.key.class is deprecated. Instead, use mapreduce.map.output.key.class 14/02/16 02:37:53 WARN conf.Configuration: mapred.working.dir is deprecated. Instead, use mapreduce.job.working.dir 14/02/16 02:37:55 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1392513928307_0005 14/02/16 02:37:58 WARN mapred.JobConf: The variable mapred.child.ulimit is no longer used. 14/02/16 02:37:58 INFO client.YarnClientImpl: Submitted application application_1392513928307_0005 to ResourceManager at vagrant.bigtop1/127.0.0.1:8032 14/02/16 02:37:58 INFO mapreduce.Job: The url to track the job: http://vagrant.bigtop1:20888/proxy/application_1392513928307_0005/ 14/02/16 02:37:58 INFO mapreduce.Job: Running job: job_1392513928307_0005 14/02/16 02:38:07 INFO mapreduce.Job: Job job_1392513928307_0005 running in uber mode : false 14/02/16 02:38:07 INFO mapreduce.Job: map 0% reduce 0% 14/02/16 02:38:35 INFO mapreduce.Job: map 14% reduce 0% 14/02/16 02:38:44 INFO mapreduce.Job: map 29% reduce 0% 14/02/16 02:38:45 INFO mapreduce.Job: Task Id : attempt_1392513928307_0005_m_000001_0, Status : FAILED Killed by external signal 14/02/16 02:38:54 INFO mapreduce.Job: Task Id : attempt_1392513928307_0005_m_000004_0, Status : FAILED Killed by external signal 14/02/16 02:38:55 INFO mapreduce.Job: map 57% reduce 0% 14/02/16 02:39:13 INFO mapreduce.Job: map 71% reduce 0% 14/02/16 02:39:22 INFO mapreduce.Job: map 71% reduce 1% 14/02/16 02:39:23 INFO mapreduce.Job: map 86% reduce 2% 14/02/16 02:39:26 INFO mapreduce.Job: map 86% reduce 3% 14/02/16 02:39:31 INFO mapreduce.Job: map 86% reduce 2% 14/02/16 02:40:27 INFO mapreduce.Job: map 100% reduce 2% 14/02/16 02:40:28 INFO mapreduce.Job: map 100% reduce 5% 14/02/16 02:40:29 INFO mapreduce.Job: map 100% reduce 9% 14/02/16 02:40:30 INFO mapreduce.Job: map 100% reduce 10% 14/02/16 02:40:32 INFO mapreduce.Job: map 100% reduce 14% 14/02/16 02:40:33 INFO mapreduce.Job: map 100% reduce 17% 14/02/16 02:40:57 INFO mapreduce.Job: map 100% reduce 27% 14/02/16 02:40:58 INFO mapreduce.Job: map 100% reduce 30% 14/02/16 02:40:59 INFO mapreduce.Job: map 100% reduce 37% 14/02/16 02:41:26 INFO mapreduce.Job: map 100% reduce 47% 14/02/16 02:41:27 INFO mapreduce.Job: map 100% reduce 57% 14/02/16 02:41:53 INFO mapreduce.Job: map 100% reduce 67% 14/02/16 02:41:54 INFO mapreduce.Job: map 100% reduce 70% 14/02/16 02:41:55 INFO mapreduce.Job: map 100% reduce 77% 14/02/16 02:42:18 INFO mapreduce.Job: map 100% reduce 80% 14/02/16 02:42:21 INFO mapreduce.Job: map 100% reduce 90% 14/02/16 02:42:22 INFO mapreduce.Job: map 100% reduce 93% 14/02/16 02:42:23 INFO mapreduce.Job: map 100% reduce 97% 14/02/16 02:42:26 INFO mapreduce.Job: map 100% reduce 100% 14/02/16 02:42:26 INFO mapreduce.Job: Job job_1392513928307_0005 completed successfully 14/02/16 02:42:26 WARN mapred.JobConf: The variable mapred.child.ulimit is no longer used. 14/02/16 02:42:26 INFO mapreduce.Job: Counters: 45 File System Counters FILE: Number of bytes read=1067 FILE: Number of bytes written=2755986 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=497 HDFS: Number of bytes written=867 HDFS: Number of read operations=104 HDFS: Number of large read operations=0 HDFS: Number of write operations=60 Job Counters Failed map tasks=2 Killed reduce tasks=11 Launched map tasks=9 Launched reduce tasks=41 Other local map tasks=9 Total time spent by all maps in occupied slots (ms)=308583 Total time spent by all reduces in occupied slots (ms)=1013311 Map-Reduce Framework Map input records=10 Map output records=10 Map output bytes=867 Map output materialized bytes=2147 Input split bytes=497 Combine input records=0 Combine output records=0 Reduce input groups=10 Reduce shuffle bytes=2147 Reduce input records=10 Reduce output records=10 Spilled Records=20 Shuffled Maps =210 Failed Shuffles=0 Merged Map outputs=210 GC time elapsed (ms)=9317 CPU time spent (ms)=21960 Physical memory (bytes) snapshot=4822437888 Virtual memory (bytes) snapshot=59736100864 Total committed heap usage (bytes)=2466344960 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0 File Input Format Counters Bytes Read=0 File Output Format Counters Bytes Written=867 [root@vagrant vagrant]# hadoop fs -cat /tmp/bps2/* BigPetStore,storeCode_CO,1 heidi,o'neill,Sun Dec 28 01:54:42 UTC 1969,15.1,choke-collar BigPetStore,storeCode_CT,1 shawn,cantrell,Sat Jan 24 05:08:29 UTC 1970,19.1,fuzzy-collar BigPetStore,storeCode_OK,1 herbert,dejesus,Fri Jan 16 08:14:57 UTC 1970,10.5,dog-food BigPetStore,storeCode_AZ,1 walter,richardson,Wed Dec 31 19:45:21 UTC 1969,10.5,dog-food BigPetStore,storeCode_CA,1 natasha,caldwell,Thu Dec 18 04:46:14 UTC 1969,11.75,fish-food BigPetStore,storeCode_CA,2 natasha,caldwell,Sat Jan 17 00:50:34 UTC 1970,7.5,cat-food BigPetStore,storeCode_CA,3 natasha,caldwell,Sun Jan 25 19:31:17 UTC 1970,11.75,fish-food BigPetStore,storeCode_NY,1 margaret,sims,Wed Jan 21 03:56:34 UTC 1970,10.5,dog-food BigPetStore,storeCode_NY,2 margaret,sims,Sun Dec 28 06:44:04 UTC 1969,19.75,fish-food BigPetStore,storeCode_AK,1 sharon,vargas,Thu Jan 22 15:46:47 UTC 1970,19.1,fuzzy-collar [root@vagrant vagrant]# hadoop fs -cat /tmp/bps3/* BigPetStore,storeCode_CO,1 shawn,cantrell,Sat Jan 24 05:08:29 UTC 1970,10.5,dog-food BigPetStore,storeCode_CT,1 clarence,robles,Wed Jan 21 18:14:05 UTC 1970,10.5,dog-food BigPetStore,storeCode_OK,1 tia,mckee,Tue Jan 06 18:35:34 UTC 1970,5.1,hay-bail BigPetStore,storeCode_AZ,1 judy,drake,Mon Dec 29 04:55:38 UTC 1969,30.1,snake-bite ointment BigPetStore,storeCode_CA,1 darrell,watkins,Mon Dec 08 15:04:55 UTC 1969,11.75,fish-food BigPetStore,storeCode_CA,2 mickey,garrison,Sat Jan 17 20:53:21 UTC 1970,11.75,fish-food BigPetStore,storeCode_CA,3 mickey,garrison,Fri Jan 23 14:59:35 UTC 1970,7.5,cat-food BigPetStore,storeCode_NY,1 clarence,robles,Wed Jan 21 18:14:05 UTC 1970,20.1,steel-leash BigPetStore,storeCode_NY,2 valerie,wise,Sun Jan 04 03:11:53 UTC 1970,20.1,steel-leash BigPetStore,storeCode_AK,1 lindsey,mcneil,Fri Jan 16 13:43:11 UTC 1970,19.1,fuzzy-collar
          Hide
          jay vyas added a comment -

          Hi folks. FYI, another update.

          • bigpetstore now tested on EMR for generating/processing up to 1,000,000 records
          • bigpetstore now processes data w/ pig

          Will probably put this in as a first iteration, and then add hive , etc as later patches.

          If anyone wants to play with it, you can check here :
          https://github.com/jayunit100/bigpetstore

          I think about 5,000,000,000 records ~ 1 TB of data

          As always, anyone wanting to hack on this let me know. Im going to keep it in github for a little while longer while i clean it up, but can put it into bigtop soon.

          Show
          jay vyas added a comment - Hi folks. FYI, another update. bigpetstore now tested on EMR for generating/processing up to 1,000,000 records bigpetstore now processes data w/ pig Will probably put this in as a first iteration, and then add hive , etc as later patches. If anyone wants to play with it, you can check here : https://github.com/jayunit100/bigpetstore I think about 5,000,000,000 records ~ 1 TB of data As always, anyone wanting to hack on this let me know. Im going to keep it in github for a little while longer while i clean it up, but can put it into bigtop soon.
          Hide
          jay vyas added a comment -

          Another bigpetstore update: We now have a continous integration server for this project which is autoincrementing versions and running the data generation stuff, any one interest in running it or building their forks feel free to ping me directly (i havent put the url online because the server isnt secure just yet), but anyone who wants to put in a pull request is welcome to utilize it to test their code .

          Show
          jay vyas added a comment - Another bigpetstore update: We now have a continous integration server for this project which is autoincrementing versions and running the data generation stuff, any one interest in running it or building their forks feel free to ping me directly (i havent put the url online because the server isnt secure just yet), but anyone who wants to put in a pull request is welcome to utilize it to test their code .
          Hide
          jay vyas added a comment -

          Hi folks. Have got another update on this: We added some visualization widgets to it, and I've now run BigPetStore both on EMR / HDFS as well as on Red Hat sponsored AMIs with glusterfs, to process 1,000,000 records. The app continues to be in a state that it needs cleanup before i try to submit a patch. But in the interim, here's some eye candy of what bigpetstore generated and processed petstore transactions will look like.

          https://s3.amazonaws.com/uploads.hipchat.com/30405/198980/WTVGEBriu6VuNmA/Screen%20Shot%202014-03-26%20at%2011.47.16%20AM.jpg

          And here is the US Map of dog food purchases !
          https://s3.amazonaws.com/uploads.hipchat.com/30405/198980/RrKpgzy0lXY6X02/bps.jpg

          Show
          jay vyas added a comment - Hi folks. Have got another update on this: We added some visualization widgets to it, and I've now run BigPetStore both on EMR / HDFS as well as on Red Hat sponsored AMIs with glusterfs, to process 1,000,000 records. The app continues to be in a state that it needs cleanup before i try to submit a patch. But in the interim, here's some eye candy of what bigpetstore generated and processed petstore transactions will look like. https://s3.amazonaws.com/uploads.hipchat.com/30405/198980/WTVGEBriu6VuNmA/Screen%20Shot%202014-03-26%20at%2011.47.16%20AM.jpg And here is the US Map of dog food purchases ! https://s3.amazonaws.com/uploads.hipchat.com/30405/198980/RrKpgzy0lXY6X02/bps.jpg
          Hide
          Konstantin Boudnik added a comment -

          jay vyas, how do you think we can integrate it into bigtop?

          Show
          Konstantin Boudnik added a comment - jay vyas , how do you think we can integrate it into bigtop?
          Hide
          jay vyas added a comment - - edited

          Hi cos. The slides are here:

          http://jayunit100.github.io/bigpetstore/slides.html
          And the current video: https://www.youtube.com/watch?v=OVB3nEKN94k

          Theres still a fair amount of cleanup to do (minor stuff), but any time i can integrate it as its own submodule? Im ready to push it in at any time.

          For now im developing it in my github. But I can start integrating it with bigtop at any time, its a great integration test for the whole ecosystem IMO.

          Maybe we can discuss at the hackathon next week.

          Show
          jay vyas added a comment - - edited Hi cos. The slides are here: http://jayunit100.github.io/bigpetstore/slides.html And the current video: https://www.youtube.com/watch?v=OVB3nEKN94k Theres still a fair amount of cleanup to do (minor stuff), but any time i can integrate it as its own submodule? Im ready to push it in at any time. For now im developing it in my github. But I can start integrating it with bigtop at any time, its a great integration test for the whole ecosystem IMO. Maybe we can discuss at the hackathon next week.
          Hide
          jay vyas added a comment -

          FYI, as per some conversations here at ApacheCon, Im going to fast track a patch for this so that folks such as Sean Mackrory and Alex Newman can jump in and contribute (thanks in advance) !

          FYI the current places for improvement (will create JIRAs)

          • rewrite pom.xml into a build.gradle script (how does gradle support a notion of "profiles")?
          • test hive code at scale and add configurability for jdbc connection (right now is just local/embedded)
          • test mahout code at scale
          • update docs for above ^^
          • package BPS web app into a jetty app that launches as a build profile or task in code (right now its in gh-pages branch on github)
          Show
          jay vyas added a comment - FYI, as per some conversations here at ApacheCon, Im going to fast track a patch for this so that folks such as Sean Mackrory and Alex Newman can jump in and contribute (thanks in advance) ! FYI the current places for improvement (will create JIRAs) rewrite pom.xml into a build.gradle script (how does gradle support a notion of "profiles")? test hive code at scale and add configurability for jdbc connection (right now is just local/embedded) test mahout code at scale update docs for above ^^ package BPS web app into a jetty app that launches as a build profile or task in code (right now its in gh-pages branch on github)
          Hide
          jay vyas added a comment - - edited

          ((( PATCH ATTACHED ))) Okay.... Great. After this goes in ill create a boatload of improvement JIRAs. Its a little raw ATM. But yeah , good call to get it in so we can start iterating against it as a group.

          Show
          jay vyas added a comment - - edited ((( PATCH ATTACHED ))) Okay.... Great. After this goes in ill create a boatload of improvement JIRAs. Its a little raw ATM. But yeah , good call to get it in so we can start iterating against it as a group.
          Hide
          jay vyas added a comment -

          I just realized, there were contributions from others. I've added there names to this patch with approx number of lines they added.

          Show
          jay vyas added a comment - I just realized, there were contributions from others. I've added there names to this patch with approx number of lines they added.
          Hide
          Konstantin Boudnik added a comment - - edited

          I've added there names to this patch with approx number of lines they added.

          My Hadoop is bigger then yours, eh?

          Show
          Konstantin Boudnik added a comment - - edited I've added there names to this patch with approx number of lines they added. My Hadoop is bigger then yours, eh?
          Hide
          jay vyas added a comment -

          wellllll... hmmm.... not sure about that. BUT im pretty damn sure that my PETSTORE is bigger than yours

          Show
          jay vyas added a comment - wellllll... hmmm.... not sure about that. BUT im pretty damn sure that my PETSTORE is bigger than yours
          Hide
          Konstantin Boudnik added a comment -

          Won't argue about that

          Show
          Konstantin Boudnik added a comment - Won't argue about that
          Hide
          Konstantin Boudnik added a comment -

          Any volunteers to review the patch?

          Show
          Konstantin Boudnik added a comment - Any volunteers to review the patch?
          Hide
          Sean Mackrory added a comment -

          I'd be happy to - I'll try get it done today.

          Show
          Sean Mackrory added a comment - I'd be happy to - I'll try get it done today.
          Hide
          jay vyas added a comment -

          We just did a code review, here is all the issues we found. Most we can roll into a next-iteration JIRA, but maybe a few are important enough that we better fix them now before putting into the patch.

          • aoache licenses : put them where necessary
          • digraph name (arch.dot)
          • org.apache.bigtop version in pom.xml (easy inline fix)
          • leverage bigtop super pom in the bps pom for consistency
          • Crunch code is unstable and just a dev class
          • redundant deps in profiles. Not even sure really if they are overriden or just result in classpath dupes.
          • hive-setup shell script parameterization and cleanup (it should probably go in a dev/ folder since its really just for dev testing)
          • remove intellij headers from some autogen files
          • BigPetStore Contract: Possibly reimplement
          Show
          jay vyas added a comment - We just did a code review, here is all the issues we found. Most we can roll into a next-iteration JIRA, but maybe a few are important enough that we better fix them now before putting into the patch. aoache licenses : put them where necessary digraph name (arch.dot) org.apache.bigtop version in pom.xml (easy inline fix) leverage bigtop super pom in the bps pom for consistency Crunch code is unstable and just a dev class redundant deps in profiles. Not even sure really if they are overriden or just result in classpath dupes. hive-setup shell script parameterization and cleanup (it should probably go in a dev/ folder since its really just for dev testing) remove intellij headers from some autogen files BigPetStore Contract: Possibly reimplement
          Hide
          Konstantin Boudnik added a comment -

          As a rule of thumb here are a couple of very important rules for new code:

          • ASL boilerplate code should be present
          • no @author tags are allowed. Basically, by contributing your code to ASF you're transferring the rights to it. Hence, no author thingy
          Show
          Konstantin Boudnik added a comment - As a rule of thumb here are a couple of very important rules for new code: ASL boilerplate code should be present no @author tags are allowed. Basically, by contributing your code to ASF you're transferring the rights to it. Hence, no author thingy
          Hide
          Konstantin Boudnik added a comment - - edited

          Another things: perhaps I am wrong but this structure

           bigtop_bigpetstore/setuphive.sh                    |   11 +
           .../bigpetstore/integration/BigPetStoreHiveIT.java |   94 +++
           .../integration/BigPetStoreMahoutIT.java           |   74 ++
           .../bigpetstore/integration/BigPetStorePigIT.java  |  149 ++++
           .../bigtop/bigpetstore/integration/ITUtils.java    |  133 ++++
          

          looks pretty chaotic, don't you think? There are
          bigpetstore, bigtop/bigpetstore under bigtop_bigpetstore which makes it increadibly hard to understand the purpose of it.

          Show
          Konstantin Boudnik added a comment - - edited Another things: perhaps I am wrong but this structure bigtop_bigpetstore/setuphive.sh | 11 + .../bigpetstore/integration/BigPetStoreHiveIT.java | 94 +++ .../integration/BigPetStoreMahoutIT.java | 74 ++ .../bigpetstore/integration/BigPetStorePigIT.java | 149 ++++ .../bigtop/bigpetstore/integration/ITUtils.java | 133 ++++ looks pretty chaotic, don't you think? There are bigpetstore, bigtop/bigpetstore under bigtop_bigpetstore which makes it increadibly hard to understand the purpose of it.
          Hide
          jay vyas added a comment - - edited

          1) Okay , am adding the essential fixes now...

          2) Re: Chaotic dirs ~~~ i think thats an artifact of the way the patch summary print truncates the preceding directories,
          The "bigpetstore" directory occurs far down the tree.

          ├── src
          │   ├── integration
          │   │   └── java
          │   │       └── org
          │   │           └── bigtop
          │   │               └── bigpetstore
          │   │                   └── integration
          ....
          
          Show
          jay vyas added a comment - - edited 1) Okay , am adding the essential fixes now... 2) Re: Chaotic dirs ~~~ i think thats an artifact of the way the patch summary print truncates the preceding directories, The "bigpetstore" directory occurs far down the tree. ├── src │   ├── integration │   │   └── java │   │   └── org │   │   └── bigtop │   │   └── bigpetstore │   │   └── integration ....
          Hide
          jay vyas added a comment -

          okay ... updated patch attached with a couple of pom cleanups .

          Show
          jay vyas added a comment - okay ... updated patch attached with a couple of pom cleanups .
          Hide
          Sean Mackrory added a comment - - edited

          Thanks for posting all the notes from our review the other day. I think most of them are minor enough to fix in follow-up JIRAs since this is a new module and doesn't have to be perfect to start letting other people collaborate on it. I just tried your latest patch, liked the POM changes, and was able to build, run the pig tests, etc..

          A few notes I think you should address before we +1 and commit it:

          • src/integration/java/org/bigtop/bigpetstore/integration/BigPetStorePigIT.java still has author information in the IntelliJ header
          • It looks like StringUtils.java and log4j.properties came from other projects. While the licenses may be identical, it is from a different project and I think we need a more formal declaration of where the files came from. Other files are still missing the license boilerplate that Cos mentioned, although I'm not sure that's a hard requirement as the license is distributed with the project as a whole as this is being added as part of the project. Either way - good to add the headers.
          • Changing the digraph name should be really easy so it'd be nice to just get that done and not have anyone wandering where ethane comes from
          Show
          Sean Mackrory added a comment - - edited Thanks for posting all the notes from our review the other day. I think most of them are minor enough to fix in follow-up JIRAs since this is a new module and doesn't have to be perfect to start letting other people collaborate on it. I just tried your latest patch, liked the POM changes, and was able to build, run the pig tests, etc.. A few notes I think you should address before we +1 and commit it: src/integration/java/org/bigtop/bigpetstore/integration/BigPetStorePigIT.java still has author information in the IntelliJ header It looks like StringUtils.java and log4j.properties came from other projects. While the licenses may be identical, it is from a different project and I think we need a more formal declaration of where the files came from. Other files are still missing the license boilerplate that Cos mentioned, although I'm not sure that's a hard requirement as the license is distributed with the project as a whole as this is being added as part of the project. Either way - good to add the headers. Changing the digraph name should be really easy so it'd be nice to just get that done and not have anyone wandering where ethane comes from
          Hide
          jay vyas added a comment -

          okay... reattached the patch... thanks for reviewing this on a saturday!
          (1) added the boiler plate,
          (2) update the arch.dot
          (3) deleted a few dead code snippets
          cant wait to do this code drop, get ready for about 600 bigpetstore follow up JIRAs !

          MY PETSTORE IS BIGGER THAN YOURS

          Show
          jay vyas added a comment - okay... reattached the patch... thanks for reviewing this on a saturday! (1) added the boiler plate, (2) update the arch.dot (3) deleted a few dead code snippets cant wait to do this code drop, get ready for about 600 bigpetstore follow up JIRAs ! MY PETSTORE IS BIGGER THAN YOURS
          Hide
          Sean Mackrory added a comment -

          My only last complaint is that setuphive.sh is downloading Hadoop from an official Apache site while Hive is coming from what is presumably a mirror? Can we use http://archive.apache.org/dist/hive/hive-0.12.0/hive-0.12.0.tar.gz instead?

          Beyond that I'm a +1 if Cos has no other objections. Do you want to commit this with the 'BIGTOP-1089. BigPetStore: A polyglot big data processing blueprint' commit message and then then upload the result of `git format-patch` so it'll have your name, etc.. on it?

          Show
          Sean Mackrory added a comment - My only last complaint is that setuphive.sh is downloading Hadoop from an official Apache site while Hive is coming from what is presumably a mirror? Can we use http://archive.apache.org/dist/hive/hive-0.12.0/hive-0.12.0.tar.gz instead? Beyond that I'm a +1 if Cos has no other objections. Do you want to commit this with the ' BIGTOP-1089 . BigPetStore: A polyglot big data processing blueprint' commit message and then then upload the result of `git format-patch` so it'll have your name, etc.. on it?
          Hide
          jay vyas added a comment -

          hiya shawn, no prob... yeah, that setuphive.sh script needed some cleanup :... here's a reattach.

          Show
          jay vyas added a comment - hiya shawn, no prob... yeah, that setuphive.sh script needed some cleanup :... here's a reattach.
          Hide
          jay vyas added a comment - - edited

          and RE: format-patch, I always do it that way ever since mark grover showed me how to put patches in.
          I think the most recent patch does give credit where its due , right? ( i think thats where the git signoff thingy comes into play).

          Show
          jay vyas added a comment - - edited and RE: format-patch, I always do it that way ever since mark grover showed me how to put patches in. I think the most recent patch does give credit where its due , right? ( i think thats where the git signoff thingy comes into play).
          Hide
          Sean Mackrory added a comment -

          Looks good! I'll go ahead and commit if Konstantin Boudnik has no -1?

          Show
          Sean Mackrory added a comment - Looks good! I'll go ahead and commit if Konstantin Boudnik has no -1?
          Hide
          jay vyas added a comment -

          (drumroll)

          Show
          jay vyas added a comment - (drumroll)
          Hide
          Sean Mackrory added a comment -

          I think the most recent patch does give credit where its due , right?

          Yes - I assumed (you know what happens when you assume) from the naming of the file that it had been done differently - but you are right.

          Show
          Sean Mackrory added a comment - I think the most recent patch does give credit where its due , right? Yes - I assumed (you know what happens when you assume) from the naming of the file that it had been done differently - but you are right.
          Hide
          Konstantin Boudnik added a comment -

          Jay, I appreciate the clarification on the dir-tree structure. But that's exactly my point: you end up with something like bigtop-bigpetstore/src/integration/java/org/bigtop/bigpetstore/integration - this is the one ugly path with multiple repetitions of the names like integration. Does it have to be that complicated? I think we can clean up the tree later on, but I think we have to clean it up.

          Also, the package should be org.apache.bigtop, not just org.bigtop.

          Show
          Konstantin Boudnik added a comment - Jay, I appreciate the clarification on the dir-tree structure. But that's exactly my point: you end up with something like bigtop-bigpetstore/src/integration/java/org/bigtop/bigpetstore/integration - this is the one ugly path with multiple repetitions of the names like integration . Does it have to be that complicated? I think we can clean up the tree later on, but I think we have to clean it up. Also, the package should be org.apache.bigtop , not just org.bigtop .
          Hide
          jay vyas added a comment - - edited

          (new patch) ^^
          Hi cos.. good points
          I fixed the tree and package structures both, and updated the README accordingly so now

          • its "org.apache.bigtop" instead of "org.bigtop"
          • The "integration" tests dont include "integration" in the package name anymore.
          └── src
              ├── integration
              │   └── java
              │       └── org
              │           └── apache
              │               └── bigtop
              │                   └── bigpetstore
              ├── main
              │   ├── java
              │   │   └── org
              │   │       └── apache
              │   │           └── bigtop
              │   │               └── bigpetstore
              │   │                   ├── clustering
              │   │                   ├── contract
              │   │                   ├── etl
              │   │                   ├── generator
              │   │                   └── util
              │   └── resources
              └── test
                  ├── java
                  │   └── org
                  │       └── apache
                  │           └── bigtop
                  │               └── bigpetstore
                  │                   ├── docs
                  │                   └── generator
                  └── resources
          
          
          Show
          jay vyas added a comment - - edited (new patch) ^^ Hi cos.. good points I fixed the tree and package structures both, and updated the README accordingly so now its "org.apache.bigtop" instead of "org.bigtop" The "integration" tests dont include "integration" in the package name anymore. └── src ├── integration │   └── java │   └── org │   └── apache │   └── bigtop │   └── bigpetstore ├── main │   ├── java │   │   └── org │   │   └── apache │   │   └── bigtop │   │   └── bigpetstore │   │   ├── clustering │   │   ├── contract │   │   ├── etl │   │   ├── generator │   │   └── util │   └── resources └── test ├── java │   └── org │   └── apache │   └── bigtop │   └── bigpetstore │   ├── docs │   └── generator └── resources
          Hide
          jay vyas added a comment - - edited

          oops, left out last commit, re attaching new patch. (updated comment above with the directory tree)

          Show
          jay vyas added a comment - - edited oops, left out last commit, re attaching new patch. (updated comment above with the directory tree)
          Hide
          jay vyas added a comment -

          Konstantin Boudnik See "tree" output above... looks okay now?

          Sean Mackrory I guess we can now start on some follow up JIRAs..

          • Convert to build to gradle
          • Rewrite bigpetstore webapp (jetty task that launches it locally).
          • Finish hive column slicer and add directions for running hive on a cluster.
          • Finish mahout recommender
          • Refine HIVE development testing tooling (i.e. setuphive.sh - can we replace it with a gradle task)?
          Show
          jay vyas added a comment - Konstantin Boudnik See "tree" output above... looks okay now? Sean Mackrory I guess we can now start on some follow up JIRAs.. Convert to build to gradle Rewrite bigpetstore webapp (jetty task that launches it locally). Finish hive column slicer and add directions for running hive on a cluster. Finish mahout recommender Refine HIVE development testing tooling (i.e. setuphive.sh - can we replace it with a gradle task)?
          Hide
          Konstantin Boudnik added a comment -

          Yup, thanks. As I said - we can always fine-tune it later if we find issues with it.

          One last set of comments that needs to be addressed:

          • BPSRecommnder, NumericalIdUtils, TestDocs miss the license
          • TestNumericalIdUtils, TestPetStoreTransactionGeneratorJob has two set of licenses in a wrong place

          There is a number of classes that has package statement first and then license; and a bunch of them where it is otherwise. We can fix it later, I guess. BTW, IDEA has a great mechanism to do consistent license boiler plate placement - you don't need to do this manually.

          Show
          Konstantin Boudnik added a comment - Yup, thanks. As I said - we can always fine-tune it later if we find issues with it. One last set of comments that needs to be addressed: BPSRecommnder, NumericalIdUtils, TestDocs miss the license TestNumericalIdUtils, TestPetStoreTransactionGeneratorJob has two set of licenses in a wrong place There is a number of classes that has package statement first and then license; and a bunch of them where it is otherwise. We can fix it later, I guess. BTW, IDEA has a great mechanism to do consistent license boiler plate placement - you don't need to do this manually.
          Hide
          jay vyas added a comment -

          okie dokie ! one more attachment ^^

          • Added license where missing
          • Added licence to above
            "package"

            delcarations.

          • Removed extra license texts
          Show
          jay vyas added a comment - okie dokie ! one more attachment ^^ Added license where missing Added licence to above "package" delcarations. Removed extra license texts
          Hide
          jay vyas added a comment -

          Hi folks. I think we Are we all set on this patch :now .? ?

          Show
          jay vyas added a comment - Hi folks. I think we Are we all set on this patch :now .? ?
          Hide
          Sean Mackrory added a comment -

          I'll go ahead and commit this shortly.

          Show
          Sean Mackrory added a comment - I'll go ahead and commit this shortly.
          Hide
          Konstantin Boudnik added a comment -

          Please do, Sean.
          Another consequent improvement to be done is to apply consistent formatting to the java source code. Let's open a ticket for it as well. Thanks

          Show
          Konstantin Boudnik added a comment - Please do, Sean. Another consequent improvement to be done is to apply consistent formatting to the java source code. Let's open a ticket for it as well. Thanks
          Hide
          Sean Mackrory added a comment -

          That is, I'll commit it as soon as I reset my ASF credentials like I was supposed to. Unfortunately that involves reading an email sent with a PGP key I've never interacted with before, so it may take me a minute

          Show
          Sean Mackrory added a comment - That is, I'll commit it as soon as I reset my ASF credentials like I was supposed to. Unfortunately that involves reading an email sent with a PGP key I've never interacted with before, so it may take me a minute
          Hide
          Sean Mackrory added a comment -

          Oh it's my key! And committed...

          Show
          Sean Mackrory added a comment - Oh it's my key! And committed...
          Hide
          jay vyas added a comment -

          No rush Sean Sean! Just making sure if we needed to roll a new Patch update.
          Thanks to bruno, cos and Sean both for the reviews I know this was a lot of work to test and audit!

          Show
          jay vyas added a comment - No rush Sean Sean! Just making sure if we needed to roll a new Patch update. Thanks to bruno, cos and Sean both for the reviews I know this was a lot of work to test and audit!
          Hide
          jay vyas added a comment -

          Awesome thanks Sean! Ok.
          And thanks also to Matt Fenwick , Nigel savage, and the openstack folks for helping me with the web app and also with testing at scale.
          Hope to see more contributions and just fyi I added your names to the commit message.

          Show
          jay vyas added a comment - Awesome thanks Sean! Ok. And thanks also to Matt Fenwick , Nigel savage, and the openstack folks for helping me with the web app and also with testing at scale. Hope to see more contributions and just fyi I added your names to the commit message.

            People

            • Assignee:
              jay vyas
              Reporter:
              jay vyas
            • Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development