Details

    • Type: Task Task
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.13.0
    • Component/s: None
    • Labels:
      None

      Description

      I can not cope with hive's build infrastructure any more. I have started working on porting the project to maven. When I have some solid progess i will github the entire thing for review. Then we can talk about switching the project somehow.

        Issue Links

        1.
        Create maven branch Sub-task Resolved Brock Noland
         
        2.
        Milestone 1: Compile source code under maven Sub-task Resolved Brock Noland
         
        3.
        Milestone 2: Generate tests under maven Sub-task Resolved Brock Noland
         
        4.
        Milestone 3: Some tests pass under maven Sub-task Resolved Brock Noland
         
        5.
        Milestone 4: Most tests pass under maven Sub-task Resolved Brock Noland
         
        6.
        Root pom is malformed Sub-task Resolved Edward Capriolo
         
        7.
        Milestone 5: PTest2 maven support Sub-task Resolved Brock Noland
         
        8.
        Milestone 6: All tests pass under hadoop 1 Sub-task Resolved Brock Noland
         
        9.
        Fix PTest2 Maven support Sub-task Resolved Brock Noland
         
        10.
        Merge maven branch into trunk Sub-task Resolved Brock Noland
         
        11.
        Add assembly (i.e.) tar creation to pom Sub-task Resolved Szehon Ho
         
        12.
        Ability to compile odbc and re-generate generated code stored in source control Sub-task Resolved Brock Noland
         
        13.
        fix saveVersion.sh to work on mac Sub-task Resolved Owen O'Malley
         
        14.
        Create script for removing ant artifacts after merge Sub-task Resolved Brock Noland
         
        15.
        Create profile to generate protobuf Sub-task Resolved Brock Noland
         
        16.
        Merge latest trunk into branch and fix resulting tests Sub-task Resolved Brock Noland
         
        17.
        Ensure all artifacts are prefixed with hive- Sub-task Resolved Brock Noland
         
        18.
        Fix eclipse:eclipse maven goal Sub-task Resolved Carl Steinbach
         
        19.
        Verify versions of libraries post maven merge Sub-task Resolved Brock Noland
         
        20. Separate reactor root or aggregator from parent pom Sub-task Open Unassigned
         
        21.
        Fix broken tests after maven merge (1) Sub-task Resolved Brock Noland
         
        22.
        Generate javadoc and source jars Sub-task Resolved Szehon Ho
         
        23. Create something like ant testreport Sub-task Open Unassigned
         
        24.
        Cleanup transitive dependencies Sub-task Resolved Unassigned
         
        25.
        Tar files should extract to the directory of the same name minus tar.gz Sub-task Resolved Brock Noland
         
        26.
        Fix binary packaging build eg include hcatalog, resolve pom issues Sub-task Resolved Brock Noland
         
        27.
        log4j properties appear to have been lost in maven upgrade Sub-task Resolved Sergey Shelukhin
         
        28.
        Fix hadoop2 execution environment Milestone 1 Sub-task Resolved Vikram Dixit K
         
        29. Remove versions from child module dependencies Sub-task Patch Available Unassigned
         
        30.
        Fix issues with new paths to jar in hcatalog Sub-task Resolved Brock Noland
         
        31.
        Rename HCatalog HBase Storage Handler artifact id Sub-task Resolved Brock Noland
         
        32.
        Fix hadoop2 execution environment Milestone 2 Sub-task Resolved Vikram Dixit K
         
        33.
        PTest2 should support build-only args Sub-task Resolved Brock Noland
         
        34.
        Shade Kryo dependency Sub-task Resolved Brock Noland
         
        35.
        Fix eclipse:eclipse post shim aggregation changes Sub-task Resolved Szehon Ho
         
        36.
        generate eclipse settings files that comply with hive code convention Sub-task Resolved Unassigned
         
        37.
        Implement checkstyle in maven Sub-task Closed Lars Francke
         

          Activity

          Hide
          Brock Noland added a comment -

          Is the branch public? For my part I leave on vacation in a few days and I am booked up until I leave but it'd be interesting to see the work.

          Show
          Brock Noland added a comment - Is the branch public? For my part I leave on vacation in a few days and I am booked up until I leave but it'd be interesting to see the work.
          Hide
          Edward Capriolo added a comment -

          https://github.com/edwardcapriolo/hive

          I am working on getting ql ATM.

          Show
          Edward Capriolo added a comment - https://github.com/edwardcapriolo/hive I am working on getting ql ATM.
          Hide
          Owen O'Malley added a comment -

          Suggestions:

          • separate the ql and exec jars. We should have a jar that includes just the Hive code and not the dependencies.
          • remove the ant dir, we don't need it after moving to maven.
          • i'd suggest making the top level the parent pom and then making a separate aggregation directory for packaging
          • if you want to look at the attempt i started back in april, i pushed it to https://github.com/omalley/hive on the maven branch.
          Show
          Owen O'Malley added a comment - Suggestions: separate the ql and exec jars. We should have a jar that includes just the Hive code and not the dependencies. remove the ant dir, we don't need it after moving to maven. i'd suggest making the top level the parent pom and then making a separate aggregation directory for packaging if you want to look at the attempt i started back in april, i pushed it to https://github.com/omalley/hive on the maven branch.
          Hide
          Edward Capriolo added a comment -

          "remove the ant dir, we don't need it after moving to maven."

          Right, I was actually thinking we can have the project in a state where either could build it for a while but it might be nice to cut bait with the old system.

          "separate the ql and exec jars. We should have a jar that includes just the Hive code and not the dependencies."
          Makes sense. I have just been focused on removing input formats and zk.

          I will look at what you did here: https://github.com/omalley/hive. I am punting on many issues on my first pass, but there is lots of things I do not know exactly how they will be solved yet,

          Show
          Edward Capriolo added a comment - "remove the ant dir, we don't need it after moving to maven." Right, I was actually thinking we can have the project in a state where either could build it for a while but it might be nice to cut bait with the old system. "separate the ql and exec jars. We should have a jar that includes just the Hive code and not the dependencies." Makes sense. I have just been focused on removing input formats and zk. I will look at what you did here: https://github.com/omalley/hive . I am punting on many issues on my first pass, but there is lots of things I do not know exactly how they will be solved yet,
          Hide
          Edward Capriolo added a comment -

          I like what you did with the shims, that was one I just punted on.

          Show
          Edward Capriolo added a comment - I like what you did with the shims, that was one I just punted on.
          Hide
          Brock Noland added a comment -

          Just an FYI as I know Xuefu Zhang had expressed interest in this as well. He is OOO until next week as well.

          Show
          Brock Noland added a comment - Just an FYI as I know Xuefu Zhang had expressed interest in this as well. He is OOO until next week as well.
          Hide
          Edward Capriolo added a comment -

          I am hitting a weird blocker now with antlr generation of the ql/hive-exec project. My antlr+plugin combination was able to generate the hive-metastore .g files ok but is having issues with HiveLexer.g and HiveParser.g. That is my biggest blocker at the moment. If the issue keeps up I may switch to exec for the time being.

          Show
          Edward Capriolo added a comment - I am hitting a weird blocker now with antlr generation of the ql/hive-exec project. My antlr+plugin combination was able to generate the hive-metastore .g files ok but is having issues with HiveLexer.g and HiveParser.g. That is my biggest blocker at the moment. If the issue keeps up I may switch to exec for the time being.
          Hide
          Sergey Shelukhin added a comment -

          Would this be a good time to change module structure? I can do a followup patch after this is done. It would be nice to separate metastore client from server, both for potential external usage, and for internal features where metastore server wants to involve QL bits.

          Show
          Sergey Shelukhin added a comment - Would this be a good time to change module structure? I can do a followup patch after this is done. It would be nice to separate metastore client from server, both for potential external usage, and for internal features where metastore server wants to involve QL bits.
          Hide
          Edward Capriolo added a comment -

          Right now we are not making any branches/patches yet. Our plan is to hack at github and then once we get everything working like we like open a hive branch and do it all again. Breaking up meta-store sounds ok.

          Show
          Edward Capriolo added a comment - Right now we are not making any branches/patches yet. Our plan is to hack at github and then once we get everything working like we like open a hive branch and do it all again. Breaking up meta-store sounds ok.
          Hide
          Roshan Naik added a comment -

          curious .. is ant's 'makepom' task (to convert a ivy file into a pom file) a useful starting point for such an effort ?

          Show
          Roshan Naik added a comment - curious .. is ant's 'makepom' task (to convert a ivy file into a pom file) a useful starting point for such an effort ?
          Hide
          Edward Capriolo added a comment -

          Sergey Shelukhin I think what you are saying is the thrift part of hive-metastore should be a submodule. That seems easy to do. I can do that.In other words the thrift generated classes in their own sub project.

          Roshan Naik I will look at there. Currently though there is a lot of stuff in there I do not want. I generally like gutting things and stripping them down.I am afraid a tool like 'makpom' will just preoduce very ugly and complicated poms, but I am not saying I won't try it.

          Show
          Edward Capriolo added a comment - Sergey Shelukhin I think what you are saying is the thrift part of hive-metastore should be a submodule. That seems easy to do. I can do that.In other words the thrift generated classes in their own sub project. Roshan Naik I will look at there. Currently though there is a lot of stuff in there I do not want. I generally like gutting things and stripping them down.I am afraid a tool like 'makpom' will just preoduce very ugly and complicated poms, but I am not saying I won't try it.
          Hide
          Sergey Shelukhin added a comment -

          I was thinking about splitting metastore client from server, not just thrift (thrift should be in client too), so that users of metastore wouldn't have to depend on server. In particular, right now metastore server cannot use anything in QL without indirect code, because QL uses metastore client/common bits.

          Show
          Sergey Shelukhin added a comment - I was thinking about splitting metastore client from server, not just thrift (thrift should be in client too), so that users of metastore wouldn't have to depend on server. In particular, right now metastore server cannot use anything in QL without indirect code, because QL uses metastore client/common bits.
          Hide
          Sergey Shelukhin added a comment -

          I figure if there will be disturbance in the build anyway, I can do it right after to not have disturbance for too long

          Show
          Sergey Shelukhin added a comment - I figure if there will be disturbance in the build anyway, I can do it right after to not have disturbance for too long
          Hide
          Brock Noland added a comment -

          Sergey, I think that makes sense, but I think it would make sense to complete the mavenization effort first.

          Show
          Brock Noland added a comment - Sergey, I think that makes sense, but I think it would make sense to complete the mavenization effort first.
          Hide
          Brock Noland added a comment -

          Hi,

          I'd really like to help work on this and spent the past a couple days doing so and plan to work on it next week as well. However, I think we shouldn't commit the change until after the 0.12 branch has been created. I have attached my WIP patch HIVE-5107-wip.patch. Basically I borrowed heavily from both Edward's and Owen's approaches with the additional belief that the conversion we should move as few files as possible until after we have moved to maven. Thus I have only moves files when it would create a circular dependency.

          Current status:

          • All java source including tests (minus the generated tests) is compiling
          • The new it-tests module contains any test which would have generated a circular dependency
          • Generated tests are not being generated (they will be generated in the it-tests module)
          • Some tests pass, most do not because the required test.* properties are not set

          Time Comparison:

          • ant clean package vs mvn clean package
            • ant 2.5 minutes
            • mvn 30 seconds
          • rm -rf ~/.ivy2 && ant very-clean clean package vs rm -rf ~/.m2/repository && mvn clean package -DskipTests
            • ant 25 minutes
            • mvn 5 minutes

          How to use:

          1. Apply patch
          2. Execute maven-rollfoward.sh

          How to patch was generated

          1. Execute maven-rollback.sh
          2. git add .
          3. git status | grep deleted: | awk ' {print $NF}

            ' | xargs git rm

          4. git diff origin/trunk

          Next steps:

          • Generate TestCli* tests in the it-tests module. Without moving existing tests location.
          • Trim down size of hive-ql-$version-exec.jar. It includes all deps at present as opposed the ones we need.
          • Get tests passing.

          Output of mvn clean package -DskipTests:

          [INFO] Reactor Summary:
          [INFO]
          [INFO] Hive .............................................. SUCCESS [0.106s]
          [INFO] Hive Ant Utilities ................................ SUCCESS [1.066s]
          [INFO] Hive Shims Common ................................. SUCCESS [0.392s]
          [INFO] Hive Shims 0.20 ................................... SUCCESS [0.243s]
          [INFO] Hive Shims Secure Common .......................... SUCCESS [0.414s]
          [INFO] Hive Shims 0.20S .................................. SUCCESS [0.133s]
          [INFO] Hive Shims 0.23 ................................... SUCCESS [0.409s]
          [INFO] Hive Shims ........................................ SUCCESS [0.495s]
          [INFO] Hive Common ....................................... SUCCESS [0.484s]
          [INFO] Hive Serde ........................................ SUCCESS [2.009s]
          [INFO] Hive Metastore .................................... SUCCESS [4.204s]
          [INFO] Hive TestUtils .................................... SUCCESS [0.027s]
          [INFO] Hive Query Language ............................... SUCCESS [9.673s]
          [INFO] Hive Service ...................................... SUCCESS [1.296s]
          [INFO] Hive JDBC ......................................... SUCCESS [0.315s]
          [INFO] Hive Beeline ...................................... SUCCESS [0.156s]
          [INFO] Hive CLI .......................................... SUCCESS [0.210s]
          [INFO] Hive Contrib ...................................... SUCCESS [0.201s]
          [INFO] Hive HBase Handler ................................ SUCCESS [0.196s]
          [INFO] Hive HCatalog ..................................... SUCCESS [0.001s]
          [INFO] Hive HCatalog Core ................................ SUCCESS [0.662s]
          [INFO] Hive HCatalog Pig Adapter ......................... SUCCESS [0.240s]
          [INFO] Hive HCatalog Server Extensions ................... SUCCESS [0.177s]
          [INFO] Hive HCatalog Webhcat Java Client ................. SUCCESS [0.172s]
          [INFO] Hive HCatalog Webhcat ............................. SUCCESS [0.273s]
          [INFO] Hive HCatalog HBase Storage Handler ............... SUCCESS [0.307s]
          [INFO] Hive HWI .......................................... SUCCESS [0.109s]
          [INFO] Hive Integration Tests ............................ SUCCESS [0.030s]
          [INFO] ------------------------------------------------------------------------
          [INFO] BUILD SUCCESS
          [INFO] ------------------------------------------------------------------------
          [INFO] Total time: 24.490s
          [INFO] Finished at: Fri Sep 06 14:45:25 CDT 2013
          [INFO] Final Memory: 104M/799M
          [INFO] ------------------------------------------------------------------------
          
          Show
          Brock Noland added a comment - Hi, I'd really like to help work on this and spent the past a couple days doing so and plan to work on it next week as well. However, I think we shouldn't commit the change until after the 0.12 branch has been created. I have attached my WIP patch HIVE-5107 -wip.patch. Basically I borrowed heavily from both Edward's and Owen's approaches with the additional belief that the conversion we should move as few files as possible until after we have moved to maven. Thus I have only moves files when it would create a circular dependency. Current status: All java source including tests (minus the generated tests) is compiling The new it-tests module contains any test which would have generated a circular dependency Generated tests are not being generated (they will be generated in the it-tests module) Some tests pass, most do not because the required test.* properties are not set Time Comparison: ant clean package vs mvn clean package ant 2.5 minutes mvn 30 seconds rm -rf ~/.ivy2 && ant very-clean clean package vs rm -rf ~/.m2/repository && mvn clean package -DskipTests ant 25 minutes mvn 5 minutes How to use: Apply patch Execute maven-rollfoward.sh How to patch was generated Execute maven-rollback.sh git add . git status | grep deleted: | awk ' {print $NF} ' | xargs git rm git diff origin/trunk Next steps: Generate TestCli* tests in the it-tests module. Without moving existing tests location. Trim down size of hive-ql-$version-exec.jar. It includes all deps at present as opposed the ones we need. Get tests passing. Output of mvn clean package -DskipTests: [INFO] Reactor Summary: [INFO] [INFO] Hive .............................................. SUCCESS [0.106s] [INFO] Hive Ant Utilities ................................ SUCCESS [1.066s] [INFO] Hive Shims Common ................................. SUCCESS [0.392s] [INFO] Hive Shims 0.20 ................................... SUCCESS [0.243s] [INFO] Hive Shims Secure Common .......................... SUCCESS [0.414s] [INFO] Hive Shims 0.20S .................................. SUCCESS [0.133s] [INFO] Hive Shims 0.23 ................................... SUCCESS [0.409s] [INFO] Hive Shims ........................................ SUCCESS [0.495s] [INFO] Hive Common ....................................... SUCCESS [0.484s] [INFO] Hive Serde ........................................ SUCCESS [2.009s] [INFO] Hive Metastore .................................... SUCCESS [4.204s] [INFO] Hive TestUtils .................................... SUCCESS [0.027s] [INFO] Hive Query Language ............................... SUCCESS [9.673s] [INFO] Hive Service ...................................... SUCCESS [1.296s] [INFO] Hive JDBC ......................................... SUCCESS [0.315s] [INFO] Hive Beeline ...................................... SUCCESS [0.156s] [INFO] Hive CLI .......................................... SUCCESS [0.210s] [INFO] Hive Contrib ...................................... SUCCESS [0.201s] [INFO] Hive HBase Handler ................................ SUCCESS [0.196s] [INFO] Hive HCatalog ..................................... SUCCESS [0.001s] [INFO] Hive HCatalog Core ................................ SUCCESS [0.662s] [INFO] Hive HCatalog Pig Adapter ......................... SUCCESS [0.240s] [INFO] Hive HCatalog Server Extensions ................... SUCCESS [0.177s] [INFO] Hive HCatalog Webhcat Java Client ................. SUCCESS [0.172s] [INFO] Hive HCatalog Webhcat ............................. SUCCESS [0.273s] [INFO] Hive HCatalog HBase Storage Handler ............... SUCCESS [0.307s] [INFO] Hive HWI .......................................... SUCCESS [0.109s] [INFO] Hive Integration Tests ............................ SUCCESS [0.030s] [INFO] ------------------------------------------------------------------------ [INFO] BUILD SUCCESS [INFO] ------------------------------------------------------------------------ [INFO] Total time: 24.490s [INFO] Finished at: Fri Sep 06 14:45:25 CDT 2013 [INFO] Final Memory: 104M/799M [INFO] ------------------------------------------------------------------------
          Hide
          Edward Capriolo added a comment -

          Nice . So its not going to be like building sofware on the punch card machine anymore? Once we release 0.12 we should cut a branch for these changes and test fixing.

          Show
          Edward Capriolo added a comment - Nice . So its not going to be like building sofware on the punch card machine anymore? Once we release 0.12 we should cut a branch for these changes and test fixing.
          Hide
          Carl Steinbach added a comment -

          Brock Noland Do you have any theories about why your mavenized build completes in 1/5 the amount of time required by the current build?

          Show
          Carl Steinbach added a comment - Brock Noland Do you have any theories about why your mavenized build completes in 1/5 the amount of time required by the current build?
          Hide
          Brock Noland added a comment -

          I think a large part is ivy vs maven. Ivy seems to do a lot more lookups because the dependencies for each module are not well defined. I still have to add generating the TestCli* tests to the maven build so the comparison above wasn't perfectly fair. However, I don't think that will materially add to the build time.

          Show
          Brock Noland added a comment - I think a large part is ivy vs maven. Ivy seems to do a lot more lookups because the dependencies for each module are not well defined. I still have to add generating the TestCli* tests to the maven build so the comparison above wasn't perfectly fair. However, I don't think that will materially add to the build time.
          Hide
          Edward Capriolo added a comment -

          Carl Steinbach The reason i see is IVY is always contacting the public internet to test checksums of everything, with maven once you have the jars in your .m2/repo it never contacts the internet anymore for checksums....maven totally offline...ivy constantly checksumming things and doing web requests.

          Show
          Edward Capriolo added a comment - Carl Steinbach The reason i see is IVY is always contacting the public internet to test checksums of everything, with maven once you have the jars in your .m2/repo it never contacts the internet anymore for checksums....maven totally offline...ivy constantly checksumming things and doing web requests.
          Hide
          Edward Capriolo added a comment -

          Brock Noland Now that 0.12 is cut, lets start a branch for this. We can convert this into umprella issues to fix the tests, and other things that come up.

          Show
          Edward Capriolo added a comment - Brock Noland Now that 0.12 is cut, lets start a branch for this. We can convert this into umprella issues to fix the tests, and other things that come up.
          Hide
          Brock Noland added a comment -

          Sounds good! After thinking about this for sometime I don't want to check any svn mv commands into the branch. Therefore any patch to the maven branch which would require:

          1) (if moves where performed) add the commands to maven-rollforward.sh and maven-rollback.sh
          2) Execute maven-rollback.sh
          3) Generate diff
          4) After approval commit diff

          Additionally when pulling from the branch we would:

          1) Execute maven-rollback.sh
          2) pull
          3) Execute maven-rollforward.sh
          4) Continue working

          That is a little painful but it ensures we stay in-sync with trunk to as much as possible.

          Show
          Brock Noland added a comment - Sounds good! After thinking about this for sometime I don't want to check any svn mv commands into the branch. Therefore any patch to the maven branch which would require: 1) (if moves where performed) add the commands to maven-rollforward.sh and maven-rollback.sh 2) Execute maven-rollback.sh 3) Generate diff 4) After approval commit diff Additionally when pulling from the branch we would: 1) Execute maven-rollback.sh 2) pull 3) Execute maven-rollforward.sh 4) Continue working That is a little painful but it ensures we stay in-sync with trunk to as much as possible.
          Hide
          Brock Noland added a comment -

          FYI the maven branch is not working at present. I merged in from trunk and therefore have to deal with the new vectorization code gen and such. Additionally this has exposed the fact that because the qfile tests require the uber hive-exec jar we'll have to deal with them differently. I should have a patch to get the branch in a working condition Monday.

          Show
          Brock Noland added a comment - FYI the maven branch is not working at present. I merged in from trunk and therefore have to deal with the new vectorization code gen and such. Additionally this has exposed the fact that because the qfile tests require the uber hive-exec jar we'll have to deal with them differently. I should have a patch to get the branch in a working condition Monday.
          Hide
          Brock Noland added a comment -

          FYI anyone interested in this project should think about watching HIVE-5610 which is the jira we'll use to actually execute the merge.

          Show
          Brock Noland added a comment - FYI anyone interested in this project should think about watching HIVE-5610 which is the jira we'll use to actually execute the merge.
          Hide
          Vaibhav Gumashta added a comment -

          Hi Brock Noland! Thanks for the awesome effort. I have one question regarding the organization of tests. Some of the tests have been moved to the itests folder whereas some live in the original package. Is there a good reason for having that structure? For example, some of the unit test files for the service package live in service/src/test/org/apache/hive/service, while some of them have moved to itests/hive-unit/src/test/java/org/apache/hive/service.

          Show
          Vaibhav Gumashta added a comment - Hi Brock Noland ! Thanks for the awesome effort. I have one question regarding the organization of tests. Some of the tests have been moved to the itests folder whereas some live in the original package. Is there a good reason for having that structure? For example, some of the unit test files for the service package live in service/src/test/org/apache/hive/service, while some of them have moved to itests/hive-unit/src/test/java/org/apache/hive/service.
          Hide
          Brock Noland added a comment -

          itests are any tests that has cyclical dependencies or requires that the packages be built. Typically only integration tests that have those requirements, thus I have named it itests.

          Show
          Brock Noland added a comment - itests are any tests that has cyclical dependencies or requires that the packages be built. Typically only integration tests that have those requirements, thus I have named it itests.
          Hide
          Vaibhav Gumashta added a comment -

          I'm not well-versed in maven, but wouldn't it be cleaner to move all the tests to itests? I think it might become confusing when adding new tests if the tests for a package are split into different locations.

          Show
          Vaibhav Gumashta added a comment - I'm not well-versed in maven, but wouldn't it be cleaner to move all the tests to itests? I think it might become confusing when adding new tests if the tests for a package are split into different locations.
          Hide
          Edward Capriolo added a comment -

          Generally it is better to but unit tests closest to the code it is testing. This makes it easier to determine test coverage.

          Integration tests usually involve testing across modules.

          Ideally we want tests to be localized. Someone working in hive-avro should not have to run tests unrelated to avro to add a feature, I think that is what we are aiming for, clean separation and easier testing without a full run.

          Show
          Edward Capriolo added a comment - Generally it is better to but unit tests closest to the code it is testing. This makes it easier to determine test coverage. Integration tests usually involve testing across modules. Ideally we want tests to be localized. Someone working in hive-avro should not have to run tests unrelated to avro to add a feature, I think that is what we are aiming for, clean separation and easier testing without a full run.
          Hide
          Brock Noland added a comment -

          This is an area of opinion...so know that the following is a bunch of opinion, I don't mean to present it as fact. Since I am presenting opinion, I want to apologize in advance should I fail to appropriately express my opinion and cause offense. I mean none. I also don't want to presume too much, so if define things you are well versed in, I am very sorry, I simply don't want to make assumptions that might reduce the clarity in my statements.

          bq I'm not well-versed in maven, but wouldn't it be cleaner to move all the tests to itests?

          I don't think so. Too be clear, I don't like the way this is broken up but compared with the old ant based build, I am willing to live this 100 times over. That old build had degraded into "a fate worse than death". Ideally we would re-write the "unit" tests in itest into true unit tests, as defined below, and move them back into the regular build.

          I think it might become confusing when adding new tests if the tests for a package are split into different locations.

          I would strongly suggest that no new tests should be added to itests. If a new "unit" test is being added there it's because it is not in fact a unit test.

          Often you have two kinds of tests:

          • unit tests - tests that individually only test a "small" piece of functionality making them extremely fast
          • integration tests - cross module testing e.g. test that test the integration of one or more components

          I strongly believe and it's maven's assumption that unit tests live with the code it's testing (the pattern is module/src/main/java, module/src/test/java). This is good because:

          • After making changes within a module you can do sanity tests by easily running the tests within only that module
          • Allowing unit tests to import any dependency results in what are supposed to be "unit" tests degrading to in integration tests

          Because the previous hive build was monolithic it wasn't easy to make changes within a single module and then run sanity tests on only that module. Additionally because any project could use nearly any dependency, a large number of what should have been "unit" tests degraded into integration tests.

          When I say "degrade into an integration tests" I mean that a unit test should test a specific piece of functionality and be extremely fast, generally sub-second. In my experience, if a test takes more than one second to two seconds it's either a poorly written unit test, a poorly written class, or an integration test. Often it's because infrastructure which is not being tested is instantiated, not mocked. e.g. starting a server as opposed to mocking a server.

          TL; DR: I'd love to move the tests in itest back to their owning module, it just requires some work to do so. Huge +1 to anyone willing to take on the challenge!

          Show
          Brock Noland added a comment - This is an area of opinion...so know that the following is a bunch of opinion, I don't mean to present it as fact. Since I am presenting opinion, I want to apologize in advance should I fail to appropriately express my opinion and cause offense. I mean none. I also don't want to presume too much, so if define things you are well versed in, I am very sorry, I simply don't want to make assumptions that might reduce the clarity in my statements. bq I'm not well-versed in maven, but wouldn't it be cleaner to move all the tests to itests? I don't think so. Too be clear, I don't like the way this is broken up but compared with the old ant based build, I am willing to live this 100 times over. That old build had degraded into "a fate worse than death". Ideally we would re-write the "unit" tests in itest into true unit tests, as defined below, and move them back into the regular build. I think it might become confusing when adding new tests if the tests for a package are split into different locations. I would strongly suggest that no new tests should be added to itests. If a new "unit" test is being added there it's because it is not in fact a unit test. Often you have two kinds of tests: unit tests - tests that individually only test a "small" piece of functionality making them extremely fast integration tests - cross module testing e.g. test that test the integration of one or more components I strongly believe and it's maven's assumption that unit tests live with the code it's testing (the pattern is module/src/main/java, module/src/test/java). This is good because: After making changes within a module you can do sanity tests by easily running the tests within only that module Allowing unit tests to import any dependency results in what are supposed to be "unit" tests degrading to in integration tests Because the previous hive build was monolithic it wasn't easy to make changes within a single module and then run sanity tests on only that module. Additionally because any project could use nearly any dependency, a large number of what should have been "unit" tests degraded into integration tests. When I say "degrade into an integration tests" I mean that a unit test should test a specific piece of functionality and be extremely fast, generally sub-second. In my experience, if a test takes more than one second to two seconds it's either a poorly written unit test, a poorly written class, or an integration test. Often it's because infrastructure which is not being tested is instantiated, not mocked. e.g. starting a server as opposed to mocking a server. TL; DR: I'd love to move the tests in itest back to their owning module, it just requires some work to do so. Huge +1 to anyone willing to take on the challenge!
          Hide
          phanikumar added a comment -

          Hi I am trying to compile the code using maven but i am getting an error can someone help me

          [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.3.2:compile (default-compile) on project hive-exec: Compilation failure
          [ERROR] /home/datafreak/Downloads/hive-trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java:[2402,15] error: cannot find symbol
          [ERROR] -> [Help 1]
          [ERROR]
          [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
          [ERROR] Re-run Maven using the -X switch to enable full debug logging.
          [ERROR]
          [ERROR] For more information about the errors and possible solutions, please read the following articles:
          [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
          [ERROR]
          [ERROR] After correcting the problems, you can resume the build with the command
          [ERROR] mvn <goals> -rf :hive-exec

          Show
          phanikumar added a comment - Hi I am trying to compile the code using maven but i am getting an error can someone help me [ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:2.3.2:compile (default-compile) on project hive-exec: Compilation failure [ERROR] /home/datafreak/Downloads/hive-trunk/ql/src/java/org/apache/hadoop/hive/ql/metadata/Hive.java: [2402,15] error: cannot find symbol [ERROR] -> [Help 1] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException [ERROR] [ERROR] After correcting the problems, you can resume the build with the command [ERROR] mvn <goals> -rf :hive-exec

            People

            • Assignee:
              Edward Capriolo
              Reporter:
              Edward Capriolo
            • Votes:
              6 Vote for this issue
              Watchers:
              18 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development