Hive
  1. Hive
  2. HIVE-4675

Create new parallel unit test environment

    Details

    • Type: Improvement Improvement
    • Status: Closed
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: None
    • Fix Version/s: 0.12.0
    • Component/s: Testing Infrastructure
    • Labels:
      None

      Description

      The current ptest tool is great, but it has the following limitations:

      -Requires an NFS filer
      -Unless the NFS filer is dedicated ptests can become IO bound easily
      -Investigating of failures is troublesome because the source directory for the failure is not saved
      -Ignoring or isolated tests is not supported
      -No unit tests for the ptest framework exist

      It'd be great to have a ptest tool that addresses this limitations.

      1. HIVE-4675.patch
        415 kB
        Brock Noland
      2. HIVE-4675.patch
        405 kB
        Brock Noland
      3. HIVE-4675.patch
        404 kB
        Brock Noland
      4. HIVE-4675.patch
        657 kB
        Brock Noland

        Issue Links

          Activity

          Brock Noland created issue -
          Hide
          Brock Noland added a comment -

          First off, sorry for the large patch. I created this tool as a personal side project. It eventually became quite useful as an internal tool.

          I would like to contribute the hive parallel unit testing framework to hive proper. Currently it's on github: https://github.com/brockn/hive-ptest/ but I'd love to generate a patch for hive and upload it here if it's of interest. It was the following features:

          -Does not require an NFS filer
          -Utilizes multiple disks and multiple cpus on a slave host
          -Entire source directory for any test failure is saved
          -Highly configurable. A properties file is used for base configuration and many items can be overridden at runtime
          -Patch builds can take a URL meaning the patch can be pulled directly from JIRA
          -Reliable. We are using this to test all patches internally
          -Tests can be easily isolated or ignored
          -The framework itself is tested via unit tests

          Utilizing 8 hosts internally we have been able to get the trunk unit tests down to 60 minutes.

          Show
          Brock Noland added a comment - First off, sorry for the large patch. I created this tool as a personal side project. It eventually became quite useful as an internal tool. I would like to contribute the hive parallel unit testing framework to hive proper. Currently it's on github: https://github.com/brockn/hive-ptest/ but I'd love to generate a patch for hive and upload it here if it's of interest. It was the following features: -Does not require an NFS filer -Utilizes multiple disks and multiple cpus on a slave host -Entire source directory for any test failure is saved -Highly configurable. A properties file is used for base configuration and many items can be overridden at runtime -Patch builds can take a URL meaning the patch can be pulled directly from JIRA -Reliable. We are using this to test all patches internally -Tests can be easily isolated or ignored -The framework itself is tested via unit tests Utilizing 8 hosts internally we have been able to get the trunk unit tests down to 60 minutes.
          Brock Noland made changes -
          Field Original Value New Value
          Description The current ptest tool is great, but it has the following limitations:

          -Required an NFS filer
          -Unless the NFS filer is dedicated ptests can become IO bound easily
          -Investigating of failures is troublesome because the source directory for the failure is not saved
          -Ignoring or isolated tests is not supported
          -No unit tests for the ptest framework exist

          It'd be great to have a ptest tool that addresses this limitations.
          The current ptest tool is great, but it has the following limitations:

          -Requires an NFS filer
          -Unless the NFS filer is dedicated ptests can become IO bound easily
          -Investigating of failures is troublesome because the source directory for the failure is not saved
          -Ignoring or isolated tests is not supported
          -No unit tests for the ptest framework exist

          It'd be great to have a ptest tool that addresses this limitations.
          Hide
          Shreepadma Venugopalan added a comment -

          +1 to the proposal.

          Show
          Shreepadma Venugopalan added a comment - +1 to the proposal.
          Hide
          Ashutosh Chauhan added a comment -

          Does it use the existing ptest framework or is it a complete rewrite ? Looks like its a complete rewrite, since its in java and earlier one was in python. If so, than shall we remove the earlier framework at the same time we adopt this? Having two frameworks doing same thing will be confusing. Thoughts?

          Show
          Ashutosh Chauhan added a comment - Does it use the existing ptest framework or is it a complete rewrite ? Looks like its a complete rewrite, since its in java and earlier one was in python. If so, than shall we remove the earlier framework at the same time we adopt this? Having two frameworks doing same thing will be confusing. Thoughts?
          Hide
          Brock Noland added a comment -

          Hi,

          Yes it's a re-write. I am in favor of your proposal. I'll generate a patch which does the swap.

          Brock

          Show
          Brock Noland added a comment - Hi, Yes it's a re-write. I am in favor of your proposal. I'll generate a patch which does the swap. Brock
          Hide
          Ashutosh Chauhan added a comment -

          Hold on to generating the patch. Lets see if others agree or disagree with the proposal.

          Show
          Ashutosh Chauhan added a comment - Hold on to generating the patch. Lets see if others agree or disagree with the proposal.
          Hide
          Brock Noland added a comment -

          Sounds good! FWIW, I figured we'd let the patch sit for a while.

          Show
          Brock Noland added a comment - Sounds good! FWIW, I figured we'd let the patch sit for a while.
          Hide
          Edward Capriolo added a comment -

          This sounds great. Not requiring python is a positive.

          Show
          Edward Capriolo added a comment - This sounds great. Not requiring python is a positive.
          Hide
          Brock Noland added a comment -

          Thank you for the feedback!!

          Show
          Brock Noland added a comment - Thank you for the feedback!!
          Hide
          Ashutosh Chauhan added a comment -

          Kevin Wilfong Gang Tim Liu Pamela Vagata Are you guys still using python based ptest framework? This new framework looks promising.

          Show
          Ashutosh Chauhan added a comment - Kevin Wilfong Gang Tim Liu Pamela Vagata Are you guys still using python based ptest framework? This new framework looks promising.
          Brock Noland made changes -
          Link This issue blocks HIVE-4739 [ HIVE-4739 ]
          Brock Noland made changes -
          Link This issue is related to HIVE-4290 [ HIVE-4290 ]
          Hide
          Ashutosh Chauhan added a comment -

          Brock Noland I think you should go ahead and create the patch. Lets not rip-off python ptest yet. Lets keep it around while we familiarize ourself with the new framework.

          Show
          Ashutosh Chauhan added a comment - Brock Noland I think you should go ahead and create the patch. Lets not rip-off python ptest yet. Lets keep it around while we familiarize ourself with the new framework.
          Hide
          Brock Noland added a comment -

          Sounds good, will do!

          Show
          Brock Noland added a comment - Sounds good, will do!
          Brock Noland made changes -
          Link This issue is related to HIVE-4609 [ HIVE-4609 ]
          Show
          Brock Noland added a comment - https://reviews.facebook.net/D11427
          Brock Noland made changes -
          Attachment HIVE-4675.patch [ 12588904 ]
          Brock Noland made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Fix Version/s 0.12.0 [ 12324312 ]
          Brock Noland made changes -
          Link This issue is related to HIVE-4815 [ HIVE-4815 ]
          Hide
          Brock Noland added a comment -

          To summarize, here is where we are at:

          -HIVE-4675 (this jira) contains ptest2 which works on a fixed number of dedicated hosts
          -HIVE-4815 updates ptest2 to work with ec2 spot instances
          -An Apache Jenkins build has been executing over the holiday weekend

          What are you thoughts on committing HIVE-4675 and HIVE-4815 Ashutosh Chauhan?

          I have been using ptest2 to execute Hive trunk builds on the Apache jenkins: https://builds.apache.org/job/Hive-trunk-hadoop1-ptest/. The tests have been running on ec2 spot instances. Things have been going very well. It takes about 2 hours via 8 XL ec2 spot instances. Since build 27 I have only been making minor changes.

          Summary of Builds since stabilization

          • Passed [28, 29, 33, 34, 35, 36, 38, 39, 40]
          • Failed due a flaky test [27, 30, 37]
          • Failed due to my mistake [31, 32] (compilation error) [41] (restarted the webserver hosting the service)
          Show
          Brock Noland added a comment - To summarize, here is where we are at: - HIVE-4675 (this jira) contains ptest2 which works on a fixed number of dedicated hosts - HIVE-4815 updates ptest2 to work with ec2 spot instances -An Apache Jenkins build has been executing over the holiday weekend What are you thoughts on committing HIVE-4675 and HIVE-4815 Ashutosh Chauhan ? I have been using ptest2 to execute Hive trunk builds on the Apache jenkins: https://builds.apache.org/job/Hive-trunk-hadoop1-ptest/ . The tests have been running on ec2 spot instances. Things have been going very well. It takes about 2 hours via 8 XL ec2 spot instances. Since build 27 I have only been making minor changes. Summary of Builds since stabilization Passed [28, 29, 33, 34, 35, 36, 38, 39, 40] Failed due a flaky test [27, 30, 37] Failed due to my mistake [31, 32] (compilation error) [41] (restarted the webserver hosting the service)
          Hide
          Ashutosh Chauhan added a comment -

          Awesome work, Brock! I saw ptest builds on apache jenkins machines, really cool stuff. I havent looked at patches yet, but will start taking a look soon. cc: Edward Capriolo He might be interested in this work as well.

          Show
          Ashutosh Chauhan added a comment - Awesome work, Brock! I saw ptest builds on apache jenkins machines, really cool stuff. I havent looked at patches yet, but will start taking a look soon. cc: Edward Capriolo He might be interested in this work as well.
          Hide
          Brock Noland added a comment -

          Sounds good! FWIW, the other builds are waiting on an INFRA issue: INFRA-6525

          Show
          Brock Noland added a comment - Sounds good! FWIW, the other builds are waiting on an INFRA issue: INFRA-6525
          Hide
          Vikram Dixit K added a comment -

          I used this framework to run tests on hive on a single node. It took about half the time that it normally takes which is great. However, I am unable to figure out the failing tests. I got a message that goes:

          TestOrcHCatLoader has one or more failing tests... Also, it doesn't seem like the output is integrated with the ant testreport target. It would be great to see a summary of failing tests. Could you please elaborate on how to get an idea of the failing tests.

          Thanks!

          Show
          Vikram Dixit K added a comment - I used this framework to run tests on hive on a single node. It took about half the time that it normally takes which is great. However, I am unable to figure out the failing tests. I got a message that goes: TestOrcHCatLoader has one or more failing tests... Also, it doesn't seem like the output is integrated with the ant testreport target. It would be great to see a summary of failing tests. Could you please elaborate on how to get an idea of the failing tests. Thanks!
          Hide
          Brock Noland added a comment -

          Hi,

          Great to hear! The TEST-.xml file should be in the "logs" directory in the working dir. Typically we run this via jenkins and then in the jenkins build script copy the TEST-.xml files into a directory for jenkins to parse.

          I think we could generate some kind of report as well, did you want to create an enhancement request describing what you'd like?

          Brock

          Show
          Brock Noland added a comment - Hi, Great to hear! The TEST- .xml file should be in the "logs" directory in the working dir. Typically we run this via jenkins and then in the jenkins build script copy the TEST- .xml files into a directory for jenkins to parse. I think we could generate some kind of report as well, did you want to create an enhancement request describing what you'd like? Brock
          Hide
          Vikram Dixit K added a comment -

          Brock Noland I have raised HIVE-4842 for the same.

          Thanks!

          Show
          Vikram Dixit K added a comment - Brock Noland I have raised HIVE-4842 for the same. Thanks!
          Brock Noland made changes -
          Link This issue blocks HIVE-4842 [ HIVE-4842 ]
          Hide
          Brock Noland added a comment -

          Hi,

          Just curious if anyone has started reviewing this yet?

          The reason I ask is HIVE-4815 did a lot of refactoring of this code so that patch by itself is quite large. If no one has started reviewing this then I am inclined to merge this patch and HIVE-4815 because the total size of the combined patch is actually smaller than the current patch on this JIRA.

          TLDR: If I merge this jira with HIVE-4815 the result is a smaller patch.

          Brock

          Show
          Brock Noland added a comment - Hi, Just curious if anyone has started reviewing this yet? The reason I ask is HIVE-4815 did a lot of refactoring of this code so that patch by itself is quite large. If no one has started reviewing this then I am inclined to merge this patch and HIVE-4815 because the total size of the combined patch is actually smaller than the current patch on this JIRA. TLDR: If I merge this jira with HIVE-4815 the result is a smaller patch. Brock
          Hide
          Ashutosh Chauhan added a comment -

          I for one hasn't gotten around to look at this one. Quite a big patch, so need to allocate continuous amount of time for it : ) I am fine with combining two patches.

          Show
          Ashutosh Chauhan added a comment - I for one hasn't gotten around to look at this one. Quite a big patch, so need to allocate continuous amount of time for it : ) I am fine with combining two patches.
          Hide
          Brock Noland added a comment -

          Sounds good, will do! I said it earlier but I'll say it again, sorry for the large patch!

          Show
          Brock Noland added a comment - Sounds good, will do! I said it earlier but I'll say it again, sorry for the large patch!
          Brock Noland made changes -
          Status Patch Available [ 10002 ] Open [ 1 ]
          Hide
          Brock Noland added a comment -

          Other than Ashutosh I haven't heard of interest in reviewing this change therefore I am assuming no else has started reviewing it. Therefore I'll go ahead and consolidate patches as previously discussed.

          Show
          Brock Noland added a comment - Other than Ashutosh I haven't heard of interest in reviewing this change therefore I am assuming no else has started reviewing it. Therefore I'll go ahead and consolidate patches as previously discussed.
          Brock Noland made changes -
          Attachment HIVE-4675.patch [ 12592384 ]
          Brock Noland made changes -
          Remote Link This issue links to "Review (Web Link)" [ 12416 ]
          Hide
          Brock Noland added a comment -
          Show
          Brock Noland added a comment - Linked to Review item: https://reviews.facebook.net/D11427?large=true#toc
          Brock Noland made changes -
          Link This issue blocks HIVE-4862 [ HIVE-4862 ]
          Hide
          Brock Noland added a comment -

          Minor changes: Remove whitespace, fix xml dependency version, and allow configuring the number of log dirs per test "profile"

          Show
          Brock Noland added a comment - Minor changes: Remove whitespace, fix xml dependency version, and allow configuring the number of log dirs per test "profile"
          Brock Noland made changes -
          Attachment HIVE-4675.patch [ 12592466 ]
          Brock Noland made changes -
          Status Open [ 1 ] Patch Available [ 10002 ]
          Hide
          Brock Noland added a comment -

          Latest patch support pre-commit testing. Since this patch is so damn large already I won't be making anymore changes to this other than critical/blocker bug fixes.

          Show
          Brock Noland added a comment - Latest patch support pre-commit testing. Since this patch is so damn large already I won't be making anymore changes to this other than critical/blocker bug fixes.
          Brock Noland made changes -
          Attachment HIVE-4675.patch [ 12592626 ]
          Hide
          Brock Noland added a comment -

          OK the latest patch is up and running pre-commit tests. The only issue I've seen is that pre-commit build has a lot of patches to test! I know of ways we can increase throughput but I think we should wait until this mega-patch is committed to avoid making it even larger.

          Show
          Brock Noland added a comment - OK the latest patch is up and running pre-commit tests . The only issue I've seen is that pre-commit build has a lot of patches to test! I know of ways we can increase throughput but I think we should wait until this mega-patch is committed to avoid making it even larger.
          Brock Noland made changes -
          Link This issue blocks HIVE-4882 [ HIVE-4882 ]
          Hide
          Edward Capriolo added a comment -

          +1 it is already working well. Will commit in 24 hours.

          Show
          Edward Capriolo added a comment - +1 it is already working well. Will commit in 24 hours.
          Hide
          Edward Capriolo added a comment -

          I noticed the patch had some junk files in there like :

          rig
          A testutils/ptest2/src/test/resources/test-outputs/skewjoin.q-ab8536a7-1b5c-45ed-ba29-14450f27db8b-hive.log
          ! testutils/ptest2/src/test/resources/test-outputs/skewjoin.q-ab8536a7-1b5c-45ed-ba29-14450f27db8b-hive.log.orig

          Can you please regenerate?

          Show
          Edward Capriolo added a comment - I noticed the patch had some junk files in there like : rig A testutils/ptest2/src/test/resources/test-outputs/skewjoin.q-ab8536a7-1b5c-45ed-ba29-14450f27db8b-hive.log ! testutils/ptest2/src/test/resources/test-outputs/skewjoin.q-ab8536a7-1b5c-45ed-ba29-14450f27db8b-hive.log.orig Can you please regenerate?
          Hide
          Brock Noland added a comment -

          Hi,

          I have the *-hive.log files in there as part of the output parsing tests to ensure that we don't accidentally try and parse the hive.log logs. I have stripped them down to only a few lines. Not sure where the .orig file came from as it's not in the patch?

          Brock

          Show
          Brock Noland added a comment - Hi, I have the *-hive.log files in there as part of the output parsing tests to ensure that we don't accidentally try and parse the hive.log logs. I have stripped them down to only a few lines. Not sure where the .orig file came from as it's not in the patch? Brock
          Hide
          Edward Capriolo added a comment -

          I understand. It is all good then. The .orig files were generated by my patch program.

          Show
          Edward Capriolo added a comment - I understand. It is all good then. The .orig files were generated by my patch program.
          Brock Noland made changes -
          Link This issue blocks HIVE-4889 [ HIVE-4889 ]
          Hide
          Edward Capriolo added a comment -

          Nice work Brock. Big data does not need shared NFS file systems for unit tests.

          Show
          Edward Capriolo added a comment - Nice work Brock. Big data does not need shared NFS file systems for unit tests.
          Edward Capriolo made changes -
          Status Patch Available [ 10002 ] Resolved [ 5 ]
          Resolution Fixed [ 1 ]
          Hide
          Brock Noland added a comment -

          Thanks!! I have a couple small fixes which I'll submit tomorrow and then we can move the ptest build infrastructure to the official source tree!

          Show
          Brock Noland added a comment - Thanks!! I have a couple small fixes which I'll submit tomorrow and then we can move the ptest build infrastructure to the official source tree!
          Hide
          Ashutosh Chauhan added a comment -

          This issue has been fixed and released as part of 0.12 release. If you find further issues, please create a new jira and link it to this one.

          Show
          Ashutosh Chauhan added a comment - This issue has been fixed and released as part of 0.12 release. If you find further issues, please create a new jira and link it to this one.
          Ashutosh Chauhan made changes -
          Status Resolved [ 5 ] Closed [ 6 ]
          Transition Time In Source Status Execution Times Last Executer Last Execution Date
          Patch Available Patch Available Open Open
          24d 18h 53m 1 Brock Noland 15/Jul/13 15:52
          Open Open Patch Available Patch Available
          14d 14h 54m 2 Brock Noland 16/Jul/13 02:49
          Patch Available Patch Available Resolved Resolved
          3d 1h 59m 1 Edward Capriolo 19/Jul/13 04:48
          Resolved Resolved Closed Closed
          88d 19h 42m 1 Ashutosh Chauhan 16/Oct/13 00:31

            People

            • Assignee:
              Brock Noland
              Reporter:
              Brock Noland
            • Votes:
              0 Vote for this issue
              Watchers:
              7 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved:

                Development