Bigtop
  1. Bigtop
  2. BIGTOP-1213

TestHadoopExamples smokes: orderering of tests; other improvements

    Details

    • Type: Improvement Improvement
    • Status: Resolved
    • Priority: Major Major
    • Resolution: Fixed
    • Affects Version/s: 0.7.0
    • Fix Version/s: 0.8.0
    • Component/s: tests
    • Labels:
      None

      Description

      According to the JDK contract for a Map, the order of iteration is not gauranteed.

      But in TestHadoopExamples.groovy, we define a map and iterate through it.

      The tests should run in the defined order,

      1) both for determinism, as well

      2) so that TeraGen always runs before TeraSort

       static Map examples =
          [
              pi :'5 10',
              wordcount :"$EXAMPLES/text $EXAMPLES_OUT/wordcount",
              teragen :"${terasort_rows} teragen${terasortid}",
              terasort :"teragen${terasortid} terasort${terasortid}",
              teravalidate :"terasort${terasortid} tervalidate${terasortid}",
              ... 
          ];
      

      examples.each

      { k, v -> res[k] = [k.toString(), v.toString()] as Object[]; }
      
      

      While implementing this JIRA, we can also add in some other minor improvements :

      • Parameterization of calculate pi so that it can run fast on VMS
      • Remove "sleep" , as its not in newer versions of YARN ( i think, need to confirm)

      TL;DR, Im proposing a fix to add concreted deterministic JDK-independant ordering to the Map so tests always run in same order + a few minor upgrades to this class while were at it.

      1. BIGTOP-1213.patch
        7 kB
        Dasha Boudnik
      2. BIGTOP-1213.patch
        2 kB
        Dasha Boudnik

        Activity

        Hide
        Dasha Boudnik added a comment -

        Looking into this.

        Show
        Dasha Boudnik added a comment - Looking into this.
        Hide
        Dasha Boudnik added a comment -

        All the suggested changes are included in this patch. Namely:

        • TeraGen should now always run before Terasort. I changed Map to LinkedHashMap, as the latter guarantees to iterate in the order in which the entries were put into the map
        • parameters can be set for pi; I kept the default at 2 maps, 1000 samples just to avoid changing the code more than I needed to
        • sleep is no longer included, as it's indeed not in newer versions for YARN

        Hope that covers everything!

        Show
        Dasha Boudnik added a comment - All the suggested changes are included in this patch. Namely: TeraGen should now always run before Terasort. I changed Map to LinkedHashMap, as the latter guarantees to iterate in the order in which the entries were put into the map parameters can be set for pi; I kept the default at 2 maps, 1000 samples just to avoid changing the code more than I needed to sleep is no longer included, as it's indeed not in newer versions for YARN Hope that covers everything!
        Hide
        jay vyas added a comment - - edited

        Thanks Dasha... how did you make this patch? (someone else correct me if im wrong... ) but I tried to apply but i think you might have formatted the patch differently... Git can't apply it for me. Here is my workflow that mark grover initially gave me. Can you try it out and resubmit the patch? Thanks!

        1.put the changes all in 1 commit w/ a umbrella commit message i.e. "BIGTOP-1213: TestHadoopExamples fixes" .

        2. run

         git format-patch HEAD^..HEAD --stdout > BIGTOP-1213.patch
        

        3. Upload the BIGTOP-1213.patch file to this JIRA

        Show
        jay vyas added a comment - - edited Thanks Dasha... how did you make this patch? (someone else correct me if im wrong... ) but I tried to apply but i think you might have formatted the patch differently... Git can't apply it for me. Here is my workflow that mark grover initially gave me. Can you try it out and resubmit the patch? Thanks! 1.put the changes all in 1 commit w/ a umbrella commit message i.e. " BIGTOP-1213 : TestHadoopExamples fixes" . 2. run git format-patch HEAD^..HEAD --stdout > BIGTOP-1213.patch 3. Upload the BIGTOP-1213 .patch file to this JIRA
        Hide
        Konstantin Boudnik added a comment -

        Dasha, looks like you have the traces of BIGTOP-1208 in this patch

        Show
        Konstantin Boudnik added a comment - Dasha, looks like you have the traces of BIGTOP-1208 in this patch
        Hide
        Dasha Boudnik added a comment -

        Jay, my apologies – I think this should be good

        Show
        Dasha Boudnik added a comment - Jay, my apologies – I think this should be good
        Hide
        jay vyas added a comment -

        Hi Dasha . The patch looks pretty good ! .. but still having a little issue applying it.

        I think maybe the IntelliJ Headers or something are messing it up?

        Maybe try the workflow i pasted above out? it will definetly work for you:

         git format-patch HEAD^..HEAD --stdout > BIGTOP-1213.patch
        

        And then just attach the .patch file to this JIRA.

        Then, for example of a correct formatted patch see this one:
        https://issues.apache.org/jira/secure/attachment/12627913/BIGTOP-1200.1.patch

        Then I will test the patch out for you on my VMs and let you know

        Show
        jay vyas added a comment - Hi Dasha . The patch looks pretty good ! .. but still having a little issue applying it. I think maybe the IntelliJ Headers or something are messing it up? Maybe try the workflow i pasted above out? it will definetly work for you: git format-patch HEAD^..HEAD --stdout > BIGTOP-1213.patch And then just attach the .patch file to this JIRA. Then, for example of a correct formatted patch see this one: https://issues.apache.org/jira/secure/attachment/12627913/BIGTOP-1200.1.patch Then I will test the patch out for you on my VMs and let you know
        Hide
        Konstantin Boudnik added a comment -

        BTW, the original type Map that has been used in the test isn't a Java Map - Groovy's uses LInkedHashMap as a default map implementation. It doesn't hurt to make things a little bit more expressive - like has been done in this patch - but it wasn't a bug to begin with. Re-labeling as "Improvement".

        Show
        Konstantin Boudnik added a comment - BTW, the original type Map that has been used in the test isn't a Java Map - Groovy's uses LInkedHashMap as a default map implementation. It doesn't hurt to make things a little bit more expressive - like has been done in this patch - but it wasn't a bug to begin with. Re-labeling as "Improvement".
        Hide
        Konstantin Boudnik added a comment -

        Jay, the patch is applying perfectly fine with
        patch -p0 < BIGTOP-1213.patch
        If you prefer using git apply they make sure to provide -p0 option because git's default is -p1

        Show
        Konstantin Boudnik added a comment - Jay, the patch is applying perfectly fine with patch -p0 < BIGTOP-1213 .patch If you prefer using git apply they make sure to provide -p0 option because git's default is -p1
        Hide
        jay vyas added a comment -

        Ahh okay, thanks cos ! so i guess in bigtop we support both p0 and p1 for commits? Id prefer if we use the git default in the future as a standard for patches, but fair enough.

        @cos good point regarding the default Map implementation : Sorry - im still a groovy novice.
        @Dasha ill apply/test this patch shortly and thanks again for the patch .

        Show
        jay vyas added a comment - Ahh okay, thanks cos ! so i guess in bigtop we support both p0 and p1 for commits? Id prefer if we use the git default in the future as a standard for patches, but fair enough. @cos good point regarding the default Map implementation : Sorry - im still a groovy novice. @Dasha ill apply/test this patch shortly and thanks again for the patch .
        Hide
        Konstantin Boudnik added a comment -

        Jay, -p0 or -p1 is coming from the projects that are hosted on different VCS like svn. The case in point is Hadoop where you have to be flexible as people do use git for development but svn for central repo.

        Show
        Konstantin Boudnik added a comment - Jay, -p0 or -p1 is coming from the projects that are hosted on different VCS like svn. The case in point is Hadoop where you have to be flexible as people do use git for development but svn for central repo.
        Hide
        Dasha Boudnik added a comment -

        Hi Jay, any updates on the patch?

        Show
        Dasha Boudnik added a comment - Hi Jay, any updates on the patch?
        Hide
        jay vyas added a comment -

        (testing now)...

        Show
        jay vyas added a comment - (testing now)...
        Hide
        jay vyas added a comment - - edited

        +1 to commit this . yup worked just fine ... Thanks Dasha !

        Sorry took so long, my "automated system for testing smokes" isnt being very automatic today (note that i only ran teragen, and i think the patch is simple enough it doesnt need a full battery of tests) ...

        Show
        jay vyas added a comment - - edited +1 to commit this . yup worked just fine ... Thanks Dasha ! Sorry took so long, my "automated system for testing smokes" isnt being very automatic today (note that i only ran teragen, and i think the patch is simple enough it doesnt need a full battery of tests) ...
        Hide
        Dasha Boudnik added a comment -

        Great! No worries at all – just wanted to make sure you're alright over there since it's my fix you're testing

        Show
        Dasha Boudnik added a comment - Great! No worries at all – just wanted to make sure you're alright over there since it's my fix you're testing
        Hide
        Konstantin Boudnik added a comment -

        I have changed the synopsis: details should go to description instead

        Pushed to master. Thanks Dasha!

        Show
        Konstantin Boudnik added a comment - I have changed the synopsis: details should go to description instead Pushed to master. Thanks Dasha!

          People

          • Assignee:
            Dasha Boudnik
            Reporter:
            jay vyas
          • Votes:
            0 Vote for this issue
            Watchers:
            5 Start watching this issue

            Dates

            • Created:
              Updated:
              Resolved:

              Development