Uploaded image for project: 'Apache MADlib'
  1. Apache MADlib
  2. MADLIB-847

One Way Anova produces inconsistent results

    XMLWordPrintableJSON

Details

    • Bug
    • Status: Resolved
    • Major
    • Resolution: Unresolved
    • None
    • None
    • None

    Description

      The one way anova test seems sensitive to the distribution of data in some circumstances (not completely clear which circumstances yet).

      Example:

       
      select madlib.one_way_anova(level, value) from anova_test;
                                                        one_way_anova                                                  
      -----------------------------------------------------------------------------------------------------------------
       (3.19605033565888,28.6618271033655,4,77,0.79901258391472,0.372231520822928,2.14654734813501,0.0830236307841792)
      (1 row)
      
      CREATE TABLE anova_test2 as SELECT * FROM anova_test distributed by (level);
      
      select madlib.one_way_anova(level, value) from anova_test2;
                                                       one_way_anova                                                  
      ----------------------------------------------------------------------------------------------------------------
       (2.75936790108985,29.0985095379345,4,77,0.689841975272463,0.377902721271877,1.82544855181438,0.13247440016284)
      (1 row)
      
      

      Attachments

        Activity

          People

            haying Xixuan (Aaron) Feng
            cwelton Caleb Welton
            Votes:
            0 Vote for this issue
            Watchers:
            2 Start watching this issue

            Dates

              Created:
              Updated: