Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-31094

Removing redundant rules in the output of Frequent Pattern Growth Algorithm

    XMLWordPrintableJSON

    Details

    • Type: Brainstorming
    • Status: Resolved
    • Priority: Minor
    • Resolution: Incomplete
    • Affects Version/s: 2.4.5
    • Fix Version/s: None
    • Component/s: ML

      Description

      Will implement the is.redundant() function similar to the one here: https://rdrr.io/cran/arules/man/is.redundant.html

      By definition:

      A rule is redundant if a more general rules with the same or a higher confidence exists. That is, a more specific rule is redundant if it is only equally or even less predictive than a more general rule.

      As FP Growth is an exhaustive algorithm, many of the rules it produces are redundant. Therefore there is merit in implementing this function to spark. This not only reduces the total number of rules produced in the output, but also produces better rules.

        Attachments

          Activity

            People

            • Assignee:
              Unassigned
              Reporter:
              Dyex719 Aditya Addepalli
            • Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: