Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-19899

FPGrowth input column naming

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Major
    • Resolution: Fixed
    • 2.2.0
    • 2.2.0
    • ML
    • None

    Description

      Current implementation extends HasFeaturesCol. Personally I find it rather unfortunate. Up to this moment we used consistent conventions - if we mix-in HasFeaturesCol the featuresCol should be VectorUDT.

      Using the same Param for an array<T> (and possibly for array<arrray<T>> once PrefixSpan is ported to ml) will be confusing for the users.

      I would like to suggest adding new trait (let's say HasTransactionsCol) to clearly indicate that the input type differs for the other Estiamtors.

      Attachments

        Issue Links

          Activity

            People

              zero323 Maciej Szymkiewicz
              zero323 Maciej Szymkiewicz
              Joseph K. Bradley Joseph K. Bradley
              Votes:
              0 Vote for this issue
              Watchers:
              5 Start watching this issue

              Dates

                Created:
                Updated:
                Resolved: