Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-31400

The catalogString doesn't distinguish Vectors in ml and mllib

    XMLWordPrintableJSON

    Details

    • Type: Improvement
    • Status: Resolved
    • Priority: Minor
    • Resolution: Fixed
    • Affects Version/s: 2.4.5
    • Fix Version/s: 3.1.0
    • Component/s: ML, MLlib
    • Labels:
      None
    • Environment:

      Ubuntu 16.04

      Description

      Bug Description

      The `catalogString` is not detailed enough to distinguish the pyspark.ml.linalg.Vectors and pyspark.mllib.linalg.Vectors.

      How to reproduce the bug

      Here is an example from the official document (Python code). If I keep all other lines untouched, and only modify the Vectors import line, which means:

      # from pyspark.ml.linalg import Vectors
      from pyspark.mllib.linalg import Vectors
      

      Or you can directly execute the following code snippet:

      from pyspark.ml.feature import MinMaxScaler
      # from pyspark.ml.linalg import Vectors
      from pyspark.mllib.linalg import Vectors
      dataFrame = spark.createDataFrame([
          (0, Vectors.dense([1.0, 0.1, -1.0]),),
          (1, Vectors.dense([2.0, 1.1, 1.0]),),
          (2, Vectors.dense([3.0, 10.1, 3.0]),)
      ], ["id", "features"])
      scaler = MinMaxScaler(inputCol="features", outputCol="scaledFeatures")
      scalerModel = scaler.fit(dataFrame)
      

      It will raise an error:

      IllegalArgumentException: 'requirement failed: Column features must be of type struct<type:tinyint,size:int,indices:array<int>,values:array<double>> but was actually struct<type:tinyint,size:int,indices:array<int>,values:array<double>>.'
      

      However, the actually struct and the desired struct are exactly the same string, which cannot provide useful information to the programmer. I would suggest making the catalogString distinguish pyspark.ml.linalg.Vectors and pyspark.mllib.linalg.Vectors.

      Thanks!

       

        Attachments

          Activity

            People

            • Assignee:
              junpeiz Junpei Zhou
              Reporter:
              junpeiz Junpei Zhou
            • Votes:
              0 Vote for this issue
              Watchers:
              2 Start watching this issue

              Dates

              • Created:
                Updated:
                Resolved: