Uploaded image for project: 'Spark'
  1. Spark
  2. SPARK-4791

Create SchemaRDD from case classes with multiple constructors

    XMLWordPrintableJSON

Details

    • Improvement
    • Status: Resolved
    • Minor
    • Resolution: Fixed
    • 1.3.0
    • None
    • SQL
    • None

    Description

      Issue: One can usually take an RDD of case classes and create a SchemaRDD, where Spark SQL infers the schema from the case class metadata. However, if the case class has multiple constructors, then ScalaReflection.schemaFor gets confused.

      Motivation: In spark.ml, I would like to create a class with the following signature:
      ```
      case class LabeledPoint(label: Double, features: Vector, weight: Double) {
      def this(label: Double, features: Vector) = this(label, features, 1.0)
      }
      ```

      Proposed fix: Change ScalaReflection.schemaFor so it checks for whether there are multiple constructors. If there are multiple ones, it should take the primary constructor. This will not change the behavior of existing code since it currently only supports case classes with 1 constructor.

      Attachments

        Activity

          People

            josephkb Joseph K. Bradley
            josephkb Joseph K. Bradley
            Votes:
            0 Vote for this issue
            Watchers:
            3 Start watching this issue

            Dates

              Created:
              Updated:
              Resolved: